|3/2015 - 5|
Application of Machine Learning Algorithms for the Query Performance PredictionMILICEVIC, M. , BARANOVIC, M. , ZUBRINIC, K.
|Click to see author's profile on SCOPUS, IEEE Xplore, Web of Science|
|Download PDF (1,442 KB) | Citation | Downloads: 307 | Views: 930|
machine learning, prediction algorithms, query processing, transaction databases
learning(16), performance(13), data(13), prediction(12), machine(12), database(12), systems(10), query(10), workloads(7), francisco(7)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2015-08-31
Volume 15, Issue 3, Year 2015, On page(s): 33 - 44
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2015.03005
Web of Science Accession Number: 000360171500005
SCOPUS ID: 84940732050
This paper analyzes the relationship between the system load/throughput and the query response time in a real Online transaction processing (OLTP) system environment. Although OLTP systems are characterized by short transactions, which normally entail high availability and consistent short response times, the need for operational reporting may jeopardize these objectives. We suggest a new approach to performance prediction for concurrent database workloads, based on the system state vector which consists of 36 attributes. There is no bias to the importance of certain attributes, but the machine learning methods are used to determine which attributes better describe the behavior of the particular database server and how to model that system. During the learning phase, the system's profile is created using multiple reference queries, which are selected to represent frequent business processes. The possibility of the accurate response time prediction may be a foundation for automated decision-making for database (DB) query scheduling. Possible applications of the proposed method include adaptive resource allocation, quality of service (QoS) management or real-time dynamic query scheduling (e.g. estimation of the optimal moment for a complex query execution).
|References|||||Cited By «-- Click to see who has cited this paper|
| M. Milicevic, M. Baranovic, V. Batos, "QoS control based on query response time prediction", WSEAS Transactions on Computers. 4 (2005), 882-889.
 M. Wimmer, V. Nicolescu, D. Gmach, M. Mohr, A. Kemper, H. Krcmar, "Evaluation of Adaptive Computing Concepts for Classical ERP Systems and Enterprise Services", Proceedings of IEEE Joint Conference on E-Commerce Technology and Enterprise Computing, E-Commerce and E-Services (CEC'06 and EEE'06), San Francisco, California, June 26-29, 352-355.
[CrossRef] [SCOPUS Record]
 B. Shneiderman, Designing the User Interface: Strategies for Effective Human-Computer Interaction, 3rd ed., Addison-Wesley, Reading, MA, 1998.
 R. B. Miller, "Response time in man-computer conversational transactions", Proceedings of AFIPS Fall Joint Computer Conference, Vol. 33, 1968, 267-277.
 J. Nielsen, Usability Engineering, Morgan Kaufmann, San Francisco, 1994.
 R. McNab, Y. Wang, I.H. Witten, C. Gutwin, "Predicting query times", Proceedings of the 21st Annual international ACM SIGIR Conference on Research and Development in information Retrieval SIGIR '98. ACM Press, New York, 1998, 355-356.
 S. Heisig, S. Moyle, "Using model trees to characterize computer resource usage", Proceedings of WOSS 2004, 80-84.
[CrossRef] [SCOPUS Times Cited 6]
 P. Dinda, D. O'Hallaron, "The Statistical Properties of Host Load, Fourth Workshop on Languages", Compilers and Run-time Systems for Scalable Computers (LCR 98), Pittsburgh, 1998.
 P. Dinda, D. O'Hallaron, "An Evaluation of Linear Models for Host Load Prediction", Proc. 8th IEEE Symposium on High-Performance, Distributed Computing (HPDC-8), Redondo Beach, 1999.
 P. Dinda, D. O'Hallaron, "Host load prediction using linear models", Cluster Computing, 2000.
 P. Dinda, "A Prediction-based Real-time Scheduling Advisor", Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002.
[CrossRef] [SCOPUS Times Cited 48]
 M. Andreolini, S. Casolari, "Load prediction models in web-based systems", Proceedings of the 1st international Conference on Performance Evaluation Methodolgies and Tools, New York: ACM Press, 2006.
[CrossRef] [SCOPUS Times Cited 4]
 W. Xu, X. Zhu, S. Singhal, Z. Wang, "Predictive Control for Dynamic Resource Allocation in Enterprise Data Centers", 10th IEEE/IFIP In Network Operations and Management Symposium, (2006). 115-126.
 R. Vilalta, C.V. Apte, J.L. Hellerstein, S. Ma, S.M. Weiss, "Predictive algorithms in the management of computer systems", IBM Systems Journal Vol. 41, No 3, 2002.
 S. Cronen-Townsend, Y. Zhou, W.B. Croft, "Predicting query performance", Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 2002.
 B. He, I. Ounis, "Query performance prediction. Information Systems", Special Issue for the String Processing and Information Retrieval (SPIRE2004), 2005.
 C. Hauff, L. Azzopardi, D. Hiemstra, "The combination and evaluation of query performance prediction methods", Lecture Notes in Computer Science, Volume 5478, 2009. 301-312.
[CrossRef] [SCOPUS Record]
 J. Perez-Iglesias, L. Araujo, "Evaluation of Query Performance Prediction Methods by Range", Proceedings of the 17th edition of the Symposium on String Processing and Information Retrieval, 2010.
[CrossRef] [SCOPUS Record]
 N. Tomov, E.W. Dempster, M.H. Williams, J.B. King, A. Burger, "Approximate Estimation of Transaction Response Time", Comput. Journal, 42(3) (1999).241-250.
[CrossRef] [Web of Science Times Cited 6]
 D.A. Menascé, R. Dodge, D. Barbara, "Preserving QoS of E-Commerce Sites through Self-Tuning: A Performance Model Approach", Proceedings of 2001 ACM Conf. E-Commerce, ACM Press, 2001. 224-234.
 D.A. Menascé, "Automatic QoS Control", IEEE Internet Computing 7(1) (2003). str. 92-95.
 P. Martin, W. Powley, H. Li, K. Romanufa, "Managing database server performance to meet QoS requirements in electronic commerce systems", International Journal on Digital Libraries 3(4), 2002, 316-324.
[CrossRef] [SCOPUS Times Cited 12]
 B. Mozafari, C. Curino, A. Jindal, S.l Madden, "Performance and resource modeling in highly-concurrent OLTP workloads", Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, 2013, 301-312.
[CrossRef] [SCOPUS Times Cited 14]
 B.Mozafari, C. Curino, S. Madden, "DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud", CIDR, 2013.
 M. Ahmad, I. Bowman, "Predicting system performance for multi-tenant database workloads", Proceedings of the Fourth International Workshop on Testing Database Systems (DBTest '11). ACM, New York, 2011.
[CrossRef] [SCOPUS Times Cited 1]
 C. Gupta, A. Mehta, U. Dayal, PQR: "Predicting Query Execution Times for Autonomous Workload Management", Proceedings of the 2008 International Conference on Autonomic Computing, 2008.13-22.
[CrossRef] [SCOPUS Times Cited 29]
 A. Mehta, C. Gupta, U. Dayal, "BI batch manager: a system for managing batch workloads on enterprise data-warehouses", Proceedings of the 11th international conference on Extending database technology: Advances in database technology, 2008.
[CrossRef] [SCOPUS Times Cited 4]
 A. Ganapathi, H.A. Kuno, U. Dayal, J. L. Wiener, A. Fox, M. I. Jordan, D. A. Patterson, "Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning", Proceedings of the 2009 IEEE International Conference on Data Engineering, 2009. pp 592-603.
[CrossRef] [Web of Science Times Cited 41] [SCOPUS Times Cited 85]
 S. Krompass, A. Scholz, M. Albutiu, H. Kuno, J. Wiener, U. Dayal, A. Kemper, "Quality of Service-Enabled Management of Database Workloads", IEEE Data Engineering Bulletin, Special Issue on Testing and Tuning of Database Systems, 31(1), 2008.
 S. Krompass, H.A. Kuno, K. Wilkinson, U. Dayal, A. Kemper, "Adaptive query scheduling for mixed database workloads with multiple objectives", Proceedings of the Third International Workshop on Testing Database Systems, DBTest 2010.
 M. Akdere, U. Çetintemel, M. Riondato, E. Upfal, S. Zdonik, "Learning-based Query Performance Modeling and Prediction", Proceedings of the 2012 IEEE 28th International Conference on Data Engineering (ICDE '12). IEEE Computer Society, USA, 390-401.
[CrossRef] [Web of Science Times Cited 18] [SCOPUS Times Cited 29]
 M. Ahmad, A. Aboulnaga, S. Babu, "Query interactions in database workloads", Proceedingsof the Int. Workshop on Testing DatabaseSystems (DBTest), 2009.
[CrossRef] [SCOPUS Times Cited 2]
 M. Ahmad, S. Duan, A. Aboulnaga, S. Babu, "Interaction-aware prediction of business intelligence workload completion times", International Conference on Data Engineering (ICDE), 2010, 413-416.
[CrossRef] [Web of Science Times Cited 2] [SCOPUS Times Cited 7]
 J. Duggan, U. Cetintemel, O. Papaemmanouil, E. Upfal, "Performance prediction for concurrent database workloads", SIGMOD, 2011.
[CrossRef] [SCOPUS Times Cited 43]
 Y. Lingyun, I. Foster, J. M. Schopf, "Homeostatic and tendency-based CPU load predictions", Parallel and distributed processing Symposium, 2003.
[CrossRef] [SCOPUS Times Cited 85]
 H. Li, D. Groep, L. Wolters, "Efficient response time predictions by exploiting application and resource state similarities", 6th International Workshop on Grid Computing (GRID 2005), 2005.
[CrossRef] [SCOPUS Times Cited 26]
 W. Smith, I. T. Foster, V. E. Taylor, "Predicting application run times with historical information", Journal of Parallel Distrib. Comput., 64(9) (2004).1007-1016.
[CrossRef] [Web of Science Times Cited 35] [SCOPUS Times Cited 58]
 J. R. Quinlan, "Learning with Continuous Classes", Proceedings of Fifth Australian Joint Conf. Artificial Intelligence, Australia,1992.
 Y. Wang, I.H. Witten, "Inducing model trees for continuous classes", Proceedings of Poster Papers, 9th European Conference on Machine Learning, Prague, Czech, 1997.
 G. Holmes, M. Hall, E. Frank, "Generating Rule Sets from Model Trees", Twelfth Australian Joint Conference on Artificial Intelligence, 1-12, 1999.
[CrossRef] [SCOPUS Times Cited 42]
 I. H. Witten, E. Frank, Data Mining: Practical machine learning tools and techniques, 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
 D. Aha,D. Kibler, M. Albert, "Instance-based learning algorithms", Machine Learning 6 (1991), 37-66.
[CrossRef] [Web of Science Times Cited 1626] [SCOPUS Times Cited 2151]
 D. Rumelhart, G. Hinton, R. Williams, "Learning Internal Representations by Error Propagation", Parallel Distributed Processing Vol.1 (1986), Cambridge, MA, MIT Press. 318-362.
[CrossRef] [SCOPUS Times Cited 3]
 J. Han, M. Kamber, Data Mining. Morgan Kaufmann, San Francisco, CA, 2001.
 S. K. Shevade, S. S. Keerthi, C. Bhattacharyya, K. R. K. Murthy, "Improvements to the SMO Algorithm for SVM Regression", IEEE Transactions on Neural Networks, 1999.
[CrossRef] [Web of Science Times Cited 268] [SCOPUS Times Cited 397]
 R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection", Proceedings of the 14th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Francisco, 1995.
 V. Barnett, T. Lewis, Outliers in Statistical Data, 2nd ed., John Wiley & Sons, 1987.
 E. Knorr, R. Ng, "A unified notion of outliers: Properties and computation", Proceedings of 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD'97), Newport Beach, CA, 1997.
 E. Knorr, R. Ng, "Algorithms for mining distance-based outliers in large datasets", Proceedings of 1998 Int. Conf. Very Large Data Bases (VLDB'98), New York, 1998.
 L. Breiman, "Bagging Predictors", Machine Learning, 24(2), 1996, 123-140.
 Y. Freund, R.E. Schapire, "Experiments with a new boosting algorithm", Proceedings of the Thirteenth International Conference on Machine Learning / editor L. Saitta. Bari, Italy. San Francisco: Morgan Kaufmann, 1996, 148-156.
 R. E. Schapire, Y. Freund, P. Bartlett, W. S. Lee, "Boosting the margin: A new explanation for the effectiveness of voting methods", Proceedings of the Fourteenth International Conference on Machine Learning / D. H. Fisher, editor. Nashville, TN. San Francisco: Morgan Kaufmann, 1997, 322-330.
[CrossRef] [Web of Science Times Cited 142] [SCOPUS Times Cited 184]
 E. Frank, Y. Wang, S. Inglis, G. Holmes, I. H. Witten, "Using model trees for classification", Machine Learning, 32 (1998), 63-76.
 T. Fawcett, "ROC Graphs: Notes and Practical Considerations for Data Mining Researchers", Technical Report HPL-2003-4, HP Labs, 2003.
 I. Kononenko, "Estimating attributes: analysis and extensions of Relief", Proceedings of the European Conference on Machine Learning: ECML-94. / De Raedt, L., Bergadano, F., editors. Springer Verlag, 1994. 171-182.
 M. Robnik Sikonja, I. Kononenko, "An adaptation of Relief for attribute estimation on regression", Proceedings of 14th International Conference on Machine Learning ICML'97 / D.Fisher editor. Nashville, TN. 1997.
 H. Liu, J. Li, L. Wong, "A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns", Proceedings of 13th International Conference on Genome Informatics (GIW02), Tokyo, Japan, 2002.
 M. A. Hall, "Correlation-based feature selection machine learning", Ph.D. Thesis, Department of Computer Science, University of Waikato, Hamilton, New Zealand, 1998.
 M. A. Hall, L. A. Smith, "Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper", Proceedings of the 22nd Australasian Computer Science Conference, 1999.
Web of Science® Citations for all references: 2,138 TCR
SCOPUS® Citations for all references: 3,230 TCR
Web of Science® Average Citations per reference: 36 ACR
SCOPUS® Average Citations per reference: 54 ACR
TCR = Total Citations for References / ACR = Average Citations per Reference
We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more
Citations for references updated on 2017-04-23 17:35 in 241 seconds.
Note1: Web of Science® is a registered trademark of Thomson Reuters.
Note2: SCOPUS® is a registered trademark of Elsevier B.V.
Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site.
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.