PilotScope: Steering Databases with Machine Learning Drivers (2024)

research-article

Artifacts Available / v1.1

Authors:
Rong Zhu Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
Search about this author

,
Lianggui Weng Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
Search about this author

,
Wenqing Wei Alibaba Group, USTC, Hangzhou, China

Alibaba Group, USTC, Hangzhou, China
Search about this author

,
Di Wu Alibaba Group, HUST, Hangzhou, China

Alibaba Group, HUST, Hangzhou, China
Search about this author

,
Jiazhen Peng Alibaba Group, ZJU, Hangzhou, China

Alibaba Group, ZJU, Hangzhou, China
Search about this author

,
Yifan Wang Alibaba Group, UFL, Hangzhou, China

Alibaba Group, UFL, Hangzhou, China
Search about this author

,
Bolin Ding Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
Search about this author

,
Defu Lian USTC, Hefei, China

USTC, Hefei, China
Search about this author

,
Bolong Zheng HUST, Wuhan, China

HUST, Wuhan, China
Search about this author

,
Jingren Zhou Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
Search about this author

Proceedings of the VLDB EndowmentVolume 17Issue 5pp 980–993https://doi.org/10.14778/3641204.3641209

Published:02 May 2024Publication History

0citation
0
Downloads

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

Proceedings of the VLDB Endowment

Volume 17, Issue 5

PreviousArticleNextArticle

Skip Abstract Section

Abstract

Learned databases, or AI4DB techniques, have rapidly developed in the last decade. Deploying machine learning (ML) and AI4DB algorithms into actual databases is the gold standard to examine their performance in practice. However, due to the complexity of database systems, the difference between ML and DB programming paradigms, and the diversity of ML models, the tasks of developing and deploying AI4DB algorithms into databases are prohibitively difficult. Most previous works focus on specific AI4DB algorithms and ML models whose deployment requires close cooperation between ML and DB developers and heavy engineering cost.

In this paper, we design and implement PilotScope, an AI4DB middleware with a programming model that largely reduces such difficulties. With a novel abstraction of AI4DB algorithms for, e.g., knob tuning and query optimization, PilotScope consists of two classes of components, AI4DB drivers and DB interactors, with different programming paradigms and roles in AI4DB tasks. ML developers focus on designing and implementing AI4DB drivers, which are algorithmic workflows that collect statistics from databases, train ML models, make decisions and optimize databases using learned models. AI4DB drivers interact with databases via DB interactors (e.g., for collecting data and enforcing actions in databases). DB developers focus on implementing these interactors on one or more database engines, with the interaction details hindered from ML developers. PilotScope supports a variety of AI4DB tasks, and the implementation of an AI4DB algorithm on PilotScope can be deployed in different databases with only minimum modifications. PilotScope is effective in benchmarking these AI4DB algorithms in real-world scenarios. We hope that PilotScope could significantly accelerate iterating AI4DB research and make AI4DB techniques truly applicable in production.

References

2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server. https://www.microsoft.com/en-us/research/publication/anytime-algorithm-of-database-tuning-advisor-for-microsoft-sql-server/.Google Scholar
2020. Bao appendix. https://rmarcus.info/appendix.html.Google Scholar
2020. Implementation of DeepDB: Learn from Data, not from Queries! https://github.com/DataManagementLab/deepdb-public.Google Scholar
2020. A prototype implementation of Bao for PostgreSQL. https://github.com/learnedsystems/BaoForPostgreSQL.Google Scholar
2021. NoisePage - Database Management System Project. https://noise.page/.Google Scholar
2021. Transaction Processing Performance Council(TPC): Version 2 and Version 3. https://github.com/Nathaniel-Han/End-to-End-CardEst-Benchmark.Google Scholar
2023. Give PostgreSQL ability to manually force some decisions in execution plans. https://github.com/ossc-db/pg_hint_plan.Google Scholar
2023. A new CardEst Benchmark to Bridge AI and DBMS. https://github.com/Nathaniel-Han/End-to-End-CardEst-Benchmark.Google Scholar
2023. openGauss. https://github.com/opengauss-mirror.Google Scholar
2023. Platform to evaluate index selection algorithms. https://github.com/hyrise/index_selection_evaluation.Google Scholar
2023. SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization. https://automl.github.io/SMAC3/v2.0.1/.Google Scholar
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM, 1009--1024. Google ScholarDigital Library
Christoph Anneser, Nesime Tatbul, David E. Cohen, Zhenggang Xu, Prithviraj Pandian, Nikolay Laptev, and Ryan Marcus. 2023. AutoSteer: Learned Query Optimization for Any SQL Database. Proc. VLDB Endow. 16, 12 (2023), 3515--3527. Google ScholarDigital Library
Christopher Baik, H. V. Jagadish, and Yunyao Li. 2019. Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases. In 35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, April 8--11, 2019. IEEE, 374--385.Google ScholarCross Ref
Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14--16, 2005, Fatma Özcan (Ed.). ACM, 227--238.Google ScholarDigital Library
Matthew Butrovich, Wan Shen Lim, Lin Ma, John Rollinson, William Zhang, Yu Xia, and Andrew Pavlo. 2022. Tastes Great! Less Filling! High Performance and Accurate Training Data Collection for Self-Driving Database Management Systems. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary G. Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 617--630.Google ScholarDigital Library
Ronnie Chaiken, Bob Jenkins, Per-Ake Larson, Bill Ramsey, Darren Shakib, Simon Weaver, and Jingren Zhou. 2008. SCOPE: easy and efficient parallel processing of massive data sets. Proc. VLDB Endow. 1, 2 (2008), 1265--1276.Google ScholarDigital Library
Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25--29, 1997, Athens, Greece, Matthias Jarke, Michael J. Carey, Klaus R. Dittrich, Frederick H. Lochovsky, Pericles Loucopoulos, and Manfred A. Jeusfeld (Eds.). Morgan Kaufmann, 146--155.Google ScholarDigital Library
Mahendra Chavan, Ravindra Guravannavar, Karthik Ramachandra, and S. Sudarshan. 2011. DBridge: A program rewrite tool for set-oriented query execution. In Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11--16, 2011, Hannover, Germany, Serge Abiteboul, Klemens Böhm, Christoph Koch, and Kian-Lee Tan (Eds.). IEEE Computer Society, 1284--1287.Google Scholar
Jin Chen, Guanyu Ye, Yan Zhao, Shuncheng Liu, Liwei Deng, Xu Chen, Rui Zhou, and Kai Zheng. 2022. Efficient Join Order Selection Learning with Graph-based Representation. In KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14 - 18, 2022, Aidong Zhang and Huzefa Rangwala (Eds.). ACM, 97--107.Google Scholar
Xu Chen, Haitian Chen, Zibo Liang, Shuncheng Liu, Jianhong Wang, Kai Zeng, Han Su, and Kai Zheng. 2023. LEON:: A New Framework for ML-Aided Query Optimization. Proc. VLDB Endow. 16, 9 (2023), 2261--2273. https://www.vldb.org/pvldb/vol16/p2261-chen.pdfGoogle ScholarDigital Library
Xu Chen, Zhen Wang, Shuncheng Liu, Yaliang Li, Kai Zeng, Bolin Ding, Jingren Zhou, Han Su, and Kai Zheng. 2023. BASE: Bridging the Gap between Cost and Latency for Query Optimization. Proc. VLDB Endow. 16, 8 (2023), 1958--1966. https://www.vldb.org/pvldb/vol16/p1958-chen.pdfGoogle ScholarDigital Library
Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. Proc. VLDB Endow. 4, 6 (2011), 362--372.Google ScholarDigital Library
Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1241--1258.Google Scholar
Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David B. Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 969--984.Google ScholarDigital Library
Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. FITing-Tree: A Data-aware Index Structure. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1189--1206.Google ScholarDigital Library
Goetz Graefe. 1995. The Cascades Framework for Query Optimization. IEEE Data Eng. Bull. 18, 3 (1995), 19--29. http://sites.computer.org/debull/95SEP-CD.pdfGoogle Scholar
Goetz Graefe and William J. McKenna. 1993. The Volcano Optimizer Generator: Extensibility and Efficient Search. In Proceedings of the Ninth International Conference on Data Engineering, April 19--23, 1993, Vienna, Austria. IEEE Computer Society, 209--218. Google ScholarCross Ref
Yuxing Han, Ziniu Wu, Peizhi Wu, Rong Zhu, Jingyi Yang, Liang Wei Tan, Kai Zeng, Gao Cong, Yanzhao Qin, Andreas Pfadler, Zhengping Qian, Jingren Zhou, Jiangneng Li, and Bin Cui. 2021. Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation. Proc. VLDB Endow. 15, 4 (2021), 752--765.Google ScholarDigital Library
Benjamin Hilprecht, Andreas Schmidt, Moritz Kulessa, Alejandro Molina, Kristian Kersting, and Carsten Binnig. 2020. DeepDB: Learn from Data, not from Queries! Proc. VLDB Endow. 13, 7 (2020), 992--1005.Google ScholarDigital Library
Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-Based Optimization for General Algorithm Configuration. In Learning and Intelligent Optimization - 5th International Conference, LION 5, Rome, Italy, January 17--21, 2011. Selected Papers (Lecture Notes in Computer Science), Carlos A. Coello Coello (Ed.), Vol. 6683. Springer, 507--523.Google ScholarDigital Library
Alekh Jindal, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag, Marc T. Friedman, Yifung Lin, Konstantinos Karanasos, and Sriram Rao. 2018. Computation Reuse in Analytics Job Service at Microsoft. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018, Gautam Das, Christopher M. Jermaine, and Philip A. Bernstein (Eds.). ACM, 191--203.Google ScholarDigital Library
Navin Kabra and David J. DeWitt. 1998. Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans. In SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2--4, 1998, Seattle, Washington, USA. ACM Press, 106--117. Google ScholarDigital Library
Riham Abdel Kader, Peter A. Boncz, Stefan Manegold, and Maurice van Keulen. 2009. ROX: run-time optimization of XQueries. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29 - July 2, 2009. ACM, 615--626. Google ScholarDigital Library
Konstantinos Kanellis, Cong Ding, Brian Kroth, Andreas Müller, Carlo Curino, and Shivaram Venkataraman. 2022. LlamaTune: Sample-Efficient DBMS Configuration Tuning. Proc. VLDB Endow. 15, 11 (2022), 2953--2965.Google ScholarDigital Library
Johan Kok Zhi Kang, Gaurav, Sien Yi Tan, Feng Cheng, Shixuan Sun, and Bingsheng He. 2021. Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 1014--1022.Google Scholar
Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter A. Boncz, and Alfons Kemper. 2019. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings. www.cidrdb.org.Google Scholar
Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms. Proc. VLDB Endow. 13, 11 (2020), 2382--2395.Google ScholarDigital Library
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018, Gautam Das, Christopher M. Jermaine, and Philip A. Bernstein (Eds.). ACM, 489--504.Google ScholarDigital Library
Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph M. Hellerstein, and Ion Stoica. 2018. Learning to Optimize Join Queries With Deep Reinforcement Learning. CoRR abs/1808.03196 (2018). arXiv:1808.03196 http://arxiv.org/abs/1808.03196Google Scholar
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter A. Boncz, Alfons Kemper, and Thomas Neumann. 2015. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 9, 3 (2015), 204--215.Google ScholarDigital Library
Guoliang Li, Xuanhe Zhou, and Lei Cao. 2021. AI Meets Database: AI4DB and DB4AI. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 2859--2866.Google ScholarDigital Library
Guoliang Li, Xuanhe Zhou, Ji Sun, Xiang Yu, Yue Han, Lianyuan Jin, Wenbo Li, Tianqing Wang, and Shifu Li. 2021. openGauss: An Autonomous Database System. Proc. VLDB Endow. 14, 12 (2021), 3028--3041.Google ScholarDigital Library
Yan Li, Liwei Wang, Sheng Wang, Yuan Sun, and Zhiyong Peng. 2022. A Resource-Aware Deep Cost Model for Big Data Query Processing. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 885--897.Google ScholarCross Ref
Lin Ma, William Zhang, JieJiao, Wuwen Wang, Matthew Butrovich, Wan Shen Lim, Prashanth Menon, and Andrew Pavlo. 2021. MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 1248--1261.Google ScholarDigital Library
Minghua Ma, Zheng Yin, Shenglin Zhang, Sheng Wang, Christopher Zheng, Xinhao Jiang, Hanwen Hu, Cheng Luo, Yilin Li, Nengjun Qiu, Feifei Li, Changcheng Chen, and Dan Pei. 2020. Diagnosing Root Causes of Intermittent Slow Queries in Large-Scale Cloud Databases. Proc. VLDB Endow. 13, 8 (2020), 1176--1189.Google ScholarDigital Library
Nantia Makrynioti and Vasilis Vassalos. 2021. Declarative Data Analytics: A Survey. IEEE Trans. Knowl. Data Eng. 33, 6 (2021), 2392--2411.Google ScholarCross Ref
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska. 2021. Bao: Making Learned Query Optimization Practical. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021. ACM, 1275--1288.Google Scholar
Ryan C. Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A Learned Query Optimizer. Proc. VLDB Endow. 12, 11 (2019), 1705--1718.Google ScholarDigital Library
Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman, and Hamid Pirahesh. 2004. Robust Query Processing through Progressive Optimization. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13--18, 2004. ACM, 659--670. Google ScholarDigital Library
Guido Moerkotte and Thomas Neumann. 2008. Dynamic programming strikes back. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008. ACM, 539--552. Google ScholarDigital Library
Parimarjan Negi, Matteo Interlandi, Ryan Marcus, Mohammad Alizadeh, Tim Kraska, Marc T. Friedman, and Alekh Jindal. 2021. Steering Query Optimizers: A Practical Take on Big Data Workloads. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 2557--2569.Google Scholar
Andy Pavlo, Matthew Butrovich, Lin Ma, Prashanth Menon, Wan Shen Lim, Dana Van Aken, and William Zhang. 2021. Make Your Database System Dream of Electric Sheep: Towards Self-Driving Operation. Proc. VLDB Endow. 14, 12 (2021), 3211--3221.Google ScholarDigital Library
Pedro Pedreira, Orri Erling, Konstantinos Karanasos, Scott Schneider, Wes McKinney, Satya R Valluri, Mohamed Zait, and Jacques Nadeau. 2023. The Composable Data Management System Manifesto. Proc. VLDB Endow. 16, 10 (2023), 2679--2685.Google ScholarDigital Library
Bin Xin Ru, Ahsan S. Alvi, Vu Nguyen, Michael A. Osborne, and Stephen J. Roberts. 2020. Bayesian Optimisation over Multiple Continuous and Categorical Inputs. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research), Vol. 119. PMLR, 8276--8285.Google Scholar
Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. In 35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, April 8--11, 2019. IEEE, 1238--1249.Google Scholar
Tarique Siddiqui, Alekh Jindal, Shi Qiao, Hiren Patel, and Wangchao Le. 2020. Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 99--113.Google ScholarDigital Library
Michael Stillger, Guy M. Lohman, Volker Markl, and Mokhtar Kandil. 2001. LEO - DB2's LEarning Optimizer. In VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11--14, 2001, Roma, Italy. Morgan Kaufmann, 19--28. http://www.vldb.org/conf/2001/P019.pdfGoogle Scholar
Ji Sun, Jintao Zhang, Zhaoyan Sun, Guoliang Li, and Nan Tang. 2021. Learned Cardinality Estimation: A Design Space Exploration and A Comparative Evaluation. Proc. VLDB Endow. 15, 1 (2021), 85--97.Google ScholarDigital Library
Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, and Joseph Antonakakis. 2019. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1153--1170.Google ScholarDigital Library
Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, February 28 - March 3, 2000, David B. Lomet and Gerhard Weikum (Eds.). IEEE Computer Society, 101--110.Google ScholarCross Ref
Xiaoying Wang, Changbo Qu, Weiyuan Wu, Jiannan Wang, and Qingqing Zhou. 2021. Are We Ready For Learned Cardinality Estimation? Proc. VLDB Endow. 14, 9 (2021), 1640--1654.Google ScholarDigital Library
Lianggui Weng, Rong Zhu, Di Wu, Bolin Ding, Bolong Zheng, and Jingren Zhou. 2024. Eraser: Eliminating Performance Regression on Learned Query Optimizer. Proc. VLDB Endow. (2024).Google Scholar
Kyu-Young Whang. [n.d.]. Index Selection in Relational Databases. In Foundations of Data Organization, Proceedings of the International Conference on Foundations of Data Organization, May 22--24, 1985, Kyoto, Japan, Sakti P. Ghosh, Yahiko Kambayashi, and Katsumi Tanaka (Eds.). 487--500.Google Scholar
Renzhi Wu, Bolin Ding, Xu Chu, Zhewei Wei, Xiening Dai, Tao Guan, and Jingren Zhou. 2021. Learning to be a Statistician: Learned Estimator for Number of Distinct Values. Proc. VLDB Endow. 15, 2 (2021), 272--284. Google ScholarDigital Library
Wentao Wu, Jeffrey F. Naughton, and Harneet Singh. 2016. Sampling-Based Query Re-Optimization. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016. ACM, 1721--1736. Google ScholarDigital Library
Zongheng Yang, Wei-Lin Chiang, Sifei Luan, Gautam Mittal, Michael Luo, and Ion Stoica. 2022. Balsa: Learning a Query Optimizer Without Expert Demonstrations. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary G. Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 931--944.Google ScholarDigital Library
Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen, and Ion Stoica. 2020. NeuroCard: One Cardinality Estimator for All Tables. Proc. VLDB Endow. 14, 1 (2020), 61--73.Google ScholarDigital Library
Xiang Yu, Chengliang Chai, Guoliang Li, and Jiabin Liu. 2022. Cost-based or Learning-based? A Hybrid Query Optimizer for Query Plan Selection. Proc. VLDB Endow. 15, 13 (2022), 3924--3936.Google ScholarDigital Library
Haitao Yuan, Guoliang Li, Ling Feng, Ji Sun, and Yue Han. 2020. Automatic View Generation with Deep Learning and Reinforcement Learning. In 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, April 20--24, 2020. IEEE, 1501--1512.Google Scholar
Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, Minwei Ran, and Zekang Li. 2019. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 415--432.Google ScholarDigital Library
Wangda Zhang, Matteo Interlandi, Paul Mineiro, Shi Qiao, Nasim Ghazanfari, Karlen Lie, Marc T. Friedman, Rafah Hosn, Hiren Patel, and Alekh Jindal. 2022. Deploying a Steered Query Optimizer in Production at Microsoft. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 2299--2311. Google ScholarDigital Library
Xinyi Zhang, Zhuo Chang, Yang Li, Hong Wu, Jian Tan, Feifei Li, and Bin Cui. 2022. Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation. Proc. VLDB Endow. 15, 9 (2022), 1808--1821.Google ScholarDigital Library
Xuanhe Zhou, Ji Sun, Guoliang Li, and Jianhua Feng. 2020. Query Performance Prediction for Concurrent Queries using Graph Embedding. Proc. VLDB Endow. 13, 9 (2020), 1416--1428.Google ScholarDigital Library
Rong Zhu, Wei Chen, Bolin Ding, Xingguang Chen, Andreas Pfadler, Ziniu Wu, and Jingren Zhou. 2023. Lero: A Learning-to-Rank Query Optimizer. Proc. VLDB Endow. 16, 6 (2023), 1466--1479.Google ScholarDigital Library
Rong Zhu, Lianggui Weng, Wenqing Wei, Di Wu, Jiazhen Peng, Yifan Wang, Bolin Ding, Defu Lian Bolong Zheng, and Jingren Zhou. [n.d.]. PilotScope: Steering Databases with Machine Learning Drivers (Full Version). In https://github.com/duoyw/PilotScope/blob/main/paper/fullversion.pdf.Google Scholar
Rong Zhu, Ziniu Wu, Chengliang Chai, Andreas Pfadler, Bolin Ding, Guoliang Li, and Jingren Zhou. [n.d.]. Learned Query Optimizer: At the Forefront of AI-Driven Databases. In Proceedings of the 25th International Conference on Extending Database Technology, EDBT 2022, Edinburgh, UK, March 29 - April 1, 2022, Julia Stoyanovich, Jens Teubner, Paolo Guagliardo, Milos Nikolic, Andreas Pieris, Jan Mühlig, Fatma Özcan, Sebastian Schelter, H. V. Jagadish, and Meihui Zhang (Eds.). 1--4.Google Scholar
Rong Zhu, Ziniu Wu, Yuxing Han, Kai Zeng, Andreas Pfadler, Zhengping Qian, Jingren Zhou, and Bin Cui. 2021. FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation. Proc. VLDB Endow. 14, 9 (2021), 1489--1502.Google ScholarDigital Library

Cited By

View all

Recommendations

A survey on machine learning in array databases
Abstract
This paper provides an in-depth survey on the integration of machine learning and array databases. First,machine learning support in modern database management systems is introduced. From straightforward implementations of linear algebra ...
Read More
See Also
1022 La Cadena Avenue, Unit G, Arcadia, CA 91007 | Compass
Lifelong Machine Learning
Read More
Inductive Learning in Deductive Databases
Most current applications of inductive learning in databases take place in the context of a single extensional relation. The authors place inductive learning in the context of a set of relations defined either extensionally or intentionally in the ...
Read More

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Information
Contributors

Published in
Proceedings of the VLDB Endowment Volume 17, Issue 5
January 2024
233 pages
ISSN:2150-8097
Editors:
Meihui Zhang
Beijing Institute of Technology
,
Cyrus Shahabi
University of Southern California
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 2 May 2024
Published in pvldb Volume 17, Issue 5

Check for updates
Badges
- Artifacts Available / v1.1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Bibliometrics
Citations0

Article Metrics
- Total Citations
  View Citations
- Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

View Digital Edition

Figures
Other

Close Figure Viewer

Browse AllReturnChange zoom level

Caption

View Issue’s Table of Contents

PilotScope: Steering Databases with Machine Learning Drivers (2024)

New Citation Alert added!

New Citation Alert!

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Badges

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Export Citations