{"id":967236,"date":"2023-12-11T15:19:29","date_gmt":"2023-12-11T23:19:29","guid":{"rendered":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/?post_type=msr-project&#038;p=967236"},"modified":"2023-12-11T15:19:32","modified_gmt":"2023-12-11T23:19:32","slug":"query-optimization-for-database-systems","status":"publish","type":"msr-project","link":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/project\/query-optimization-for-database-systems\/","title":{"rendered":"Query Optimization for Database Systems"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background bg-gray-200 has-background- card-background--full-bleed\">\n\t\t\t\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"query-optimization\">Query Optimization <\/h1>\n\n\n\n<p><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>The query optimizer is a crucial component in a relational database system and is responsible for finding a good execution plan for a SQL query. For cloud database service providers, the importance of query optimization is amplified due to the scale (e.g., millions of databases hosted) and variety of different workloads for which the query optimizer is expected to work well &#8220;out-of-the-box&#8221;. Query optimization is challenging due to the richness of SQL queries that contain operators such as joins, group-by, aggregation, and nested sub-queries, the limited data statistics available during query optimization, and the need to keep time and resources for query optimization small. We are interested in a variety of problems related to query optimization. We highlight below some examples of our recent work in query optimization. <\/p>\n\n\n\n<p>We conducted an empirical study to <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/analyzing-the-impact-of-cardinality-estimation-on-execution-plans-in-microsoft-sql-server\/\">quantify the impact of cardinality estimation<\/a> on plan quality in a state-of-the-art query optimizer (in the Microsoft SQL Server database engine), built using the Volcano\/Cascades framework. We have also explored the viability of using <em>data-driven techniques (e.g., ML models) <\/em>that exploit knowledge of the workload for problems such as<a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/leveraging-query-logs-and-machine-learning-for-parametric-query-optimization\/\"> parameteric query optimization<\/a> and <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/selectivity-estimation-for-range-predicates-using-lightweight-models\/\">cardinality estimation<\/a>.  With the increasing focus in research and industry on leveraging data-driven techniques for query optimization, we feel it is important to evaluate such systems fairly by taking into consideration the costs of obtaining training data as well as robustness to changing workload characteristics. For this purpose, we developed and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/dsb\">open-sourced<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> the <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/dsb-a-decision-support-benchmark-for-workload-driven-and-traditional-database-systems\/\">DSB benchmark<\/a>. <\/p>\n\n\n","protected":false},"excerpt":{"rendered":"<p>The query optimizer is a crucial component in a relational database system and is responsible for finding a good execution plan for a SQL query. For cloud database service providers, the importance of query optimization is amplified due to the scale (e.g., millions of databases hosted) and variety of different workloads for which the query [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13563],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-967236","msr-project","type-msr-project","status-publish","hentry","msr-research-area-data-platform-analytics","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[159591,969993,860196,802579,791294,784660,688089,659616,647013,619941,579487,494327,378605,151058,158906,158523,155428,155390,155388,155377,155301,154450,154427,152706,152531,152520],"related-downloads":[],"related-videos":[],"related-groups":[957177],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Anshuman Dutt","user_id":35537,"people_section":"Related people","alias":"andut"},{"type":"user_nicename","display_name":"Vivek Narasayya","user_id":34602,"people_section":"Related people","alias":"viveknar"},{"type":"user_nicename","display_name":"Surajit Chaudhuri","user_id":33764,"people_section":"Related people","alias":"surajitc"},{"type":"user_nicename","display_name":"Kukjin Lee","user_id":32593,"people_section":"Related people","alias":"kulee"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/967236","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":8,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/967236\/revisions"}],"predecessor-version":[{"id":983106,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/967236\/revisions\/983106"}],"wp:attachment":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/media?parent=967236"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=967236"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=967236"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=967236"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=967236"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}