{"id":282170,"date":"2014-01-01T00:00:05","date_gmt":"2014-01-01T08:00:05","guid":{"rendered":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/?post_type=msr-group&#038;p=282170"},"modified":"2026-04-29T11:59:30","modified_gmt":"2026-04-29T18:59:30","slug":"azure-research-systems","status":"publish","type":"msr-group","link":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/group\/azure-research-systems\/","title":{"rendered":"Azure Research \u2013 Systems"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background-catalina-blue card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788.jpg\" class=\"attachment-full size-full\" alt=\"a blue-lit hallway of servers\" style=\"object-position: 51% 50%\" srcset=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788.jpg 1400w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-300x169.jpg 300w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-1024x576.jpg 1024w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-768x432.jpg 768w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-1066x600.jpg 1066w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-655x368.jpg 655w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-343x193.jpg 343w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-240x135.jpg 240w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-640x360.jpg 640w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-960x540.jpg 960w, https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2022\/08\/T05_FutureOfTheCloud_1400x788-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 align-self-center\">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/group\/azure-research\/\" class=\"icon-link icon-link--reverse mb-2\" data-bi-cN=\"Azure Research\">\n\t\t\t\t\t\t\t\t\t<span class=\"c-glyph glyph-chevron-left\" aria-hidden=\"true\"><\/span>\n\t\t\t\t\t\t\t\t\tAzure Research\t\t\t\t\t\t\t\t<\/a>\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading h2\" id=\"azure-research-systems\">Azure Research \u2013 Systems<\/h1>\n\n\n\n<p>Cloud and AI systems innovation at the core of Azure<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p class=\"has-lighter-gray-background-color has-background\"><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"overview\">Overview<\/h2>\n\n\n\n<p class=\"has-text-align-left\">Azure Research \u2013 Systems is a research group in Azure Core that brings forward-looking, world-class systems research directly into Azure. The group was seeded from the Cloud Efficiency team, which migrated from the <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/group\/systems-research-group-redmond\/\">Systems Research Group<\/a> at <a href=\"https:\/\/newed.any0.dpdns.org\/research\/\">Microsoft Research<\/a>, for a closer integration with Azure.<\/p>\n\n\n\n<p class=\"has-text-align-left\">Our group&#8217;s main mission is to improve the cost efficiency of Microsoft&#8217;s online services and datacenters. We pursue this mission by working closely with the company&#8217;s product groups to (1) propose and lead joint projects that improve efficiency, and (2) do research on potential future efficiency improvements.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"impact-of-our-research\">Impact of our research<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"some-of-our-main-tech-transfers-and-corresponding-papers\">Some of our main tech transfers and corresponding papers<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The new power regulation feature for AI training that we worked with NVIDIA and OpenAI to create (described in our <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/pdf\/2508.14318\">joint report<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>) has become part of GB200 GPUs in 2025.<\/li>\n\n\n\n<li>Our approach to AI inference phase splitting (aka prefill-decode disaggregation and described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/splitwise-efficient-generative-llm-inference-using-phase-splitting\/\">ISCA 2024 paper<\/a>) has been implemented in NVIDIA&#8217;s Dynamo inference framework and motivated NVIDIA&#8217;s CPX GPUs in 2025.<\/li>\n\n\n\n<li>Our hybrid (software + firmware) implementation of per-VM power capping went into production in Cobalt servers in August 2025.<\/li>\n\n\n\n<li>Our approach for power oversubscription of GPU clusters (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/characterizing-power-management-opportunities-for-llms-in-the-cloud\/\">ASPLOS 2024 paper<\/a>) went into production in November 2024.<\/li>\n\n\n\n<li>The server shutdown component of our power emergency management system (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/flex-high-availability-datacenters-with-zero-reserved-power\/\">ISCA 2021 paper<\/a>) went into production in June 2023. The system allows datacenters to allocate all of their reserve\/redundant power and host more servers.<\/li>\n\n\n\n<li>Harvest VMs v2 for harvesting underutilized cores (described as Elastic VMs in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/smartharvest-harvesting-idle-cpus-safely-and-efficiently-in-the-cloud\/\">EuroSys 2021 paper<\/a>) went into production in January 2023.<\/li>\n\n\n\n<li>The server throttling component of our power emergency management system (also described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/flex-high-availability-datacenters-with-zero-reserved-power\/\">ISCA 2021 paper<\/a>) went into production in March 2021.<\/li>\n\n\n\n<li>Our per-VM power capping software (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/prediction-based-power-oversubscription-in-cloud-platforms\/\">ATC 2021 paper<\/a>) went into production in October 2020.<\/li>\n\n\n\n<li>Our hybrid policy for managing cold starts in serverless platforms (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/serverless-in-the-wild-characterizing-and-optimizing-the-serverless-workload-at-a-large-cloud-provider\/\">ATC 2020 paper<\/a>) went into production in Azure Functions in June 2020.<\/li>\n\n\n\n<li>Harvest VMs v1 for harvesting unallocated cores and Harvest Hadoop (our modification of YARN and HDFS to benefit from Harvest VMs and described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/providing-slos-for-resource-harvesting-vms-in-cloud-platforms\/\">OSDI 2020 paper<\/a>) went into production in November 2019.<\/li>\n\n\n\n<li>Our power capping and oversubscription software went into production in July 2018.<\/li>\n\n\n\n<li>Our tail latency mitigation techniques for HDFS (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/managing-tail-latency-in-datacenter-scale-file-systems-under-production-constraints\/\">EuroSys 2019 paper<\/a>) went into production in June 2018.<\/li>\n\n\n\n<li>Resource Central, our ML and prediction-serving system for cloud platforms (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/resource-central-understanding-predicting-workloads-improved-resource-management-large-cloud-platforms\/\">SOSP 2017 paper<\/a>), went into production in March 2018.<\/li>\n\n\n\n<li>Router-Based HDFS Federation, our system for transparently scaling HDFS to datacenter sizes (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/scaling-distributed-file-systems-resource-harvesting-datacenters\/\">ATC 2017 paper<\/a>), went into production in June 2017.<\/li>\n\n\n\n<li>CPU blind isolation for harvesting spare CPU cycles (described in our <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.usenix.org\/conference\/atc18\/presentation\/iorgulescu\" target=\"_blank\" rel=\"noopener noreferrer\">ATC 2018 paper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> with the DMX group at MSR) went into production in August 2016.<\/li>\n\n\n\n<li>Perflite, a tool for VM utilization analysis and optimization built from our Floodlight tool, went into production in February 2016.<\/li>\n\n\n\n<li>Our resource-harvesting YARN\/HDFS stack and HDFS data placement algorithm for harvesting spare storage (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/harvesting-spare-cycles-and-storage-in-large-scale-datacenters\/\">OSDI 2016 paper<\/a>) went into production in January 2016.<\/li>\n\n\n\n<li>Our analysis of disk reliability (described in our <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/environmental-conditions-and-disk-reliability-in-free-cooled-datacenters\/\">FAST 2016 award paper<\/a>) prompted the adoption of a new ambient control policy for Microsoft&#8217;s free-cooling datacenters starting in 2015.<\/li>\n<\/ul>\n\n\n\n<p>None of these successes would not have been possible without our close partnership with teams in Azure, Bing, E+D, CO+I, AHSI, Core OS, M365 Research, and MSR.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"some-of-our-recent-best-paper-awards\">Some of our recent best paper awards<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.nature.com\/articles\/s41586-025-08832-3\"><strong>Using Life Cycle Assessment to Drive Innovation for Sustainable Cool Clouds.<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a> <strong>Nature 2025.<\/strong> Husam Alissa, Teresa Nick, Ashish Raniwala, Alberto Arribas Herranz, Kali Frost, Ioannis Manousakis, Kari Lio, Brijesh Warrier, Vaidehi Oruganti, T. J. DiCaprio, Kathryn Oseen-Senda, Bharath Ramakrishnan, Naval Gupta,&nbsp;Ricardo Bianchini, Jim Kleewein, Christian Belady, Marcus Fontoura, Julie Sinistore, Mukunth Natarajan, Lauren Johnson, VeeAnder Mealing, Praneet Arshi, Madeline Frieze.<\/li>\n\n\n\n<li><a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/dynamollm-designing-llm-inference-clusters-for-performance-and-energy-efficiency\/\"><strong>DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency<\/strong><\/a>. <strong>HPCA 2025.<\/strong> Jovan Stojkovic, Chaojie Zhang, \u00cd\u00f1igo Goiri, Josep Torrellas, Esha Choukse.\ud83c\udfc5<em>Best Paper Award.<\/em><\/li>\n\n\n\n<li><strong>Splitwise: Efficient Generative LLM Inference Using Phase Splitting<\/strong>. \ud83c\udfc5<strong>IEEE Micro Top Picks from the 2024 Computer Architecture Conferences<\/strong>. Esha Choukse, Pratyush Patel, Chaojie Zhang, Aashaka Shah, \u00cd\u00f1igo Goiri, Saeed Maleki, Rodrigo Fonseca, Ricardo Bianchini.<\/li>\n\n\n\n<li><strong>Designing Cloud Servers for Lower Carbon<\/strong>. \ud83c\udfc5<strong>IEEE Micro Top Picks from the 2024 Computer Architecture Conferences<\/strong>. Jaylen Wang, Daniel S. Berger, Fiodar Kazhamiaka, Celine Irvene, Chaojie Zhang, Esha Choukse, Kali Frost, Rodrigo Fonseca, Brijesh Warrier, Chetan Bansal, Jonathan Stern, Ricardo Bianchini, Akshitha Sriraman.<\/li>\n\n\n\n<li><a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/pond-cxl-based-memory-pooling-systems-for-cloud-platforms\/\"><strong>Pond: CXL-Based Memory Pooling Systems for Cloud Platforms<\/strong><\/a>. <strong>ASPLOS 2023<\/strong>. Huaicheng Li, Daniel S. Berger, Stanko Novakovic, Lisa Hsu, Dan Ernst, Pantea Zardoshti, Monish Shah, Samir Rajadnya, Scott Lee, Ishwar Agarwal, Mark D. Hill, Marcus Fontoura, Ricardo Bianchini. \ud83c\udfc5<em>Distinguished Paper Award<\/em><\/li>\n\n\n\n<li><a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/enso-a-streaming-interface-for-nic-application-communication\/\"><strong>Ens\u014d: A Streaming Interface for NIC-Application Communication<\/strong>.<\/a> <strong>OSDI 2023. <\/strong>Hugo Sadok, Nirav Atre, Zhipeng Zhao, Daniel S. Berger, James C. Hoe, Aurojit Panda, Justine Sherry, Ren Wang. \ud83c\udfc5<em>Best Paper Award.<\/em><\/li>\n<\/ul>\n\n\n\n<p>Complete list of publications <a href=\"https:\/\/newed.any0.dpdns.org\/en-us\/research\/group\/azure-systems-research\/publications\/\">here<\/a>.<\/p>\n\n\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cloud and AI systems innovation at the core of Azure Azure Research \u2013 Systems is a research group in Azure Core that brings forward-looking, world-class systems research directly into Azure. The group was seeded from the Cloud Efficiency team, which migrated from the Systems Research Group at Microsoft Research, for a closer integration with Azure. [&hellip;]<\/p>\n","protected":false},"featured_media":871668,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_group_start":"","footnotes":""},"research-area":[13547],"msr-group-type":[243694],"msr-locale":[268875],"msr-impact-theme":[],"class_list":["post-282170","msr-group","type-msr-group","status-publish","has-post-thumbnail","hentry","msr-research-area-systems-and-networking","msr-group-type-group","msr-locale-en_us"],"msr_group_start":"","msr_detailed_description":"","msr_further_details":"","msr_hero_images":[],"msr_research_lab":[],"related-researchers":[{"type":"user_nicename","display_name":"Ricardo Bianchini","user_id":33393,"people_section":"Group 1","alias":"ricardob"},{"type":"user_nicename","display_name":"Rodrigo Fonseca","user_id":40429,"people_section":"Group 1","alias":"rofons"},{"type":"user_nicename","display_name":"Daniel S. Berger","user_id":38892,"people_section":"Group 1","alias":"daberg"},{"type":"user_nicename","display_name":"Esha Choukse","user_id":40417,"people_section":"Group 1","alias":"eschouks"},{"type":"user_nicename","display_name":"&Iacute;&ntilde;igo Goiri","user_id":32102,"people_section":"Group 1","alias":"inigog"},{"type":"user_nicename","display_name":"Celine Irvene","user_id":40636,"people_section":"Group 1","alias":"celineirvene"},{"type":"user_nicename","display_name":"Fiodar Kazhamiaka","user_id":42714,"people_section":"Group 1","alias":"fkazhamiaka"},{"type":"user_nicename","display_name":"Alok Kumbhare","user_id":36086,"people_section":"Group 1","alias":"Alok Kumbhare"},{"type":"user_nicename","display_name":"Rohan Mahapatra","user_id":44065,"people_section":"Group 1","alias":"romahapatra"},{"type":"user_nicename","display_name":"Pulkit Misra","user_id":38496,"people_section":"Group 1","alias":"pumisra"},{"type":"user_nicename","display_name":"Haoran Qiu","user_id":43428,"people_section":"Group 1","alias":"haoranqiu"},{"type":"user_nicename","display_name":"Enrique Saurez","user_id":41820,"people_section":"Group 1","alias":"esaurez"},{"type":"user_nicename","display_name":"Chaojie Zhang","user_id":42705,"people_section":"Group 1","alias":"chaojiezhang"}],"related-publications":[568647,569016,431325,431334,392756,168898,168874,282332,168025,168026,168310,1136488,1083105,1136484,1131135,1127547,1116381,1115865,1113288,1112880,1105797,1102506,1088772,1083438,1083432,1083426,1083420,1083144,1083138,1058097,1153960,1170188,1163710,1163709,1163700,1154877,1154814,1154811,1153976,1139896,1151670,1151668,1151666,1149914,1148330,1148314,1145087,1143223,757534,887910,887382,883242,883236,856551,832030,813553,807802,781513,780187,908127,757489,756961,743035,738673,737704,702511,694536,658218,632832,632478,573207,1012251,1054683,1051530,1026447,1017912,1017714,1017021,1017003,1016991,1016919,1016907,560202,999297,990468,990234,953553,946659,946248,946149,939681,939012,931692,931290],"related-downloads":[],"related-videos":[],"related-projects":[1017939,757045,615975,658236,616008,573039],"related-events":[],"related-opportunities":[1151775],"related-posts":[938859,951852,956640,962838,1025451,1085448,1138612],"tab-content":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/282170","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-group"}],"about":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-group"}],"version-history":[{"count":89,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/282170\/revisions"}],"predecessor-version":[{"id":1170163,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/282170\/revisions\/1170163"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/media\/871668"}],"wp:attachment":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/media?parent=282170"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=282170"},{"taxonomy":"msr-group-type","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-group-type?post=282170"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=282170"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=282170"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}