{"id":1016919,"date":"2024-03-20T09:56:21","date_gmt":"2024-03-20T16:56:21","guid":{"rendered":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/?post_type=msr-research-item&#038;p=1016919"},"modified":"2024-07-18T20:27:57","modified_gmt":"2024-07-19T03:27:57","slug":"smartoclock-workload-and-risk-aware-overclocking-in-the-cloud","status":"publish","type":"msr-research-item","link":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/smartoclock-workload-and-risk-aware-overclocking-in-the-cloud\/","title":{"rendered":"SmartOClock: Workload- and Risk-Aware Overclocking in the Cloud"},"content":{"rendered":"<p><span dir=\"ltr\" role=\"presentation\">Operating server components beyond their voltage <\/span><span dir=\"ltr\" role=\"presentation\">and power design limits (<\/span><span dir=\"ltr\" role=\"presentation\">i.e.<\/span><span dir=\"ltr\" role=\"presentation\">, overclocking) enables improving <\/span><span dir=\"ltr\" role=\"presentation\">performance and lowering cost for cloud workloads. However, <\/span><span dir=\"ltr\" role=\"presentation\">overclocking can significantly degrade component lifetime, in<\/span><span dir=\"ltr\" role=\"presentation\">crease power consumption, and cause power capping events, <\/span><span dir=\"ltr\" role=\"presentation\">eventually diminishing the performance benefits.<\/span><\/p>\n<p><span dir=\"ltr\" role=\"presentation\">In this paper, we characterize the impact of overclocking <\/span><span dir=\"ltr\" role=\"presentation\">on cloud workloads by studying their profiles from production <\/span><span dir=\"ltr\" role=\"presentation\">deployments. Based on the characterization insights, we propose <\/span><span dir=\"ltr\" role=\"presentation\">SmartOClock,<\/span> <span dir=\"ltr\" role=\"presentation\">the<\/span> <span dir=\"ltr\" role=\"presentation\">first<\/span> <span dir=\"ltr\" role=\"presentation\">distributed<\/span> <span dir=\"ltr\" role=\"presentation\">overclocking<\/span> <span dir=\"ltr\" role=\"presentation\">management <\/span><span dir=\"ltr\" role=\"presentation\">platform specifically designed for cloud environments. SmartO<\/span><span dir=\"ltr\" role=\"presentation\">Clock is a workload-aware scheme that relies on power predic<\/span><span dir=\"ltr\" role=\"presentation\">tions to heterogeneously distribute the power budgets across its <\/span><span dir=\"ltr\" role=\"presentation\">servers based on their needs and then enforce budget compliance <\/span><span dir=\"ltr\" role=\"presentation\">locally, per-server, in a decentralized manner.<\/span><\/p>\n<p><span dir=\"ltr\" role=\"presentation\">SmartOClock reduces the tail latency by 9%, application cost <\/span><span dir=\"ltr\" role=\"presentation\">by 30% and total energy consumption by 10% for latency-<\/span><span dir=\"ltr\" role=\"presentation\">sensitive microservices on a 36-server deployment. Simulation <\/span><span dir=\"ltr\" role=\"presentation\">analysis using production traces show that SmartOClock reduces <\/span><span dir=\"ltr\" role=\"presentation\">the<\/span> <span dir=\"ltr\" role=\"presentation\">number<\/span> <span dir=\"ltr\" role=\"presentation\">of<\/span> <span dir=\"ltr\" role=\"presentation\">power<\/span> <span dir=\"ltr\" role=\"presentation\">capping<\/span> <span dir=\"ltr\" role=\"presentation\">events<\/span> <span dir=\"ltr\" role=\"presentation\">by<\/span> <span dir=\"ltr\" role=\"presentation\">up<\/span> <span dir=\"ltr\" role=\"presentation\">to<\/span> <span dir=\"ltr\" role=\"presentation\">95%<\/span> <span dir=\"ltr\" role=\"presentation\">while <\/span><span dir=\"ltr\" role=\"presentation\">increasing the overclocking success rate by up to 62%. We also <\/span><span dir=\"ltr\" role=\"presentation\">describe lessons from building a first-of-its-kind overclockable <\/span><span dir=\"ltr\" role=\"presentation\">cluster at a cloud provider for production experiments.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Operating server components beyond their voltage and power design limits (i.e., overclocking) enables improving performance and lowering cost for cloud workloads. However, overclocking can significantly degrade component lifetime, increase power consumption, and cause power capping events, eventually diminishing the performance benefits. In this paper, we characterize the impact of overclocking on cloud workloads by studying [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"ISCA","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2024-6-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246691],"msr-conference":[259546],"msr-journal":[],"msr-impact-theme":[264846],"msr-pillar":[],"class_list":["post-1016919","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us","msr-field-of-study-computer-science"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-6-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2024\/03\/SmartOClock_ISCA24.pdf","id":"1029366","title":"smartoclock_isca24","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":1029366,"url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2024\/04\/SmartOClock_ISCA24.pdf"},{"id":1016925,"url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2024\/03\/SmartOClock_ISCA.pdf"}],"msr-author-ordering":[{"type":"text","value":"Jovan Stojkovic","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Pulkit Misra","user_id":38496,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Pulkit Misra"},{"type":"user_nicename","value":"&Iacute;&ntilde;igo Goiri","user_id":32102,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=&Iacute;&ntilde;igo Goiri"},{"type":"user_nicename","value":"Sam Whitlock","user_id":41024,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Sam Whitlock"},{"type":"user_nicename","value":"Esha Choukse","user_id":40417,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Esha Choukse"},{"type":"user_nicename","value":"Mayukh Das","user_id":41140,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Mayukh Das"},{"type":"user_nicename","value":"Chetan Bansal","user_id":31394,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Chetan Bansal"},{"type":"text","value":"Jason Lee","user_id":0,"rest_url":false},{"type":"text","value":"Zoey Sun","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Haoran Qiu","user_id":43428,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Haoran Qiu"},{"type":"text","value":"Reed Zimmermann","user_id":0,"rest_url":false},{"type":"text","value":"Savyasachi Samal","user_id":0,"rest_url":false},{"type":"guest","value":"brijesh-warrier","user_id":956994,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=brijesh-warrier"},{"type":"text","value":"Ashish Raniwala","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ricardo Bianchini","user_id":33393,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ricardo Bianchini"}],"msr_impact_theme":["Computing foundations"],"msr_research_lab":[],"msr_event":[],"msr_group":[282170,793670,811276,998211],"msr_project":[],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":[],"_links":{"self":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016919","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016919\/revisions"}],"predecessor-version":[{"id":1058478,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016919\/revisions\/1058478"}],"wp:attachment":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1016919"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=1016919"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1016919"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=1016919"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=1016919"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=1016919"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1016919"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1016919"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=1016919"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=1016919"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=1016919"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1016919"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1016919"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}