{"id":152274,"date":"1994-09-01T00:00:00","date_gmt":"1994-09-01T00:00:00","guid":{"rendered":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/msr-research-item\/from-cmu-sphinx-ii-to-microsoft-whisper-making-speech-recognition-usable\/"},"modified":"2018-10-16T22:02:38","modified_gmt":"2018-10-17T05:02:38","slug":"from-cmu-sphinx-ii-to-microsoft-whisper-making-speech-recognition-usable","status":"publish","type":"msr-research-item","link":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/from-cmu-sphinx-ii-to-microsoft-whisper-making-speech-recognition-usable\/","title":{"rendered":"From CMU Sphinx-II to Microsoft Whisper: Making Speech Recognition Usable"},"content":{"rendered":"<div class=\"asset-content\">\n<p>In this paper, we first review Sphinx-II, a large-vocabulary speaker-independent continuous speech recognition system developed at CMU. We summarize techniques that helped Sphinx-II achieve the state-of-the-art large-vocabulary continuous speech recognition performance. We then review Whisper, a system we developed here at Microsoft Research. We focus on recognition accuracy, efficiency and usability issues, which are critical to the success of commercial speech applications. Whisper has significantly improved Sphinx-II&#8217;s performance in these three areas.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this paper, we first review Sphinx-II, a large-vocabulary speaker-independent continuous speech recognition system developed at CMU. We summarize techniques that helped Sphinx-II achieve the state-of-the-art large-vocabulary continuous speech recognition performance. We then review Whisper, a system we developed here at Microsoft Research. We focus on recognition accuracy, efficiency and usability issues, which are critical [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"Kluwer Academic","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"Automatic Speech and Speaker Recognition--Advanced Topics","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"MSR-TR-94-20","msr_organization":"","msr_pages_string":"28","msr_page_range_start":"28","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"Doug Beeferman","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"Microsoft Research","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"1994-09-01","msr_highlight_text":"","msr_notes":"Automatic Speech and Speaker Recognition--Advanced Topics","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":1994,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13554],"msr-publication-type":[193718],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-152274","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-human-computer-interaction","msr-locale-en_us"],"msr_publishername":"Kluwer Academic","msr_edition":"Automatic Speech and Speaker Recognition--Advanced Topics","msr_affiliation":"","msr_published_date":"1994-09-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"28","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"MSR-TR-94-20","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"Automatic Speech and Speaker Recognition--Advanced Topics","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"211728","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"tr-94-20.doc","viewUrl":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2016\/02\/tr-94-20.doc","id":211728,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":211728,"url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2016\/02\/tr-94-20.doc"}],"msr-author-ordering":[{"type":"user_nicename","value":"fil","user_id":31811,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=fil"},{"type":"text","value":"Doug Beeferman","user_id":0,"rest_url":false},{"type":"user_nicename","value":"milindm","user_id":32927,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=milindm"},{"type":"user_nicename","value":"mehwang","user_id":32872,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=mehwang"},{"type":"user_nicename","value":"xdh","user_id":34869,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=xdh"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"techreport","related_content":[],"_links":{"self":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/152274","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/152274\/revisions"}],"predecessor-version":[{"id":541525,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/152274\/revisions\/541525"}],"wp:attachment":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/media?parent=152274"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=152274"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=152274"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=152274"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=152274"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=152274"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=152274"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=152274"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=152274"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=152274"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=152274"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=152274"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=152274"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}