{"id":336134,"date":"2016-12-13T16:13:02","date_gmt":"2016-12-14T00:13:02","guid":{"rendered":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/?post_type=msr-research-item&#038;p=336134"},"modified":"2018-10-16T20:10:30","modified_gmt":"2018-10-17T03:10:30","slug":"novel-framework-text-independent-speaker-verification-based-utterance-transform-iterative-cohort-modeling","status":"publish","type":"msr-research-item","link":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/publication\/novel-framework-text-independent-speaker-verification-based-utterance-transform-iterative-cohort-modeling\/","title":{"rendered":"Novel Framework of Text-independent Speaker Verification based on Utterance Transform and Iterative Cohort Modeling"},"content":{"rendered":"<p>A novel framework for text-independent speaker veri\u0002fication is proposed. The framework is based on a new interpretation of Universal Background Model. The UBM in our framework actually defi\u0002nes a transform which maps the variable length observation into a \u0002fixed dimensional supervector (supervector space). Each speech utterance is then mapped into a point in this supervector space. The similarity measure in this vector space is progressively refined via an iterative cohort modeling scheme. The experiments on NIST 2002 corpus show the effectiveness of this new framework. Overall the EER drops from the baseline system(with TNorm) 9:21% to \u0002final improved system(without T-Norm) 8:07%. The new framework can effectively reduce the data dependence in the fi\u0002nal output score which is clearly indicated in the second sets of experiments. The EER after T-Norm of \u0002final system marginally increases by relatively 1:73% compared to the EER of baseline system drops 16:12% relatively after T-Norm. Also, the relative improvement of DCF after T-Norm is marginal for the fi\u0002nal improved system (2:47%) compared to 33:68% in baseline system. It clear shows that the iterative cohort modeling effectively reduce the data dependence of the \u0002nal scores, so that T-Norm will not further improve the system performance. Also, the performance of novel frame clearly increases as the iteration grows which suggest that the framework progressively re\u0002ne the similarity measure on the supervector space with the iterative cohort modeling. Index Terms: speaker veri\u0002fication, utterance transform, iterative cohort modeling<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A novel framework for text-independent speaker veri\u0002fication is proposed. The framework is based on a new interpretation of Universal Background Model. The UBM in our framework actually defi\u0002nes a transform which maps the variable length observation into a \u0002fixed dimensional supervector (supervector space). Each speech utterance is then mapped into a point in this supervector [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"Proceedings of the Ninth International Conference on Spoken Language Processing (Interspeech 2006 - ICSLP), Pittsburgh, Pennsylvania","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"Proceedings of the Ninth International Conference on Spoken Language Processing (Interspeech 2006 - ICSLP), Pittsburgh, Pennsylvania","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2006-09-17","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13562],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-336134","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-computer-vision","msr-locale-en_us"],"msr_publishername":"","msr_edition":"Proceedings of the Ninth International Conference on Spoken Language Processing (Interspeech 2006 - ICSLP), Pittsburgh, Pennsylvania","msr_affiliation":"","msr_published_date":"2006-09-17","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"460413","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"novel-framework-of-text-independent-speaker-verification-based-on","viewUrl":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-content\/uploads\/2016\/12\/Novel-Framework-of-Text-independent-Speaker-Verification-based-on.pdf","id":460413,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Ming Liu","user_id":0,"rest_url":false},{"type":"text","value":"Huazhong Ning","user_id":0,"rest_url":false},{"type":"text","value":"Thomas S. Huang","user_id":0,"rest_url":false},{"type":"user_nicename","value":"zhang","user_id":35102,"rest_url":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=zhang"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[336119],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":336119,"post_title":"Speaker Verification:  Text-Dependent vs. Text-Independent","post_name":"speaker-verification-text-dependent-vs-text-independent","post_type":"msr-project","post_date":"2016-12-13 16:04:09","post_modified":"2017-06-19 09:30:04","post_status":"publish","permalink":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/project\/speaker-verification-text-dependent-vs-text-independent\/","post_excerpt":"Speaker verification is the process of verifying the claimed identity of a speaker based on the speech signal from the speaker (voiceprint). There are two types of speaker verification systems: Text-Independent Speaker Verification (TI-SV) and Text-Dependent Speaker Verification (TD-SV). TD-SV requires the speaker saying exactly the enrolled or given password. Text independent Speaker Verification is a process of verifying the identity without constraint on the speech content. Compared to TD-SV, it is more convenient because&hellip;","_links":{"self":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/336119"}]}}]},"_links":{"self":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/336134","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/336134\/revisions"}],"predecessor-version":[{"id":523980,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/336134\/revisions\/523980"}],"wp:attachment":[{"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/media?parent=336134"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=336134"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=336134"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=336134"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=336134"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=336134"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=336134"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=336134"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=336134"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=336134"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=336134"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=336134"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/newed.any0.dpdns.org\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=336134"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}