{"title":"测试、测量和自动语音识别","authors":"D. S. Pallett, J. Baker","doi":"10.1145/266231.266238","DOIUrl":null,"url":null,"abstract":"igure One shows a representative test cycle for tests implemented by the NIST group. A test cycle is initiated with an analysis and planning phase, typically coordinated by a group of researchers, research sponsors, and NIST staff. During this phase, test protocols and implementation schedules are defined. A data-collection phase leads to the creation or identification of standardized speech and natural language corpora, distributed to a community of core technology developers. In most cases, a portion of the corpora is held in reserve by NIST as performance assessment test material. At agreed-upon times, NIST defines and releases development and evaluation test sets to the core technology developers, and they, in turn, provide NIST with the results of their locally-implemented tests. NIST then produces a detailed set of uniformly-scored tabulated results, including the results of numerous paired-comparison statistical significance tests and other analyses. These test results and their scientific implications then become an important matter for discussion at technical meetings. The extent of NIST's work is illustrated by a look at some 60 technical papers on speech recognition submitted to the 1996 IEEE International Conference on Acoustics, Speech and Signal Processing. Twenty-eight of the 60 papers reported results based on the use of NIST-defined test data, test methodologies, and NIST-implemented benchmark tests. Of these 28 papers, 16 were by researchers in the United States and 12 were from other nations. From Dragon Systems' perspective, the NIST reference speech database measurement and testing methodologies are important for research and necessary to advance the technology. While ideas are plentiful , testing is expensive; researchers and research resources are costly. So sharing data makes sense. Large common databases are statistically more meaningful than smaller proprietary ones, and using these large databases minimizes dead-end approaches. At the speech-recognition workshops where results of the NIST's benchmark tests are presented, there are opportunities to compare results and the different approaches pursued at different laboratories. In this way the entire community benefits.","PeriodicalId":270594,"journal":{"name":"ACM Stand.","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Tests, measurements, and automatic speech recognition\",\"authors\":\"D. S. Pallett, J. Baker\",\"doi\":\"10.1145/266231.266238\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"igure One shows a representative test cycle for tests implemented by the NIST group. A test cycle is initiated with an analysis and planning phase, typically coordinated by a group of researchers, research sponsors, and NIST staff. During this phase, test protocols and implementation schedules are defined. A data-collection phase leads to the creation or identification of standardized speech and natural language corpora, distributed to a community of core technology developers. In most cases, a portion of the corpora is held in reserve by NIST as performance assessment test material. At agreed-upon times, NIST defines and releases development and evaluation test sets to the core technology developers, and they, in turn, provide NIST with the results of their locally-implemented tests. NIST then produces a detailed set of uniformly-scored tabulated results, including the results of numerous paired-comparison statistical significance tests and other analyses. These test results and their scientific implications then become an important matter for discussion at technical meetings. The extent of NIST's work is illustrated by a look at some 60 technical papers on speech recognition submitted to the 1996 IEEE International Conference on Acoustics, Speech and Signal Processing. Twenty-eight of the 60 papers reported results based on the use of NIST-defined test data, test methodologies, and NIST-implemented benchmark tests. Of these 28 papers, 16 were by researchers in the United States and 12 were from other nations. From Dragon Systems' perspective, the NIST reference speech database measurement and testing methodologies are important for research and necessary to advance the technology. While ideas are plentiful , testing is expensive; researchers and research resources are costly. So sharing data makes sense. Large common databases are statistically more meaningful than smaller proprietary ones, and using these large databases minimizes dead-end approaches. At the speech-recognition workshops where results of the NIST's benchmark tests are presented, there are opportunities to compare results and the different approaches pursued at different laboratories. In this way the entire community benefits.\",\"PeriodicalId\":270594,\"journal\":{\"name\":\"ACM Stand.\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Stand.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/266231.266238\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Stand.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/266231.266238","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tests, measurements, and automatic speech recognition
igure One shows a representative test cycle for tests implemented by the NIST group. A test cycle is initiated with an analysis and planning phase, typically coordinated by a group of researchers, research sponsors, and NIST staff. During this phase, test protocols and implementation schedules are defined. A data-collection phase leads to the creation or identification of standardized speech and natural language corpora, distributed to a community of core technology developers. In most cases, a portion of the corpora is held in reserve by NIST as performance assessment test material. At agreed-upon times, NIST defines and releases development and evaluation test sets to the core technology developers, and they, in turn, provide NIST with the results of their locally-implemented tests. NIST then produces a detailed set of uniformly-scored tabulated results, including the results of numerous paired-comparison statistical significance tests and other analyses. These test results and their scientific implications then become an important matter for discussion at technical meetings. The extent of NIST's work is illustrated by a look at some 60 technical papers on speech recognition submitted to the 1996 IEEE International Conference on Acoustics, Speech and Signal Processing. Twenty-eight of the 60 papers reported results based on the use of NIST-defined test data, test methodologies, and NIST-implemented benchmark tests. Of these 28 papers, 16 were by researchers in the United States and 12 were from other nations. From Dragon Systems' perspective, the NIST reference speech database measurement and testing methodologies are important for research and necessary to advance the technology. While ideas are plentiful , testing is expensive; researchers and research resources are costly. So sharing data makes sense. Large common databases are statistically more meaningful than smaller proprietary ones, and using these large databases minimizes dead-end approaches. At the speech-recognition workshops where results of the NIST's benchmark tests are presented, there are opportunities to compare results and the different approaches pursued at different laboratories. In this way the entire community benefits.