graviti logo产品公开数据集关于我们
登录
783
0
31
THCHS-30
创建来自Data Decorators / AChenQ
概要
活动

Overview

Speech data is crucially important for speech recognition research. There are quite some speech databases that can be purchased at prices that are reasonable for most research institutes. However, for young people who just start research activities or those who just gain initial interest in this direction, the cost for data is still an annoying barrier. We support the `free data' movement in speech recognition: research institutes (particularly supported by public funds) publish their data freely so that new researchers can obtain sufficient data to kick of their career.Here, we follow this trend and release a free Chinese speech database THCHS-30 that can be used to build a full- edged Chinese speech recognition system.

Citation

Please use the following citation when referencing the dataset:

@article{DBLP:journals/corr/WangZ15e,
  author    = {Dong Wang and
               Xuewei Zhang},
  title     = {{THCHS-30} : {A} Free Chinese Speech Corpus},
  journal   = {CoRR},
  volume    = {abs/1512.01882},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.01882},
  archivePrefix = {arXiv},
  eprint    = {1512.01882},
  timestamp = {Mon, 13 Aug 2018 16:46:59 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/WangZ15e.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

License

Custom

数据预览
查看数据
数据集信息
应用场景NLPASR
标注类型SentenceText
LicenseCustom
更新时间2021-03-24 23:42:43
数据概要
数据格式Audio
数据数量13.39k
文件大小4GB
标注数量0
版权归属方
CSLT at Tsinghua University
标注方
未知
了解更多和支持
立即开始构建AI
免费开始联系我们