graviti logo产品公开数据集关于我们
Demo演示登录
963
0
18
Free Spoken Digit
概要
讨论
代码
活动
f7058569-81ef-4ba5-9a77-435fb5ab3606
a5a6dbf·
Jun 27, 2021 1:47 PM
·5Commits
cover

Overview

A simple audio/speech dataset consisting of recordings of spoken digits in wav files at 8kHz. The recordings are trimmed so that they have near minimal silence at the beginnings and ends.
FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.

Data Collection

FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.
Please contribute your homemade recordings. All recordings should be mono 8kHz wav files and be trimmed to have minimal silence. Don't forget to update metadata.py with the speaker meta-data.
To add your data, follow the recording instructions in acquire_data/say_numbers_prompt.py and then run split_and_label_numbers.py to make your files.

Data Format

Files are named in the following format: {digitLabel}{speakerName} {index}.wav Example: 7_jackson_32.wav
Now it contains 3,000 recordings (50 of each digit per speaker) from 6 speaks in English Prounciations.
metadata.py contains meta-data regarding the speakers gender and accents.

数据预览
查看数据
🎉感谢Data Decorators的贡献
数据集信息
应用场景暂无
标注类型Classification
任务类型Voice Print RecognitionASR
LicenseCC BY-SA 4.0
更新时间2021-03-24 23:35:36
数据概要
数据格式Audio
数据数量6K
已标注数量7999
文件大小40MB
版权归属方
Zohar Jackson
标注方
未知
了解更多和支持
立即开始构建AI
免费开始联系我们