graviti logo产品公开数据集关于我们
登录
337
0
18
Free Spoken Digit
创建来自Data Decorators / AChenQ
概要
活动

Overview

A simple audio/speech dataset consisting of recordings of spoken digits in wav files at 8kHz. The recordings are trimmed so that they have near minimal silence at the beginnings and ends.
FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.

Data Collection

FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.
Please contribute your homemade recordings. All recordings should be mono 8kHz wav files and be trimmed to have minimal silence. Don't forget to update metadata.py with the speaker meta-data.
To add your data, follow the recording instructions in acquire_data/say_numbers_prompt.py and then run split_and_label_numbers.py to make your files.

Data Format

Files are named in the following format: {digitLabel}{speakerName} {index}.wav Example: 7_jackson_32.wav
Now it contains 3,000 recordings (50 of each digit per speaker) from 6 speaks in English Prounciations.
metadata.py contains meta-data regarding the speakers gender and accents.

License

CC BY-SA 4.0

数据预览
查看数据
数据集信息
应用场景Voice Print RecognitionASR
标注类型Classification
LicenseCC BY-SA 4.0
更新时间2021-03-24 23:41:44
数据概要
数据格式Audio
数据数量6k
文件大小40MB
标注数量6000
版权归属方
Zohar Jackson
标注方
未知
了解更多和支持
相关数据集
VoxCeleb2
创建来自Robert
立即开始构建AI
免费开始联系我们