This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.
Total Clips | 13,100 |
---|---|
Total Words | 225,715 |
Total Characters | 1,308,678 |
Total Duration | 23:55:17 |
Mean Clip Duration | 6.57 sec |
Min Clip Duration | 1.11 sec |
Max Clip Duration | 10.10 sec |
Mean Words per Clip | 17.23 |
Distinct Words | 13,821 |
Metadata is provided in transcripts.csv. This file consists of one record per line, delimited by the pipe character (0x7c). The fields are:
Each audio file is a single-channel 16-bit PCM WAV with a sample rate of 22050 Hz.
@misc{ljspeech17,
author = {Keith Ito and Linda Johnson},
title = {The LJ Speech Dataset},
howpublished = {\url{https://keithito.com/LJ-Speech-Dataset/}},
year = 2017
}