graviti logo产品公开数据集关于我们
登录
674
0
21
SynthText in the Wild
创建来自Hello Dataset / Robert
概要
活动

Overview

This is a synthetically generated dataset, in which word instances are placed in natural scene images, while taking into account the scene layout.

The dataset consists of 800 thousand images with approximately 8 million synthetic word instances. Each text instance is annotated with its text-string, word-level and character-level bounding-boxes.

Data Format

SynthText in the Wild Dataset

Ankush Gupta, Andrea Vedaldi, and Andrew Zisserman Visual Geometry Group, University of Oxford, 2016

Data format:

SynthText.zip (size = 42074172 bytes (41GB)) contains 858,750 synthetic scene-image files (.jpg) split into 200 directories, with 7,266,866 word-instances, and 28,971,487 characters.

Ground-truth annotations are contained in the file "gt.mat" (Matlab format). The file "gt.mat" contains the following cell-arrays, each of size 1x858750:

  1. imnames : names of the image files

  2. wordBB : word-level bounding-boxes for each image, represented by tensors of size 2x4xNWORDS_i, where:

    • the first dimension is 2 for x and y respectively,
    • the second dimension corresponds to the 4 points (clockwise, starting from top-left), and
    • the third dimension of size NWORDS_i, corresponds to the number of words in the i_th image.
  3. charBB : character-level bounding-boxes, each represented by a tensor of size 2x4xNCHARS_i (format is same as wordBB's above)

  4. txt : text-strings contained in each image (char array).

             Words which belong to the same "instance", i.e.,
             those rendered in the same region with the same font, color,
             distortion etc., are grouped together; the instance
             boundaries are demarcated by the line-feed character (ASCII: 10)

             A "word" is any contiguous substring of non-whitespace
             characters.

             A "character" is defined as any non-whitespace character.

For any questions or comments, contact Ankush Gupta at: removethisifyouarehuman-ankush@robots.ox.ac.uk

Citation

If you use this data, please cite:

@InProceedings{Gupta16,
  author       = "Ankush Gupta and Andrea Vedaldi and Andrew Zisserman",
  title        = "Synthetic Data for Text Localisation in Natural Images",
  booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition",
  year         = "2016",
}
数据集信息
应用场景OCR/Text Detection
标注类型Polygon2D
LicenseUnknown
更新时间2021-03-24 22:56:01
数据概要
数据格式Image
数据数量800k
文件大小38MB
标注数量0
版权归属方
Visual Geometry Group
标注方
未知
了解更多和支持
相关数据集
DIVA-HisDB
创建来自Robert
CCPD
创建来自AChenQ
CCPD-Green
创建来自AChenQ
立即开始构建AI
免费开始联系我们