Datasets:
id
int32
1
13
| image
image
| bboxes
string
|
---|---|---|
1 | "[{'type': 'box', 'label': 'numbers', 'points': [(1655.38, 519.7), (1851.68, 561.4)], 'attributes': [{'name': 'text', 'text': '61640132'}]}]" |
|
2 | "[{'type': 'box', 'label': 'numbers', 'points': [(248.8, 241.23), (352.9, 282.22)], 'attributes': [{'name': 'text', 'text': '55638167'}]}]" |
|
3 | "[{'type': 'box', 'label': 'numbers', 'points': [(504.4, 221.5), (600.4, 240.4)], 'attributes': [{'name': 'text', 'text': '95095857'}]}]" |
|
4 | "[{'type': 'box', 'label': 'numbers', 'points': [(719.93, 185.34), (828.27, 229.45)], 'attributes': [{'name': 'text', 'text': '63164818'}]}]" |
|
5 | "[{'type': 'box', 'label': 'numbers', 'points': [(785.2, 240.1), (1010.1, 284.7)], 'attributes': [{'name': 'text', 'text': '63673149'}]}]" |
|
6 | "[{'type': 'box', 'label': 'numbers', 'points': [(483.47, 286.6), (612.27, 322.83)], 'attributes': [{'name': 'text', 'text': '60517067'}]}]" |
|
7 | "[{'type': 'box', 'label': 'numbers', 'points': [(295.5, 93.89), (438.09, 133.59)], 'attributes': [{'name': 'text', 'text': '62071246'}]}]" |
|
8 | "[{'type': 'box', 'label': 'numbers', 'points': [(218.49, 14.1), (328.19, 48.68)], 'attributes': [{'name': 'text', 'text': '52605870'}]}]" |
|
9 | "[{'type': 'box', 'label': 'numbers', 'points': [(142.71, 257.5), (235.2, 276.4)], 'attributes': [{'name': 'text', 'text': '95500344'}]}]" |
|
10 | "[{'type': 'box', 'label': 'numbers', 'points': [(189.29, 56.4), (261.72, 78.7)], 'attributes': [{'name': 'text', 'text': '61677258'}]}]" |
|
11 | "[{'type': 'box', 'label': 'numbers', 'points': [(389.2, 141.66), (482.4, 162.4)], 'attributes': [{'name': 'text', 'text': '42092288'}]}]" |
|
12 | "[{'type': 'box', 'label': 'numbers', 'points': [(132.8, 164.7), (226.1, 183.8)], 'attributes': [{'name': 'text', 'text': '50915024'}]}]" |
|
13 | "[{'type': 'box', 'label': 'numbers', 'points': [(495.2, 208.11), (618.1, 234.79)], 'attributes': [{'name': 'text', 'text': '68965877'}]}]" |
OCR Trains Dataset
The dataset consists of text data obtained through optical character recognition (OCR) technology, which extracts text from images, in this case, the train number.
The dataset be used to train machine learning models for extracting and analyzing text from train-related documents or images, to develop algorithms or models for real-time updates, or building intelligent systems related to trains and transportation.
Get the dataset
This is just an example of the data
Leave a request on https://trainingdata.pro/data-market to discuss your requirements, learn about the price and buy the dataset.
Dataset structure
- images - contains of original images of trains
- annotations.xml - contains coordinates of the bounding boxes and indicated text, created for the original photo
Data Format
Each image from images
folder is accompanied by an XML-annotation in the annotations.xml
file indicating the coordinates of the bounding boxes for text detection. For each point, the x and y coordinates are provided.
Example of XML file structure
Text Detection in Trains' images might be made in accordance with your requirements.
**TrainingData**
More datasets in TrainingData's Kaggle account: https://www.kaggle.com/trainingdatapro/datasets
TrainingData's GitHub: https://github.com/Trainingdata-datamarket/TrainingData_All_datasets
- Downloads last month
- 2