Datasets:
translation
translation
|
---|
{
"af": "Lewens:",
"en": "Lives: XX"
} |
{
"af": "Laat veelvuldige Amarok instansies toe om te loop",
"en": "Allow running multiple Amarok instances"
} |
{
"af": "_Vertikale aansig",
"en": "_Vertical View"
} |
{
"af": "_Maak skyf skoon…",
"en": "_Blank Disc…"
} |
{
"af": "Nederlandse sleutelkaart",
"en": "Dutch keymap"
} |
{
"af": "Skep Html Aanbieding",
"en": "Shape Animation"
} |
{
"af": "Speletjie verby",
"en": "Game Over"
} |
{
"af": "Hierdie hand, myne, het hom gedood",
"en": "It was just bad luck. He died at my hand."
} |
{
"af": "fermware ontbreek",
"en": "firmware missing"
} |
{
"af": "Installeer pakkette",
"en": "Installing software"
} |
{
"af": "Figure met getalle",
"en": "Find the number"
} |
{
"af": "Geesdriftige animasie",
"en": "Zealous Animation"
} |
{
"af": "Open Lêer",
"en": "Open File"
} |
{
"af": "Sagter Huidige Laag",
"en": "Mirror Layer Y"
} |
{
"af": "Die aantal myne in 'n pasmaakspel",
"en": "The number of mines in a custom game"
} |
{
"af": "Begin die eerste X-bediener, maar halt dan totdat ons 'n GAAN kry in die eieu",
"en": "Start the first X server but then halt until we get a GO in the fifo"
} |
{
"af": "Kho",
"en": "Kho"
} |
{
"af": "Voo_rskou",
"en": "No preview"
} |
{
"af": "Outomaties bespeurde enkoderings",
"en": "Automatically Detected Encodings"
} |
{
"af": "Westelike Europees",
"en": "Western European"
} |
{
"af": "Wagwoord",
"en": "Password Settings"
} |
{
"af": "Groep Eienskappe",
"en": "Group Properties"
} |
{
"af": "Dis so mooi! Die bloed wat uitkom",
"en": "What blood can be nice ."
} |
{
"af": "Name=Plaaslike Netwerk",
"en": "Name=KDE Plugin Information"
} |
{
"af": "Luister na die voëls",
"en": "Listen to the birds."
} |
{
"af": "ruitetwee",
"en": "Move ~a onto the two of diamonds."
} |
{
"af": "SiprusCountry name (optional, but should be translated)",
"en": "Cyprus"
} |
{
"af": "Jy is met die goue lepel gebore, en jy's jammer daaroor?",
"en": "You're another one of those \"no future\" types. - Wait! - No responsibility!"
} |
{
"af": "Ek het jou een geskuld",
"en": "I owed you for one"
} |
{
"af": "klawervyf",
"en": "five of clubs"
} |
{
"af": "Gevaal",
"en": "Failed"
} |
{
"af": "Wetenskap",
"en": "Science"
} |
{
"af": "Ingevoegde teks",
"en": "Inserted text"
} |
{
"af": "Reeksnommer",
"en": "Real Name:"
} |
{
"af": "SerbieseName",
"en": "Serbian Ijekavian Latin"
} |
{
"af": "Wat is verkeerd met JOU?",
"en": "Anyone else?"
} |
{
"af": "kan begin, sodat hulle",
"en": "their own businesses"
} |
{
"af": "Vinnig! In gelid!",
"en": "Two columns, route step, march!"
} |
{
"af": "Dit mag ek nie sê nie.",
"en": "- I can't tell you."
} |
{
"af": "Handtekening bestaan, maar publieke sleutel word benodig",
"en": "Signature exists, but need public key"
} |
{
"af": "Plaas die plekaanduier onder die inleiding frase",
"en": "Put the cursor below the introduction phrase"
} |
{
"af": "Wissel Diep Lug Objekte",
"en": "Hide Messier objects while moving ?"
} |
{
"af": "Ek wou nog altyd reis en daaroor skryf, soos Budrewicz of Kapuściński, maar ek weet nie hoe dit kan werk nie",
"en": "I've always wanted to travel and write about it. Like Budrewicz or Kapuściński. But I don't know how it would be."
} |
{
"af": "Saam ons's doen ll lewe.",
"en": "we'll live."
} |
{
"af": "Slegs doen 'n rugsteun",
"en": "Next HotSync will be: %1."
} |
{
"af": "Swart dame op %1$s neem wit ruiter op %2$s",
"en": "Black queen at %1$s takes the white knight at %2$s"
} |
{
"af": "N-Z_BAR_Roemeens",
"en": "N-Z_BAR_Romanian"
} |
{
"af": "Vervang",
"en": "& Replace"
} |
{
"af": "Boodskapfilters",
"en": "Message Filters"
} |
{
"af": "LeonCity in California USA",
"en": "Monterey Park"
} |
{
"af": "Jy moet ten minste een drukker kies.",
"en": "You must select at least one printer."
} |
{
"af": "Rol",
"en": "Role"
} |
{
"af": "Boekmerke",
"en": "Bookmarks"
} |
{
"af": "'n Klein foutjie, en ek vergeet dat ek van jou hou.",
"en": "One little mistake and I'll forget I like you."
} |
{
"af": "_Sorteer...",
"en": "_Sort..."
} |
{
"af": "Gelykheid-getallevreters",
"en": "Equality Number Munchers"
} |
{
"af": "verband met die bedrywighede van die Agentskap; en",
"en": "writing, require in connection with the activities of the Agency; and"
} |
{
"af": "Stuurder",
"en": "by Sender"
} |
{
"af": "Geaktiveer",
"en": "Table"
} |
{
"af": "61. (1) Die Owerheid kan regulasies voorskryf wat op uitsaaidienslisensiehouers van",
"en": "61. (1) The Authority may prescribe regulations applicable to broadcasting service"
} |
{
"af": "Ek gaan na die Maldives, wil jy nie saamgaan nie?",
"en": "I'm going to the Maldives. Why don't you come too?"
} |
{
"af": "Areas",
"en": "& Area"
} |
{
"af": "Speletjie",
"en": "Name :"
} |
{
"af": "Huidige Onderhouer",
"en": "Previous Maintainer"
} |
{
"af": "Na die insident, het hy dit net verloor",
"en": "After the incident... he went off the deep end."
} |
{
"af": "(i) die proses en prosedures vir aansoek doen om of registrasie, wysiging, oordrag en hernuwing van een of meer van die lisensies in subartikels (2) en (4) vermeld, uiteengesit word;",
"en": "(i) the process and procedures for applying for or registering, amending, transfering and renewing one or more of the licences specified in subsections (2) and (4);"
} |
{
"af": "Die restaurant?",
"en": "The restaurant?"
} |
{
"af": "Boonste Kas",
"en": "Upper"
} |
{
"af": "wat ek daar beleef het, wat ek...",
"en": "what I experienced there, what I..."
} |
{
"af": "Fonte",
"en": "Fonts"
} |
{
"af": "Filter Log Blaaier ...",
"en": "Filter Log Viewer ..."
} |
{
"af": "Lae Glans",
"en": "Low Gloss"
} |
{
"af": "Lengte",
"en": "Length"
} |
{
"af": "Moet haar liewer niks sê nie, dalk kom alles nog reg",
"en": "Don't tell her anything, maybe things get better."
} |
{
"af": "Verifieer PIN",
"en": "Verify PIN"
} |
{
"af": "Ek hou daarvan.",
"en": "I liked it."
} |
{
"af": "Wit loper op %1$s neem swart loper op %2$s",
"en": "White bishop at %1$s takes the black bishop at %2$s"
} |
{
"af": "Uitsaaiadres",
"en": "Broadcast"
} |
{
"af": "Kan ek u help?",
"en": "Can I help you?"
} |
{
"af": "Nee",
"en": "Let's go."
} |
{
"af": "Ondersteunde beeldlêers",
"en": "Supported image files"
} |
{
"af": "Horisontaal Gidse",
"en": "Horizontal :"
} |
{
"af": "\"Ek kan Maman nie nou vertel nie, haar oogprobleem word erger",
"en": "Despite the autumn sun, her blinds remain shut."
} |
{
"af": "Cups bediener",
"en": "CUPS Server"
} |
{
"af": "Drukreëlvoumodus",
"en": "Printing Line Wrapping Mode"
} |
{
"af": "Ander vinger:",
"en": "Other finger:"
} |
{
"af": "Name",
"en": "Art inspired by real Japanese Mahjongg tiles"
} |
{
"af": "Aanhegselherinnering",
"en": "Attachment contents not loaded"
} |
{
"af": "Nommer",
"en": "Number"
} |
{
"af": "Verklein venster",
"en": "Hide window title bar"
} |
{
"af": "Ek belowe",
"en": "-Cross my heart."
} |
{
"af": "millisekonde,millisekondes",
"en": "millisecond,milliseconds,ms"
} |
{
"af": "Lyste",
"en": "list"
} |
{
"af": "Klein",
"en": "Small"
} |
{
"af": "Oorweeg dit om iets na 'n leë gleuf te skuif",
"en": "Consider moving something into an empty slot"
} |
{
"af": "Daar in?",
"en": "In there?"
} |
{
"af": "Keuse",
"en": "& Selection of page"
} |
{
"af": "Vertoon Staat ...",
"en": "Show Sheet ..."
} |
{
"af": "Versteekte lêer",
"en": "Hidden file"
} |
{
"af": "Ciske!",
"en": "Ciske!"
} |
Dataset Card for Opus100
Dataset Summary
OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side. The corpus covers 100 languages (including English). Selected the languages based on the volume of parallel data available in OPUS.
Supported Tasks and Leaderboards
[More Information Needed]
Languages
OPUS-100 contains approximately 55M sentence pairs. Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.
Dataset Structure
Data Instances
{
"ca": "El departament de bombers té el seu propi equip d'investigació.",
"en": "Well, the fire department has its own investigative unit."
}
Data Fields
src_tag
:string
text in source languagetgt_tag
:string
translation of source language in target language
Data Splits
The dataset is split into training, development, and test portions. Data was prepared by randomly sampled up to 1M sentence pairs per language pair for training and up to 2000 each for development and test. To ensure that there was no overlap (at the monolingual sentence level) between the training and development/test data, they applied a filter during sampling to exclude sentences that had already been sampled. Note that this was done cross-lingually so that, for instance, an English sentence in the Portuguese-English portion of the training data could not occur in the Hindi-English test set.
Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
[More Information Needed]
Initial Data Collection and Normalization
[More Information Needed]
Who are the source language producers?
[More Information Needed]
Annotations
Annotation process
[More Information Needed]
Who are the annotators?
[More Information Needed]
Personal and Sensitive Information
[More Information Needed]
Considerations for Using the Data
Social Impact of Dataset
[More Information Needed]
Discussion of Biases
[More Information Needed]
Other Known Limitations
[More Information Needed]
Additional Information
Dataset Curators
[More Information Needed]
Licensing Information
[More Information Needed]
Citation Information
@misc{zhang2020improving,
title={Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation},
author={Biao Zhang and Philip Williams and Ivan Titov and Rico Sennrich},
year={2020},
eprint={2004.11867},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Contributions
Thanks to @vasudevgupta7 for adding this dataset.
- Downloads last month
- 134,268