Datasets:
id
string
| audio
audio
| file
string
| text
string
| normalized_text
string
|
---|---|---|---|---|
"LJ001-0001" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0001.wav" | "Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition" | "Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition" |
|
"LJ001-0002" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0002.wav" | "in being comparatively modern." | "in being comparatively modern." |
|
"LJ001-0003" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0003.wav" | "For although the Chinese took impressions from wood blocks engraved in relief for centuries before the woodcutters of the Netherlands, by a similar process" | "For although the Chinese took impressions from wood blocks engraved in relief for centuries before the woodcutters of the Netherlands, by a similar process" |
|
"LJ001-0004" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0004.wav" | "produced the block books, which were the immediate predecessors of the true printed book," | "produced the block books, which were the immediate predecessors of the true printed book," |
|
"LJ001-0005" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0005.wav" | "the invention of movable metal letters in the middle of the fifteenth century may justly be considered as the invention of the art of printing." | "the invention of movable metal letters in the middle of the fifteenth century may justly be considered as the invention of the art of printing." |
|
"LJ001-0006" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0006.wav" | "And it is worth mention in passing that, as an example of fine typography," | "And it is worth mention in passing that, as an example of fine typography," |
|
"LJ001-0007" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0007.wav" | "the earliest book printed with movable types, the Gutenberg, or "forty-two line Bible" of about 1455," | "the earliest book printed with movable types, the Gutenberg, or "forty-two line Bible" of about fourteen fifty-five," |
|
"LJ001-0008" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0008.wav" | "has never been surpassed." | "has never been surpassed." |
|
"LJ001-0009" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0009.wav" | "Printing, then, for our purpose, may be considered as the art of making books by means of movable types." | "Printing, then, for our purpose, may be considered as the art of making books by means of movable types." |
|
"LJ001-0010" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0010.wav" | "Now, as all books not primarily intended as picture-books consist principally of types composed to form letterpress," | "Now, as all books not primarily intended as picture-books consist principally of types composed to form letterpress," |
|
"LJ001-0011" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0011.wav" | "it is of the first importance that the letter used should be fine in form;" | "it is of the first importance that the letter used should be fine in form;" |
|
"LJ001-0012" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0012.wav" | "especially as no more time is occupied, or cost incurred, in casting, setting, or printing beautiful letters" | "especially as no more time is occupied, or cost incurred, in casting, setting, or printing beautiful letters" |
|
"LJ001-0013" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0013.wav" | "than in the same operations with ugly ones." | "than in the same operations with ugly ones." |
|
"LJ001-0014" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0014.wav" | "And it was a matter of course that in the Middle Ages, when the craftsmen took care that beautiful form should always be a part of their productions whatever they were," | "And it was a matter of course that in the Middle Ages, when the craftsmen took care that beautiful form should always be a part of their productions whatever they were," |
|
"LJ001-0015" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0015.wav" | "the forms of printed letters should be beautiful, and that their arrangement on the page should be reasonable and a help to the shapeliness of the letters themselves." | "the forms of printed letters should be beautiful, and that their arrangement on the page should be reasonable and a help to the shapeliness of the letters themselves." |
|
"LJ001-0016" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0016.wav" | "The Middle Ages brought calligraphy to perfection, and it was natural therefore" | "The Middle Ages brought calligraphy to perfection, and it was natural therefore" |
|
"LJ001-0017" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0017.wav" | "that the forms of printed letters should follow more or less closely those of the written character, and they followed them very closely." | "that the forms of printed letters should follow more or less closely those of the written character, and they followed them very closely." |
|
"LJ001-0018" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0018.wav" | "The first books were printed in black letter, i.e. the letter which was a Gothic development of the ancient Roman character," | "The first books were printed in black letter, i.e. the letter which was a Gothic development of the ancient Roman character," |
|
"LJ001-0019" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0019.wav" | "and which developed more completely and satisfactorily on the side of the "lower-case" than the capital letters;" | "and which developed more completely and satisfactorily on the side of the "lower-case" than the capital letters;" |
|
"LJ001-0020" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0020.wav" | "the "lower-case" being in fact invented in the early Middle Ages." | "the "lower-case" being in fact invented in the early Middle Ages." |
|
"LJ001-0021" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0021.wav" | "The earliest book printed with movable type, the aforesaid Gutenberg Bible, is printed in letters which are an exact imitation" | "The earliest book printed with movable type, the aforesaid Gutenberg Bible, is printed in letters which are an exact imitation" |
|
"LJ001-0022" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0022.wav" | "of the more formal ecclesiastical writing which obtained at that time; this has since been called "missal type,"" | "of the more formal ecclesiastical writing which obtained at that time; this has since been called "missal type,"" |
|
"LJ001-0023" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0023.wav" | "and was in fact the kind of letter used in the many splendid missals, psalters, etc., produced by printing in the fifteenth century." | "and was in fact the kind of letter used in the many splendid missals, psalters, etc., produced by printing in the fifteenth century." |
|
"LJ001-0024" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0024.wav" | "But the first Bible actually dated (which also was printed at Maintz by Peter Schoeffer in the year 1462)" | "But the first Bible actually dated (which also was printed at Maintz by Peter Schoeffer in the year fourteen sixty-two)" |
|
"LJ001-0025" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0025.wav" | "imitates a much freer hand, simpler, rounder, and less spiky, and therefore far pleasanter and easier to read." | "imitates a much freer hand, simpler, rounder, and less spiky, and therefore far pleasanter and easier to read." |
|
"LJ001-0026" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0026.wav" | "On the whole the type of this book may be considered the ne-plus-ultra of Gothic type," | "On the whole the type of this book may be considered the ne-plus-ultra of Gothic type," |
|
"LJ001-0027" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0027.wav" | "especially as regards the lower-case letters; and type very similar was used during the next fifteen or twenty years not only by Schoeffer," | "especially as regards the lower-case letters; and type very similar was used during the next fifteen or twenty years not only by Schoeffer," |
|
"LJ001-0028" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0028.wav" | "but by printers in Strasburg, Basle, Paris, Lubeck, and other cities." | "but by printers in Strasburg, Basle, Paris, Lubeck, and other cities." |
|
"LJ001-0029" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0029.wav" | "But though on the whole, except in Italy, Gothic letter was most often used" | "But though on the whole, except in Italy, Gothic letter was most often used" |
|
"LJ001-0030" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0030.wav" | "a very few years saw the birth of Roman character not only in Italy, but in Germany and France." | "a very few years saw the birth of Roman character not only in Italy, but in Germany and France." |
|
"LJ001-0031" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0031.wav" | "In 1465 Sweynheim and Pannartz began printing in the monastery of Subiaco near Rome," | "In fourteen sixty-five Sweynheim and Pannartz began printing in the monastery of Subiaco near Rome," |
|
"LJ001-0032" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0032.wav" | "and used an exceedingly beautiful type, which is indeed to look at a transition between Gothic and Roman," | "and used an exceedingly beautiful type, which is indeed to look at a transition between Gothic and Roman," |
|
"LJ001-0033" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0033.wav" | "but which must certainly have come from the study of the twelfth or even the eleventh century MSS." | "but which must certainly have come from the study of the twelfth or even the eleventh century MSS." |
|
"LJ001-0034" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0034.wav" | "They printed very few books in this type, three only; but in their very first books in Rome, beginning with the year 1468," | "They printed very few books in this type, three only; but in their very first books in Rome, beginning with the year fourteen sixty-eight," |
|
"LJ001-0035" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0035.wav" | "they discarded this for a more completely Roman and far less beautiful letter." | "they discarded this for a more completely Roman and far less beautiful letter." |
|
"LJ001-0036" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0036.wav" | "But about the same year Mentelin at Strasburg began to print in a type which is distinctly Roman;" | "But about the same year Mentelin at Strasburg began to print in a type which is distinctly Roman;" |
|
"LJ001-0037" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0037.wav" | "and the next year Gunther Zeiner at Augsburg followed suit;" | "and the next year Gunther Zeiner at Augsburg followed suit;" |
|
"LJ001-0038" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0038.wav" | "while in 1470 at Paris Udalric Gering and his associates turned out the first books printed in France, also in Roman character." | "while in fourteen seventy at Paris Udalric Gering and his associates turned out the first books printed in France, also in Roman character." |
|
"LJ001-0039" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0039.wav" | "The Roman type of all these printers is similar in character," | "The Roman type of all these printers is similar in character," |
|
"LJ001-0040" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0040.wav" | "and is very simple and legible, and unaffectedly designed for use; but it is by no means without beauty." | "and is very simple and legible, and unaffectedly designed for use; but it is by no means without beauty." |
|
"LJ001-0041" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0041.wav" | "It must be said that it is in no way like the transition type of Subiaco," | "It must be said that it is in no way like the transition type of Subiaco," |
|
"LJ001-0042" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0042.wav" | "and though more Roman than that, yet scarcely more like the complete Roman type of the earliest printers of Rome." | "and though more Roman than that, yet scarcely more like the complete Roman type of the earliest printers of Rome." |
|
"LJ001-0043" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0043.wav" | "A further development of the Roman letter took place at Venice." | "A further development of the Roman letter took place at Venice." |
|
"LJ001-0044" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0044.wav" | "John of Spires and his brother Vindelin, followed by Nicholas Jenson, began to print in that city," | "John of Spires and his brother Vindelin, followed by Nicholas Jenson, began to print in that city," |
|
"LJ001-0045" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0045.wav" | "1469, 1470;" | "fourteen sixty-nine, fourteen seventy;" |
|
"LJ001-0046" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0046.wav" | "their type is on the lines of the German and French rather than of the Roman printers." | "their type is on the lines of the German and French rather than of the Roman printers." |
|
"LJ001-0047" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0047.wav" | "Of Jenson it must be said that he carried the development of Roman type as far as it can go:" | "Of Jenson it must be said that he carried the development of Roman type as far as it can go:" |
|
"LJ001-0048" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0048.wav" | "his letter is admirably clear and regular, but at least as beautiful as any other Roman type." | "his letter is admirably clear and regular, but at least as beautiful as any other Roman type." |
|
"LJ001-0049" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0049.wav" | "After his death in the "fourteen eighties," or at least by 1490, printing in Venice had declined very much;" | "After his death in the "fourteen eighties," or at least by fourteen ninety, printing in Venice had declined very much;" |
|
"LJ001-0050" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0050.wav" | "and though the famous family of Aldus restored its technical excellence, rejecting battered letters," | "and though the famous family of Aldus restored its technical excellence, rejecting battered letters," |
|
"LJ001-0051" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0051.wav" | "and paying great attention to the "press work" or actual process of printing," | "and paying great attention to the "press work" or actual process of printing," |
|
"LJ001-0052" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0052.wav" | "yet their type is artistically on a much lower level than Jenson's, and in fact" | "yet their type is artistically on a much lower level than Jenson's, and in fact" |
|
"LJ001-0053" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0053.wav" | "they must be considered to have ended the age of fine printing in Italy." | "they must be considered to have ended the age of fine printing in Italy." |
|
"LJ001-0054" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0054.wav" | "Jenson, however, had many contemporaries who used beautiful type," | "Jenson, however, had many contemporaries who used beautiful type," |
|
"LJ001-0055" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0055.wav" | "some of which -- as, e.g., that of Jacobus Rubeus or Jacques le Rouge -- is scarcely distinguishable from his." | "some of which -- as, e.g., that of Jacobus Rubeus or Jacques le Rouge -- is scarcely distinguishable from his." |
|
"LJ001-0056" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0056.wav" | "It was these great Venetian printers, together with their brethren of Rome, Milan," | "It was these great Venetian printers, together with their brethren of Rome, Milan," |
|
"LJ001-0057" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0057.wav" | "Parma, and one or two other cities, who produced the splendid editions of the Classics, which are one of the great glories of the printer's art," | "Parma, and one or two other cities, who produced the splendid editions of the Classics, which are one of the great glories of the printer's art," |
|
"LJ001-0058" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0058.wav" | "and are worthy representatives of the eager enthusiasm for the revived learning of that epoch. By far," | "and are worthy representatives of the eager enthusiasm for the revived learning of that epoch. By far," |
|
"LJ001-0059" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0059.wav" | "the greater part of these Italian printers, it should be mentioned, were Germans or Frenchmen, working under the influence of Italian opinion and aims." | "the greater part of these Italian printers, it should be mentioned, were Germans or Frenchmen, working under the influence of Italian opinion and aims." |
|
"LJ001-0060" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0060.wav" | "It must be understood that through the whole of the fifteenth and the first quarter of the sixteenth centuries" | "It must be understood that through the whole of the fifteenth and the first quarter of the sixteenth centuries" |
|
"LJ001-0061" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0061.wav" | "the Roman letter was used side by side with the Gothic." | "the Roman letter was used side by side with the Gothic." |
|
"LJ001-0062" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0062.wav" | "Even in Italy most of the theological and law books were printed in Gothic letter," | "Even in Italy most of the theological and law books were printed in Gothic letter," |
|
"LJ001-0063" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0063.wav" | "which was generally more formally Gothic than the printing of the German workmen," | "which was generally more formally Gothic than the printing of the German workmen," |
|
"LJ001-0064" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0064.wav" | "many of whose types, indeed, like that of the Subiaco works, are of a transitional character." | "many of whose types, indeed, like that of the Subiaco works, are of a transitional character." |
|
"LJ001-0065" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0065.wav" | "This was notably the case with the early works printed at Ulm, and in a somewhat lesser degree at Augsburg." | "This was notably the case with the early works printed at Ulm, and in a somewhat lesser degree at Augsburg." |
|
"LJ001-0066" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0066.wav" | "In fact Gunther Zeiner's first type (afterwards used by Schussler) is remarkably like the type of the before-mentioned Subiaco books." | "In fact Gunther Zeiner's first type (afterwards used by Schussler) is remarkably like the type of the before-mentioned Subiaco books." |
|
"LJ001-0067" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0067.wav" | "In the Low Countries and Cologne, which were very fertile of printed books, Gothic was the favorite." | "In the Low Countries and Cologne, which were very fertile of printed books, Gothic was the favorite." |
|
"LJ001-0068" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0068.wav" | "The characteristic Dutch type, as represented by the excellent printer Gerard Leew, is very pronounced and uncompromising Gothic." | "The characteristic Dutch type, as represented by the excellent printer Gerard Leew, is very pronounced and uncompromising Gothic." |
|
"LJ001-0069" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0069.wav" | "This type was introduced into England by Wynkyn de Worde, Caxton's successor," | "This type was introduced into England by Wynkyn de Worde, Caxton's successor," |
|
"LJ001-0070" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0070.wav" | "and was used there with very little variation all through the sixteenth and seventeenth centuries, and indeed into the eighteenth." | "and was used there with very little variation all through the sixteenth and seventeenth centuries, and indeed into the eighteenth." |
|
"LJ001-0071" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0071.wav" | "Most of Caxton's own types are of an earlier character, though they also much resemble Flemish or Cologne letter." | "Most of Caxton's own types are of an earlier character, though they also much resemble Flemish or Cologne letter." |
|
"LJ001-0072" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0072.wav" | "After the end of the fifteenth century the degradation of printing, especially in Germany and Italy," | "After the end of the fifteenth century the degradation of printing, especially in Germany and Italy," |
|
"LJ001-0073" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0073.wav" | "went on apace; and by the end of the sixteenth century there was no really beautiful printing done:" | "went on apace; and by the end of the sixteenth century there was no really beautiful printing done:" |
|
"LJ001-0074" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0074.wav" | "the best, mostly French or Low-Country, was neat and clear, but without any distinction;" | "the best, mostly French or Low-Country, was neat and clear, but without any distinction;" |
|
"LJ001-0075" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0075.wav" | "the worst, which perhaps was the English, was a terrible falling-off from the work of the earlier presses;" | "the worst, which perhaps was the English, was a terrible falling-off from the work of the earlier presses;" |
|
"LJ001-0076" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0076.wav" | "and things got worse and worse through the whole of the seventeenth century, so that in the eighteenth printing was very miserably performed." | "and things got worse and worse through the whole of the seventeenth century, so that in the eighteenth printing was very miserably performed." |
|
"LJ001-0077" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0077.wav" | "In England about this time, an attempt was made (notably by Caslon, who started business in London as a type-founder in 1720)" | "In England about this time, an attempt was made (notably by Caslon, who started business in London as a type-founder in seventeen twenty)" |
|
"LJ001-0078" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0078.wav" | "to improve the letter in form." | "to improve the letter in form." |
|
"LJ001-0079" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0079.wav" | "Caslon's type is clear and neat, and fairly well designed;" | "Caslon's type is clear and neat, and fairly well designed;" |
|
"LJ001-0080" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0080.wav" | "he seems to have taken the letter of the Elzevirs of the seventeenth century for his model:" | "he seems to have taken the letter of the Elzevirs of the seventeenth century for his model:" |
|
"LJ001-0081" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0081.wav" | "type cast from his matrices is still in everyday use." | "type cast from his matrices is still in everyday use." |
|
"LJ001-0082" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0082.wav" | "In spite, however, of his praiseworthy efforts, printing had still one last degradation to undergo." | "In spite, however, of his praiseworthy efforts, printing had still one last degradation to undergo." |
|
"LJ001-0083" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0083.wav" | "The seventeenth century founts were bad rather negatively than positively." | "The seventeenth century founts were bad rather negatively than positively." |
|
"LJ001-0084" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0084.wav" | "But for the beauty of the earlier work they might have seemed tolerable." | "But for the beauty of the earlier work they might have seemed tolerable." |
|
"LJ001-0085" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0085.wav" | "It was reserved for the founders of the later eighteenth century to produce letters which are positively ugly, and which, it may be added," | "It was reserved for the founders of the later eighteenth century to produce letters which are positively ugly, and which, it may be added," |
|
"LJ001-0086" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0086.wav" | "are dazzling and unpleasant to the eye owing to the clumsy thickening and vulgar thinning of the lines:" | "are dazzling and unpleasant to the eye owing to the clumsy thickening and vulgar thinning of the lines:" |
|
"LJ001-0087" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0087.wav" | "for the seventeenth-century letters are at least pure and simple in line. The Italian, Bodoni, and the Frenchman, Didot," | "for the seventeenth-century letters are at least pure and simple in line. The Italian, Bodoni, and the Frenchman, Didot," |
|
"LJ001-0088" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0088.wav" | "were the leaders in this luckless change, though our own Baskerville, who was at work some years before them, went much on the same lines;" | "were the leaders in this luckless change, though our own Baskerville, who was at work some years before them, went much on the same lines;" |
|
"LJ001-0089" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0089.wav" | "but his letters, though uninteresting and poor, are not nearly so gross and vulgar as those of either the Italian or the Frenchman." | "but his letters, though uninteresting and poor, are not nearly so gross and vulgar as those of either the Italian or the Frenchman." |
|
"LJ001-0090" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0090.wav" | "With this change the art of printing touched bottom," | "With this change the art of printing touched bottom," |
|
"LJ001-0091" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0091.wav" | "so far as fine printing is concerned, though paper did not get to its worst till about 1840." | "so far as fine printing is concerned, though paper did not get to its worst till about eighteen forty." |
|
"LJ001-0092" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0092.wav" | "The Chiswick press in 1844 revived Caslon's founts, printing for Messrs. Longman the Diary of Lady Willoughby." | "The Chiswick press in eighteen forty-four revived Caslon's founts, printing for Messrs. Longman the Diary of Lady Willoughby." |
|
"LJ001-0093" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0093.wav" | "This experiment was so far successful that about 1850 Messrs. Miller and Richard of Edinburgh" | "This experiment was so far successful that about eighteen fifty Messrs. Miller and Richard of Edinburgh" |
|
"LJ001-0094" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0094.wav" | "were induced to cut punches for a series of "old style" letters." | "were induced to cut punches for a series of "old style" letters." |
|
"LJ001-0095" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0095.wav" | "These and similar founts, cast by the above firm and others," | "These and similar founts, cast by the above firm and others," |
|
"LJ001-0096" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0096.wav" | "have now come into general use and are obviously a great improvement on the ordinary "modern style" in use in England, which is in fact the Bodoni type" | "have now come into general use and are obviously a great improvement on the ordinary "modern style" in use in England, which is in fact the Bodoni type" |
|
"LJ001-0097" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0097.wav" | "a little reduced in ugliness. The design of the letters of this modern "old style" leaves a good deal to be desired," | "a little reduced in ugliness. The design of the letters of this modern "old style" leaves a good deal to be desired," |
|
"LJ001-0098" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0098.wav" | "and the whole effect is a little too gray, owing to the thinness of the letters." | "and the whole effect is a little too gray, owing to the thinness of the letters." |
|
"LJ001-0099" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0099.wav" | "It must be remembered, however, that most modern printing is done by machinery on soft paper, and not by the hand press," | "It must be remembered, however, that most modern printing is done by machinery on soft paper, and not by the hand press," |
|
"LJ001-0100" | "/storage/hf-datasets-cache/all/datasets/64267897617596-config-parquet-and-info-lj_speech-2d45ae89/downloads/extracted/d0704e5a1e2f25829b7420e4212ba4f371d6dbee87e4bc4ff46cc53a49497ae3/LJSpeech-1.1/wavs/LJ001-0100.wav" | "and these somewhat wiry letters are suitable for the machine process, which would not do justice to letters of more generous design." | "and these somewhat wiry letters are suitable for the machine process, which would not do justice to letters of more generous design." |
Dataset Card for lj_speech
Dataset Summary
This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books in English. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.
The texts were published between 1884 and 1964, and are in the public domain. The audio was recorded in 2016-17 by the LibriVox project and is also in the public domain.
Supported Tasks and Leaderboards
The dataset can be used to train a model for Automatic Speech Recognition (ASR) or Text-to-Speech (TTS).
other:automatic-speech-recognition
: An ASR model is presented with an audio file and asked to transcribe the audio file to written text. The most common ASR evaluation metric is the word error rate (WER).other:text-to-speech
: A TTS model is given a written text in natural language and asked to generate a speech audio file. A reasonable evaluation metric is the mean opinion score (MOS) of audio quality. The dataset has an active leaderboard which can be found at https://paperswithcode.com/sota/text-to-speech-synthesis-on-ljspeech
Languages
The transcriptions and audio are in English.
Dataset Structure
Data Instances
A data point comprises the path to the audio file, called file
and its transcription, called text
.
A normalized version of the text is also provided.
{
'id': 'LJ002-0026',
'file': '/datasets/downloads/extracted/05bfe561f096e4c52667e3639af495226afe4e5d08763f2d76d069e7a453c543/LJSpeech-1.1/wavs/LJ002-0026.wav',
'audio': {'path': '/datasets/downloads/extracted/05bfe561f096e4c52667e3639af495226afe4e5d08763f2d76d069e7a453c543/LJSpeech-1.1/wavs/LJ002-0026.wav',
'array': array([-0.00048828, -0.00018311, -0.00137329, ..., 0.00079346,
0.00091553, 0.00085449], dtype=float32),
'sampling_rate': 22050},
'text': 'in the three years between 1813 and 1816,'
'normalized_text': 'in the three years between eighteen thirteen and eighteen sixteen,',
}
Each audio file is a single-channel 16-bit PCM WAV with a sample rate of 22050 Hz.
Data Fields
id: unique id of the data sample.
file: a path to the downloaded audio file in .wav format.
audio: A dictionary containing the path to the downloaded audio file, the decoded audio array, and the sampling rate. Note that when accessing the audio column:
dataset[0]["audio"]
the audio file is automatically decoded and resampled todataset.features["audio"].sampling_rate
. Decoding and resampling of a large number of audio files might take a significant amount of time. Thus it is important to first query the sample index before the"audio"
column, i.e.dataset[0]["audio"]
should always be preferred overdataset["audio"][0]
.text: the transcription of the audio file.
normalized_text: the transcription with numbers, ordinals, and monetary units expanded into full words.
Data Splits
The dataset is not pre-split. Some statistics:
- Total Clips: 13,100
- Total Words: 225,715
- Total Characters: 1,308,678
- Total Duration: 23:55:17
- Mean Clip Duration: 6.57 sec
- Min Clip Duration: 1.11 sec
- Max Clip Duration: 10.10 sec
- Mean Words per Clip: 17.23
- Distinct Words: 13,821
Dataset Creation
Curation Rationale
[Needs More Information]
Source Data
Initial Data Collection and Normalization
This dataset consists of excerpts from the following works:
- Morris, William, et al. Arts and Crafts Essays. 1893.
- Griffiths, Arthur. The Chronicles of Newgate, Vol. 2. 1884.
- Roosevelt, Franklin D. The Fireside Chats of Franklin Delano Roosevelt. 1933-42.
- Harland, Marion. Marion Harland's Cookery for Beginners. 1893.
- Rolt-Wheeler, Francis. The Science - History of the Universe, Vol. 5: Biology. 1910.
- Banks, Edgar J. The Seven Wonders of the Ancient World. 1916.
- President's Commission on the Assassination of President Kennedy. Report of the President's Commission on the Assassination of President Kennedy. 1964.
Some details about normalization:
- The normalized transcription has the numbers, ordinals, and monetary units expanded into full words (UTF-8)
- 19 of the transcriptions contain non-ASCII characters (for example, LJ016-0257 contains "raison d'être").
- The following abbreviations appear in the text. They may be expanded as follows:
Abbreviation | Expansion |
---|---|
Mr. | Mister |
Mrs. | Misess (*) |
Dr. | Doctor |
No. | Number |
St. | Saint |
Co. | Company |
Jr. | Junior |
Maj. | Major |
Gen. | General |
Drs. | Doctors |
Rev. | Reverend |
Lt. | Lieutenant |
Hon. | Honorable |
Sgt. | Sergeant |
Capt. | Captain |
Esq. | Esquire |
Ltd. | Limited |
Col. | Colonel |
Ft. | Fort |
(*) there's no standard expansion for "Mrs." |
Who are the source language producers?
[Needs More Information]
Annotations
Annotation process
- The audio clips range in length from approximately 1 second to 10 seconds. They were segmented automatically based on silences in the recording. Clip boundaries generally align with sentence or clause boundaries, but not always.
- The text was matched to the audio manually, and a QA pass was done to ensure that the text accurately matched the words spoken in the audio.
Who are the annotators?
Recordings by Linda Johnson from LibriVox. Alignment and annotation by Keith Ito.
Personal and Sensitive Information
The dataset consists of people who have donated their voice online. You agree to not attempt to determine the identity of speakers in this dataset.
Considerations for Using the Data
Social Impact of Dataset
[Needs More Information]
Discussion of Biases
[Needs More Information]
Other Known Limitations
- The original LibriVox recordings were distributed as 128 kbps MP3 files. As a result, they may contain artifacts introduced by the MP3 encoding.
Additional Information
Dataset Curators
The dataset was initially created by Keith Ito and Linda Johnson.
Licensing Information
Public Domain (LibriVox)
Citation Information
@misc{ljspeech17,
author = {Keith Ito and Linda Johnson},
title = {The LJ Speech Dataset},
howpublished = {\url{https://keithito.com/LJ-Speech-Dataset/}},
year = 2017
}
Contributions
Thanks to @anton-l for adding this dataset.
- Downloads last month
- 897