Projects in the TextGrid Repository


  • The CLiGS textbox contains several corpora of literary texts in Romance languages. It was made made available by the CLiGS junior research group.


  • Distant Reading for European Literary History (COST Action CA16204) is a project aiming to create a vibrant and diverse network of researchers jointly developing the resources and methods necessary to change the way European literary history is written. Grounded in the Distant Reading paradigm (i.e. using computational methods of analysis for large collections of literary texts), the Action will create a shared theoretical and practical framework to enable innovative, sophisticated, data-driven, computational methods of literary text analysis across at least 10 European languages. Fostering insight into cross-national, large-scale patterns and evolutions across European literary traditions, the Action will facilitate the creation of a broader, more inclusive and better-grounded account of European literary history and cultural identity.


  • The corpus contains novels written by Spanish authors published between 1880 and 1939. The original corpus contains in total 358 prose texts, however, due to copyright issues, 219 can be published currently. The corpus is designed considering the data of two authoritative Histories of Literature and each text is annotated with several types of metadata. Further details on the corpus can be found below.

The TextGrid Repository

The TextGrid Repository is a long-term archive for humanities research data. It provides an extensive, searchable, and reusable repository of texts and images. Aligned with the principles of Open Access and FAIR, the TextGrid Repository was awarded the CoreTrustSeal in 2020. For researchers, the TextGrid Repository offers a sustainable, durable, and secure way to publish their research data in a citable manner and to describe it in an understandable way through required metadata. Read more about sustainability, FAIR and Open Access in the Mission Statement of the TextGrid Repository.

The vast majority of the texts are XML/TEI encoded in addition to the plain text format, allowing for diverse reuse. The repository was established with the acquisition of the Digital Library and is continually evolving based on the TextGrid Community. Through numerous edition projects, which are created in the virtual research environment of the TextGrid Laboratory, manuscripts (images) as well as transcriptions (XML/TEI encoded text data) are available (such as the the Library of Neology or the project on German-French travel correspondence ARCHITRAVE).

Accordingly, the content is partly project-specific and is growing over time. The basis is an extensive corpus of world literature from the beginning of the history of printed books to the 20th century, consisting of texts by around 600 authors: the TextGrid Digital Library contains world literature written or translated in German. Nevertheless, the TextGrid Repository has no language restriction and foreign language texts are published here depending on the project context.

In addition to the advanced search, the content of the TextGrid Repository is also explorable by filtering by author, by genre, by file type, and by project.

All published content is open-access, and should be referenced as usual according to the citation suggestion provided in each case. The TextGrid Repository offers the possibility to compile individual collections via the shelf function. These can be downloaded collectively in XML or TXT formats or examined directly with a range of digital tools.

Participation

Would you like your own XML encoded files to be archived, made quotable and accessible through the TextGrid Repository? Then contact us: https://textgrid.de/en/kontakt/