NERDPool: Data Pool for Named Entity Recognition

Named Entity Recognition (NER) is the ability to automatically identify and extract information about named entities such as person and place names from unstructured data. It is a topic that gains increasingly attention in the Digital Humanities and digital scholarly editing. For the training of NER models for historical German language texts hardly any resources exist. NERDPool tries to overcome this issue by publishing a collection of gold standard named entity annotation samples through this web application/web service

Basic Stats
Data Sources
Filter by Tags

RTA 10442 samples

Entities

LOC 8735 PER 2083 DATE 2127 TIME 580

Info

{'url': 'https://reichstagsakten-1576.uni-graz.at/en/', 'license': 'https://creativecommons.org/licenses/by-nc/4.0/', 'creators': [{'name': 'Roman Bleier', 'github_name': 'bleierr'}, {'name': 'Jacqueline More', 'github_name': 'jackymore'}, {'name': 'Magdalene Ebner'}], 'description': 'From June to October 1576, Emperor Maximilian II and more than 200 representatives of the Reichsstände discussed and decided on the political fate of (Eastern) Central Europe in Regensburg. Envoys from (almost) all over Europe took this as an opportunity to go to Lower Bavaria as well. They made the Reichstag a place of European politics. This event has left a great deal of written documentation which are edited and published in the project The Imperial Diet of Regensburg, 1576 -- digital.'}

DATA-Endpoint

RITA 924 samples

Entities

PER 1470 LOC 241 ORG 63

Info

{'url': 'https://hdl.handle.net/21.11115/0000-000D-FEAB-5', 'license': 'https://creativecommons.org/licenses/by/4.0/', 'creators': [{'name': 'Michael Span'}], 'description': "Die annotierten Daten stammen aus sogenannten 'Verfachbüchern' aus mittleren Pustertal aus der zweiten Hälfte des 18. Jahrhunderts."}

DATA-Endpoint

MRP 10319 samples

Entities

ORG 3971 PER 7774 GPE 2787 LOC 412

Info

{'url': 'https://mrp.oeaw.ac.at/', 'license': 'https://creativecommons.org/licenses/by/4.0/', 'creators': [{'name': 'Ademir Hamzabegovic'}], 'description': "Die annotierten Daten basieren auf den 'Ministerratsprotokolle Österreichs und der österreichisch-ungarischen Monarchie 1848–1918'"}

DATA-Endpoint

Chronik Aldersbach 536 samples

Entities

PER 1005 LOC 851 DATE 389

Info

{'url': 'http://gams.uni-graz.at/context:aled', 'license': 'https://creativecommons.org/licenses/by-nc/4.0/', 'creators': [{'name': 'Robert Klugseder'}, {'name': 'Maximilian Vogeltanz'}, {'name': 'Georg Vogeler'}], 'description': 'Deutschsprachige Chronik (Mitte 17. Jahrundert) von Abt Gerhard Hörger (reg. 1651-1659) des Zisterzienserstift Aldersbach.'}

DATA-Endpoint

DIPKO 695 samples

Entities

LOC 1315 PER 1319

Info

{'url': 'http://gams.uni-graz.at/context:dipko', 'license': 'https://creativecommons.org/licenses/by-nc-sa/4.0', 'creators': [{'name': 'Anna Huemer'}, {'name': 'Lisa Brunner'}, {'name': 'Philipp Humer'}], 'description': 'Itinerarium oder rayß beschreibung von Wien in Österreich nach Constantinopel. Der Textkorpus basiert zu einem großen Teil auf Übernahmen von Passagen vorhergehender Reiseberichte, von Berichten und Relationen des Internuntius, Johann Rudolf Schmid zum Schwarzenhorn, sowie auf Aufzeichnungen und Beobachtungen von Johann Georg Metzger selbst. Die Reinschrift eigener Aufzeichnungen erfolgte mit hoher Wahrscheinlichkeit nach Abschluss der Reise'}

DATA-Endpoint

WRDIARIUM 2359 samples

Entities

PER 3158 LOC 5079 ORG 447

Info

{'url': 'https://digitarium.acdh.oeaw.ac.at/', 'license': 'https://creativecommons.org/licenses/by/4.0/', 'creators': [{'name': 'Ademir Hamzabegovic'}], 'description': 'Ausgaben des Wiener Diarums.', 'quote_source': 'Wienerisches DIGITARIUM, herausgegeben von Claudia Resch und Dario Kampkaspar.'}

DATA-Endpoint