News

Successfully completed master Thesis - Named Entity Extraction on archival data

The goal of this thesis was to find entities in a large set of archival data and to link those entities to the knowledgebases GND (http://www.dnb.de/DE/Standardisierung/GND/gnd_node.html) and Wikidata (https://www.wikidata.org/wiki/Wikidata:Main_Page). For the task of Named Entity Recognition (NER) a Conditional Random Field was used and the linking was implemented with a extension of AGDISTIS (http://aksw.org/Projects/AGDISTIS.html).

The extension of AGDISTIS contains a new measure for calculating the distance of two entities and new entitytype specific features. On top of that a new technique was developed to link entities over two knowledgebases at the same time. Finally the framework was evaluated with a newly generated gold standard for archival data.

For other interesting topics for theses see http://dice.cs.uni-paderborn.de/teaching/thesis/.

527efb333