Organization: Semantic Computing Research Group (SeCo), The Finnish Literature Society (SKS)

Description: The dataset consists of biographies of the National Biography of Finland and other historical databases as Linked Data. The data was created by extracting knowledge from the over 13 000 underlying biographical texts using language technology, and by enriching the data by linking it to various biographical databases, such as DBpedia and Wikidata, collection databases of memory institutions, Linked data repositories etc. The dataset includes events, people, places, occupations, and other aspects and documentation of biographical history.

License(s): CC BY 4.0

Detailed Dataset Contents

Occupational titles

Information on occupational titles mentioned in the biographies. The graph contains about 7 000 labels.

BiographySampo, places

This graph contains Finnish and international map names containing the most of places mentioned in the biographical data. The data has roughly 5 000 place names. Place resources in Finland were extracted from YSO places or from the Place Name Registry. Foreign places were queried from the Wikidata database, or achieved using the Google Maps Api.

People in BiographySampo

The people in the database of Finnish Literature Society, including five sources. The data also contains their relatives extracted from the biographical descriptions.

  • View in
  • URI:
  • Information included: name, birth, death, occupations, biography, links to known relatives.
  • Detailed source information:
    • National Biography of Finland database. The Finnish Literature Society.
    • Business Leaders. The Finnish Literature Society.
    • Finnish Generals and Admirals 1809–1917. The Finnish Literature Society.
    • Finnish Clergy 1554–1721. The Finnish Literature Society.
    • Finnish Clergy 1800–1920. The Finnish Literature Society.
  • Example resource URI:


Keywords used in the web application. The keywords are terms in the ontology that have been linked to the biographical descriptions.

BiographySampo Events

Information about the events of the biographees were converted from the source CSV files or extracted from the biographical descriptions.


Categories of people mentioned in the Kansallisbiografia dataset. The data has 61 different labels.


Links between the biographies. The links consists of manual addition by the biography authors, and automatical additions constructed with NLP techniques.


Linkage to person resources or images in external databases, e.g. Wikidata, VIAF,, ISNI, Warsampo, etc.


Information on companies mentioned in the biographies. The graph contains about 2 700 labels.


Timespans used in events. Each timespan can have a time of start and end.


This graph contains the biographical descriptions of 13 665 biographees. Each biography consists of several paragraphs.


Time periods used in the data.

Family relations

Family relations used in the data.