Your browser doesn't support the features required by impress.mod.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

29.05.2016 | 1. WHiSe Workshop / ESWC 2016, Anissaras - Crete

Data Repositories in the Humanities and the Semantic Web

Modelling, Linking, Visualising

Download: https://github.com/digicademy/whise2016-xtriples View: http://digicademy.github.io/whise2016-xtriples

Table of Contents

  1. Humanities' data repositories: Examples from the
    Academy of Sciences and Literature Mainz
  2. Implicit and explicit semantics:
    bridging the gap between TEI-XML and RDF
  3. The XTriples Webservice
  4. XTriples use cases: harvesting, transforming and vizualising XML data
  5. XTriples: Outlook & Roadmap

01

Humanities' data repositories

Examples from the Academy of Sciences and Literature Mainz

The German Academies of Sciences

Long-term research in the humanities and social sciences

  • The "Academies' Programme" is currently the most comprehensive humanities research programme in Germany.
  • It provides funding for long-term research projects, predominantly in the humanities but also in the social sciences.
  • Within the Academies’ Programme, a total of over 900 researchers and administrative staff are currently working on 144 different projects across 199 locations. Projects can range from 12 to 24 years.
  • The main focus of the programme is on compiling dictionaries, encyclopaedias and critical editions in the fields of theology, philosophy, history, literature and linguistics, art history and archaeology, inscriptions, onomastics, and musicology.
  • Methods from the Digital Humanities increasingly become common practice in the research conducted by the academies. Projects accumulate a broad range of humanities research data that is often made available on specialised online platforms.

Example Data Repositories

Academy of Sciences and Literature | Mainz

Regesta Imperii Online (http://www.regesta-imperii.de)

  • Database with ca. 173.000 Regestae of medieval charters from Charlemagne to Maximilian I covering 800 years of European medieval history
  • Records available under CC-BY 4.0 license via REST-Interface
  • TEI-XML according to the guidelines of the Charters Encoding Initiative (example)

Corpus Vitrearum Medii Aevi (http://www.corpusvitrearum.de)

  • Digital archive with high-resolution TIFF images of medieval stained glass (CC-BY 4.0)
  • Extensive XMP metadata embedded into the images
  • Controlled vocabulary (Iconclass) to describe the iconography
  • REST-Interface for harvesting images and metadata in XMP and JSON-LD (example)

German Inscriptions Online (http://www.inschriften.net)

  • 17.000 catalogue numbers from the critial edition "Deutsche Inschriften" with 18.000 images, covering medieval and early modern inscriptions in the German speaking area
  • Currently in the process of making the records available marked up in TEI-XML according to the EpiDoc standard
  • REST-Interface for harvesting the XML data (example)

LOD for the Academies' projects

Great potential in connecting research data repositories

New connections, new questions

Questions that could be analysed using semantic web technologies

But the primary question is: How to get the semantic information out of TEI-XML?

02

Implicit and explicit semantics

Bridging the gap between TEI-XML and RDF

Implicit Semantics in TEI-XML

Metadata of a letter from Johann Wolfgang von Goethe

Link: Johann Wolfgang von Goethe to Samuel Thomas Soemmering (1793)

    <correspDesc key="686" cs:source="#SOE20">
        <correspAction type="sent">
            <persName ref="http://d-nb.info/gnd/118540238">
                Johann Wolfgang von Goethe
            </persName>
            <placeName ref="http://www.geonames.org/2812482">
                Weimar
            </placeName>
            <date when="1793-12-05">5.12.1793</date>
        </correspAction>
        [...]
    </correspDesc>

The semantics are "implicit" in the sense that they can be contained in XML tag names, tag values, attribute names as well as values.

Explicit Semantics in RDF

The same information as expressed in RDF

    Goethe           is a              Person ;
                     sends             Letter .

    Letter           dates to          1793 ;
                     sent from         Weimar .

    Weimar           is a              City ;
                     has latitude      11.32 ;
                     has longitude     50.98 .

Gathering additional data

Enriching the RDF model of Goethe's letter with LOD

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7

How?

03

The XTriples Webservice

A generic webservice to extract RDF statements from XML

Website, Documentation & Code

Version 1.3 (stable) | MIT License

Basic principles of the webservice

  1. Generic: Extract RDF statements out of any XML
  2. Simple: Slim configuration based on statement patterns
  3. Inclusive: Harvest data from other XML resources during extraction
  4. Flexible: Return a broad array of RDF formats
  5. RESTful: Communicate solely over HTTP

XTriples

Process diagramm & technical components

Simple XML configuration

Based on statement patterns with XPATH / XQuery expressions

<xtriples>
    <configuration>
        <vocabularies>
            <vocabulary prefix="###PREFIX###" uri="###NAMESPACE###"/>
            <vocabulary prefix="rdf" uri="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
            [...] <!-- more vocabularies -->
        </vocabularies>
        <triples>
            <statement>
                <subject prefix="###PREFIX###">###XPATH###</subject>
                <predicate prefix="rdf">about</predicate>
                <object type="literal">###XPATH###</object>
            </statement>
            [...] <!-- more statements -->
        </triples>
    </configuration>
    <collection uri="http://xml.collection.somewhere/resources/index.xml">
        <resources uri="http://xml.collection.somewhere/resources/{###XPATH###}.xml"/>
    </collection>
    [...] <!-- more collections -->
</xtriples>

Example data and configuration

Crawling a collection and extracting FOAF statements

<gods>
    <god id="1">
        <name>
            <greek>Ἀφροδίτη</greek>
            <roman>Venus</roman>
            <english>Aphrodite</english>
        </name>
        <gender>female</gender>
        [...]
    </god>
    <god id="2">
        <name>
            <greek>Ἀπόλλων</greek>
            <roman>Apollon</roman>
            <english>Apollo</english>
        </name>
        <gender>male</gender>
        [...]
    </god>
    <god id="3">
        <name>
            <greek>Ἄρης</greek>
            <roman>Mars</roman>
            <english>Ares</english>
        </name>
        <gender>male</gender>
        [...]
    </god>
    [...]
</gods>

<xtriples>
    <configuration>
        <vocabularies>
            <vocabulary prefix="gods" uri="http://xtriples.spatialhumanities.de/examples/gods/"/>
            <vocabulary prefix="rdf" uri="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
            <vocabulary prefix="rdfs" uri="http://www.w3.org/2000/01/rdf-schema#"/>
            <vocabulary prefix="foaf" uri="http://xmlns.com/foaf/0.1/"/>
        </vocabularies>
        <triples>
            <statement>
                <subject prefix="gods">/@id</subject>
                <predicate prefix="rdf">type</predicate>
                <object prefix="foaf" type="uri">Person</object>
            </statement>
            <statement>
                <subject prefix="gods">/@id</subject>
                <predicate prefix="rdfs">label</predicate>
                <object type="literal" lang="en">/name/english</object>
            </statement>
            <statement>
                <subject prefix="gods">/@id</subject>
                <predicate prefix="rdfs">label</predicate>
                <object type="literal" lang="gr">/name/greek</object>
            </statement>
        </triples>
    </configuration>
    <collection uri="http://xtriples.spatialhumanities.de/examples/gods/all.xml">
        <resource uri="{//god}"/>
    </collection>
</xtriples>

XML data
XTriples configuration

Run this live: XML | Configuration | RDF | SVG

04

XTriples use cases

Harvesting, transforming and vizualising XML data

Use case 1: Corpus Vitrearum

Connecting the CVMA image archive with Iconclass and Europeana

Connecting CVMA, Europeana and Iconclass

CVMA - Stage 01

Applying CIDOC, SKOS and dcterms

d1e17 http://id.corpusvitrearum.de/images/2622 crm:E18_Physical_Thing, skos:Concept: http://id.corpusvitrearum.de/images/2622 "der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" "der Erzengel Michael kämpft gegen den Drachen (Teufel, Satan) Drache · Engel · Erzengel · Michael · Religion · Satan · Teufel · christliche Religion · kämpfen · mit Füßen treten · trampeln · übernatürlich" http://id.corpusvitrearum.de/images/2622->"der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" dcterms:description "Hl. Michael als Drachentöter" "Hl. Michael als Drachentöter" http://id.corpusvitrearum.de/images/2622->"Hl. Michael als Drachentöter" dcterms:title

XTriples: Configuration | Run this live: SVG | RDF/XML | Turtle

CVMA - Stage 02

Connecting Iconclass

d1e31 http://id.corpusvitrearum.de/images/2622 crm:E18_Physical_Thing, skos:Concept: http://id.corpusvitrearum.de/images/2622 "der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" "der Erzengel Michael kämpft gegen den Drachen (Teufel, Satan) Drache · Engel · Erzengel · Michael · Religion · Satan · Teufel · christliche Religion · kämpfen · mit Füßen treten · trampeln · übernatürlich" http://id.corpusvitrearum.de/images/2622->"der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" dcterms:description http://iconclass.org/11G31 http://iconclass.org/11G31 http://id.corpusvitrearum.de/images/2622->http://iconclass.org/11G31 rdfs:seeAlso "Hl. Michael als Drachentöter" "Hl. Michael als Drachentöter" http://id.corpusvitrearum.de/images/2622->"Hl. Michael als Drachentöter" dcterms:title

XTriples: Configuration | Run this live: SVG | RDF/XML | Turtle

CVMA - Stage 03

Iconographically similar resources from Europeana

d1e53 http://id.corpusvitrearum.de/images/2622 crm:E18_Physical_Thing, skos:Concept: http://id.corpusvitrearum.de/images/2622 "der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" "der Erzengel Michael kämpft gegen den Drachen (Teufel, Satan) Drache · Engel · Erzengel · Michael · Religion · Satan · Teufel · christliche Religion · kämpfen · mit Füßen treten · trampeln · übernatürlich" http://id.corpusvitrearum.de/images/2622->"der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" dcterms:description http://iconclass.org/11G31 http://iconclass.org/11G31 http://id.corpusvitrearum.de/images/2622->http://iconclass.org/11G31 rdfs:seeAlso http://www.europeana.eu/portal/search?q=what%3A%22http%3A%2F%2Ficonclass.org%2F11G31%22 http://www.europeana.eu/portal/search?q=what%3A%22http%3A%2F%2Ficonclass.org%2F11G31%22 http://id.corpusvitrearum.de/images/2622->http://www.europeana.eu/portal/search?q=what%3A%22http%3A%2F%2Ficonclass.org%2F11G31%22 rdfs:seeAlso "Hl. Michael als Drachentöter" "Hl. Michael als Drachentöter" http://id.corpusvitrearum.de/images/2622->"Hl. Michael als Drachentöter" dcterms:title

XTriples: Configuration | Run this live: SVG | RDF/XML | Turtle

CVMA - Stage 04

Multilingual labels from Iconclass

d1e112 http://iconclass.org/73G412 http://iconclass.org/73G412 http://id.corpusvitrearum.de/images/2622 crm:E18_Physical_Thing, skos:Concept: http://id.corpusvitrearum.de/images/2622 http://id.corpusvitrearum.de/images/2622->http://iconclass.org/73G412 skos:related http://www.europeana.eu/portal/search?q=what%3A%22http%3A%2F%2Ficonclass.org%2F11G31%22 http://www.europeana.eu/portal/search?q=what%3A%22http%3A%2F%2Ficonclass.org%2F11G31%22 http://id.corpusvitrearum.de/images/2622->http://www.europeana.eu/portal/search?q=what%3A%22http%3A%2F%2Ficonclass.org%2F11G31%22 rdfs:seeAlso "der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan)" "der Erzengel Michael kämpft gegen den Drachen (Teufel, Satan)" http://id.corpusvitrearum.de/images/2622->"der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan)" skos:prefLabel "Hl. Michael als Drachentöter" "Hl. Michael als Drachentöter" http://id.corpusvitrearum.de/images/2622->"Hl. Michael als Drachentöter" dcterms:title http://iconclass.org/11G31 http://iconclass.org/11G31 http://id.corpusvitrearum.de/images/2622->http://iconclass.org/11G31 rdfs:seeAlso "der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" "der Erzengel Michael kämpft gegen den Drachen (Teufel, Satan) Drache · Engel · Erzengel · Michael · Religion · Satan · Teufel · christliche Religion · kämpfen · mit Füßen treten · trampeln · übernatürlich" http://id.corpusvitrearum.de/images/2622->"der Erzengel Michael kämpft gegen den Drachen\n(Teufel, Satan) Drache · Engel · Erzengel ·\nMichael · Religion · Satan · Teufel · christliche\nReligion · kämpfen · mit Füßen treten · trampeln ·\nübernatürlich" dcterms:description http://iconclass.org/71E464 http://iconclass.org/71E464 http://id.corpusvitrearum.de/images/2622->http://iconclass.org/71E464 skos:related "the Archangel Michael fighting the dragon (devil,\nSatan)" "the Archangel Michael fighting the dragon (devil, Satan)" http://id.corpusvitrearum.de/images/2622->"the Archangel Michael fighting the dragon (devil,\nSatan)" skos:prefLabel

XTriples: Configuration | Run this live: SVG | RDF/XML | Turtle

Use case 2: A network of letters

Combining correspSearch with the GND and Geonames

Connecting letter information with the GND Authority File and Geonames

correspSearch - Stage 01

Goethe and his correspondents in correspSearch

d1e200 <!-- http://d&#45;nb.info/gnd/118805193 --> http://d-nb.info/gnd/118805193 foaf:Person: Soemmerring, Samuel Thomas <!-- http://d&#45;nb.info/gnd/118540238 --> http://d-nb.info/gnd/118540238 foaf:Person: Goethe, Johann Wolfgang von, Johann Wolfgang von Goethe <!-- http://d&#45;nb.info/gnd/118805193&#45;&gt;http://d&#45;nb.info/gnd/118540238 --> http://d-nb.info/gnd/118805193-&gt;http://d-nb.info/gnd/118540238 cd:sending <!-- http://d&#45;nb.info/gnd/118540238&#45;&gt;http://d&#45;nb.info/gnd/118805193 --> http://d-nb.info/gnd/118540238-&gt;http://d-nb.info/gnd/118805193 cd:sending <!-- http://d&#45;nb.info/gnd/122361261 --> http://d-nb.info/gnd/122361261 foaf:Person: Weber, Genovefa <!-- http://d&#45;nb.info/gnd/118540238&#45;&gt;http://d&#45;nb.info/gnd/122361261 --> http://d-nb.info/gnd/118540238-&gt;http://d-nb.info/gnd/122361261 cd:sending <!-- http://d&#45;nb.info/gnd/115363688 --> http://d-nb.info/gnd/115363688 foaf:Person: Luise Augusta Herzogin von Sachsen-Weimar und Eisenach <!-- http://d&#45;nb.info/gnd/115363688&#45;&gt;http://d&#45;nb.info/gnd/118540238 --> http://d-nb.info/gnd/115363688-&gt;http://d-nb.info/gnd/118540238 cd:sending <!-- http://d&#45;nb.info/gnd/117158542 --> http://d-nb.info/gnd/117158542 foaf:Person: Weber, Franz Anton <!-- http://d&#45;nb.info/gnd/117158542&#45;&gt;http://d&#45;nb.info/gnd/118540238 --> http://d-nb.info/gnd/117158542-&gt;http://d-nb.info/gnd/118540238 cd:sending <!-- http://d&#45;nb.info/gnd/118540246 --> http://d-nb.info/gnd/118540246 foaf:Person: Goethe, Katharina Elisabeth <!-- http://d&#45;nb.info/gnd/118540246&#45;&gt;http://d&#45;nb.info/gnd/118540238 --> http://d-nb.info/gnd/118540246-&gt;http://d-nb.info/gnd/118540238 cd:sending

XTriples: Configuration | Run this live: SVG | RDF/XML | Turtle

correspSearch - Stage 02

Extracting / enriching Goethe's letter network in correspSearch

XTriples: Configuration | Run this live: SVG | RDF/XML | Turtle

05

XTriples

Outlook & Roadmap

Features and improvements

Outlook & Roadmap

F I N I S

Thank you

Links, Software & Attribution

Links, Software & Attribution

Links

Software used

Attribution