Workshop: Linked Data and Syriac Sources, Amsterdam, March 2018
Around 30 scholars from more than a dozen different countries met in Amsterdam in mid-March for two days of discussions and presentations on developments in digital humanities in Syriac language and literature.
Participants were welcomed to the workshop and Amsterdam by Professor Joke van Saane, Vice-Dean of the Faculty of Theology at VU Amsterdam and Professor Wido van Peursen, the workshop organizer and host.
Following an overview of digital Syriac projects in the Netherlands (CALAP, Turgama, Polemics Visualized, Topic Visualizer for Syriac texts, LinkSyr: Linking Syriac Data (CLARIAH), Linked Data and Syriac Sources, Electronic Peshitta Text, e-CSCO), Professor van Peursen explained the methodology behind the projects, which aim to produce more reliable versions of Syriac texts than are currently available.
Hannes Vlaardingerbroek (Leiden/Amsterdam), presented an overview of the LinkSyr project, which is using data in the form of tagged and untagged morphological terms from existing projects and materials and collating them into one dataset, with 160,000 items already tagged of what will eventually comprise more than 1 million terms. However, there is not enough data to train reliable HMM language models: existing tagging methods for Semitic languages, such as Hebrew and Arabic, use large corpora to train language models, which are not currently available for Syriac. Syromorph (BYU) claims high accuracy but is not yet compatible with the LinkSyr data. Mathias Coeckelbergs (Brussels and Leuven), discussed the nature of the data in more detail and longer-term plans, such as linking terms to the syriaca.org database, providing automatic reading tools for non-Syriac specialists and more efficient search facilities. The data has some limits, as it works by recognizing surface forms, which can have multiple translations. Eventually, it is hoped the classification of URIs will be more data-driven and searchable for specific collections of texts.
Following this, George Kiraz (Beth Mardutho) described the process for converting Syriac lexicons from image to text files, creating an on-line, searchable dictionary, as part of the SEDRA project. While SEDRA was designed specifically for Syriac, the project has the technical capability to be expanded to include other Semitic languages and is looking for funding to achieve this longer-term aim.
David A. Michelson (Vanderbilt) provided an update on the syriaca.org project, which has minted URIs for places, persons, primary source texts and citations (bibliographic items), and published them online. URIs relating to factoids (events), ontology (keyword classification) and manuscripts are available as raw data. The project is currently looking for someone to do the same for artifacts. Daniel L. Schwartz (Texas A&M) talked participants through the various features the sitee offers.
Jamie Walters (Oxford-BYU Syriac Corpus) talked participants through the structure and functions of the Oxford-BYU website and the new edition of Hugoye, to be launched this summer.
Daniel Stökl ben Ezra (EPEH Paris), demonstrated the interface and search functions offered by the ThALES lectionary database, which includes material in Syriac and Arabic.
In the afternoon a number of breakout sessions discussed lexicography, named entities, liturgy, text corpus creation, scholars’ needs and interests, how to bridge Syriac linked data and the Syriac community and linking to other traditions such as Arabic and Ethiopic and brainstormed recommendations and suggestions for future projects.
The workshop provided a rare opportunity for face-to-face discussion and exchange amongst scholars working with Syriac in a variety of fields and it is to be hoped that the connections that were made at the workshop continue to develop to the benefit of current and future projects.
© International Qur’anic Studies Association, 2018. All rights reserved.