Thursday, July 28, 2016

Nomisma.org crosses 100,000 coin threshold

Following the publication of RIC 6, 7, 8, and 10 to Online Coins of the Roman Empire, I extracted over 18,000 coins with references to these volumes from the British Museum SPARQL endpoint (with this query) and successfully matched and imported about 17,000 of these into Nomisma's SPARQL endpoint. There are now about 80,000 physical coins linked to the 38,000 Roman imperial coin types in OCRE. I also reprocessed the Berlin LIDO export and published 1,000+ coins from late Roman coinage into OCRE.

This recent import has brought the total number of physical specimens linked to online type corpora (including Coinage of the Roman Republic Online and PELLA) to 116,964, about 98,000 of which come from the British Museum and American Numismatic Society alone. The number of coins has more than doubled in a year.


Technical Process

The XML response to the SPARQL query linked above was processed through a PHP script I wrote several days ago. The script iterates through each result in the XML document in order to parse the reference text with regular expressions in order to generate a type ID that conforms to the OCRE convention (e.g., ric.7.anch.109) and test to see whether the URI exists. The result of the concordance process is written as a CSV file, which is then processed by another PHP script into RDF conforming to the Nomisma.org ontology. An additional SPARQL query is executed on the British Museum endpoint for each row in the table in order to extract the weight, diameter, image, etc. The RDF is written to disk and then imported into the Nomisma SPARQL endpoint, where the data are immediately available in OCRE.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.