Reuse and adapt DataArc's code and expand your toolkit with the API

Code Base

All code developed for the DataArc Project is available open access. It is free to reuse and adapt under a creative commons license.

The source code for the search tool is at: https://github.com/castuofa/dataarc-ui

Technical documentation for the 'Why' section is here.

The API code is at: https://github.com/castuofa/dataarc-api

The datasets and metadata used in the tool are at: https://github.com/castuofa/dataarc-source

 

Data Preparation Code

The data for the Icelandic Saga Map and other related Sagas datasets are created by tagging the text with relevant concepts in Microsoft word. These documents are saved as XML, which is then converted to geoJSON for import into dataARC. The code to convert from XML to geoJSON is available at: https://github.com/ropitz/docx-geojson-sagamap .

Search Tool Architecture Overview

API

The DataArc search tool is built on an API. This same API can be used to carry out advanced analyses and queries and to visualise data in jupyter or R notebooks, or to pull data from DataArc into your own project's infrastructure.

The API technical documentation is at: https://api.data-arc.org/documentation/. A brief API help guide is available at: https://dataarc-demo.readthedocs.io/en/latest/dataarc-api.html.

Use the graphql playground at: https://api.data-arc.org/graphql  to test your queries and learn to use the API.

You can see examples of the API in use in our project's jupyter notebooks like this one which looks at the levels of connectedness between concepts used in the project and this one which provides a basic example of how to use a jupyter  notebook and the API to extract connected data.

API Connections Diagram

dataARC API conceptual schema

Category

Categories for the datasets

  • Has many Datasets

Dataset

A geojson dataset with defined attributes

  • Belongs to one Category
  • Has many Combinators
  • Has many Dataset Fields
  • Has many Features

Dataset Field

Fields pulled from the geojson file to allow contributors to set field types and define display text

  • Belongs to one Dataset

Feature

These are processed datasets broken out into individual features from the geojson file

  • Belongs to one Dataset
  • Belongs to many Combinators

Combinator

Connects dataset features with concepts

  • Belongs to one Dataset
  • Has many Combinator Queries
  • Has many Concepts
  • Has many Features

Combinator Query

Query to perform on the dataset

  • Belongs to one Combinator

Concept

Simple concepts such as “woods” or “farms” that allow us to tie datasets together

  • Related to many Combinators
  • Has many Concept Topics

Concept Map

Topic map file provided by Rachel that contains topics and how they are connected

  • Has many Topics

Concept Topic

Topics parsed out of the concept map

  • Belongs to one Concept Map
  • Belongs to one Concept

Event

Events logged automatically by the api (create, update, delete, etc.)

 

Temporal Coverage

Predefined time periods for the timeline