News

DICE @ TIB workshop about an Open Research Knowledge Graph

On March 20th, we were invited to a workshop at the TIB Hannover to discuss the vision for an Open Research Knowledge Graph (ORKG).

In the following, we will depict how the DICE research group can contribute  to such an ORKG and which technologies emerged from the discussions. The following notes follow the timeline of the workshop.

 

Time

Topic

 

10:00 - 10:30

Overview Presentation on the Open Research Knowledge Graph Vision

Sören Auer: After a short introduction of the participants from various backgrounds, from social science to chemistry, Sören explained the initial vision for an ORKG at the TIB Hannover. The audience and the speaker agreed that developments done under this action should be as open as possible for both researchers and industries.

 

The workshop then kicked-off under the theme “From researchers for researchers”.

10:30 - 11:45

Knowledge graph for research: Thoughts by external participants

 
 
  • Roland Ramthun is working on infrastructure for psychological information. Planning to publish different digital services, e.g. a semantic search service.

  • Bernd Müller: Implemented LIVOVO, a semantic search engine for life science, based on the VIVO technology. VIVO is a database for scholarly data and open source. In particular, different VIVO instances can be connected to query a larger base of information.

  • Michael Kohlhase: A working group including Michael is working on a world digital library for mathematics that will include several million research articles. He does not believe in the “semantic” solution but agrees that we need to go down from the container level (i.e. metadata) to the document level (i.e. to content). He suggests the use of a Theory Graph (similar to an ontology graph). These can tolerate contradictory information. Theories should be primary citizens in an ORKG to accommodate the real world. He invites the audience to conquer the one-brain-barrier of science given the sheer amount of research articles.

  • Benjamin Zapilko: A research graph was created by the Research Data Alliance for articles and then enriched the articles  from ORCID and other metadata sources.The research group is already able to identify, amongst other things, variables, data sources, claims and evidence. Obviously, the topic of publication mining is one of the future and heavy populated research areas.

  • Christoph Lange-Bever: Focussed on citation-based metrics and how alternative  metrics can be applied. In particular, he focused on events like conferences, workshops and even (informal) meetings. Here, we need to collect data about events from various sources to use this information as new quality metrics.

  • Niklas Petersen: Discusses Sarven Capadisli’s effort to progress and automate decentralised publishing. He is seeking to extract information on the sentence level and build a first version of a knowledge graph. He even wants to invest it in a startup. Very interesting, let’s support him!

  • Georg Rehm: Among other interesting topics, Georg deals with storing and representing different shades of grey in research data. He also mentioned the upcoming German “Nationale Forschungsdateninfrastruktur”.

  • Vitalis Wiens talked about Semantic Zooming into Knowledge Graphs to overcome the information overload using WebVOWL.

  • Harald Sack described his vision for a research knowledge graph and how it can be used to provide easy-to-read information. An example based on DBpedia can be found on http://scihi.org/

  • Bernd Müller, head of Semantic Information Retrieval at ZBMed, presented the LIVIVO platform and how an ORKG could support the search methodology.

 

11:45 - 12:15

Brief position statements by participants

  • Conceptualization of Research Contribution

  • Architecture, Storage, REST API

  • User interface demo

  • RDF/SPARQL/OWL in Neo4J

 

Next, the facilitators of the break-out groups presented their work topics:

  • Anna Kasprzik introduced ideas around a core ontology, which gdeparts from an article-centric view. She even suggests that there is no top-level ontology and we rather need a first working attempt.

  • Manuel Prinz presented an initial vision for the backend and architecture for ORKG. Currently, there is a small prototype based on Kotlin and Spring Boot. In the future, the development team wants to open the source code and make it a truly community-driven project hosted beyond TIB Hannover.

  • Viktor K. (employed at TIB) presented the current front-end prototype and discussed design decisions, e.g. presentation, authoring, validation, ranking of information.

  • Sören Auer presented his vision for creating a knowledge graph infrastructure to enable community-driven development of different ORKG’s and their visualisations and applications on top of them.

  • Michael Kohlhase seconded the vision of a simple but meaningful data model (comparable with that of OpenStreetMap) to facilitate the development of an ORKG.

 

12:20 - 13:00

Lunch break

The lunch break offered us the opportunity to come together and inform ourselves about different novel research topics, such as patent research at FIZ Karlsruhe, LIVIVO from Cologne or recent developments around the upcoming National Research Data Infrastructure proposal by the German Government.

13:00 - 15:00

Work in parallel workgroups

  • Conceptualization and ontologies  

  • Data storage, backend, REST API

  • Frontend and ORKG user interface  

  • Pilot applications, testbeds and use cases, cooperation and governance  

 

After the lunch break, we closely followed discussions around the pilot application use cases, possible testbeds, corporations, governance issues, licenses and publisher issues as well as how to fund the ORKG.

The participants agreed on the use of an open license (including a copyleft clause) for the ORKG but the question of how to attach licensing information to each fact is still open. Michael Kohlhase suggested collaboration on data, and competition on the service level. Finally, the participants discussed whether the ORKG should contain only manually curated data or allow automatic additions.

 

What do you think?

15:00 - 16:00

Concluding session with presentation of workgroup results and discussion of next steps

At the end, each group presented their work and surprisingly, many initial thoughts were changed due to community needs.

The ontology group, led by Anna, discussed the core ontology and surveyed different candidates. The backend group consisting of Manuel from TIB and Natanael Arndt discussed design decisions, such as the storage system and data model and its relation to the vocabulary and time dependence. They also discussed the decentralisation aspect of an ORKG implementation. The frontend group’s discussion had one central take-home message: Visualisations have a purpose. That is, the initial visualisations need to be use case-driven and to be reactive to user preferences and the underlying data detail model. Finally, Michael Kohlhase summarized the work of the last break-out group and the workshop ended with closing remarks from TIB’s director Sören Auer.

Please note, this blog post is of course highly subjective and written from the view of the author. Soon there will be a website (http://orkg.org), so stay tuned for updates and follow us on Twitter @DiceResearch

527efb333