Networking Reference Structures

Author: Gertjan van Heijst

Introduction

The objective of the proposed research for this part of the RNA-project is a network of reference structures (NRS) that is both self-supporting and on-line available. Developers of thesauri and other reference structures can connect their reference structure to the NRS to make it available to all NRS users and to allow cross referencing between their reference structure and other reference structures connected to the NRS. The added value of such an NRS with respect tot the current practice of isolated idiosyncratic reference structures can be expressed in terms of synergy, quality and efficiency.

We have distinguished five design criteria for the NRS:

  • Existing references structures can be embedded in the NRS with minimal effort.
  • The NRS infrastructure should allow cross referencing between terms in different reference structures that are part of the NRS.
  • Reference structures that are embedded in the NRS remain autonomous entities that can be edited and restructured by the owners of the reference structure without interference or supervision from the NRS.
  • Existing tools for development and maintenance of reference structures and collections can be made NRS compatible with minimal efforts.
  • The services of the NRS should be complementary to the services of individual reference structures and their support tools.

An NRS infrastructure that satisfies these requirements should support at least three functions:

  • registration;
  • assurance of interoperability;
  • access control.

An other indispensable function is version management. This isnt an intrinsic part of the NRS infrastructure though, it should be handled by supporting applications.

Registration

Put simply, registration means that the infrastructure maintains a list of embedded reference structures that can be consulted by users or tools to locate particular reference structures and to obtain specific information about each of them.

Assurance of interoperability

Interoperability assurance means that the participating reference structures share a common data model that allows for data sharing between the reference structures.

The common data model that the NRS should support has two at first sight conflicting requirements. On the one side, the NRS should impose minimal requirements on the data models of the participating reference structures, to ensure easy embedding of arbitrary reference structures. This implies a 'minimal' common data model which represents a 'common denominator' of the participating reference structures. The downside of such a minimal model is that reference structures that have a richer data model, loose this 'richness' when viewed through the NRS, because the common data model in that case works like a filter. The second requirement should therefore be that the NRS should allow participating reference structures to communicate features of their data model that are additional to the features of the common data model to users and in particular tools that connect to the NRS.

An obvious candidate for the common data model is SKOS, the Simple Knowledge Organisation system. SKOS is a developing W3C standard for the modelling of reference structures and thesauri in particular. SKOS is based on the RDF data model and uses an XML syntax, which are both widely adapted W3C standards too.

To allow participating reference structures to enhance the minimal data model OWL is the proper candidate. OWL is an ontology language with sufficient expressive power to express al foreseeable extensions to the common data model that NRS reference structures would need.  OWL, which is also a W3C standard, has a formal and therefore machine-interpretable semantics, so when reference structures publish the additional features of their data model in OWL, OWL enabled tools connected tot the NRS could interpret these features and act accordingly.

In summary: use of SKOS as a common data model allows easy connectability of existing reference structures to the NRS and use of OWL as a means for extending the data model allows richer reference structures to inform tools connected to the NRS about their additional features.

Access control

The access control mechanism of the NRS ensures that reference structures can only be consulted or modified by user with sufficient rights to do so.

The NRS infrastructure should take care of access control to facilitate 'reference structure surfing'. From the user perspective the most perceptible advantage of the NRS is the ability to jump from one reference structure to another by using the cross references. It would be extremely bothersome if users should identify themselves to every reference structure they access. So there should be only an identification dialogue when the user starts an NRS session and the NRS should keep track of the user identity throughout the session. The access policy for a participating reference structure on the other hand should remain a responsibility of the owner of the reference structure.

Version management

The envisioned NRS infrastructure supports consultation of the embedded reference structures and use of the cross-references between these reference structures. Development and maintenance of reference structures will mostly take place in a restricted area of the NRS environment or outside the NRS environment. To support the reference structure maintenance cycle, the NRS infrastructure should provide the following features:

  • It must be possible to have a development version and a published version of a reference structure in the NRS environment.
  • Reference structure owners can publish their reference structures at the moment that suits them best. This could be hourly, daily, monthly, yearly or at any arbitrary moment in time.
  • Users of a reference structure are informed (e.g. collection managers that use keywords from the reference structure) are informed about changes in the reference structure that are relevant tot them (i.e. because keywords that they use have been modified, replaced or deleted).
  • Managers of other reference structures that link to a modified reference structure are informed about changes in the reference structure that are relevant tot them (i.e. because keywords that they refer to have been modified, replaced or deleted).