Language Independent Metadata Browsing of European Resources
LIMBER involved five national social science data archives of the UK, Germany, Greece, France and Norway who came together to develop a multilingual user interface to the data stored in social science archives across Europe.
Social science archives contain the datasets from many studies by governments and academia. The datasets are described by common metadata, using a set of terms defined in a common multilingual thesaurus. The LIMBER project developed multilingual tools to support user access to the data stored at social science archives across Europe and to integrate it with data from other domains. This made the information available throughout Europe for users to retrieve and communicate in the language(s) of their choice, as a basis for further research, policy making and planning by individuals, companies and government organisations.
The data themselves were coded alphanumeric data from many thousands of social studies, so the data itself did not require translation, only the metadata, defining the fields, code values etc. used to interpret it.
The key objectives were to:
- research and evaluate a metadata model and representation (in XML and RDF) for social science datasets to allow their integration within and across data archives
- develop and evaluate a multilingual thesaurus to be used to index and access social science datasets in data archives
- develop and evaluate a multilingual query and retrieval tool to assist queries and retrieval with explanations, drawing on the multilingual thesaurus to allow queries to dataset archives to be made in several languages with keyword and phrase translation in the retrieved data and metadata
- develop and evaluate tools to support the construction and maintenance of datasets in an archive using automatic indexing, drawing on the multilingual thesaurus
- evaluate an XML metadata server using the metadata model and representation for social science datasets and the multilingual thesaurus
To construct the multilingual interface, a multilingual thesaurus of social science terms was constructed. A metadata model was developed using the World Wide Web Consortium (W3C) standard for metadata - Resource Description Framework (RDF) to allow the construction of semantic definitions of terms in the thesaurus, making them more interpretable by users, and provide a standard basis for query and retrieval tools. These two elements have provided the basis for world wide access in a user's own language to these valuable data resources. Tools were also developed to construct and maintain the metadata.
The UK Data Archive led the multilingual thesaurus development which was based on the Archive's own HASSET thesaurus.