DBRepo, opens an external URL in a new window, a repository for databases, was developed within the BMBWF funded project FAIR Data Austria, opens an external URL in a new window, which was recently successfully completed. The repository supports researchers in making their research data findable, accessible, interoperable, and reusable. To facilitate this, deposition of data produces metadata which is stored in the metadata database and also indexed in the search service in DBRepo. Currently, database information (name, description), table information (name, description) and column information (name, type, nullable, unique, primary key) as well as metadata required to issue a persistent identifier to a database and dataset are stored. This (meta-)data is findable through a wildcard search from the frontend via a simple text box.
This is useful for initial lookups of datasets of interest, but currently lacks implementation to do structured search which was not part of the FAIR Data Austria project. For example: a researcher looks for datasets that contain ozone measurements independent of the dataset containing the terms O3, Ozone, Ozon (= German) or озон (= Ukrainian). We refer to this type of search as semantic concept-based search. Similarly, datasets containing particle measurements should be findable and convertible no matter whether their unit of measurement is mg/m3, µg/m3, ppm or ppb. We refer to this type of search as unit of measurement-based search.
The next step: structured search
We want to extend the functionality of DBRepo to make (meta-)data findable within DBRepo through structured search that is capable of handling semantic concept-based search queries and unit of measurement-based search queries. This will enhance the service provided by DBRepo twofold:
- researchers who want to use high-quality data stored in DBRepo are able to find relevant datasets that match their search query and
- data in DBRepo has a wider exposure compared to the already existing wildcard search.
Another task that will be interesting for other TU Wien researchers is the retrieval of datasets given that their semantic concept and unit of measure is known. For example: a researcher wants to retrieve all datasets from DBRepo that contain moderate ground-level ozone measurements ranging from 55 ppb to 70 ppb. In this case, a single search query will be using the search service of DBRepo to retrieve results in other units of measurement such as ground-level ozone measurements of 118.2 µg/m3 (at 20°C and 1 atm atmospheric pressure). We refer to this as unit independent-based search.
The goals of this internal project are:
- Extend the indexed metadata in the search service to cover semantic concepts and units of measurement of columns for tables and allow structured search through facets that assist users in filtering results based on semantic concept and/or unit of measurement.
- Based on task 1, extend the metadata stored for each column that contains measurements to also allow the collection of metadata to enhance the conversation between units of measurement.
- Based on task 1, extend the search further to allow unit-independent search within the Ontology of units of measurements.
The project team consists of:
The envisioned timeframe for the project is nine months. The awarded budget is close to the maximum amount for .dcall projects of 30.0000€.
More information on DBRepo
System description: https://doi.org/10.2218/ijdc.v17i1.825, opens an external URL in a new window
Demonstration instance: https://dbrepo1.ec.tuwien.ac.at, opens an external URL in a new window
Sandbox instance: https://dbrepo2.ec.tuwien.ac.at, opens an external URL in a new window
Center for Research Data Management
Favoritenstraße 16 (top floor), 1040 Vienna