Researchers frequently need to find, use, and publish research data as part of their work. To properly manage this data in an institutional context, we developed DBRepo, a repository for data in databases that assists researchers in making their research data findable, accessible, interoperable, and reusable. The system manages the researcher’s data and derives machine-actionable metadata from it. It allows users transparent access to their data and others to explore the data. As a result of the .dcall 2023 project, we implemented a module that enables better findability of data, for example, according to their semantic concept and independent of their unit of measurement.
Structured & unit independent search
With the first datasets being deposited in DBRepo during the beginning of 2023, we found that the findability is very limited due to a free-text search. The accuracy especially had potential for development since a free-text search produced too many results that were not relevant to the search term.
This situation has improved after the completion of the .dcall 2023 project, where the search index was entirely re-modelled. It now contains an optimised replica of the metadata available in DBRepo, structured in an efficient data model. This allows for a structured search across all major components, such as databases, tables, columns, views, identifiers, users, concepts, and units of measurement, thus for a precise search of components. For example, you can search for databases that contain a semantic concept like wd:temperature, opens an external URL in a new window. This is similar to webshops allowing to filter clothing size, colour, etc.
Additionally, to further increase the relevancy of search results, a user can search datasets regardless of their unit of measurement as long as they have a common semantic concept and convertible unit of measurement. This allows for a unit independent search such as getting databases that contain a semantic concept wd:temperature, opens an external URL in a new window and unit of measurements om2:degreeCelsius, opens an external URL in a new window and om2:degreeFahrenheit, opens an external URL in a new window. The search module knows the proper context and only shows results that match the source unit.
.dcall 2023 final presentation: https://ec.tuwien.ac.at/~weise/pdf/dcall_final_presentation.pdf, opens an external URL in a new window
We want to thank the .digital office for enabling the development with funding and great collaboration through the internal .dcall 2023, TU.it for the compute resources and great collaboration, as well as all open-source developers involved (Martin Weise, Sotirios Tsepelakis, Nikola Lukic, Max Spannring, Gökay Güçlü, Geoffrey Karnbach).
Center for Research Data Management
Favoritenstraße 14 (top floor), 1040 Vienna