All news at TU Wien

21. November 2025

Research Spheres: Venturing into new spheres of research

Conversion of the research plan to a data-based approach by 2026

annotation tree, white background, in front words in black in circular order — © Thomas Haschka
Annotation Tree

Research is not for the disorganised: anyone who wants results has to plan – and that is something TU Wien has always done. Now, however, the university is moving to the next level. By the end of 2026, the Rectorate will shift its research strategy to a data-driven approach. Thomas Haschka, AI consultant for EuroCC Austria, opens an external URL in a new window, is part of the research team from the “Knowledge Graph Lab” together with Emanuel Sallinger, Eleonora Laurenza, Alessandro Pesare, Roxana Dogaru, Max Tiessler as well as Moritz Staudinger and Tatiana Beliaeva in Allan Hanbury's team that uses AI-based methods such as knowledge graphs and LLMs to analyse TU Wien’s past research output and lay the groundwork for future strategic decisions.

Around 4,600 people conduct research at TU Wien across 51 institutes. They publish roughly 700 articles each year in about 50 journals – an immense amount of data. This data now serves, alongside project submissions and individual research profiles, as the foundation for restructuring the research matrix introduced in 2009. The name of the project within which this is taking place is Research Spheres.

But why? What is the point of Research Spheres?

“Every university in Austria defined research priorities and related thematic areas years ago. Since research is a dynamic process, it is now time to revise this system,” says Elisabeth Schludermann, Senior Advisor for Research and project lead. “The existing research matrix is highly static and does not reflect many current topics – such as AI or high-performance computing. The project aims to develop approaches and tools that can represent research at TU Wien in a way that is appropriate and meaningful for its various audiences – data-driven and evidence-based.”

Thomas Haschka is part of the project team, an AI specialist at TU Wien’s dataLAB and advisor to the EuroCC Austria team on questions of artificial intelligence. The AI tool he developed is essential to the project, as it makes it possible to analyse publication output in great detail and group it based on similarities between abstracts. This generates fresh impulses for potential research topics, which can then be further developed and help strengthen the research community.

30,000 abstracts as fuel for the AI

What exactly did Thomas Haschka do? Thanks to the EuroCC project, he was one of the first in Austria to gain access to the new MUSICA high-performance computer. With the help of 80 GPUs from the supercomputer’s total of 272 GPU nodes and 168 CPU nodes, he was able to programme an algorithm and deploy an associated large language model (LLM).. Without MUSICA’s immense computing power, this part of the project would not have been possible. And even with MUSICA, annotating all research fields from all TU Wien abstracts took three days.

The “fuel” for the LLM consisted of around 30,000 abstracts from published articles and about 2,000 submitted projects (which usually later result in publications). All abstracts were provided by TU Wien’s research information systems unit, specifically from the ReposiTum (TU internal), Scival-Scopus, opens an external URL in a new window and Dimensions databases, opens an external URL in a new window. The project data came from TU-internal databases as well as Dimensions.

Haschka processed these tens of thousands of records, removed duplicates and kept only English texts to provide clean and comparable material for the AI model. He then generated an embedding vector for each abstract – a bit like creating a map where each article appears as a dot. If two articles deal with similar topics, the dots lie close together on this map. Based on these distances, Haschka formed groups – so-called clusters. All articles with related content end up in the same cluster.

From many clusters to a tree

All these clusters were then arranged hierarchically, like a tree: small groups of closely related articles form branches, which merge into larger limbs until all topics converge at the “trunk”. The closer the thematic similarity between abstracts, the nearer they sit in the tree.

This is where the LLM comes into play. Its role was to find a name for each branch and cluster – a sort of overarching label describing what the articles are about. “Quantum physics”, for instance, or “sustainable materials”, “artificial intelligence” and so on. In the end, every branch has a label. This is crucial for recognising and understanding the various research areas being pursued.

Name	Purpose	Lifetime	Type	Provider
CookieConsent	Saves your settings for the use of cookies on this website.	1 year	HTML	Homepage TU Wien
SimpleSAML	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	Login TU Wien
SimpleSAMLAuthToken	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	Login TU Wien
fe_typo_user	Is needed so that in case of a Typo3 frontend login the session ID is recognized to grant access to protected areas.	session	HTTP	Homepage TU Wien
staticfilecache	Is needed to optimize the delivery time of the website.	session	HTTP	Homepage TU Wien
JESSIONSID	Is needed so that in case of a LectureTube the session ID is recognized to grant access to protected areas.	session	HTTP	LectureTube TU Wien
_shibsession_lecturetube	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	LectureTube TU Wien

Name	Purpose	Lifetime	Type	Provider
_pk_id	Used to store a few details about the user such as the unique visitor ID.	13 months	HTML	Matomo TU Wien
_pk_ref	Is used to store the information of the users home website.	6 months	HTML	Matomo TU Wien
_pk_ses	Is needed to store temporary data of the visit.	30 minutes	HTML	Matomo TU Wien

Name	Purpose	Lifetime	Type	Provider
facebook	Is used to Enable ad delivery or retargeting	90 days	HTTP	Meta
__fb_chat_plugin	Is needed to store and track interactions (marketing/tracking).	persistent	HTTP	Meta
_js_datr	Is needed to save user settings.	2 years	HTTP	Meta
_fbc	Is needed to save the last visit (marketing/tracking).	2 years	HTTP	Meta
fbm	Is needed to store account data (marketing/tracking).	1 year	HTTP	Meta
xs	Is needed to store a unique session ID (marketing/tracking).	1 year	HTTP	Meta
wd	Is needed to log the screen resolution.	1 week	HTTP	Meta
fr	Is needed to serve ads and measure and improve their relevance.	3 months	HTTP	Meta
act	Is needed to store logged in users (marketing/tracking).	90 days	HTTP	Meta
_fbp	Is needed to store and track visits to various websites (marketing/tracking).	3 months	HTTP	Meta
datr	Is needed to identify the browser for security and website integrity purposes, including account recovery and identification of potentially compromised accounts.	2 years	HTTP	Meta
dpr	Is used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	1 week	HTTP	Meta
sb	Is needed to store browser details and security information of the Facebook account.	2 years	HTTP	Meta
dbln	Is needed to store browser details and security information of the Facebook account.	2 years	HTTP	Meta
spin	Is needed for promotional purposes and social campaign reporting.	session	HTTP	Meta
presence	Contains the "chat" status of logged in users.	1 month	HTTP	Meta
cppo	Is needed for statistical purposes.	90 days	HTTP	Meta
locale	Is needed to save the language settings.	session	HTTP	Meta
pl	Required for Facebook Pixel.	2 years	HTTP	Meta
lu	Required for Facebook Pixel.	2 years	HTTP	Meta
c_user	Required for Facebook Pixel.	3 months	HTTP	Meta
bcookie	Is needed to store browser data (marketing/tracking).	2 years	HTTP	LinkedIn
li_oatml	Is needed to identify LinkedIn members outside of LinkedIn for advertising and analytics purposes.	1 month	HTTP	LinkedIn
BizographicsOptOut	Is needed to save privacy settings.	10 years	HTTP	LinkedIn
li_sugr	Is needed to store browser data (marketing/tracking).	3 months	HTTP	LinkedIn
UserMatchHistory	Is needed to provide advertising or retargeting (marketing/tracking).	30 days	HTTP	LinkedIn
linkedin_oauth_	Is needed to provide cross-page functionality.	session	HTTP	LinkedIn
lidc	Is needed to store performed actions on the website (marketing/tracking).	1 day	HTTP	LinkedIn
bscookie	Is needed to store performed actions on the website (marketing/tracking).	2 years	HTTP	LinkedIn
X-LI-IDC	Is needed to provide cross-page functionality (marketing/tracking).	session	HTTP	LinkedIn
AnalyticsSyncHistory	Stores the time when the user was synchronized with the "lms_analytics" cookie.	30 days	HTTP	LinkedIn
lms_ads	Is needed to identify LinkedIn members outside of LinkedIn.	30 days	HTTP	LinkedIn
lms_analytics	Is needed to identify LinkedIn members for analytics purposes.	30 days	HTTP	LinkedIn
li_fat_id	Required for indirect member identification used for conversion tracking, retargeting and analytics.	30 days	HTTP	LinkedIn
U	Is needed to identify the browser.	3 months	HTTP	LinkedIn
_guid	Is needed to identify a LinkedIn member for advertising via Google Ads.	90 days	HTTP	LinkedIn