We meet Dr Adil Mukhtar at the Research Unit of Integral Building Technology at TU Wien after his recently published paper on code and data sharing gaps in building sciences caught our attention. Dr Mukhtar earned his master's degree in computer science in Pakistan, then moved to Austria six years ago to dive deeper into AI – and has since swiched gears, now applying machine learning to building sciences. Our conversation ranges from training models for fault detection in building systems, the intricacies of simulation datasets and explainable AI methods, to the contrasting research cultures of building physics and computer science – and what these differences mean for sharing data.
AI-supported fault detection
“With machine learning models – if used in building systems or wherever – most of them, which are really efficient, inherently lack transparency and interpretability. So that's why they are called black box models. You often don't get to know the reasoning behind why the model has made that decision. And that's where explainable AI techniques come in.”
Adil Mukhtar describes how AI-driven fault detection and diagnosis is now standard in building research: models scan sensor data from HVAC (Heating, Ventilation, Air Conditioning) systems to spot anomalies and pinpoint root causes. This turns raw streams of temperature, flow and pressure readings into clear alerts – ideally triggering automatic work orders, setpoint adjustments or schedule changes. Since operators need to trust these alerts, explainable AI techniques (XAI) are necessary to show exactly which sensor values or patterns triggered the fault, so technicians can verify and act confidently rather than second-guessing automated decisions. But in order to create these models, comparable training datasets from building systems are necessary, which tend to be derived from simulations rather than sensitive real-world data.
Working with simulated datasets
“The problem with working with machine learning in a building environment setting is that you usually don't have real-world data – NDAs and proprietary info, such as, number of occupants, building structure, and more. So we turn to simulated datasets to train our models. But simulation has its own drawbacks. It's not perfect. It's not close to reality. It's the oversimplification of reality and they have like their own initial biases.”
When possible, anonymised real-world data is shared within project consortia, but for fault detection, Dr Mukhtar has relied on simulated datasets published in top journals, which he trusts. Models trained on oversimplified simulations often fail to generalise to messy real-world conditions, amplifying errors when deployed in actual buildings. Explainable AI tools like SHAP help by revealing which simulated inputs – building characteristics, room layouts, fault flags – drive the model's output, though this transparency cannot fully overcome the simulation's inherent limitations. But for Adil, even trusted simulated datasets from top publishing venues only go so far. Without detailed metadata on how those datasets were then used to build and train the resulting model, other researchers cannot retrace the steps and parameters or verify the results.
Metadata in machine learning
"As a machine learning practitioner, if I have the simulated dataset available from the paper, I will try to share that, but I'll also try my best to have open weights and explain how and why I split the datasets into training, validation and testing portions – so others can reproduce my model and outcome. Machine learning needs more sharing on initialisation – like the methodology part of it, you want to explain fine-tuning as in detail as possible."
Machine learning demands specifics on data splits, model fine-tuning, feature transformations, hyperparameter choices and best-performing weights to avoid random lucky fits or conflicting replication results. Without this, others cannot retrain the model under the same conditions or verify if strong results came from robust methods or chance – especially given the stochastic nature of AI, where different initialisations or splits can lead to wildly different outcomes. Dr Mukhtar recounts how, upon entering building sciences, he discovered significant differences in metadata sharing between computer science and building sciences. His recent paper, Reproducibility of machine learning-based fault detection and diagnosis for HVAC systems in buildings, opens an external URL in a new window reveals that building researchers prioritise descriptive metadata about experimental setups but often omit baselines and modelling details.
Outlook on reproducibility
“Around 70% of the papers and studies that we've reviewed were not sharing the dataset, and also no information on code or all supplementary material. But what I have observed is that building engineers are more likely to share descriptions of the dataset itself – there is a difference in research culture. All the metadata, the context of how the built environment is, the methodology or the experiment set-up is gladly provided. But generally, there is no requirement for sharing your dataset during the submission process or coding scripts or material.”
Dr Mukhtar's paper emphasises that detailed metadata sharing – beyond basic building or lab descriptions – is essential for genuine reproducibility. Building engineers document physical contexts like room layouts and HVAC zones well, but are, in Adil's words, often "code shy” (uncertain about publishing process, exposing details, etc.), while machine learning researchers, in some cases, prioritise code transparency and model configurations when they do share supplementary material. He sees little incentive for broader data sharing beyond personal drive, despite direct author contacts when papers intrigue. Reproducibility debates have lingered for decades amid "publish or perish" pressures and proprietary constraints in building projects. Top-down funder mandates with dedicated data-sharing work packages could drive change, and early signs are hopeful: publishers award badges, funders promote FAIR practices. Adil calls for heightened awareness. Without cultural shifts and firm rules on code sharing, datasets and black-box models remain siloed, their potential lost to future research.
Contact
Adil Mukhtar
Research Unit of Integral Building Technology
TU Wien
adil.mukhtar@tuwien.ac.at
Center for Research Data Management
TU Wien
research.data@tuwien.ac.at
