Research software

In the course of digitalisation, research software has become a central element of scientific work. This includes the development and use of research software for simulation, generation, processing, analysis and visualisation of research data, and for control of research equipment and experiments. The essential importance of research software poses new challenges for good scientific practice with regards to transparency, traceability and reproducibility of research results. These standards are best met through open access to software according to the FAIR principles (Findability, Accessibility, Interoperability, Reusability).

Handreichung zum Umgang mit Forschungssoftware (translated quote)

Matthias Katerbow, Georg Feulner et al., 2018, page 1 (DOI 10.5281/zenodo.1172970)

More information

By archiving and making available software which you have developed yourself, you ensure that the functionality of your software can be verified and reproduced later and by others. The following tips may help you pursue this goal:

Archive your self-developed software in a dedicated software repository such as Software Heritage to give everyone access to your source code. To increase the visibility of your research, you can also use code repositories like GitHub or TU Wien's TUgitLab. If you develop workflows, repositories such as my experiment may be right for you.
Assign source codes you want to publish with a suitable and free (as possible) license, such as the GNU General Public License (GPL) or the Apache License 2.0. The tool Choose a License can help you choosing an open source license.
In addition to the code, provide a detailed description of your experiments and data sets which others can use to test your software. In Jupyter Notebooks, you can create code and describe your experiment in related documentation at the same time.
Additionally, help others reproduce your experiment with your specific configuration by providing Docker container. ReproZip is also a helpful tool, since it bundles all relevant files such as data, software, libraries and environmental variables.
You should always allocate a persistent identifier to your code (e.g. DOI through the GitHub integration in Zenodo; there will also be a corresponding link between the TUgitLab and the TU data repository in the future). If this is not possible, point out the exact date of change.
From the outset, specify in your data management plan which source codes and related information should ultimately be archived, as well as how and where (see example).
As a rule, follow the standards of your discipline and create good README files to inform others about your experiment and approach.
Adhere to naming and coding conventions of the language used. This improves readability and allows others to easily navigate the structure and process of your project.
Follow the FAIR principles when making your software available.

In this experiment (DOI: 10.5281/zenodo.1209833, opens an external URL in a new window), the association between alcohol consumed and UFOs sighted was investigated. The following two sets of data were used for this purpose:

1. Dataset: Alcohol Consumption
OECD (2018), Alcohol consumption (indicator).
DOI: 10.1787/e6895909-en, opens an external URL in a new window
File Location: data/raw/DP_LIVE_22032018202902423.csvFile Size: 112K

2. Dataset: Ufo Sightings
Sigmond Axel. (2014). ufo-reports, opens an external URL in a new window (Version commitc0915f18186e5e2227083702049a838258001a2a) [Data set]. Zenodo
File Location: data/raw/ufo-scrubbed-geocoded-time-standardized.csv
File Size: 13M

The following files should be archived as they are relevant to the reproducibility of the experiment:

README.md (text files with instructions for running the experiment)
both Jupyer Notebooks (experiment code and its documentation)
Dockerfile (Dockerfile, to create a Docker container)
requirements.txt (list of Python dependencies needed for the experiment)
documentation/architecture.png (diagram of the experiment architecture)
documentation/description.txt (text file which describes the correlation diagram)
documentation/metadata.xml (relevant metadata for the experiment)

Input data is not included in the archive, as it is already stored in other repositories and easily accessible.

In addition to selve-developed software, already available software is used in research.

These applications must also be documented in order to verify and reproduce research results, as should the following:

a detailed description and citation of the software version used and its configuration
a description of the parameters chosen and the software and hardware environment in which the software was used

The use of Docker containers, opens an external URL in a new window may also make sense here.

As early as software selection, you should consider potential usage limitations with regard to the reproducibility of your experiments and, if possible, exclude them. For example, consider the following points:

In the case of web applications, it is possible that the version you use will no longer be available later.
In the case of commercial off-the-shelf software, licensing costs and a dependence on the software provider, are factors which affect you and all other users.

Name	Purpose	Lifetime	Type	Provider
CookieConsent	Saves your settings for the use of cookies on this website.	1 year	HTML	Homepage TU Wien
SimpleSAML	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	Login TU Wien
SimpleSAMLAuthToken	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	Login TU Wien
fe_typo_user	Is needed so that in case of a Typo3 frontend login the session ID is recognized to grant access to protected areas.	session	HTTP	Homepage TU Wien
staticfilecache	Is needed to optimize the delivery time of the website.	session	HTTP	Homepage TU Wien
JESSIONSID	Is needed so that in case of a LectureTube the session ID is recognized to grant access to protected areas.	session	HTTP	LectureTube TU Wien
_shibsession_lecturetube	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	LectureTube TU Wien

Name	Purpose	Lifetime	Type	Provider
_pk_id	Used to store a few details about the user such as the unique visitor ID.	13 months	HTML	Matomo TU Wien
_pk_ref	Is used to store the information of the users home website.	6 months	HTML	Matomo TU Wien
_pk_ses	Is needed to store temporary data of the visit.	30 minutes	HTML	Matomo TU Wien

Name	Purpose	Lifetime	Type	Provider
facebook	Is used to Enable ad delivery or retargeting	90 days	HTTP	Meta
__fb_chat_plugin	Is needed to store and track interactions (marketing/tracking).	persistent	HTTP	Meta
_js_datr	Is needed to save user settings.	2 years	HTTP	Meta
_fbc	Is needed to save the last visit (marketing/tracking).	2 years	HTTP	Meta
fbm	Is needed to store account data (marketing/tracking).	1 year	HTTP	Meta
xs	Is needed to store a unique session ID (marketing/tracking).	1 year	HTTP	Meta
wd	Is needed to log the screen resolution.	1 week	HTTP	Meta
fr	Is needed to serve ads and measure and improve their relevance.	3 months	HTTP	Meta
act	Is needed to store logged in users (marketing/tracking).	90 days	HTTP	Meta
_fbp	Is needed to store and track visits to various websites (marketing/tracking).	3 months	HTTP	Meta
datr	Is needed to identify the browser for security and website integrity purposes, including account recovery and identification of potentially compromised accounts.	2 years	HTTP	Meta
dpr	Is used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	1 week	HTTP	Meta
sb	Is needed to store browser details and security information of the Facebook account.	2 years	HTTP	Meta
dbln	Is needed to store browser details and security information of the Facebook account.	2 years	HTTP	Meta
spin	Is needed for promotional purposes and social campaign reporting.	session	HTTP	Meta
presence	Contains the "chat" status of logged in users.	1 month	HTTP	Meta
cppo	Is needed for statistical purposes.	90 days	HTTP	Meta
locale	Is needed to save the language settings.	session	HTTP	Meta
pl	Required for Facebook Pixel.	2 years	HTTP	Meta
lu	Required for Facebook Pixel.	2 years	HTTP	Meta
c_user	Required for Facebook Pixel.	3 months	HTTP	Meta
bcookie	Is needed to store browser data (marketing/tracking).	2 years	HTTP	LinkedIn
li_oatml	Is needed to identify LinkedIn members outside of LinkedIn for advertising and analytics purposes.	1 month	HTTP	LinkedIn
BizographicsOptOut	Is needed to save privacy settings.	10 years	HTTP	LinkedIn
li_sugr	Is needed to store browser data (marketing/tracking).	3 months	HTTP	LinkedIn
UserMatchHistory	Is needed to provide advertising or retargeting (marketing/tracking).	30 days	HTTP	LinkedIn
linkedin_oauth_	Is needed to provide cross-page functionality.	session	HTTP	LinkedIn
lidc	Is needed to store performed actions on the website (marketing/tracking).	1 day	HTTP	LinkedIn
bscookie	Is needed to store performed actions on the website (marketing/tracking).	2 years	HTTP	LinkedIn
X-LI-IDC	Is needed to provide cross-page functionality (marketing/tracking).	session	HTTP	LinkedIn
AnalyticsSyncHistory	Stores the time when the user was synchronized with the "lms_analytics" cookie.	30 days	HTTP	LinkedIn
lms_ads	Is needed to identify LinkedIn members outside of LinkedIn.	30 days	HTTP	LinkedIn
lms_analytics	Is needed to identify LinkedIn members for analytics purposes.	30 days	HTTP	LinkedIn
li_fat_id	Required for indirect member identification used for conversion tracking, retargeting and analytics.	30 days	HTTP	LinkedIn
U	Is needed to identify the browser.	3 months	HTTP	LinkedIn
_guid	Is needed to identify a LinkedIn member for advertising via Google Ads.	90 days	HTTP	LinkedIn

Research software

More information

Self-developed research software

Archiving example

Software used