Unsere Events

06. September 2023, 10:00 bis 16:00

Online course: Data Analysis and Data Preparation for Machine Learning


The one-day Data Analysis and Data Preparation for Machine Learning online course aims at professionals from all domains, who would like to get a handle on their data. It shows the participant how to get a sense of the look and feel of their data, how to visualize it and clean it up where necessary. In addition, this course shows participants how to get the data in a suitable shape before feeding it into Machine Learning (ML) algorithms further down the line. Please note that this course will not teach ML methods, as there are follow-up courses for these topics.

This online training is created for all engineering disciplines with some programming experience as well as professionals who would like to make use of the tons of data that are being collected, such as:

• Marketing professionals with programming experience
• Professionals in quality management with programming experience
• Professionals in machine maintenance with programming experience

Participants will learn how to:

• Get data into a suitable form
• Visualize data
• Clean data
• Transform data
• Analyze data
• Handle data that does not fit in memory


• Overview
Participants learn why data needs to be pre-processed before being passed to ML methods. They also learn what the typical challenges are in data wrangling.
• Pandas
Participants get to know this powerful Python library and find out how they can load data into a data frame, get the look and feel of it and transform it in the best suitable way.
• NumPy
ML would simply not be possible in Python without this useful library for numerical operations. This is why participants will get to know the most important aspects of the API and what can be achieved with it.
• Matplotlib
Humans are visual beings and this is why we prefer looking at graphs, rather than endless tables of data. Matplotlib is the Python library to create all kinds of graphs which helps understand data a great deal more. Participants will learn how to create the most common graphs within Matplotlib.
• Dask
In ML problems, we often get to a situation where our data does not fit into memory. Even if it fits into memory, we would like some operations to run faster. Dask solves this problem by dividing our data into smaller, more manageable chunks. It then runs computations on those chunks in parallel, making it possible to handle data that is larger than memory. It is also faster since it makes computations run concurrently. Participants will get to know this tool and see the similarities with previously learned libraries.

Course Format:

This course will be delivered as a LIVE ONLINE COURSE (using Zoom) for remote participation. The participation links will be provided after the purchase and before the training.


The participants are expected to have at least basic programming skills in Python.

Hands-on Labs:

The programming language of choice is Python and participants will get to know libraries such as NumPy, Pandas, Scikit-Learn, Matplotlib and Dask.

The content is delivered with Jupyter notebooks on Google Colab, so participants should have a Google account in order to be able to participate fully.

Date, Time, and Location:

6 September 2023, 10:00 - 16:00 CEST, with a 1-hour break at 12:00, LIVE ONLINE COURSE (using Zoom).

Course Material:

The course material will be available for registered attendees at course start.

Prices and Eligibility:

Price for the full course with certificate of attendance:
120 EUR/person (including VAT)



TU Wien,
Zoom, Online



EIT MAnufacturing CLC East, EuroCC Austria, Vienna Scientific Cluster
Simeon Harrison (Coordinator Training for Industries, EuroCC Austria and VSC)











Anmeldung erforderlich


Register now via EIT Manufacturing CLC East: https://eitmanufacturing-east.eu/product/data-analysis-and-data-preparation-for-machine-learning/ , öffnet eine externe URL in einem neuen Fenster