Alessio Colucci

Fault Analysis and Mitigation for Modern Deep Learning Systems

In the last decade, Neural Networks (NNs) have become widely used due to their proficiency in complex tasks. However, while they have reached a level of performance and accuracy allowing system designers to deploy them even on resource-constrained IoT devices, there have been few studies focusing on their reliability for safety-critical real-world scenarios. Most reliability research has focused on permanent faults, neglecting transient faults which have become more and more common with the further scaling of technology nodes. Additionally, real-world deployments utilize different optimization techniques, e.g. quantization and pruning, and different models, e.g., Spiking Neural Networks (SNNs), for further improving performance, and using old mitigation techniques, e.g. Triple-Modular Redundancy, which are not cost-effective for usage with NNs.
Hence, there is a need for further experiments and analysis of transient faults in NNs, especially for SNNs and compressed NNs, in order to develop better error models and mitigations. Our work has focused on developing a modern fault injection framework, to better analyze different NNs, while still being independent of the model or platform used. With this framework, different fault models can be experimented on, to gather the results of the injections and train a general error model for NNs. This model will be used to fine-tune the injected faults, and develop cost-effective mitigations for NNs.

Our website uses cookies to ensure you get the best experience on our website, for analytical purposes, to provide social media features, and for targeted advertising. This it is necessary in order to pass information on to respective service providers. If you would like additional information about cookies on this website, please see our Data Protection Declaration.

These cookies are required to help our website run smoothly.

Name	Purpose	Lifetime	Type	Provider
wordpress_test_cookie	Testing-Cookie to check whether cookies are allowed.	1 Year	HTTP	Homepage TUW
PHPSESSID	Used by WordPress to retain the state of your current user session for all page requests.	Session	HTTP	Homepage TUW
wordpress_logged_in_{hash}	Used by Wordpress to keep users logged in. {hash} represents an unique user token.	1 Year	HTTP	Homepage TUW
wp-settings-time-{id}	Used to customize your view of admin interface, and possibly also the main site interface.	1 Year	HTTP	Homepage TUW
wordpress_sec_{hash}	This cookie is used to store your authentication details. Its use is limited to the admin console area. {hash} represents an unique user token.	1 Year	HTTP	Homepage TUW
wp-settings-{id}	Used to customize your view of admin interface, and possibly also the main site interface.	1 Year	HTTP	Homepage TUW
wp-wpml_current_language	Stores the current language. This cookie is enabled by default on sites that use the Language filtering for AJAX operations feature.	1 Day	HTTP	Homepage TUW
wp-wpml_current_admin_language_{hash}	Stores the current WordPress administration area language. {hash} represents an unique user token.	1 Day	HTTP	Homepage TUW
CookieConsent_d608fe	Saves your settings for the use of cookies on this website.	1 Year	HTML	Homepage TUW

These cookies help us to continuously improve our services and adapt our website to your needs. We statistically evaluate the pseudonymized data collected from our website.

Name	Purpose	Lifetime	Type	Provider
_pk_id.136.56ce	Used to store a few details about the user such as the unique visitor ID.	13 months	HTML	Matomo TUW
_pk_ref	Is used to store the information of the users home website.	6 months	HTML	Matomo TUW
_pk_ses.136.56ce	Is needed to store temporary data of the visit.	30 minutes	HTML	Matomo TUW

Fault Analysis and Mitigation for Modern Deep Learning Systems

About Cookies