I am conducting research on enhancing the reliability and robustness of machine learning models, with a particular emphasis on detecting and mitigating diverse reliability challenges such as silent data corruptions (SDCs), soft errors, adversarial attacks, and out-of-distribution (OOD) inputs. The objective of this work is to build more trustworthy and dependable deep learning systems capable of maintaining stable performance under uncertain and potentially hostile conditions.
To this end, I utilize a range of widely adopted convolutional neural network architectures, including ResNet, AlexNet, DenseNet, and VGG. These models serve as representative baselines for analyzing how reliability issues propagate through different architectural depths and connectivity patterns. My experiments systematically evaluate the susceptibility of these models to different types of faults and perturbations, providing quantitative and qualitative insights into their reliability characteristics.
In addressing these challenges, I am implementing and examining several detection and mitigation techniques aimed at identifying abnormal behaviors at runtime. Among these are methods designed for specific reliability threats: the Dr. DNA framework (which conceptually targets the detection of silent data corruptions, although no official implementation currently exists), the Mahalanobis-based detector (used for identifying both adversarial and OOD instances through distance-based anomaly scoring in feature space), and the ODIN detector (a temperature scaling and input perturbation approach for OOD detection). These methods are rigorously benchmarked across multiple datasets and attack scenarios to assess detection accuracy, computational overhead, and generalization capability.
Overall, my research contributes to a deeper understanding of how modern deep networks fail under unconventional or corrupted conditions, and explores principled mechanisms for improving fault tolerance and reliability in real-world machine learning deployments.

