|Functional Error Correction for Reliable Neural Networks
|Kunping Huang, Texas A&M University, United States; Paul H. Siegel, University of California, San Diego, United States; Anxiao Jiang, Texas A&M University, United States
|L.9: Learning Methods and Networks
|Statistics and Learning Theory
|Click here to download the manuscript
|Click here to watch in the Virtual Symposium
|When deep neural networks (DNNs) are implemented in hardware, their weights need to be stored in memory devices. As noise accumulates in the stored weights, the DNN's performance will degrade. This paper studies how to use error correcting codes (ECCs) to protect the weights. Different from classic error correction in data storage, the optimization objective is to optimize the DNN's performance after error correction, instead of minimizing the Uncorrectable Bit Error Rate in the protected bits. That is, by seeing the DNN as a function of its input, the error correction scheme is function-oriented. A main challenge is that a DNN often has millions to hundreds of millions of weights, causing a large redundancy overhead for ECCs, and the relationship between the weights and its DNN's performance can be highly complex. To address the challenge, we propose a Selective Protection (SP) scheme, which chooses only a subset of important bits for ECC protection. To find such bits and achieve an optimized tradeoff between ECC's redundancy and DNN's performance, we present an algorithm based on deep reinforcement learning. Experimental results verify that compared to the natural baseline scheme, the proposed algorithm achieves substantially better performance for the functional error correction task.