Abstract
The traditional paradigm for achieving exceptional performance in classification tasks is to use deep neural networks (DNNs) trained as “black-boxes” on large quantities of labeled data. Unfortunately, two side effects of training DNNs this way is the risk of overfitting to the training set and a lack of understanding of what feature patterns the DNN has extracted from the data to make decisions. To analyze and mitigate these effects, we view the DNN as a composition of its feature extractor, which outputs a final feature vector, and its classifier function, which outputs a predicted label based on the encoding of the feature vector. For robust classification, we expect that feature vectors within the same class should be similar while features from different classes should be dissimilar. In this research, we focus our attention on the intra- and inter-class feature vector relationships output by the feature extraction function prior to classification and analyze their relationships to test performance. We design a novel DNN evaluation method that can operate effectively using either gathered or synthetic data, which are input to the DNN to create class-feature sets. We compute feature similarity metrics based on intra-class compactness, inter-class separation, and intra-class feature covariance which are able to bound and correlate to test performance. These observations led us to derive a governing probabilistic bound on the deviation of a feature vector from its class using an alternative form of Chebyshev’s inequality. By augmenting the cross-entropy loss function with the terms of the inequality, we are able to reduce the upper bound and regularize a DNN in a novel training method, which outperforms previous approaches in many settings. Our feature characterization metrics can be used by the community to better understand the feature-performance relationship, compare pre-trained feature extractors, and implement a new training method that can efficiently mitigate overfitting.