Transferred Discrepancy: Quantifying the Difference Between Representations

Published in Arxiv Preprint, 2020

Understanding what information neural networks capture is an essential problem in deep learning, and studying whether different models capture similar features is an initial step to achieve this goal. Previous works sought to define metrics over the feature matrices to measure the difference between two models. In this work, we propose a novel metric that goes beyond previous approaches. We argue that we should design the metric based on a similar principle. For that, we introduce the transferred discrepancy (TD), a new metric that defines the difference between two representations based on their downstream-task performance. We also find that TD may be used to evaluate the effectiveness of different training strategies. This suggests a training strategy that leads to more robust representation also trains models that generalize better. Read more

Recommended citation: **Yunzhen Feng** *, Runtian Zhai*, Di He, Liwei Wang, Bin Dong

Enhancing Certified Robustness of Smoothed Classifiers via Weighted Model Ensembling

Published in Arxiv Preprint, 2020

Randomized smoothing has achieved state-of-the-art certified robustness against l2-norm adversarial attacks. However, it is not wholly resolved on how to find the optimal base classifier for randomized smoothing. In this work, we employ a Smoothed WEighted ENsembling (SWEEN) scheme to improve the performance of randomized smoothed classifiers. We theoretically show how SWEEN can be trained to achieve near-optimal risk in the randomized smoothing regime. We also develop an adaptive prediction algorithm to reduce the prediction and certification cost of SWEEN models. Extensive experiments illustrates the benefits of employing SWEEN. Read more

Recommended citation: Chizhou Liu, **Yunzhen Feng**, Ranran Wang, Bin Dong