Publications

Preprints


CaBaGE: Data-Free Model Extraction using ClAss BAlanced Generator Ensemble

Authors: Jonathan Rosenthal, Shanchao Liang, Kevin Zhang, and Lin Tan
Download Paper

Machine Learning as a Service (MLaaS) is often provided as a pay-per-query, black-box system to clients. Such a black-box approach not only hinders open replication, validation, and interpretation of model results, but also makes it harder for white-hat researchers to identify vulnerabilities in the MLaaS systems. Model extraction is a promising technique to address these challenges by reverse-engineering black-box models. Since training data is typically unavailable for MLaaS models, this paper focuses on the realistic version of it: data-free model extraction. We propose a data-free model extraction approach, CaBaGe, to achieve higher model extraction accuracy with a small number of queries. Our innovations include (1) a novel experience replay for focusing on difficult training samples; (2) an ensemble of generators for steadily producing diverse synthetic data; and (3) a selective filtering process for querying the victim model with harder, more balanced samples. In addition, we create a more realistic setting, for the first time, where the attacker has no knowledge of the number of classes in the victim training data, and create a solution to learn the number of classes on the fly. Our evaluation shows that CaBaGe outperforms existing techniques on seven datasets—MNIST, FMNIST, SVHN, CIFAR-10, CIFAR-100, ImageNet-subset, and Tiny ImageNet—with an accuracy improvement of the extracted models by up to 43.13%. Furthermore, the number of queries required to extract a clone model matching the final accuracy of prior work is reduced by up to 75.7%.

Published in ArXiv
Published:

Academic Reports


Deep Reinforcement Learning for Pokemon Battling

Authors: Kevin Zhang
Download Paper | Download Slides

Games have long been seen as a test for skill and adaptation, for both humans and artificial algorithms. They vary widely in rule systems and complexity and can serve as test-beds for development of new algorithms. One such game, popular with a wide variety of demographics all over the world, is Pokemon– and more specifically, its system of battling. This project is inspired by recent interest in develop ing machine learning systems for Pokemon battling, and was carried out with the aim of building experience in Reinforcement Learning (RL) techniques to accomplish such a goal. Different algorithms in RL are analyzed, then implemented in PyTorch. The agents are trained and tested in a Pokemon Battling environment which has been integrated into an OpenAI Gym environment.

Published:

Characterization of Data Partitioning Techniques for Ensemble Methods in Automatic Program Repair

Authors: Kevin Zhang, Nan Jiang, Lin Tan
Download Paper | Download Slides

Fixing bugs in code is a time-consuming endeavor. Automatic Program Repair (APR) seeks to autonomously fix bugs present in source code through patch generation. Recently, the application of neural networks and deep learning techniques, including neural machine translation, to this field has yielded good results, achieving state-of-the-art rates for fixes on the Defects4j and QuixBugs benchmarks. Ensemble techniques have been used to improve the learning properties of these models and achieve better results. However, a systematic measurement of the effectiveness of different ensemble methods has not been carried out. Partitioning of the dataset for training with bagging was selected as a simple and comparable ensemble method. Clustering of the bug type via human categorization and clustering via the encoder hidden state output of a pre-trained model were compared with random divisions to split the training data for ensemble models. This study then compared the results of these different ensemble methods with the same model design on the QuixBugs benchmark to determine their relative effectiveness. It was found that models trained on randomly partitioned data outperformed models trained on data clustered by both human categorization and machine embeddings, fixing 25 bugs on the QuixBugs benchmark as compared to 20 each for the two clustering methods. Further conclusions and observations about the performance of each approach, as well as recommendations for further approaches in ensemble techniques will be provided based on the comparison and analysis of results for these methods.

Published: