
Marcus Botacin, Malware Detection under Concept Drift: Science and Engineering
The current largest challenge in ML-based malware detection is maintaining high detection rates while samples evolve, causing classifiers to drift. What is the best way to solve this problem? In this talk, Dr. Botacin presents two views on the problem: the scientific and the engineering. In the first part of the talk, Dr. Botacin discusses how to make ML-based drift detectors explainable. The talk discusses how one can split the classifier knowledge into two: (1) the knowledge about the frontier between Malware (M) and Goodware (G); and (2) the knowledge about the concept of the (M and G) classes, to understand whether the concept or the classification frontier changed. The second part of the talk discusses how the experimental conditions in which the drift handling approaches are developed often mismatch the real deployment settings, causing the solutions to fail to achieve the desired results. Dr Botacin points out ideal assumptions that do not hold in reality, such as: (1) the amount of drifted data a system can handle, and (2) the immediate availability of oracle data for drift detection, when in practice, a scenario of label delays is much more frequent. The talk demonstrates a solution for these problems via a 5K+ experiment, which illustrates (1) how to explain every drift point in a malware detection pipeline and (2) how an explainable drift detector also makes online retraining to achieve higher detection rates and requires fewer retraining points than traditional approaches. About the speaker: Dr. Botacin is a Computer Science Assistant Professor at Texas A&M University (TAMU, USA) since 2022. Ph.D. in Computer Science (UFPR, Brazil), Master's in Computer Science and Computer Engineering (UNICAMP, Brazil). Malware Analyst since 2012. Specialist in AV engines and Sandbox Development. Dr. Botacin published research papers at major academic conferences and journals. Dr. Botacin also presented his work at major industry and hacking conferences, such as HackInTheBox and Hou.Sec.Con.Page: https://marcusbotacin.github.io/
정보
- 프로그램
- 주기매주 업데이트
- 발행일2025년 10월 29일 오후 8:30 UTC
- 길이52분
- 시즌31
- 에피소드897
- 등급전체 연령 사용가