Michael Sikorski is a well‐known expert in malware analysis. He is a Technical Director at Mandiant and a member of the Mandiant Labs (M‐Labs) leadership team. Mr. Sikorski created a series of courses in malware analysis and teaches them to a variety of audiences including the FBI, NSA, private companies, and Black Hat. He is co‐author of the book “Practical Malware Analysis,” published in early 2012. Mr. Sikorski received a MS in Computer Science specializing in security from John Hopkins University and BS in Computer Engineering from Columbia University.
The first two questions that malware analysts receive from incident responders in the field are “Is this file malware?” and “Is this malware related to something we have seen before?”. In order to more rapidly answer these questions we decided to start integrating machine learning into our malware analysis process. This talk will discuss how we extract features and build models to cluster and classify malware using machine learning through the use of the following two research tools:
1. Red Forest: Anti-virus products do not recognize all malware that an analyst deals with in an environment --- whether it is because the signatures or heuristics fail. To solve this issue, we created a malware classification tool called Red Forest. This tool is extremely flexible and can be easily customized. In this talk, we will show the benefits of Red Forest and how you can extend the malware classification engine using plug-ins. You will see how users can easily train the system to build new models for use in their unique environments. This is a sneak peek into a free tool that we’ll be releasing later this year.
2. Epar: In order to increase the speed in which we are able to analyze many samples at once we started to cluster malware into groups. A byproduct of that effort is that we can now more easily identify malware variants; which, happens to be the majority of the malware we encounter. Our solution is a tool for clustering called Epar. It extracts a series of static and dynamic features to cluster malware. I’ll talk about how it works and demonstrate its capability.