Machine Learning talks in RSA Con 2017

Machine Learning talks in RSA Con 2017

The RSA Conference is one of the most widely attended security conferences in the world, and the 2017 edition, held in SFO, concluded just about 10 days ago.

There were close to 20 presentations this time, around using Machine Learning (referred to as ML hereon in this post) in detecting/preventing cyber attacks of various kinds. And in this post I share my take and a summary (detailed in some cases) on the Top 10 talks on ML.

Some of these talks, especially research projects, require a detailed discussion and analysis, but I’ve tried to do justice to them by keeping my summary as detailed as possible. I plan to dive deeper into some of these topics, in the future.

Note: I have included a link to the original Talk (presentation or video) wherever I could find them, so do check them out.

  1. A Vision for Shared, Central Intelligence to Ebb a Growing Flood of Alerts

Dan Plastina, who heads Threat Protection at Microsoft, gave a talk on striking a balance between using ML in threat detection and also in Incident Management/Orchestration process, using linked Graph and chat Bots, in “SecOPS Console”, to better manage the growing flood security alerts. What I found interesting in this talk is the mention of a whole gamut of Microsoft products, many of which are familiar to us, like AD, Office, Azure security center. But I couldn’t find if Dan was also referring to an IR Orchestration tool that Microsoft has built or is int roadmap. Also, I see that R is being tightly integrated into various Microsoft products.

An interesting talk indeed, and here is the link to the original talk.

2. Advances in Cloud-Scale Machine Learning for Cyber-Defense

Another talk from Microsoft; this one by Mark Russinovich, the CTO for Microsoft Azure. This one was quite a deep dive into how Microsoft uses ML in detecting cyber attacks on the Azure platform. My quick notes below:

  • He started off with some metrics:
    • More than 10,000 location-detected attacks (detected/reflected attacks) – I didn’t understand what exactly he meant here.
    • 1.5 mil compromise attempts deflected
  • Red team and Blue team kill chain – it was interesting to see how each of the blue team’s “response” are mapped to read team’s malicious action stages
    • Attack disruption shows execute stage before move stage
  • Their “supervised” learning approach enables detection with minimal FP – this is an interesting claim
  • “Attack disruption” requires us to think of ML beyond detection
  • He also covered properties of successful ML solution – adaptable, explainable, actionable, results in successful detection
  • Framework for a successful detection – honestly this is one of the best and simple visual representation/explanation of how an ML based solution should look like. He also talks about two Case studies where IPFIX data is used as a training set, and detecting malware using a combination of Rules and ML
  • Then he goes deep into Case study 2 where he talks about the algorithms and compares fingerprint based detection to behaviour based.
  • Triage incidents not alerts – very valid point
  • In a nutshell – attack disruption means to shorten blue team kill chain

The Video to the original talk is available here.

3. Combatting Advanced Cybersecurity Threats with AI and Machine Learning 

This one was by Andrew B. Gardner, Head of Symantec’s ML Program. My notes below:

  • Interesting perspective shared here, but a bit high level.
  • He starts off with comparing AI & ML and how they differ in cyber – interesting point about the use of ML in cybersecurity, rather than AI, for various reasons:
    • complex sequential data
    • not human intuitive (logs)
    • labels are expensive (scarce)
    • closed research models
  • Typical use of ML in cyber today: collect data sets > training algorithms > build a model > updated classifiers > ingested to another “threat detector”
  • Though the advantages of using ML in cybersecurity are good, Andrew poses interesting argument around what are disadvantages of using ML in cyber security:
    • dependency on data (quality, completeness), and system
    • adversaries also have access to ML
  • ML at Symantec
    • some interesting approaches shown, about optimizing models – True positive to false positive ratios (ROC) and how to optimize them
    • use of string scoring services – Charlatan

Link to the original talk is here.

4. Automated prevention of ransomware with Machine learning and GPOs

This talk was by Rod Soto (Security Researcher at Splunk) and Joseph Zadeh (Security Data Scientist at Splunk). My notes below:

  • Rod and Joseph started with some key aspects of detecting ransomeware in the “new age” – behavioural modeling, unsupervised ML, anomaly detection and leveraging big data
  • Use of Aktaion tool kit for building the detection system
    • Take PCAPs of known (labeled) exploits and known (labeled) benign behavior and convert them to bro format
    • Convert each Bro log to a sequence of micro behaviors (machine learning input)
    • Compare the sequence of micro behaviors to a set of known benign/malicious samples using a Random Forest Classifier
    • Derive a list of indicators from any log predicted as malicious
    • Pass the list of IOCs (JSON) to a GPO generation script
  • Key is to focus on delivery of exploit (in addition to using system specific and call back specific behaviours) – following key steps were covered:
    • training a model (Random forest algorithm used in this case), to detect exploit delivery, using known malicious indicators
    • tuning the hyper parameters – risk factor, age, session time, entropy, etc.
    • model classifier built with 6 trees
    • the model will start generating output that separates signal from noise (they use the Splunk MLTK in this case)
    • link it to GPO scripts to automate the response procedures via power shell (active defense)
  • Training set and test data used in the demo include datasets from Contagio, DeepEnd Research, Ransomware samples with some call back and file system level indicators, labelled benign http user traffic (anonymized bluecoat logs)
  • The talk then ends with a PoC demo of this whole workflow
  • Summary: ML + GPO = Active Defense

Link to the original talk here.

5. Big Metadata: Machine Learning on Encrypted Communications

This one was by Jennifer Fernick and Mark Crowley, Security Researchers from University of Waterloo. My notes below:

  • This is derived from a research project, and was a very interesting session where not just the application of ML in cybersecurity was discussed, but also the inverse – security in the computational functions of ML
  • In this talk Jennifer and Mark talk about
    • ML research in cyber security – applying ML to problems in cybersecurity
      • using ML in cyber security
      • cybersecurity for ML – adversarial ML – study of ML systems in adversarial environments, where an attacker might train the system in hopes of modifying its behaviour to allow for an attack
      • a mid way – secure ways of computing ML functions
    • Candidate problems depend on information sources
    • Metadata – how can we use metadata for building the training set, while keeping privacy concerns intact?
    • ML 101 – a crash course
    • Their work in the field, and
    • Future direction
  • In the “security for ML” topic, there were some very interesting concepts presented – secure multi-party computation, privacy preserving data mining, homomorphic encryption, differential privacy. All these are deep mathematical and computation fields in themselves and definitively requires intensive reading. And so I am going to stop at that!
  • In the “ML in cybersecurity” topic, some fundamental questions were called out – what problem am I trying to solve
    • securing my learning data?
    • learning my security data?
  • On “ML 101” topic, they give an excellent crash course on ML and how to use it in cybersecurity
    • use of clustering (unsupervised learning) and classification (supervised learning)
    • system design and algorithm choices
  • Their work in ML – use of ML on encrypted data – analysing private and public communication networks to detect anomalies
  • I have to confess I found this talk to be the most difficult to thoroughly grasp, as the talk was research oriented and definitely calls for an in depth reading on each of the sub-topics covered. A great presentation indeed!

Link to the original talk here.

6. Applied Cognitive Security: Complementing the Security Analyst

This one was by Vijay Dheap, Program Director, Cognitive Security at IBM.

  • This talk was primarily about IBM’s Cognitive security product built on Watson their Qradar Security intelligence platform, and how it can help a Security Analyst better detect, analyse and respond faster to security incidents.
  • The presentation was high level and didn’t get into the details of how Cognitive Security with IBM Watson actually works. For ex., what algorithms are used, and what are the typical hyper parameters, and how they are used in conjunction with contextual feeds (vulnerability, asset, identity, behaviour) to detect security incident more effectively.
  • The presentation did cover one case study with a Botnet use case, but didn’t reveal much information on the inner workings (atleast some indication) of how ML and Watson’s AI detected this incident.
  • A good “high level” talk over all.

Link to the original talk here.

7. Dealing with Millions of Anomalies

This one was by Chris Larsen, Threat Researcher with Symantec

  • The talk was about detecting malicious traffic, by using ML (anomaly detection), and TI data
  • He first approach to handle the issue of picking “interesting anomalies” in millions of anomalies, is to pick “One Hit Wonders” and “One Day Wonders”, and then investigating them further by using various attributes (IP address licenses, ports used, are they DGA, etc.)
  • Once we have this “interesting anomalies” filtered out, then run it against good TI, to pick the most probable malicious traffic.
  • Summary: good TI is the key, and a good place to start, are TI that has malware/attack “families” context, industry/vertical/geo context.
  • Definitely an interesting talk with real world examples like using IOC data for Angler and Magnitude exploit kits, to filter out “most probable” malicious traffic, and then drilling further down from there.

There is a video of Chris’s gal available here. Definitely worth watching.

8. Machine Learning: Cybersecurity Boon or Boondoggle

This one was by Dr. Zulfikar Ramzan, CTO of RSA.

  • The talk starts at an elementary level, covering the fundamentals of ML and its use in Cyber security.
  • But towards the end, Zulfikar covered some very interesting facts/tips/best practices while using ML in cyber security. For ex.:
    • The importance of ROC (Receiver Operating Characteristic Curve) while making a trade-off between True positive and false positive classifications.
    • ML (in this case unsupervised) only is helpful in detecting bad “actions”, and not bad “intent”, and thus resulting in calling out lot of legitimate “unusual actions” as “bad/malicious”.

Link to the original talk here.

9. Applied Machine Learning: Defeating Modern Malicious Documents

This one was by Evan Gaustad, Sr. Manager, CSIRT – Target.

  • The talk basically starts with typical vulnerabilities exploited in Microsoft Office (Macros), and some examples of the attack lifecycle using malicious documents itself
  • Evan then gets into the details of the project he has been working on, where he used supervised ML (classification) to detect malicious documents. There is a video recording of his talk here, and I strongly recommend it. He covers a lot of details of how the model and its classifier actually works, with examples.

There is a video of Evan’s talk available here. Its a must watch.

10. An Introduction to Graph Theory for Security People Who Can’t Math Good

This one was by Andrew Hay, CISO, Data Gravity.

  • Though this talk didn’t actually cover how ML is used in detecting/preventing cyber attacks, it was a great crash course on Graphs theory (for the non-mathematicians amongst us), and how it can be extremely useful in visualising an attack lifecycle
  • Application of Graphs in security context
    • incident response – use of Google’s Fusion tables to visually represent the communication/interactions between user and entity in a security incident
    • actor tracking – detecting the source of a phishing campaign – using the IOCs available, use Maltego (CE)
  • What was interesting in this talk was – it is so easy to build a visual representation of the interaction. However, it can get way too complicated to interpret, due to a bad choice of dataset and the “vertices” (nodes) and “edges” (connections) in it.

The link to the original talk is available here.


Thanks for reading through my point of view RSA Con USA 2017. I hope I was able to provide byte sized (mega!) summary of some of the most interesting talks in this conference this year.

PS: Do subscribe to this blog, to get notified the moment I publish my next post.


Interesting Data Science projects of 2015

Interesting Data Science projects of 2015

Here is a list of some really interesting Data Science projects of 2015. Thanks to Jeff Leek from @simplystatistics for putting this together. 
Some of my picks from the list are:

* I’m excited about the new R Consortiumand the idea of having more organizations that support folks in the R community.

* Emma Pierson’s blog and writeups in various national level news outlets continue to impress. I thought this oneon changing the incentives for sexual assault surveys was particularly interesting/good.

* As usual Philip Guo was producing gold over on his blog. I appreciate this piece on twelve tips for data driven research.

* I am really excited about the new field of adaptive data analysis. Basically understanding how we can let people be “real data analysts” and still get reasonable estimates at the end of the day. This paper from Cynthia Dwork and co was one of the initial salvos that came out this year.

* Karl Broman’s post on why reproducibility is hard is a great introduction to the real issues in making data analyses reproducible.

* Datacamp incorporated Python into their platform. The idea of interactive education for R/Python/Data Science is a very cool one and has tons of potential.

Picture Courtesy: