Deep Learning – Limitations and its future

Deep Learning – Limitations and its future

Introduction

The origin of this blog post is the recent debate spurred off Elon Musk and Mark Zuckerberg, on whether AI is good or bad for humanity.

Elon is an inspiration to many of us around the world, especially for anyone entrepreneurial, and also for us – the machine learning enthusiasts; self-driving cars, and his thoughts on AI and its applications in his companies (e.g., Tesla autonomous driving via its auto-pilot capabilities).

Screen Shot 2017-09-17 at 10.49.24 PM.png
Auto-pilot in action, in a Tesla Model S

But sometimes, I tend to differ with his point-of-views on certain topics. For ex., We have to leave earth and go to Mars, to sustain humanity (Space X was founded primarily to make this possible sooner), and that robots with Super-intelligence, will take over the planet soon. Elon is a visionary, and like Stephen Hawking, he too believes that AI could one day supersede humans. He is right; or rather, he ‘could’ be right. But the point I would like to make in this blog post is that AI and deep learning in particular, is at a very nascent stage now, and considering the capabilities that we have built into AI systems so far, I am pretty sure that the doomsday type scenario that Elon points out, is definitely not in the near future.

As Andrew NG, the cofounder of Coursera and former chief scientist at Chinese technology powerhouse Baidu, recently pointed out in a Harvard Business Review event, the more immediate problem we need to address, is job displacement, due to automation, and this is an area that we must focus on, rather than being distracted by science fiction-ish, dystopian elements.

Limitations of Deep learning

At the recently held AI By The Bay conference, Francois Chollet, an AI Researcher at Google and inventor of the widely used deep learning library Keras, spoke about the limitations of deep learning. He said that deep learning is simply a more powerful pattern recognition system when compared to previous statistical and machine learning methods. “The most important problem for A.I today is abstraction and reasoning”, said Chollet.

Current supervised perception and reinforcement learning algorithms require lots of data, they’re terrible at planning, and are merely doing straightforward pattern recognition. However, by contrast, humans are able to learn from very few examples and can do very long-term planning. Also, we are capable of forming abstract models of a situation and manipulate these models to achieve “extreme generalization”.

Lets take an example of how difficult it is to teach simple humans behaviours to a deep learning algorithm. Lets examine the task of not being hit by a car as you attempt to cross a road. In case of supervised learning, we would need huge datasets of this (vehicular movement) situations with clearly labeled actions to take, such as “stop” or “move”. Then you’d need to train a neural network to learn the mapping between the situation and the appropriate response action. If we go with the reinforcement learning route, where we give an algorithm a goal and then let it independently determine the appropriate actions to take, the computer would need to die thousands of times before it learns to avoid vehicles in different situations. In summary, humans only need to be told once to avoid cars. We’re equipped with the ability to generalize from just a few examples and are capable of imagining (modeling) the dire consequences of being run-over by a vehicle. And so without ever (in most cases) losing our life or hurting us significantly, most of us quickly learn to avoid being run over by motor vehicles.

Talking of anthropomorphizing machine learning models, Francois Chollet in his recent blog post, has a very interesting observation:

“A fundamental feature of the human mind is our “theory of mind”, our tendency to project intentions, beliefs and knowledge on the things around us. Drawing a smiley face on a rock suddenly makes it “happy”—in our minds. Applied to deep learning, this means that when we are able to somewhat successfully train a model to generate captions to describe pictures, for instance, we are led to believe that the model “understands” the contents of the pictures, as well as the captions it generates. We then proceed to be very surprised when any slight departure from the sort of images present in the training data causes the model to start generating completely absurd captions.”

Another difference between how we, humans, interpret our surrounding, versus how these models do, is ‘extreme generalisation’ that we are good at, versus the ‘local generalisation’ that the machine learning models can do.

Screen Shot 2017-09-17 at 10.53.23 PM.png
Can humans’s ‘extreme generalisation’ abilities be ported into a machine learning model?     (Picture source: https://blog.keras.io/the-limitations-of-deep-learning.html)

Lets take an example to understand this difference. If we take a young and smart 6 year old boy from Bangalore, and leave him in the town of Siem Reap in Cambodia, he will, in a few hours, manage to find out place to eat, and start communicating with the people around, and make his ends meet in a couple of days time. This ability to handle new situations, when we have never experience a similar one before – language, people, surroundings, etc., that is to perform abstraction and reasoning, far beyond what we have experienced so far, is arguably the defining characteristic of human cognition. In other words, this is “extreme generalization”; our ability to adapt to completely new, never experienced before situations, using very little data or even no new data at all. This is in sharp contrast with what deep/neural nets can do, which can be referred to as “local generalization”; that is, the mapping from inputs to outputs performed by deep nets quickly start to fall apart, if the new inputs differ even slightly from what they were trained with.

How Deep learning should evolve

A necessary transformational development that we can expect in the field of machine learning is a move away from models that merely perform pattern recognition and can only achieve local generalization, towards models capable of abstraction and reasoning, that can achieve extreme generalization. Whilst moving towards this goal, it will also be important for the models to require minimal intervention from human engineers. Today, most of the AI programs that are capable of basic reasoning, are all written by human programmers; for example the software that relies on search algorithms. All this will result in deep learning models which are not heavily dependant on supervised learning, which is the case today, and truly become self-supervised and independent.

As Francois calls out in his blog post, “we will move away from having on one hand “hard-coded algorithmic intelligence” (handcrafted software) and on the other hand “learned geometric intelligence” (deep learning). We will have instead a blend of formal algorithmic modules that provide reasoning and abstraction capabilities, and geometric modules that provide informal intuition and pattern recognition capabilities. The whole system would be learned with little or no human involvement.”

Will this result in machine learning engineers losing jobs? Not really; engineers will move higher up in the value chain. These engineers will then start focusing on crafting complex loss functions to meet business goals and use cases, and spend more time understanding how the models they have built impact the digital ecosystems in which they are deployed. For ex., interact and understand the users that consume the model’s predictions and the sources that generate the model’s training data. Only the largest companies can afford to have their data scientists spend time in these areas.

Another area of development could be, the models becoming more modular, like how the advent of OOP (Object-oriented programming) helped in software-development, and how the concept of ‘functions” help in re-using key functionalities in a software program. This will steer the way for the models becoming re-usable. What we do today, in the lines of model reuse across different tasks, is to leverage pre-trained weights for models that perform common functions, like visual feature extraction (for image recognition). When we reach this stage, we would not only leverage previously learned features (submodel weights or hyperparameters), but also model architectures and training procedures. And as models become more like programs, we would start reusing program subroutines, like the functions and classes found in our regular programming languages today.

Screen Shot 2017-09-17 at 11.00.40 PM.png
(Picture source: http://www.astuta.com/how-artificial-intelligence-big-data-will-transform-the-workplace/)

Closing thoughts…

The result of these developments in the deep learning models, would be a system that attains the state of Artificial General Intelligence (AGI).

So, does all this, again take us to the debate i initially started with – a singularitarian robot apocalypse to takeover planet Earth? For now, and for the near-term future, I think its a pure fantasy, that originates from a profound misunderstanding of intelligence and technology.

This post began with Elon Musk’s comments; and I will end it with a recent comment from him – “AI will be the best of worst hing ever for humanity…”.

If you want to dwell deeper in this fantasy of Superintelligence, then do read Max Tegmark’s book Life 3.0 published last month. Elon himself, highly recommends it. :).

Screen Shot 2017-09-17 at 9.18.47 PM.png

Another classic is Nick Bostrom’s Superintelligence – Paths, Dangers, Strategies. And, if you are a Python programmer, and want to start building Deep Learning models, then I strongly recommend Francois Chollet’s book Deep Learning with Python

Further reading

Here are some recent research papers highlighting limitations of Deep Learning:

 

 

Cover image source: https://regmedia.co.uk/

Advertisements

Visualising the performance of Machine learning models

Visualising the performance of Machine learning models

Evaluating the performance of machine learning models using various metrics like accuracy, precision, recall, is straightforward, but visualising them has never been easy. But Ben Bengfort at District Data Labs, has developed a python library for this purpose, called YellowBrick.

It definitely looks interesting. Our very own Charles Givre shows how this package can be used, in his blog.

It's a definite read.

Busting the Citadel Trojan developers

Brian Krebs recently reported about the Citadel developers getting busted by the FBI. What is most interesting is the bait that FBI used to trap them.

Citadel boasted an online tech support system for customers designed to let them file bug reports, suggest and vote on new features in upcoming malware versions, and track trouble tickets that could be worked on by the malware developers and fellow Citadel users alike. Citadel customers also could use the system to chat and compare notes with fellow users of the malware.

It was this very interactive nature of Citadel’s support infrastructure that FBI agents ultimately used to locate and identify Vartanyan, the developer of Citadel.

Machine Learning and EU GDPR

Machine Learning and EU GDPR

In this post, I share my thoughts on the impact of using machine learning to conduct profiling of individuals in the context of the EU General Data Protection Regulation (hereon referred to as GDPR). My analysis is based on, specifically, Article 22 of the GDPR regulation, which can be found here, which refers to the “automated-processing and profiling of data subjects” requirement.

One of the arguments I discuss is, though using machine learning for profiling (of users/consumers, hereon referred to as ‘data subjects’) may complicate data controllers’ compliance with their obligations under the GDPR, at the same time it may lead to fairer decisions for data subjects, because human intervention whilst classifying data or people is flawed and is subject to various factors, whereas, machines/computers eliminate the subjectivity and biased approaches used by humans.

Lawful, Fair and Transparent

One of the fundamental principles of EU data protection law is that personal data must be processed lawfully, fairly and in a transparent manner.

GDPR’s definition of ‘processing’, is as follows:
‘any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction’

‘Profiling’ is a subset of automated processing, and GDPR defines it as:
‘the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person’s performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements’.

Now, lets analyse the three key tenets of the GRPD requirement – personal data must be processed lawfully, fairly and transparently

Lawfulness

If we break down the definition of ‘profiling’ in GDPR, in the context of machine learning, following are three key elements in this process:

Data profiling – key elements:

  • Data collection
  • Model development
  • Decision making

The outcome of these steps is that, machine learning is used for:

  • Automated data processing for profiling purposes
  • Automated decision making, based on the profiles built

Data collection

The regulation says that the collection of personal data should comply with the data protection principles and there must be a lawful ground for processing of this data. This means that personal data should only be collected for specified, explicit, and legitimate purposes and should not be processed subsequently in a manner that is incompatible with those purposes.

A machine learning algorithm may build a profile of a subject, based on the data that has been provided by the ‘data controller’ or by a third party or by both. Many organisations use Cloud Computing services for these activities, as the process may require significant resources in terms of computational power and storage. Depending on the nature of the business/application/usecase of such profiling, this processing may take place locally on the data controller’s machines, while a copy of this data is also sent to the Cloud to continue the dynamic training of the algorithm.

Elaborating on the “lawfulness” of this profiling, an individuals’ personal data are not only processed to create descriptive profiles about them but also to check against predefined patterns of normal behaviour, and to detect anomalies. This stage of profile construction will be subject to the GDPR rules governing the processing of personal data including the legal grounds for processing this data.

An interesting point to note is that, the final text of Article 22 of the GDPR refers to a ‘data subject’ and not a ‘natural person’. This could be interpreted as the protection against solely automated decision-making might not apply if the data processed are anonymized. This means, if profiling does not involve the processing of data relating to identifiable individuals, the protection against decisions based on automated profiling may not be applicable, even if such decisions may impact upon a person’s behaviour or autonomy. However, as Article 22 seems only to apply to profiling of individual data subjects and not groups, the question arises whether data subjects are protected against decisions that have significant effects on them but these decisions could be based on group profiling.

This can be an issue, because if inferences about individuals are made based on shared characteristics with other members of a group, there may be significant number of false positives or false negatives. A good example of this “anonymised” data collection for machine learning application, is Apple’s approach, which they refer to as ‘differential privacy’

Decision making

When it comes to decision making, based on the ‘processing’ of personal data described above, does ‘automated individual decision-making’ only cover situations where a machine makes decisions without any involvement by human actors? This may not be true in most of the situations as some human intervention is likely to occur at some point in the automated decision-making process. And so, I think the scope of the protection is broader than just covering wholly automated decision-making. Also, human intervention would have to be actual and substantive, i.e. humans would have to exercise ‘real influence on the outcome of a particular decision-making process, in order to lead to the inapplicability of this protection.

In addition, the GDPR does not specify whether the decision itself has to be made by a human or whether it could potentially be made by a machine. Nevertheless, as I mentioned above, it is highly likely that one or more humans will be involved in the design of the model, training it with data, and testing of a system incorporating machine learning.

Legal impact

Another important element of the decision is that it has to produce legal effects or similarly significantly affect the data subject. Some examples could be an automatic refusal for an online credit application or e-recruitment practices without human intervention. The effects can be both material and / or immaterial, potentially affecting the data subject’s dignity, integrity or reputation. And so the requirement that ‘effects’ be ‘legal’ means that a decision must be binding or that the decision creates legal obligations for a data subject.

Potential consequences of non-compliance

It is important to bear in mind that if data controllers violate the rights of data subjects under Article 22, they shall ‘be subject to administrative fines up to 20,000,000 EUR, or in the case of an undertaking, up to 4 % of the total worldwide annual turnover of the preceding financial year, whichever is higher’. In the face of potential penalties of this magnitude and considering the complexities of machine learning, data controllers may have apprehensions in using the technology for automated decision making in certain situations. Moreover, data controllers may insist that contractual arrangements be put in place, with providers that are part of the machine learning supply chain, which contain very specific provisions regarding the design, training, testing, operation and outputs of the algorithms, and also the relevant technical and organisational security measures to be incorporated.

Fairness

Lets now turn to the meaning of ‘fairness’ in the context of using machine learning either to carry out automated processing, including profiling, or to make automated decisions based on such processing. Whether personal data will be processed in a fair way or not, may depend upon a number of factors. Machine learning processes may be biased to produce the results pursued by the person who built the model. Also, the quantity and quality of data used to train the algorithm, including the reliability of their sources and labelling, may have significant impact on the construction of profiles.
For example, an indirect bias may arise where data relate to a minority group that has been treated unfairly in the past in such a way that the group is underrepresented in specific contexts or overrepresented in others. Also, in case of a hiring application, if fewer women have been hired previously, data about female employees might be less reliable than data about male employees.

So the point is, reliability while using machine learning for automated decision-making, will depend, on the techniques and the training data used. Further, machine learning techniques often perform better when the training data is large (more data about data subjects), and when the variance is wide spread. However, this may collide with the data minimisation principle in EU data protection law, a strict interpretation of which is that ‘the data collected on the data subject should be strictly necessary for the specific purpose previously determined by the data controller’.

And so it is very important that the data controllers decide, at the time of collection, which personal data they are going to process for profiling purposes. Then, they will also have to provide the algorithm with only the data that are strictly necessary for the specific profiling purpose, even if that leads to a narrower representation of the data subject and possibly a less fair decision for him/her.

Transparency

Machine learning algorithms may be based on very different computational learning models. Some are more amenable to allowing humans to track the way they work, others may operate as a ‘black box’. For example, where a process utilises a decision tree it may be easier to generate an explanation (in a human-readable form) of how and why the algorithm reached a particular conclusion; though this very much depends on the size and complexity of the tree. The situation may be very different in relation to neural network-type algorithms, such as deep learning algorithms. This is because the conclusions reached by neural networks are ‘non-deductive and thus cannot be legitimated by a deductive explanation of the impact various factors at the input stage have on the ultimate outcome’

This opacity of machine learning techniques might have an impact on a data controller’s obligation to process a data subject’s personal data in a transparent way. Whether personal data are obtained directly from the data subject or from an indirect source, the GDPR imposes on the data controller the obligation, at the time when personal data are obtained, to provide the data subject with information regarding:

‘the existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) and, at least in those cases, meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.’

Does this mean that whenever machine learning is used to conduct profiling the data controller must provide information regarding the existence and type of machine learning algorithms used? If so, to what does the term ‘logic’ refer and what would constitute ‘meaningful information’ about that logic? Does the term ‘logic’ refer to the data set used to train the algorithm, or to the way the algorithm itself works in general, for example the mathematical / statistical theories on which the design of the algorithm is based? And what about the criteria fed into the algorithm, the variables, and the weights attributed to those variables? And how does this relate to the role of different service providers forming part of the ‘machine learning’ supply chain? All these are important clarifications to be sought.

Due to all the above complexities, it is clear that transparency might not be the most appropriate way of seeking to ensure legal fairness but that compliance should be verified, for instance, through the use of technical tools. For example to show bias to a particular attribute like the use of race in credit decisions or the requirement that a certain class of analysis be applied for certain decisions. This might also be achieved by testing the trained model for unfair discrimination against a number of ‘discrimination testing’ datasets, or by assessing the actual outcomes of the machine learning process to prove that they comply with the lawfulness and fairness requirements.

Conclusion

According to Article 22 of the GDPR, data subjects have a right not to be subject to a decision based solely on automated processing, including profiling that produces legal effects concerning them or significantly affects them. When data controllers use machine learning to carry out automated processing, including profiling of data subjects, they must comply with the requirement of lawful, fair and transparent processing. This may be difficult to achieve due to the way in which machine learning works and / or the way machine learning is integrated into a broader workflow that might involve the use of data of different origins and reliability, specific interventions by human operators, and the deployment of machine learning products and services, including ‘Machine Learning as a Service’ services (provided by Amazon, Google, Microsoft, and others).

In order to be compliant, data controllers must assess how using machine learning to carry out automated processing affects the different stages of profiling and the level of risk to data subjects’ rights, and the impact of how the data controller can produce evidences of the compliance to the regulator and the data subject. In some cases where automated processing, including profiling, is permitted by law, data controllers still have to implement appropriate measures to protect the data subjects’ rights. The underlying objective of GDPR is that a decision significantly affecting a person cannot just be based on a fully automated assessment of his or her personal characteristics. However, as I called out in the very beginning of this post, in the context of machine learning, in some cases, it might be more beneficial for data subjects if a final decision is based on an automated assessment, as it is devoid of prejudices induced by human intervention.

Whether a decision about us is being made by a human or by a machine/computer, right now the best we can hope for is such a decision, which can produce legal effects or significantly affect us in any manner, will be as fair as humans can be. And eventually, we, as machine learning practitioners, must aim to build machine learning models where the decisions are far more fair, than what humans can be.

This is taking into account that machines may soon be able to overcome the limitations of human decision makers and provide us with decisions that are demonstrably fair. Indeed, it may already in some contexts make sense to replace the current model, whereby individuals can appeal to a human against a machine decision, and also where individuals would have a right to appeal to a machine against a decision made by a human!

Well that sounds a bit weird, ain’t it! Has the time for Skynet to take over Planet earth, finally arrived!

I am sure that many of the questions that we, the machine learning enthusiasts and practitioners, have, about the implication of GDPR to it, will eventually be answered, after GDPR becomes a regulation in May 2018. And also, we will see interesting changes to how machine learning models are designed and applied, especially in the context of personal data processing.

 

 

Title image courtesy: LinkedIn.com

Securely store API keys in R scripts with the “secret” package

When we use an API key to access a secure service, through R, or when we need to authenticate in order to access a protected database, we need to store this sensitive information in our R code somewhere. This typical practice is to include those keys as strings in the R code itself — but as you guessed it, it’s not secure. By doing that, we are also storing our private keys and passwords in plain-text on our hard drive somewhere. And as most of us use Github to collaborate on our code, we will also end up, unknowingly, including those keys in a public repo.

Now there is a solution to this – its the “secret” package developed by Gábor Csárdi and Andrie de Vries for R. This package integrates with OpenSSH, providing R functions that allow us to create a vault to keys on our local hard drive, and also define trusted users who can access those keys, and then include encrypted keys in R scripts or packages that can only be decrypted by the person who wrote the code, or by people he/she trusts.

Here is the presentation by Andrie de Vries at useR!2017, where they demoed this package, and here is the package itself.

 

WannaCry ransomware – My thoughts

WannaCry ransomware – My thoughts

The beginning of this weekend, started on a very rough note, for most of us in the cyber security domain; thanks to the WannaCry / Wcry / WannaCrypt ransomware.

At the time of this writing, this malware has infected more than a million machines across the world, impacting organisations in more than a dozen countries, with UK, Russia, Spain and India being the most hit. 

There has been enough written about the WannaCry / Wcry / WannaCrypt. And so in this post I want to focus on the technical aspects of how this malware has been constructed and it’s propagation. 

I found Talos analysis of this malware to be the most comprehensive and is an excellent read for anyone who is keen to understand this malware under the hood. 

Let’s approach our analysis alongside the cyber kill chain framework. 
There is enough evidence that email was used as the “Delivery” mechanism to deliver the payload of this attack. Once delivered onto a machine, this malware spreads via SMB; that is the Server Message Block protocol typically used by Windows computers to communicate with other file systems over a network. An infected computer then propagates the infection to other vulnerable machines in the same network. The same vector is also used to spread across hosts which are externally facing, and have inbound connections allowed on on TCP ports 139 and 445. 
It also appears, in order for the malware to “Laterally move”, it uses the notorious Doublepulsar backdoor, as Talos notes:

Talos has observed WannaCry samples making use of DOUBLEPULSAR which is a persistent backdoor that is generally used to access and execute code on previously compromised systems. This allows for the installation and activation of additional software, such as malware. This backdoor is typically installed following successful exploitation of SMB vulnerabilities addressed as part of Microsoft Security Bulletin MS17-010. This backdoor is associated with an offensive exploitation framework that was released as part of the Shadow Brokers cache that was recently released to the public. Since its release it has been widely analyzed and studied by the security industry as well as on various underground hacking forums.

And if there are no machines found which have been previously compromised and implanted with DOUBLEPULSAR, the malware uses ETERNALBLUE for the initial exploitation of this SMB vulnerability. And this is exactly the cause for this activity to look like a worm propagation across the World Wide Web. 

The Kill switch

The security researcher tweeting from @MalwareTechBlog has become the #AccidentalHero, in the last two days, for saving the mankind (or “machinekind”) from this whole fiasco. 

The researchers from Cisco Umbrella team first observed DNS requests for one of WCry’s kill switch domains (iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea[.]com), early on May 12th, with the hit count going as high as 6000-7000 requests per second by late evening. 
If you look closely, the domain composition looks nearly human typed, with most of the characters falling into the top rows of a typical keyboard. 
The reason why this domain is being called a kill switch is due to the role it plays in the overall execution of this malware, as shown in the code below:


As Talos notes, 

The above subroutine attempts an HTTP GET to this domain, and if it fails, continues to carry out the infection. However if it succeeds, the subroutine exits. The domain is registered to a well known sinkhole, effectively causing this sample to terminate its malicious activity.

Dwelling deeper into the dropper and the payload itself, it is observed that there are two files which are essential for the exploit to be run – mssecsvc.exe which executes the file tasksche.exe. Once the kill switch domain is checked, and the logic as explained earlier is checked, the file mssecsvc.exe is run. The second execution is run to check the IP address of the infected system and then attempts to connect on TCP port 445, of every IP address in the same subnet. When a successful response is received proving that the IP address is of a node in the network, the SMB vulnerability is used to connect and then transfer the data. This is the vulnerability that Microsoft has addressed in the bulletin MS17-010.

Talos goes further and explains how this malware scans for local and network disk drives and identifies files with certain extensions, before starting to encrypt them all. Post the encryption is complete, the @wanadecryptor@.exe is run which is the pop-up or note that shows up on the computer screen of the victim’s machine. Then an external connection is launched onto Tor networks, in order to proxy their communication outbound, through the Tor network. 

All this sounds exciting and quite sophisticated to be detected by any intrusion detection system, isn’t it? But the key to detecting this kind of attack, wouldn’t be possible by using signatures or fingerprints (hashes of the malware file or the IOCs that have been discovered so far), but by using a good anomaly and behaviour detection system. 

I will write more on this once more information about the propagation of this malware comes to light. But I think in this particular case, it is very evident that prevention is and has to be the key focus are for all the organisations and individual computer users alike. 
Before I get into that, I would also like to highlight the political angle to this entire story. 
Political angle

Edward Snowden has been quite vocal about NSA’s involvement in this, on his twitter account. 


I am not that interested in getting into the political angle here, as it only causes anger and frustration, but I would just like to recall the drama played last year when a whole bunch of people wanted Apple to create a special version of iOS for the U.S. government, under the promise that it would never escape their safe hands and get into the wild. What happened this time folks? How did this vulnerability got out loose and into the hands of these adversaries, impacting businesses, important government departments (impacting German Auto Bahn) and people’s lives (impacting UK’s NHS)”?

So what now?

Well, we’re pretty much in clean-up mode. All the major anti-virus vendors have released signatures for WannaCry / Wcry / WannaCrypt. And while everyone is busy investigating what and how it happened, and some resting in peace thanking the #AccidentalHero for the kill switch, I am sure the advisories are already working on newer variants of this malware. 
But the most important lesson to be reiterated here is something that the security community has been preaching for a long long time – keep all your software up to date, and apply all the security patches on a regular basis. It’s more about the processes and the governance around it, that is more important than waiting for a machine learning and artificial intelligence based intrusion detection system to pick all this up and block them for you. As our old doctor used to say, prevention is better than cure!
Calling some of these preventive steps again, many of which are obvious to most people:

  1. Keep your operating system and the applications running on them, up to date
  2. Test and apply the patches, especially the security patches, early on
  3. Have a fool-proof backup strategy, and test them at least once every yea
  4. Do not open emails or attachments from unknown or suspicious senders
  5. Lock down computers, by providing minimal access, based on need basis
  6. Limit the access to network resources; in this case ransomware can only lock down files on systems that it can access 
  7. Only open ports that are essential for your business to operate. In this particular case, it was found in many cases that inbound TCP 139 and 445 was allowed in many of the perimeter firewalls. 
  8. Block all unnecessary outbound connections – especially the ones that use anonymity like Tor. Only thieves want to conceal themselves. 
  9. If you are still having difficulty in implementing all of the above measures, then one must depend on a strong treat “detection” system, beyond conventional anti-virus applications and use intrusion detection systems that use machine and deep learning to detect and block the “unknowns”. 

I have blogged about the last point, in various bog posts. More on using those techniques for WannaCry, in another blog post. 
Happy patching!

Hunting through Log Data with Excel

SANS just published an interesting paper on using Excel for incident investigations. 

A good read for incident responders to learn how to use Microsoft Excel and some of its more advanced features during an intrusion if a SIEM or similar product is not available (who doesn’t have them these days!?)

This guide will contain up to three methods for each example presented. First, the paper will show some of the things you can do with Excel by just using the toolbar commands. Second, if available, an Excel Function will be created to show how it can be slightly automated. Third, to enhance the Excel Function process even further, Visual Basic for Applications (VBA) code will be provided. Knowing alternate ways of manipulating different types of data will allow you to incorporate the results into the standard output described below.