Big Data – Hari Notes

On one side, we’re talking about Data Privacy, User Privacy, and legality of Survellience itself, but at the same time, there is Data that is supposed to be Public information and easily accessible by Human Beings, and also Computers, to process and take value out of it.

Just to set the context of this whole topic, here is a very interesting and extremely powerful use case that talks about a Dashboard created by the Open Data Analytics company Appallicious, which is being billed as a solution that pairs local disaster response resources with open data, and offers citizens real-time developments and status updates.

@jasonshueh has an interesting post on GovTech about the methods that could be used to harvest value out of Open Data repositories, for more such use cases.

Sunlight Foundation is a Washington, D.C. based non-profit advocacy group promoting open and transparent government. According to the foundation’s California Open Data Handbook, data must first be both “technically open” and “legally open.”

Technically open: [data] available in a machine-readable standard format, which means it can be retrieved and meaningfully processed by a computer application.
Legally open: [data] explicitly licensed in a way that permits commercial and non-commercial use and re-use without restrictions.

I think Junar is doing some interesting work in this area. And I especially liked these lines by Diego May, co-founder and CEO of Junar, in the article

What we see today is that the real innovation is not necessarily coming from hackathons, but now it’s about working with companies or entrepreneurs to solve problems

University of Massachusetts Boston is also doing some interesting work in this area and also the Fraunhofer Society in Berlin are doing some great research in this space.

This (Open Data Analytics) and the relevance of Security in it, is going to be one of the interesting areas in the Data Analytics space.

Ran Mosessco from Websense Security Labs has a very interesting post on solving a key issue every Security Analysts in a SOC (Security Operations Center) faces – the overwhelming amount of security alerts (even after correlation), also called Attack Indicators, an Analyst has to acknowledge and investigate.

Actionable threat intelligence is buried deep within terabytes of seemingly interesting but irrelevant data. Plausible deniability, false positives, lack of traceability and attribution, skillful attackers, adaptation of warfare techniques, and the like only add to the confusion. How does one bubble up prioritized, actionable threat intelligence in an automated fashion from the depths of the data morass?

This approach is still at a nascent stage and requires further study and we need to come up with an implementable solution. But I think this is a good place to start, and the following lines capture the way forward, accurately:

With attacks becoming more advanced and sophisticated each day, combining big data engineering, unsupervised machine learning, global threat intelligence and cybersecurity know-how is required to deal with them in a timely, automated and efficient manner.

This topic is one of my key focus areas professionally, and so I will be writing more about it here.

Title Image credit: communities.websense.com

Tag: Big Data

Harvesting Value from Open Data

Identifying actionable threat intelligence