Main-stream media sites lit up this week with the disclosure that the NSA has tapped into private databases in order to subvert terrorist plots. And surprisingly enough, there are actually some interesting sidebars to the original story.
A fascinating article in the NY Times explains how recent advances in the analysis of Big Data are being used to predict future terrorist activities, even without the detailed contents of the communications themselves.
American laws and American policy view the content of communications as the most private and the most valuable, but that is backwards today … The information associated with communications today is often more significant than the communications itself, and the people who do the data mining know that.
The ‘information associated’ with any communication is otherwise known as Metadata. It’s information about the information. A very simple example of metadata would be the gps location stamp on a picture you take with your smartphone.
All this kerfuffle reminded me of the (science fiction) Foundation Trilogy written by Isaac Asamov in the 50’s. Asimov postulated that (in the real world) theoretical physics ‘works’ because its predictions are based on the activity of a huge number of atomic events. but social science doesn’t ‘work’ because there isn’t enough data for statistically significant predictions.
The premise of the series is that mathematician Hari Seldon spent his life developing a branch of mathematics known as psychohistory … [which] … can predict the future, but only on a large scale; it is error-prone on a small scale. It works on the principle that the behaviour of a mass of people is predictable if the quantity of this mass is very large … The larger the number, the more predictable is the future.
So 60 years after Asimov, the staggering amount of information flowing off our modern data and communication systems, coupled with real-time analysis of Big Data now enables us to make predictions that were previously inconceivable. Welcome home, Isaac.
This was really brought home to me just a few days ago. Someone used my AMEX credit card number to make a $290 charge at a restaurant in Montreal. Fortunately an American Express computer flagged the transaction as fraudulent and rejected it. The computer then sent me an email and called my mobile phone to put a human representative on the line to discuss the situation. All within the space of about 30 seconds While talking to the representative I learned that the computer algorithm was even smart enough to accept valid charges immediately before and after the bogus charge.
For the purpose of our discussion, note that the algorithm flagging the bogus charge is working with the transaction’s metadata. AMEX neither knows nor cares what I purchase; they only know where, when and the pattern of my purchases. The merchants I deal with are the ones who know what I purchase, but that’s a totally different data set, and not particularly useful for predictive purposes.
So what’s the take-away? Personally, I don’t give a rat’s a– who’s looking at my metadata. But I plan to update the old saying “Keep your friends close, and your enemies closer” to:
Keep your metadata close, and your financial institutions even closer.
By the way, if you are not a burned out 60’s hippie, my condolences and here’s your Grok link.