Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

How to collect and analyze your own metadata

News National Security State Recommended Links Big Uncle is Watching You Nineteen Eighty-Four Search engines privacy Is Google evil? Cyberstalking
Issues of security and trust in "cloud" env Facebook as Giant Database about Users Blocking Facebook Email security MTA Log Analyzers HTTP Servers Log Analyses Cookie Cutting Integrity Checkers
Potemkin Villages of Computer Security Privacy is Dead – Get Over It Total control: keywords in your posts that might trigger surveillance Pitfalls of Google as a Search Engine How to collect and analyze your own Web activity metadata Steganography Anomaly detection Nicholas Carr's "IT Does not Matter" Fallacy and "Everything in the Cloud" Utopia
Malware Cyberwarfare Data Stealing Trojans Flame Duqu Trojan Magic Lantern CIPAV Google Toolbar
Nation under attack meme Is national security state in the USA gone rogue ? Neo-fashism Totalitarian Decisionism & Human Rights: The Re-emergence of Nazi Law Reconciling Human Rights With Total Surveillance Edward Snowden as Symbol of Resistance to National Security State Prizm-related humor Etc

In comment to the article Fisa court order that allowed NSA surveillance is revealed for first time (The Guardian 19 November 2013)a reader Androidian gave the following advice: 

Let me add to this that subscribing to a private commercial VPN service is recommended for any computer user while on the road. It won't protect you from the big guys but it does a good job of protecting you from local bad guys as well as your ISP. The one I use costs less than $5/month for all of my devices combined. PCs, tablets and phones.

Remember that as things get more secure, bad guys (all of them) will go for the weak places. Typically this is at either end of the connection.

Your computer and you are the weakest links.

Learn to and get in the habit of maintaining the security of your own systems and backup your data onto disconnected storage devices.

Try to avoid cloud computing everywhere you can.

Don't post stupid stuff about yourself or family on social media sites.

You gotta do your part too unless you really don't care. In which case i would expect you not to be bitching about it in comment threads like this one.

Still it is always interesting to know what exactly government and big cloud providers know about you. And in case of Prism it is safe to assume that all your email metadata, phone metadata and HTML logs are intercepted and stored for at least five years.  Not sure about Amazon and other "cloud" retailers", but probably those can be safely added to the mix.  Linkedin data and communications are also intercepted and stored.

The key lesson is simple: data that leave your computer are not private unless encrypted.  If you join Google or Facebook you should have no expectations of privacy of any data you put "in the cloud". The same is true about your email. If you use Gmail, you essentially publicly declare that those emails are not private.

In Australia any expectations of privacy isn't legally recognized by the Supreme Court once people   voluntarily offered data to the third party. Here is a relevant Slashdot post:

General Counsel of the Office of the Director of National Intelligence Robert S. Litt explained that our expectation of privacy isn't legally recognized by the Supreme Court once we've offered it to a third party.

Thus, sifting through third party data doesn't qualify 'on a constitutional level' as invasive to our personal privacy. This he brought to an interesting point about volunteered personal data, and social media habits. Our willingness to give our information to companies and social networking websites is baffling to the ODNI.

'Why is it that people are willing to expose large quantities of information to private parties but don't want the Government to have the same information?,' he asked."

... ... ...

While Snowden's leaks have provoked Jimmy Carter into labeling this government a sham, and void of a functioning democracy, Litt presented how these wide data collection programs are in fact valued by our government, have legal justification, and all the necessary parameters.

Litt, echoing the president and his boss James Clapper, explained thusly:

"We do not use our foreign intelligence collection capabilities to steal the trade secrets of foreign companies in order to give American companies a competitive advantage. We do not indiscriminately sweep up and store the contents of the communications of Americans, or of the citizenry of any country. We do not use our intelligence collection for the purpose of repressing the citizens of any country because of their political, religious or other beliefs. We collect metadata—information about communications—more broadly than we collect the actual content of communications, because it is less intrusive than collecting content and in fact can provide us information that helps us more narrowly focus our collection of content on appropriate targets. But it simply is not true that the United States Government is listening to everything said by every citizen of any country."

It's great that the U.S. government behaves better than corporations on privacy—too bad it trusts/subcontracts corporations to deal with that privacy—but it's an uncomfortable thing to even be in a position of having to compare the two. This is the point Litt misses, and it's not a fine one.

So you need to control what you put in the cloud. You can start with your email. It's easy and your send folder is essentially what you put.

 In this case your WEB browsing habits metadata are neatly stored in headers. Logging you HTTP communication require a proxy.

Similarly logging you phone communications metadata require  running Asterism or similar software.

It's actually very interesting (albeit horrifying) to see all your searches for the last seven years or so neatly stored by Google. I stopped using Google just after I saw this cache of data.  One important result here that you need to vary search engines allocating your searches to several of them. You can also may think about "spam searches" which slightly "dilute" the "fingerprint" of your searches. 


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

The U.S. government takes a big data approach to intelligence gathering. And so can you! Mike Elgan

June 22, 2013 (Computerworld)

Everybody's talking about PRISM, the U.S. government's electronic surveillance program.

We don't know all the details about PRISM (also called US-984XN). But we learned enough from a badly designed PowerPoint presentation leaked by NSA contractor Edward Snowden to feel outraged by its reach and audacity.

In a nutshell, PRISM (and related telephone surveillance programs) take a big data approach to spying on foreign terrorists using American servers.

PRISM and related programs may harvest metadata of every phone call, every email, every Internet search, every Facebook post -- everything -- and use algorithmic filtering to find suspicious communication. Once they've found it, they can get a warrant to listen to the actual phone calls and read the actual email to find clues that enable authorities to stop terrorist attacks before they happen. (You know, Minority Report-style precrime.)

Metadata is not the content of the phone call or email, but the information about them: Who contacted whom, when, from where and for how long.

PRISM inspires shock and awe. But if you set aside the shock part -- the privacy and constitutional implications -- you realize the awe component is worth exploring.

The PRISM approach is this: Cast the widest possible information net, then use machine intelligence to serve up just the needles without the haystack.

PRISM works. It gets government snoops what they're looking for. And if it works for the NSA, it can work for you, too.

In fact, the ideas behind PRISM are built into a wide variety of tools available to everybody.

So here's how to run your own private PRISM program:

1. Capture massive amounts of data

One of the NSA's goals is to record the metadata on every phone call and email.

Obviously, no human personally reads all that data. But it's copied and stored anyway for searching later.

You can take the same approach. One easy way is to use integrated Google services together.

Google now offers 15 GB of free storage that can be divided any way you like between Gmail, Google Drive and Google+ photos. And they'll give you more if you pay for it.

Google also offers an Alerts service that searches the Internet and mails you the results. Most people set up only the number of Alerts that they can read. But that's not the NSA way.

The PRISM approach would be to harvest far more Google Alerts than any human could possible process, then use Gmail filters to automatically skip the inbox and send them straight to a specially created folder within Gmail. You can set up new Alerts every day each time you think of an area of interest. These can include people you know, companies to watch, ideas to keep up with.

Alerts won't send you the data (the story), but the metadata (information about the story, plus the link). One advantage of this approach is that if a site is deleted, making it vanish also from Google Search, you'll still have a record of it with enough metadata to pursue leads.

Note that Google also offers Google Scholar Alerts, which works like regular alerts but that searches academic books, papers and other resources. This is one of the great underappreciated services on the Internet.

You can also spy on yourself NSA style by capturing the metadata on your phone calls and chats. (Of course, the email is already there.)

The trick is to use Google Voice, and turn on the features that save your information to email. (Note that Voice will send your data to any email address, not just a Gmail one.) You'll find the appropriate checkboxes under the Voicemail & Text tab of Google Voice Settings.

This will send metadata on all of your calls, plus full data on all your SMS chats, transcripts of your recorded calls and voicemails and even the sound recording of your voicemails for searching later.

Note that Google's new Hangouts feature, which is accessible in Gmail, Google+ and in the dedicated Hangouts mobile apps, will send the full text of all your chats plus metadata on your video calls to the Gmail address associated with your Google+ account.

You can also use various tools like IFTTT or Zapier to automatically drop all content or metadata from any RSS feed into Google Drive, or alternatives like Evernote for searching later on.

Remember: Do it the NSA way and go nuts with this, dropping dozens, or even hundreds of items per day into your searchable storage. Don't worry about having too much data. Have faith in existing and future search tools to later find what you're looking for.

Beyond the automated harvesting of data, don't forget the manual approach, either. Capture every document that might someday be relevant and dump it into a special folder in Google Drive by using a browser extension like the Save to Google Drive plug-in. (Chrome has other extensions and so do other browsers.) You can do similar one-click saving using Evernote Web Clipper.

Once all this data and metadata is pouring into Gmail and Drive, you can simply use Google's search features to find what you're looking for.

The key to great NSA-style data harvesting, by the way, is to constantly tweak your code. Keep adding, deleting and modifying your Google Alerts and RSS feeds to make sure they deliver the kind of data you want.

2. Use algorithmic filtering

Algorithmic "noise filters" are popping up everywhere these days, especially on social networks and social media services where users could be overwhelmed by too much information.

But thinking like the NSA, we can use these filters to cast a massively wide information net, then let the filters weed out duplicate and irrelevant information for us. (Note that I got this tip from a conversation with blogger Robert Scoble this week.)

The idea is to set up a special-purpose Twitter feed for information harvesting, then use it to follow vastly more content sources than any human could possibly keep up with.

Then, read that feed using Flipboard, Prismatic or some other site that filters content for you and that supports Twitter. (Note that these services also support Facebook and Google Reader, but Google will discontinue Reader soon. Twitter is probably your best bet.)

One thing these filters do well is eliminate content duplicates. Instead of getting 500 stories about the name of Kanye and Kim's baby, you'll get just one story -- probably the best or most popular one -- and get it over with.

Another way to think about the power of algorithmic de-duping is that normally you might not follow a news source from which only one story in 100 is unique or exclusive. But because duplicate stories are filtered out, you get only the one unique story from that source and not the 99 also-ran stories.

This elimination of duplicates frees you to follow news and content sources promiscuously, casting an ultra-wide net without fear of overloading yourself with redundant content.

3. Don't forget the new photograph recognition tech

One of the amazing spy tools at the disposal of the NSA is the ability to process photographs for face, object and location information.

These tools are at your disposal, too.

Facebook's new Graph Search feature lets you quickly experiment with finding photos by trying different queries. For example, if you search for "Pictures taken by people who work at ..." followed by a company, you'll get what you asked for. (This is one way to spy on a competitor, for example.)

Google's picture searching takes it even further, enabling you to search not only for tags, keywords, associated text and location, but also content categorization. Google can actually recognize objects, landmarks and other stuff, even if the person who posted it added no such context.

For example, if you search Google+ for something like Sydney Opera House, you'll get a massive trove of pictures of the building, many of which are not accompanied by any mention of the words Sydney, Opera or House. Google actually recognizes the building using machine intelligence.

The same goes for categories of things. You can search for the word "car," which is not a specific thing but a type or category of thing. Google still gives you cars, whether they're tagged or not.

There's one ironic caveat to using the NSA's methods for wide-scale information harvesting and algorithmic filtering, which is that the NSA may theoretically know everything you're doing.

The NSA's domestic surveillance programs are controversial and possibly unconstitutional. But let's face it: They work.

And the NSA's methods can work for you, too.

This article, How to run your own NSA spy program, was originally published at Computerworld.com.

Mike Elgan writes about technology and tech culture. Contact and learn more about Mike at http://Google.me/+MikeElgan. You can also see more articles by Mike Elgan on Computerworld.com.

Prism doesn't have CIOs in a panic -- yet - Computerworld

But CIOs and other tech executives say the spying scandal underscores the need for strong corporate security measures

By Chris Kanaracus

June 17, 2013 11:56 AM ET

3 Comments

IDG News Service - Revelations over the U.S. National Security Agency's Prism surveillance program have much of the general public in uproar, but in terms of the controversy's impact to enterprise IT, some CIOs have measured, albeit watchful, reactions.

"I don't see it as a problem for us," said Mike Zill, CIO of medical-products manufacturer CareFusion. "I don't see the government doing something to systematically damage our company or any company."

That said, CareFusion already has multiple "highly secure" systems in the company for protecting highly sensitive information, but those systems don't cover all of CareFusion's data and employees, Zill said. "The question is, do we push that to everybody? It's a question of the economics and the risk-to-reward [quotient]."

Only certain industries may need to worry, according to another IT professional.

"I think if we were some nuclear or medical company or something like that it would have been different, but the fact that we can tell you when Justin Timberlake is going on tour doesn't matter," said Ian Woodall, project manager and group IT at XL Video, a British company that provides large-scale video equipment for music concerts and festivals.

Many enterprises may be more concerned about industrial espionage than government spy agencies cracking their communications. But Prism should nonetheless serve as a clear wake-up call to CIOs and other IT executives, said Nick Selby, CEO of StreetCred Software and a risk management consultant who advises large organizations on industrial espionage and data breaches.

"If you take a look at what's already known about monitoring the public Internet, what you find is unencrypted email for decades has been entirely susceptible to in-transit copying, monitoring and surveillance," he said. "Most CIOs and most CSOs have not taken to heart the fact that it is not only possible your email will be intercepted and surveilled, it is likely. The value of encrypted communications has never before been so clearly outlined."

Still, Prism isn't going to spark "a major change in direction" at Toyota Motor Engineering and Manufacturing, North America, as the company is already "pretty high up that ladder of locking things down," said Tim Platt, vice president of information systems and information security. "Espionage is one of our larger concerns."

That said, the fallout from Prism "certainly adds weight to some of the considerations we've made in the past," he added. For example, in order to print or scan a document, employees must place their company badge on a reader, which logs the transaction, Platt said.

Toyota in general places strong emphasis on "what information is going outside of our walls, what the content of that is, and who could get at it," Platt added.

Platt also hinted at one potential benefit to CIOs resulting from the Prism revelations.

Security measures "cost money," he said. "Being able to point to the news that everybody's watching and say, 'that's what we're talking about,' that [simplifies] making business cases to executives."

The Prism scandal has shaken all of us, but perhaps mostly as individuals, according to Tony Soderlund, CIO at Salem Municipality in Sweden, which uses Google Apps.

To that end, CareFusion's Zill sees the potential for a post-Prism backlash against enterprises that use tracking tools, data mining, analytics and other technologies to profile customers, send targeted advertisements and, ultimately, sell more products and services.

"I don't want to be tracked," Zill said, citing the "digital leash" companies try to place on consumers. "It's exhausting."

In the wake of Prism, companies "should be prepared to be very clear about how they use customer and prospect information," said analyst Curt Monash of Monash Research. "This news makes the general populace antsier about privacy."

"My general approach to privacy issues is that it's inevitable that information will be passed around," Monash added. The key for enterprises is to be "responsible in its use and be seen to be responsible," he said.

(With reporting from Mikael Ricknas in London).

Quinn_Eskimo

Edward Snowden is an example of the managers of the database!

Feel all better now? Didn't think so.

Quinn_Eskimo

Corporate CIO's don't have a clue. Watch the fully briefed politicians. They're sweatin' to the oldies. Look at Mike Rogers. These guys KNOW what's goin' on. They're running scared like never before.

Corporate CIO's have been obsoleted. They have no clu.

zman58

"I don't see the government doing something to systematically damage our company or any company." Officially that makes sense and of course we would not want the government to plan on damaging business with this treasure trove of information they are collecting.

But consider that ordinary people, who work for the government, are the ones with access to the information. People look at the data--basically ordinary people. They would be government employees and possibly contractors with some level of security clearances. People have weaknesses and faults of all kinds. Some of them are less than honest, or perhaps desperate in some way, and might try to sell or use information in ways that is not officially sanctioned. I would not want troves of business and/or personal data being gathered and sifted through by people I do not know--even if they have managed somehow to obtain "security" clearances.

Think about the possibility of discovering information that could be used in the markets to gain an unfair advantage in securities trading--like inside information. You think it might be tempting for someone? How about competitor information that might be extremely valuable to a business -- extremely valuable $$. How might that be leveraged? This PRISM is a very bad system indeed -- we need to get rid of it.



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: January, 09, 2020