While journalists have seemingly little assistance, the government cyberspies are comparatively spoon-fed, with the National Security Agency (NSA) creating a nifty little research book designed to help secret agents uncover the most clandestine intelligence hiding on the web.
If you want to learn how to search like a top secret spy for the most confidential of information on the Internet, simply visit the National Security Agency’s website and download its book titled, “Untangling the Web: A Guide to Internet Research”.
One of the first things that you’ll come across when you download the book is a fair bit of crossing out. This is due to the fact that when the Internet handbook was first released in 2007, it was for official use only and had not been approved for public use.
However, following a Freedom of Information Act (FOIA) request, the guidebook was declassified recently, meaning that the “For Official Use Only” stamp has been replaced with an “Unclassified” one.
According to Wired the FOIA request was filed by MuckRock, a website that charges fees to process public records for activists, journalists and researchers. (1)
Fully Exploiting Online Tools and the Invisible Web
The highly informative book is bursting with advice on how to fully exploit the search engines and other online tools that make up the “invisible web”. The book informs its readers of the confidential codes one can type into the likes of Google in order to find highly secretive passwords in Russia, or confidential company spreadsheets.
Although what’s by far the most interesting chapter of the once confidential book is titled “Google Hacking”.
Informing readers on how to “Google hack” naturally sparked concern as well as intrigue, although the author of the book is quick to point out that there is nothing illegal about “Google hacking”, stating:
“Google (or search engine) hacking involves using publicly available search engines to access publicly available information that almost certainly was not intended for public distribution. In short, it’s using clever but legal techniques to find information that doesn’t belong on the public internet.”
This is essentially the modern definition of the invisible web. In the next breath, the book then explains what kinds of sensitive data can be routinely found using the search engines and is most commonly discovered by Google hackers, which it lists as being:
–> Personal and/or financial information
–> User IDs, computer or account logins, passwords
–> Private, confidential, or proprietary company data
–> Sensitive government information
–> Vulnerabilities in websites and servers that could facilitate breaking into the site (2)
The book then goes on to explain how to discover this tiny fraction of the eight billion pages in the Google index that were not meant to be made available to the public and provides practical advice on typical techniques in hacking into the “secret web”, or the invisible web.
One such technique outlined by NSA’s guidebook is to search by file type, site type and keyword. As many companies store their financial data and other confidential information about employees and procedures in Excel spreadsheet format and tend to title the sheet as “Confidential”, the hacker should implement such words into their query.
The book uses the example of a hacker wanting to find sensitive information about an organization in South Africa. That hacker might start their query as:
[Filetype:xls site:za confidential]
More Ammunition for Hackers
Now forgive me if I sound naive, but surely wouldn’t encouraging users to use terms such as “internal”, “confidential”, “budget” and “not for distribution” in order to search out top secret organization and individual’s confidential information, give a “true” hacker, the ones the FBI are so desperate to combat, more ammunition?
Another chapter that has raised a few eyebrows is the one titled “Uncovering the ‘Invisible’ Internet”.
“The deep (aka hidden or invisible) web continues to elude most search services and users seeking to plumb its depths. The challenge is how to access that part of the web that remains invisible to search engines,” writes the author in his introduction to the chapter. (2)
The collection of websites, which are effectively beyond the reach of the conventional search engines, that include everything from the Library of Congress to private company networks, could, as Extreme Tech notes, “be incredibly powerful in the right hands.” (3)
Wouldn’t it be more appropriate to point out that in informing and thereby effectively encouraging users to delve and probe into the clandestine world of the invisible web could prove to be incredibly dangerous in the wrong hands?