home

Why Data Mining Won't Stop Terrorists

Security and tech guru Bruce Schneier writes the definitive rebuttal of data mining as a counterterrorism tool.

Rule number one:

Data mining works best when there's a well-defined profile you're searching for, a reasonable number of attacks per year, and a low cost of false alarms.

Example: credit card fraud. By examining records of your transactions, credit card companies can spot a spending pattern that indicates something nefarious may be afoot. It's different with terrorism:

Terrorist plots are different. There is no well-defined profile, and attacks are very rare. Taken together, these facts mean that data mining systems won't uncover any terrorist plots until they are very accurate, and that even very accurate systems will be so flooded with false alarms that they will be useless.

All data mining systems fail in two different ways: false positives and false negatives. A false positive is when the system identifies a terrorist plot that really isn't one. A false negative is when the system misses an actual terrorist plot. Depending on how you "tune" your detection algorithms, you can err on one side or the other: you can increase the number of false positives to ensure that you are less likely to miss an actual terrorist plot, or you can reduce the number of false positives at the expense of missing terrorist plots.

Data-mining is the equivalent of searching for the proverbial needle in the haystack. Schneier crunches some numbers and reports:

This unrealistically-accurate system will generate one billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999% and you're still chasing 2,750 false alarms per day -- but that will inevitably raise your false negatives, and you're going to miss some of those ten real plots.

After some more examples of where data mining can be useful -- think Amazon or Netflix in projecting books or movies you might like based on your past purchases or reviews, Schneier writes:

Finding terrorism plots is not a problem that lends itself to data mining. It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier. We'd be far better off putting people in charge of investigating potential plots and letting them direct the computers, instead of putting the computers in charge and letting them decide who should be investigated.

Makes sense to me. Unfortunately, TIA lives on, as the National Journal reported a few weeks ago. It just went into an equivalent of the witness protection program: it changed its name and moved to the Defense Department.

< Moussaoui Judge Warns Prosecutors | Is The White House Completely Losing Touch With Reality? >
  • The Online Magazine with Liberal coverage of crime-related political and injustice news

  • Contribute To TalkLeft


  • Display: Sort:
    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#1)
    by anon55 on Fri Mar 10, 2006 at 12:09:06 AM EST
    Data mining may not help with terrorism, but the Bush administration can monitor left wing blogs, democrat candidates homes and election headquarters, democrat congressmen - both at home and at work, business owned by democrats, the list does not end [Ed. please use a name, even one that is not your own, so that we can tell a single person is writing your comments. Thanks.]

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#2)
    by Che's Lounge on Fri Mar 10, 2006 at 07:26:51 AM EST
    Figure that out just now did they? Thank you Captain Obvious.

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#3)
    by Talkleft Visitor on Fri Mar 10, 2006 at 08:06:29 AM EST
    And there's some little jerk in the fbi keepin' papers on me six feet high it brings me down it brings me down

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#4)
    by jimakaPPJ on Fri Mar 10, 2006 at 09:01:39 AM EST
    Et al - The information provided concerns data mining, and may, or may not be accurate. I seem to remember many people thought airplanes could never fly faster than the speed of sound, and further back it was thought people wouldn't be able to breathe on trains as they went faster and faster. You have to watch those engineers, they just keep on inventing while the current crop of experts have a tendency to proclaim there is nothing past their knowledge base. For those of you in the LA area, I urge you to take some time today and go down to the Anaheim Convention Center and tour the exhibits in the OFC/NFOEC show. The technology will blow your mind, and it is free. I seem to remember that the NSA is supposedly listening in on telephone calls, and some data calls, from terrorist telephones/computers outside the US to number/computers inside the US and the reverse. Subject activity being done without a warrant. So instead of millions it appears that number is much lower. At any given time probably less than 500, courtesy of the NYT giving the terrorists a heads up. In addition voice traffic requires an old fashioned recording device, not the complex programs described. And of course data becomes easy when you have the IP, etc. The other point is that where does the numbers come from, and why the requirement for speed? From what I have read, the initial information comes from captured cell phones, computers and (gasp!) interrogation. So it ages quickly and is mostly useless within a matter of hours. And if you are concerned over data mining, I assume you have been pounding the DNC over its announced plan to spend millions to construct a database of Democrats... It would be interesting to see the filter they use to keep out the Independents and Repubs. Perhaps they will use the old random sample telephone survey. If they use the CBS method they will already have 70% Demos. The key word could be: war. Those who do not go into an immediate rant can be assumed to be the opposition and dropped from the database. How do you spell that word? You know.. hypocrites...

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#5)
    by Talkleft Visitor on Fri Mar 10, 2006 at 09:44:54 AM EST
    I think that data mining could be turned into a terorist tool: the terrorist are unpredictable but authoritarian response is very predictable. e.g. Coded net messages could be designed to gain an authoritarian response that would wedge american muslims against americans in general. This adminstration is certainly inept enough to aid AQ in its local recruiting. (read WASP by E. Russell.)

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#6)
    by Talkleft Visitor on Fri Mar 10, 2006 at 09:46:37 AM EST
    How do you spell that word? You know.. hypocrites... Yes, PPJ, a political party putting together a database of its followers is exactly the same as the US Government data-mining the transactions of its citizens, right? Oh, and talking about surveys and political parties........

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#7)
    by Talkleft Visitor on Fri Mar 10, 2006 at 09:51:21 AM EST
    Nice try, Jim. If the number's that low, it's just that much more reason they can get a FISA warrant, but by all means, my back's actin' up in this damp weather, feel free to keep diggin' your own grave. I'll just sit here and watch. Make the occasional comment. Oh, still goin' with the NYT bashin', eh? Yeah, well, maybe there's five people left in the Country who don't know about Judy Miller's stenography of the administration's wmd lies and bs and the fact that they sat on this warrantless wiretap story for over a year when it may well have influenced the election results against them, doesn't mean ya can't pull it off. Given the crowd you're workin' with, the FACT THAT IT'S A TOTAL LIE AND ABSOLUTE BS DOESN'T MEAN ANYTHING. So, it's your contention that 2 outta 3 Americans are hypocrites?

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#8)
    by jimakaPPJ on Fri Mar 10, 2006 at 10:24:04 AM EST
    Charliedontsurf1 - Uh, you seem particularly out of touch today. The NYT wrote and published the article. You do remember 12/6 don't you? Of course it was almost three months ago. And your reference to Judith Miller confuses me. Since you claim she lied, are you saying that the article is a lie? I mean, what does one have to do with the other? And the issue isn't quantity, the issue is time. How much time do they have after getting a "number" to monitor for intelligence information before the terrorists know it is compromised? And 66% of Americans are Demos? Who knew? Shirt - Good point. In poker language that is called "false tells." Problem is, it can't be used very often. Dark Avenger - Couple of points. Your link is to a STATE not a NATIONAL effort. And are you saying it is okay because the Repubs are doing it? Wow! Bang!

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#9)
    by Talkleft Visitor on Fri Mar 10, 2006 at 11:06:00 AM EST
    Posted by JimakaPPJ March 10, 2006 11:24 AM
    Charliedontsurf1 - Uh, you seem particularly out of touch today. The NYT wrote and published the article. You do remember 12/6 don't you? Of course it was almost three months ago. And your reference to Judith Miller confuses me. Since you claim she lied, are you saying that the article is a lie? I mean, what does one have to do with the other? And the issue isn't quantity, the issue is time. How much time do they have after getting a "number" to monitor for intelligence information before the terrorists know it is compromised? And 66% of Americans are Demos? Who knew?
    Once again, in the interest of time and band width, we'll try to confine ourselves to the things you do know. Who said Democrats? Dems or not, We 66 percent sure ain't shrub fans. We've all seen enough of that clown act. OK, I'll repeat it again, slowly. Yeah, 12/6/05. It should've been 10/25/04. Comprende? It was ready to go? Fitz could've had his Press Conference then, too, if not for all the White House's Obstruction of Justice and other criminal activity. Judith Miller published the lies the WH wanted her to. Just like Woodward did. That made the lies appear legit. Am I still goin' too fast for ya? And as for this warrantless wiretap bs, cut the crap. They know they're bein' monitored. That's to keep it secret from the people. They're spyin' on their political enemies. They're spyin' on the Humane Society and Quakers in Basements and people who exercise their rights to peacefully assemble and demonstrate against this War. Just like their parents did for nixon only now they've got better technology. But spare me the slop about these punks bein' patriots. People who support or think like this don't have a patriotic bone in their bodies. They don't have a clue what this Country is about. They couldn't.

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#10)
    by Talkleft Visitor on Fri Mar 10, 2006 at 11:35:43 AM EST
    Your link is to a STATE not a NATIONAL effort. I said surveys and political parties, and that you did click on the link and read the post was good. And are you saying it is okay because the Repubs are doing it?. Actually, PPJ, it's a fundraising effort disguised as a survey, and using fear to deplete the bank accounts of Republicans anywhere is fine with me, my objection is to the final destination of such funds. :>)

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#11)
    by Talkleft Visitor on Fri Mar 10, 2006 at 01:52:46 PM EST
    The "base rate fallacy" is not a technological limitation. All modern communications are digital, except television, but even HDTV is digital. Telephone calls certainly would not require "old fashioned recording device(s)". Such recordings would just make the calls harder to work with. If they had specific communications addresses to work with they wouldn't need data mining. Traffic analysis and human analysis of the data would be more accurate. Data mining is used when there are large numbers of calls and few or no certain addresses. Part of the purpose of the data mining would be to identify the important addresses. Most of all, the terrorists/spies/saboteurs know all about IP addresses, email addresses and telephone numbers. They're unlikely to use the same addresses repeatedly for important communications. They probably know how to use both codes and cyphers, if they need to. In fact, important communications may well take place only in face to face meetings. Data mining by itself shouldn't trouble anyone, as Bruce Schneier pointed. It's how the government is using it that is the problem. An interesting look at WWII signals intelligence challenges might be Leo Marks' "Between Silk and Cyanide". Of course, if you're interested in a broader history there's always Kahn's "The Codebreakers". Or there are any of Schneier's books. Questioning Mr. Schneier's expertise in computer or communications security is, perhaps, a bit like questioning Ms. Merritt's knowledge of criminal law.

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#12)
    by jimakaPPJ on Fri Mar 10, 2006 at 08:49:55 PM EST
    allen - Trust me. If you are going to listen to what was said, it is reproduced in analog form. charlie - The subject was data mining. Now I know you gave zero knowledge about the subject, but if you just be quiet the rest of the world may not figure that out.

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#13)
    by Talkleft Visitor on Fri Mar 10, 2006 at 09:16:09 PM EST
    PPJ:
    voice traffic requires an old fashioned recording device,
    No, it can be done digitally, PPJ. To listen would take an earphone or head phones or speakers, but the recording doesn't have to be an analog recording, just the output for us human types has to be analog(sound waves).

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#14)
    by jimakaPPJ on Sat Mar 11, 2006 at 11:08:13 AM EST
    Dark Avenger - Digital recording is an old fashioned device. Where have you been? But you understood the point. Because my point to allen:
    allen - Trust me. If you are going to listen to what was said, it is reproduced in analog form.
    Now you may use speakers or headphones, as desired. allen - I didn't challenge anyone's expertise, I just noted that experts wear out, grow old and fail to keep up. It is the human condition. I don't think anyone outside of NSA has a real grasp of what they can, or cannot, do. I also noted that the claim of "warrantless" is what the complaint is about. My contention is that the number would be so small that the problem noted by Bruce S. does not come into play. I'll stand by that.

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#15)
    by Talkleft Visitor on Sat Mar 11, 2006 at 07:20:09 PM EST
    Digital recording is an old fashioned device. Where have you been? Digital recording is the current method of recording sound, PPJ. Can you tell me what kind of recording device isn't old-fashioned by your standards? Your imprecision of terminology, along with your usual attempt to denigrate the writing of someone whose opinion you don't agree with, tells us a lot here. I just noted that experts wear out, grow old and fail to keep up. It is the human condition So you think that one of the three applies to Schneier because.......? I don't think anyone outside of NSA has a real grasp of what they can, or cannot, do. There are technical limits as to what can be done and not done, PPJ, unless you're one of those tinfoil hat types that believes the government has alien technology from "Area 51". You didn't think that Men in Black was a documentary, did you? My contention is that the number would be so small that the problem noted by Bruce S. does not come into play. If you can show where he was off in his theory or figures he used in his examples, your contention would have all the gravity of a hypothesis, and thus it would be testable and falsifiable. TTFN.

    Re: Why Data Mining Won't Stop Terrorists (none / 0) (#16)
    by Talkleft Visitor on Sat Jun 03, 2006 at 12:22:52 AM EST
    I can understand why the government wants to data mine. The most successful business datamine their clients and look for trends and opportunities however gathering all of this information and one branch having all of this information will just lead to disaster. Now that the enemy knows that we are doing this to our own citizens it is just a matter of time before they decide hack into our computers and then they will have even more information to use against us. To make sure that this does not happen the govt. needs to separate all the data and store in separate areas so that no one branch or service has all the compiled data. Before You Buy A Car