Data mining: The new weapon in the war on terrorism?

If the government is analyzing Americans’ phone records to discover and track terrorist networks — or ever plans to do so — the requisite technology would cost a lot of money, demand considerable computing power and raise privacy issues, observers say.

The possibility that the government is sifting through tens of millions of phone records came to the public’s attention earlier this month after USA Today reported that the National Security Agency had collected records from AT&T, Verizon and BellSouth. 数据挖掘实验室

Although it is unknown if the government is probing phone records for national security purposes, the possibility shines a spotlight on the potential benefits and drawbacks of a sophisticated technology that few people fully understand.

数据挖掘实验室

That technology is data mining, or extracting knowledge from a vast amount of data. The technique requires super fast computers and software capable of performing complex algorithms, experts say.

数据挖掘论坛

Nathan Hoskin, chief architect of Planning Systems, a data analysis and engineering systems developer, said the government would need supercomputers “on the scale of Blue Gene or Columbia — or you could also create what amounts to a supercomputer out of hundreds or thousands of regular PCs.”

The development of a data-mining system that could analyze U.S. phone data would cost somewhere in the range of $20 million to $50 million, added Hoskin, whose company has worked with federal agencies.

If telecommunications companies hand over their records, three kinds of algorithms might be helpful in investigating potential terrorist cells: clustering algorithms, link analysis and association rule mining.

数据挖掘论坛

The first — clustering algorithms — focuses on pieces of data that are similar to one another. The second — link analysis — attempts to connect the dots among disparate datasets, such as terrorist conspirators scattered worldwide. 数据挖掘实验室

“Terrorists are smart enough to know that if ‘Al’ and ‘Joe’ are both known criminals, they can’t talk directly without attracting law enforcement’s attention,” said Hoskin, who has worked on data analysis and data-mining projects for corporations such as Equifax and Enron during his 25-year career. “With link analysis algorithms, you can start looking for common sets of paths [or] routes.” 数据挖掘实验室

For example, intelligence officials might be able to identify a terrorist cell leader by tracing call routes. The algorithm might show that a Texas-based terrorist who attacked a facility in Austin had previously communicated with a conspirator in Oklahoma City, who had spoken with a co-conspirator in Boston, who in turn had been in touch with someone in Spain, and on and on, until the call route stopped in Pakistan. Then the officials may target the Pakistani caller as a possible cell leader.

But this approach can produce meaningless data because it becomes harder for the link analysis to connect the dots once the route extends five or six hops, Hoskin said.

The third method — association rule mining — looks for patterns within data. If every time Al gets a call from Oklahoma City he then immediately calls Pakistan, the algorithm associates calls originating in Oklahoma City with the country Pakistan. The association may raise red flags for intelligence officials.

数据挖掘研究院

Computer programs can combine all of those algorithms, too. If the composite picture points to the same person, the government could decide to probe every contact that person has called in the past few years. 数据挖掘研究院

Hoskin said he thinks the government would be reluctant to delve into this sort of personal information until the data mining produces convincing evidence. 数据挖掘论坛

“If I was an agent of the government, it wouldn’t be until the point that something had really piqued my interest that I’d say… ‘Do a lookup on this number and find all the people associated with it,’” he said.

数据挖掘实验室

But privacy advocates say mining phone records could produce a mountain of civil rights violations without ever generating one lead. 数据挖掘论坛

Jay Stanley, public education director of the American Civil Liberties Union’s technology and liberty program, said intelligence work could easily creep from mining to wiretapping and other modes of surveillance. “We have to expect that anybody that gets flagged by one tool, like this telephone records database, would find themselves subject to the National Security Agency’s other spying tools, whatever those might be.”

数据挖掘论坛

Critics say the possible data-mining initiative resembles the Defense Department’s scrapped Total Information Awareness program, which was envisioned as a way to anticipate potential terrorist attacks by analyzing patterns from a massive and wide-ranging database of electronic information.

数据挖掘工具

“There’s a lot of evidence that the National Security Agency is engaging in data-mining activities that do bear some resemblance to the TIA program,” Stanley said. “I think one of the primary questions that Congress needs to investigate is to what extent they are engaging in TIA-like activities by sharing private phone records.” 数据挖掘工具

Even if phone companies are not giving out personal identifiers — customers’ names, street addresses and other personal information — the government can obtain personal information from a phone number via other databases and services, according to data-mining experts. 数据挖掘论坛

“It would take a large bank, much less the National Security Agency, about 10 minutes to assign names to all those phone numbers,” Stanley added.

数据挖掘实验室

Earlier this month, a federal auditor testified to the House Judiciary Committee Commercial and Administrative Law Subcommittee that agencies had failed to comply with data-mining protocols as recently as August 2005.

数据挖掘工具

“Increased use by federal agencies of data mining — the analysis of large amounts of data to uncover hidden patterns and relationships — has been accompanied by uncertainty regarding privacy requirements and oversight of such systems,” said Linda Koontz, information management issues director at the Government Accountability Office, testifying before the subcommittee. 数据挖掘交友

“As we reported in previous work, the result was that although agencies employing data mining took many steps needed to protect privacy, such as issuing public notices, none followed all key procedures, such as including in these notices the intended uses of personal information,” she said. 数据挖掘研究院

In comparing wiretapping to looking at phone records, observers say both pose threats to Americans’ privacy. 数据挖掘交友

“Listening to the content of calls is more intrusive, but nobody should underestimate the privacy invasion that’s involved in tracing who’s talking to whom,” Stanley said. He added that the effort could expose innocent citizens’ calls to therapists, lovers and hot lines. 数据挖掘论坛

“People have the implicit expectation that the list of people they call will not be shared with their neighbors or the government,” Stanley said. 数据挖掘交友

Mining phone records to find terrorists could be a waste of time, akin to tagging the entire U.S. population as a possible suspect, he said.

数据挖掘实验室

“Most of the successes we’ve seen in the national security area seem to be old-fashioned, stick-to-the-basics investigative work…start from known leads and work outward,” Stanley said. 数据挖掘实验室

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:江苏烟草破旧立新展现烟草流通领域新变化
下一篇:美国密苏里州采用Teradata税务解决方案
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 数据挖掘书籍推荐
  • 从HTML文件中抽取正文的简单方案
  • :::数据挖掘的流程:::
  • 谷歌印钞机后的神秘团队---质量控制中的数
  • 数据挖掘在电子商务型CRM设计应用
  • Microsoft 决策树算法
  • 第五届机器学习及其应用研讨会
  • :::实施数据挖掘项目考虑的问题:::
  • 数据库中数据挖掘的基本技术介绍
  • 数字资源利用跟踪分析方法
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • 云计算来了,数据挖掘该怎么用
  • Yahoo! 数据挖掘案例
  • 微软亚洲研究院发布“人立方关系搜索”
  • 如何保护数据隐私
  • 数据挖掘在电子商务型CRM设计应用
  • Conferences papers were submitted to
  • 数据挖掘书籍推荐
  • 基于电子商务的组织创新研究
  • Papers from Web Search and Data Mining 2
  • APRA Summit on Prospect Data Mining and
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静