TIPSTER Text Program

The TIPSTER Text Program was a Defense Advanced Research Projects Agency (DARPA ) led government effort to advance the state of the art in text processing technologies through the cooperation of researchers and developers in Government, industry and academia. The resulting capabilities were deployed within the intelligence community to provide analysts with improved operational tools. Due to lack of funding, this program formally ended in the Fall of 1998. 数据挖掘实验室

DARPA, the Department of Defense (DoD) and the Central Intelligence Agency (CIA) jointly funded and managed the program, in close collaboration with the National Institute of Standards and Technology (NIST) and the Space and Naval Warfare Systems Center (SPAWAR, or SSC), formerly NCCOSC/NRaD. A TIPSTER Advisory Board was formed in 1998 with members representing users from other Government agencies interested in automated text processing, such as the Department of Energy (DOE), Federal Bureau of Investigation (FBI), Internal Revenue Service (IRS), National Science Foundation (NSF), Treasury Department and other Government agencies.

数据挖掘论坛

In its efforts to improve document processing efficiency and cost effectiveness TIPSTER focused on three underlying technologies.

数据挖掘论坛

  • Document Detection: the capability to locate documents containing the type of information the user wants from either a text stream or a store of documents.
  • Information Extraction: the capability to locate specified information within a text.
  • Summarization: the capability to condense the size of a document or collection while retaining the key ideas in the material

These three capabilities formed the basis for nearly all other information handling tasks.

数据挖掘工具

TIPSTER Phase I

During the first phase of TIPSTER research efforts, (1991-1994), the participants made major advances in creating the algorithms for document detection and information extraction and in improving the techniques for measuring those advances, through activities such as the Message Understanding Conferences (MUC) and the Text Retrieval Conferences (TREC). Document Detection technologies improved Recall from roughly 30% to as high as 75% and the improvement in the processing of natural language queries was also significant. Improvements in Information Extraction produced increases in Recall from roughly 49% to 65% and in Precision from 55% to 59%, and dramatic gains were made in the ability to automatically identify a wide range of items such as names (both personal and organizational), dates, locations, times, phone numbers, etc.

数据挖掘研究院

TIPSTER Phase II

The TIPSTER research and development community turned its attention to the creation of a software architecture during the second phase, (April 1994-September 1996), in order to standardize the technology components, enable "plug and play" capabilities among the various tools being developed, and permit the sharing of software among the various participants. Based on feedback from the researchers, developers, and users of the existing prototype and implementation systems, the architecture, funding permitted, continued to evolve. 数据挖掘研究院

The Multilingual Entity Task (MET) developed Chinese and Japanese training collectons with over 300 documents in each language. The task was initially confined to Named Entity extraction and the development of a variety of tools such as word boundary finder, part-of-speech tagged Chinese lexicons and dictionaries. 数据挖掘交友

Various research projects and demonstration systems in support of Document Detection and Information Extraction were also completed.

TIPSTER Phase III

Phase III started in October 1996 and continued to build on Phase I and II achievements with new projects in supporting research, development and evaluation areas. Also, summarization was added as a fundamental task area. See Phase III Overview

数据挖掘实验室

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:Phase III Overview
下一篇:Generic Information Retrieval System
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 什么是信息抽取?
  • 信息抽取相关词语定义
  • 什么是信息抽取(Information Extraction )
  • 网上信息抽取技术纵览 参考文献
  • MUC Evaluations and dataset
  • 基于WEB资源的信息抽取技术
  • Jakarta POI - Java API To Access Microso
  • 网上信息抽取技术纵览 第二章信息抽取技术
  • XWRAP Elite Home
  • Generic Information Retrieval System
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • MUC Evaluations and dataset
  • 信息抽取相关词语定义
  • 什么是信息抽取?
  • Jakarta POI - Java API To Access Microso
  • 什么是信息抽取(Information Extraction )
  • XWRAP Elite Home
  • Webstemmer - How it works?
  • Generic Information Retrieval System
  • TIPSTER Text Program
  • Phase III Overview
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静