Phase III Overview

DARPA and other TIPSTER members sponsored 17 research and architecture development contracts with academic institutions and commercial companies in their effort to continue a balanced overall program, consisting of four basic parts: Advanced Research, Metrics-based Evaluations, a Structured Software Architecture, and Demonstration and Implementation Projects. Phase III Kick-off workshop was held October 1996. 数据挖掘论坛

Advanced Research

TIPSTER′s Phase III was aimed at continuing innovative research in basic areas while adding several new topics of investigation: detection (improving search algorithms; merging results from different engines); extraction (tuning systems to new domains; raising accuracy); summarizing (producing a single summary of multiple documents); multilingual capabilities (porting tools and techniques proven in one language to work in other languages); and cross-technology (sharing of information between detection and extraction tools).

数据挖掘论坛

Metrics-based Evaluations

The Message Understanding Conferences (MUC) was a forum for assessing and discussing progress in natural language processing which showcased high performing systems in 1995 with scores about 90% in named entity tagging task. The Text Retrieval Conference (TREC), initiated by TIPSTER to evaluate document detection performance, will continue under NIST leadership. Acquisitions boosted the TREC data base to five gigabytes, an invaluable resource for the Information Retrieval (IR) community to test querying and ranking methods. The Summarization Evaluation Conference (SEC) recently completed an initial test of methods to evaluate summarization performance. Some results can be found here. 数据挖掘论坛

Software Architecture

Phase III proposed a feature called the TIPSTER Architecture Capabilities Platform (ACP). ACP′s goals were aimed at providing framework for research and development in both Document Detection and Information Extraction. This resource, began development in 1997, it focused on providing the community with the opportunity to test components in a TIPSTER Architecture compliant environment and to perform experiments using TIPSTER components and modules. Inclusion of CORBA capabilities and the Z39.50 Information Retrieval Protocol supported reusable components and a more ′standards′ base approach to the Architecture.

Demonstration and Implementation Projects

Initial prototyping efforts in Phase II led to several demonstrations of the capabilities of TIPSTER components against operational tasks in the Intelligence Community and elsewhere. The most successful of these early systems was prepared to be migrated to operational use. Phase III proposed a new round of prototyping and applications development. The expansion of the Software Architecture and to bring the results to the user as quickly as possible. 数据挖掘研究院

Phase III Work

New Research Areas

Several new research areas were added as Phase III tasks :

数据挖掘实验室

  • Text summarization - an enhancement in the information extraction area to develop methods and algorithms to produce, in reasonable English, a summary for each document of interest or a single summary of multiple documents in a collection of interest
  • Merging search results - develop the means to merge results from different search algorithms while maintaining a useful relevance ranking for the retrieved items or fuse the retrieved information with other items
  • Coreference resolution - develop algorithms to resolve text references to the same or different entities; improve the automatic extraction of relationships among entities mentioned in a document

Continuing Research Areas

A number of research areas from Phase I and II required further effort in Phase III : 数据挖掘实验室

  • Customization methods - develop more effective ways for a system administrator or end-user to port tools and techniques shown to work in one language or domain to other languages or domains
  • Multilingual capability enhancements in both detection and extraction
  • Further improvements in recall and precision
  • User-interface design and usability testing

Architecture and Capabilities Platform Development

The major focus of the Architecture component for Phase III was the development of a COmmon Request Broker Architecture (CORBA) compliant Architecture Capabilities Platform (ACP) to host TIPSTER-compliant software components and modules. The ACP planned to provide a software platform for testing of individual TIPSTER tools and capabilities. Plans proposed to have developers demonstrate to the Government the modularity of their text handling systems by plugging components and modules into the ACP and interacting with the other TIPSTER components on the platform. In addition, the ACP intended to demonstrate the capability to interact with systems based on Z39.50 standards. Various supporting components for the ACP would have included document collections, standard detection needs, lexicons, a document manager and a default Graphical User Interface (GUI). As a continuation from Phase II, the TIPSTER program was aimming to foster a cooperative effort among the research entities and the ACP developers to provide enhancements to the TIPSTER Architecture design.

数据挖掘研究院

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:TIPSTER Related Research
下一篇:TIPSTER Text Program
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 什么是信息抽取?
  • 信息抽取相关词语定义
  • 什么是信息抽取(Information Extraction )
  • 网上信息抽取技术纵览 参考文献
  • MUC Evaluations and dataset
  • 基于WEB资源的信息抽取技术
  • Jakarta POI - Java API To Access Microso
  • 网上信息抽取技术纵览 第二章信息抽取技术
  • XWRAP Elite Home
  • Generic Information Retrieval System
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • MUC Evaluations and dataset
  • 信息抽取相关词语定义
  • 什么是信息抽取?
  • Jakarta POI - Java API To Access Microso
  • 什么是信息抽取(Information Extraction )
  • XWRAP Elite Home
  • Webstemmer - How it works?
  • Generic Information Retrieval System
  • TIPSTER Text Program
  • Phase III Overview
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静