MUC Evaluations and dataset

Since early 1990, the MUC evaluations have been funding the development of metrics and statistical algorithms to support government evaluations of emerging information extraction technologies. In the mid-nineties MUC evaluations began to provide prepared data and task definitions in addition to providing fully automated scoring software to measure machine and human performance. The tasks grew from just production of a database of events found in newswire articles from one source to the production of multiple databases of increasingly complex information extracted from multiple sources of news in multiple languages. The databases now include named entities, multilingual named entities, attributes of those entities, facts about relationships between entities, and events in which the entities participated. 数据挖掘实验室

The results of these evaluations were reported at conferences during the 1990′s where developers and evaluators shared their findings and government specialists described their needs. These conferences were called "Message Understanding Conferences (MUC)" as a results of the use of such technology to process military messages. The multilingual portion was known as "Multilingual Entitity Task (MET)" The proceedings of these conferences have all been published, the last of which appears on this website. All previous proceedings were published in bound form by Morgan Kaufmann Publishers.

MUC Data Sets

For each evaluation, ground truth had to be established to determine the reliability of the participating systems. Datasets were typically prepared by human annotators for training, dry run test, and formal run test usage. These datasets are now being made available wherever possible on this website.

The texts used for MUC 6 and MUC 7 are copyrighted materials and are only available through the Linguistic Data Consortium (LDC) for a small fee. The texts are available as: newswire articles for MUC-6 (MUC-VI Text Collection), and newswire articles for MUC-7 (North American News Text Corpora). 数据挖掘工具

Contact the LDC for licensing of the texts and request the public domain prepared datasets used in MUC and the MUC scoring software. The MUC 3 and MUC 4 Data Sets are provided completely free of charge courtesy of FBIS (Federal Broadcast Information Services). The MET 2 Data Sets are provided completely free of charge courtesy of the US Government. They are available here in compressed and TAR′ed format. 数据挖掘研究院

MUC 3 and MUC 4 Data Sets

数据挖掘研究院

MET 2 Data Sets

Note: If you see the data, rather than a dialog box, then download the file and save it before uncompressing and un TARing the file.

数据挖掘工具

数据挖掘研究院

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:信息抽取相关词语定义
下一篇:没有了
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 什么是信息抽取?
  • 信息抽取相关词语定义
  • 什么是信息抽取(Information Extraction )
  • 网上信息抽取技术纵览 参考文献
  • MUC Evaluations and dataset
  • 基于WEB资源的信息抽取技术
  • Jakarta POI - Java API To Access Microso
  • 网上信息抽取技术纵览 第二章信息抽取技术
  • XWRAP Elite Home
  • Generic Information Retrieval System
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • MUC Evaluations and dataset
  • 信息抽取相关词语定义
  • 什么是信息抽取?
  • Jakarta POI - Java API To Access Microso
  • 什么是信息抽取(Information Extraction )
  • XWRAP Elite Home
  • Webstemmer - How it works?
  • Generic Information Retrieval System
  • TIPSTER Text Program
  • Phase III Overview
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静