Content(2)

4.4.2  Genomics

(4/17/05)  The LocusLink database of NLM has been superceded by Entrez Gene.  While the latter maintains all of the records of the former (including Gene Reference in Functions or GeneRIFs), it adds a much larger number of organisms covered as well as integration within the Entrez searching system.  MEDLINE records that contain information about a gene in Entrez Gene now allow linkage to it through the "Link Out" function.

GenBank has surpassed 40 million sequences and 44 billion base pairs.  For up-to-date statistics, see:
http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html

Another new genomics resource from NLM is the Genetics Home Reference (Mitchell et al., 2004).  The system draws on publicly available resources, most of which are written for professionals, but presents them with additional material to provide a view more understandable to the lay public.

数据挖掘研究院



Mitchell, J., Fun, J., et al. (2004). Design of Genetics Home Reference:  a new NLM consumer health resource. Journal of the American Medical Informatics Association, 11: 439-447.

(4/18/04)  A resource of growing importance in genomics is the model organism database, where all information (e.g., gene nomenclature, nucelotide and protein sequences, literature references, and other data) are brought together into a unified resource.  The major model organism databases were described by Bahls et al. (2003).  An accompanying article described the challenges of building and maintaining such databases (Perkel, 2003).  The five most-developed model organism databases include:
Naturally, the development of all these model organism databases has led to the development of a tool to facilitate their construction, the Generic Model Organism Database Construction Kit (Stein et al., 2002).

A growing effort is being developed by these databases and other resources to annotate the function of genes and proteins in biology.  Most of the annotation is done using the GeneOntology, which is described the next chapter.  One resource that attempts to bring together the names, annotations, and linkages to data sets for genome-scale analysis is SOURCE (http://source.stanford.edu), developed at Stanford University (Diehn et al., 2004).  Another attempt focused on the human genome is the GDB Human Genome Database (http://www.gdb.org/).

数据挖掘工具



Another aggregation in the molecular biology domain is the Transparent Access to Multiple Bioinformatics Information Sources (TAMBIS) system (Goble et al., 2001), which uses a domain ontology to model the underlying information.

Bahls, C., Weitzman, J., et al. (2003). Biology′s models. The Scientist . June 2, 2003. 5. http://www.the-scientist.com/yr2003/jun/feature_030602.html.
Diehn, M., Sherlock, G., et al. (2003). SOURCE:  a unified genomic resource of functional annotations, ontologies, and gene expression data. Nucleic Acids Research, 31: 219-223.
Goble, C., Stevens, R., et al. (2001). Transparent access to multiple bioinformatics information sources. IBM Systems Journal, 40: 532-552. http://www.cs.man.ac.uk/~stevensr/papers/goble01.pdf.
Perkel, J. (2003). Feeding the info junkies. The Scientist. June 2, 2003. 39. http://www.the-scientist.com/yr2003/jun/feature14_030602.html.
Stein, L., Mungall, C., et al. (2002). The generic genome browser:  a building block for a model organism system database. Genome Research, 12: 1599-1610.

4.4.3  Citations

(4/19/03)  As mentioned above, ResearchIndex ( http://citeseer.nj.nec.com/cs ) is a database of computer science-oriented (including medical informatics) scientific literature.  A key feature of this database is its citation linkages from papers to other papers that it cites as well as those that cite it.

4.4.4  EBM databases

(4/17/05)  The proliferation of EBM databases continues, with a variety of formats used.  This section provides an update on existing resources as well as description of some new ones.

While some EBM purists argue that Up to Date (http://www.uptodate.com/) is not completely evidence-based, e.g., not all statements are tagged with levels of evidence or support from studies of the highest quality evidence, the resource is comprehensive and very popular among clinicians as well as those in training.  Up to Date has about 4,500 topic reviews in adult and pediatric medicine which are updated continually.  Each topic has an outline that allows easy navigation.  One of those outline headings is "Recommendations," which quickly gives the specific clinical recommendations for diagnosis and/or treatment of the problem.  Topics are linked to both the MEDLINE references of articles cited as well as a drug compendium for specific prescribing information.  Up to Date also provides a "What′s New" area for each clinical topic, describing the latest clinical news in a given field.  The system has also been enhanced with links to the Lexi-Comp drug reference, PubMed MEDLINE references, and patient education information. 数据挖掘论坛

(Historical note:  I programmed the first version of Up to Date!  This was before it was called Up to Date and when it was considerably less developed than it is now.  The founder of Up to Date, Dr. Burton Rose of Brigham & Women′s Hospital, sought an informatics fellow to help develop his idea of a resource that would provide simple yet authoritative information to physicians.  Due to my interest in IR, I took on the project, developing the first version in Apple Hypercard.  Of course, I finished my fellowship and moved on, and another fellow, Dr. Joseph Rush, took on the project and now is the senior programmer for Up to Date.  Drs. Rose and Rush have built a substantial enterprise from those humble beginnings!)

Another resource growing in size and comprehensiveness is PIER: The Physicians′ Information and Education Resource (http://pier.acponline.org/) from the American College of Physicians (ACP, http://www.acponline.org/), the specialty society for internal medicine.  PIER is designed to be the comphrensive information resource for practitioners of adult primary care medicine.

At this time, PIER is only available to members of the ACP.  PIER is organized into modules that are categorized under six topic types:
  • Diseases
  • Screening and Prevention
  • Complementary/Alternative Medicine
  • Ethical and Legal Issues
  • Procedures
  • Drug Resource
As of now, the largest category of modules is Diseases, with over 500 developed.  The content for each disease is organized under the following headings:
  • Prevention
  • Screening
  • Diagnosis
  • Consultation for Diagnosis - When to consider obtaining subspecialty consultation for the diagnosis
  • Hospitalization - Important issues to address in the patient hopsitalized with this disease
  • Non-drug Therapy
  • Drug Therapy
  • Patient Education - Pertient issues to educate the patient about with this disease
  • Consultation for Management - When to consider obtaining subspecialty consultation for management
  • Follow-up
Modules also include references, patient information, additional references, and a PDF file of entire module for printing.  A handheld version is also available (http://pier.acponline.org/pierpdajump.html) and the underlying system is constructed in a modular way to allow access via other applications, such as electronic health records. 数据挖掘工具

Every single guidance statement and recommendation in PIER is given a strength of recommendation rating to help the clinician assess their usefulness.  These evidence ratings come from the procedure used in another ACP publication, ACP Journal Club ( http://www.acpjc.org/shared/purpose_and_procedure.htm#criteria).  The strength of recommendation is rated from A-C based on the following criteria:
  1. The preponderance of data supporting this statement is derived from level 1 studies, which meet all of the evidence criteria for that study type
  2. The preponderance of data supporting this statement is derived from level 2 studies, which meet at least one of the evidence criteria for that study type
  3. The preponderance of data supporting this statement is derived from level 3 studies, which meet none of the evidence criteria for that study type or are derived from expert opinion, commentary or consensus
The evidence criteria vary for the study type (e.g., randomized controlled trials for therapeutic or preventive interventions).  References drawn from the medical literature are also given a level of evidence rating:

数据挖掘交友


  1. Studies that meet all of the evidence criteria for that study type
  2. Studies that meet at least one of the evidence criteria for that study type
  3. Studies that meet none of the evidence criteria for that study type or are derived from expert opinion, commentary or consensus
Another widely distributed and comprehensive resources is Clinical Evidence (http://www.clinicalevidence.com/).  Billed as an "evidence formulary," Clinical Evidence classifies each intervention for a given medical condition into the following categories:
  • Beneficial - Interventions for which effectiveness has been demonstrated by clear evidence from RCTs, and for which expectation of harms is small compared with the benefits
  • Likely to be beneficial - Interventions for which effectiveness is less well established than for those listed under “beneficial.”
  • Trade off between benefits and harms - Interventions for which clinicians and patients should weigh up the beneficial and harmful effects according to individual circumstances and priorities.
  • Unknown effectiveness - Interventions for which there are currently insufficient data or data of inadequate quality.
  • Unlikely to be beneficial - Interventions for which lack of effectiveness is less well established than for those listed under “likely to be ineffective or harmful.”
  • Likely to be ineffective or harmful - Interventions for which ineffectiveness or harmfulness has been demonstrated by clear evidence.
An additional comprehensive collection of EBM content consist of POEMS ("patient-oriented evidence that matters"), which are short evidence-based synopses whose topics are selected based on the following criteria:
  • They address a question faced by physicians.
  • They measure outcomes that physicians and their patients care about: symptoms, morbidity, quality of life, and mortality.
  • They have the potential to change the way medicine is practiced.
The main component of InfoPOEMS (http://www.infopoems.com/) is InfoRetriever, a resource that invludes a variety of evidence-based content and tools, including:
  • All POEMs
  • Cochrane Systematic Review abstracts
  • More than 120 decision support tools
  • More than 1,800 diagnostic calculators supporting selection and interpretation of diagnostic tests and the H&P
  • About 400 summaries of practice guidelines
  • Five-Minute Clinical Consult
A less comprehensive EBM resource is Evidence-Based On Call (http://www.eboncall.co.uk/), which provides evidence-based summaries of 38 "on-call" medical conditions.

数据挖掘实验室



Some EBM collections take newer approaches, and are in the development stage and thus less comprehensive.  Designed for clinicians at the University of Washington, PrimeAnswers (http://www.primeanswers.org/) aims to provide the "best evidence at the point of care."  The system includes easy access to the other EBM resources, some of which are commerical and thus password-protected.  It also features its own new content, consisting of evidence-based summaries of about 20 common clinical conditions, with linkage to the appropriate evidence.  The Family Practice Inquiries Network (http://www.fpin.org/) is a project led by leading Departments of Family Medicine in the United States.  The goal of FPIN is to develop a resource that answers 80% of primary care clinical questions in 60 seconds.  This will be done by collecting the most common clinical questions and providing specific answers to them.  In some ways this is analagous to the "Frequently Asked Questions" (FAQs) seen on many Web sites.

数据挖掘研究院



(4/19/04)  The Best Evidence product described in the book is no longer available, although the component publications that made it up, ACP Journal Club and Evidence-Based Medicine, are still available.

4.4.5  Other databases

(4/17/05)  With the growing concern about bioterrorism and hazardous material incidents more generally, the NLM has created the Wireless System for Emergency Responders (WISER, wiser.nlm.nih.gov).  This system is available for both handheld devices (Palm and Pocket PC) and PCs (Windows and Web versions).  Data for the system comes from a variety of sources, such as the NLM Hazardous Substances Data Bank (HSDB), the Department of Transportation Emergency Response Guidebook, and the POISINDEX system from Micromedex.  The output from searching is presented in different order depending on whether the user is a first responder, hazaradous materials (HAZMAT) specialist, or emergency medical system (EMS) specialist. 数据挖掘研究院

(4/17/05)  A variety of interesting new databases have appeared that do not fall under the rubric of textual databases but are integrated with them:
  • Google Maps - This is not the first map application, but it provides Google′s typical ease of use and links the drawn maps to satellite images (maps.google.com).
  • PubChem - The growing amount of chemical information, particularly that which is relevant to biological activity, has led the NLM to create the PubChem database (pubchem.ncbi.nlm.nih.gov).  This resource shows chemical structures, related substances, biological activity, and linkages to the biomedical literature.
  • Search capabilities over the documents, emails, viewed Web pages, and so forth on one′s own machine.  Both the Windows and Macintosh operating systems allow searching over information in files on their disks these days.  In addition, Web search engine vendors such as Google offer "desktop searching" tools.  The forerunner of the Microsoft desktop searching application was described by Dumais et al. (2003). 数据挖掘交友
Dumais, S., Cutrell, E., et al. (2003). Stuff I′ve seen:  a system for personal information retrieval and re-use. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, CA. ACM Press. 72-79. http://research.microsoft.com/~sdumais/SISCore-SIGIR2003-Final.pdf.

(4/19/03)  The NIH CRISP database was the subject of a recent newspaper story in which scientists expressed worry that some of the topics of their research could come under increased scrutiny due to their politically controversial nature (Goode, 2003).

Goode, E. (2003). Certain Words Can Trip Up AIDS Grants, Scientists Say. New York Times. http://www.nytimes.com/2003/04/18/national/18GRAN.html .

(4/19/03)  An interesting resource outside the domain of health and biomedicine is the Software Engineering Body of Knowledge (SWEBOK, http://www.swebok.org/).  The goal of this resource is to map all of the knowledge of the field of software engineering (Bourque et al., 1999).  The paper by Bourque et al. summarized the challenges in creating such a resource.  For example, where does one draw the line between the discipline of software engineering and related ones, such as computer science, cognitive science, management science, and systems engineering.  Likewise, what should the depth of material presented?  The project chose to adopt the approach of inlcuding "generally accepted" knowledge, which applies to most situations most of the time and has widespread consensus about its value and effectiveness.  This type of knowledge was distinguished from "advanced and research" knowledge, which was not yet mature, and "specialized" knowledge, which was not yet generally applicable. 数据挖掘工具

The knowledge in SWEBOK is organized under a hierarchical breakdown of topics.  Each topic includes:
  • Pointers to reference materials
  • Description
  • Classification by Vincenti′s (1990) taxonomy of engineering design knowledge:  fundamental design concepts, criteria and specifications, theoretical tools, quantitative data, practical considerations, and approaches to problem solving.
  • Ratings by Bloom′s (1984) taxonomy of pedagogical categories:  knowledge, comprehension, application, analysis, synthesis, and evaluation.
  • References to related disciplines
Bloom, B. and Krathwohl, D. (1984). Taxonomy of Educational Objectives . New York. Addison-Wesley.
Bourque, P., Dupuis, R., et al. (1999). The guide to the Software Engineering Body of Knowledge. IEEE Software, 16(6): 35-44. http://www.lrgl.uqam.ca/publications/pdf/463.pdf.
Vincenti, W. (1990). What Engineers Know and How They Know It:  Analytical Studies from Aeronautical History. Baltimore. The Johns Hopkins University Press. 数据挖掘实验室

(5/6/03)  The SWEBOK project has motivated a similar effort in medical informatics called the Health Informatics Body of Knowledge (HIBOK, http://www.ehrweb.org/ehrweb/implementation/pages/hibok.htm).  At this point, however, HIBOK is more of a concept than a reality.

4.5  Aggregations

(4/19/03)  A very comprehensive collection of content has been made available to all clinicians in the United Kingdom is the National Electronic Library for Health (http://www.nelh.nhs.uk/) from the British National Health Service.  A variety of free and commerical resources are available, with the latter only to available to those with a password.  The commerical resources include Clinical Evidence, the full text of over 800 journals, the Cochrane Library, and a variety of bibliographic databases.

(4/20/03)  All of the Web sites of the National Cancer Institute (http://www.nci.nih.gov/) are now organized under a single URL ( http://cancer.gov/), which includes CancerNet, PDQ, and more. 数据挖掘交友

(4/18/04)  Some MEDLINEplus oriented to the elderly has been repackaged into the NIH Senior Health Web site ( http://nihseniorhealth.gov/).  Some innovative additional features of this site for elderly people with poor vision and/or low reading ability include the capability to enlarge the font size of the text, increase the contrast by using a black background with white or yellow text, and have the content delivered in spoken format.

(4/18/04)  The distinction between aggregations and other resources continues to blur.  For example, all of McGraw-Hill′s textbooks, including the venerable Harrison′s, are now available in a single product called Access Medicine (http://www.accessmedicine.com/).  There are increasing linkages across textbooks as well as links to updates, continuing medical education (CME) self-assessments, and other Web resources.

The market for aggregations of clinical content continues to grow.  A number of commercial products (beyond those mentioned above or in the book) have emerged either de novo or from the aggregation of previous standalone systems: 数据挖掘研究院
  • Clineguide (Wolters Kluwer, http://www.clineguide.com) combines a former summary of diseases and treatments with drug information from Medi-Span, full-text resources from the SKOLAR system developed at Stanford, other full-text titles from Lippincott Williams & Wilkins, and the database access system Ovid into a single product.
  • First Consult (Elsevier, http://www.firstconsult.com) integrates the PDxMD summary of diseases and treatments to MDConsult and other resources of the mega-publisher Elsevier.
  • MICROMEDEX Healthcare Series (Thomson Micromedex, http://www.micromedex.com) integrates a number of former standalone databases into a comprehensive clinical information resource.
Not surprisingly, most of the above clinical references are available in formats for Personal Digital Assistants (PDAs).

Of course, a big challenge that remains with all of these wonderful resources is that they are only aggregated among themselves and not to other resources, perhaps with the exception of linkages to MEDLINE references in PubMed.  As a result, one cannot "mix and match" different of his or her favorite clinical resources into a unified digital library.  Chapter 10 discusses digital libraries and what might be done to make this possible from a technical standpoint.  Not surprisingly, the real barriers are economic, i.e., publishers do not want to link a user to the resources of a competitor.

数据挖掘工具


[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:Content(2)
下一篇:Content(1)
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 信息检索的核心支撑技术
  • 信息检索研究人员推荐读物
  • 清华信息检索在TREC评测中再创佳绩
  • 如何实现中文文献的自动聚合分类
  • Resources for Text, Speech and Language
  • 基于WordNet的文本分类技术研究和实现
  • 字符串匹配的KMP算法
  • 中创软件Infor中间件助力税收信息化
  • Boyer Moore 算法
  • 中文信息处理——纵览与建议
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • 信息检索权威资料收集
  • Artificial Intelligence as Smart as Huma
  • 2nd CFP: Social Linking Track at Hyperte
  • 如何实现中文文献的自动聚合分类
  • 信息检索的核心支撑技术
  • Efficient Similarity Search over Vector
  • MARS: A Matching and Ranking System for
  • 信息检索研究人员推荐读物
  • Resources for Text, Speech and Language
  • Information Wants to be Found
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静