INQUERY Algorithm

Home
A. Noun Phrase Extraction:

Text is parsed into noun phrases using Phrasex. The design is modular, so that we can replace the Phrasex with other noun phrase extractors in future development.

B. Layered Searching Strategy:

1. The noun phrases are used as query string to Inquery database, (see description in C below) the output of a query is a ranked list of MeSH headings.

2. The layered search method searches the first field, if no record is retrieved, then search the next field, and so on.

3. The fields for searching are listed in sequence as follows:
    a. TITLE
    b. Synonyms
    c. UMLS Related Concepts
    d. UMLS Co-occurring Concepts
    e. PubMed Citations


4. The ranking scores returned by the Inquery database are used as a part of the computation for Mapping Score. (Currently we are not doing any semantic aggregation of the ranked MeSH headings, like we did with our earlier standalone experiment. We believe that such aggregation will improve performance)

C. Inquery Database:

1. MeSH Main Headings as Titles for the records.

2. Each record includes the following fields:
    TITLE: MeSH Main Heading
    CUI: Concept Unique Identifier
    SYN: Entry terms from MeSH, plus synonyms from UMLS
    STY: UMLS Semantic Type
    MN: MeSH Tree Number
    REL: UMLS Related Concepts, including broader, narrower, and other related concepts.
    COT: UMLS Co-occurring Concepts, top 50 terms are taken
    PMCIT: 10 top PubMed citations with title, abstract, and MeSH headings, from query using the MeSH Heading as Major Topic.


Last Modified: March 16, 2004 ii-public
Links to Our Sites
Indexing Initiative (II)
Investigating computer-assisted and fully automatic methodologies for indexing biomedical text. Includes the NLM Medical Text Indexer (MTI).
Semantic Knowledge Representation (SKR)
Develop programs to provide usable semantic representation of biomedical text. Includes the MetaMap and SemRep programs.
MetaMap Transfer (MMTx)
Distributable version of the MetaMap program.
Word Sense Disambiguation (WSD)
Test collection of manually curated MetaMap ambiguity resolution in support of word sense disambiguation research.
Medline Baseline Repository (MBR)
Static MEDLINE Baselines for use in research involving biomedical citations.
Picture of Lister Hill Center Lister Hill National Center for Biomedical Communications   NLM Logo U.S. National Library of Medicine   NIH Logo National Institutes of Health
DHHS Logo Department of Health and Human Services
     Contact Us    |   Copyright    |   Privacy    |   Accessibility    |   Freedom of Information Act    |   USA.gov