Text is parsed into noun phrases
using Phrasex. The design is modular, so that we can replace the Phrasex
with other noun phrase extractors in future development.
B. Layered Searching Strategy:
1. The noun phrases are used as query string to Inquery database,
(see description in C below) the output of a query is a ranked list of MeSH
2. The layered search method
searches the first field, if no record is retrieved, then search the next
field, and so on.
3. The fields for searching are listed in sequence as follows:
c. UMLS Related Concepts
d. UMLS Co-occurring Concepts
e. PubMed Citations
4. The ranking scores returned by the Inquery database are used as a part
of the computation for Mapping Score.
(Currently we are not doing any semantic aggregation of the ranked MeSH
headings, like we did with our earlier standalone experiment. We believe
that such aggregation will improve performance)
C. Inquery Database:
1. MeSH Main Headings as Titles for the records.
2. Each record includes the following fields:
TITLE: MeSH Main Heading
CUI: Concept Unique Identifier
SYN: Entry terms from MeSH, plus synonyms from UMLS
STY: UMLS Semantic Type
MN: MeSH Tree Number
REL: UMLS Related Concepts, including broader, narrower,
and other related concepts.
COT: UMLS Co-occurring Concepts, top 50 terms are taken
PMCIT: 10 top PubMed citations with title, abstract,
and MeSH headings, from query using the MeSH Heading as Major Topic.