MTI Post Processing

Indexing Initiative
Once filtering is accomplished, post-processing is performed regardless of the filtering level used. Post-processing involves cleaning up the final recommendation list by removing any terms that survived the filtering process but are invalid for the target audience, filling out the list of terms by adding CTs, Geographicals, and other MHs based on the text, a machine learning algorithm, and lookup lists, and then finally attaching subheadings to the individual MHs and creating a global list of subheadings applicable to the text.

The first post-processing step involves identifying the end user so the correct exclu-sion list can be used to remove terms from the recommendation list. There are three distinct exclusion lists used by MTI to provide tailored results for Indexing, Cataloging, and HMD. For example, the MH 'Academic Dissertations' is not used by Indexing or Cataloging, but is needed for HMD. The Indexing exclusion list is the default used by MTI and contains MHs that are too general to be recommended or contain "not used for indexing" in the Annotation field of its MeSH record (e.g., the general MH "Eye Manifestations" with treecode C11.300 in 2013 MeSH).

The tailored recommendation list and text is then reviewed: CTs, Geographical MHs, and other MHs and SHs are added and marked so that they can be displayed as final recommendations. For example, if the MH "Neonatal Screening" is being recommended, MTI automatically adds CTs "Humans" and "Infant, Newborn" if they are not already in the list. If the text contains the word Nairobi, the Geographical MH "Kenya" is added to the list if it not already there. A secondary check is done for Nairobi to make sure the text is actually about the country Kenya since there is also the possibility that the text is referring to "Nairobi Sheep". MTI has a small set of cases like this which require a secondary check before the MH is actually added to the final recommendation list.

One final class of additions is a "forced list" of triggers whose presence within the text triggers one or more MHs. The "forced list" comes mainly from Indexer Feedback that indicated "if you see xyz, you should always recommend 'abc'". For example, if hiv patient is in the text being processed, MTI will always recommend the MH "Acquired Immunodeficiency Syndrome". MTI performs a case-insensitive search of the text for the "forced list" triggers and then adds the MH(s) if not already present and sets the "forced" flag that tells MTI to always display the term.
Last Modified: May 30, 2019 ii-public2
     Contact Us    |   Contact Us (SemRep)    |   Copyright    |   Privacy    |   Accessibility    |   Freedom of Information Act    |    Get Acrobat Reader button
Links to Our Sites:
Indexing Initiative (II)
Investigating computer-assisted and fully automatic methodologies for indexing biomedical text. Includes the NLM Medical Text Indexer (MTI).
Semantic Knowledge Representation (SKR)
Develop programs to provide usable semantic representation of biomedical text. Includes the SemRep program.
Program to map biomedical text to the UMLS Metathesaurus. Information and downloadable material for the MetaMap program.
Word Sense Disambiguation (WSD)
Test collection of manually curated MetaMap ambiguity resolution in support of word sense disambiguation research.
MEDLINE Baseline Repository (MBR)
Static MEDLINE® Baselines for use in research involving biomedical citations. Allows for query searches and test collection creation.
Structured Abstracts (SA)
Information about NLM's research on Structured Abstracts in the MEDLINE® Baselines.
Lister Hill Center Homepage Link - Image of Lister Hill Center Lister Hill National Center for Biomedical Communications   NLM Homepage Link - NLM Logo U.S. National Library of Medicine   NIH Homepage Link - NIH Logo National Institutes of Health
DHHS Homepage Link - DHHS Logo Department of Health and Human Services