Extract MeSH Descriptors process

Indexing Initiative

The PubMed Related Citations algorithm returns a scored and ranked list of the 20 most relevant citations in HTML format similar to the following sample:

<HTML> <HEAD> <TITLE>neighbouring</TITLE> </HEAD> <BODY bgcolor="ffffff"> <br> <br><pre><b>The neighbours are:</b></pre><hr> (1)&nbsp;&nbsp; 97.027&nbsp;&nbsp;Rosenbaum JL, Almli CR, Yundt KD, Altman DI, Powers WJ<br> <A href="http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=9339686">9339686</A> &nbsp;&nbsp;Higher neonatal cerebral blood flow correlates with worse childhood neurologic outcome.<hr> (2)&nbsp;&nbsp; 33.9859&nbsp;&nbsp;Altman DI, Powers WJ, Perlman JM, Herscovitch P, Volpe SL, Volpe JJ<br> <A href="http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=3263081">3263081</A> &nbsp;&nbsp;Cerebral blood flow requirement for brain viability in newborn infants is lower than in adults.<hr> ... (20)&nbsp;&nbsp; 24.6715&nbsp;&nbsp;Bednarczyk EM, Rutherford WF, Leisure GP, Munger MA, Panacek EA, Miraldi FD, Green JA<br> <A href="http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=2343589">2343589</A>

Once we have this data, we do the following for each of the 20 relevant citations returned:

  1. Parse the output to find the score and reference UID for each of the 20 citations.

  2. Call the NCBI Text Tool Server for each of the 20 relevant citation UIDs. This retrieves the citation from PubMed in MEDLINE format which allows us to then parse out the MeSH Headings (lines with "MH -"). The output looks similar to the following sample:

    	UI  - 97479605
    	PMID- 9339686
    	DA  - 19971114
    	DCOM- 19971114
    	LR  - 20001218
    	IS  - 0028-3878
    	VI  - 49
    	IP  - 4
    	DP  - 1997 Oct
    	TI  - Higher neonatal cerebral blood flow correlates with worse childhood
    	      neurologic outcome.
    	PG  - 1035-41
    	AB  - Cerebral blood flow (CBF) in newborn infants is often below levels
    	      necessary to sustain brain viability in adults. Controversy exists
    	      regarding the effects of such low CBF on subsequent neurologic function.
    	      We determined the current childhood neurologic status and IQ in 26
    	      subjects who had measurements of CBF performed with PET in the neonatal
    	      period between 1983 and 1989 as part of a study of hypoxic-ischemic
    	      encephalopathy. Follow-up information at ages 4 to 12 years was obtained
    	      on all 26 subjects. Ten subjects had died. All 16 survivors underwent
    	      clinical neurologic evaluation, and 14 also underwent intelligence
    	      testing. Eight had abnormal clinical neurologic evaluations; eight were
    	      normal. The mean neonatal CBF in those with abnormal childhood neurologic
    	      outcome was significantly higher than in those with normal childhood
    	      neurologic outcome (35.64 +/- 11.80 versus 18.26 +/- 8.62 mL 100 g(-1)
    	      min(-1), t = 3.36, p = 0.005). A significant negative correlation between
    	      neonatal CBF and childhood IQ was demonstrated (Spearman rank correlation
    	      r = -0.675, p = 0.008). Higher CBF was associated with lower IQ. The
    	      higher CBF in subjects with worse neurologic and intellectual outcome may
    	      reflect greater loss of cerebrovascular autoregulation or other vascular
    	      regulatory mechanisms due to more severe brain damage.
    	AD  - Department of Pediatrics, Washington University School of Medicine, St.
    	      Louis Children's Hospital, MO 63110, USA.
    	AU  - Rosenbaum JL
    	AU  - Almli CR
    	AU  - Yundt KD
    	AU  - Altman DI
    	AU  - Powers WJ
    	LA  - eng
    	ID  - NS06833/NS/NINDS
    	ID  - NS32568/NS/NINDS
    	PT  - Journal Article
    	TA  - Neurology
    	JC  - NZ0
    	JID - 0401060
    	SB  - AIM
    	SB  - IM
    	MH  - Cerebrovascular Circulation/*physiology
    	MH  - *Child Development
    	MH  - Follow-Up Studies
    	MH  - Human
    	MH  - Infant, Newborn
    	MH  - Intelligence
    	MH  - *Nervous System Physiology
    	MH  - Neurologic Examination
    	MH  - Support, Non-U.S. Gov't
    	MH  - Support, U.S. Gov't, P.H.S.
    	MH  - Tomography, Emission-Computed
    	EDAT- 1997/10/27 20:29
    	MHDA- 1997/10/27 20:29
    	PST - ppublish
    	SO  - Neurology 1997 Oct;49(4):1035-41.

    Once we have the citation from PubMed, we need to do the following:
    1. Pull the MeSH Heading ("MH - ") lines from the citation.

    2. Track whether the MeSH Heading is an IM term or not.

    3. Combine the MH with the appropriate UI and original score from the relevant citation we are working on.

    4. The final output of this process is a scored and ranked list of all MeSH Headings from the 20 relevant citations with a line of "***" separating each of the relevant citation's MeSH Headings. The output would look similar to the following:

      	   97479605|IM|Brain Diseases|33.9859
      	   97479605|IM|Cerebrovascular Circulation|33.9859
      	   97479605|NIM|Gestational Age|33.9859
      	   97479605|NIM|Infant, Premature|33.9859
      	   97479605|IM|Tomography, Emission-Computed|33.9859
      	   97479605|NIM|Asphyxia Neonatorum|29.0115
      	   97479605|IM|Brain Ischemia|29.0115
      	   97479605|IM|Cerebrovascular Circulation|29.0115
      	   97479605|IM|Hypoxia, Brain|29.0115
      	   97479605|IM|Intracranial Pressure|29.0115
      	   97479605|IM|Cerebrovascular Circulation|24.6715
      	   97479605|NIM|Oxygen Radioisotopes|24.6715
        	   97479605|NIM|Tomography, Emission-Computed|24.6715

Last Modified: May 30, 2019 ii-public2
     Contact Us    |   Contact Us (SemRep)    |   Copyright    |   Privacy    |   Accessibility    |   Freedom of Information Act    |   USA.gov    Get Acrobat Reader button
Links to Our Sites:
Indexing Initiative (II)
Investigating computer-assisted and fully automatic methodologies for indexing biomedical text. Includes the NLM Medical Text Indexer (MTI).
Semantic Knowledge Representation (SKR)
Develop programs to provide usable semantic representation of biomedical text. Includes the SemRep program.
Program to map biomedical text to the UMLS Metathesaurus. Information and downloadable material for the MetaMap program.
Word Sense Disambiguation (WSD)
Test collection of manually curated MetaMap ambiguity resolution in support of word sense disambiguation research.
MEDLINE Baseline Repository (MBR)
Static MEDLINE® Baselines for use in research involving biomedical citations. Allows for query searches and test collection creation.
Structured Abstracts (SA)
Information about NLM's research on Structured Abstracts in the MEDLINE® Baselines.
Lister Hill Center Homepage Link - Image of Lister Hill Center Lister Hill National Center for Biomedical Communications   NLM Homepage Link - NLM Logo U.S. National Library of Medicine   NIH Homepage Link - NIH Logo National Institutes of Health
DHHS Homepage Link - DHHS Logo Department of Health and Human Services