INFORMATION & RESOURCES

COVID-19 Related Resources

We are providing descriptions and links to all of the various Indexing Initiative COVID-19 related activities and resources on this page. We wanted to provide a single landing page for you to find all of these resources without having to go through the entire site.

As of May 1, 2020 - All of the Indexing Initiative tools (MetaMap Lite, MetaMap, SemRep, and MTI) have all been updated to use the UMLS 2020AA release which does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

We also have a special coronavirus addendum for the Specialist Lexicon (https://lhncbc.nlm.nih.gov/LSG/Projects/lexicon/current/web/index.html) since we are currently in the middle of our release schedule. This coronavirus specific addendum is compatible with all versions of the Specialist Lexicon.

SemRep processes both the LitCovid and CORD-19 datasets on a weekly basis. Please Note: These results are provided without the need of an UMLS Licence Agreement due to the nature of the data.

SemRep Options Used for Processing Both Datasets:
semrep -A -N -n -S -F -Z 2021AB
     A: Anaphora resolution
     N: use_generic_domain_extension
     n: use_generic_domain_modification
     S: generic_processing
     F: full_fielded_output
     Z 2021AB: use 2021AB data

*** LitCovid Results (399,160 distinct citations as of February 5, 2024):

 LitCovid.RIS.gz (219 MB) - Downloaded RIS-Formatted LitCovid citations

md5sum b2647b5dd68b798bcccb6b46fd693818
sha15sum 4a9110382ad4e326fc3098a9fb8fea007748b0f6

 LitCovid.MEDLINE.ALL.gz (54 MB) - MEDLINE Formatted LitCovid citations

md5sum f1143b9e2d211acbe09e40418568cef6
sha15sum 5c9b74c266c9c52d6c086712e97eadc4344e08a9

 LitCovid.SemRep.ALL.gz (246 MB) - SemRep Results File for LitCovid citations

md5sum 683aa200f1fb6721c7a73048b23e6ea5
sha15sum 20a04744f9ed1ecf453325a2cfc9e246a63a7f5c

*** CORD-19 Results (1,056,977 distinct articles as of July 19, 2022):

 CORD-19.metadata.csv.gz (551 MB) - Downloaded metadata.csv CORD-19 articles

md5sum a0b3f2fe6a19048e6fbcce0fb744874c
sha1sum 80f9c78f7c9d8717258b0fa681969b54b680228c

 CORD-19.MEDLINE.ALL.gz (399 MB) - MEDLINE Formatted CORD-19 articles

md5sum e8701fd1bc80957f08a0af4c85249d7b
sha1sum 09551f6867cdc1b058267b388f072dd7c9d5e94b

 CORD-19.SemRep.ALL.gz (1.6 GB) - SemRep Results File for CORD-19 articles

md5sum cba1c65dce37d97e6ab5a5e5f437ae29
sha15sum 60351ba29b5cf064fc1d2cd2616fd2ca4bad43e6

semmedVER42_R is a superset of semmedVER41_R that in addition includes data derived from all PubMed citations downloaded on April 30, 2020 using the query --

( covid OR sars-cov-2 OR wuhan OR coronavirus OR 2019-ncov OR sars ) AND 2019:2020[dp]

The additional COVID-19 citations were processed with SemRep and the 2020AA UMLS data, which includes COVID-19 terms in CUIs C5203670, C5203671, C5203672, C5203673, C5203674, C5203675, and C5203676.

To Download the SemMedDB Database click here.

UMLS 2020AA release does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml

MetaMap Lite Downloads: https://metamap.nlm.nih.gov/MetaMapLite.shtml


MetaMap Downloads: https://metamap.nlm.nih.gov/DataSetDownload.shtml