2015 Subject Extraction Test Collection
|
|
PMC Full Text Articles, subject terms, and experiment files used in the
"Extracting Characteristics of the Study Subjects from Full-Text Articles" paper.
|
13 Nov 2015 |
2014 Vocabulary Density Study Datasets
|
|
MeSH Descriptor and MeSH Descriptor/MeSH Qualifier Vocabulary Density Study Datasets used in the
"Recent Enhancements to the NLM Medical Text Indexer" paper and
"Vocabulary Density Method for Customized Indexing of MEDLINE Journals" AMIA poster.
|
25 Sep 2014 |
2013
BioASQ Publication Types Dataset
|
|
Dataset of Training and Testing randomly selected PMIDs as
well as True Positives for both Training and Testing used in the
"Identifying Publication Types Using Machine Learning" paper.
|
8 Sep 2013 |
2013 Vitamin D Dataset
|
|
Lists of PMIDs for the Datasets used in the
"Mining MEDLINE for problems associated with vitamin
D" paper.
|
14 Aug 2013 |
2013 MTI_ML Dataset
|
|
Dataset of Training and Testing randomly selected PMIDs as well
as True Positives for both Training and Testing used in the
"Comparison and combination of several MeSH indexing
approaches" machine learning paper.
|
29 Jul 2013 |
2012 MTI_ML Dataset
|
|
Dataset of Training and Testing randomly selected PMIDs used in
the
"MeSH indexing: machine learning and lessons
learned" machine learning paper.
|
29 Jul 2013 |
2011 MTI_ML Dataset
|
|
Dataset of Training and Testing randomly selected PMIDs used in
the
"A One-Size-Fits-All Indexing Method Does Not Exist:
Automatic Selection Based on Meta-Learning." and
"Automatic algorithm selection for MeSH Heading indexing
based on meta-learning." machine learning papers.
|
29 Jul 2013 |
151 Citation GIA Test
Collection
|
|
Test Collection used in our Gene Indexing Assistant (GIA)
project. The GIA corpus consists of 151 manually annotated
MEDLINE citations, randomly extracted from journals on human
genetics with publication dates between 2002 to 2011.
|
2012 |
Word Sense Disambiguation
(WSD) Test Collection
|
|
The test collection consists of 50 highly frequent ambiguous
UMLS concepts from 1998 MEDLINE with manually annotated results.
|
2001 |
500 PubMed
Central Full Text Test Collection
|
|
Test Collection used in our Full Text experiments to date.
|
22 Oct 2003 12 Dec 2003 6 Feb 2004 22 Mar 2005
|
200 MEDLINE
Citations Test Collection
|
|
Test Collection used in our original experiments, tuning
parameters phase, and now used to track improvements to MTI.
|
20 Jan 1999 14 Mar 2007 |
Dosage Info Test Collection
|
|
Test Collection used in experiments for paper: Finding
medications doses in the literature
|
31 Oct 2018 |