The main components of the MTI ML distribution are easily downloaded via the single MTI_ML.tar.gz link below. MTI ML also requires a third party package monq-1.1.1.jar which is available via the link below.
The monq-1.1.1.jar package is a third party open source development resource package that was originally available from berliOS. The monq-1.1.1.jar package is used to parse XML and provide server capabilities to MTI ML. We have preserved a static copy of the monq-1.1.1.jar file locally.
For best results, download and install the MTI ML distribution file MTI_ML.tar.gz and then download and install the monq-1.1.1.jar package in the
new MTI_ML directory that was created by the MTI ML distribution file.
MTI ML has been compiled and verified to work with java 1.6. If you need to install java or update your current version, follow this link: http//www.java.com. Be sure that the path to the java program is in the PATH environment variable.
Move the downloaded files into a directory where you want to install MTI ML. When you uncompress and untar the MTI ML distribution file it will create
a subdirectory call "MTI_ML" and that directory will be referred to as
<parent_directory> throughout the rest of the instructions. So,
for example, if you create a directory "Project" and install MTI ML in Project then the <parent_directory> should be set to <path to Project>/Project/MTI_ML or <path to Project>\Project\MTI_ML under
Windows.
The various MTI ML jar files have to be added to the CLASSPATH environment variable or configured directly using the -cp parameter of the Java Virtual Machine (JVM).
# Windows. Open a Windows command prompt. Go to <parent_directory>.
Move to the drive (e.g. C:) where the <parent_directory> is located
[Drive]:
cd <parent_directory>
# Linux and Mac OS. Open a terminal. Go to <parent_directory>.
cd <parent_directory>
# Windows:
set CLASSPATH="<parent_directory>\monq-1.1.1.jar;<parent_directory>\utils.jar;<parent_directory>\mti_prod.jar"
# Linux and Mac OS:
* in C Shell (csh or tcsh)
setenv CLASSPATH <parent_directory>/monq-1.1.1.jar:<parent_directory>/utils.jar:<parent_directory>/mti_prod.jar
* in Bourne Again Shell (bash)
export CLASSPATH=<parent_directory>/monq-1.1.1.jar:<parent_directory>/utils.jar:<parent_directory>/mti_prod.jar
* Bourne Shell (sh)
CLASSPATH=<parent_directory>/monq-1.1.1.jar:<parent_directory>/utils.jar:<parent_directory>/mti_prod.jar export CLASSPATH
The file configuration.txt contains the details of the training. In this case, we are training models for the Humans, Male, and Female MeSH headings.
Details of the training are sent to the standard output. In the example, this is redirected to out.txt and can be used to follow the training progress and understand the generated model.
The training will take several minutes. Check the file out.log to ensure that there were no errors during training.
# Windows:
type citations.train.xml | java -cp %CLASSPATH% -Xmx1G -Xms1G -ss6000k gov.nih.nlm.nls.mti.trainer.OVATrainer gov.nih.nlm.nls.mti.textprocessors.MEDLINEXMLTextProcessor "" gov.nih.nlm.nls.mti.featuresextractors.BinaryFeatureExtractor "-l -n -c -f1" configuration.txt trie.gz classifiers.gz 2> out.log > out.txt
# Linux and Mac OS (from Bourne/Bash Shell):
sh
gunzip -c citations.train.xml.gz | java -cp $CLASSPATH -Xmx1G -Xms1G -ss6000k gov.nih.nlm.nls.mti.trainer.OVATrainer gov.nih.nlm.nls.mti.textprocessors.MEDLINEXMLTextProcessor "" gov.nih.nlm.nls.mti.featuresextractors.BinaryFeatureExtractor "-l -n -c -f1" configuration.txt trie.gz classifiers.gz 2> out.log > out.txt
Annotating will take several minutes. Check the file annotation.log to ensure that there were no errors during annotating.
# Windows:
type citations.test.xml | java -ss6000k -cp %CLASSPATH% gov.nih.nlm.nls.mti.annotator.OVAAnnotator gov.nih.nlm.nls.mti.textprocessors.MEDLINEXMLTextProcessor "" gov.nih.nlm.nls.mti.featuresextractors.BinaryFeatureExtractor "-l -n -c" trie.gz classifiers.gz > annotation.txt 2> annotation.log
# Linux and Mac OS (from Bourne/Bash Shell):
gunzip -c citations.test.xml.gz | java -ss6000k -cp $CLASSPATH gov.nih.nlm.nls.mti.annotator.OVAAnnotator gov.nih.nlm.nls.mti.textprocessors.MEDLINEXMLTextProcessor "" gov.nih.nlm.nls.mti.featuresextractors.BinaryFeatureExtractor "-l -n -c" trie.gz classifiers.gz > annotation.txt 2> annotation.log
# Windows:
type annotation.txt | java -cp %CLASSPATH% gov.nih.nlm.nls.mti.evaluator.Evaluator benchmark.test > benchmark.txt
# Linux and Mac OS (from Bourne/Bash Shell):
cat annotation.txt | java -cp $CLASSPATH gov.nih.nlm.nls.mti.evaluator.Evaluator benchmark.test > benchmark.txt
grep "^Female|" benchmark.txt
grep "^Humans|" benchmark.txt
grep "^Male|" benchmark.txt
The evaluation file benchmark.txt contains results for all the MeSH headings. Each line shows the result for a single MeSH heading with fields separated by the pipe symbol. The first field is the MeSH heading name, then the number of positives in the test set, true positives, the false negatives, precision, recall, and F-measure.
The result for the MeSH headings Humans, Male and Female should be similar to these results:
Now that you have managed to train and evaluate classifiers based on the example data set, you can learn more about this tool from the document available here.
| Copyright,
Privacy,
Accessibility,
Viewers and Players,
Freedom of Information Act, Contact Us Last Modified: May 30, 2019 Server: ii-public2 |
|
|