public class MstParser
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
Wrapper for the MSTParser (high memory requirements). More information about the parser can be found here here
The MSTParser models tend to be very large, e.g. the Eisner model is about 600 MB uncompressed. With this model, parsing a simple sentence with MSTParser requires about 3 GB heap memory.
This component feeds MSTParser only with the FORM (token) and POS (part-of-speech) fields. LEMMA,
CPOS, and other columns from the CONLL 2006 format are not generated (cf.
DependencyInstance
).
Modifier and Type | Field and Description |
---|---|
protected String |
dependencyMappingLocation |
protected String |
language |
protected String |
modelLocation |
static String |
PARAM_DEPENDENCY_MAPPING_LOCATION
Load the dependency to UIMA type mapping from this location instead of locating
the mapping automatically.
|
static String |
PARAM_LANGUAGE
Use this language instead of the document language to resolve the model.
|
static String |
PARAM_MODEL_LOCATION
Load the model from this location instead of locating the model automatically.
|
static String |
PARAM_ORDER
Specifies the order/scope of features.
|
static String |
PARAM_PRINT_TAGSET
Log the tag set(s) when a model is loaded.
|
static String |
PARAM_VARIANT
Override the default variant used to locate the model.
|
protected boolean |
printTagSet |
protected String |
variant |
Constructor and Description |
---|
MstParser() |
Modifier and Type | Method and Description |
---|---|
void |
initialize(org.apache.uima.UimaContext context)
Initializes the MSTParser and creates a ModelResourceProvicer
|
void |
process(org.apache.uima.jcas.JCas jcas)
Processes the given text using the MSTParser.
|
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_LANGUAGE
protected String language
public static final String PARAM_VARIANT
protected String variant
public static final String PARAM_MODEL_LOCATION
protected String modelLocation
public static final String PARAM_PRINT_TAGSET
false
protected boolean printTagSet
public static final String PARAM_DEPENDENCY_MAPPING_LOCATION
protected String dependencyMappingLocation
public static final String PARAM_ORDER
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
- Cannot be initializedpublic void process(org.apache.uima.jcas.JCas jcas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
jcas
- The JCas containing the textual inputorg.apache.uima.analysis_engine.AnalysisEngineProcessException
- No parse createdCopyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.