public class StanfordParser
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
Modifier and Type | Class and Description |
---|---|
static class |
StanfordParser.DependenciesMode |
Modifier and Type | Field and Description |
---|---|
protected String |
constituentMappingLocation |
protected String |
language |
protected StanfordParser.DependenciesMode |
mode |
protected String |
modelLocation |
static String |
PARAM_ANNOTATIONTYPE_TO_PARSE
This parameter can be used to override the standard behavior which uses the Sentence
annotation as the basic unit for parsing.
|
static String |
PARAM_CONSTITUENT_MAPPING_LOCATION
Location of the mapping file for constituent tags to UIMA types.
|
static String |
PARAM_KEEP_PUNCTUATION |
static String |
PARAM_LANGUAGE
Use this language instead of the document language to resolve the model and tag set mapping.
|
static String |
PARAM_MAX_ITEMS
Controls when the factored parser considers a sentence to be too complex and falls back to
the PCFG parser.
|
static String |
PARAM_MAX_SENTENCE_LENGTH
Maximum number of tokens in a sentence.
|
static String |
PARAM_MODE
Sets the kind of dependencies being created.
|
static String |
PARAM_MODEL_LOCATION
Location from which the model is read.
|
static String |
PARAM_POS_MAPPING_LOCATION
Location of the mapping file for part-of-speech tags to UIMA types.
|
static String |
PARAM_PRINT_TAGSET
Write the tag set(s) to the log when a model is loaded.
|
static String |
PARAM_PTB3_ESCAPING
Enable all traditional PTB3 token transforms (like -LRB-, -RRB-).
|
static String |
PARAM_QUOTE_BEGIN
List of extra token texts (usually single character strings) that should be treated like
opening quotes and escaped accordingly before being sent to the parser.
|
static String |
PARAM_QUOTE_END
List of extra token texts (usually single character strings) that should be treated like
closing quotes and escaped accordingly before being sent to the parser.
|
static String |
PARAM_READ_POS
Sets whether to use or not to use already existing POS tags from another annotator for the
parsing process.
|
static String |
PARAM_VARIANT
Variant of a model the model.
|
static String |
PARAM_WRITE_CONSTITUENT
Sets whether to create or not to create constituent tags.
|
static String |
PARAM_WRITE_DEPENDENCY
Sets whether to create or not to create dependency annotations.
|
static String |
PARAM_WRITE_PENN_TREE
If this parameter is set to true, each sentence is annotated with a PennTree-Annotation,
containing the whole parse tree in Penn Treebank style format.
|
static String |
PARAM_WRITE_POS
Sets whether to create or not to create POS tags.
|
protected String |
posMappingLocation |
protected boolean |
printTagSet |
protected String |
variant |
Constructor and Description |
---|
StanfordParser() |
Modifier and Type | Method and Description |
---|---|
protected void |
doCreateDependencyTags(edu.stanford.nlp.parser.common.ParserGrammar aParser,
StanfordAnnotator sfAnnotator,
edu.stanford.nlp.trees.Tree parseTree,
List<Token> tokens) |
void |
initialize(org.apache.uima.UimaContext context) |
void |
process(org.apache.uima.jcas.JCas aJCas)
Processes the given text using the StanfordParser.
|
protected edu.stanford.nlp.ling.CoreLabel |
tokenToWord(Token aToken) |
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_PRINT_TAGSET
protected boolean printTagSet
public static final String PARAM_LANGUAGE
protected String language
public static final String PARAM_VARIANT
protected String variant
public static final String PARAM_MODEL_LOCATION
protected String modelLocation
public static final String PARAM_POS_MAPPING_LOCATION
protected String posMappingLocation
public static final String PARAM_CONSTITUENT_MAPPING_LOCATION
protected String constituentMappingLocation
public static final String PARAM_WRITE_DEPENDENCY
Default: true
public static final String PARAM_MODE
Default: TREE
protected StanfordParser.DependenciesMode mode
public static final String PARAM_WRITE_CONSTITUENT
Default: true
public static final String PARAM_WRITE_PENN_TREE
Default: false
public static final String PARAM_ANNOTATIONTYPE_TO_PARSE
If the parameter is set with the name of an annotation type x, the parser will no longer parse Sentence-annotations, but x-Annotations.
Default: null
public static final String PARAM_WRITE_POS
Default: false
public static final String PARAM_READ_POS
Default: true
public static final String PARAM_MAX_SENTENCE_LENGTH
Default: 130
TestOptions.maxLength
,
Constant Field Valuespublic static final String PARAM_MAX_ITEMS
Default: 200000
TestOptions.MAX_ITEMS
,
Constant Field Valuespublic static final String PARAM_PTB3_ESCAPING
PTBEscapingProcessor
,
Constant Field Valuespublic static final String PARAM_QUOTE_BEGIN
public static final String PARAM_QUOTE_END
public static final String PARAM_KEEP_PUNCTUATION
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas aJCas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
aJCas
- the JCas
to processorg.apache.uima.analysis_engine.AnalysisEngineProcessException
JCasAnnotator_ImplBase.process(org.apache.uima.jcas.JCas)
protected void doCreateDependencyTags(edu.stanford.nlp.parser.common.ParserGrammar aParser, StanfordAnnotator sfAnnotator, edu.stanford.nlp.trees.Tree parseTree, List<Token> tokens)
protected edu.stanford.nlp.ling.CoreLabel tokenToWord(Token aToken)
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.