StanfordParser (DKPro Core 1.9.0 API)

java.lang.Object
- org.apache.uima.analysis_component.AnalysisComponent_ImplBase
- - org.apache.uima.analysis_component.Annotator_ImplBase
  - - org.apache.uima.analysis_component.JCasAnnotator_ImplBase
    - - org.apache.uima.fit.component.JCasAnnotator_ImplBase
      - de.tudarmstadt.ukp.dkpro.core.stanfordnlp.StanfordParser

All Implemented Interfaces:

org.apache.uima.analysis_component.AnalysisComponent
```
public class StanfordParser
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
```
Stanford Parser component.

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class StanfordParser.DependenciesMode

Nested Classes
Modifier and Type	Class and Description
`static class`	`StanfordParser.DependenciesMode`

Field Summary

Fields
Modifier and Type	Field and Description
`protected String`	`constituentMappingLocation`
`protected String`	`language`
`protected StanfordParser.DependenciesMode`	`mode`
`protected String`	`modelLocation`
`static String`	`PARAM_ANNOTATIONTYPE_TO_PARSE` This parameter can be used to override the standard behavior which uses the Sentence annotation as the basic unit for parsing.
`static String`	`PARAM_CONSTITUENT_MAPPING_LOCATION` Location of the mapping file for constituent tags to UIMA types.
`static String`	`PARAM_KEEP_PUNCTUATION`
`static String`	`PARAM_LANGUAGE` Use this language instead of the document language to resolve the model and tag set mapping.
`static String`	`PARAM_MAX_ITEMS` Controls when the factored parser considers a sentence to be too complex and falls back to the PCFG parser.
`static String`	`PARAM_MAX_SENTENCE_LENGTH` Maximum number of tokens in a sentence.
`static String`	`PARAM_MODE` Sets the kind of dependencies being created.
`static String`	`PARAM_MODEL_LOCATION` Location from which the model is read.
`static String`	`PARAM_POS_MAPPING_LOCATION` Location of the mapping file for part-of-speech tags to UIMA types.
`static String`	`PARAM_PRINT_TAGSET` Write the tag set(s) to the log when a model is loaded.
`static String`	`PARAM_PTB3_ESCAPING` Enable all traditional PTB3 token transforms (like -LRB-, -RRB-).
`static String`	`PARAM_QUOTE_BEGIN` List of extra token texts (usually single character strings) that should be treated like opening quotes and escaped accordingly before being sent to the parser.
`static String`	`PARAM_QUOTE_END` List of extra token texts (usually single character strings) that should be treated like closing quotes and escaped accordingly before being sent to the parser.
`static String`	`PARAM_READ_POS` Sets whether to use or not to use already existing POS tags from another annotator for the parsing process.
`static String`	`PARAM_VARIANT` Variant of a model the model.
`static String`	`PARAM_WRITE_CONSTITUENT` Sets whether to create or not to create constituent tags.
`static String`	`PARAM_WRITE_DEPENDENCY` Sets whether to create or not to create dependency annotations.
`static String`	`PARAM_WRITE_PENN_TREE` If this parameter is set to true, each sentence is annotated with a PennTree-Annotation, containing the whole parse tree in Penn Treebank style format.
`static String`	`PARAM_WRITE_POS` Sets whether to create or not to create POS tags.
`protected String`	`posMappingLocation`
`protected boolean`	`printTagSet`
`protected String`	`variant`

Constructor Summary

Constructors
Constructor and Description

StanfordParser()

Constructors
Constructor and Description
`StanfordParser()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`doCreateDependencyTags(edu.stanford.nlp.parser.common.ParserGrammar aParser, StanfordAnnotator sfAnnotator, edu.stanford.nlp.trees.Tree parseTree, List<Token> tokens)`
`void`	`initialize(org.apache.uima.UimaContext context)`
`void`	`process(org.apache.uima.jcas.JCas aJCas)` Processes the given text using the StanfordParser.
`protected edu.stanford.nlp.ling.CoreLabel`	`tokenToWord(Token aToken)`

Methods inherited from class org.apache.uima.fit.component.JCasAnnotator_ImplBase
getLogger

Methods inherited from class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
getRequiredCasInterface, process

Methods inherited from class org.apache.uima.analysis_component.Annotator_ImplBase
getCasInstancesRequired, hasNext, next

Methods inherited from class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
batchProcessComplete, collectionProcessComplete, destroy, getContext, getResultSpecification, reconfigure, setResultSpecification

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - PARAM_PRINT_TAGSET
```
public static final String PARAM_PRINT_TAGSET
```
    Write the tag set(s) to the log when a model is loaded.
    
    See Also:
    
    Constant Field Values
  - printTagSet
```
protected boolean printTagSet
```
  - PARAM_LANGUAGE
```
public static final String PARAM_LANGUAGE
```
    Use this language instead of the document language to resolve the model and tag set mapping.
    
    See Also:
    
    Constant Field Values
  - language
```
protected String language
```
  - PARAM_VARIANT
```
public static final String PARAM_VARIANT
```
    Variant of a model the model. Used to address a specific model if here are multiple models for one language.
    
    See Also:
    
    Constant Field Values
  - variant
```
protected String variant
```
  - PARAM_MODEL_LOCATION
```
public static final String PARAM_MODEL_LOCATION
```
    Location from which the model is read.
    
    See Also:
    
    Constant Field Values
  - modelLocation
```
protected String modelLocation
```
  - PARAM_POS_MAPPING_LOCATION
```
public static final String PARAM_POS_MAPPING_LOCATION
```
    Location of the mapping file for part-of-speech tags to UIMA types.
    
    See Also:
    
    Constant Field Values
  - posMappingLocation
```
protected String posMappingLocation
```
  - PARAM_CONSTITUENT_MAPPING_LOCATION
```
public static final String PARAM_CONSTITUENT_MAPPING_LOCATION
```
    Location of the mapping file for constituent tags to UIMA types.
    
    See Also:
    
    Constant Field Values
  - constituentMappingLocation
```
protected String constituentMappingLocation
```
  - PARAM_WRITE_DEPENDENCY
```
public static final String PARAM_WRITE_DEPENDENCY
```
    Sets whether to create or not to create dependency annotations.
    Default: true
    
    See Also:
    
    Constant Field Values
  - PARAM_MODE
```
public static final String PARAM_MODE
```
    Sets the kind of dependencies being created.
    Default: TREE
    
    See Also:
    
    StanfordParser.DependenciesMode, Constant Field Values
  - mode
```
protected StanfordParser.DependenciesMode mode
```
  - PARAM_WRITE_CONSTITUENT
```
public static final String PARAM_WRITE_CONSTITUENT
```
    Sets whether to create or not to create constituent tags. This is required for POS-tagging and lemmatization.
    Default: true
    
    See Also:
    
    Constant Field Values
  - PARAM_WRITE_PENN_TREE
```
public static final String PARAM_WRITE_PENN_TREE
```
    If this parameter is set to true, each sentence is annotated with a PennTree-Annotation, containing the whole parse tree in Penn Treebank style format.
    Default: false
    
    See Also:
    
    Constant Field Values
  - PARAM_ANNOTATIONTYPE_TO_PARSE
```
public static final String PARAM_ANNOTATIONTYPE_TO_PARSE
```
    This parameter can be used to override the standard behavior which uses the Sentence annotation as the basic unit for parsing.
    If the parameter is set with the name of an annotation type x, the parser will no longer parse Sentence-annotations, but x-Annotations.
    
    Default: null
    
    See Also:
    
    Constant Field Values
  - PARAM_WRITE_POS
```
public static final String PARAM_WRITE_POS
```
    Sets whether to create or not to create POS tags. The creation of constituent tags must be turned on for this to work.
    Default: false
    
    See Also:
    
    Constant Field Values
  - PARAM_READ_POS
```
public static final String PARAM_READ_POS
```
    Sets whether to use or not to use already existing POS tags from another annotator for the parsing process.
    Default: true
    
    See Also:
    
    Constant Field Values
  - PARAM_MAX_SENTENCE_LENGTH
```
public static final String PARAM_MAX_SENTENCE_LENGTH
```
    Maximum number of tokens in a sentence. Longer sentences are not parsed. This is to avoid out of memory exceptions.
    Default: 130
    
    See Also:
    
    TestOptions.maxLength, Constant Field Values
  - PARAM_MAX_ITEMS
```
public static final String PARAM_MAX_ITEMS
```
    Controls when the factored parser considers a sentence to be too complex and falls back to the PCFG parser.
    Default: 200000
    
    See Also:
    
    TestOptions.MAX_ITEMS, Constant Field Values
  - PARAM_PTB3_ESCAPING
```
public static final String PARAM_PTB3_ESCAPING
```
    Enable all traditional PTB3 token transforms (like -LRB-, -RRB-).
    
    See Also:
    
    PTBEscapingProcessor, Constant Field Values
  - PARAM_QUOTE_BEGIN
```
public static final String PARAM_QUOTE_BEGIN
```
    List of extra token texts (usually single character strings) that should be treated like opening quotes and escaped accordingly before being sent to the parser.
    
    See Also:
    
    Constant Field Values
  - PARAM_QUOTE_END
```
public static final String PARAM_QUOTE_END
```
    List of extra token texts (usually single character strings) that should be treated like closing quotes and escaped accordingly before being sent to the parser.
    
    See Also:
    
    Constant Field Values
  - PARAM_KEEP_PUNCTUATION
```
public static final String PARAM_KEEP_PUNCTUATION
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - StanfordParser
```
public StanfordParser()
```
- Method Detail
  - initialize
```
public void initialize(org.apache.uima.UimaContext context)
                throws org.apache.uima.resource.ResourceInitializationException
```
    Specified by:
    
    initialize in interface org.apache.uima.analysis_component.AnalysisComponent
    
    Overrides:
    
    initialize in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
    
    Throws:
    
    org.apache.uima.resource.ResourceInitializationException
  - process
```
public void process(org.apache.uima.jcas.JCas aJCas)
             throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
```
    Processes the given text using the StanfordParser.
    
    Specified by:
    
    process in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
    
    Parameters:
    
    aJCas - the JCas to process
    
    Throws:
    
    org.apache.uima.analysis_engine.AnalysisEngineProcessException
    
    See Also:
    
    JCasAnnotator_ImplBase.process(org.apache.uima.jcas.JCas)
  - doCreateDependencyTags
```
protected void doCreateDependencyTags(edu.stanford.nlp.parser.common.ParserGrammar aParser,
                                      StanfordAnnotator sfAnnotator,
                                      edu.stanford.nlp.trees.Tree parseTree,
                                      List<Token> tokens)
```
  - tokenToWord
```
protected edu.stanford.nlp.ling.CoreLabel tokenToWord(Token aToken)
```

Class StanfordParser

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.uima.fit.component.JCasAnnotator_ImplBase

Methods inherited from class org.apache.uima.analysis_component.JCasAnnotator_ImplBase

Methods inherited from class org.apache.uima.analysis_component.Annotator_ImplBase

Methods inherited from class org.apache.uima.analysis_component.AnalysisComponent_ImplBase

Methods inherited from class java.lang.Object

Field Detail

PARAM_PRINT_TAGSET

printTagSet

PARAM_LANGUAGE

language

PARAM_VARIANT

variant

PARAM_MODEL_LOCATION

modelLocation

PARAM_POS_MAPPING_LOCATION

posMappingLocation

PARAM_CONSTITUENT_MAPPING_LOCATION

constituentMappingLocation

PARAM_WRITE_DEPENDENCY

PARAM_MODE

mode

PARAM_WRITE_CONSTITUENT

PARAM_WRITE_PENN_TREE

PARAM_ANNOTATIONTYPE_TO_PARSE

PARAM_WRITE_POS

PARAM_READ_POS

PARAM_MAX_SENTENCE_LENGTH

PARAM_MAX_ITEMS

PARAM_PTB3_ESCAPING

PARAM_QUOTE_BEGIN

PARAM_QUOTE_END

PARAM_KEEP_PUNCTUATION

Constructor Detail

StanfordParser

Method Detail

initialize

process

doCreateDependencyTags

tokenToWord