public class PhraseAnnotator
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
In order to identify longer phrases, run the FrequencyCounter
and this annotator
multiple times, each time taking the results of the previous run as input. From the second run on, set phrases
in the feature path parameter PARAM_FEATURE_PATH
.
Modifier and Type | Field and Description |
---|---|
static String |
PARAM_COVERING_TYPE
Set this parameter if bigrams should only be counted when occurring within a covering type, e.g.
|
static String |
PARAM_DISCOUNT
The discount in order to prevent too many phrases consisting of very infrequent words to be formed.
|
static String |
PARAM_FEATURE_PATH
The feature path to use for building bigrams.
|
static String |
PARAM_FILTER_REGEX |
static String |
PARAM_LOWERCASE
If true, lowercase everything.
|
static String |
PARAM_MODEL_LOCATION
The file providing the unigram and bigram unigrams to use.
|
static String |
PARAM_REGEX_REPLACEMENT |
static String |
PARAM_STOPWORDS_FILE |
static String |
PARAM_STOPWORDS_REPLACEMENT |
static String |
PARAM_THRESHOLD
The threshold score for phrase construction.
|
Constructor and Description |
---|
PhraseAnnotator() |
Modifier and Type | Method and Description |
---|---|
void |
initialize(org.apache.uima.UimaContext context) |
void |
process(org.apache.uima.jcas.JCas aJCas) |
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_FEATURE_PATH
public static final String PARAM_LOWERCASE
public static final String PARAM_MODEL_LOCATION
public static final String PARAM_DISCOUNT
FrequencyCounter.PARAM_MIN_COUNT
),
which is by default set to 5.public static final String PARAM_THRESHOLD
public static final String PARAM_STOPWORDS_FILE
public static final String PARAM_STOPWORDS_REPLACEMENT
public static final String PARAM_FILTER_REGEX
public static final String PARAM_REGEX_REPLACEMENT
public static final String PARAM_COVERING_TYPE
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas aJCas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.