public class TfidfAnnotator
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
Tfidf
annotations consisting of a term and a tfidf weight. DfStore
, which can be
created using the TfidfConsumer
.Modifier and Type | Class and Description |
---|---|
static class |
TfidfAnnotator.WeightingModeIdf
Available modes for inverse document frequency
|
static class |
TfidfAnnotator.WeightingModeTf
Available modes for term frequency
|
Modifier and Type | Field and Description |
---|---|
protected String |
featurePath |
protected boolean |
lowercase |
static String |
PARAM_FEATURE_PATH
This annotator is type agnostic, so it is mandatory to specify the type of the working
annotation and how to obtain the string representation with the feature path.
|
static String |
PARAM_IDF_MODE
The model for inverse document frequency weighting.
Invoke toString() on an enum of TfidfAnnotator.WeightingModeIdf for setup. |
static String |
PARAM_LOWERCASE
If set to true, the whole text is handled in lower case.
|
static String |
PARAM_TF_MODE
The model for term frequency weighting.
Invoke toString() on an enum of TfidfAnnotator.WeightingModeTf for setup. |
static String |
PARAM_TFDF_PATH
Provide the path to the Df-Model.
|
Constructor and Description |
---|
TfidfAnnotator() |
Modifier and Type | Method and Description |
---|---|
protected FreqDist<String> |
getTermFrequencies(org.apache.uima.jcas.JCas jcas) |
void |
initialize(org.apache.uima.UimaContext context) |
void |
process(org.apache.uima.jcas.JCas jcas) |
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_FEATURE_PATH
protected String featurePath
public static final String PARAM_TFDF_PATH
SharedDfModel
is bound to this
annotator, this is ignored.public static final String PARAM_LOWERCASE
protected boolean lowercase
public static final String PARAM_TF_MODE
TfidfAnnotator.WeightingModeTf
for setup.
Default value is "NORMAL" yielding an unweighted tf.
public static final String PARAM_IDF_MODE
TfidfAnnotator.WeightingModeIdf
for setup.
Default value is "NORMAL" yielding an unweighted idf.
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas jcas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.