public class Web1TWriter
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
Modifier and Type | Field and Description |
---|---|
protected String |
contextType |
static String |
PARAM_CONTEXT_TYPE
The type being used for segments
|
static String |
PARAM_CREATE_INDEXES
Create the indexes that jWeb1T needs to operate.
|
static String |
PARAM_INPUT_TYPES
Types to generate n-grams from.
|
static String |
PARAM_LOWERCASE
Create a lower case index.
|
static String |
PARAM_MAX_NGRAM_LENGTH
Maximum n-gram length.
|
static String |
PARAM_MIN_FREQUENCY
Specifies the minimum frequency a NGram must have to be written to the
final index.
|
static String |
PARAM_MIN_NGRAM_LENGTH
Minimum n-gram length.
|
static String |
PARAM_SPLIT_TRESHOLD
The input file(s) is/are split into smaller files for quick access.
|
static String |
PARAM_TARGET_ENCODING
Character encoding of the output data.
|
static String |
PARAM_TARGET_LOCATION
Location to which the output is written.
|
Constructor and Description |
---|
Web1TWriter() |
Modifier and Type | Method and Description |
---|---|
void |
collectionProcessComplete()
The input files for each ngram level is read, splitted according to the
frequency of the words starting letter in the files and the split files
are individually sorted and consolidated.
|
void |
initialize(org.apache.uima.UimaContext context) |
void |
process(org.apache.uima.jcas.JCas jcas) |
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_INPUT_TYPES
Token.class.getName() + "/pos/PosValue"
for part-of-speech n-gramspublic static final String PARAM_TARGET_LOCATION
public static final String PARAM_TARGET_ENCODING
public static final String PARAM_MIN_NGRAM_LENGTH
1
public static final String PARAM_MAX_NGRAM_LENGTH
3
public static final String PARAM_LOWERCASE
public static final String PARAM_CREATE_INDEXES
public static final String PARAM_MIN_FREQUENCY
public static final String PARAM_SPLIT_TRESHOLD
public static final String PARAM_CONTEXT_TYPE
protected String contextType
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas jcas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
public void collectionProcessComplete() throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
collectionProcessComplete
in interface org.apache.uima.analysis_component.AnalysisComponent
collectionProcessComplete
in class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.