public class TokenizedTextWriter extends JCasFileWriter_ImplBase
PARAM_FEATURE_PATH
.JCasFileWriter_ImplBase.NamedOutputStream
Modifier and Type | Field and Description |
---|---|
static String |
PARAM_COVERING_TYPE
In the output file, each unit of the covering type is written into a separate line.
|
static String |
PARAM_EXTENSION
Set the output file extension.
|
static String |
PARAM_FEATURE_PATH
The feature path, e.g.
|
static String |
PARAM_NUMBER_REGEX |
static String |
PARAM_STOPWORDS_FILE
All the tokens listed in this file (one token per line) are replaced by
STOP . |
static String |
PARAM_TARGET_ENCODING
Encoding for the target file.
|
JAR_PREFIX, PARAM_COMPRESSION, PARAM_ESCAPE_DOCUMENT_ID, PARAM_OVERWRITE, PARAM_SINGULAR_TARGET, PARAM_STRIP_EXTENSION, PARAM_TARGET_LOCATION, PARAM_USE_DOCUMENT_ID
Constructor and Description |
---|
TokenizedTextWriter() |
Modifier and Type | Method and Description |
---|---|
void |
collectionProcessComplete() |
void |
initialize(org.apache.uima.UimaContext context) |
void |
process(org.apache.uima.jcas.JCas aJCas) |
getCompressionMethod, getOutputStream, getOutputStream, getRelativePath, getTargetLocation, isStripExtension, isUseDocumentId
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_TARGET_ENCODING
public static final String PARAM_FEATURE_PATH
de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token/lemma/value
for lemmas. Default:
de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token
(i.e. token texts).public static final String PARAM_NUMBER_REGEX
public static final String PARAM_STOPWORDS_FILE
STOP
. Empty
lines and lines starting with #
are ignored. Casing is ignored.public static final String PARAM_EXTENSION
.txt
.public static final String PARAM_COVERING_TYPE
DEFAULT_COVERING_TYPE
), is sentences so that each sentence is written to a line.
If no linebreaks within a document is desired, set this value to null
.
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasConsumer_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas aJCas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
public void collectionProcessComplete() throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
collectionProcessComplete
in interface org.apache.uima.analysis_component.AnalysisComponent
collectionProcessComplete
in class JCasFileWriter_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.