public class ImsCwbWriter
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
PARAM_CQP_HOME
to directly create output in the
native binary CQP format via the original CWB command line tools.Modifier and Type | Field and Description |
---|---|
static String |
ATTR_BEGIN |
static String |
ATTR_CPOS |
static String |
ATTR_END |
static String |
ATTR_ID |
static String |
ATTR_LEMMA |
static String |
ATTR_POS |
static String |
ATTR_URI |
static String |
E_DOCUMENT |
static String |
E_SENTENCE |
static String |
E_TEXT |
static String |
PARAM_ADDITIONAL_FEATURES
Write additional token-level annotation features.
|
static String |
PARAM_CORPUS_NAME
The name of the generated corpus.
|
static String |
PARAM_CQP_COMPRESS
Set this parameter to compress the token streams and the indexes using cwb-huffcode and
cwb-compress-rdx.
|
static String |
PARAM_CQP_HOME
Set this parameter to the directory containing the cwb-encode and cwb-makeall commands if you
want the write to directly encode into the CQP binary format.
|
static String |
PARAM_CQPWEB_COMPATIBILITY
Make document IDs compatible with CQPweb.
|
static String |
PARAM_SENTENCE_TAG |
static String |
PARAM_TARGET_ENCODING
Character encoding of the output data.
|
static String |
PARAM_TARGET_LOCATION
Location to which the output is written.
|
static String |
PARAM_WRITE_CPOS
Write coarse-grained part-of-speech tags.
|
static String |
PARAM_WRITE_DOC_ID
Write the document ID for each token.
|
static String |
PARAM_WRITE_DOCUMENT_TAG
Write a pseudo-XML tag with the name
document to mark the start and end of a
document. |
static String |
PARAM_WRITE_LEMMA
Write lemmata.
|
static String |
PARAM_WRITE_OFFSETS
Write the start and end position of each token.
|
static String |
PARAM_WRITE_POS
Write part-of-speech tags.
|
static String |
PARAM_WRITE_TEXT_TAG
Write a pseudo-XML tag with the name
text to mark the start and end of a document. |
Constructor and Description |
---|
ImsCwbWriter() |
Modifier and Type | Method and Description |
---|---|
void |
collectionProcessComplete() |
String |
getCoveredAnnotationFeatureValue(String aFeaturePath,
org.apache.uima.cas.text.AnnotationFS aCoveringAnnotation)
Get the feature value of an annotation which is covered by another annotation.
|
void |
initialize(org.apache.uima.UimaContext context) |
void |
process(org.apache.uima.jcas.JCas jcas) |
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String E_SENTENCE
public static final String E_TEXT
public static final String E_DOCUMENT
public static final String ATTR_BEGIN
public static final String ATTR_END
public static final String ATTR_POS
public static final String ATTR_CPOS
public static final String ATTR_LEMMA
public static final String ATTR_ID
public static final String ATTR_URI
public static final String PARAM_TARGET_LOCATION
public static final String PARAM_TARGET_ENCODING
public static final String PARAM_WRITE_DOC_ID
document tag
or a text tag
which also contain the document ID that can be queried in CQP.public static final String PARAM_WRITE_POS
public static final String PARAM_WRITE_CPOS
public static final String PARAM_WRITE_LEMMA
public static final String PARAM_WRITE_DOCUMENT_TAG
document
to mark the start and end of a
document.public static final String PARAM_WRITE_TEXT_TAG
text
to mark the start and end of a document.
This is used by CQPweb.public static final String PARAM_WRITE_OFFSETS
public static final String PARAM_ADDITIONAL_FEATURES
public static final String PARAM_CQPWEB_COMPATIBILITY
public static final String PARAM_CQP_HOME
public static final String PARAM_CQP_COMPRESS
public static final String PARAM_CORPUS_NAME
public static final String PARAM_SENTENCE_TAG
public void initialize(org.apache.uima.UimaContext context) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas jcas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
public void collectionProcessComplete() throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
collectionProcessComplete
in interface org.apache.uima.analysis_component.AnalysisComponent
collectionProcessComplete
in class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
public String getCoveredAnnotationFeatureValue(String aFeaturePath, org.apache.uima.cas.text.AnnotationFS aCoveringAnnotation)
aFeaturePath
- The fully qualified feature path of the feature in question:
your.package.and.annotation.class.name/featureNameaCoveringAnnotation
- The annotation that covers the annotation for which the feature value should be
extracted.Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.