public class ImsCwbReader extends ResourceCollectionReaderBase
ResourceCollectionReaderBase.Resource
Modifier and Type | Field and Description |
---|---|
protected String |
mappingPosLocation |
static String |
PARAM_GENERATE_NEW_IDS
If true, the unit IDs are used only to detect if a new document (CAS) needs to be created,
but for the purpose of setting the document ID, a new ID is generated.
|
static String |
PARAM_ID_IS_URL
If true, the unit text ID encoded in the corpus file is stored as the URI in the document
meta data.
|
static String |
PARAM_POS_MAPPING_LOCATION
Location of the mapping file for part-of-speech tags to UIMA types.
|
static String |
PARAM_POS_TAG_SET
Specify which tag set should be used to locate the mapping file.
|
static String |
PARAM_READ_LEMMA
Read lemmas.
|
static String |
PARAM_READ_POS
Read part-of-speech tags and generate
POS annotations or subclasses if a
tag set or mapping file is
used. |
static String |
PARAM_READ_SENTENCES
Read sentences.
|
static String |
PARAM_READ_TOKEN
Read tokens and generate
Token annotations. |
static String |
PARAM_REPLACE_NON_XML
Replace non-XML characters with spaces.
|
static String |
PARAM_SOURCE_ENCODING
Character encoding of the output.
|
protected String |
posTagset |
EXCLUDE_PREFIX, INCLUDE_PREFIX, JAR_PREFIX, KEY_RESOURCE_RESOLVER, PARAM_INCLUDE_HIDDEN, PARAM_LANGUAGE, PARAM_LOG_FREQ, PARAM_PATH, PARAM_PATTERNS, PARAM_SOURCE_LOCATION, PARAM_USE_DEFAULT_EXCLUDES
Constructor and Description |
---|
ImsCwbReader() |
Modifier and Type | Method and Description |
---|---|
void |
getNext(org.apache.uima.cas.CAS aCAS) |
org.apache.uima.util.Progress[] |
getProgress() |
boolean |
hasNext() |
void |
initialize(org.apache.uima.UimaContext aContext) |
getBase, getBase, getDefaultExcludes, getLanguage, getResolver, getResourceIterator, getResources, getSourceLocation, initCas, initCas, isSingleLocation, locationToUrl, nextFile, scan
close, getLogger, initialize
destroy, getCasInitializer, getProcessingResourceMetaData, initialize, isConsuming, reconfigure, setCasInitializer, typeSystemInit
getConfigParameterValue, getConfigParameterValue, setConfigParameterValue, setConfigParameterValue
getCasManager, getMetaData, getRelativePathResolver, getResourceManager, getUimaContext, getUimaContextAdmin, setLogger, setMetaData
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public static final String PARAM_SOURCE_ENCODING
public static final String PARAM_POS_MAPPING_LOCATION
protected String mappingPosLocation
public static final String PARAM_POS_TAG_SET
protected String posTagset
public static final String PARAM_READ_TOKEN
Token
annotations.
Default: true
public static final String PARAM_READ_POS
POS
annotations or subclasses if a
tag set
or mapping file
is
used.
Default: true
public static final String PARAM_READ_SENTENCES
true
public static final String PARAM_READ_LEMMA
true
public static final String PARAM_GENERATE_NEW_IDS
public static final String PARAM_ID_IS_URL
PARAM_GENERATE_NEW_IDS
(Default: false)public static final String PARAM_REPLACE_NON_XML
public void initialize(org.apache.uima.UimaContext aContext) throws org.apache.uima.resource.ResourceInitializationException
initialize
in class ResourceCollectionReaderBase
org.apache.uima.resource.ResourceInitializationException
public boolean hasNext() throws IOException, org.apache.uima.collection.CollectionException
hasNext
in interface org.apache.uima.collection.base_cpm.BaseCollectionReader
hasNext
in class ResourceCollectionReaderBase
IOException
org.apache.uima.collection.CollectionException
public void getNext(org.apache.uima.cas.CAS aCAS) throws IOException, org.apache.uima.collection.CollectionException
IOException
org.apache.uima.collection.CollectionException
public org.apache.uima.util.Progress[] getProgress()
getProgress
in interface org.apache.uima.collection.base_cpm.BaseCollectionReader
getProgress
in class ResourceCollectionReaderBase
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.