public class Conll2002Reader extends JCasResourceCollectionReader_ImplBase
Reads the CoNLL 2002 named entity format. The columns are separated by a single space, like illustrated below.
Wolff B-PER
, O
currently O
a O
journalist O
in O
Argentina B-LOC
, O
played O
with O
Del B-PER
Bosque I-PER
in O
the O
final O
years O
of O
the O
seventies O
in O
Real B-ORG
Madrid I-ORG
. O
Sentences are separated by a blank new line.
ResourceCollectionReaderBase.Resource
Modifier and Type | Field and Description |
---|---|
static String |
PARAM_ENCODING
Character encoding of the input data.
|
static String |
PARAM_INTERN_TAGS
Use the
String.intern() method on tags. |
static String |
PARAM_LANGUAGE
The language.
|
static String |
PARAM_READ_NAMED_ENTITY
Write named entity information.
|
EXCLUDE_PREFIX, INCLUDE_PREFIX, JAR_PREFIX, KEY_RESOURCE_RESOLVER, PARAM_INCLUDE_HIDDEN, PARAM_PATH, PARAM_PATTERNS, PARAM_SOURCE_LOCATION, PARAM_USE_DEFAULT_EXCLUDES
Constructor and Description |
---|
Conll2002Reader() |
Modifier and Type | Method and Description |
---|---|
void |
getNext(org.apache.uima.jcas.JCas aJCas)
Subclasses implement this method rather than
JCasResourceCollectionReader_ImplBase.getNext(CAS) |
void |
initialize(org.apache.uima.UimaContext aContext) |
getNext, initCas, initCas
getBase, getBase, getDefaultExcludes, getLanguage, getProgress, getResolver, getResourceIterator, getResources, getSourceLocation, hasNext, initCas, initCas, isSingleLocation, locationToUrl, nextFile, scan
close, getLogger, initialize
destroy, getCasInitializer, getProcessingResourceMetaData, initialize, isConsuming, reconfigure, setCasInitializer, typeSystemInit
getConfigParameterValue, getConfigParameterValue, setConfigParameterValue, setConfigParameterValue
getCasManager, getMetaData, getResourceManager, getUimaContext, getUimaContextAdmin, setLogger, setMetaData
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public static final String PARAM_ENCODING
public static final String PARAM_LANGUAGE
public static final String PARAM_INTERN_TAGS
String.intern()
method on tags. This is usually a good idea to avoid
spamming the heap with thousands of strings representing only a few different tags.
Default: true
public static final String PARAM_READ_NAMED_ENTITY
true
public void initialize(org.apache.uima.UimaContext aContext) throws org.apache.uima.resource.ResourceInitializationException
initialize
in class ResourceCollectionReaderBase
org.apache.uima.resource.ResourceInitializationException
public void getNext(org.apache.uima.jcas.JCas aJCas) throws IOException, org.apache.uima.collection.CollectionException
JCasResourceCollectionReader_ImplBase
JCasResourceCollectionReader_ImplBase.getNext(CAS)
getNext
in class JCasResourceCollectionReader_ImplBase
aJCas
- the JCas.IOException
- if an i/o error occurs reading the data.org.apache.uima.collection.CollectionException
- if another type of error occurs.Copyright © 2007–2016 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.