public class PdfReader extends ResourceCollectionReaderBase
ResourceCollectionReaderBase.Resource
Modifier and Type | Field and Description |
---|---|
static String |
BUILT_IN |
static String |
PARAM_END_PAGE
The last page to be extracted from the PDF.
|
static String |
PARAM_HEADING_TYPE
The type used to annotate headings.
|
static String |
PARAM_PARAGRAPH_TYPE
The type used to annotate paragraphs.
|
static String |
PARAM_START_PAGE
The first page to be extracted from the PDF.
|
static String |
PARAM_SUBSTITUTION_TABLE_LOCATION
The location of the substitution table use to post-process the text extracted form the PDF,
e.g.
|
EXCLUDE_PREFIX, INCLUDE_PREFIX, JAR_PREFIX, KEY_RESOURCE_RESOLVER, PARAM_INCLUDE_HIDDEN, PARAM_LANGUAGE, PARAM_LOG_FREQ, PARAM_PATH, PARAM_PATTERNS, PARAM_SOURCE_LOCATION, PARAM_USE_DEFAULT_EXCLUDES
Constructor and Description |
---|
PdfReader() |
Modifier and Type | Method and Description |
---|---|
void |
getNext(org.apache.uima.cas.CAS aCAS) |
void |
initialize(org.apache.uima.UimaContext aContext) |
getBase, getBase, getDefaultExcludes, getLanguage, getProgress, getResolver, getResourceIterator, getResources, getSourceLocation, hasNext, initCas, initCas, isSingleLocation, locationToUrl, nextFile, scan
close, getLogger, initialize
destroy, getCasInitializer, getProcessingResourceMetaData, initialize, isConsuming, reconfigure, setCasInitializer, typeSystemInit
getConfigParameterValue, getConfigParameterValue, setConfigParameterValue, setConfigParameterValue
getCasManager, getMetaData, getRelativePathResolver, getResourceManager, getUimaContext, getUimaContextAdmin, setLogger, setMetaData
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public static final String BUILT_IN
public static final String PARAM_SUBSTITUTION_TABLE_LOCATION
public static final String PARAM_HEADING_TYPE
public static final String PARAM_PARAGRAPH_TYPE
public static final String PARAM_START_PAGE
public static final String PARAM_END_PAGE
public void initialize(org.apache.uima.UimaContext aContext) throws org.apache.uima.resource.ResourceInitializationException
initialize
in class ResourceCollectionReaderBase
org.apache.uima.resource.ResourceInitializationException
public void getNext(org.apache.uima.cas.CAS aCAS) throws IOException, org.apache.uima.collection.CollectionException
IOException
org.apache.uima.collection.CollectionException
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.