public class BinaryCasWriter extends JCasFileWriter_ImplBase
Format | Description | Type system on load | CAS Addresses preserved |
---|---|---|---|
S | CAS structures are dumped to disc as they are using Java serialization (CASSerializer
). Because these structures are pre-allocated in memory at larger sizes than what is actually
required, files in this format may be larger than necessary. However, the CAS addresses of
feature structures are preserved in this format. When the data is loaded back into a CAS, it must
have been initialized with the same type system as the original CAS. |
must be the same | yes |
S+ | CAS structures are dumped to disc as they are using Java serialization as in form 0, but
now using the CASCompleteSerializer which includes CAS metadata like type system and
index repositories. |
is reinitialized | yes |
0 | CAS structures are dumped to disc as they are using Java serialization (CASSerializer
). This is basically the same as format S but includes a UIMA header and can be read
using Serialization.deserializeCAS(org.apache.uima.cas.CAS, java.io.InputStream) . |
must be the same | yes |
4 | UIMA binary serialization saving all feature structures (reachable or not). This format internally uses gzip compression and a binary representation of the CAS, making it much more efficient than format 0. | must be the same | yes |
6 | UIMA binary serialization as format 4, but saving only reachable feature structures. | must be the same | no |
6+ |
UIMA binary serialization as format 6, but also contains the type system defintion. This allows
the BinaryCasReader to load data leniently into a CAS that has been initialized with a
different type system. |
lenient loading | no |
JCasFileWriter_ImplBase.NamedOutputStream
Modifier and Type | Field and Description |
---|---|
static String |
PARAM_FILENAME_EXTENSION |
static String |
PARAM_FORMAT |
static String |
PARAM_TYPE_SYSTEM_LOCATION
Location to write the type system to.
|
JAR_PREFIX, PARAM_COMPRESSION, PARAM_ESCAPE_DOCUMENT_ID, PARAM_STRIP_EXTENSION, PARAM_TARGET_LOCATION, PARAM_USE_DOCUMENT_ID
Constructor and Description |
---|
BinaryCasWriter() |
Modifier and Type | Method and Description |
---|---|
void |
process(org.apache.uima.jcas.JCas aJCas) |
collectionProcessComplete, getCompressionMethod, getOutputStream, getOutputStream, getRelativePath, getTargetPath, getTargetPath, isStripExtension, isUseDocumentId
getLogger, initialize
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_TYPE_SYSTEM_LOCATION
typesystem.ser
.
JCasFileWriter_ImplBase.PARAM_COMPRESSION
parameter has no effect on the
type system. Instead, if the type system file should be compressed or not is detected from
the file name extension (e.g. ".gz").
SerializedCasReader
can currently not
read such files. Use this only if you really know what you are doing.
public static final String PARAM_FORMAT
public static final String PARAM_FILENAME_EXTENSION
public void process(org.apache.uima.jcas.JCas aJCas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright © 2011–2015. All rights reserved.