Package org.apache.uima.cas.impl
Class BinaryCasSerDes4
java.lang.Object
org.apache.uima.cas.impl.BinaryCasSerDes4
- All Implemented Interfaces:
SlotKindsConstants
User callable serialization and deserialization of the CAS in a compressed Binary Format
This serializes/deserializes the state of the CAS, assuming that the type
information remains constant.
Header specifies to reader the format, and the compression level.
How to Serialize:
1) create an instance of this class, specifying some options that don't change very much
2) call serialize(CAS) to serialize the cas *
You can reuse the instance for a different CAS (as long as the type system is the same);
this will save setup time.
This class lazily constructs customized TypeInfo instances for each type encountered in serializing.
These are preserved across multiple serialization calls, so their setup / initialization is only
needed the first time.
The form of the binary CAS is inserted at the beginning so that receivers can do the
proper deserialization.
Binary format requires that the exact same type system be used when deserializing
How to Deserialize:
1) get an appropriate CAS to deserialize into. For delta CAS, it does not have to be empty.
2) call CASImpl: cas.reinit(inputStream) This is the existing method
for binary deserialization, and it now handles this compressed version, too.
Delta cas is also supported.
Compression/Decompression
Works in two stages:
application of Zip/Unzip to particular sub-collections of CAS data,
grouped according to similar data distribution
collection of like kinds of data (to make the zipping more effective)
There can be up to ~20 of these collections, such as
control info, float-exponents, string chars
Deserialization:
Read all bytes,
create separate ByteArrayInputStreams for each segment, sharing byte bfr
create appropriate unzip data input streams for these
Properties of Form 4:
1) (Change from V2) Indexes are used to determine what gets serialized, because there's no "heap" to walk,
unless the v2-id-mode is in effect.
2) The number used for references to FSs is a sequentially incrementing one, starting at 1
This allows better compression.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enum
static enum
Compression alternativesstatic enum
private class
Class instantiated once per deserialization Multiple deserializations in parallel supported, with multiple instances of thisprivate class
Class instantiated once per serialization Multiple serializations in parallel supported, with multiple instances of this -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final boolean
private final boolean
(package private) final TypeImpl
static final boolean
static final boolean
static final boolean
private static final boolean
private static final boolean
private static final boolean
private final TypeSystemImpl
Things set up for one instance of this class, and reuse-ablestatic final int
Fields inherited from interface org.apache.uima.cas.impl.SlotKindsConstants
arrayLength_i, byte_i, control_i, double_Exponent_i, double_Mantissa_Sign_i, float_Exponent_i, float_Mantissa_Sign_i, fsIndexes_i, heapRef_i, int_i, long_High_i, long_Low_i, NBR_SLOT_KIND_ZIP_STREAMS, short_i, strChars_i, strLength_i, strOffset_i, strSeg_i, typeCode_i
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
deserialize
(CASImpl cas, InputStream deserIn, boolean isDelta, CommonSerDes.Header h) static void
(package private) static CommonSerDesSequential
private static DataOutputStream
serialize
(AbstractCas cas, Object out) serialize
(AbstractCas cas, Object out, Marker trackingMark) serialize
(AbstractCas cas, Object out, Marker trackingMark, BinaryCasSerDes4.CompressLevel compressLevel) serialize
(AbstractCas cas, Object out, Marker trackingMark, BinaryCasSerDes4.CompressLevel compressLevel, BinaryCasSerDes4.CompressStrat compressStrategy) void
serializeWithTsi
(CASImpl casImpl, Object out)
-
Field Details
-
TRACE_SER
private static final boolean TRACE_SER- See Also:
-
TRACE_DES
private static final boolean TRACE_DES- See Also:
-
TRACE_DOUBLE
private static final boolean TRACE_DOUBLE- See Also:
-
TYPECODE_COMPR
public static final int TYPECODE_COMPR- See Also:
-
IS_DIFF_ENCODE
public static final boolean IS_DIFF_ENCODE- See Also:
-
CAN_BE_NEGATIVE
public static final boolean CAN_BE_NEGATIVE- See Also:
-
IGNORED
public static final boolean IGNORED- See Also:
-
IN_MAIN_HEAP
public static final boolean IN_MAIN_HEAP- See Also:
-
ts
Things set up for one instance of this class, and reuse-able -
doMeasurements
private final boolean doMeasurements -
fsArrayType
-
-
Constructor Details
-
BinaryCasSerDes4
- Parameters:
ts
- the type systemdoMeasurements
- - normally set this to false.
-
-
Method Details
-
serialize
public SerializationMeasures serialize(AbstractCas cas, Object out, Marker trackingMark, BinaryCasSerDes4.CompressLevel compressLevel, BinaryCasSerDes4.CompressStrat compressStrategy) throws IOException - Parameters:
cas
- CAS to serializeout
- output objecttrackingMark
- tracking mark (for delta serialization)compressLevel
- -compressStrategy
- -- Returns:
- null or serialization measurements (depending on setting of doMeasurements)
- Throws:
IOException
- if the marker is invalid
-
serializeWithTsi
- Throws:
IOException
-
serialize
public SerializationMeasures serialize(AbstractCas cas, Object out, Marker trackingMark, BinaryCasSerDes4.CompressLevel compressLevel) throws IOException - Throws:
IOException
-
serialize
public SerializationMeasures serialize(AbstractCas cas, Object out, Marker trackingMark) throws IOException - Throws:
IOException
-
serialize
- Throws:
IOException
-
deserialize
public void deserialize(CASImpl cas, InputStream deserIn, boolean isDelta, CommonSerDes.Header h) throws IOException - Throws:
IOException
-
makeDataOutputStream
- Parameters:
f
- can be a DataOutputStream, an OutputStream a File- Returns:
- a data output stream
- Throws:
FileNotFoundException
- passthru
-
getCsds
-
dumpCas
-