Class CasSerializerSupport.CasDocSerializer

java.lang.Object
org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
Enclosing class:
CasSerializerSupport

public class CasSerializerSupport.CasDocSerializer extends Object
Use an inner class to hold the data for serializing a CAS. Each call to serialize() creates its own instance. package private to allow a test case to access not static to share the logger and the initializing values (could be changed)
  • Field Details

    • cas

      public final CASImpl cas
    • tsi

      public final TypeSystemImpl tsi
    • visited_not_yet_written

      public final Set<TOP> visited_not_yet_written
      set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. - FSs added to this, during "enqueue" phase, prior to encoding uses: - for Arrays and Lists, used to detect multi-refs - for Lists, used to detect loops - during enqueuing phase, prevent multiple enqueuings - during encoding phase, to prevent multiple encodings Public for use by JsonCasSerializer
    • enqueued_multiRef_arrays_or_lists

      private final Set<TOP> enqueued_multiRef_arrays_or_lists
      Set of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written. It has array or list elements where the item needs to be enqueued onto the "queue" list. Use: limit the put-onto-queue list to one time
    • multiRefFSs

      public final Set<TOP> multiRefFSs
      Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature. This is also for xmi, to enable adding to "queue" (once) for each FSs of this kind. Used: - limit the number of times this is put onto the queue to 1. - skip encoding of items on "queue" if not in this Set (maybe not needed? 8/2017 mis) - serialize if not in indexed set, dynamic ref == true, and in this set (otherwise serialize only from ref)
    • isDynamicMultiRef

      public final boolean isDynamicMultiRef
      Set to true for JSON configuration of using dynamic multi-ref detection for arrays and lists
    • previouslySerializedFSs

      public List<TOP> previouslySerializedFSs
    • modifiedEmbeddedValueFSs

      public List<TOP> modifiedEmbeddedValueFSs
    • indexedFSs

      public final List<TOP>[] indexedFSs
      Array of Lists of all FS that are indexed in some view (other than sofas). Array indexed by view.
    • queue

      private final Deque<TOP> queue
      FSs not in an index, but only being serialized becaused they're referenced. Exception: the sofa's are here.
    • typeCode2namespaceNames

      public XmlElementName[] typeCode2namespaceNames
    • typeUsed

      private final BitSet typeUsed
    • needNameSpaces

      public boolean needNameSpaces
    • nsUriToPrefixMap

      public final Map<String,String> nsUriToPrefixMap
      map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string
    • nsPrefixesUsed

      public final Set<String> nsPrefixesUsed
      the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in
    • marker

      public final MarkerImpl marker
      Used to tell if a FS was created before or after mark.
    • sharedData

      public final XmiSerializationSharedData sharedData
      for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serialization
    • isDelta

      public final boolean isDelta
      Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object.
    • isFiltering

      public final boolean isFiltering
      Whether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer.
    • sortedUsedTypes

      private TypeImpl[] sortedUsedTypes
    • errorHandler2

      private final ErrorHandler errorHandler2
    • filterTypeSystem_inner

      public TypeSystemImpl filterTypeSystem_inner
    • uniqueStrings

      private final Map<String,String> uniqueStrings
    • isFormattedOutput_inner

      public final boolean isFormattedOutput_inner
    • csss

    • sortFssByType

      public final Comparator<TOP> sortFssByType
      Called for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
  • Constructor Details

  • Method Details

    • reportMultiRefWarning

      private void reportMultiRefWarning(FeatureImpl fi) throws SAXException
      Throws:
      SAXException
    • serialize

      public void serialize() throws Exception
      Starts serialization
      Throws:
      Exception - -
    • getSofa

      public Sofa getSofa(int sofaNum)
      Parameters:
      sofaNum - - starts at 1
      Returns:
      the sofa FS, or null
    • writeViewsCommons

      public void writeViewsCommons() throws Exception
      Throws:
      Exception
    • getSortedUsedTypes

      public TypeImpl[] getSortedUsedTypes()
    • getUsedTypesIterable

      private Iterable<TypeImpl> getUsedTypesIterable()
    • enqueueIncoming

      private void enqueueIncoming()
      Enqueues all FS that are stored in the sharedData's id map. This map is populated during the previous deserialization. This method is used to make sure that all incoming FS are echoed in the next serialization. It is required if there are out-of-type FSs that are being merged back into the serialized form; those might reference some of these.
    • enqueueIndexed

      private void enqueueIndexed()
      add the indexed FSs onto the indexedFSs by view. add the SofaFSs onto the by-ref queue
    • enqueueNonsharedMultivaluedFS

      private void enqueueNonsharedMultivaluedFS()
      When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified. The embedded nonshared-multivalued item could be a list or an array
    • enqueueFeaturesOfIndexed

      private void enqueueFeaturesOfIndexed() throws SAXException
      Enqueue everything reachable from features of indexed FSs.
      Throws:
      SAXException
    • enqueueFeaturesOfFSs

      private void enqueueFeaturesOfFSs(List<TOP> fss) throws SAXException
      Throws:
      SAXException
    • enqueueCommon

      int enqueueCommon(TOP fs)
    • enqueueCommonWithoutDeltaAndFilteringCheck

      int enqueueCommonWithoutDeltaAndFilteringCheck(TOP fs)
    • enqueueCommon

      private int enqueueCommon(TOP fs, boolean doDeltaAndFilteringCheck)
      Parameters:
      fs - -
      doDeltaAndFilteringCheck - -
      Returns:
      true to have enqueue put onto "queue" and enqueue features
    • enqueueIndexedFs_only_not_features

      void enqueueIndexedFs_only_not_features(int viewNumber, TOP fs)
    • enqueueFsAndMaybeFeatures

      private void enqueueFsAndMaybeFeatures(TOP fs) throws SAXException
      Enqueue an FS, and everything reachable from it. This call is recursive with enqueueFeatures, \ and an arbitrary long chain can get stack overflow error. Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106
      Parameters:
      addr - The FS address.
      Throws:
      SAXException
    • isListElementsMultiplyReferenced

      private boolean isListElementsMultiplyReferenced(TOP listNode)
      For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - t
      Parameters:
      curNode - -
      featCode - -
      Returns:
      false if no list element is multiply-referenced, true if there is a loop or another ref from outside the list, for one or more list element nodes
    • isMultiRef_enqueue

      private boolean isMultiRef_enqueue(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) throws SAXException
      ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue". (not handled here: ordinary FSs are serialized in-line in JSON with isDynamicMultiRef)
      Parameters:
      fi - - the feature, to look up the multiRefAllowed flag
      featVal - - the List or array element
      alreadyVisited - true if visited_not_yet_written contains the featVal
      isListNode - -
      isListFeat - -
      Returns:
      false if should skip enqueue because this array or list is being serialized inline
      Throws:
      SAXException - -
    • enqueueFeatures

      private void enqueueFeatures(TOP fs) throws SAXException
      Enqueue all FSs reachable from features of the given FS.
      Parameters:
      addr - address of an FS
      typeCode - type of the FS
      insideListNode - true iff the enclosing FS (addr) is a list type
      Throws:
      SAXException
    • enqueueFSArrayElements

      private void enqueueFSArrayElements(FSArray fsArray) throws SAXException
      Enqueues all FS reachable from an FSArray.
      Parameters:
      addr - Address of an FSArray
      Throws:
      SAXException
    • enqueueFSListElements

      private void enqueueFSListElements(FSList<TOP> node) throws SAXException
      Enqueues all Head values of FSList reachable from an FSList. This does NOT include the list nodes themselves.
      Parameters:
      addr - Address of an FSList
      Throws:
      SAXException
    • encodeIndexed

      public void encodeIndexed() throws Exception
      Throws:
      Exception
    • encodeFSs

      private void encodeFSs(List<TOP> fss) throws Exception
      Throws:
      Exception
    • encodeQueued

      public void encodeQueued() throws Exception
      Throws:
      Exception
    • encodeFS

      public void encodeFS(TOP fs) throws Exception
      Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.
      Parameters:
      fs - the FS to be encoded.
      Throws:
      SAXException - passthru
      Exception
    • getElementCountForSharedData

      int getElementCountForSharedData()
    • getXmiId

      public String getXmiId(TOP fs)
      Get the XMI ID to use for an FS.
      Parameters:
      fs - the FS
      Returns:
      XMI ID or null
    • getXmiIdAsInt

      public int getXmiIdAsInt(TOP fs)
    • getNameSpacePrefix

      public String getNameSpacePrefix(String uimaTypeName, String nsUri, int lastDotIndex)
    • getUniqueString

      public String getUniqueString(String s)
    • getTypeNameFromXmlElementName

      public String getTypeNameFromXmlElementName(XmlElementName xe)
    • isStaticMultiRef

      public boolean isStaticMultiRef(FeatureImpl fi)