Package com.fasterxml.aalto.in
Class ByteBasedPNameTable
java.lang.Object
com.fasterxml.aalto.util.NameTable
com.fasterxml.aalto.in.ByteBasedPNameTable
This is a symbol table implementation used for storing byte-based
PNames
, specifically, instances of (ByteBasedPName
).-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final int
(package private) static final int
Bucket index is 8 bits, and value 0 is reserved to represent 'empty' status.private int
Total number of PNames in collision buckets (included inmCount
along with primary entries)private int
Index of the first unused collision bucket entry (== size of the used portion of collision list): less than or equal to 0xFF (255), since max number of entries is 255 (8-bit, minus 0 used as 'empty' marker)private ByteBasedPNameTable.Bucket[]
Array of heads of collision bucket chains; size dynamicallyprivate boolean
Flag that indicates whether underlying data structures for the collision list are shared or not.private int
Total number of PNames in the symbol table(package private) static final int
private int[]
Array of 2^N size, which contains combination of 24-bits of hash (0 to indicate 'empty' slot), and 8-bit collision bucket index (0 to indicate empty collision bucket chain; otherwise subtract one from index)private int
Mask used to truncate 32-bit hash value to current hash array size; essentially, hash array size - 1 (since hash array sizes are 2^N).private boolean
Flag that indicates whether underlying data structures for the main hash area are shared or not.private ByteBasedPName[]
Array that containsPName
instances matching entries inmMainHash
.private boolean
private boolean
This flag is set if, after adding a new entry, it is deemed that a rehash is warranted if any more entries are to be added. -
Constructor Summary
ConstructorsConstructorDescriptionByteBasedPNameTable
(int hashSize) Constructor used when creating a child instance -
Method Summary
Modifier and TypeMethodDescriptionstatic final int
calcHash
(int firstQuad) static final int
calcHash
(int[] quads, int qlen) static final int
calcHash
(int firstQuad, int secondQuad) static int[]
calcQuads
(byte[] wordBytes) private void
doAddSymbol
(int hash, ByteBasedPName symbol) private void
private int
Method called to find the best bucket to spill a PName over to: usually the first bucket that has only one entry, but in general first one of the buckets with least number of entriesfindSymbol
(int hash, int[] quads, int qlen) Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.findSymbol
(int hash, int firstQuad, int secondQuad) Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.void
boolean
Method called to check to quickly see if a child symbol table may have gotten additional entries.boolean
void
nuke()
Method used by test code, to reset state of the name table.private void
rehash()
int
size()
toString()
private void
private void
Method that needs to be called, if the main hash structure is (may be) shared.private void
-
Field Details
-
MIN_HASH_SIZE
static final int MIN_HASH_SIZE- See Also:
-
INITIAL_COLLISION_LEN
static final int INITIAL_COLLISION_LEN- See Also:
-
LAST_VALID_BUCKET
static final int LAST_VALID_BUCKETBucket index is 8 bits, and value 0 is reserved to represent 'empty' status.- See Also:
-
mCount
private int mCountTotal number of PNames in the symbol table -
mMainHashMask
private int mMainHashMaskMask used to truncate 32-bit hash value to current hash array size; essentially, hash array size - 1 (since hash array sizes are 2^N). -
mMainHash
private int[] mMainHashArray of 2^N size, which contains combination of 24-bits of hash (0 to indicate 'empty' slot), and 8-bit collision bucket index (0 to indicate empty collision bucket chain; otherwise subtract one from index) -
mMainNames
Array that containsPName
instances matching entries inmMainHash
. Contains nulls for unused entries. -
mCollList
Array of heads of collision bucket chains; size dynamically -
mCollCount
private int mCollCountTotal number of PNames in collision buckets (included inmCount
along with primary entries) -
mCollEnd
private int mCollEndIndex of the first unused collision bucket entry (== size of the used portion of collision list): less than or equal to 0xFF (255), since max number of entries is 255 (8-bit, minus 0 used as 'empty' marker) -
mNeedRehash
private transient boolean mNeedRehashThis flag is set if, after adding a new entry, it is deemed that a rehash is warranted if any more entries are to be added.
-
-
Constructor Details
-
ByteBasedPNameTable
public ByteBasedPNameTable(int hashSize) -
ByteBasedPNameTable
ByteBasedPNameTable(ByteBasedPNameTable parent) Constructor used when creating a child instance
-
-
Method Details
-
mergeFromChild
-
nuke
public void nuke()Method used by test code, to reset state of the name table. -
size
public int size() -
maybeDirty
public boolean maybeDirty()Method called to check to quickly see if a child symbol table may have gotten additional entries. Used for checking to see if a child table should be merged into shared table.- Specified by:
maybeDirty
in classNameTable
-
findSymbol
Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.Note: separate methods to optimize common case of relatively short element/attribute names (8 or less ascii characters)
- Parameters:
firstQuad
- int32 containing first 4 bytes of the pname; if the whole name less than 4 bytes, padded with zero bytes in front (zero MSBs, ie. right aligned)secondQuad
- int32 containing bytes 5 through 8 of the pname; if less than 8 bytes, padded with up to 4 zero bytes in front (zero MSBs, ie. right aligned)- Returns:
- PName matching the symbol passed (or constructed for it)
-
findSymbol
Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.Note: this is the general purpose method that can be called for names of any length. However, if name is less than 9 bytes long, it is preferable to call the version optimized for short names.
- Parameters:
quads
- Array of int32s, each of which contain 4 bytes of encoded nameqlen
- Number of int32s, starting from index 0, in quads parameter- Returns:
- PName matching the symbol passed (or constructed for it)
-
addSymbol
public ByteBasedPName addSymbol(int hash, String symbolStr, int colonIx, int firstQuad, int secondQuad) -
addSymbol
-
calcHash
public static final int calcHash(int firstQuad) -
calcHash
public static final int calcHash(int firstQuad, int secondQuad) -
calcHash
public static final int calcHash(int[] quads, int qlen) -
calcQuads
public static int[] calcQuads(byte[] wordBytes) -
toString
-
doAddSymbol
-
rehash
private void rehash() -
findBestBucket
private int findBestBucket()Method called to find the best bucket to spill a PName over to: usually the first bucket that has only one entry, but in general first one of the buckets with least number of entries -
expandCollision
private void expandCollision()
-