org.apache.xml.utils
public class FastStringBuffer extends Object
Note that Stree and DTM used a single FastStringBuffer as a string pool, by recording start and length indices within this single buffer. This minimizes heap overhead, but of course requires more work when retrieving the data.
FastStringBuffer operates as a "chunked buffer". Doing so reduces the need to recopy existing information when an append exceeds the space available; we just allocate another chunk and flow across to it. (The array of chunks may need to grow, admittedly, but that's a much smaller object.) Some excess recopying may arise when we extract Strings which cross chunk boundaries; larger chunks make that less frequent.
The size values are parameterized, to allow tuning this code. In theory, Result Tree Fragments might want to be tuned differently from the main document's text.
%REVIEW% An experiment in self-tuning is included in the code (using nested FastStringBuffers to achieve variation in chunk sizes), but this implementation has proven to be problematic when data may be being copied from the FSB into itself. We should either re-architect that to make this safe (if possible) or remove that code and clean up for performance/maintainability reasons.
Field Summary | |
---|---|
static int | SUPPRESS_BOTH Manifest constant: Suppress both leading and trailing whitespace.
|
static int | SUPPRESS_LEADING_WS Manifest constant: Suppress leading whitespace.
|
static int | SUPPRESS_TRAILING_WS Manifest constant: Suppress trailing whitespace.
|
Constructor Summary | |
---|---|
FastStringBuffer(int initChunkBits, int maxChunkBits, int rebundleBits)
Construct a FastStringBuffer, with allocation policy as per parameters.
| |
FastStringBuffer(int initChunkBits, int maxChunkBits)
Construct a FastStringBuffer, using a default rebundleBits value.
| |
FastStringBuffer(int initChunkBits)
Construct a FastStringBuffer, using default maxChunkBits and
rebundleBits values.
| |
FastStringBuffer()
Construct a FastStringBuffer, using a default allocation policy. |
Method Summary | |
---|---|
void | append(char value)
Append a single character onto the FastStringBuffer, growing the
storage if necessary.
|
void | append(String value)
Append the contents of a String onto the FastStringBuffer,
growing the storage if necessary.
|
void | append(StringBuffer value)
Append the contents of a StringBuffer onto the FastStringBuffer,
growing the storage if necessary.
|
void | append(char[] chars, int start, int length)
Append part of the contents of a Character Array onto the
FastStringBuffer, growing the storage if necessary.
|
void | append(FastStringBuffer value)
Append the contents of another FastStringBuffer onto
this FastStringBuffer, growing the storage if necessary.
|
char | charAt(int pos)
Get a single character from the string buffer.
|
String | getString(int start, int length) |
boolean | isWhitespace(int start, int length) |
int | length()
Get the length of the list. |
void | reset()
Discard the content of the FastStringBuffer, and most of the memory
that was allocated by it, restoring the initial state. |
int | sendNormalizedSAXcharacters(ContentHandler ch, int start, int length)
Sends the specified range of characters as one or more SAX characters()
events, normalizing the characters according to XSLT rules.
|
static void | sendNormalizedSAXcharacters(char[] ch, int start, int length, ContentHandler handler)
Directly normalize and dispatch the character array.
|
void | sendSAXcharacters(ContentHandler ch, int start, int length)
Sends the specified range of characters as one or more SAX characters()
events.
|
void | sendSAXComment(LexicalHandler ch, int start, int length)
Sends the specified range of characters as sax Comment.
|
void | setLength(int l)
Directly set how much of the FastStringBuffer's storage is to be
considered part of its content. |
int | size()
Get the length of the list. |
String | toString()
Note that this operation has been somewhat deoptimized by the shift to a
chunked array, as there is no factory method to produce a String object
directly from an array of arrays and hence a double copy is needed.
|
See Also: FastStringBuffer
See Also: FastStringBuffer
For coding convenience, I've expressed both allocation sizes in terms of a number of bits. That's needed for the final size of a chunk, to permit fast and efficient shift-and-mask addressing. It's less critical for the inital size, and may be reconsidered.
An alternative would be to accept integer sizes and round to powers of two; that really doesn't seem to buy us much, if anything.
Parameters: initChunkBits Length in characters of the initial allocation of a chunk, expressed in log-base-2. (That is, 10 means allocate 1024 characters.) Later chunks will use larger allocation units, to trade off allocation speed of large document against storage efficiency of small ones. maxChunkBits Number of character-offset bits that should be used for addressing within a chunk. Maximum length of a chunk is 2^chunkBits characters. rebundleBits Number of character-offset bits that addressing should advance before we attempt to take a step from initChunkBits to maxChunkBits
ISSUE: Should this call assert initial size, or fixed size? Now configured as initial, with a default for fixed. NEEDSDOC @param initChunkBits
NOTE THAT after calling append(), previously obtained references to m_array[][] may no longer be valid.... though in fact they should be in this instance.
Parameters: value character to be appended.
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
Parameters: value String whose contents are to be appended.
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
Parameters: value StringBuffer whose contents are to be appended.
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
Parameters: chars character array from which data is to be copied start offset in chars of first character to be copied, zero-based. length number of characters to be copied
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
Parameters: value FastStringBuffer whose contents are to be appended.
Parameters: pos character position requested.
Returns: A character from the requested position.
Parameters: start Offset of first character in the range. length Number of characters to send.
Returns: a new String object initialized from the specified range of characters.
Parameters: start Offset of first character in the range. length Number of characters to send.
Returns: true if the specified range of characters are all whitespace,
as defined by XMLCharacterRecognizer.
CURRENTLY DOES NOT CHECK FOR OUT-OF-RANGE.
Returns: the number of characters in the FastStringBuffer's content.
Parameters: ch SAX ContentHandler object to receive the event. start Offset of first character in the range. length Number of characters to send.
Returns: normalization status to apply to next chunk (because we may
have been called recursively to process an inner FSB):
Throws: org.xml.sax.SAXException may be thrown by handler's characters() method.
Parameters: ch The characters from the XML document. start The start position in the array. length The number of characters to read from the array. handler SAX ContentHandler object to receive the event.
Throws: org.xml.sax.SAXException Any SAX exception, possibly wrapping another exception.
Note too that there is no promise that the output will be sent as a single call. As is always true in SAX, one logical string may be split across multiple blocks of memory and hence delivered as several successive events.
Parameters: ch SAX ContentHandler object to receive the event. start Offset of first character in the range. length Number of characters to send.
Throws: org.xml.sax.SAXException may be thrown by handler's characters() method.
Note that, unlike sendSAXcharacters, this has to be done as a single call to LexicalHandler#comment.
Parameters: ch SAX LexicalHandler object to receive the event. start Offset of first character in the range. length Number of characters to send.
Throws: org.xml.sax.SAXException may be thrown by handler's characters() method.
Parameters: l New length. If l<0 or l>=getLength(), this operation will not report an error but future operations will almost certainly fail.
Returns: the number of characters in the FastStringBuffer's content.
(It really is a pity that Java didn't design String as a final subclass of MutableString, rather than having StringBuffer be a separate hierarchy. We'd avoid a lot of double-buffering.)
Returns: the contents of the FastStringBuffer as a standard Java string.