private static final class TextFormat.Tokenizer
extends java.lang.Object
String
.
The Java standard library provides many classes that you might think would be useful for implementing this, but aren't. For example:
java.io.StreamTokenizer
: This almost does what we want -- or,
at least, something that would get us close to what we want -- except
for one fatal flaw: It automatically un-escapes strings using Java
escape sequences, which do not include all the escape sequences we
need to support (e.g. '\x').
java.util.Scanner
: This seems like a great way at least to
parse regular expressions out of a stream (so we wouldn't have to load
the entire input into a single string before parsing). Sadly,
Scanner
requires that tokens be delimited with some delimiter.
Thus, although the text "foo:" should parse to two tokens ("foo" and
":"), Scanner
would recognize it only as a single token.
Furthermore, Scanner
provides no way to inspect the contents
of delimiters, making it impossible to keep track of line and column
numbers.
Luckily, Java's regular expression support does manage to be useful to
us. (Barely: We need Matcher.usePattern()
, which is new in
Java 1.5.) So, we can use that, at least. Unfortunately, this implies
that we need to have the entire input in one contiguous string.
Modifier and Type | Field and Description |
---|---|
private int |
column |
private java.lang.String |
currentToken |
private static java.util.regex.Pattern |
DOUBLE_INFINITY |
private static java.util.regex.Pattern |
FLOAT_INFINITY |
private static java.util.regex.Pattern |
FLOAT_NAN |
private int |
line |
private java.util.regex.Matcher |
matcher |
private int |
pos |
private int |
previousColumn |
private int |
previousLine |
private java.lang.CharSequence |
text |
private static java.util.regex.Pattern |
TOKEN |
private static java.util.regex.Pattern |
WHITESPACE |
Modifier | Constructor and Description |
---|---|
private |
Tokenizer(java.lang.CharSequence text)
Construct a tokenizer that parses tokens from the given text.
|
Modifier and Type | Method and Description |
---|---|
boolean |
atEnd()
Are we at the end of the input?
|
void |
consume(java.lang.String token)
If the next token exactly matches
token , consume it. |
boolean |
consumeBoolean()
If the next token is a boolean, consume it and return its value.
|
ByteString |
consumeByteString()
If the next token is a string, consume it, unescape it as a
ByteString , and return it. |
private void |
consumeByteString(java.util.List<ByteString> list)
Like
consumeByteString() but adds each token of the string to
the given list. |
double |
consumeDouble()
If the next token is a double, consume it and return its value.
|
float |
consumeFloat()
If the next token is a float, consume it and return its value.
|
java.lang.String |
consumeIdentifier()
If the next token is an identifier, consume it and return its value.
|
int |
consumeInt32()
If the next token is a 32-bit signed integer, consume it and return its
value.
|
long |
consumeInt64()
If the next token is a 64-bit signed integer, consume it and return its
value.
|
java.lang.String |
consumeString()
If the next token is a string, consume it and return its (unescaped)
value.
|
int |
consumeUInt32()
If the next token is a 32-bit unsigned integer, consume it and return its
value.
|
long |
consumeUInt64()
If the next token is a 64-bit unsigned integer, consume it and return its
value.
|
private TextFormat.ParseException |
floatParseException(java.lang.NumberFormatException e)
Constructs an appropriate
TextFormat.ParseException for the given
NumberFormatException when trying to parse a float or double. |
(package private) int |
getColumn() |
(package private) int |
getLine() |
(package private) int |
getPreviousColumn() |
(package private) int |
getPreviousLine() |
private TextFormat.ParseException |
integerParseException(java.lang.NumberFormatException e)
Constructs an appropriate
TextFormat.ParseException for the given
NumberFormatException when trying to parse an integer. |
boolean |
lookingAt(java.lang.String text)
Returns
true if the current token's text is equal to that
specified. |
boolean |
lookingAtInteger()
Returns
true if the next token is an integer, but does
not consume it. |
void |
nextToken()
Advance to the next token.
|
TextFormat.ParseException |
parseException(java.lang.String description)
Returns a
TextFormat.ParseException with the current line and column
numbers in the description, suitable for throwing. |
TextFormat.ParseException |
parseExceptionPreviousToken(java.lang.String description)
Returns a
TextFormat.ParseException with the line and column numbers of
the previous token in the description, suitable for throwing. |
private void |
skipWhitespace()
Skip over any whitespace so that the matcher region starts at the next
token.
|
boolean |
tryConsume(java.lang.String token)
If the next token exactly matches
token , consume it and return
true . |
boolean |
tryConsumeDouble()
If the next token is a double, consume it and return
true . |
boolean |
tryConsumeFloat()
If the next token is a float, consume it and return
true . |
boolean |
tryConsumeIdentifier()
If the next token is an identifier, consume it and return
true . |
boolean |
tryConsumeInt64()
If the next token is a 64-bit signed integer, consume it and return
true . |
boolean |
tryConsumeString()
If the next token is a string, consume it and return true.
|
boolean |
tryConsumeUInt64()
If the next token is a 64-bit unsigned integer, consume it and return
true . |
TextFormat.UnknownFieldParseException |
unknownFieldParseExceptionPreviousToken(java.lang.String unknownField,
java.lang.String description)
Returns a
TextFormat.UnknownFieldParseException with the line and column
numbers of the previous token in the description, and the unknown field
name, suitable for throwing. |
private final java.lang.CharSequence text
private final java.util.regex.Matcher matcher
private java.lang.String currentToken
private int pos
private int line
private int column
private int previousLine
private int previousColumn
private static final java.util.regex.Pattern WHITESPACE
private static final java.util.regex.Pattern TOKEN
private static final java.util.regex.Pattern DOUBLE_INFINITY
private static final java.util.regex.Pattern FLOAT_INFINITY
private static final java.util.regex.Pattern FLOAT_NAN
private Tokenizer(java.lang.CharSequence text)
int getPreviousLine()
int getPreviousColumn()
int getLine()
int getColumn()
public boolean atEnd()
public void nextToken()
private void skipWhitespace()
public boolean tryConsume(java.lang.String token)
token
, consume it and return
true
. Otherwise, return false
without doing anything.public void consume(java.lang.String token) throws TextFormat.ParseException
token
, consume it. Otherwise,
throw a TextFormat.ParseException
.TextFormat.ParseException
public boolean lookingAtInteger()
true
if the next token is an integer, but does
not consume it.public boolean lookingAt(java.lang.String text)
true
if the current token's text is equal to that
specified.public java.lang.String consumeIdentifier() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public boolean tryConsumeIdentifier()
true
.
Otherwise, return false
without doing anything.public int consumeInt32() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public int consumeUInt32() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public long consumeInt64() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public boolean tryConsumeInt64()
true
. Otherwise, return false
without doing anything.public long consumeUInt64() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public boolean tryConsumeUInt64()
true
. Otherwise, return false
without doing anything.public double consumeDouble() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public boolean tryConsumeDouble()
true
.
Otherwise, return false
without doing anything.public float consumeFloat() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public boolean tryConsumeFloat()
true
.
Otherwise, return false
without doing anything.public boolean consumeBoolean() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public java.lang.String consumeString() throws TextFormat.ParseException
TextFormat.ParseException
.TextFormat.ParseException
public boolean tryConsumeString()
public ByteString consumeByteString() throws TextFormat.ParseException
ByteString
, and return it. Otherwise, throw a
TextFormat.ParseException
.TextFormat.ParseException
private void consumeByteString(java.util.List<ByteString> list) throws TextFormat.ParseException
consumeByteString()
but adds each token of the string to
the given list. String literals (whether bytes or text) may come in
multiple adjacent tokens which are automatically concatenated, like in
C or Python.TextFormat.ParseException
public TextFormat.ParseException parseException(java.lang.String description)
TextFormat.ParseException
with the current line and column
numbers in the description, suitable for throwing.public TextFormat.ParseException parseExceptionPreviousToken(java.lang.String description)
TextFormat.ParseException
with the line and column numbers of
the previous token in the description, suitable for throwing.private TextFormat.ParseException integerParseException(java.lang.NumberFormatException e)
TextFormat.ParseException
for the given
NumberFormatException
when trying to parse an integer.private TextFormat.ParseException floatParseException(java.lang.NumberFormatException e)
TextFormat.ParseException
for the given
NumberFormatException
when trying to parse a float or double.public TextFormat.UnknownFieldParseException unknownFieldParseExceptionPreviousToken(java.lang.String unknownField, java.lang.String description)
TextFormat.UnknownFieldParseException
with the line and column
numbers of the previous token in the description, and the unknown field
name, suitable for throwing.