org.htmlparser
public class PrototypicalNodeFactory extends Object implements Serializable, NodeFactory
Text and remark nodes are generated from prototypes accessed via the {@link #setTextPrototype(Text) textPrototype} and {@link #setRemarkPrototype(Remark) remarkPrototype} properties respectively. Tag nodes are generated as follows:
Prototype tags, in the form of undifferentiated tags, are held in a hash table. On a request for a tag, the attributes are examined for the name of the tag to be created. If a prototype of that name has been registered (exists in the hash table), it is cloned and the clone is given the characteristics ({@link Attribute Attributes}, start and end position) of the requested tag.
In the case that no tag has been registered under that name, a generic tag is created from the prototype acessed via the {@link #setTagPrototype(Tag) tagPrototype} property.
The hash table of registered tags can be automatically populated with all the known tags from the {@link org.htmlparser.tags} package when the factory is constructed, or it can start out empty and be populated explicitly.
Here is an example of how to override all text issued from {@link org.htmlparser.nodes.TextNode#toPlainTextString() Text.toPlainTextString()}, in this case decoding (converting character references), which illustrates the use of setting the text prototype:
PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.setTextPrototype ( // create a inner class that is a subclass of TextNode new TextNode () { public String toPlainTextString() { String original = super.toPlainTextString (); return (org.htmlparser.util.Translate.decode (original)); } }); Parser parser = new Parser (); parser.setNodeFactory (factory);
Here is an example of using a custom link tag, in this case just printing the URL, which illustrates registering a tag:
class PrintingLinkTag extends LinkTag { public void doSemanticAction () throws ParserException { System.out.println (getLink ()); } } PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.registerTag (new PrintingLinkTag ()); Parser parser = new Parser (); parser.setNodeFactory (factory);
Field Summary | |
---|---|
protected Map | mBlastocyst
The list of tags to return.
|
protected Remark | mRemark
The prototypical remark node. |
protected Tag | mTag
The prototypical tag node. |
protected Text | mText
The prototypical text node. |
Constructor Summary | |
---|---|
PrototypicalNodeFactory()
Create a new factory with all tags registered.
| |
PrototypicalNodeFactory(boolean empty)
Create a new factory. | |
PrototypicalNodeFactory(Tag tag)
Create a new factory with the given tag as the only registered tag. | |
PrototypicalNodeFactory(Tag[] tags)
Create a new factory with the given tags registered. |
Method Summary | |
---|---|
void | clear()
Clean out the registry. |
Remark | createRemarkNode(Page page, int start, int end)
Create a new remark node. |
Text | createStringNode(Page page, int start, int end)
Create a new string node. |
Tag | createTagNode(Page page, int start, int end, Vector attributes)
Create a new tag node.
|
Tag | get(String id)
Gets a tag from the registry. |
Remark | getRemarkPrototype()
Get the object that is cloned to generate remark nodes. |
Set | getTagNames()
Get the list of tag names. |
Tag | getTagPrototype()
Get the object that is cloned to generate tag nodes.
|
Text | getTextPrototype()
Get the object that is cloned to generate text nodes. |
Tag | put(String id, Tag tag)
Adds a tag to the registry. |
void | registerTag(Tag tag)
Register a tag.
|
PrototypicalNodeFactory | registerTags()
Register all known tags in the tag package.
|
Tag | remove(String id)
Remove a tag from the registry. |
void | setRemarkPrototype(Remark remark)
Set the object to be used to generate remark nodes. |
void | setTagPrototype(Tag tag)
Set the object to be used to generate tag nodes.
|
void | setTextPrototype(Text text)
Set the object to be used to generate text nodes. |
void | unregisterTag(Tag tag)
Unregister a tag.
|
Parameters: empty If true
, creates an empty factory,
otherwise create a new factory with all tags registered.
Parameters: tag The single tag to register in the otherwise empty factory.
Parameters: tags The tags to register in the otherwise empty factory.
Parameters: page The page the node is on. start The beginning position of the remark. end The ending positiong of the remark.
Returns: A remark node comprising the indicated characters from the page.
Parameters: page The page the node is on. start The beginning position of the string. end The ending position of the string.
Returns: A text node comprising the indicated characters from the page.
Parameters: page The page the node is on. start The beginning position of the tag. end The ending positiong of the tag. attributes The attributes contained in this tag.
Returns: A tag node comprising the indicated characters from the page.
Parameters: id The name of the tag to return.
Returns: The tag registered under the id
name,
or null
if none.
Returns: The prototype for {@link Remark} nodes.
See Also: PrototypicalNodeFactory
Returns: The names of the tags currently registered.
Returns: The prototype for {@link Tag} nodes.
See Also: PrototypicalNodeFactory
Returns: The prototype for {@link Text} nodes.
See Also: PrototypicalNodeFactory
Parameters: id The name under which to register the tag. For proper operation, the id should be uppercase so it will be matched by a Map lookup. tag The tag to be returned from a {@link #createTagNode} call.
Returns: The tag previously registered with that id if any,
or null
if none.
For proper operation, the ids are converted to uppercase so they will be matched by a Map lookup.
Parameters: tag The tag to register.
Returns: 'this' nodefactory as a convenience.
Parameters: id The name of the tag to remove.
Returns: The tag that was registered with that id
,
or null
if none.
Parameters: remark The prototype for {@link Remark} nodes.
If null
the prototype is set to the default
({@link RemarkNode}).
See Also: PrototypicalNodeFactory
Parameters: tag The prototype for {@link Tag} nodes.
If null
the prototype is set to the default
({@link TagNode}).
See Also: PrototypicalNodeFactory
Parameters: text The prototype for {@link Text} nodes.
If null
the prototype is set to the default
({@link TextNode}).
See Also: PrototypicalNodeFactory
The ids are converted to uppercase to undo the operation of registerTag.
Parameters: tag The tag to unregister.
HTML Parser is an open source library released under LGPL. | |