Package org.htmlparser.visitors
Class NodeVisitor
java.lang.Object
org.htmlparser.visitors.NodeVisitor
- Direct Known Subclasses:
HtmlPage,LinkFindingVisitor,ObjectFindingVisitor,StringBean,StringFindingVisitor,TagFindingVisitor,TextExtractingVisitor,UrlModifyingVisitor
The base class for the 'Visitor' pattern.
Classes that wish to use
visitAllNodesWith() will subclass
this class and provide implementations for methods they are interested in
processing.
The operation of visitAllNodesWith() is to call
beginParsing(), then visitXXX() according to the
types of nodes encountered in depth-first order and finally
finishedParsing().
Typical code to print all the link tags:
import org.htmlparser.Parser;
import org.htmlparser.Tag;
import org.htmlparser.Text;
import org.htmlparser.util.ParserException;
import org.htmlparser.visitors.NodeVisitor;
public class MyVisitor extends NodeVisitor
{
public MyVisitor ()
{
}
public void visitTag (Tag tag)
{
System.out.println ("\n" + tag.getTagName () + tag.getStartPosition ());
}
public void visitStringNode (Text string)
{
System.out.println (string);
}
public static void main (String[] args) throws ParserException
{
Parser parser = new Parser ("http://cbc.ca");
Visitor visitor = new MyVisitor ();
parser.visitAllNodesWith (visitor);
}
}
If you want to handle more than one tag type with the same visitor
you will need to check the tag type in the visitTag method. You can
do that by either checking the tag name:
public void visitTag (Tag tag)
{
if (tag.getName ().equals ("BODY"))
... do something with the BODY tag
else if (tag.getName ().equals ("FRAME"))
... do something with the FRAME tag
}
or you can use instanceof if all the tags you want to handle
have a registered
tag (i.e. they are generated by the NodeFactory):
public void visitTag (Tag tag)
{
if (tag instanceof BodyTag)
{
BodyTag body = (BodyTag)tag;
... do something with body
}
else if (tag instanceof FrameTag)
{
FrameTag frame = (FrameTag)tag;
... do something with frame
}
else // other specific tags and generic TagNode objects
{
}
}-
Constructor Summary
ConstructorsConstructorDescriptionCreates a node visitor that recurses itself and it's children.NodeVisitor(boolean recurseChildren) Creates a node visitor that recurses itself and it's children only ifrecurseChildrenistrue.NodeVisitor(boolean recurseChildren, boolean recurseSelf) Creates a node visitor that recurses itself only ifrecurseSelfistrueand it's children only ifrecurseChildrenistrue. -
Method Summary
Modifier and TypeMethodDescriptionvoidOverride this method if you wish to do special processing prior to the start of parsing.voidOverride this method if you wish to do special processing upon completion of parsing.booleanDepth traversal predicate.booleanSelf traversal predicate.voidvisitEndTag(Tag tag) Called for eachTagvisited that is an end tag.voidvisitRemarkNode(Remark remark) Called for eachRemarkNodevisited.voidvisitStringNode(Text string) Called for eachStringNodevisited.voidCalled for eachTagvisited.
-
Constructor Details
-
NodeVisitor
public NodeVisitor()Creates a node visitor that recurses itself and it's children. -
NodeVisitor
public NodeVisitor(boolean recurseChildren) Creates a node visitor that recurses itself and it's children only ifrecurseChildrenistrue.- Parameters:
recurseChildren- Iftrue, the visitor will visit children, otherwise only the top level nodes are recursed.
-
NodeVisitor
public NodeVisitor(boolean recurseChildren, boolean recurseSelf) Creates a node visitor that recurses itself only ifrecurseSelfistrueand it's children only ifrecurseChildrenistrue.- Parameters:
recurseChildren- Iftrue, the visitor will visit children, otherwise only the top level nodes are recursed.recurseSelf- Iftrue, the visitor will visit the top level node.
-
-
Method Details
-
beginParsing
public void beginParsing()Override this method if you wish to do special processing prior to the start of parsing. -
visitTag
Called for eachTagvisited.- Parameters:
tag- The tag being visited.
-
visitEndTag
Called for eachTagvisited that is an end tag.- Parameters:
tag- The end tag being visited.
-
visitStringNode
Called for eachStringNodevisited.- Parameters:
string- The string node being visited.
-
visitRemarkNode
Called for eachRemarkNodevisited.- Parameters:
remark- The remark node being visited.
-
finishedParsing
public void finishedParsing()Override this method if you wish to do special processing upon completion of parsing. -
shouldRecurseChildren
public boolean shouldRecurseChildren()Depth traversal predicate.- Returns:
trueif children are to be visited.
-
shouldRecurseSelf
public boolean shouldRecurseSelf()Self traversal predicate.- Returns:
trueif a node itself is to be visited.
-