Package org.htmlparser.nodes
Class TagNode
java.lang.Object
org.htmlparser.nodes.AbstractNode
org.htmlparser.nodes.TagNode
- All Implemented Interfaces:
Serializable,Cloneable,Node,Tag
- Direct Known Subclasses:
BaseHrefTag,CompositeTag,DoctypeTag,FrameTag,ImageTag,InputTag,JspTag,MetaTag,ProcessingInstructionTag
TagNode represents a generic tag.
If no scanner is registered for a given tag name, this is what you get.
This is also the base class for all tags created by the parser.
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidaccept(NodeVisitor visitor) Default tag visiting code.booleanDetermines if the given tag breaks the flow of text.getAttribute(String name) Returns the value of an attribute.getAttributeEx(String name) Returns the attribute with the given name.Gets the attributes in the tag.String[]Return the set of tag names that cause this tag to finish.intGet the line number where this tag ends.Get the end tag for this (composite) tag.String[]Return the set of end tag names that cause this tag to finish.String[]getIds()Return the set of names handled by this tag.Return the name of this tag.intGet the line number where this tag starts.intGets the nodeBegin.intGets the nodeEnd.Return the name of this tag.getText()Return the text contained in this tag.Return the scanner associated with this tag.booleanIs this an empty xml tag of the form <tag/>.booleanisEndTag()Predicate to determine if this tag is an end tag (i.e.voidremoveAttribute(String key) Remove the attribute with the given key, if it exists.voidsetAttribute(String key, String value) Set attribute with given key, value pair.voidsetAttribute(String key, String value, char quote) Set attribute with given key, value pair where the value is quoted by quote.voidsetAttribute(Attribute attribute) Set an attribute.voidsetAttributeEx(Attribute attribute) Set an attribute.voidsetAttributesEx(Vector attribs) Sets the attributes.voidsetEmptyXmlTag(boolean emptyXmlTag) Set this tag to be an empty xml node, or not.voidSet the end tag for this (composite) tag.voidsetTagBegin(int tagBegin) Sets the nodeBegin.voidsetTagEnd(int tagEnd) Sets the nodeEnd.voidsetTagName(String name) Set the name of this tag.voidParses the given text to create the tag contents.voidsetThisScanner(Scanner scanner) Set the scanner associated with this tag.toHtml(boolean verbatim) Render the tag as HTML.Get the plain text from this node.toString()Print the contents of the tag.Methods inherited from class org.htmlparser.nodes.AbstractNode
clone, collectInto, doSemanticAction, getChildren, getEndPosition, getFirstChild, getLastChild, getNextSibling, getPage, getParent, getPreviousSibling, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition, toHtmlMethods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.htmlparser.Node
clone, collectInto, doSemanticAction, getChildren, getEndPosition, getFirstChild, getLastChild, getNextSibling, getPage, getParent, getPreviousSibling, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition, toHtml
-
Field Details
-
mDefaultScanner
The default scanner for non-composite tags. -
mAttributes
The tag attributes. Objects of typeAttribute. The first element is the tag name, subsequent elements being either whitespace or real attributes. -
breakTags
Set of tags that breaks the flow.
-
-
Constructor Details
-
TagNode
public TagNode()Create an empty tag. -
TagNode
Create a tag with the location and attributes provided- Parameters:
page- The page this tag was read from.start- The starting offset of this node within the page.end- The ending offset of this node within the page.attributes- The list of attributes that were parsed in this tag.- See Also:
-
TagNode
Create a tag like the one provided.- Parameters:
tag- The tag to emulate.scanner- The scanner for this tag.
-
-
Method Details
-
getAttribute
Returns the value of an attribute.- Specified by:
getAttributein interfaceTag- Parameters:
name- Name of attribute, case insensitive.- Returns:
- The value associated with the attribute or null if it does not exist, or is a stand-alone or
- See Also:
-
setAttribute
Set attribute with given key, value pair. Figures out a quote character to use if necessary.- Specified by:
setAttributein interfaceTag- Parameters:
key- The name of the attribute.value- The value of the attribute.- See Also:
-
removeAttribute
Remove the attribute with the given key, if it exists.- Specified by:
removeAttributein interfaceTag- Parameters:
key- The name of the attribute.
-
setAttribute
Set attribute with given key, value pair where the value is quoted by quote.- Specified by:
setAttributein interfaceTag- Parameters:
key- The name of the attribute.value- The value of the attribute.quote- The quote character to be used around value. If zero, it is an unquoted value.- See Also:
-
getAttributeEx
Returns the attribute with the given name.- Specified by:
getAttributeExin interfaceTag- Parameters:
name- Name of attribute, case insensitive.- Returns:
- The attribute or null if it does not exist.
- See Also:
-
setAttributeEx
Set an attribute.- Specified by:
setAttributeExin interfaceTag- Parameters:
attribute- The attribute to set.- See Also:
-
setAttribute
Set an attribute. This replaces an attribute of the same name. To set the zeroth attribute (the tag name), use setTagName().- Parameters:
attribute- The attribute to set.
-
getAttributesEx
Gets the attributes in the tag.- Specified by:
getAttributesExin interfaceTag- Returns:
- Returns the list of
Attributesin the tag. The first element is the tag name, subsequent elements being either whitespace or real attributes. - See Also:
-
getTagName
Return the name of this tag.Note: This value is converted to uppercase and does not begin with "/" if it is an end tag. Nor does it end with a slash in the case of an XML type tag. To get at the original text of the tag name use
getRawTagName(). The conversion to uppercase is performed with an ENGLISH locale.- Specified by:
getTagNamein interfaceTag- Returns:
- The tag name.
- See Also:
-
getRawTagName
Return the name of this tag.- Specified by:
getRawTagNamein interfaceTag- Returns:
- The tag name or null if this tag contains nothing or only whitespace.
-
setTagName
Set the name of this tag. This creates or replaces the first attribute of the tag (the zeroth element of the attribute vector).- Specified by:
setTagNamein interfaceTag- Parameters:
name- The tag name.- See Also:
-
getText
Return the text contained in this tag.- Specified by:
getTextin interfaceNode- Overrides:
getTextin classAbstractNode- Returns:
- The complete contents of the tag (within the angle brackets).
- See Also:
-
setAttributesEx
Sets the attributes. NOTE: Values of the extended hashtable are two element arrays of String, with the first element being the original name (not uppercased), and the second element being the value.- Specified by:
setAttributesExin interfaceTag- Parameters:
attribs- The attribute collection to set.- See Also:
-
setTagBegin
public void setTagBegin(int tagBegin) Sets the nodeBegin.- Parameters:
tagBegin- The nodeBegin to set
-
getTagBegin
public int getTagBegin()Gets the nodeBegin.- Returns:
- The nodeBegin value.
-
setTagEnd
public void setTagEnd(int tagEnd) Sets the nodeEnd.- Parameters:
tagEnd- The nodeEnd to set
-
getTagEnd
public int getTagEnd()Gets the nodeEnd.- Returns:
- The nodeEnd value.
-
setText
Parses the given text to create the tag contents.- Specified by:
setTextin interfaceNode- Overrides:
setTextin classAbstractNode- Parameters:
text- A string of the form <TAGNAME xx="yy">.- See Also:
-
toPlainTextString
Get the plain text from this node.- Specified by:
toPlainTextStringin interfaceNode- Specified by:
toPlainTextStringin classAbstractNode- Returns:
- An empty string (tag contents do not display in a browser).
If you want this tags HTML equivalent, use
toHtml().
-
toHtml
Render the tag as HTML. A call to a tag'stoHtml()method will render it in HTML.- Specified by:
toHtmlin interfaceNode- Specified by:
toHtmlin classAbstractNode- Parameters:
verbatim- Iftruereturn as close to the original page text as possible.- Returns:
- The tag as an HTML fragment.
- See Also:
-
toString
Print the contents of the tag.- Specified by:
toStringin interfaceNode- Specified by:
toStringin classAbstractNode- Returns:
- An string describing the tag. For text that looks like HTML use #toHtml().
-
breaksFlow
public boolean breaksFlow()Determines if the given tag breaks the flow of text.- Specified by:
breaksFlowin interfaceTag- Returns:
trueif following text would start on a new line,falseotherwise.
-
accept
Default tag visiting code. Based onisEndTag(), calls eithervisitTag()orvisitEndTag().- Specified by:
acceptin interfaceNode- Specified by:
acceptin classAbstractNode- Parameters:
visitor- The visitor that is visiting this node.
-
isEmptyXmlTag
public boolean isEmptyXmlTag()Is this an empty xml tag of the form <tag/>.- Specified by:
isEmptyXmlTagin interfaceTag- Returns:
- true if the last character of the last attribute is a '/'.
-
setEmptyXmlTag
public void setEmptyXmlTag(boolean emptyXmlTag) Set this tag to be an empty xml node, or not. Adds or removes an ending slash on the tag.- Specified by:
setEmptyXmlTagin interfaceTag- Parameters:
emptyXmlTag- If true, ensures there is an ending slash in the node, i.e. <tag/>, otherwise removes it.
-
isEndTag
public boolean isEndTag()Predicate to determine if this tag is an end tag (i.e. </HTML>). -
getStartingLineNumber
public int getStartingLineNumber()Get the line number where this tag starts.- Specified by:
getStartingLineNumberin interfaceTag- Returns:
- The (zero based) line number in the page where this tag starts.
-
getEndingLineNumber
public int getEndingLineNumber()Get the line number where this tag ends.- Specified by:
getEndingLineNumberin interfaceTag- Returns:
- The (zero based) line number in the page where this tag ends.
-
getIds
Return the set of names handled by this tag. Since this a a generic tag, it has no ids. -
getEnders
Return the set of tag names that cause this tag to finish. These are the normal (non end tags) that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, the default is no enders. -
getEndTagEnders
Return the set of end tag names that cause this tag to finish. These are the end tags that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, it has no end tag enders.- Specified by:
getEndTagEndersin interfaceTag- Returns:
- The names of following end tags that stop further scanning.
-
getThisScanner
Return the scanner associated with this tag.- Specified by:
getThisScannerin interfaceTag- Returns:
- The scanner associated with this tag.
- See Also:
-
setThisScanner
Set the scanner associated with this tag.- Specified by:
setThisScannerin interfaceTag- Parameters:
scanner- The scanner for this tag.- See Also:
-
getEndTag
Get the end tag for this (composite) tag. For a non-composite tag this always returnsnull. -
setEndTag
Set the end tag for this (composite) tag. For a non-composite tag this is a no-op.
-