public abstract class SgmlPage extends DomNode implements Page, Document
DomNode.ChildIterator, DomNode.DescendantElementsIterator<T extends DomNode>
AS_TEXT_BLANK, AS_TEXT_BLOCK_SEPARATOR, AS_TEXT_NEW_LINE, AS_TEXT_TAB, PROPERTY_ELEMENT, READY_STATE_COMPLETE, READY_STATE_INTERACTIVE, READY_STATE_LOADED, READY_STATE_LOADING, READY_STATE_UNINITIALIZED
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_POSITION_CONTAINED_BY, DOCUMENT_POSITION_CONTAINS, DOCUMENT_POSITION_DISCONNECTED, DOCUMENT_POSITION_FOLLOWING, DOCUMENT_POSITION_IMPLEMENTATION_SPECIFIC, DOCUMENT_POSITION_PRECEDING, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NOTATION_NODE, PROCESSING_INSTRUCTION_NODE, TEXT_NODE
Constructor and Description |
---|
SgmlPage(WebResponse webResponse,
WebWindow webWindow)
Creates an instance of SgmlPage.
|
Modifier and Type | Method and Description |
---|---|
String |
asXml()
Returns a string representation of the XML document from this element and all it's children (recursively).
|
void |
cleanUp()
Clean up this page.
|
protected SgmlPage |
clone()
Creates a clone of this instance.
|
DomAttr |
createAttribute(String name) |
CDATASection |
createCDATASection(String data) |
Comment |
createComment(String data) |
DomDocumentFragment |
createDocumentFragment()
Creates an empty
DomDocumentFragment object. |
DomDocumentFragment |
createDomDocumentFragment()
Deprecated.
as of 2.18, please use
createDocumentFragment() instead |
abstract Element |
createElement(String tagName)
Creates an element, the type of which depends on the specified tag name.
|
abstract Element |
createElementNS(String namespaceURI,
String qualifiedName)
Create a new Element with the given namespace and qualified name.
|
Text |
createTextNode(String data) |
String |
getCanonicalXPath()
Returns the canonical XPath expression which identifies this node, for instance
"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".
|
DocumentType |
getDoctype()
Returns the document type.
|
DomElement |
getDocumentElement()
Returns the document element.
|
DomNodeList<DomElement> |
getElementsByTagName(String tagName) |
WebWindow |
getEnclosingWindow()
Returns the window that this page is sitting inside.
|
String |
getNodeName()
Gets the name for the current node.
|
short |
getNodeType()
Gets the type of the current node.
|
SgmlPage |
getPage()
Returns the page that contains this node.
|
abstract String |
getPageEncoding()
Returns the page encoding.
|
URL |
getUrl()
Returns the URL of this page.
|
WebClient |
getWebClient()
Returns the WebClient that originally loaded this page.
|
WebResponse |
getWebResponse()
Returns the web response that was originally used to create this page.
|
abstract boolean |
hasCaseSensitiveTagNames()
Returns
true if this page has case-sensitive tag names, false otherwise. |
void |
initialize()
Initialize this page.
|
boolean |
isHtmlPage()
Returns true if this page is an HtmlPage.
|
void |
normalizeDocument()
The current implementation just
DomNode.normalize() s the document element. |
protected void |
setDocumentType(DocumentType type)
Sets the document type.
|
void |
setEnclosingWindow(WebWindow window)
Sets the window that contains this page.
|
addCharacterDataChangeListener, addDomChangeListener, appendChild, asText, checkChildHierarchy, cloneNode, compareDocumentPosition, detach, fireCharacterDataChanged, fireNodeAdded, fireNodeDeleted, getAncestors, getAttributes, getBaseURI, getByXPath, getByXPath, getChildNodes, getChildren, getDescendants, getDomElementDescendants, getEndColumnNumber, getEndLineNumber, getFeature, getFirstByXPath, getFirstByXPath, getFirstChild, getHtmlElementDescendants, getHtmlPageOrNull, getIndex, getLastChild, getLocalName, getNamespaceURI, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousSibling, getReadyState, getScriptableObject, getScriptObject, getStartColumnNumber, getStartLineNumber, getTextContent, getUserData, hasAttributes, hasChildNodes, hasFeature, insertBefore, insertBefore, isAncestorOf, isAncestorOfAny, isBlock, isDefaultNamespace, isDirectlyAttachedToPage, isDisplayed, isEqualNode, isSameNode, isSupported, isTrimmedText, lookupNamespaceURI, lookupPrefix, mayBeDisplayed, normalize, notifyIncorrectness, onAddedToDocumentFragment, onAddedToPage, onAllChildrenAddedToPage, printChildrenAsXml, printXml, processImportNode, querySelector, querySelectorAll, remove, removeAllChildren, removeCharacterDataChangeListener, removeChild, removeDomChangeListener, replace, replaceChild, setNextSibling, setNodeValue, setParentNode, setPrefix, setPreviousSibling, setReadyState, setScriptableObject, setTextContent, setUserData
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
adoptNode, createAttributeNS, createEntityReference, createProcessingInstruction, getDocumentURI, getDomConfig, getElementById, getElementsByTagNameNS, getImplementation, getInputEncoding, getStrictErrorChecking, getXmlEncoding, getXmlStandalone, getXmlVersion, importNode, renameNode, setDocumentURI, setStrictErrorChecking, setXmlStandalone, setXmlVersion
appendChild, cloneNode, compareDocumentPosition, getAttributes, getBaseURI, getChildNodes, getFeature, getFirstChild, getLastChild, getLocalName, getNamespaceURI, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousSibling, getTextContent, getUserData, hasAttributes, hasChildNodes, insertBefore, isDefaultNamespace, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, normalize, removeChild, replaceChild, setNodeValue, setPrefix, setTextContent, setUserData
public SgmlPage(WebResponse webResponse, WebWindow webWindow)
webResponse
- the web response that was used to create this pagewebWindow
- the window that this page is being loaded intopublic void cleanUp()
public WebResponse getWebResponse()
getWebResponse
in interface Page
public void initialize() throws IOException
initialize
in interface Page
IOException
- if an IO problem occurspublic String getNodeName()
getNodeName
in interface Node
getNodeName
in class DomNode
public short getNodeType()
getNodeType
in interface Node
getNodeType
in class DomNode
public WebWindow getEnclosingWindow()
getEnclosingWindow
in interface Page
public void setEnclosingWindow(WebWindow window)
window
- the new frame or null if this page is being removed from a framepublic WebClient getWebClient()
@Deprecated public DomDocumentFragment createDomDocumentFragment()
createDocumentFragment()
insteadDomDocumentFragment
object.DomDocumentFragment
public DomDocumentFragment createDocumentFragment()
DomDocumentFragment
object.createDocumentFragment
in interface Document
DomDocumentFragment
public final DocumentType getDoctype()
getDoctype
in interface Document
protected void setDocumentType(DocumentType type)
type
- the document typepublic SgmlPage getPage()
public abstract Element createElement(String tagName)
createElement
in interface Document
tagName
- the tag name which determines the type of element to be createdpublic abstract Element createElementNS(String namespaceURI, String qualifiedName)
createElementNS
in interface Document
namespaceURI
- the URI that identifies an XML namespacequalifiedName
- the qualified name of the element type to instantiatepublic abstract String getPageEncoding()
public DomElement getDocumentElement()
getDocumentElement
in interface Document
protected SgmlPage clone()
public String asXml()
public abstract boolean hasCaseSensitiveTagNames()
true
if this page has case-sensitive tag names, false
otherwise. In general,
XML has case-sensitive tag names, and HTML doesn't. This is especially important during XPath matching.true
if this page has case-sensitive tag names, false
otherwisepublic void normalizeDocument()
DomNode.normalize()
s the document element.normalizeDocument
in interface Document
public String getCanonicalXPath()
Returns the canonical XPath expression which identifies this node, for instance "/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".
WARNING: This sort of automated XPath expression is often quite bad at identifying a node, as it is highly sensitive to changes in the DOM tree.
getCanonicalXPath
in class DomNode
DomNode.getByXPath(String)
public DomAttr createAttribute(String name)
createAttribute
in interface Document
public URL getUrl()
public boolean isHtmlPage()
Page
isHtmlPage
in interface Page
public DomNodeList<DomElement> getElementsByTagName(String tagName)
getElementsByTagName
in interface Document
public CDATASection createCDATASection(String data)
createCDATASection
in interface Document
public Text createTextNode(String data)
createTextNode
in interface Document
public Comment createComment(String data)
createComment
in interface Document
Copyright © 2002–2016 Gargoyle Software Inc.. All rights reserved.