public final class BoilerpipeSAXInput extends java.lang.Object implements BoilerpipeInput
InputSource
using SAX and returns a TextDocument
.Constructor and Description |
---|
BoilerpipeSAXInput(org.xml.sax.InputSource is)
Creates a new instance of
BoilerpipeSAXInput for the given InputSource . |
Modifier and Type | Method and Description |
---|---|
TextDocument |
getTextDocument()
Retrieves the
TextDocument using a default HTML parser. |
TextDocument |
getTextDocument(BoilerpipeHTMLParser parser)
Retrieves the
TextDocument using the given HTML parser. |
public BoilerpipeSAXInput(org.xml.sax.InputSource is) throws org.xml.sax.SAXException
BoilerpipeSAXInput
for the given InputSource
.is
- org.xml.sax.SAXException
public TextDocument getTextDocument() throws BoilerpipeProcessingException
TextDocument
using a default HTML parser.getTextDocument
in interface BoilerpipeInput
TextDocument
.BoilerpipeProcessingException
public TextDocument getTextDocument(BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException
TextDocument
using the given HTML parser.parser
- The parser used to transform the input into boilerpipe's internal representation.TextDocument
BoilerpipeProcessingException