public final class DocumentTitleMatchClassifier extends java.lang.Object implements BoilerpipeFilter
TextBlock
s which contain parts of the HTML
<TITLE>
tag, using some heuristics which are quite
specific to the news domain.Constructor and Description |
---|
DocumentTitleMatchClassifier(java.lang.String title) |
Modifier and Type | Method and Description |
---|---|
java.util.Set<java.lang.String> |
getPotentialTitles() |
boolean |
process(TextDocument doc)
Processes the given document
doc . |
public DocumentTitleMatchClassifier(java.lang.String title)
public java.util.Set<java.lang.String> getPotentialTitles()
public boolean process(TextDocument doc) throws BoilerpipeProcessingException
BoilerpipeFilter
doc
.process
in interface BoilerpipeFilter
doc
- The TextDocument
that is to be processed.true
if changes have been made to the
TextDocument
.BoilerpipeProcessingException