Package | Description |
---|---|
de.l3s.boilerpipe.filters.simple |
The BoilerpipeFilters in this package are straight-forward and probably not really specific to English.
|
Class and Description |
---|
BoilerplateBlockFilter
Removes
TextBlock s which have explicitly been marked as "not content". |
InvertedFilter
Reverts the "isContent" flag for all
TextBlock s |
LabelToBoilerplateFilter
Marks all blocks that contain a given label as "boilerplate".
|
MarkEverythingContentFilter
Marks all blocks as content.
|
MinClauseWordsFilter
Keeps only blocks that have at least one segment fragment ("clause") with at
least k words (default: 5).
|
SplitParagraphBlocksFilter
Splits TextBlocks at paragraph boundaries.
|
SurroundingToContentFilter |