|
PDF Clown 0.0.8 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object it.stefanochizzolini.clown.tools.TextExtractor
public class TextExtractor
Tool for extracting text from content contexts
.
Nested Class Summary | |
---|---|
static class |
TextExtractor.AreaModeEnum
Text-to-area matching mode. |
Constructor Summary | |
---|---|
TextExtractor()
|
|
TextExtractor(boolean sorted)
|
|
TextExtractor(List<Rectangle2D> areas,
boolean sorted)
|
Method Summary | |
---|---|
Map<Rectangle2D,List<ITextString>> |
extract(Contents contents)
Extracts text strings from the given contents. |
Map<Rectangle2D,List<ITextString>> |
extract(IContentContext contentContext)
Extracts text strings from the given content context. |
String |
extractPlain(Contents contents)
Extracts plain text from the given contents. |
String |
extractPlain(IContentContext contentContext)
Extracts plain text from the given content context. |
Map<Rectangle2D,List<ITextString>> |
filter(List<? extends ITextString> textStrings,
Rectangle2D... areas)
Gets the text strings matching the given areas. |
List<ITextString> |
filter(List<? extends ITextString> textStrings,
Rectangle2D area)
Gets the text strings matching the given area. |
Map<Rectangle2D,List<ITextString>> |
filter(Map<Rectangle2D,List<ITextString>> textStrings,
Rectangle2D... areas)
Gets the text strings matching the given areas. |
List<ITextString> |
filter(Map<Rectangle2D,List<ITextString>> textStrings,
Rectangle2D area)
Gets the text strings matching the given area. |
TextExtractor.AreaModeEnum |
getAreaMode()
Gets the text-to-area matching mode. |
List<Rectangle2D> |
getAreas()
Gets the graphic areas whose text has to be extracted. |
double |
getAreaTolerance()
Gets the admitted outer area (in points) for containment matching purposes. |
boolean |
isSorted()
Gets whether the text strings have to be sorted. |
void |
setAreaMode(TextExtractor.AreaModeEnum value)
|
void |
setAreas(List<Rectangle2D> value)
|
void |
setAreaTolerance(double value)
|
void |
setSorted(boolean value)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public TextExtractor()
public TextExtractor(boolean sorted)
public TextExtractor(List<Rectangle2D> areas, boolean sorted)
Method Detail |
---|
public Map<Rectangle2D,List<ITextString>> extract(IContentContext contentContext)
contentContext
- Source content context.public Map<Rectangle2D,List<ITextString>> extract(Contents contents)
contents
- Source contents.public String extractPlain(IContentContext contentContext)
contentContext
- Source content context.public String extractPlain(Contents contents)
contents
- Source contents.public List<ITextString> filter(Map<Rectangle2D,List<ITextString>> textStrings, Rectangle2D area)
textStrings
- Text strings to filter, grouped by source area.area
- Graphic area which text strings have to be matched to.public Map<Rectangle2D,List<ITextString>> filter(Map<Rectangle2D,List<ITextString>> textStrings, Rectangle2D... areas)
textStrings
- Text strings to filter, grouped by source area.areas
- Graphic areas which text strings have to be matched to.public List<ITextString> filter(List<? extends ITextString> textStrings, Rectangle2D area)
textStrings
- Text strings to filter.area
- Graphic area which text strings have to be matched to.public Map<Rectangle2D,List<ITextString>> filter(List<? extends ITextString> textStrings, Rectangle2D... areas)
textStrings
- Text strings to filter.areas
- Graphic areas which text strings have to be matched to.public TextExtractor.AreaModeEnum getAreaMode()
public List<Rectangle2D> getAreas()
public double getAreaTolerance()
This measure is useful to ensure that text whose boxes overlap with the area bounds is not excluded from the match.
public boolean isSorted()
public void setAreaMode(TextExtractor.AreaModeEnum value)
getAreaMode()
public void setAreas(List<Rectangle2D> value)
getAreas()
public void setAreaTolerance(double value)
getAreaTolerance()
public void setSorted(boolean value)
isSorted()
|
PDF Clown 0.0.8 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |