org.apache.poi
Class POITextExtractor

java.lang.Object
  extended by org.apache.poi.POITextExtractor
All Implemented Interfaces:
java.io.Closeable
Direct Known Subclasses:
POIOLE2TextExtractor

public abstract class POITextExtractor
extends java.lang.Object
implements java.io.Closeable

Common Parent for Text Extractors of POI Documents. You will typically find the implementation of a given format's text extractor under org.apache.poi.[format].extractor .

See Also:
ExcelExtractor, PowerPointExtractor, VisioTextExtractor, WordExtractor

Constructor Summary
POITextExtractor()
           
 
Method Summary
 void close()
          Allows to free resources of the Extractor as soon as it is not needed any more.
abstract  POITextExtractor getMetadataTextExtractor()
          Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.
abstract  java.lang.String getText()
          Retrieves all the text from the document.
 void setFilesystem(java.io.Closeable fs)
          Used to ensure file handle cleanup.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

POITextExtractor

public POITextExtractor()
Method Detail

getText

public abstract java.lang.String getText()
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.

Returns:
All the text from the document

getMetadataTextExtractor

public abstract POITextExtractor getMetadataTextExtractor()
Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.

Returns:
the metadata and text extractor

setFilesystem

public void setFilesystem(java.io.Closeable fs)
Used to ensure file handle cleanup.

Parameters:
fs - filesystem to close

close

public void close()
           throws java.io.IOException
Allows to free resources of the Extractor as soon as it is not needed any more. This may include closing open file handles and freeing memory. The Extractor cannot be used after close has been called.

Specified by:
close in interface java.io.Closeable
Throws:
java.io.IOException