public class ExcelExtractor extends java.lang.Object implements POIOLE2TextExtractor, ExcelExtractor
Returns the textual content of the file, suitable for indexing by something like Lucene, but not really intended for display to the user.
To turn an excel file into a CSV or similar, then see the XLS2CSVmra example
| Constructor and Description |
|---|
ExcelExtractor(DirectoryNode dir) |
ExcelExtractor(HSSFWorkbook wb) |
ExcelExtractor(POIFSFileSystem fs) |
| Modifier and Type | Method and Description |
|---|---|
static java.lang.String |
_extractHeaderFooter(HeaderFooter hf) |
HSSFWorkbook |
getDocument()
Return the underlying POIDocument
|
HSSFWorkbook |
getFilesystem() |
java.lang.String |
getText()
Retrieves all the text from the document.
|
boolean |
isCloseFilesystem() |
static void |
main(java.lang.String[] args)
Command line extractor.
|
void |
setCloseFilesystem(boolean doCloseFilesystem) |
void |
setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces?
Default is false
|
void |
setIncludeBlankCells(boolean includeBlankCells)
Should blank cells be output? Default is to only
output cells that are present in the file and are
non-blank.
|
void |
setIncludeCellComments(boolean includeCellComments)
Should cell comments be included?
Default is false
|
void |
setIncludeHeadersFooters(boolean includeHeadersFooters)
Should headers and footers be included in the output?
Default is true
|
void |
setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included?
Default is true
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetDocSummaryInformation, getMetadataTextExtractor, getRoot, getSummaryInformationclosepublic ExcelExtractor(HSSFWorkbook wb)
public ExcelExtractor(POIFSFileSystem fs) throws java.io.IOException
java.io.IOExceptionpublic ExcelExtractor(DirectoryNode dir) throws java.io.IOException
java.io.IOExceptionpublic static void main(java.lang.String[] args)
throws java.io.IOException
args - the command line parametersjava.io.IOException - if the file can't be read or contains errorspublic void setIncludeSheetNames(boolean includeSheetNames)
ExcelExtractorsetIncludeSheetNames in interface ExcelExtractorincludeSheetNames - true if the sheet names should be includedpublic void setFormulasNotResults(boolean formulasNotResults)
ExcelExtractorsetFormulasNotResults in interface ExcelExtractorformulasNotResults - true if the formula itself is returnedpublic void setIncludeCellComments(boolean includeCellComments)
ExcelExtractorsetIncludeCellComments in interface ExcelExtractorincludeCellComments - true if cell comments should be includedpublic void setIncludeBlankCells(boolean includeBlankCells)
includeBlankCells - true if blank cells should be includedpublic void setIncludeHeadersFooters(boolean includeHeadersFooters)
ExcelExtractorsetIncludeHeadersFooters in interface ExcelExtractorincludeHeadersFooters - true if headers and footers should be includedpublic java.lang.String getText()
POITextExtractorgetText in interface POITextExtractorgetText in interface ExcelExtractorpublic static java.lang.String _extractHeaderFooter(HeaderFooter hf)
public HSSFWorkbook getDocument()
POIOLE2TextExtractorgetDocument in interface POIOLE2TextExtractorgetDocument in interface POITextExtractorpublic void setCloseFilesystem(boolean doCloseFilesystem)
setCloseFilesystem in interface POITextExtractordoCloseFilesystem - true (default), if underlying resources/filesystem should be
closed on POITextExtractor.close()public boolean isCloseFilesystem()
isCloseFilesystem in interface POITextExtractortrue, if resources/filesystem should be closed on POITextExtractor.close()public HSSFWorkbook getFilesystem()
getFilesystem in interface POITextExtractorCopyright 2021 The Apache Software Foundation or its licensors, as applicable.