public class OldExcelExtractor extends java.lang.Object implements POITextExtractor
Returns much (but not all) of the textual content of the file, suitable for indexing by something like Apache Lucene, or used by Apache Tika, but not really intended for display to the user.
Constructor and Description |
---|
OldExcelExtractor(DirectoryNode directory) |
OldExcelExtractor(java.io.File f) |
OldExcelExtractor(java.io.InputStream input) |
OldExcelExtractor(POIFSFileSystem fs) |
Modifier and Type | Method and Description |
---|---|
int |
getBiffVersion()
The Biff version, largely corresponding to the Excel version
|
java.lang.Object |
getDocument() |
java.io.Closeable |
getFilesystem() |
int |
getFileType()
The kind of the file, one of
BOFRecord.TYPE_WORKSHEET ,
BOFRecord.TYPE_CHART , BOFRecord.TYPE_EXCEL_4_MACRO
or BOFRecord.TYPE_WORKSPACE_FILE |
POITextExtractor |
getMetadataTextExtractor()
Returns another text extractor, which is able to
output the textual content of the document
metadata / properties, such as author and title.
|
java.lang.String |
getText()
Retrieves the text contents of the file, as best we can
for these old file formats
|
protected void |
handleNumericCell(java.lang.StringBuilder text,
double value) |
boolean |
isCloseFilesystem() |
static void |
main(java.lang.String[] args) |
void |
setCloseFilesystem(boolean doCloseFilesystem) |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
close
public OldExcelExtractor(java.io.InputStream input) throws java.io.IOException
java.io.IOException
public OldExcelExtractor(java.io.File f) throws java.io.IOException
java.io.IOException
public OldExcelExtractor(POIFSFileSystem fs) throws java.io.IOException
java.io.IOException
public OldExcelExtractor(DirectoryNode directory) throws java.io.IOException
java.io.IOException
public static void main(java.lang.String[] args) throws java.io.IOException
java.io.IOException
public int getBiffVersion()
public int getFileType()
BOFRecord.TYPE_WORKSHEET
,
BOFRecord.TYPE_CHART
, BOFRecord.TYPE_EXCEL_4_MACRO
or BOFRecord.TYPE_WORKSPACE_FILE
public java.lang.String getText()
getText
in interface POITextExtractor
protected void handleNumericCell(java.lang.StringBuilder text, double value)
public POITextExtractor getMetadataTextExtractor()
POITextExtractor
getMetadataTextExtractor
in interface POITextExtractor
public void setCloseFilesystem(boolean doCloseFilesystem)
setCloseFilesystem
in interface POITextExtractor
doCloseFilesystem
- true
(default), if underlying resources/filesystem should be
closed on POITextExtractor.close()
public boolean isCloseFilesystem()
isCloseFilesystem
in interface POITextExtractor
true
, if resources/filesystem should be closed on POITextExtractor.close()
public java.io.Closeable getFilesystem()
getFilesystem
in interface POITextExtractor
public java.lang.Object getDocument()
getDocument
in interface POITextExtractor
Copyright 2022 The Apache Software Foundation or its licensors, as applicable.