Apache Software Foundation > Apache POI

Apache POI - Configuration


The best way to learn about using Apache POI is to read through the feature documentation and other online examples online.

To keep the features documentation focused on the APIs, there is little mention of some of the configuration settings that can be enabled that may prove useful to users who have to handle very large documents or very large throughput.

Configuration Setting Description
org.apache.poi.ooxml.POIXMLTypeLoader.DEFAULT_XML_OPTIONS POI support for XSSF APIs relies heavily on XMLBeans. This instance can be configured. It is recommended to take care if you do change any of the config items. In POI 5.1.0, we will disallow Doc Type parsing in the XML files embedded in xlsx/docx/pptx/etc files, by default. DEFAULT_XML_OPTIONS.setDisallowDocTypeDeclaration(false) will undo this change.
org.apache.poi.openxml4j.util.ZipSecureFile.setMinInflateRatio(double ratio) Sets the ratio between de- and inflated bytes to detect zipbomb. It defaults to 1% (= 0.01d), i.e. when the compression is better than 1% for any given read package part, the parsing will fail indicating a Zip-Bomb.
org.apache.poi.openxml4j.util.ZipSecureFile.setMaxEntrySize(long maxEntrySize) Sets the maximum file size of a single zip entry. It defaults to 4GB, i.e. the 32-bit zip format maximum. This can be used to limit memory consumption and protect against security vulnerabilities when documents are provided by users. POI 5.1.0 removes the previous limit of 4GB on this setting.
org.apache.poi.openxml4j.util.ZipSecureFile.setMaxTextSize(long maxTextSize) Sets the maximum number of characters of text that are extracted before an exception is thrown during extracting text from documents. This can be used to limit memory consumption and protect against security vulnerabilities when documents are provided by users. The default is approx 10 million chars. Prior to POI 5.1.0, the max allowed was approx 4 billion chars.
org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.setThresholdBytesForTempFiles(int thresholdBytes) Coming in POI 5.1.0. Number of bytes at which a zip entry is regarded as too large for holding in memory and the data is put in a temp file instead - defaults to -1 meaning temp files are not used and that zip entries with more than 2GB of data after decompressing will fail, 0 means all zip entries are stored in temp files. A threshold like 50000000 (approx 50Mb is recommended)
org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.setEncryptTempFiles(boolean encrypt) Coming in POI 5.1.0. Whether temp files should be encrypted (default false). Only affects temp files related to zip entries.
org.apache.poi.openxml4j.opc.ZipPackage.setUseTempFilePackageParts(boolean tempFilePackageParts) Coming in POI 5.1.0. Whether to save package part data in temp files to save memory (default=false).
org.apache.poi.openxml4j.opc.ZipPackage.setEncryptTempFilePackageParts(boolean encryptTempFiles) Coming in POI 5.1.0. Whether to encrypt package part temp files (default=false).

by POI Developers