Apache POI - HPSF - Java API for Microsoft Format Document Properties
Microsoft applications like "Word", "Excel" or "Powerpoint" let the user describe a document by properties like "title", "category" and so on. The application itself adds further information: last author, creation date etc. These document properties are stored in property set streams. A property set stream is a separate document within a POI filesystem. HPSF is POI's pure-Java implementation to read and write property sets.
The HPSF HOWTO describes what a Java application should do to read a property set using HPSF, how to retrieve the information it needs, and how to write properties into the document.
HPSF supports OLE2 property set streams in general, and is not limited to the special case of document properties in the Microsoft Office files mentioned above. The HPSF description describes the internal structure of property set streams. A separate document explains the internal of thumbnail images.
by Rainer Klute