move over latest docuementation files from poi-site
7
src/documentation/README.txt
Normal file
@ -0,0 +1,7 @@
|
||||
This is the base documentation directory.
|
||||
|
||||
skinconf.xml # This file customizes Forrest for your project. In it, you
|
||||
# tell forrest the project name, logo, copyright info, etc
|
||||
|
||||
sitemap.xmap # Optional. This sitemap is consulted before all core sitemaps.
|
||||
# See http://forrest.apache.org/docs/project-sitemap.html
|
||||
42
src/documentation/RELEASE-NOTES.txt
Normal file
@ -0,0 +1,42 @@
|
||||
The Apache POI project is pleased to announce the release of POI @VERSION@.
|
||||
Featured are a handful of new areas of functionality, and numerous bug fixes.
|
||||
|
||||
See the downloads page for source distributions: https://poi.apache.org/download.html
|
||||
|
||||
Release Notes
|
||||
|
||||
Changes
|
||||
------------
|
||||
The most notable changes in this release are:
|
||||
|
||||
@List changes here@
|
||||
|
||||
A full list of changes is available in the change log: https://poi.apache.org/changes.html.
|
||||
People interested should also follow the dev mailing list to track further progress.
|
||||
|
||||
Release Contents
|
||||
----------------
|
||||
|
||||
This release comes in source form:
|
||||
- source archive you can build POI from (poi-src-@VERSION@-@DSTAMP@.zip or poi-src-@VERSION@-@DSTAMP@.tar.gz)
|
||||
Unpack the archive and use the following command to build all POI components with JDK 1.8 or higher:
|
||||
|
||||
gradle jar
|
||||
|
||||
Pre-built versions of all POI components are also available in the central Maven repository
|
||||
under Group ID "org.apache.poi" and Version "@VERSION@"
|
||||
|
||||
All release artifacts are accompanied by SHA checksums and PGP signatures
|
||||
that you can use to verify the authenticity of your download.
|
||||
The public key used for the PGP signature can be found at
|
||||
https://svn.apache.org/repos/asf/poi/tags/@RELEASE_TAG@/KEYS
|
||||
|
||||
About Apache POI
|
||||
-----------------------
|
||||
|
||||
Apache POI is well-known in the Java field as a library for reading and
|
||||
writing Microsoft Office file formats, such as Excel, PowerPoint, Word,
|
||||
Visio, Publisher and Outlook. It supports both the older (OLE2) and
|
||||
new (OOXML - Office Open XML) formats.
|
||||
|
||||
See https://poi.apache.org/ for more details
|
||||
328
src/documentation/cli.xconf
Normal file
@ -0,0 +1,328 @@
|
||||
<?xml version="1.0"?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
<!--+
|
||||
| This is the Apache Cocoon command line configuration file.
|
||||
| Here you give the command line interface details of where
|
||||
| to find various aspects of your Cocoon installation.
|
||||
|
|
||||
| If you wish, you can also use this file to specify the URIs
|
||||
| that you wish to generate.
|
||||
|
|
||||
| The current configuration information in this file is for
|
||||
| building the Cocoon documentation. Therefore, all links here
|
||||
| are relative to the build context dir, which, in the build.xml
|
||||
| file, is set to ${build.context}
|
||||
|
|
||||
| Options:
|
||||
| verbose: increase amount of information presented
|
||||
| to standard output (default: false)
|
||||
| follow-links: whether linked pages should also be
|
||||
| generated (default: true)
|
||||
| precompile-only: precompile sitemaps and XSP pages, but
|
||||
| do not generate any pages (default: false)
|
||||
| confirm-extensions: check the mime type for the generated page
|
||||
| and adjust filename and links extensions
|
||||
| to match the mime type
|
||||
| (e.g. text/html->.html)
|
||||
|
|
||||
| Note: Whilst using an xconf file to configure the Cocoon
|
||||
| Command Line gives access to more features, the use of
|
||||
| command line parameters is more stable, as there are
|
||||
| currently plans to improve the xconf format to allow
|
||||
| greater flexibility. If you require a stable and
|
||||
| consistent method for accessing the CLI, it is recommended
|
||||
| that you use the command line parameters to configure
|
||||
| the CLI. See documentation at:
|
||||
| http://cocoon.apache.org/2.1/userdocs/offline/
|
||||
| http://wiki.apache.org/cocoon/CommandLine
|
||||
|
|
||||
+-->
|
||||
|
||||
<cocoon verbose="true"
|
||||
follow-links="true"
|
||||
precompile-only="false"
|
||||
confirm-extensions="false">
|
||||
|
||||
<!--+
|
||||
| The context directory is usually the webapp directory
|
||||
| containing the sitemap.xmap file.
|
||||
|
|
||||
| The config file is the cocoon.xconf file.
|
||||
|
|
||||
| The work directory is used by Cocoon to store temporary
|
||||
| files and cache files.
|
||||
|
|
||||
| The destination directory is where generated pages will
|
||||
| be written (assuming the 'simple' mapper is used, see
|
||||
| below)
|
||||
+-->
|
||||
<context-dir>.</context-dir>
|
||||
<config-file>WEB-INF/cocoon.xconf</config-file>
|
||||
<work-dir>../tmp/cocoon-work</work-dir>
|
||||
<dest-dir>../site</dest-dir>
|
||||
|
||||
<!--+
|
||||
| A checksum file can be used to store checksums for pages
|
||||
| as they are generated. When the site is next generated,
|
||||
| files will not be written if their checksum has not changed.
|
||||
| This means that it will be easier to detect which files
|
||||
| need to be uploaded to a server, using the timestamp.
|
||||
|
|
||||
| The default path is relative to the core webapp directory.
|
||||
| An asolute path can be used.
|
||||
+-->
|
||||
<!-- <checksums-uri>build/work/checksums</checksums-uri>-->
|
||||
|
||||
<!--+
|
||||
| Broken link reporting options:
|
||||
| Report into a text file, one link per line:
|
||||
| <broken-links type="text" report="filename"/>
|
||||
| Report into an XML file:
|
||||
| <broken-links type="xml" report="filename"/>
|
||||
| Ignore broken links (default):
|
||||
| <broken-links type="none"/>
|
||||
|
|
||||
| Two attributes to this node specify whether a page should
|
||||
| be generated when an error has occured. 'generate' specifies
|
||||
| whether a page should be generated (default: true) and
|
||||
| extension specifies an extension that should be appended
|
||||
| to the generated page's filename (default: none)
|
||||
|
|
||||
| Using this, a quick scan through the destination directory
|
||||
| will show broken links, by their filename extension.
|
||||
+-->
|
||||
<broken-links type="xml"
|
||||
file="../brokenlinks.xml"
|
||||
generate="false"
|
||||
extension=".error"
|
||||
show-referrers="true"/>
|
||||
|
||||
<!--+
|
||||
| Load classes at startup. This is necessary for generating
|
||||
| from sites that use SQL databases and JDBC.
|
||||
| The <load-class> element can be repeated if multiple classes
|
||||
| are needed.
|
||||
+-->
|
||||
<!--
|
||||
<load-class>org.firebirdsql.jdbc.Driver</load-class>
|
||||
-->
|
||||
|
||||
<!--+
|
||||
| Configures logging.
|
||||
| The 'log-kit' parameter specifies the location of the log kit
|
||||
| configuration file (usually called logkit.xconf.
|
||||
|
|
||||
| Logger specifies the logging category (for all logging prior
|
||||
| to other Cocoon logging categories taking over)
|
||||
|
|
||||
| Available log levels are:
|
||||
| DEBUG: prints all level of log messages.
|
||||
| INFO: prints all level of log messages except DEBUG
|
||||
| ones.
|
||||
| WARN: prints all level of log messages except DEBUG
|
||||
| and INFO ones.
|
||||
| ERROR: prints all level of log messages except DEBUG,
|
||||
| INFO and WARN ones.
|
||||
| FATAL_ERROR: prints only log messages of this level
|
||||
+-->
|
||||
<!-- <logging log-kit="WEB-INF/logkit.xconf" logger="cli" level="ERROR" /> -->
|
||||
|
||||
<!--+
|
||||
| Specifies the filename to be appended to URIs that
|
||||
| refer to a directory (i.e. end with a forward slash).
|
||||
+-->
|
||||
<default-filename>index.html</default-filename>
|
||||
|
||||
<!--+
|
||||
| Specifies a user agent string to the sitemap when
|
||||
| generating the site.
|
||||
|
|
||||
| A generic term for a web browser is "user agent". Any
|
||||
| user agent, when connecting to a web server, will provide
|
||||
| a string to identify itself (e.g. as Internet Explorer or
|
||||
| Mozilla). It is possible to have Cocoon serve different
|
||||
| content depending upon the user agent string provided by
|
||||
| the browser. If your site does this, then you may want to
|
||||
| use this <user-agent> entry to provide a 'fake' user agent
|
||||
| to Cocoon, so that it generates the correct version of your
|
||||
| site.
|
||||
|
|
||||
| For most sites, this can be ignored.
|
||||
+-->
|
||||
<!--
|
||||
<user-agent>Cocoon Command Line Environment 2.1</user-agent>
|
||||
-->
|
||||
|
||||
<!--+
|
||||
| Specifies an accept string to the sitemap when generating
|
||||
| the site.
|
||||
| User agents can specify to an HTTP server what types of content
|
||||
| (by mime-type) they are able to receive. E.g. a browser may be
|
||||
| able to handle jpegs, but not pngs. The HTTP accept header
|
||||
| allows the server to take the browser's capabilities into account,
|
||||
| and only send back content that it can handle.
|
||||
|
|
||||
| For most sites, this can be ignored.
|
||||
+-->
|
||||
|
||||
<accept>*/*</accept>
|
||||
|
||||
<!--+
|
||||
| Specifies which URIs should be included or excluded, according
|
||||
| to wildcard patterns.
|
||||
|
|
||||
| These includes/excludes are only relevant when you are following
|
||||
| links. A link URI must match an include pattern (if one is given)
|
||||
| and not match an exclude pattern, if it is to be followed by
|
||||
| Cocoon. It can be useful, for example, where there are links in
|
||||
| your site to pages that are not generated by Cocoon, such as
|
||||
| references to api-documentation.
|
||||
|
|
||||
| By default, all URIs are included. If both include and exclude
|
||||
| patterns are specified, a URI is first checked against the
|
||||
| include patterns, and then against the exclude patterns.
|
||||
|
|
||||
| Multiple patterns can be given, using muliple include or exclude
|
||||
| nodes.
|
||||
|
|
||||
| The order of the elements is not significant, as only the first
|
||||
| successful match of each category is used.
|
||||
|
|
||||
| Currently, only the complete source URI can be matched (including
|
||||
| any URI prefix). Future plans include destination URI matching
|
||||
| and regexp matching. If you have requirements for these, contact
|
||||
| dev@cocoon.apache.org.
|
||||
+-->
|
||||
|
||||
<exclude pattern="**/"/>
|
||||
<exclude pattern="api/**"/>
|
||||
<!-- POI Customisation - allow us to have an index page at /apidocs/ -->
|
||||
<exclude pattern="**apidocs/dev/**"/>
|
||||
<exclude pattern="**apidocs/3.*/**"/>
|
||||
<exclude pattern="**apidocs/4.*/**"/>
|
||||
|
||||
<!--
|
||||
This is a workaround for FOR-284 "link rewriting broken when
|
||||
linking to xml source views which contain site: links".
|
||||
See the explanation there and in declare-broken-site-links.xsl
|
||||
-->
|
||||
<exclude pattern="site:**"/>
|
||||
<exclude pattern="ext:**"/>
|
||||
<exclude pattern="lm:**"/>
|
||||
<exclude pattern="**/site:**"/>
|
||||
<exclude pattern="**/ext:**"/>
|
||||
<exclude pattern="**/lm:**"/>
|
||||
|
||||
<!-- Exclude tokens used in URLs to ASF mirrors (interpreted by a CGI) -->
|
||||
<exclude pattern="[preferred]/**"/>
|
||||
<exclude pattern="[location]"/>
|
||||
|
||||
<!-- <include-links extension=".html"/>-->
|
||||
|
||||
<!--+
|
||||
| <uri> nodes specify the URIs that should be generated, and
|
||||
| where required, what should be done with the generated pages.
|
||||
| They describe the way the URI of the generated file is created
|
||||
| from the source page's URI. There are three ways that a generated
|
||||
| file URI can be created: append, replace and insert.
|
||||
|
|
||||
| The "type" attribute specifies one of (append|replace|insert):
|
||||
|
|
||||
| append:
|
||||
| Append the generated page's URI to the end of the source URI:
|
||||
|
|
||||
| <uri type="append" src-prefix="documents/" src="index.html"
|
||||
| dest="build/dest/"/>
|
||||
|
|
||||
| This means that
|
||||
| (1) the "documents/index.html" page is generated
|
||||
| (2) the file will be written to "build/dest/documents/index.html"
|
||||
|
|
||||
| replace:
|
||||
| Completely ignore the generated page's URI - just
|
||||
| use the destination URI:
|
||||
|
|
||||
| <uri type="replace" src-prefix="documents/" src="index.html"
|
||||
| dest="build/dest/docs.html"/>
|
||||
|
|
||||
| This means that
|
||||
| (1) the "documents/index.html" page is generated
|
||||
| (2) the result is written to "build/dest/docs.html"
|
||||
| (3) this works only for "single" pages - and not when links
|
||||
| are followed
|
||||
|
|
||||
| insert:
|
||||
| Insert generated page's URI into the destination
|
||||
| URI at the point marked with a * (example uses fictional
|
||||
| zip protocol)
|
||||
|
|
||||
| <uri type="insert" src-prefix="documents/" src="index.html"
|
||||
| dest="zip://*.zip/page.html"/>
|
||||
|
|
||||
| This means that
|
||||
| (1)
|
||||
|
|
||||
| In any of these scenarios, if the dest attribute is omitted,
|
||||
| the value provided globally using the <dest-dir> node will
|
||||
| be used instead.
|
||||
+-->
|
||||
<!--
|
||||
<uri type="replace"
|
||||
src-prefix="samples/"
|
||||
src="hello-world/hello.html"
|
||||
dest="build/dest/hello-world.html"/>
|
||||
-->
|
||||
|
||||
<!--+
|
||||
| <uri> nodes can be grouped together in a <uris> node. This
|
||||
| enables a group of URIs to share properties. The following
|
||||
| properties can be set for a group of URIs:
|
||||
| * follow-links: should pages be crawled for links
|
||||
| * confirm-extensions: should file extensions be checked
|
||||
| for the correct mime type
|
||||
| * src-prefix: all source URIs should be
|
||||
| pre-pended with this prefix before
|
||||
| generation. The prefix is not
|
||||
| included when calculating the
|
||||
| destination URI
|
||||
| * dest: the base destination URI to be
|
||||
| shared by all pages in this group
|
||||
| * type: the method to be used to calculate
|
||||
| the destination URI. See above
|
||||
| section on <uri> node for details.
|
||||
|
|
||||
| Each <uris> node can have a name attribute. When a name
|
||||
| attribute has been specified, the -n switch on the command
|
||||
| line can be used to tell Cocoon to only process the URIs
|
||||
| within this URI group. When no -n switch is given, all
|
||||
| <uris> nodes are processed. Thus, one xconf file can be
|
||||
| used to manage multiple sites.
|
||||
+-->
|
||||
<!--
|
||||
<uris name="mirrors" follow-links="false">
|
||||
<uri type="append" src="mirrors.html"/>
|
||||
</uris>
|
||||
-->
|
||||
|
||||
<!--+
|
||||
| File containing URIs (plain text, one per line).
|
||||
+-->
|
||||
<!--
|
||||
<uri-file>uris.txt</uri-file>
|
||||
-->
|
||||
</cocoon>
|
||||
41
src/documentation/content/locationmap.xml
Normal file
@ -0,0 +1,41 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
<locationmap xmlns="http://apache.org/forrest/locationmap/1.0">
|
||||
<components>
|
||||
<matchers default="lm">
|
||||
<matcher
|
||||
name="lm"
|
||||
src="org.apache.forrest.locationmap.WildcardLocationMapHintMatcher"/>
|
||||
</matchers>
|
||||
</components>
|
||||
<locator>
|
||||
<!--
|
||||
To locate all your source documents in a slide repository you can do:
|
||||
|
||||
<match pattern="tabs.xml">
|
||||
<location src="http://127.0.0.1:8080/slide/files/tabs.xml"/>
|
||||
</match>
|
||||
<match pattern="site.xml">
|
||||
<location src="http://127.0.0.1:8080/slide/files/site.xml"/>
|
||||
</match>
|
||||
<match pattern="**.xml">
|
||||
<location src="http://127.0.0.1:8080/slide/files/{1}.xml"/>
|
||||
</match>
|
||||
-->
|
||||
</locator>
|
||||
</locationmap>
|
||||
85
src/documentation/content/xdocs/apidocs/index.xml
Normal file
@ -0,0 +1,85 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Javadocs</title>
|
||||
<authors>
|
||||
<person id="PD" name="POI Developers" email="dev@poi.apache.org" />
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Apache POI Javadocs</title>
|
||||
<p>
|
||||
The Javadocs for the latest (development) version of Apache POI
|
||||
can be <a href="dev/index.html">accessed online here</a>, or build
|
||||
from a <a href="site:subversion">source code checkout</a>
|
||||
by running the <em>javadocs</em> Ant task. The
|
||||
<a href="dev/index.html">latest (development) Javadocs</a> are generally
|
||||
updated every few weeks, so may lag the most recent development slightly.
|
||||
</p>
|
||||
<p>
|
||||
For recent releases, the Javadocs for the latest stable release
|
||||
each the family can also be browsed online:
|
||||
</p>
|
||||
<ul>
|
||||
<li><a href="ext:apidocs/v50">Apache POI 5.0.x Javadocs</a></li>
|
||||
<li><a href="ext:apidocs/v41">Apache POI 4.1.x Javadocs</a></li>
|
||||
<li><a href="ext:apidocs/v40">Apache POI 4.0.x Javadocs</a></li>
|
||||
<li><a href="ext:apidocs/v317">Apache POI 3.17 Javadocs</a></li>
|
||||
</ul>
|
||||
|
||||
<section><title>Older Releases</title>
|
||||
<p>
|
||||
For every release of Apache POI, the specific Javadocs for that
|
||||
version are available with the release.
|
||||
</p>
|
||||
<p>
|
||||
Maven / Gradle / IDE users are able to fetch the javadocs for each
|
||||
of the Apache POI jars from Maven Central (or your preferred Maven
|
||||
mirror). These are made available with the <em>javadoc</em> classifier,
|
||||
e.g. <em>group: 'org.apache.poi', name: 'poi', version: '4.1.1',
|
||||
classifier: 'javadoc'</em>
|
||||
</p>
|
||||
<p>
|
||||
If you have downloaded the <em>binary (bin)</em> release, then you
|
||||
can find the Javadocs within the download in the <em>/docs/apidocs/</em>
|
||||
folder.
|
||||
</p>
|
||||
<p>
|
||||
If you have downloaded the <em>source (src)</em> release, then you
|
||||
need to build your own copy. Run the <em>javadocs</em> ant task
|
||||
to have the Javadocs built, the build will tell you the output
|
||||
directory at the end (it varies slightly between POI versions).
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
451
src/documentation/content/xdocs/casestudies.xml
Normal file
@ -0,0 +1,451 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Case Studies</title>
|
||||
<authors>
|
||||
<person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person id="CR" name="Cameron Riley" email="crileyNO@SPAMekmail.com"/>
|
||||
<person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/>
|
||||
<person id="DS" name="Dominik Stadler" email="centic@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Introduction</title>
|
||||
<p>
|
||||
A number of people are using POI for a variety of purposes. As with
|
||||
any new API or technology, the first question people generally ask
|
||||
is not "how can I" but rather "Who else is doing what I'm about to
|
||||
do?" This is understandable with the abysmal success rate in the
|
||||
software business. These case statements are meant to help create
|
||||
confidence and understanding.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Submitting a Case Study</title>
|
||||
<p>
|
||||
We are actively seeking case studies for this page (after all it
|
||||
just started). To submit a case study, either
|
||||
<a href="site:guidelines">
|
||||
submit a patch for this page</a> or email it to the
|
||||
<a href="site:mailinglists">mailing list
|
||||
</a> (with [PATCH] prefixed subject, please).
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Case Studies</title>
|
||||
|
||||
<section><title>Manticore Projects - VBox Financial Reporting Software</title>
|
||||
<p>
|
||||
<em>Andreas Reichel, Managing Consultant</em><br/><br/>
|
||||
</p>
|
||||
<p>
|
||||
<strong>Use Case for Apache POI in VBox Financial Reporting Software</strong><br/>
|
||||
<a href="https://manticore-projects.com/">Manticore Projects</a> specializes in Financial Valuation,
|
||||
Accounting, and Reporting under IFRS 9, IFRS 16, and IFRS 17. The software extensively leverages
|
||||
Apache POI for importing, exporting, and visualizing data, making it a cornerstone of the solutions.
|
||||
</p>
|
||||
<p>
|
||||
<strong>SQL Sheet Integration for Data Capture</strong><br/>
|
||||
The software uses and supports <a href="https://github.com/panchmp/sqlsheet">SQL Sheet</a> to build
|
||||
"Data Capture Sheets", allowing end-users to seamlessly upload structured data via Microsoft Excel
|
||||
spreadsheets into applications. <a href="https://github.com/panchmp/sqlsheet">SQL Sheet</a>, a JDBC driver
|
||||
for XLS/XLSX files based on Apache POI, transforms worksheets into database tables, enabling access
|
||||
through plain SQL and JDBC MetaData.
|
||||
</p>
|
||||
<p>
|
||||
<strong>Streamlined Excel Exports for Controllers and Auditors</strong><br/>
|
||||
Within VBox applications, Apache POI enables interactive export of UI content, such as data tables, into
|
||||
formatted Excel spreadsheets. This functionality provides financial controllers and auditors with easy
|
||||
access to complex data and calculations in a familiar format.
|
||||
</p>
|
||||
<p>
|
||||
<strong>ETL-VBox Report Builder for Regulatory Compliance</strong><br/>
|
||||
|
||||
The <a href="https://manticore-projects.com/VBox/etl.html">ETL-VBox Report Builder</a> uses Apache POI
|
||||
to create spreadsheet-based form reports, a critical requirement for regulatory reporting. Regulatory
|
||||
bodies often provide specific MS Excel templates with multiple sheets representing data forms and fields.<br/>
|
||||
With Apache POI, the software visualizes these Excel templates directly in the UI, mimicking the Excel
|
||||
experience. Non-technical users can drag and drop records or values from data cubes into the spreadsheet
|
||||
interface. This "data to cell-range" mapping is stored and used to populate the workbook automatically,
|
||||
ensuring reports are generated accurately and on time—such as during daily end-of-day processes.<br/>
|
||||
One of the standout benefits of this approach is the platform independent separation of report templates
|
||||
(including corporate design styles, formulas, and charts) from the actual data. By leveraging Apache POI,
|
||||
it bridges the gap between structured data and Excels flexibility, delivering the best of both worlds for
|
||||
end-users who love working in Excel.
|
||||
</p>
|
||||
<p>
|
||||
<strong>Why Apache POI?</strong><br/>
|
||||
Apache POI has proven to be a high-performance and robust library. It is supported by comprehensive
|
||||
documentation and an excellent community of developers. At
|
||||
<a href="https://manticore-projects.com/">Manticore Projects</a>, we are proud
|
||||
contributors to this vibrant community and deeply value the collaboration that drives the evolution of this
|
||||
indispensable tool.<br/>
|
||||
By integrating Apache POI into our software, we empower users with intuitive and powerful features for
|
||||
financial reporting, helping them meet their regulatory and operational needs with confidence.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>WriteExcel Utilities</title>
|
||||
<p>
|
||||
This WriteExcel distribution package found at <a href="https://stevepritchard.ca/home/WriteExcel/overview.htm">WriteExcel Utilities</a> contains source,
|
||||
documentation, examples, build tools and precompiled classes that wrap the Apache POI Excel interface.
|
||||
</p>
|
||||
<p>
|
||||
WriteExcel creates a Workbook file using a simple interface that uses formatted strings as the primary way
|
||||
of passing information to the support methods which interpret the strings and issue the necessary POI method calls.
|
||||
Access to the <code>Workbook</code> object allows the POI methods to be called directly for cases not handled by the interface.
|
||||
</p>
|
||||
<p>
|
||||
An existing Workbook file can be used as a template source so that sheets can be copied and then left intact, modified and/or supplemented.
|
||||
</p>
|
||||
<p>
|
||||
The creation of Workbooks containing charts is supported by using an existing Workbook file as a template that contains one or more charts
|
||||
and using WriteExcel to modify the data that the chart refers to.
|
||||
</p>
|
||||
<p>
|
||||
The ReadExcelFile component of the package can be used to selectively iterate across existing Workbooks (or Workbooks under construction) and create
|
||||
Java objects with the selected data which can then be forwarded for further processing.
|
||||
<br/><br/>
|
||||
WriteExcel was used to produce the monthly reporting files for a church accounting system among other things.
|
||||
<br/><br/>
|
||||
Steve Pritchard<br/>
|
||||
Rexcel Systems Inc.<br/>
|
||||
July, 2019<br/>
|
||||
<a href="https://stevepritchard.ca">Steve Pritchard Utilities</a>
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Processing biometric scanner logs - Glassbeam</title>
|
||||
<p>
|
||||
As a small startup there is no attendance management system in place. So they have a manual register where
|
||||
they record attendance. There also is a biometric scanner to allow entries through the office gates,
|
||||
which again maintains logs of entries.
|
||||
Instead of establishing an attendance management system, they decided to make use of these biometric scanner logs and generate an
|
||||
excel report instead.
|
||||
</p>
|
||||
<p>
|
||||
A <a href="http://www.shivamkapoor.com/blogs/technology/2019/07/10/code-design-template-for-apache-poi-based-excel-writers/">blog post</a> describes how
|
||||
the startup uses Apache POI to generate reports about attendance of employees based on biometric scanner logs.
|
||||
</p>
|
||||
<p>
|
||||
A fully working solution can be found on <a href="https://github.com/codingkapoor/essl-attendance-report-generator">Github</a>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>REWOO Scope</title>
|
||||
<p>
|
||||
<a href="http://www.rewoo.de/">REWOO Scope</a> is a modern and easy to use web-based enterprise content management system. It supports knowledge workers and managers in making the right decisions based upon all relevant information.
|
||||
</p>
|
||||
<p>
|
||||
The system uses Apache POI to extract information stored within excel files and use it transparently within REWOO Scope. Thus, POI allows our customers to work in their standard office environment while also having all important information in the REWO Scope system.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>QuestionPro</title>
|
||||
<p>
|
||||
<a href="http://www.questionpro.com">QuestionPro</a> is an online service allowing businesses and individuals to create, deploy and do in-depth analysis of Online Surveys. The technology is build on open-source frameworks like Struts, Velocity, POI, Lucene ... the List goes on. The application deployment is on a Linux Application Cluster farm with a Mysql database.
|
||||
</p>
|
||||
<p>
|
||||
There are quite a few competitors delivering similar solutions using Microsoft Technologies like asp and .net. One of the distinct advantages our competitors had over us was the ability to generate Excel Spreadsheets, Access Databases (MDB) etc. on the fly using the Component Object Model (COM) - since their servers were running IIS and they had access to the COM registry and such.
|
||||
</p>
|
||||
<p>
|
||||
QuestionPro's initial solution was to generate CSV files. This was easy however it was a cumbersome process for our clients to download the CSV files and then import them into Excel. Moreover, formatting information could not be preserved or captured using the CSV format. This is where POI came to our rescue. With a POI based solution, we could generate a full report with multiple sheets and all the analytical reports. To keep the solution scalable, we had a dedicated cluster for generating out the reports.
|
||||
</p>
|
||||
<p>
|
||||
|
||||
The Apache-POI project has helped QuestionPro compete with the other players in the marketplace with proprietary technology. It leveled the playing field with respect to reporting and data analysis solutions. It helped in opening doors into closed solutions like Microsoft's CDF. Today about 100 excel reports are generated daily, each with about 10-30 sheets in them.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Vivek Bhaskaran
|
||||
</p>
|
||||
<p>
|
||||
<a href="http://www.questionpro.com">QuestionPro, Inc</a>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
POI In Action - <a href="http://www.questionpro.com/marketing/SurveyReport-289.xls">http://www.questionpro.com/marketing/SurveyReport-289.xls</a>
|
||||
</p>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>Sunshine Systems</title>
|
||||
<p>
|
||||
<a href="http://www.sunshinesys.com/">Sunshine Systems</a> developed a
|
||||
POI based reporting solution for a price optimization software package which
|
||||
is used by major retail chains.
|
||||
</p>
|
||||
<p>The solution allowed the retailer's merchandise planners and managers to request a
|
||||
markdown decision support reports and price change reports using a standard browser
|
||||
The users could specify report type, report options, as well as company,
|
||||
division,
|
||||
and department filter criteria. Report generation took place in the
|
||||
multi-threaded
|
||||
application server and was capable of supporting many simultaneous report requests.
|
||||
</p>
|
||||
<p>The reporting application collected business information from the price
|
||||
optimization
|
||||
application's Oracle database. The data was aggregated and summarized
|
||||
based upon the
|
||||
specific report type and filter criteria requested by the user. The
|
||||
final report was
|
||||
rendered as a Microsoft Excel spreadsheet using the POI HSSF API and
|
||||
was stored on
|
||||
the report database server for that specific user as a BLOB. Reports
|
||||
could be
|
||||
seamlessly and easily viewed using the same browser.
|
||||
</p>
|
||||
<p>The retailers liked the solution because they had instantaneous access
|
||||
to critical
|
||||
business data through an extremely easy to use browser interface. They
|
||||
did not need
|
||||
to train the broader user community on all the complexities of the optimization
|
||||
application. Furthermore, the reports were generated in an Excel spreadsheet
|
||||
format,
|
||||
which everyone was familiar with and which also allowed further data
|
||||
analysis using
|
||||
standard Excel features.
|
||||
</p>
|
||||
<p>Rob Stevenson (rstevenson at sunshinesys dot com)
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Bank of Lithuania</title>
|
||||
<p>
|
||||
The
|
||||
<a href="http://www.lbank.lt/">Bank of Lithuania</a>
|
||||
reports financial statistical data to Excel format using the
|
||||
<a href="https://poi.apache.org/">Apache POI</a>
|
||||
project's
|
||||
<a href="site:spreadsheet">
|
||||
HSSF</a> API. The system is based on Oracle JServer and
|
||||
utilizes a Java stored procedure that outputs to XLS format
|
||||
using the HSSF API. - Arian Lashkov (alaskov at lbank.lt)
|
||||
</p>
|
||||
</section>
|
||||
<!-- <section>-->
|
||||
<!-- <title>Bit Tracker by Tracker Inc., and ThinkVirtual</title>-->
|
||||
<!-- <p>-->
|
||||
<!-- Bit Tracker (http://www.bittracker.com/) is the world's first and only web-based drill bit tracking system to manage your company's critical bit information and use that data to its full potential. It manages all bit related data, including their usage, locations, how they were used, and results such as rate of penetration and dull grade after use. This data needs to be available in Excel format for backwards compatibility and other uses in the industry. After using CSV and HTML formats, we needed something better for creating the spreadsheets and POI is the answer. It works great and was easy to implement. Kudos to the POI team.-->
|
||||
<!-- </p>-->
|
||||
<!-- <p>-->
|
||||
<!-- Travis Reeder (travis at thinkvirtual dot com)-->
|
||||
<!-- </p>-->
|
||||
<!-- </section>-->
|
||||
<section>
|
||||
<title>Edwards And Kelcey Technology</title>
|
||||
<p>
|
||||
Edwards and Kelcey Technology (http://www.ekcorp.com/) developed a
|
||||
Facility
|
||||
Management and Maintenance System for the Telecommunications industry
|
||||
based
|
||||
on Turbine and Velocity. Originally the invoicing was done with a simple
|
||||
CSV
|
||||
sheet which was then marked up by accounts and customized for each client.
|
||||
As growth has been consistent with the application, the requirement for
|
||||
invoices that need not be touched by hand increased. POI provided the
|
||||
solution to this issue, integrating easily and transparently into the
|
||||
system. POI HSSF was used to create the invoices directly from the server
|
||||
in
|
||||
Excel 97 format and now services over 150 unique invoices per month.
|
||||
</p>
|
||||
<p>
|
||||
Cameron Riley (crileyNO@ SPAMekmail.com)
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>ClickFind</title>
|
||||
<p>
|
||||
<a href="http://www.clickfind.com/">ClickFind Inc.</a> used the POI
|
||||
projects HSSF API to provide their medical
|
||||
research clients with an Excel export from their electronic data
|
||||
collection web service Data Collector 3.0. The POI team's assistance
|
||||
allowed ClickFind to give their clients a data format that requires less
|
||||
technical expertise than the XML format used by the Data Collector
|
||||
application. This was important to ClickFind as many of their current
|
||||
and potential clients are already using Excel in their day-to-day
|
||||
operations and in established procedures for handling their generated
|
||||
clinical data. - Jared Walker (jared.walker at clickfind.com)
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>IKAN Software NV</title>
|
||||
<p>In addition to Change Management and Database Modelling, IKAN Software NV
|
||||
(http://www.ikan.be/) develops and supports its own ETL
|
||||
(Extract/Transform/Load) tools.</p>
|
||||
|
||||
<p>IKAN's latest product is this domain is called ETL4ALL
|
||||
(http://www.ikan.be/etl4all/). ETL4ALL is an open source tool
|
||||
allowing data transfer from and to virtually any data source. Users can
|
||||
combine and examine data stored in relational databases, XML databases, PDF
|
||||
files, EDI, CSV files, etc.
|
||||
</p>
|
||||
|
||||
<p>It is obvious that Microsoft Excel files are also supported.
|
||||
POI has been used to successfully implement this support in ETL4ALL.</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>JM Lafferty Associates, Inc.</title>
|
||||
<p>
|
||||
On its <a href="http://www.forecastworks.com/public/">ForecastWorks</a> website
|
||||
<a href="http://www.jmlafferty.com/">JM Lafferty Associates, Inc.</a> produces dynamic on demand
|
||||
financial analyses of companies and institutional funds. The pages produced are selected and exported
|
||||
in several file formats including PPT and XLS.
|
||||
</p>
|
||||
<ul>
|
||||
<li>The PPT files produced are of high quality which is on a par with similar PDF files.</li>
|
||||
<li>The XLS files produced contain a complex forecasting model built from a template with a VBA Macro.</li>
|
||||
</ul>
|
||||
<p>
|
||||
David Fisher (dfisher@jmlafferty.com)
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>iDATA Development Ltd (IDD)</title>
|
||||
<p>
|
||||
<a href="http://www.iexlsoftware.com/">IDD</a> have developed the iEXL product to
|
||||
generate Excel spreadsheets directly on the Iseries/AS400 IBM I on Power platform.
|
||||
</p>
|
||||
<p>
|
||||
Professional spreadsheets created via a menu system. Some basic programming is required for more complex options.
|
||||
When programming is required it can be carried out using RPG, SQL, QUERY, JAVA, COBOL etc.
|
||||
In other words your existing staffs knowledge
|
||||
</p>
|
||||
<p>
|
||||
Design spreadsheets with:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Fonts down to cell level</li>
|
||||
<li>Colours (Background and text) down to cell level</li>
|
||||
<li>Shading down to cell level</li>
|
||||
<li>Cell patterns down to cell level</li>
|
||||
<li>Cell initialization</li>
|
||||
<li>Freeze Panes</li>
|
||||
<li>Passwords</li>
|
||||
<li>Images/Pictures both static and dynamic</li>
|
||||
<li>Headings</li>
|
||||
<li>Page breaks</li>
|
||||
<li>Sheet breaks</li>
|
||||
<li>Text insertion and much more</li>
|
||||
<li>Functions/Formula</li>
|
||||
<li>Merge cells</li>
|
||||
<li>Row Height</li>
|
||||
<li>Cell text alignment</li>
|
||||
<li>Text Rotation </li>
|
||||
<li>50 Database files per workbook.</li>
|
||||
<li>E-mail the spreadsheet</li>
|
||||
</ul>
|
||||
<p>
|
||||
The product name is 'iEXL' and has been live on both European and North American systems for over four years.
|
||||
It is being used in preference to more established commercial products which our clients have already purchased.
|
||||
This is due to cost and ease of use.
|
||||
</p>
|
||||
<p>
|
||||
All spreadsheets can be archived if required so that historical spreadsheets can be retrieved.
|
||||
</p>
|
||||
<p>
|
||||
The system has benefits for all departments within an organisation.
|
||||
Examples of this are accounts department for things such as aged trial balance,
|
||||
distribution department for ASN’s, warehousing for stock figures, IS for security reporting etc.
|
||||
</p>
|
||||
<p>
|
||||
Clients have at this point (June 2012) created over 300 spreadsheets which in turn have generated over
|
||||
500,000 E-mails. iEXL has a menu driven email system.
|
||||
</p>
|
||||
<p>
|
||||
Due to the Apache-POI project IDD have been able to create the IEXL product.
|
||||
This is a well priced product which allows companies of all sizes access to a product that opens up their reporting capabilities
|
||||
</p>
|
||||
<p>
|
||||
Within the <a href="http://www.iexlsoftware.com/">iEXLSOFTWARE.COM</a> website you will find a full user manual,
|
||||
installation instructions, a call log (Ticket) system and a downloadable 45 day trial version.
|
||||
</p>
|
||||
<p>
|
||||
<em>Author: Mark.D.Golden</em>
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Ugly Duckling</title>
|
||||
<p>
|
||||
<a href="http://uglyduckling.nl/">Ugly Duckling</a> focus on Software, Management and Finance.
|
||||
We have recently been using Apache POI to create tools for the mortgage group of
|
||||
<a href="https://www.abnamro.nl/en/personal/index.html">ABN AMRO</a> in the Netherlands.
|
||||
During this project we created a number of what we call 'Robots' using the HSSF API.
|
||||
</p>
|
||||
<p>
|
||||
These <a href="http://uglyduckling.nl/work/robots/">robots</a> run as services on the network and
|
||||
help automate the processing of large amounts of data. Our Robots can be used to spot problems that
|
||||
a human might not, and also to automate repetitive tasks.
|
||||
</p>
|
||||
<p>
|
||||
We found Apache POI to be extremely useful. We took the base API, wrapped it in a builder pattern and
|
||||
thus created a DSL with a fluid interface. Throughout the project we enjoyed very much working with
|
||||
Apache POI and found it to be very reliable.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Deutsche Bahn</title>
|
||||
|
||||
<p>Deutsche Bahn uses POI's HWPF component to process complex specification documents stored in the legacy Microsoft Word file format.</p>
|
||||
<p>
|
||||
In a joint effort with other international partners, <a href="http://fahrweg.dbnetze.com/fahrweg-en/start/company_aboutus/">Deutsche Bahn Netz AG</a>,
|
||||
the owner of the German rail infrastructure, developed a novel software toolchain to facilitate the creation of an interoperable on-board component
|
||||
for a pan-European train protection system. One part of this toolchain is a domain-specific specification processor which reads the relevant
|
||||
requirements documents using Apache POI, enhances them and ultimately stores their contents as <a href="http://www.omg.org/spec/ReqIF/">ReqIF</a>.
|
||||
Contrary to DOC, this XML-based file format allows for proper traceability and versioning in a multi-tenant environment. Thus, it lends itself much
|
||||
better to the management and interchange of large sets of system requirements. The resulting ReqIF files are then consumed by the various tools in
|
||||
the later stages of the software development process.
|
||||
</p>
|
||||
<p>
|
||||
Currently available, off-the-shelf software for requirement import performed very poorly on the original specification documents due to their
|
||||
structural complexity and heterogeneous formatting. POI not only helped to create a superior solution thanks to its rich API. Because of its
|
||||
open-source nature it also plays a key role in ensuring the maintainability of the resulting system which is expected to stay in operation for
|
||||
many decades to come.
|
||||
</p>
|
||||
<p>
|
||||
POI has seen various enhancements for this challenging application. Most notably, these include the addition of extensive list numbering support,
|
||||
a feature which is now part of Apache TIKA. Numerous smaller improvements, such as support for table cell background shadings, interpretation of
|
||||
certain kinds of OfficeDrawings, and proper conversion of special characters, also helped to derive meaning from the input files. See
|
||||
<a href="http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-182866">here</a> for details.
|
||||
</p>
|
||||
<p>
|
||||
This work was funded by the German Federal Ministry of Education and Research (Grant No. 01IS12021) in the context of the ITEA2 project
|
||||
<a href="http://openetcs.org/">openETCS</a>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
777
src/documentation/content/xdocs/changes.xml
Normal file
@ -0,0 +1,777 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?><!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE changes PUBLIC "-//APACHE//DTD Changes POI//EN" "changes-poi.dtd">
|
||||
|
||||
<changes>
|
||||
<contexts>
|
||||
<context id="OOXML" title="OOXML"/>
|
||||
<context id="OPC" title="OPC"/>
|
||||
<context id="POI_Overall" title="POI Overall"/>
|
||||
<context id="HSSF" title="Horrible SpreadSheet Format"/>
|
||||
<context id="XSSF" title="ooXml SpreadSheet Format"/>
|
||||
<context id="SXSSF" title="Streaming ooXml SpreadSheet Format"/>
|
||||
<context id="SS_Common" title="SpreadSheet Common"/>
|
||||
<context id="HSLF" title="Horrible SlideShow Format"/>
|
||||
<context id="XSLF" title="ooXml SlideShow Format"/>
|
||||
<context id="SL_Common" title="SlideShow Common"/>
|
||||
<context id="HWPF" title="Horrible WordProcessor Format"/>
|
||||
<context id="XWPF" title="ooXml WordProcessor Format"/>
|
||||
<context id="HDF" title="Horrible Document Format"/>
|
||||
<context id="HPSF" title="Horrible PropertySet Format"/>
|
||||
<context id="HDGF" title="Horrible Dreadful Graph Format"/>
|
||||
<context id="XDGF" title="ooXml Dreadful Graph Format"/>
|
||||
<context id="DDF" title="Dreadful Drawing Format"/>
|
||||
<context id="XDDF" title="ooXml Dreadful Drawing Format"/>
|
||||
<context id="HMEF" title="Horrible Mail Encoder Format"/>
|
||||
<context id="HSMF" title="Horrible Senseless Format"/>
|
||||
<context id="HPBF" title="Horrible Peep Book Format"/>
|
||||
<context id="HWMF" title="Horrible Wannabe Metafile Format"/>
|
||||
<context id="HEMF" title="Horrible Ermahgerd Metafile Format"/>
|
||||
<context id="POIFS" title="Poor Obfuscation Implementation FileSystem"/>
|
||||
</contexts>
|
||||
|
||||
<!--
|
||||
ACTION ATTRIBUTES:
|
||||
|
||||
type: fix, add, remove, update, unknown
|
||||
|
||||
fixes-bug: a comma-separated list of bugzilla bugs or github-##
|
||||
|
||||
breaks-compatibility: used whenever an intentional (or unintentional?) backwards compatibility
|
||||
was introduced without having a deprecation warning for at least 2 final releases.
|
||||
Use a value of "true" to indicate a breakage. Otherwise, omit this attribute.
|
||||
|
||||
context: a space-separated list of modules related to the change. Use POI, OOXML, OPC, etc to refer
|
||||
to changes to core POI code rather than listing all of the modules, or SS Common and SL Common
|
||||
when referring to both H??F and X??F formats.
|
||||
-->
|
||||
|
||||
<section id="previous_releases">
|
||||
<title>Previous releases</title>
|
||||
<p>The change log for <a href="site:changes3x">POI 3.x</a> and
|
||||
<a href="site:changespre3x">older releases</a>
|
||||
can be found in the history section.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<release version="5.4.2" date="2025-??-??">
|
||||
<summary>
|
||||
<summary-item>Upgrade batik dependency to 1.19</summary-item>
|
||||
<summary-item>Upgrade bouncycastle dependency to 1.80</summary-item>
|
||||
<summary-item>Upgrade commons-collections4 dependency to 4.5.0</summary-item>
|
||||
<summary-item>Upgrade commons-io dependency to 2.19.0</summary-item>
|
||||
<summary-item>Upgrade pdfbox dependency to 3.0.5</summary-item>
|
||||
<summary-item>Upgrade xmlsec dependency to 3.0.6</summary-item>
|
||||
<summary-item>Upgrade JaCoCo code-coverage tooling to 0.8.13</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="69646" context="SXSSF">SXSSF: check for null _fd instance in dispose call</action>
|
||||
<action type="fix" fixes-bug="69667" context="HSSF">Handle slightly broken WriteAccessRecord gracefully</action>
|
||||
<action type="fix" fixes-bug="69669" context="HSLF">Fix issue where Slide addTitle corrupts the ppt file</action>
|
||||
<action type="add" fixes-bug="github-803" context="SS_Common">Add support for SHEET function</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.4.1" date="2025-04-06">
|
||||
<summary>
|
||||
<summary-item>Note: JDK 24 will change behavior of locale providers, some formatting might be different when upgrading</summary-item>
|
||||
<summary-item>Upgrade commons-codec dependency to 1.18.0</summary-item>
|
||||
<summary-item>Upgrade bouncycastle dependency to 1.80</summary-item>
|
||||
<summary-item>Upgrade pdfbox dependency to 3.0.4</summary-item>
|
||||
<summary-item>Upgrade graphics2d dependency to 3.0.3</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="69618" context="OOXML">ZipPackage save should check that intermediate steps succeed</action>
|
||||
<action type="add" fixes-bug="github-775" context="OOXML">Allow some OPC compliance checks to be tuned</action>
|
||||
<action type="fix" fixes-bug="66260" context="XWPF">Add getNumberOfTexts() method</action>
|
||||
<action type="fix" fixes-bug="68094" context="SXSSF">Allow to use SXSSFSheet.setArbitraryExtraWidth() to define an adjustment-factor when auto-sizing columns</action>
|
||||
<action type="fix" fixes-bug="57603" context="HWPF">Fix reading/writing of documents with many columns</action>
|
||||
<action type="fix" fixes-bug="65190" context="SS_Common">Handle decimal format '0#' the same way as Excel</action>
|
||||
<action type="fix" fixes-bug="66425" context="POI_Overall">Multiple fixes found by fuzzing Apache POI via oss-fuzz</action>
|
||||
<action type="add" fixes-bug="66260" context="XWPF">Add getNumberOfTexts method to XWPFRun</action>
|
||||
<action type="fix" fixes-bug="69315" context="HSMF">Continue processing properties after multivalued properties</action>
|
||||
<action type="fix" fixes-bug="69529" context="XSSF">Streamed reading: Log failures to format formulas and numbers instead of stopping processing</action>
|
||||
<action type="fix" fixes-bug="69536" context="SXSSF">Fix arbitrary extra width support</action>
|
||||
<action type="fix" fixes-bug="69555" context="SXSSF">Handle extra issue where FontSystem is missing</action>
|
||||
<action type="fix" fixes-bug="69583" context="SS_Common">Cell copy support does not handle Time only values properly</action>
|
||||
<action type="fix" fixes-bug="69681" context="SS_Common">Issue with date/time formats that leave a space before the AM/PM part</action>
|
||||
<action type="add" fixes-bug="69714" context="POI_Overall">Allow custom TempFileCreationStrategy per thread</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.4.0" date="2025-01-08">
|
||||
<summary>
|
||||
<summary-item>Add support for SOURCE_DATE_EPOCH to allow to create reproducible binary files without creation/modification-timestamp being set</summary-item>
|
||||
<summary-item>Breaking change: Some invalid content in the compressed file-formats for xlsx/docx/pptx/... now fail parsing to prevent handling malicious input incorrectly</summary-item>
|
||||
<summary-item>Upgrade ant dependency to 1.10.15</summary-item>
|
||||
<summary-item>Upgrade batik dependency to 1.18</summary-item>
|
||||
<summary-item>Upgrade commons-codec dependency to 1.17.1</summary-item>
|
||||
<summary-item>Upgrade commons-compress dependency to 1.27.1</summary-item>
|
||||
<summary-item>Upgrade commons-io dependency to 2.18.0</summary-item>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.24.3 and add log4j-bom dependency</summary-item>
|
||||
<summary-item>Upgrade pdfbox dependency to 3.0.3</summary-item>
|
||||
<summary-item>Upgrade xmlbeans dependency to 5.3.0</summary-item>
|
||||
<summary-item>Upgrade xmlsec dependency to 3.0.5</summary-item>
|
||||
<summary-item>Upgrade JaCoCo code-coverage tooling to 0.8.12</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="github-653" context="HSSF">Adjust HSSFWorkbook.getSheet() to return the first case-insensitive match, similar to XSSF</action>
|
||||
<action type="fix" fixes-bug="github-655" context="XWPF">Fix searching text in paragraphs when text is spread across multiple runs</action>
|
||||
<action type="fix" fixes-bug="github-657" context="SXSSF">Support setting an arbitrary extra width value for column widths - not working - fixed in 69536 (5.4.1)</action>
|
||||
<action type="fix" fixes-bug="github-670" context="XWPF">XWPFRun.getText should support delInstrText and noBreakHyphen</action>
|
||||
<action type="fix" fixes-bug="github-672" context="XWPF">Support removing XWPF Styles</action>
|
||||
<action type="fix" fixes-bug="github-673" context="OOXML">Add word10.xsd to poi-ooxml-full</action>
|
||||
<action type="fix" fixes-bug="github-733" context="SS_Common">Fix issue with param order in MIRR function evaluation</action>
|
||||
<action type="fix" fixes-bug="66590" context="POIFS">Number of blocks used by the property table missing from the file header</action>
|
||||
<action type="fix" fixes-bug="69154" context="XSSF">Shifting columns with merged regions generates an error about overlapping regions</action>
|
||||
<action type="fix" fixes-bug="69209" context="SS_Common">default ignoreMissingFontSystem to true</action>
|
||||
<action type="fix" fixes-bug="69323" context="POI_Overall">DefaultTempFileCreationStrategy should worry about OS deleting the temp dir</action>
|
||||
<action type="fix" fixes-bug="69411" context="XSSF">add XSSFReader.getSheetIterator</action>
|
||||
<action type="fix" fixes-bug="69418" context="SS_Common">Issue when evaluating WORKDAY function that has a cell ref as 2nd param</action>
|
||||
<action type="fix" fixes-bug="69620" context="OOXML">Throw exception if xlsx/docx/pptx contains duplicate file names</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.3.0" date="2024-07-02">
|
||||
<summary>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.23.1</summary-item>
|
||||
<summary-item>Upgrade commons-codec dependency to 1.17.0</summary-item>
|
||||
<summary-item>Upgrade commons-compress dependency to 1.26.2</summary-item>
|
||||
<summary-item>Upgrade commons-io dependency to 2.16.1</summary-item>
|
||||
<summary-item>Upgrade pdfbox dependency to 3.0.2 and graphics2d dependency to 3.0.2</summary-item>
|
||||
<summary-item>Upgrade xmlsec dependency to 3.0.4</summary-item>
|
||||
<summary-item>Upgrade bouncycastle dependency to 1.79</summary-item>
|
||||
<summary-item>Upgrade xmlbeans dependency to 5.2.1</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="63189" context="OOXML">Add support for hyperlink based relationships which are stored separately from other relationships</action>
|
||||
<action type="fix" fixes-bug="68237" context="SXSSF">Some boolean attribute values are written as true instead of 1</action>
|
||||
<action type="fix" fixes-bug="68703" context="XSLF">IllegalArgumentException: Unexpected color choice CTFontCollectionImpl when reading font color for a table cell</action>
|
||||
<action type="fix" fixes-bug="68778" context="SXSSF">Fix issue in SXSSF when there are missing fonts</action>
|
||||
<action type="fix" fixes-bug="68183" context="SXSSF">SXSSFWorkbook now removes temp files when closed - removing need for a separate dispose call</action>
|
||||
<action type="add" fixes-bug="68987" context="OOXML">Support allowStoredEntriesWithDataDescriptor=true when reading zip data</action>
|
||||
<action type="add" fixes-bug="69147" context="OOXML">Fix regression in date handling when evaluating TEXT function</action>
|
||||
<action type="fix" fixes-bug="github-578" context="SXSSF">Rework exception handling for missing fonts to make it more robust</action>
|
||||
<action type="fix" fixes-bug="github-601" context="XDGF">handle elliptical arcs that have colinear points</action>
|
||||
<action type="add" fixes-bug="github-604" context="XDGF">Support for polylines</action>
|
||||
<action type="add" fixes-bug="github-607" context="XWPF">Support SVGs in XWPF</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.2.5" date="2023-11-25">
|
||||
<summary>
|
||||
<summary-item>Upgrade commons-io dependency to 2.15.0</summary-item>
|
||||
<summary-item>Upgrade commons-compress dependency to 1.25.0</summary-item>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.21.1</summary-item>
|
||||
<summary-item>Upgrade xmlsec dependency to 3.0.3</summary-item>
|
||||
<summary-item>Upgrade bouncycastle dependency to 1.77</summary-item>
|
||||
<summary-item>Upgrade xmlbeans dependency to 5.2.0</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="67475" context="SS_Common">Better support for edge cases in TEXT function</action>
|
||||
<action type="fix" fixes-bug="67510" context="XDDF">Fix issue where chart axes were defaulting to have blank number formats - which recent versions of Excel treat as corrupted.</action>
|
||||
<action type="add" fixes-bug="67735" context="XWPF">Add Complex scripts support in XWPFRun</action>
|
||||
<action type="fix" fixes-bug="67579" context="OOXML">POI 5.2.4 had a regression where it did not close user-provided InputStreams. In POI 5.2.5, user-provided InputStreams are again closed. There are new constructors that allow you to control whether the streams are closed.</action>
|
||||
<action type="fix" fixes-bug="67785" context="XSSF">XSSFExcelExtractor does not format formula results like the streaming based extractor</action>
|
||||
<action type="fix" fixes-bug="68094" context="XSSF">Improve cell width logic to avoid rounding issues</action>
|
||||
<action type="fix" fixes-bug="github-505" context="SL_Common">DrawTextFragment height should include leading space</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.2.4" date="2023-09-28">
|
||||
<summary>
|
||||
<summary-item>Discontinued the binary packages to reduce maintenance overhead, please rebuild the sources locally or use Maven Central for binary files</summary-item>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.20.0</summary-item>
|
||||
<summary-item>Upgrade xmlsec dependency to 3.0.2</summary-item>
|
||||
<summary-item>Upgrade batik dependency to 1.17</summary-item>
|
||||
<summary-item>Upgrade pdfbox dependency to 2.0.29, graphics2d to 0.43</summary-item>
|
||||
<summary-item>Upgrade commons-codec dependency to 1.16.0</summary-item>
|
||||
<summary-item>Upgrade commons-compress dependency to 1.24.0</summary-item>
|
||||
<summary-item>Upgrade commons-io dependency to 2.13.0</summary-item>
|
||||
<summary-item>Upgrade curvesapi dependency to 1.08</summary-item>
|
||||
<summary-item>Upgrade SparseBitSet dependency to 1.3</summary-item>
|
||||
<summary-item>Use jdk18on versions of bouncycastle jars (v1.76)</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="66598" context="XSSF">Fix invalid loop-condition when cleaning up CTCells</action>
|
||||
<action type="fix" fixes-bug="47950" context="POI_Overall">make stream/directory name lookup in OLE2 case insensitive</action>
|
||||
<action type="fix" fixes-bug="66521" context="POI_Overall">Provide a utility to clear all thread-locals to avoid reports of memory-leaks in web-application containers</action>
|
||||
<action type="fix" fixes-bug="66436" context="POI_Overall">Fix handling padding when decrypting data</action>
|
||||
<action type="fix" fixes-bug="54373" context="XSSF">Include alpha/transparency value when creating an XSSFColor from an AWT Color object</action>
|
||||
<action type="fix" fixes-bug="62272" context="XSSF">Include alpha/transparency value when setting a color-value for a font</action>
|
||||
<action type="fix" fixes-bug="65260" context="SXSSF">Fix graceful handling of missing font-system on the operating system</action>
|
||||
<action type="fix" fixes-bug="65543" context="HSSF">Incomplete Shared String Tables were causing read failures</action>
|
||||
<action type="fix" fixes-bug="66257" context="XSSF">NullPointerException in XSSFReader$SheetIterator.next()</action>
|
||||
<action type="fix" fixes-bug="66278" context="XSLF">Multiple gradient stops at the exact same location causing a rendering failure</action>
|
||||
<action type="add" fixes-bug="66301" context="HSMF">Add a method to properly write the header necessary for a MSG attachment</action>
|
||||
<action type="fix" fixes-bug="66306" context="XSLF">Make XSLFDiagramGroupShape public</action>
|
||||
<action type="fix" fixes-bug="66312" context="XWPF">Inserting paragraph into table from cursor</action>
|
||||
<action type="add" fixes-bug="66347" context="XWPF">Add theme support to XWPF</action>
|
||||
<action type="fix" fixes-bug="66365" context="XSSF">Fix issue where cells with formulas and cached results of string type do not properly support shared strings</action>
|
||||
<action type="fix" fixes-bug="66399" context="XSLF">Text run highlight colors were ignored</action>
|
||||
<action type="fix" fixes-bug="66401" context="SS_Common">Fix parsing formulas with sheet-names which contain single quotes</action>
|
||||
<action type="fix" fixes-bug="66418" context="XSSF">Fix performance issue with XSSFSheet.groupRow</action>
|
||||
<action type="fix" fixes-bug="66433" context="SS_Common">Improve boolean functions empty cell handling</action>
|
||||
<action type="fix" fixes-bug="66473" context="SXSSF">Fix performance issue with SXSSFCell.getColumnIndex()</action>
|
||||
<action type="fix" fixes-bug="66475" context="POI_Overall">SignatureConfig: remove ThreadLocals and deprecated code associated with them</action>
|
||||
<action type="fix" fixes-bug="66514" context="POI_Overall">Remove support for zip/tgz release artifacts</action>
|
||||
<action type="fix" fixes-bug="66532" context="SXSSF">Improve performance of SheetDataWriter outputEscapedString</action>
|
||||
<action type="fix" fixes-bug="66584" context="OOXML">Ensure ZipPackage closes input stream when exceptions happen</action>
|
||||
<action type="fix" fixes-bug="66614" context="SS_Common">Issue where OFFSET function applies limits that should only apply to xls format spreadsheets</action>
|
||||
<action type="fix" fixes-bug="66644" context="POI_Overall">Make jar build reproducible</action>
|
||||
<action type="fix" fixes-bug="66661" context="XSSF">Fix issue with adding table formulas</action>
|
||||
<action type="fix" fixes-bug="66988" context="XWPF">XWPFTableCell: make setText fully replace the text and add appendText method to append</action>
|
||||
<action type="fix" fixes-bug="67005" context="XSLF">Basic for reading audio files in pptx files</action>
|
||||
<action type="fix" fixes-bug="67396" context="OOXML">Set standalone="yes" in XML declarations when writing OOXML format files</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.2.3" date="2022-09-16">
|
||||
<summary>
|
||||
<summary-item>Upgrade graphics2d dependency to 0.40, pdfbox to 2.0.26</summary-item>
|
||||
<summary-item>Upgrade xmlsec dependency to 3.0.0</summary-item>
|
||||
<summary-item>Upgrade xmlbeans dependency to 5.1.1</summary-item>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.18.0</summary-item>
|
||||
<summary-item>Speed up processing of formulas with column-ranges, e.g. VLOOKUP(A4,$D:$E,2,0)</summary-item>
|
||||
<summary-item>Speed up compilation of jar-files-only builds by avoiding direct dependency on test-execution</summary-item>
|
||||
<summary-item>Avoid some more possible overly large memory allocations on certain input documents</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="51037" context="SS_Common">setDefaultColumnStyle() in XSSFSheet/SXSSFSheet was not working as expected</action>
|
||||
<action type="add" fixes-bug="55330" context="SS_Common">add PageMargin enum</action>
|
||||
<action type="add" fixes-bug="56155" context="OOXML">Support version property in CoreProperties</action>
|
||||
<action type="add" fixes-bug="58468" context="SS_Common">Support DAYS function</action>
|
||||
<action type="fix" fixes-bug="63575" context="XWPF">Support capitalized text in XWPFWordExtractor</action>
|
||||
<action type="fix" fixes-bug="63576" context="HWPF">Support capitalized text in WordExtractor</action>
|
||||
<action type="fix" fixes-bug="65562" context="SXSSF">SXSSF doesn't update dimension field</action>
|
||||
<action type="fix" fixes-bug="65473" context="XSLF">When slides were copied, the text shapes were still referencing original slide</action>
|
||||
<action type="fix" fixes-bug="65854" context="OOXML">Use revert() instead of close() when OPCPackage is opened read-only</action>
|
||||
<action type="fix" fixes-bug="65973" context="XSSF">Row shifting does not properly handle hyperlinks that span multiple cells</action>
|
||||
<action type="fix" fixes-bug="65988" context="SS_Common">RATE function fails in some cases</action>
|
||||
<action type="fix" fixes-bug="65993" context="XSSF">change XSSFHyperlink code that copies HSSFWorkbook to respect cell ranges</action>
|
||||
<action type="fix" fixes-bug="66022" context="SS_Common">Fix issue with parsing formulas that have sheet names containing certain chars</action>
|
||||
<action type="fix" fixes-bug="66047" context="SS_Common">Fix rounding issue in MROUND function</action>
|
||||
<action type="fix" fixes-bug="66079" context="XWPF">Fix bug where XWPFNumbering.removeAbstractNum removes by list index, not abstractNumId</action>
|
||||
<action type="fix" fixes-bug="github-321" context="SS_Common">DataFormatter issue with rounding in some use cases</action>
|
||||
<action type="add" fixes-bug="github-330" context="SS_Common">Support AVERAGEIF function</action>
|
||||
<action type="fix" fixes-bug="66052" context="SS_Common">XSSFColor could not be used the same time as org.apache.poi.ss.util classes</action>
|
||||
<action type="add" fixes-bug="66083" context="SS_Common">Support CEILING.MATH and FLOOR.MATH functions</action>
|
||||
<action type="fix" fixes-bug="66087" context="SS_Common">support case insensitive matching in D* functions</action>
|
||||
<action type="add" fixes-bug="66090" context="SS_Common">add support for DCOUNT, DCOUNTA, DAVERAGE, DSTDEV, DSTDEVP, DVAR, DVARP and DPRODUCT functions</action>
|
||||
<action type="add" fixes-bug="66092" context="SS_Common">Add STDEVP, STDEVA, STDEVPA, VARA and VARPA functions</action>
|
||||
<action type="add" fixes-bug="66093" context="SS_Common">add support for unimplemented subfunctions to SUBTOTAL function</action>
|
||||
<action type="add" fixes-bug="66094" context="SS_Common">add support for STDEV.S, STDEV.P, VAR.S and VAR.P functions</action>
|
||||
<action type="add" fixes-bug="66095" context="SS_Common">add support for POISSON.DIST function</action>
|
||||
<action type="add" fixes-bug="66097" context="SS_Common">Support CEILING.PRECISE and FLOOR.PRECISE functions</action>
|
||||
<action type="add" fixes-bug="66098" context="SS_Common">D* functions should support wildcard matches</action>
|
||||
<action type="add" fixes-bug="66105" context="SS_Common">Support excel correl, covar, pearson and forecast functions</action>
|
||||
<action type="fix" fixes-bug="66115" context="HSSF">Some Password protected XLS files are not read</action>
|
||||
<action type="add" fixes-bug="66123" context="XSSF">Support the gte attribute with XSSFConditionalFormattingThreshold</action>
|
||||
<action type="add" fixes-bug="66145" context="OOXML">generate poi-ooxml-full classes for dml-drawing xsd</action>
|
||||
<action type="add" fixes-bug="66146" context="OOXML">generate poi-ooxml-full classes for threaded comment and word12 xsds</action>
|
||||
<action type="fix" fixes-bug="66173" context="SS_Common">add Sheet createSplitPane(int xSplitPos, int ySplitPos, int leftmostColumn, int topRow, PaneType activePane) to eventually replace the existing createSplitPane method (that has a bug in XSSFSheet)</action>
|
||||
<action type="fix" fixes-bug="github-360" context="HSSF">HSSFExtendedColor was not setting RGB colors properly</action>
|
||||
<action type="add" fixes-bug="66176" context="XSLF">Integrate SmartArt diagrams from powerpoint presentations</action>
|
||||
<action type="fix" fixes-bug="66181" context="SS_Common">POI's implementation of VALUE function did not properly handle empty string input</action>
|
||||
<action type="fix" fixes-bug="66187" context="XWPF">Calling getTextHighlightColor() or getEmphasisMark() on XWPFRun can lead to corruption of file</action>
|
||||
<action type="fix" fixes-bug="66211" context="XSSF">XSSFTable.updateHeaders did not work for Worksheets created using current Excel versions</action>
|
||||
<action type="fix" fixes-bug="66212" context="XSSF">XSSFSheet.removeTable did not remove the links to the table part reference from the sheet</action>
|
||||
<action type="fix" fixes-bug="66213" context="XSSF">XSSFWorkbook.cloneSheet does not clone XSSFTables linked from the sheet</action>
|
||||
<action type="fix" fixes-bug="66215" context="XSSF">Shifting rows or columns can damage formulas in tables</action>
|
||||
<action type="fix" fixes-bug="66216" context="XSSF">XSSFPivotTable.getPivotCacheDefinition() does not work properly when XSSFPivotTable was read from an existing *.xlsx file</action>
|
||||
<action type="fix" fixes-bug="66230" context="SXSSF">SXSSFWorkbook should work even when fonts not installed on OS</action>
|
||||
<action type="fix" fixes-bug="66242" context="XSLF">Issue with orphaned (in package) images and notes post slide removal</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.2.2" date="2022-03-19">
|
||||
<summary>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.17.2 and graphics2d dependency to 0.35 as well as some test dependencies</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="65915" context="SS_Common">Fix issue where Boolean functions (AND, OR) do not work properly in array context</action>
|
||||
<action type="add" fixes-bug="65934" context="XSLF">add removeTextParagraph to text box API</action>
|
||||
<action type="add" fixes-bug="65935" context="XSLF">add removeTextRun to paragraph API</action>
|
||||
<action type="fix" fixes-bug="65939" context="XSSF">Fix stackoverflow issue when removing formulas with circular references</action>
|
||||
<action type="add" fixes-bug="65943" context="SXSSF">Support rich text strings in SXSSFWorkbook (only when shared string table is used)</action>
|
||||
<action type="fix" fixes-bug="65946" context="OOXML">POIXMLPropertiesTextExtractor returns duplicate key for Core properties</action>
|
||||
<action type="fix" fixes-bug="65950" context="POI_Overall">POI 5.2.1 can allocate byte arrays that are too big</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.2.1" date="2022-03-03">
|
||||
<summary>
|
||||
<summary-item>Upgrade curvesapi dependency to 1.07</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="65887" context="POI_Overall">IOUtils.toByteArray did not fully take into account value set by IOUtils.setByteArrayMaxOverride</action>
|
||||
<action type="fix" fixes-bug="60541" context="SS_Common">Collapsing a column group was incorrectly implemented</action>
|
||||
<action type="fix" fixes-bug="62857" context="SS_Common">DOLLAR function is not properly implemented</action>
|
||||
<action type="fix" fixes-bug="65792" context="SS_Common">Multiplication in cell formulas can have small rounding issues</action>
|
||||
<action type="fix" fixes-bug="65839" context="SS_Common">Picture resize can lead to infinite loop</action>
|
||||
<action type="add" fixes-bug="65846" context="SS_Common">Add support for NUMBERVALUE function</action>
|
||||
<action type="add" fixes-bug="65850" context="SS_Common">Add support for Normal Distribution functions</action>
|
||||
<action type="add" fixes-bug="65870" context="SS_Common">Add support for BESSELJ function</action>
|
||||
<action type="add" fixes-bug="65871" context="SS_Common">Add support for DOLLARDE and DOLLARFR functions</action>
|
||||
<action type="add" fixes-bug="65879" context="SS_Common">Add support for WORKDAY.INTL functions</action>
|
||||
<action type="fix" fixes-bug="65899" context="HMEF">Fix issue where malformed TNEF file can cause memory issues</action>
|
||||
<action type="fix" fixes-bug="65908" context="OPC">XAdES-XL modifications due to specification check errors</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.2.0" date="2022-01-14">
|
||||
<summary>
|
||||
<summary-item>Refactor to XSSFReader, SharedStringsTable, CommentsTable and ThemesTable to make them more extensible</summary-item>
|
||||
<summary-item>Upgrade log4j-api dependency to 2.17.1</summary-item>
|
||||
<summary-item>Upgrade BouncyCastle dependency to 1.70</summary-item>
|
||||
<summary-item>Upgrade PDFBox Graphics2d dependency to 0.34 and PDFBox dependency to 2.0.25</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="add" fixes-bug="65668" context="OOXML">upgrade to xmlsec 2.3.0 - make secure validation configurable</action>
|
||||
<action type="add" fixes-bug="65672" context="OOXML">Digital Signature - set commitment type and purpose</action>
|
||||
<action type="fix" fixes-bug="65676" context="XSSF">Issue in XSSFReader where string builder is not always cleared between cell reads</action>
|
||||
<action type="add" fixes-bug="65694" context="HSLF">handle date/time fields and formats</action>
|
||||
<action type="fix" fixes-bug="github-281" context="SS_Common">Cell Conditional Formatting: Change regex to account for decimals with no leading digit</action>
|
||||
<action type="fix" fixes-bug="github-273" context="SS_Common">Log warning when long sheet names are trimmed</action>
|
||||
<action type="add" fixes-bug="github-243" context="SS_Common">Add support for XLOOKUP and XMATCH functions</action>
|
||||
<action type="add" fixes-bug="github-290" context="POI_Overall">Customize Spliterator implementations for better parallelism</action>
|
||||
<action type="fix" fixes-bug="63211" context="SS_Common">DataFormatter incorrectly formats data formats with escaped percent character</action>
|
||||
<action type="fix" fixes-bug="64732" context="XSSF">XSSFSheet.createTable generates corrupted file when a header's cell contains a line break</action>
|
||||
<action type="fix" fixes-bug="65701" context="OOXML">Password Protecting a document when Saxon is on classpath can corrupt the output</action>
|
||||
<action type="add" fixes-bug="65703" context="SS_Common">DataFormatter: add setUse4DigitYearsInAllDateFormats(boolean) method with default of false</action>
|
||||
<action type="add" fixes-bug="65730" context="SS_Common">DataFormatter: add setUseCachedValuesForFormulaCells(boolean) method with default of false</action>
|
||||
<action type="fix" fixes-bug="65715" context="OOXML">Fix issue in XSSFSheet getDrawingPatriarch</action>
|
||||
<action type="fix" fixes-bug="65738" context="OOXML">Fix issue with excessive logging of invalid parts in OOXML files</action>
|
||||
<action type="fix" fixes-bug="65766" context="SS_Common">Cell copy does not respect rich text</action>
|
||||
<action type="fix" fixes-bug="65772" context="POI_Overall">stop using file deleteOnExit in DefaultTempFileCreationStrategy</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.1.0" date="2021-11-01">
|
||||
<summary>
|
||||
<summary-item>XDDF - bug fixes</summary-item>
|
||||
<summary-item>Upgrade Batik dependency to 1.14</summary-item>
|
||||
<summary-item>Upgrade BouncyCastle dependency to 1.69 (including adding dependency on bcutil jar)</summary-item>
|
||||
<summary-item>Upgrade Commons-Compress dependency to 1.21</summary-item>
|
||||
<summary-item>Upgrade XMLSec dependency to 2.2.3</summary-item>
|
||||
<summary-item>Upgrade PDFBox Graphics2d dependency to 0.33 (and test with PDFBox 2.0.24)</summary-item>
|
||||
<summary-item>Add commons-io 2.11.0 as a dependency</summary-item>
|
||||
<summary-item>Upgrade XMLBeans to 5.0.2</summary-item>
|
||||
<summary-item>Internal logging in POI now uses Apache Log4J 2</summary-item>
|
||||
<summary-item>Small refactor to XSSFReader to make it more extensible - should not affect most users unless they subclass XSSFReader</summary-item>
|
||||
<summary-item>By default, no DTDs will be accepted in XML files. This can be relaxed by setting POIXMLTypeLoader.DEFAULT_XML_OPTIONS.setDisallowDocTypeDeclaration(false).</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="github-221" context="XSLF">XSLFTable - revert addRow to behaviour before 4.1.2</action>
|
||||
<action type="fix" fixes-bug="65016" context="XDDF">Don't throw exception on empty data source</action>
|
||||
<action type="fix" fixes-bug="64950" context="XDDF">Set hole size for doughnut chart</action>
|
||||
<action type="fix" fixes-bug="63901" context="XSSF">XSSFDrawing - import chart from other drawing</action>
|
||||
<action type="fix" fixes-bug="63902" context="XSSF">XSSFWorkbook - reference cloned sheet in cloned chart data</action>
|
||||
<action type="fix" fixes-bug="54470" context="XSSF">XSSFWorkbook - clone sheet with chart</action>
|
||||
<action type="fix" fixes-bug="57835" context="XSLF">XSLFSlide - import slide notes when importing slide content</action>
|
||||
<action type="add" fixes-bug="github-228" context="XDDF">Manipulate individual data point properties</action>
|
||||
<action type="add" fixes-bug="65192" context="HSSF">Allow change of EncryptionMode</action>
|
||||
<action type="add" fixes-bug="65206" context="POI_Overall">Migrate ant / maven to gradle build</action>
|
||||
<action type="fix" fixes-bug="65228" context="XSLF">the method getCap() does not work correctly in xslf.usermodel.XSLFTextRun</action>
|
||||
<action type="fix" fixes-bug="65214" context="OOXML">Document signed by POI reported as 'partially' signed</action>
|
||||
<action type="fix" fixes-bug="65085" context="HSLF">LineRect shall throw more specific exceptions</action>
|
||||
<action type="fix" fixes-bug="64844" context="SL_Common">Incorrect sizes of images in SVG</action>
|
||||
<action type="add" fixes-bug="65304" context="POI_Overall">Add commons-io as a dependency</action>
|
||||
<action type="fix" fixes-bug="64473" context="OOXML">Handle issue where OOXML file has metadata and metadata.xml</action>
|
||||
<action type="add" fixes-bug="60924" context="SS_Common">Support IFS and SWITCH functions</action>
|
||||
<action type="add" fixes-bug="64633" context="SS_Common">Support TEXTJOIN function</action>
|
||||
<action type="fix" fixes-bug="65230" context="SS_Common">TRIM function should trim extra spaces between words</action>
|
||||
<action type="fix" fixes-bug="65464" context="XSSF">Fix issue with removing parent formula when shared formulas are used</action>
|
||||
<action type="add" fixes-bug="65467" context="SS_Common">Support IFNA function</action>
|
||||
<action type="fix" fixes-bug="65471" context="XSSF">Add support for T literal in DateTime formats</action>
|
||||
<action type="fix" fixes-bug="65475" context="SS_Common">SUMIF and SUMIFS functions do not properly handle #N/A values</action>
|
||||
<action type="fix" fixes-bug="github-242" context="SS_Common">add support for MAXIFS, MINIFS, AVERAGEIFS functions</action>
|
||||
<action type="fix" fixes-bug="65501" context="XSLF">Use viewbox when rendering SVG images</action>
|
||||
<action type="add" fixes-bug="65581" context="OOXML">add optional support in ZipArchiveFakeEntry to use a temp file</action>
|
||||
<action type="fix" fixes-bug="65595" context="SS_Common">Strip color formatting in headers and footers</action>
|
||||
<action type="fix" fixes-bug="65606" context="SS_Common">Fix issues with WEEKNUM function evaluation</action>
|
||||
<action type="fix" fixes-bug="65612" context="XSLF">XSLF CustomGeometry - replace XmlStreamReader access with XmlBeans delegate</action>
|
||||
<action type="fix" fixes-bug="49202" context="SS_Common">Support PERCENTRANK and related functions</action>
|
||||
<action type="fix" fixes-bug="64258" context="SS_Common">Support TDIST and related functions</action>
|
||||
<action type="fix" fixes-bug="65490" context="XSSF">Better support for shared hyperlinks</action>
|
||||
<action type="fix" fixes-bug="65042" context="OPC">Add support to ZipPackage to allow temp files to be used to save memory (useful for writing xlsx/pptx/docx files with pictures, etc.).</action>
|
||||
<action type="fix" fixes-bug="65372" context="OPC">Allow ZipSecureFile.setMaxEntrySize to accept sizes above 4Gb</action>
|
||||
<action type="fix" fixes-bug="65331" context="XWPF">Fix issue in XWPFTable.setTableAlignment(TableRowAlign tra)</action>
|
||||
<action type="fix" fixes-bug="65623" context="OPC">Create XAdES-T signature with XAdESXLSignatureFacet</action>
|
||||
<action type="fix" fixes-bug="62040" context="SS_Common">QUOTIENT function does not support cell references</action>
|
||||
<action type="fix" fixes-bug="64542" context="OPC">Allow creation of POIFSFileSystem instances from FileChannels but with an optional flag to prevent POI from closing the channel</action>
|
||||
<action type="fix" fixes-bug="65452" context="SS_Common">WorkbookFactory.create(File, ...) should throw exception if the input file is not in a supported format</action>
|
||||
<action type="fix" fixes-bug="65551" context="XSLF">Incorrect fetching paragraph and text runs props from master shape</action>
|
||||
<action type="fix" fixes-bug="65634" context="XSLF">SlideShowFactory.create(File, ...) should throw exception if the input file is not in a supported format</action>
|
||||
<action type="fix" fixes-bug="65648" context="SXSSF">Remove finalizer on SXSSF SheetDataWriter</action>
|
||||
<action type="fix" fixes-bug="65650" context="POI_Overall">Use image/x-pict as mime type for pict format pictures (previous versions used a mix of image/pict and image/x-pict)</action>
|
||||
<action type="fix" fixes-bug="65653" context="HSLF">HSLF FillType for texture and background color fills ignored</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="5.0.0" date="2021-01-20">
|
||||
<summary>
|
||||
<summary-item>Upgrade to ECMA-376 5th edition (transitional) schemas - expect API breaks when using XmlBeans directly<br/>
|
||||
some smaller changes are necessary when code is using the low-level CT... classes </summary-item>
|
||||
<summary-item>Change artifact names of poi-/ooxml-schemas to poi-ooxml-lite/full</summary-item>
|
||||
<summary-item>ooxml-security is part of poi-ooxml-full (known as ooxml-schemas) now and won't be provided separately</summary-item>
|
||||
<summary-item>updated dependencies to XMLSec 2.2.1, Bouncycastle 1.68, Commons-Codec 1.15, Commons-Compress 1.20</summary-item>
|
||||
<summary-item>XWPF - improvements in table and paragraph</summary-item>
|
||||
<summary-item>XSLF - improvements for paragraph</summary-item>
|
||||
<summary-item>provide JigSaw modules - some classes moved between packages for the JDK 9+ support, e.g.
|
||||
ExtractorFactory, so imports need to be adjusted</summary-item>
|
||||
<summary-item>removed dependencies to jaxb</summary-item>
|
||||
<summary-item>removed deprecated code</summary-item>
|
||||
<summary-item>new experimental DeferredSXSSFWorkbook which creates fewer temp files by lazily generating rows (see DeferredGeneration in poi-examples)</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="64494" context="XSSF">Ensure "applyAlignment" in cell-styles is enabled when necessary</action>
|
||||
<action type="fix" fixes-bug="64450" context="OOXML">Allow to parse a file where the relationship-id is an empty string</action>
|
||||
<action type="fix" fixes-bug="64750" context="XSSF">Do not use CTDataValidations.getCount(), instead only rely on getDataValidationArray</action>
|
||||
<action type="fix" fixes-bug="64986" context="SS_Common">Support missing or blank match_type for function Match</action>
|
||||
<action type="fix" fixes-bug="64838" context="XWPF">Do not populate cells with a paragraph when loading an existing document</action>
|
||||
<action type="fix" fixes-bug="65009" context="HSLF">Use correct index for 1-based pictures</action>
|
||||
<action type="fix" fixes-bug="64460" context="XSSF">Fix invalid moving of merged regions</action>
|
||||
<action type="fix" fixes-bug="64791" context="HSSF">Use proper position for the WriteAccessRecord</action>
|
||||
<action type="fix" fixes-bug="64238" context="SS_Common">Make LOOKUP functions deal with empty last arg correctly</action>
|
||||
<action type="fix" fixes-bug="64322" context="POIFS">Improve performance of reading OLE2 files</action>
|
||||
<action type="add" fixes-bug="64393" context="SS_Common">Handle MissingArgEval in relational operators</action>
|
||||
<action type="add" fixes-bug="64420" context="XSSF">Avoid NullPointerException in XSSFReader.SheetIterator.next() if files contain macros</action>
|
||||
<action type="add" fixes-bug="github-177" context="SS_Common">Avoid NullPointerException if RangeCopier encounters empty/missing rows</action>
|
||||
<action type="add" fixes-bug="63294" context="SS_Common">Add some more methods to allow to use CellType everywhere</action>
|
||||
<action type="fix" context="XSSF">Fix regression introduced via Bug 60845: There are more items in CTBorder that need to be handled in equals()</action>
|
||||
<action type="fix" fixes-bug="63845" context="XWPF">Adjust handling of formula-cells to fix regression with missing re-calculation introduced in 4.1.0</action>
|
||||
<action type="fix" fixes-bug="55966" context="XWPF">Include content control text in word extraction also if it is part of a paragraph</action>
|
||||
<action type="fix" fixes-bug="64244" context="XSSF">Take the replacement of RichText strings into account when computing length of strings</action>
|
||||
<action type="add" context="SS_Common">SS method to check if a Named Range is hidden or not</action>
|
||||
<action type="add" context="SS_Common">SS method to check if a Named Range is hidden or not</action>
|
||||
<action type="add" fixes-bug="github-167" context="HSMF">HSMF enhancements - NamedIdChunk, MultiValueChunks, ByteChunkDeferred</action>
|
||||
<action type="fix" context="SS_Common">Fix incorrect handling of format which should not produce any digit for zero</action>
|
||||
<action type="fix" fixes-bug="58896,52834" context="SS_Common">Speed up auto-sizing of columns when the sheet contains merged regions</action>
|
||||
<action type="fix" fixes-bug="64186" context="OPC">Decrease usage of ThreadLocals in XML Signature API</action>
|
||||
<action type="fix" fixes-bug="64213" context="SS_Common">Picture.resize(double scale) scales width wrong for small pictures and when dx1 is set</action>
|
||||
<action type="fix" fixes-bug="63712" context="OPC">upgrading xmlsec causes junit tests to fail</action>
|
||||
<action type="fix" fixes-bug="64241" context="XSLF">XSLF - Wrong scheme colors used when rendering</action>
|
||||
<action type="fix" fixes-bug="63624" context="XWPF">Method setText in XWPFTableCell updates the xml and also updates the runs and iruns</action>
|
||||
<action type="fix" fixes-bug="github-170" context="XWPF">XWPFTableCell does not process bodyElements when handle paragraph</action>
|
||||
<action type="fix" fixes-bug="github-171" context="XWPF">XWPFNumbering.addAbstractNum will definitely throw an exception</action>
|
||||
<action type="fix" fixes-bug="64301" context="OPC">Allow try-with-resources with OPCPackage.revert()</action>
|
||||
<action type="fix" fixes-bug="63745" context="HSSF">Add traversing and debugging interface to HSSF</action>
|
||||
<action type="fix" fixes-bug="64350" context="POI_Overall">Sonar fix - "Iterator.next()" methods should throw "NoSuchElementException"</action>
|
||||
<action type="fix" fixes-bug="57843" context="HWPF">RuntimeException on extracting text from Word 97-2004 Document</action>
|
||||
<action type="fix" fixes-bug="55505" context="HSSF">CountryRecord not found</action>
|
||||
<action type="fix" fixes-bug="64387" context="POIFS">Big POIFS stream result in OOM</action>
|
||||
<action type="add" fixes-bug="64411" context="POI_Overall" breaks-compatibility="true">Provide JigSaw modules</action>
|
||||
<action type="fix" fixes-bug="64441" context="SS_Common">Synchronize code that initialises WorkbookFactory</action>
|
||||
<action type="add" fixes-bug="63819" context="SS_Common">Support DateValue function</action>
|
||||
<action type="add" fixes-bug="github-179" context="SS_Common">Add an option for RangeCopier.copyRange() also clone styles</action>
|
||||
<action type="fix" fixes-bug="63290" context="XSLF">Retrieve default run properties from paragraph</action>
|
||||
<action type="add" fixes-bug="64512" context="POIFS">Ole10Native aka embedded / object packager - handle UTF16 variants</action>
|
||||
<action type="fix" fixes-bug="64561" context="XWPF">XWPFSDTContent.getText() is empty for nested SDT elements</action>
|
||||
<action type="fix" fixes-bug="64595" context="SXSSF">Missing quoting of pre-evaluated string values in formula cells causes corrupt files</action>
|
||||
<action type="fix" fixes-bug="64693" context="HEMF">POI HwmfGraphics cannot read the embedded document title</action>
|
||||
<action type="fix" fixes-bug="64716" context="HWMF">WMF font typeface charset encoding error</action>
|
||||
<action type="fix" fixes-bug="64773" context="POI_Overall">Visual signatures for .xlsx/.docx</action>
|
||||
<action type="fix" fixes-bug="64817" context="POIFS">Fix issue in testXLSXinPPT</action>
|
||||
<action type="fix" fixes-bug="github-193" context="SS_Common">Change TRUNC implementation to use MathX</action>
|
||||
<action type="add" fixes-bug="64867" context="SL_Common">Provide PDF rendering with PPTX2PNG</action>
|
||||
<action type="fix" fixes-bug="64964" context="SS_Common">Converting cell values to boolean should throw IllegalStateException instead of RuntimeException when conversion is not possible</action>
|
||||
<action type="fix" fixes-bug="64971" context="XSSF">XSSFFont setCharset(FontCharset) should use latest class instead of deprecated one</action>
|
||||
<action type="fix" fixes-bug="60397" context="XSSF">Improve performance of cell merge</action>
|
||||
<action type="fix" fixes-bug="github-206" context="SXSSF">Improve performance of SXSSF cell evaluation</action>
|
||||
<action type="fix" fixes-bug="64976" context="SS_Common">Change some methods to return ints instead of shorts (Font and CellStyle)</action>
|
||||
<action type="fix" fixes-bug="56205" context="OOXML" breaks-compatibility="true">Upgrade OOXML schema to 3rd edition (transitional)</action>
|
||||
<action type="fix" fixes-bug="64979" context="OOXML">Change artifact names of poi-/ooxml-schemas</action>
|
||||
<action type="fix" fixes-bug="64981" context="OOXML" breaks-compatibility="true">Upgrade OOXML schema to 5th edition (transitional)</action>
|
||||
<action type="fix" fixes-bug="64876" context="XSLF">Unable to convert pptx to pdf</action>
|
||||
<action type="fix" fixes-bug="65026" context="POI_Overall">Migrate tests to Junit 5</action>
|
||||
<action type="add" fixes-bug="github-207" context="POI_Overall">Use SLF4J instead of commons-logging - use jcl-over-slf4j</action>
|
||||
<action type="fix" fixes-bug="65061" context="XSSF">Handle VmlDrawings containing spreadsheet-ml default namespace</action>
|
||||
<action type="fix" fixes-bug="65063" context="HSLF">WMF parsing failed on closed empty polygon</action>
|
||||
<action type="fix" fixes-bug="github-198" context="POI_Overall">Remove jdk.charset module dependency for spreadsheets generation</action>
|
||||
<action type="fix" fixes-bug="github-196" context="OOXML">Delete unused certificate exceptions</action>
|
||||
<action type="fix" fixes-bug="github-191" context="SS_Common">Fix RuntimeException on array formula referencing blank cell</action>
|
||||
<action type="fix" fixes-bug="github-189" context="SS_Common">Move date parsing logic to DateParser</action>
|
||||
<action type="fix" fixes-bug="github-187" context="XSSF">Add length validation for Excel DataValidations that are list literals</action>
|
||||
<action type="fix" fixes-bug="github-184" context="SXSSF">New EmittingSXSSFWorkbook</action>
|
||||
<action type="fix" fixes-bug="github-176" context="XSSF">Remove limit on number of rules in XSSFSheetConditionalFormatting</action>
|
||||
<action type="fix" fixes-bug="github-177" context="HSSF">Avoid NullPointerException if RangeCopier encounters empty/missing rows</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="4.1.2" date="2020-02-17">
|
||||
<summary>
|
||||
<summary-item>Removed a lot of internal uses of StringBuffers</summary-item>
|
||||
<summary-item>XDDF - some work on better chart support</summary-item>
|
||||
<summary-item>Common SL / EMF - ongoing rendering fixes</summary-item>
|
||||
<summary-item>XSLF - OOM fixes when parsing arbitrary shape ids + a new dependency to SparseBitSet 1.2</summary-item>
|
||||
<summary-item>updated dependencies to Bouncycastle 1.64</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="64015" context="POI_Overall">Swap zaxxer.com:SparseBitSet for java.util.BitSet</action>
|
||||
<action type="fix" fixes-bug="63788" context="XWPF">When removing AbstractNum match by abstractNumId, not list index</action>
|
||||
<action type="fix" fixes-bug="63940" context="POI_Overall">Avoid endless loop/out of memory on string-replace with empty search string</action>
|
||||
<action type="fix" fixes-bug="63700" context="POI_Overall">Make D* functions work with numeric result column</action>
|
||||
<action type="fix" fixes-bug="63960" context="SXSSF">Write pre-evaluated string-values in formula cells with the correct type</action>
|
||||
<action type="fix" fixes-bug="63984" context="POI_Overall">Function AND / OR should treat missing parameters as FALSE</action>
|
||||
<action type="fix" fixes-bug="63749" context="POI_Overall">Make getFirstRowNum() and getFirstCellNum() return -1 consistently with empty data</action>
|
||||
<action type="fix" fixes-bug="63569" context="POI_Overall">Make IOUtils.setByteArrayMaxOverride() work correctly</action>
|
||||
<action type="add" context="XSLF">Add, insert and remove columns on XSLFTable</action>
|
||||
<action type="fix" fixes-bug="63842" context="POI_Overall">Fix issue with fractions where the whole number part is too large to store as an int</action>
|
||||
<action type="fix" fixes-bug="63889" context="XDDF">Produce valid PPTX file with several chart series</action>
|
||||
<action type="fix" fixes-bug="63918" context="SL_Common XSLF">Fix texture fill - scale stretched images correctly</action>
|
||||
<action type="add" context="XDDF">Add Doughnut chart data series support</action>
|
||||
<action type="fix" fixes-bug="63955" context="HMEF">HMEFContentsExtractor fails to extract content from winmail.dat</action>
|
||||
<action type="fix" fixes-bug="63927" context="POI_Overall">Inconsistent mapping of Norwegian locales for date formats</action>
|
||||
<action type="fix" fixes-bug="github-163" context="XSSF">Add set level numbering on XWPFParagraph</action>
|
||||
<action type="fix" fixes-bug="github-164" context="XSSF">Fix Bug in XSSFTable.setCellReferences when table is single cell</action>
|
||||
<action type="fix" fixes-bug="64004" context="POI_Overall">Replace Cloneable / clone() with copy constructor</action>
|
||||
<action type="fix" fixes-bug="64036" context="POI_Overall">Replace reflection calls in factories for Java 9+</action>
|
||||
<action type="fix" fixes-bug="64044" context="POI_Overall">Fix issue with setCellValue(LocalDate) not supporting nulls properly</action>
|
||||
<action type="fix" fixes-bug="64088" context="SL_Common XSLF">SlideShow rendering fixes</action>
|
||||
<action type="fix" fixes-bug="64098" context="XWPF">XWPFRun: Whitespace in text not preserved if starting with tab character.</action>
|
||||
<action type="fix" fixes-bug="64108" context="POI_Overall">unsafe pipe character ("|") in Relationship target attribute is not being encoded into a '%7C'.</action>
|
||||
<action type="fix" fixes-bug="github-166" context="XDDF">Expose invert if negative on bar charts</action>
|
||||
<action type="fix" fixes-bug="63998" context="HSSF">Support commas, exclamation marks correctly in AreaReference</action>
|
||||
<action type="fix" fixes-bug="64045" context="XSSF">XSSFWorkbook constructor doesn't close ZipFile if an exception occurs</action>
|
||||
<action type="fix" fixes-bug="64130" context="HSSF">Regression in OldSheetRecord</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="4.1.1" date="2019-10-20">
|
||||
<summary>
|
||||
<summary-item>XSSF: Memory improvements which use much less memory while writing large xlsx files</summary-item>
|
||||
<summary-item>XDDF: Improved chart support: more types and some API changes around angles and width units</summary-item>
|
||||
<summary-item>updated dependencies to Bouncycastle 1.62, Commons-Codec 1.13, Commons-Collections4 4.4, Commons-Compress 1.19</summary-item>
|
||||
<summary-item>XWPF: Additional API methods</summary-item>
|
||||
<summary-item>XSSF: Fixes to XSSFSheet.addMergedRegion() and XSSFRow.shiftRows()</summary-item>
|
||||
<summary-item>EMF/HSLF: Rendering fixes</summary-item>
|
||||
<summary-item>CVE-2019-12415 - XML External Entity (XXE) Processing in Apache POI</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="add" fixes-bug="63774" context="POI_Overall">Cache pids to speed up custom properties "add" method</action>
|
||||
<action type="add" fixes-bug="63779" context="SS_Common">Add support for the new Java date/time API added in Java 8</action>
|
||||
<action type="fix" fixes-bug="59322" context="HWPF">Avoid NullPointerException when reading Word Document with tables and a cell with a null descriptor</action>
|
||||
<action type="fix" fixes-bug="61490" context="HWPF">Read cells of tables correctly in cases where the last cell is not 'fake'</action>
|
||||
<action type="fix" context="HWPF">Do not use WeakReference for parents in Ranges to avoid spurious failures in tests</action>
|
||||
<action type="fix" fixes-bug="63657" context="XSSF">Fix regression with memory usage in XSSFRow.onDocumentWrite and some other temporary memory leaks</action>
|
||||
<action type="fix" fixes-bug="63842" context="SS_Common">FractionFormat casts whole part of the value into 'int'</action>
|
||||
<action type="fix" fixes-bug="63818" context="HSLF">Allow multiple charsets for same font typeface</action>
|
||||
<action type="fix" fixes-bug="63768" context="XSSF">XSSFExportToXml adjust settings on SchemaFactory</action>
|
||||
<action type="fix" fixes-bug="63541" context="XSLF">NullPointerException from XSLFSimpleShape.getAnchor for empty xfrm tags</action>
|
||||
<action type="add" fixes-bug="63745" context="POI_Overall">Add traversing and debugging interface</action>
|
||||
<action type="fix" fixes-bug="57423,62711" context="XSSF">Fix regression when XSSFRow.shiftRows() is used</action>
|
||||
<action type="fix" fixes-bug="63580" context="SL_Common HSLF XSLF">Fix texture paint handling</action>
|
||||
<action type="fix" fixes-bug="59004" context="HSLF">HSLF rendering - adjust values for presetShapeDefinition differs in HSLF/XSLF</action>
|
||||
<action type="fix" context="HSLF">Don't fallback to master shape properties, if master shape is not assigned</action>
|
||||
<action type="add" context="POI_Overall">Add a ThreadLocalUtil.clearAllThreadLocals which can be used to clear thread-locals</action>
|
||||
<action type="fix" fixes-bug="63371" context="XSSF">XSSFSheet.addMergedRegion should adjust count of merged cells</action>
|
||||
<action type="fix" fixes-bug="63073" context="XSSF">Return value of XSSFSheet.addMergedRegion is off by one</action>
|
||||
<action type="fix" fixes-bug="54803" context="OPC">Error opening XLSX after saving with a Drawing using POI</action>
|
||||
<action type="add" fixes-bug="github-135" context="XDDF">Support to create new chart without reading template</action>
|
||||
<action type="add" fixes-bug="github-143" context="HPSF">MAPIType.isFixedLength: not true in case of length > 8</action>
|
||||
<action type="add" fixes-bug="github-144" context="XDDF">Support for seven new chart types</action>
|
||||
<action type="add" fixes-bug="github-149" context="HSMF">improve MAPIMessage.getHtmlBody</action>
|
||||
<action type="add" fixes-bug="github-150" context="XWPF">Add XWPFPicture getWidth and getDepth methods</action>
|
||||
<action type="add" fixes-bug="github-151" context="XWPF">Add XWPFRun getStyle method</action>
|
||||
<action type="add" fixes-bug="github-152" context="XWPF">Add XWPFParagraph setKeepNext method</action>
|
||||
<action type="add" fixes-bug="github-153" context="XWPF">Add XWPFParagraph createHyperlinkRun method</action>
|
||||
<action type="add" fixes-bug="github-154" context="SXSSF">Improved support for writing large files</action>
|
||||
<action type="add" fixes-bug="github-157" context="OOXML">Add setters to POIXMLProperties</action>
|
||||
<action type="fix" fixes-bug="63153" context="XDDF">Enable safe removal of data series from charts</action>
|
||||
<action type="fix" fixes-bug="59623" context="XDDF">Provide example of threshold line in bar chart</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="4.1.0" date="2019-04-09">
|
||||
<summary>
|
||||
<summary-item>Improved support/fixes for Java 9+ and IBM JVM</summary-item>
|
||||
<summary-item>New EMF renderer and support of SVG images in XSLF</summary-item>
|
||||
<summary-item>Security, stability and memory/resource handling improvements</summary-item>
|
||||
<summary-item>Various bug fixes across function and conditional format rule evaluation</summary-item>
|
||||
<summary-item>Upgrade to XMLBeans 3.1.0</summary-item>
|
||||
<summary-item>Upgrade to Bouncycastle 1.61</summary-item>
|
||||
<summary-item>Upgrade to Curvesapi 1.06</summary-item>
|
||||
<summary-item>Upgrade to Commons-Codec 1.12</summary-item>
|
||||
<summary-item>Upgrade to Commons-Collections4 4.3</summary-item>
|
||||
<summary-item>Upgrade to XMLSec 2.1.2</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="63200" context="XSLF">Avoid a possible NullPointerException in XSLFShape.selectPaint()</action>
|
||||
<action type="add" fixes-bug="60724" context="SS_Common">Implement 'ignore hidden rows' variations for existing implemented variants</action>
|
||||
<action type="fix" fixes-bug="63264" context="SS_Common">Conditional Format rule evaluation calculates relative references incorrectly</action>
|
||||
<action type="fix" fixes-bug="61652" context="SS_Common">Fix NPE in EDATE function when date evaluates to an invalid value</action>
|
||||
<action type="fix" fixes-bug="62151" context="POIFS">Work around illegal reflective access in Java 9+ when freeing buffers</action>
|
||||
<action type="add" fixes-bug="63029" context="OPC">OPCPackage Potentially clobbers files on close()</action>
|
||||
<action type="add" fixes-bug="62980" context="SS_Common XSSF HSSF">Make D* functions ignore case in headings</action>
|
||||
<action type="fix" fixes-bug="60977" context="XSSF">Adding custom properties creates invalid .xlsx file on second write</action>
|
||||
<action type="fix" fixes-bug="60460" context="SL_Common">Null pointer exception in ExternSheetNameResolver.prependSheetName method</action>
|
||||
<action type="fix" fixes-bug="60845" context="XSSF">Fix copying styles/conditional formatting</action>
|
||||
<action type="add" fixes-bug="63054" context="SS_Common XSSF HSSF">Improved evaluation of array formulas with errors in arguments</action>
|
||||
<action type="fix" fixes-bug="63047" context="POI_Overall">Make POILogger subclassable</action>
|
||||
<action type="add" fixes-bug="62904" context="SS_Common XSSF HSSF">Support array arguments in IF and logical IS*** functions</action>
|
||||
<action type="add" fixes-bug="63028" context="SL_Common XSLF HSLF">Provide font embedding for slideshows</action>
|
||||
<action type="fix" fixes-bug="61532" context="SXSSF">Fix setting values/types during formula evaluation for SXSSF</action>
|
||||
<action type="fix" fixes-bug="62629" context="OPC">Allow to handle files with invalid content types for pictures</action>
|
||||
<action type="fix" fixes-bug="62839" context="SL_Common">Fix MathX.floor for negative n</action>
|
||||
<action type="fix" fixes-bug="62884" context="SL_Common">Sheetnum is not checked in InternalWorkbook.setSheetHidden()</action>
|
||||
<action type="fix" fixes-bug="62886" context="OPC">Regression extracting text from corrupted docx files</action>
|
||||
<action type="add" fixes-bug="63017" context="SL_Common XSLF">Remove rows from a XSLFTable</action>
|
||||
<action type="add" fixes-bug="60656" context="SL_Common XSLF HSLF">EMF image support in slideshows</action>
|
||||
<action type="add" fixes-bug="62365" context="XSLF">SVG image support in XSLF</action>
|
||||
<action type="add" fixes-bug="github-136" context="XSSF">Support GEOMEAN function</action>
|
||||
<action type="fix" fixes-bug="63011" context="OPC">Multiple digital signature in excel file broke first signature</action>
|
||||
<action type="fix" fixes-bug="62999" context="SL_Common">IBM JDK JIT causes AIOOBE in TexturePaintContext</action>
|
||||
<action type="fix" fixes-bug="62994" context="POI_Overall">IBM JCE workarounds</action>
|
||||
<action type="fix" fixes-bug="62966" context="SL_Common">init presetShapeDefinitions.xml fail under IBM jdk</action>
|
||||
<action type="fix" fixes-bug="62953" context="SL_Common XSLF HSLF">Rendering of FreeformShapes with formula fails</action>
|
||||
<action type="fix" fixes-bug="63005" context="POI_Overall">Remove support for reading files that have XML entity definitions</action>
|
||||
<action type="fix" fixes-bug="63013" context="XWPF">add XWPFRun setLang method</action>
|
||||
<action type="fix" fixes-bug="63240" context="XSSF">Remove unnecessary synchronization on DocumentHelper.newDocumentBuilder and SAXHelper.newXMLReader</action>
|
||||
<action type="fix" fixes-bug="61652" context="SS_Common">Fix NPE in EDATE function when date evaluates to an invalid value</action>
|
||||
<action type="fix" fixes-bug="63264" context="SS_Common">Conditional Format rule evaluation calculates relative references incorrectly</action>
|
||||
<action type="add" fixes-bug="60724" context="SS_Common">Implement 'ignore hidden rows' variations for existing SUBTOTAL function variants</action>
|
||||
<action type="fix" fixes-bug="63268" context="SS_Common">Fix issue with CellUtil.setFont adding unnecessary styles</action>
|
||||
<action type="fix" fixes-bug="61700" context="SS_Common">getForceFormulaRecalculation() returns wrong value</action>
|
||||
<action type="fix" fixes-bug="63292" context="SS_Common">DataFormatter.formatCellValue() ignores use1904Windowing w/4-part date formats</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="4.0.1" date="2018-12-03">
|
||||
<summary>
|
||||
<summary-item>Fixes pom.xml entries for commons-maths3 (missing), curvesapi and commons-codec</summary-item>
|
||||
<summary-item>Improvements for XDDF charts and text manipulation</summary-item>
|
||||
<summary-item>Upgrade to XMLBeans 3.0.2</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="59773" context="POI_Overall">Move loop invariants outside of loop for faster execution</action>
|
||||
<action type="fix" fixes-bug="59834" context="POI_Overall">poi-ooxml pom.xml should include dependency on poi-scratchpad</action>
|
||||
<action type="fix" fixes-bug="62690" context="POI_Overall">Missing Maven dependency to commons-math3</action>
|
||||
<action type="fix" fixes-bug="62692" context="OPC">WildFly XML parser not properly supported - Property 'http://www.oracle.com/xml/jaxp/properties/entityExpansionLimit' is not recognized</action>
|
||||
<action type="fix" fixes-bug="62699" context="POI_Overall">Download page must link to https://downloads.apache.org/poi/KEYS</action>
|
||||
<action type="fix" fixes-bug="62733" context="XSLF">XSLFBackground setFill() can corrupt the document</action>
|
||||
<action type="fix" fixes-bug="62735" context="XSSF">poi-ooxml 4.0.0 should have dependency on curvesapi 1.05</action>
|
||||
<action type="fix" fixes-bug="62740" context="XSSF">XSSFTable constructor automatically assigns invalid (non-unique) column IDs</action>
|
||||
<action type="fix" fixes-bug="62768" context="OPC">OPCPackage#close() method is incorrectly synchronized</action>
|
||||
<action type="fix" fixes-bug="62796" context="POI_Overall">Remove XML Event parser code from PackagePropertiesMarshaller</action>
|
||||
<action type="fix" fixes-bug="62800" context="XSLF">Fix null pointer exception if a picture shape has no blip id</action>
|
||||
<action type="fix" fixes-bug="62805" context="POI_Overall">Fix Old-Xerces build issues</action>
|
||||
<action type="fix" fixes-bug="62805" context="XSLF">XSLFTableCell#removeBorder(BorderEdge.right) removes the bottom edge not the right edge.</action>
|
||||
<action type="fix" fixes-bug="62811" context="POI_Overall">POI Encryption didn't work with 4.0.0 but did work with 3.17</action>
|
||||
<action type="fix" fixes-bug="62951" context="POI_Overall">FileMagic not correctly identified</action>
|
||||
<action type="fix" fixes-bug="62949" context="SL_Common">SlideShow rendering - keyframe fractions must be increasing</action>
|
||||
<action type="fix" fixes-bug="62921" context="POI_Overall">Provide OOXMLLite alternative for Java 12+</action>
|
||||
<action type="fix" fixes-bug="62625" context="POI_Overall">Handle off-spec, variant REFERENCE_NAME record structure in VBAMacroReader</action>
|
||||
<action type="fix" fixes-bug="62624" context="POI_Overall">Handle module name mapping in VBAMacroReader</action>
|
||||
<action type="fix" fixes-bug="62836" context="SS_Common">Support TREND function</action>
|
||||
<action type="fix" fixes-bug="62859" context="XWPF">Rare NPE while creating XWPFSDTContent</action>
|
||||
<action type="add" fixes-bug="62373" context="SS_Common">Support for FREQUENCY function</action>
|
||||
<action type="fix" fixes-bug="62831" context="POI_Overall">WorkbookFactory.create support for subclass of File, eg from JFileChooser</action>
|
||||
<action type="fix" fixes-bug="62815" context="XSSF">XLSB number extraction improvements</action>
|
||||
<action type="fix" fixes-bug="62373" context="SS_Common">Support FREQUENCY function</action>
|
||||
<action type="fix" fixes-bug="62742" context="POI_Overall">Add common-compress jar to bin zip/tgz</action>
|
||||
<action type="fix" fixes-bug="62747" context="POI_Overall">Upgrade bouncycastle dependency to 1.60</action>
|
||||
<action type="fix" fixes-bug="62736" context="XWPF">Relations on XSLFPictureShape were removed unconditionally</action>
|
||||
<action type="add" fixes-bug="github-109" context="XDDF">Define XDDF user model for text body, its paragraphs and text runs</action>
|
||||
<action type="add" fixes-bug="github-123" context="XSSF">Import chart on drawing</action>
|
||||
<action type="fix" fixes-bug="62746" context="XDDF">Support axIds in XDDF</action>
|
||||
<action type="fix" fixes-bug="60509" context="XSSF">XSSFWorkbook.setSheetName() does not update references in charts</action>
|
||||
<action type="fix" fixes-bug="59625" context="XWPF">Localisation (Internationalisation in other languages) when applied in charts corrupt the MS Word file</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="4.0.0" date="2018-09-07">
|
||||
<summary>
|
||||
<summary-item>Removed support for Java 6 and 7 making Java 8 the minimum version supported</summary-item>
|
||||
<summary-item>New OOXML schema (1.4) necessary, because of incompatible XMLBeans loading not anymore through POIXMLTypeLoader</summary-item>
|
||||
</summary>
|
||||
<actions>
|
||||
<action type="remove" fixes-bug="62649" breaks-compatibility="true" context="POIFS">Remove OPOIFS*</action>
|
||||
<action type="fix" fixes-bug="61589" context="XSLF">Importing content does not copy hyperlink address</action>
|
||||
<action type="fix" fixes-bug="62587" context="XSLF">repeated call to XSLFSheet.removeShape leads to java.lang.IllegalArgumentException: partName</action>
|
||||
<action type="fix" fixes-bug="62513" context="OOXML">Don't try to parse embedded package relationships</action>
|
||||
<action type="add" fixes-bug="59268" context="OOXML">Work on providing an updated version of XMLBeans</action>
|
||||
<action type="fix" fixes-bug="62451" context="HPSF">Document last printed in the year 27321</action>
|
||||
<action type="fix" fixes-bug="60713" breaks-compatibility="true" context="SXSSF XSSF OPC">(S)XSSFWorkbook/POIXMLDocument.write(OutputStream) closes the OutputStream</action>
|
||||
<action type="add" fixes-bug="62452" context="OPC">Extract configuration while verifying XML signatures</action>
|
||||
<action type="fix" fixes-bug="62187" breaks-compatibility="true" context="OPC">Compiling with Java 10 fails with ClassCastException / use commons-compress</action>
|
||||
<action type="fix" fixes-bug="62355" breaks-compatibility="true" context="POI_Overall">Unsplit packages for Jigsaw / Java 9 compatibility</action>
|
||||
<action type="fix" fixes-bug="62041" context="SL_Common">TestFonts fails on Mac</action>
|
||||
<action type="fix" fixes-bug="62051" context="XSLF">Two shapes have the same shapeId within the same slide</action>
|
||||
<action type="fix" fixes-bug="61633" context="XSLF">Zero width shapes aren't rendered</action>
|
||||
<action type="add" fixes-bug="62037" context="SL_Common">SlideNames should not be null but have a default as if accessed by VBA</action>
|
||||
<action type="fix" fixes-bug="62381" context="SL_Common">Fix rendering of AutoShapes</action>
|
||||
<action type="fix" fixes-bug="59893" context="POI_Overall">Forbid calls to InputStream.available</action>
|
||||
<action type="fix" fixes-bug="61905" context="HSSF">HSSFWorkbook.setActiveCell() does not actually make the cell selected in Excel</action>
|
||||
<action type="fix" fixes-bug="61459" context="HSLF">HSLFShape.getShapeName() returns name of shapeType and not the shape name</action>
|
||||
<action type="add" fixes-bug="62319" breaks-compatibility="true" context="SL_Common">Decommission XSLF-/PowerPointExtractor</action>
|
||||
<action type="add" fixes-bug="62092" context="SL_Common">Text not extracted from grouped text shapes in HSLF</action>
|
||||
<action type="add" fixes-bug="62159" context="OPC">Support XML signature over windows certificate store</action>
|
||||
<action type="add" fixes-bug="57369" context="XDDF">Add support for major and minor units on chart axes</action>
|
||||
<action type="add" fixes-bug="55954" context="XWPF">Added methods to position table</action>
|
||||
<action type="add" fixes-bug="61947" context="POI_Overall">Remove deprecated classes (POI 4.0.0)</action>
|
||||
<action type="add" fixes-bug="55954" context="XWPF">Add functions to get, set, remove outer borders for tables</action>
|
||||
<action type="add" fixes-bug="github-72" context="XDDF">Define XDDF user model for shape properties to be shared between XSLF, XSSF and XWPF</action>
|
||||
<action type="add" fixes-bug="61543" breaks-compatibility="true" context="XSSF">Do not fail with "part already exists" when tables are created/removed</action>
|
||||
<action type="add" fixes-bug="61550" breaks-compatibility="true" context="POI_Overall">Add more information to exception text and verify that it is thrown</action>
|
||||
<action type="add" fixes-bug="61609" breaks-compatibility="true" context="POI_Overall">Add .gitattribute file and set lf for one sample-file</action>
|
||||
<action type="add" fixes-bug="61797" breaks-compatibility="true" context="SL_Common">Embed Excel / Ole objects into powerpoint</action>
|
||||
<action type="fix" fixes-bug="61943" context="SL_Common">narrow generics definition because of tighter java9 checks</action>
|
||||
<action type="add" fixes-bug="61942" context="OPC">Refactor PackagePartName handling and add getUnusedPartIndex method</action>
|
||||
<action type="fix" fixes-bug="61941" context="POIFS">Move Ole marker generation to Ole10Native</action>
|
||||
<action type="fix" fixes-bug="61940" context="POI_Overall">Replace ClassID statics with enum</action>
|
||||
<action type="add" fixes-bug="61939" context="OPC">Provide schema for AlternateContent - provide new ooxml-schemas-1.4.jar</action>
|
||||
<action type="fix" fixes-bug="61787" context="HSSF">Change how deleted content is detected to not incorrectly see too much text as deleted, this was introduced with bug 58067</action>
|
||||
<action type="fix" fixes-bug="61798" context="HSSF">Fix usage of getLastCellNum() when calculating worksheet dimension during saving</action>
|
||||
<action type="fix" fixes-bug="61911" context="HWPF">Avoid IndexOutOfBounds access when reading pictures</action>
|
||||
<action type="fix" fixes-bug="61765" context="HSSF">Support third party tool generated files using WorkBook as their POIFS directory name</action>
|
||||
<action type="fix" fixes-bug="61881" context="HSLF">Regression in ppt parsing: typeface can't be null or empty</action>
|
||||
<action type="add" fixes-bug="github-68" context="XDDF XSLF XSSF XWPF">Share chart data implementation between XSLFChart, XSSFChart and XWPFChart through XDDF</action>
|
||||
<action type="fix" fixes-bug="61809" context="HPSF">Infinite loop in SectionIDMap.get() and .put()</action>
|
||||
<action type="add" fixes-bug="60887" context="XSSF">Surface XSSF Header/Footer Attributes</action>
|
||||
<action type="add" fixes-bug="61730" context="SS_Common">CellRangeAddresses support iterating over their CellAddresses</action>
|
||||
<action type="fix" fixes-bug="61727" context="SS_Common">CellRangeUtil merge cell ranges broken for certain orders of arguments</action>
|
||||
<action type="fix" fixes-bug="57517" context="HSSF">Fix various situations that were handled incorrectly in HSSFOptimiser</action>
|
||||
<action type="add" fixes-bug="61671" context="XSLF">XSLFSlide does not contain isHidden and setHidden like HSLFSlide does</action>
|
||||
<action type="update" fixes-bug="61630" context="XSSF">Performance improvement to XSSFExportToXML</action>
|
||||
<action type="add" fixes-bug="58068" context="XSSF">Add a method to pass the actual Color to StylesTable.findFont()</action>
|
||||
<action type="fix" fixes-bug="61096" context="POIFS">Add support for modules in VBAMacroReader</action>
|
||||
<action type="fix" fixes-bug="61033" context="XSSF">Add XSSFWorkbook.setCellFormulaValidation() to control if formulas are validated during Cell.setCellFormula()</action>
|
||||
<action type="fix" fixes-bug="61148" context="SXSSF">Fix calculating/setting formula value</action>
|
||||
<action type="fix" fixes-bug="61064" context="SS_Common">Support behavior of function CEILING in newer versions of Microsoft Excel</action>
|
||||
<action type="fix" fixes-bug="61516" context="SS_Common">Correctly handle references that end up outside the workbook when cells with formulas are copied</action>
|
||||
<action type="add" fixes-bug="60737" context="XSSF">Add endSheet() to XSSFEventBasedExcelExtractor</action>
|
||||
<action type="fix" fixes-bug="59747" context="OPC">Exchange order of writing parts into Zip to allow some tools to handle files better</action>
|
||||
<action type="add" fixes-bug="github-69" context="SS_Common">Support matrix functions</action>
|
||||
<action type="fix" fixes-bug="60499" context="OPC">Deleting a picture that is used twice on a slide corrupt the slide</action>
|
||||
<action type="fix" fixes-bug="60279" context="POI_Overall">Back-off to brute-force search for macro content if macro offset is incorrect</action>
|
||||
<action type="add" fixes-bug="61528" context="XSSF">Pivot table enhancements</action>
|
||||
<action type="fix" fixes-bug="61906" context="XSSF">add API for working with RichStringText</action>
|
||||
<action type="fix" fixes-bug="61792" context="SS_Common">Avoid iterating over chars (use codepoints instead)</action>
|
||||
<action type="fix" fixes-bug="62254" context="SS_Common">Update OFFSET function to support optional values</action>
|
||||
<action type="update" fixes-bug="62435" context="XSSF">Rename getAllEmbedds method to getAllEmbeddedParts (getAllEmbedds is retained but deprecated)</action>
|
||||
<action type="update" fixes-bug="62438" breaks-compatibility="true" context="POI_Overall">Replace org.apache.poi.openxml4j.util.Nullable with java.lang.Optional</action>
|
||||
<action type="fix" fixes-bug="github-90" context="XSSF">Change default DSIG signing algorithm to SHA256</action>
|
||||
<action type="fix" fixes-bug="github-107" context="SS_Common">Support AREAS function</action>
|
||||
<action type="fix" fixes-bug="github-110" breaks-compatibility="true" context="XWPF">Renames org.apache.poi.xwpf.usermodel.TextSegement to org.apache.poi.xwpf.usermodel.TextSegment</action>
|
||||
<action type="fix" fixes-bug="github-114" context="XWPF">Better support for Footnotes and Endnotes</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
</changes>
|
||||
232
src/documentation/content/xdocs/components/configuration.xml
Normal file
@ -0,0 +1,232 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Configuration</title>
|
||||
<authors>
|
||||
<person id="POI" name="POI Developers" email="dev@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
<p>The best way to learn about using Apache POI is to read through the <a href="index.html">feature documentation</a>
|
||||
and other online examples online.
|
||||
</p>
|
||||
<p>To keep the features documentation focused on the APIs, there is little mention of some of the configuration
|
||||
settings that can be enabled that may prove useful to users who have to handle very large documents or very
|
||||
large throughput.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Configuration via Java-code when calling Apache POI</title>
|
||||
<p>These API methods allow to configure behavior of Apache POI for special needs, e.g. when processing excessively
|
||||
large files.
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Configuration Setting</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.ooxml.POIXMLTypeLoader.DEFAULT_XML_OPTIONS</td>
|
||||
<td>POI support for XSSF APIs relies heavily on <a href="https://xmlbeans.apache.org">XMLBeans</a>.
|
||||
This instance can be <a href="https://xmlbeans.apache.org/docs/5.0.0/org/apache/xmlbeans/XmlOptions.html">configured</a>.
|
||||
It is recommended to take care if you do change any of the config items.
|
||||
In POI 5.1.0, we will disallow Doc Type parsing in the XML files embedded in xlsx/docx/pptx/etc files, by default.
|
||||
DEFAULT_XML_OPTIONS.setDisallowDocTypeDeclaration(false) will undo this change.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><a href="https://poi.apache.org/apidocs/5.0/org/apache/poi/util/IOUtils.html#setByteArrayMaxOverride-int-">
|
||||
org.apache.poi.util.IOUtils.setByteArrayMaxOverride(int maxOverride)</a>
|
||||
</td>
|
||||
<td>If this value is set to > 0, IOUtils.safelyAllocate(long, int) will ignore the maximum record length parameter.
|
||||
This is designed to allow users to bypass the hard-coded maximum record lengths if they are willing to accept the risk of allocating memory up to the size specified.
|
||||
It also allows to impose a lower limit than used for very memory constrained systems.
|
||||
<p>
|
||||
<strong>Note</strong>: This is a per-allocation limit and does not allow you to limit overall sum of allocations! Use -1 for using the limits specified per record-type.
|
||||
</p>
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><a href="https://poi.apache.org/apidocs/5.0/org/apache/poi/openxml4j/util/ZipSecureFile.html#setMinInflateRatio-double-">
|
||||
org.apache.poi.openxml4j.util.ZipSecureFile.setMinInflateRatio(double ratio)</a>
|
||||
</td>
|
||||
<td>Sets the ratio between de- and inflated bytes to detect zipbomb.
|
||||
It defaults to 1% (= 0.01d), i.e. when the compression is better than 1% for any given read package part, the parsing will fail indicating a Zip-Bomb.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><a href="https://poi.apache.org/apidocs/5.0/org/apache/poi/openxml4j/util/ZipSecureFile.html#setMaxEntrySize-long-">
|
||||
org.apache.poi.openxml4j.util.ZipSecureFile.setMaxEntrySize(long maxEntrySize)</a>
|
||||
</td>
|
||||
<td>Sets the maximum file size of a single zip entry. It defaults to 4GB, i.e. the 32-bit zip format maximum.
|
||||
This can be used to limit memory consumption and protect against security vulnerabilities when documents are provided by users.
|
||||
POI 5.1.0 removes the previous limit of 4GB on this setting.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><a href="https://poi.apache.org/apidocs/5.0/org/apache/poi/openxml4j/util/ZipSecureFile.html#setMaxTextSize-long-">
|
||||
org.apache.poi.openxml4j.util.ZipSecureFile.setMaxTextSize(long maxTextSize)</a>
|
||||
</td>
|
||||
<td>Sets the maximum number of characters of text that are extracted before an exception is thrown during extracting text from documents.
|
||||
This can be used to limit memory consumption and protect against security vulnerabilities when documents are provided by users.
|
||||
The default is approx 10 million chars. Prior to POI 5.1.0, the max allowed was approx 4 billion chars.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.setThresholdBytesForTempFiles(int thresholdBytes)
|
||||
</td>
|
||||
<td><strong>Added in POI 5.1.0.</strong>
|
||||
Number of bytes at which a zip entry is regarded as too large for holding in memory
|
||||
and the data is put in a temp file instead - defaults to -1 meaning temp files are not used
|
||||
and that zip entries with more than 2GB of data after decompressing will fail, 0 means all
|
||||
zip entries are stored in temp files. A threshold like 50000000 (approx 50Mb is recommended)
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.setEncryptTempFiles(boolean encrypt)
|
||||
</td>
|
||||
<td><strong>Added in POI 5.1.0.</strong>
|
||||
Whether temp files should be encrypted (default false). Only affects temp files related to zip entries.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.openxml4j.opc.ZipPackage.setUseTempFilePackageParts(boolean tempFilePackageParts)
|
||||
</td>
|
||||
<td><strong>Added in POI 5.1.0.</strong>
|
||||
Whether to save package part data in temp files to save memory (default=false).
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.openxml4j.opc.ZipPackage.setEncryptTempFilePackageParts(boolean encryptTempFiles)
|
||||
</td>
|
||||
<td><strong>Added in POI 5.1.0.</strong>
|
||||
Whether to encrypt package part temp files (default=false).
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.extractor.ExtractorFactory.setThreadPrefersEventExtractors(boolean preferEventExtractors) and
|
||||
org.apache.poi.extractor.ExtractorFactory.setAllThreadsPreferEventExtractors(Boolean preferEventExtractors)
|
||||
</td>
|
||||
<td>
|
||||
When creating text-extractors for documents, allows to choose a different type of extractor which parses documents
|
||||
via an event-based parser.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>Various classes: setMaxRecordLength(int length)
|
||||
</td>
|
||||
<td>
|
||||
Allows to override the default max record length for various classes which
|
||||
parse input data. E.g. XMLSlideShow, XSSFBParser, HSLFSlideShow, HWPFDocument,
|
||||
HSSFWorkbook, EmbeddedExtractor, StringUtil, ...
|
||||
<br/>
|
||||
This may be useful if you try to process very large files which otherwise trigger
|
||||
the excessive-memory-allocation prevention in Apache POI.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.xslf.usermodel.XSLFPictureData.setMaxImageSize(int length)
|
||||
</td>
|
||||
<td>
|
||||
Allows to override the default max image size allowed for XSLF pictures.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.xssf.usermodel.XSSFPictureData#setMaxImageSize(int length)
|
||||
</td>
|
||||
<td>
|
||||
Allows to override the default max image size allowed for XSSF pictures.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.xwpf.usermodel.XWPFPictureData#setMaxImageSize(int length)
|
||||
</td>
|
||||
<td>
|
||||
Allows to override the default max image size allowed for XWPF pictures.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Observed Java System Properties</title>
|
||||
<p>Apache POI supports some Java System Properties.
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>System property</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>java.io.tmpdir</td>
|
||||
<td>
|
||||
Apache POI uses the default mechanism of the JDK for specifying the location of
|
||||
temporary files.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.hwpf.preserveBinTables and org.apache.poi.hwpf.preserveTextTable</td>
|
||||
<td>
|
||||
Allows to adjust how parsing Word documents via HWPF is handling tables.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td>org.apache.poi.ss.ignoreMissingFontSystem</td>
|
||||
<td><strong>Added in POI 5.2.3.</strong>
|
||||
Instructs Apache POI to ignore some errors due to missing fonts and thus allows
|
||||
to perform more functionality even when no fonts are installed.
|
||||
<br/>
|
||||
Note: Some functionality will still not be possible as it cannot use default-values, e.g. rendering
|
||||
slides, drawing, ...
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
</body>
|
||||
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
107
src/documentation/content/xdocs/components/diagram/index.xml
Normal file
@ -0,0 +1,107 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - HDGF and XDGF - Java API To Access Microsoft Visio Format Files</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person id="pd" name="POI Developers" email="dev@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>HDGF is the POI Project's pure Java implementation of the
|
||||
Visio binary (VSD) file format. XDGF is the POI Project's
|
||||
pure Java implementation of the Visio XML (VSDX) file format.</p>
|
||||
<!-- TODO More about XDGF here! -->
|
||||
<p>Currently, HDGF provides a low-level, read-only api for
|
||||
accessing Visio documents. It also provides a
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-scratchpad/src/main/java/org/apache/poi/hdgf/extractor/">way</a>
|
||||
to extract the textual content from a file.
|
||||
</p>
|
||||
<p>At this time, there is no <em>usermodel</em> api or similar,
|
||||
only low level access to the streams, chunks and chunk commands.
|
||||
Users are advised to check the unit tests to see how everything
|
||||
works. They are also well advised to read the documentation
|
||||
supplied with
|
||||
<a href="https://web.archive.org/web/20071212220759/https://www.gnome.ru/projects/vsdump_en.html">vsdump</a>
|
||||
to get a feel for how Visio files are structured.</p>
|
||||
<p>To get a feel for the contents of a file, and to track down
|
||||
where data of interest is stored, HDGF comes with
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-scratchpad/src/main/java/org/apache/poi/hdgf/dev/">VSDDumper</a>
|
||||
to print out the contents of the file. Users should also make
|
||||
use of
|
||||
<a href="https://web.archive.org/web/20071212220759/https://www.gnome.ru/projects/vsdump_en.html">vsdump</a>
|
||||
to probe the structure of files.</p>
|
||||
|
||||
<note>
|
||||
This code currently lives the
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/">scratchpad area</a>
|
||||
of the POI SVN repository. To use this component, ensure
|
||||
you have the Scratchpad Jar on your classpath, or a dependency
|
||||
defined on the <em>poi-scratchpad</em> artifact - the main POI
|
||||
jar is not enough! See the
|
||||
<a href="site:components">POI Components Map</a>
|
||||
for more details.
|
||||
</note>
|
||||
|
||||
<section>
|
||||
<title>Steps required for write support</title>
|
||||
<p>Currently, HDGF is only able to read visio files, it is
|
||||
not able to write them back out again. We believe the
|
||||
following are the steps that would need to be taken to
|
||||
implement it.</p>
|
||||
<ol>
|
||||
<li>Re-write the decompression support in LZW4HDGF as
|
||||
HDGFLZW, which will be much better documented, and also
|
||||
under the ASL. <strong>Completed October 2007</strong></li>
|
||||
<li>Add compression support to HDGFLZW.
|
||||
<strong>In progress - works for small streams but encoding
|
||||
goes wrong on larger ones</strong></li>
|
||||
<li>Have HDGF just write back the raw bytes it read in, and
|
||||
have a test to ensure the file is un-changed.</li>
|
||||
<li>Have HDGF generate the bytes to write out from the
|
||||
Stream stores, using the compressed data as appropriate,
|
||||
without re-compressing. Plus test to ensure file is
|
||||
un-changed.</li>
|
||||
<li>Have HDGF generate the bytes to write out from the
|
||||
Stream stores, re-compressing any streams that were
|
||||
decompressed. Plus test to ensure file is un-changed.</li>
|
||||
<li>Have HDGF re-generate the offsets in pointers for the
|
||||
locations of the streams. Plus test to ensure file is
|
||||
un-changed.</li>
|
||||
<li>Have HDGF re-generate the bytes for all the chunks, from
|
||||
the chunk commands. Tests to ensure the chunks are
|
||||
serialized properly, and then that the file is un-changed</li>
|
||||
<li>Alter the data of one command, but keep it the same
|
||||
length, and check visio can open the file when written
|
||||
out.</li>
|
||||
<li>Alter the data of one command, to a new length, and
|
||||
check that visio can open the file when written out.</li>
|
||||
</ol>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,113 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - HWPF - Java API to Handle Microsoft Word Files</title>
|
||||
<subtitle>Word File Format</subtitle>
|
||||
<authors>
|
||||
<person name="S. Ryan Ackley" email="sackley@cfl.rr.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>The Word 97 File Format in semi-plain English</title>
|
||||
|
||||
<p>The purpose of this document is to give a brief high level overview of the
|
||||
HWPF document format. This document does not go into in-depth technical
|
||||
detail and is only meant as a supplement to the Microsoft Word 97-2007
|
||||
Binary File Format freely available from
|
||||
<a href="https://msdn.microsoft.com/en-us/library/cc313153%28v=office.12%29.aspx">Microsoft</a>.</p>
|
||||
<p>The OLE file format is not discussed in this document. It is assumed that
|
||||
the reader has a working knowledge of the POIFS API. </p>
|
||||
|
||||
<section><title>Word file structure</title>
|
||||
<p>A Word file is made up of the document text and data structures
|
||||
containing formatting information about the text. Of course, this is a
|
||||
very simplified illustration. There are fields and macros and other
|
||||
things that have not been considered. At this stage, HWPF is mainly
|
||||
concerned with formatted text.</p>
|
||||
</section>
|
||||
<section><title>Reading Word files</title>
|
||||
<p>The entry point for HWPF's reading of a Word file is the File Information
|
||||
Block (FIB). This structure is the entry point for the locations and size
|
||||
of a document's text and data structures. The FIB is located at the
|
||||
beginning of the main stream.</p>
|
||||
<section><title>Text</title>
|
||||
<p>The document's text is also located in the main stream. Its starting
|
||||
location is given as FIB.fcMin and its length is given in bytes by
|
||||
FIB.ccpText. These two values are not very useful in getting the text
|
||||
because of unicode. There may be unicode text intermingled with ASCII
|
||||
text. That brings us to the piece table.</p>
|
||||
<p>The piece table is used to divide the text into non-unicode and unicode
|
||||
pieces. The size and offset are given in FIB.fcClx and FIB.lcbClx
|
||||
respectively. The piece table may contain Property Modifiers (prm).
|
||||
These are for complex(fast-saved) files and are skipped. Each text piece
|
||||
contains offsets in the main stream that contain text for that piece.
|
||||
If the piece uses unicode, the file offset is masked with a certain bit.
|
||||
Then you have to unmask the bit and divide by 2 to get the real file
|
||||
offset. </p>
|
||||
</section>
|
||||
<section><title>Text Formatting</title>
|
||||
<section><title>Stylesheet</title>
|
||||
<p>All text formatting is based on styles contained in the StyleSheet.
|
||||
The StyleSheet is a data structure containing among other things, style
|
||||
descriptions. Each style description can contain a paragraph style and
|
||||
a character style or simply a character style. Each style description
|
||||
is stored in a compressed version on file. Basically these are deltas
|
||||
from another style.</p>
|
||||
<p>Eventually, you have to chain back to the nil style which is an
|
||||
imaginary style with certain implied values.</p>
|
||||
</section>
|
||||
<section><title>Paragraph and Character styles</title>
|
||||
<p>Paragraph and Character formatting properties for a document's text are
|
||||
stored on file as deltas from some base style in the Stylesheet. The
|
||||
deltas are used to create a complete uncompressed style in memory.</p>
|
||||
<p>Uncompressed paragraph styles are represented by the Pargraph
|
||||
Properties(PAP) data structure. Uncompressed character styles are
|
||||
represented by the Character Properties(CHP) data structure. The styles
|
||||
for the document text are stored in compressed format in the
|
||||
corresponding Formatted Disk Pages (FKP). A compressed PAP is referred
|
||||
to as a PAPX and a compressed CHP is a CHPX. The FKP locations are
|
||||
stored in the bin table. There are separate bin tables for CHPXs and
|
||||
PAPXs. The bin tables' locations and sizes are stored in the FIB.</p>
|
||||
<p>A FKP is a 512 byte OLE page. It contains the offsets of the beginning
|
||||
and end of each paragraph/character run in the main stream and the
|
||||
compressed properties for that interval. The compressed PAPX is based on
|
||||
its base style in the StyleSheet. The compressed CHPX is based on the
|
||||
enclosing paragraph's base style in the Stylesheet.</p>
|
||||
</section>
|
||||
<section><title>Uncompressing styles and other data structures</title>
|
||||
<p>All compressed properties(CHPX, PAPX, SEPX) contain a grpprl. A grpprl
|
||||
is an array of sprms. A sprm defines a delta from some base property.
|
||||
There is a table of possible sprms in the Word 97 spec. Each sprm is a
|
||||
two byte operand followed by a parameter. The parameter size depends on
|
||||
the sprm. Each sprm describes an operation that should be performed on
|
||||
the base style. After every sprm in the grpprl is performed on the base
|
||||
style you will have the style for the paragraph, character run,
|
||||
section, etc.</p>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
|
||||
235
src/documentation/content/xdocs/components/document/index.xml
Normal file
@ -0,0 +1,235 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - HWPF and XWPF - Java API to Handle Microsoft Word Files</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person name="Ryan Ackley" email="sackley@apache.org"/>
|
||||
<person name="Rainer Klute" email="klute@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
|
||||
<p>HWPF is the name of our port of the Microsoft Word 97(-2007) file format
|
||||
to pure Java. It also provides limited read only support for the older
|
||||
Word 6 and Word 95 file formats.</p>
|
||||
|
||||
<p>The partner to HWPF for the new Word 2007 .docx format is <em>XWPF</em>.
|
||||
Whilst HWPF and XWPF provide similar features, there is not a common
|
||||
interface across the two of them at this time.</p>
|
||||
|
||||
<p>Both HWPF and XWPF could be described as "moderately functional". For some
|
||||
use cases, especially around text extraction, support is very strong. For
|
||||
others, support may be limited or incomplete, and it may be necessary to
|
||||
dig down into low-level code. Error checking may be missing in places,
|
||||
so it may be possible to accidentally generate invalid files. Enhancements
|
||||
to fix such things are generally very well received!</p>
|
||||
|
||||
<p>As detailed in the <a href="site:components">Components
|
||||
Page</a>, HWPF is contained within the poi-scratchpad-XXX.jar, while XWPF
|
||||
is in the poi-ooxml-XXX.jar. You will need to ensure you include the appropriate
|
||||
jars (and their dependencies!) in your classpath to use HWPF or XWPF.</p>
|
||||
|
||||
<p>Please note that in version 3.12, due to a bug, you might need to include
|
||||
poi-scratchpad-XXX.jar when using XWPF. This has been fixed again for the next
|
||||
release as there should not be such a dependency.</p>
|
||||
|
||||
</section>
|
||||
<section>
|
||||
<title>An overview of the code</title>
|
||||
<p>
|
||||
Source in the <em>org.apache.poi.hwpf.model</em> tree is the Java representation of
|
||||
internal Word format structure. This code is "internal", it shall not
|
||||
be used by your code. Code from <em>org.apache.poi.hwpf.usermodel</em>
|
||||
package is actual public and user-friendly (as much as possible) API to access document
|
||||
parts. Source code in the
|
||||
<em>org.apache.poi.hwpf.extractor</em>
|
||||
tree is a wrapper of this to facilitate easy extraction of interesting things (eg the Text),
|
||||
and
|
||||
<em>org.apache.poi.hwpf.converter</em>
|
||||
package contains Word-to-HTML and Word-to-FO converters (latest can be used to generate PDF
|
||||
from Word files when using with
|
||||
<a href="https://xmlgraphics.apache.org/fop/">Apache FOP</a>
|
||||
). Also there is a small file-structure-dumping utility in
|
||||
<em>org.apache.poi.hwpf.dev</em>
|
||||
package, primally for developing purposes.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The main entry point to HWPF is HWPFDocument. Currently it has a lot of references both to
|
||||
internal interfaces (
|
||||
<em>org.apache.poi.hwpf.model</em>
|
||||
package) and public API (
|
||||
<em>org.apache.poi.hwpf.usermodel</em>
|
||||
) package. It is possible that it will be split into two different interfaces (like WordFile
|
||||
and WordDocument) in later versions.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The main entry point to XWPF is XWPFDocument. From there, you can get the
|
||||
paragraphs, pictures, tables, sections, headers etc.
|
||||
</p>
|
||||
<p>
|
||||
Currently, there are only a handful of example programs using HWPF and XWPF
|
||||
available. They can be found in svn in the examples section, under
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hwpf">HWPF</a>
|
||||
and
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xwpf">XWPF</a>.
|
||||
Both HWPF and XWPF have fairly high levels of unit test coverage, which
|
||||
provides examples of using the various areas of functionality of both
|
||||
modules. These can be found in svn, under
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-scratchpad/src/test/java/org/apache/poi/hwpf">HWPF</a>
|
||||
and
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-ooxml/src/test/java/org/apache/poi/xwpf">XWPF</a>.
|
||||
Contributions of more examples, whether inspired by the unit tests or
|
||||
not, would be most welcomed!
|
||||
</p>
|
||||
|
||||
</section>
|
||||
<section>
|
||||
<title>HWPF Notes</title>
|
||||
|
||||
<p>A .doc Word document, as handled by HWPF, can be considered as very long single
|
||||
text buffer. The HWPF API provides "pointers"
|
||||
to document parts, like sections, paragraphs and character runs. Usually user will iterates
|
||||
over main document part sections, paragraphs from sections and character runs from
|
||||
paragraph. Each such interface is a pointer to document text subrange along with additional
|
||||
properties (and they all extends same Range parent class). There is additional Range
|
||||
implementations like Table, TableRow, TableCell, etc. Some structures like Bookmark or Field
|
||||
can also provide subranges pointers.
|
||||
</p>
|
||||
|
||||
<p>Changing file content usually requires a lot of synchronized changes in those structures like
|
||||
updating property boundaries, position handlers, etc. Because of that HWPF API shall be
|
||||
considered as not thread safe. In addition, there is a "one pointer" rule for changing
|
||||
content. It means you should not use two different Range instances at one time. More
|
||||
precisely, if you are changing file content using some range pointer, all other range
|
||||
pointers except parents' ones become invalid. For example if you obtain overall range (1),
|
||||
paragraph range (2) from overall range and character run range (3) from paragraph range and
|
||||
change text of paragraph, character run range is now invalid and should not be used, but
|
||||
overall range pointer still valid. Each time you obtaining range (pointer) new instance is
|
||||
created. It means if you obtained two range pointers and changed document text using first
|
||||
range pointer, second one became invalid.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
<section>
|
||||
<title>XWPF Patches Required!</title>
|
||||
|
||||
<p>At the moment, XWPF covers many common use cases for reading and writing
|
||||
.docx files. Whilst this is a great thing, it does mean that XWPF does
|
||||
everything that the current POI committers need it to do, and so none of
|
||||
the committers are actively adding new features.</p>
|
||||
|
||||
<p>If you come across a feature in XWPF that you need, and isn't currently
|
||||
there, please do send in a patch to add the extra functionality! More details
|
||||
on contributing patches are available on the <a
|
||||
href="site:guidelines">"Contribution to POI" page</a>.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>HWPF Patches Required!</title>
|
||||
|
||||
<p>At the moment we unfortunately do not have someone taking care for HWPF
|
||||
and fostering its development. What we need is someone to stand up, take
|
||||
this thing under his hood as his baby and push it forward. Ryan Ackley,
|
||||
who put a lot of effort into HWPF, is no longer on board, so HWPF is an
|
||||
orphan child waiting to be adopted.</p>
|
||||
|
||||
<p>If <strong>you</strong> are interested in becoming the new HWPF
|
||||
pointman, you should look into the Microsoft Word internals. A good
|
||||
starting point seems to be Ryan Ackley's <a
|
||||
href="site:docformat">overview</a>. An introduction to the binary
|
||||
file formats is <a
|
||||
href="https://msdn.microsoft.com/en-us/library/cc998577%28v=office.12%29.aspx">available
|
||||
from Microsoft</a>, which has some good references and links. After that,
|
||||
the full details on the word format are available from
|
||||
<a href="https://msdn.microsoft.com/en-us/library/cc313153%28v=office.12%29.aspx">Microsoft</a>,
|
||||
but the documentation can be a little hard to get into at first... Try reading the
|
||||
<a href="site:docformat">overview</a> first, and looking at the existing
|
||||
code, then finally look up the documentation for specific missing features.</p>
|
||||
|
||||
<p>As a first step you should familiarize yourself with the source code,
|
||||
examples, test cases, and the HWPF patches available at <a
|
||||
href="https://issues.apache.org/">Bugzilla</a> (if any). Then you
|
||||
should compile an overview of</p>
|
||||
|
||||
<ul>
|
||||
<li>the current HWPF status,</li>
|
||||
<li>the patches in <a
|
||||
href="https://issues.apache.org/bugzilla/">Bugzilla</a> to be checked
|
||||
in (and those that should better be ditched),</li>
|
||||
<li>the available test cases and the test cases still to be written,</li>
|
||||
<li>the available documentation and the docs to be written,</li>
|
||||
<li>anything else that seems reasonable</li>
|
||||
</ul>
|
||||
|
||||
<p>When you start coding, you will not yet have write access to the
|
||||
SVN repository. Please submit your patches to <a
|
||||
href="https://issues.apache.org/">Bugzilla</a> and nag <a
|
||||
href="mailto:dev@poi.apache.org">the dev list</a> until someone commits
|
||||
them. Besides the actual checking in of HWPF patches, current POI
|
||||
committers will also do some minor reviews now and then of your source code
|
||||
patches, test cases and documentation to help ensure software quality. But
|
||||
most of the time you will be on your own. However, anyone offering useful
|
||||
contributions over a period of time will be offered committership!</p>
|
||||
|
||||
<p>Please do not forget to write <a
|
||||
href="https://www.junit.org/">JUnit</a> test cases and documentation!
|
||||
We won't accept code that doesn't come with test cases. And please
|
||||
consider that other contributors should be able to understand your source
|
||||
code easily. If you need any help getting started with JUnit test cases
|
||||
for HWPF, please ask on the developers' mailing list! If you show that you
|
||||
are prepared to stick at it you will most likely be given SVN commit
|
||||
access. See <a href="site:guidelines">"Contribution to POI" page</a>
|
||||
for more details and help getting started.</p>
|
||||
|
||||
<p>Of course we will help you as best as we can. However, presently there
|
||||
is no committer who is really familiar with the Word format, so you'll be
|
||||
mostly on your own. We are looking forward for you and your contributions!
|
||||
Honor and glory of becoming a POI committer are waiting!</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode: xml
|
||||
sgml-omittag:nil
|
||||
sgml-shorttag:nil
|
||||
sgml-namecase-general:nil
|
||||
sgml-general-insert-case:lower
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:nil
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
||||
@ -0,0 +1,392 @@
|
||||
<?xml version="1.0"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!-- edited with XMLSPY v5 rel. 4 U (http://www.xmlspy.com) by Ryan Ackley (Myself) -->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - HWPF - Java API to Handle Microsoft Word Files</title>
|
||||
<subtitle>Project Plan</subtitle>
|
||||
<authors>
|
||||
<person name="Ryan Ackley" email="sackley@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<p>HWPF Milestones</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>
|
||||
Milestones
|
||||
</th>
|
||||
<th>
|
||||
Target Date
|
||||
</th>
|
||||
<th>
|
||||
Owner
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Read in a Word document
|
||||
with minimum formatting
|
||||
(no lists, tables, footnotes,
|
||||
endnotes, headers, footers)
|
||||
and write it back out with the
|
||||
result viewable in Word
|
||||
97/2000
|
||||
</td>
|
||||
<td>
|
||||
07/11/2003
|
||||
</td>
|
||||
<td>
|
||||
Ryan
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Add support for Lists and
|
||||
Tables
|
||||
</td>
|
||||
<td>
|
||||
8/15/2003
|
||||
</td>
|
||||
<td>
|
||||
 
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
HWPF 1.0-alpha release with
|
||||
documentation and examples
|
||||
</td>
|
||||
<td>
|
||||
8/18/2003
|
||||
</td>
|
||||
<td>
|
||||
Praveen/Ryan
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Add support for Headers,
|
||||
Footers, endnotes, and
|
||||
footnotes
|
||||
</td>
|
||||
<td>
|
||||
8/31/2003
|
||||
</td>
|
||||
<td>
|
||||
?
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Add support for forms and
|
||||
mail merge
|
||||
</td>
|
||||
<td>
|
||||
September/October 2003
|
||||
</td>
|
||||
<td>
|
||||
?
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p>HWPF Task Lists</p>
|
||||
<p>Read in a Word document with minimum formatting (no lists, tables, footnotes,
|
||||
endnotes, headers, footers) and write it back out with the result viewable in Word 97/2000</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>
|
||||
Task
|
||||
</th>
|
||||
<th>
|
||||
Target Date
|
||||
</th>
|
||||
<th>
|
||||
Owner
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Create classes to read and
|
||||
write low level data
|
||||
structures with test cases
|
||||
</td>
|
||||
<td>
|
||||
7/10/2003
|
||||
</td>
|
||||
<td>
|
||||
Ryan
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Create classes to read and
|
||||
write FontTable and Font
|
||||
names with test case
|
||||
</td>
|
||||
<td>
|
||||
7/10/2003
|
||||
</td>
|
||||
<td>
|
||||
Praveen
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Final test
|
||||
</td>
|
||||
<td>
|
||||
7/11/2003
|
||||
</td>
|
||||
<td>
|
||||
Ryan
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p>Develop user friendly API so it is fun and easy to read and write word documents
|
||||
with java.</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>
|
||||
Task
|
||||
</th>
|
||||
<th>
|
||||
Target Date
|
||||
</th>
|
||||
<th>
|
||||
Owner
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Develop a way for SPRMS to
|
||||
be compressed and
|
||||
uncompressed
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Override CHPAbstractType
|
||||
with a concrete class that
|
||||
exposes attributes with
|
||||
human readable names
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Override PAPAbstractType
|
||||
with a concrete class that
|
||||
exposes attributes with
|
||||
human readable names
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Override SEPAbstractType
|
||||
with a concrete class that
|
||||
exposes attributes with
|
||||
human readable names
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Override DOPAbstractType
|
||||
with a concrete class that
|
||||
exposes attributes with
|
||||
human readable names
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Override TAPAbstractType
|
||||
with a concrete class that
|
||||
exposes attributes with
|
||||
human readable names
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Override TCAbstractType
|
||||
with a concrete class that
|
||||
exposes attributes with
|
||||
human readable names
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Develop a VerifyIntegrity
|
||||
class for testing so it is easy
|
||||
to determine if a Word
|
||||
Document is well-formed.
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Develop general intuitive
|
||||
API to tie everything together
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p>Add support for lists and tables</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>
|
||||
Task
|
||||
</th>
|
||||
<th>
|
||||
Target Date
|
||||
</th>
|
||||
<th>
|
||||
Owner
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Add data structures for
|
||||
reading and writing list data
|
||||
with test cases.
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Add data structures for
|
||||
reading and writing tables
|
||||
with test cases.
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p>HWPF 1.0-alpha release with documentation and examples</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>
|
||||
Task
|
||||
</th>
|
||||
<th>
|
||||
Target Date
|
||||
</th>
|
||||
<th>
|
||||
Owner
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Document the user model
|
||||
API
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Document the low level
|
||||
classes
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Come up with detailed How-To’s
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,89 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-XWPF - A Quick Guide</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at torchbox dot com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<p>XWPF has a fairly stable core API, providing read and write access
|
||||
to the main parts of a Word .docx file, but it isn't complete. For
|
||||
some things, it may be necessary to dive down into the low level XMLBeans
|
||||
objects to manipulate the ooxml structure. If you find yourself having
|
||||
to do this, please consider sending in a patch to enhance that, see the
|
||||
<a href="site:guidelines">"Contribution to POI" page</a>.</p>
|
||||
|
||||
<section><title>Basic Text Extraction</title>
|
||||
<p>For basic text extraction, make use of
|
||||
<code>org.apache.poi.xwpf.extractor.XWPFWordExtractor</code>. It accepts an input
|
||||
stream or a <code>XWPFDocument</code>. The <code>getText()</code>
|
||||
method can be used to
|
||||
get the text from all the paragraphs, along with tables, headers etc.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Specific Text Extraction</title>
|
||||
<p>To get specific bits of text, first create a
|
||||
<code>org.apache.poi.xwpf.XWPFDocument</code>. Select the <code>IBodyElement</code>
|
||||
of interest (Table, Paragraph etc), and from there get a <code>XWPFRun</code>.
|
||||
Finally fetch the text and properties from that.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Headers and Footers</title>
|
||||
<p>To get at the headers and footers of a word document, first create a
|
||||
<code>org.apache.poi.xwpf.XWPFDocument</code>. Next, you need to create a
|
||||
<code>org.apache.poi.xwpf.usermodel.XWPFHeaderFooter</code>, passing it your
|
||||
XWPFDocument. Finally, the XWPFHeaderFooter gives you access to the headers and
|
||||
footers, including first / even / odd page ones if defined in your
|
||||
document.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Changing Text</title>
|
||||
<p>From a <code>XWPFParagraph</code>, it is possible to fetch the existing
|
||||
<code>XWPFRun</code> elements that make up the text. To add new text,
|
||||
the <code>createRun()</code> method will add a new <code>XWPFRun</code>
|
||||
to the end of the list. <code>insertNewRun(int)</code> can instead be
|
||||
used to add a new <code>XWPFRun</code> at a specific point in the
|
||||
paragraph.
|
||||
</p>
|
||||
<p>Once you have a <code>XWPFRun</code>, you can use the
|
||||
<code>setText(String)</code> method to make changes to the text. To add
|
||||
whitespace elements such as tabs and line breaks, it is necessary to use
|
||||
methods like <code>addTab()</code> and <code>addCarriageReturn()</code>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Further Examples</title>
|
||||
<p>For now, there are a limited number of XWPF examples in the
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xwpf">Examples Package</a>.
|
||||
Beyond those, the best source of additional examples is in the unit
|
||||
tests. <a href="https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/test/java/org/apache/poi/xwpf/">
|
||||
Browse the XWPF unit tests.</a>
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,88 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HWPF - A Quick Guide</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at torchbox dot com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<p>HWPF is still in early development. It is in the <a
|
||||
href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/">
|
||||
scratchpad section of the SVN.</a> You will need to ensure you
|
||||
either have a recent SVN checkout, or a recent SVN nightly build
|
||||
(including the scratchpad jar!)</p>
|
||||
|
||||
<section><title>Basic Text Extraction</title>
|
||||
<p>For basic text extraction, make use of
|
||||
<code>org.apache.poi.hwpf.extractor.WordExtractor</code>. It accepts an input
|
||||
stream or a <code>HWPFDocument</code>. The <code>getText()</code>
|
||||
method can be used to
|
||||
get the text from all the paragraphs, or <code>getParagraphText()</code>
|
||||
can be used to fetch the text from each paragraph in turn. The other
|
||||
option is <code>getTextFromPieces()</code>, which is very fast, but
|
||||
tends to return things that aren't text from the page. YMMV.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Specific Text Extraction</title>
|
||||
<p>To get specific bits of text, first create a
|
||||
<code>org.apache.poi.hwpf.HWPFDocument</code>. Fetch the range
|
||||
with <code>getRange()</code>, then get paragraphs from that. You
|
||||
can then get text and other properties.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Headers and Footers</title>
|
||||
<p>To get at the headers and footers of a word document, first create a
|
||||
<code>org.apache.poi.hwpf.HWPFDocument</code>. Next, you need to create a
|
||||
<code>org.apache.poi.hwpf.usermodel.HeaderStores</code>, passing it your
|
||||
HWPFDocument. Finally, the HeaderStores gives you access to the headers and
|
||||
footers, including first / even / odd page ones if defined in your
|
||||
document. Additionally, HeaderStores provides a method for removing
|
||||
any macros in the text, which is helpful as many headers and footers
|
||||
do end up with macros in them.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Changing Text</title>
|
||||
<p>It is possible to change the text via
|
||||
<code>insertBefore()</code> and <code>insertAfter()</code>
|
||||
on a <code>Range</code> object (either a <code>Range</code>,
|
||||
<code>Paragraph</code> or <code>CharacterRun</code>).
|
||||
It is also possible to delete a <code>Range</code>.
|
||||
This code will work in many, but not all cases, and patches to
|
||||
improve it are gratefully received!
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Further Examples</title>
|
||||
<p>For now, the best source of additional examples is in the unit
|
||||
tests. <a
|
||||
href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/src/test/java/org/apache/poi/hwpf/">
|
||||
Browse the HWPF unit tests.</a>
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
216
src/documentation/content/xdocs/components/hmef/index.xml
Normal file
@ -0,0 +1,216 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HMEF - Java API To Access Microsoft Transport Neutral Encoding Files (TNEF)</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at apache dot org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>HMEF is the POI Project's pure Java implementation of Microsoft's
|
||||
TNEF (Transport Neutral Encoding Format), aka winmail.dat,
|
||||
which is used by Outlook and Exchange in some situations.</p>
|
||||
<p>Currently, HMEF provides a read-only api for accessing common
|
||||
message and attachment attributes, including the message body
|
||||
and attachment files. In addition, it's possible to have
|
||||
read-only access to all of the underlying TNEF and MAPI
|
||||
attributes of the message and attachments.</p>
|
||||
<p>HMEF also provides a command line tool for extracting out
|
||||
the message body and attachment files from a TNEF (winmail.dat)
|
||||
file.</p>
|
||||
<p>Write support, both for saving changes and for creating new
|
||||
files, is currently unavailable. Anyone interested in working
|
||||
on these areas is advised to read the
|
||||
<a href="site:guidelines">Contribution Guidelines</a> then
|
||||
<a href="site:mailinglists">join the dev list</a>!</p>
|
||||
|
||||
<note>
|
||||
This code currently lives the
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/">scratchpad area</a>
|
||||
of the POI SVN repository. To use this component, ensure
|
||||
you have the Scratchpad Jar on your classpath, or a dependency
|
||||
defined on the <em>poi-scratchpad</em> artifact - the main POI
|
||||
jar is not enough! See the
|
||||
<a href="site:components">POI Components Map</a>
|
||||
for more details.
|
||||
</note>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Using HMEF to access TNEF (winmail.dat) files</title>
|
||||
|
||||
<section>
|
||||
<title>Easy extraction of message body and attachment files</title>
|
||||
|
||||
<p>The class <em>org.apache.poi.hmef.extractor.HMEFContentsExtractor</em>
|
||||
provides both command line and Java extraction. It allows the
|
||||
saving of the message body (an RTF file), and all of the
|
||||
attachment files, to a single directory as specified.</p>
|
||||
|
||||
<p>From the command line, simply call the class specifying the
|
||||
TNEF file to extract, and the directory to place the extracted
|
||||
files into, eg:</p>
|
||||
<source>
|
||||
java -classpath poi-5.4.1.jar:poi-scratchpad-5.4.1.jar org.apache.poi.hmef.extractor.HMEFContentsExtractor winmail.dat /tmp/extracted/
|
||||
</source>
|
||||
|
||||
<p>From Java, there are two method calls on the class, one to
|
||||
extract the message body RTF to a file, and the other to extract
|
||||
all the attachments to a directory. A typical use would be:</p>
|
||||
<source>
|
||||
public void extract(String winmailFilename, String directoryName) throws Exception {
|
||||
HMEFContentsExtractor ext = new HMEFContentsExtractor(new File(winmailFilename));
|
||||
|
||||
File dir = new File(directoryName);
|
||||
File rtf = new File(dir, "message.rtf");
|
||||
if(! dir.exists()) {
|
||||
throw new FileNotFoundException("Output directory " + dir.getName() + " not found");
|
||||
}
|
||||
|
||||
System.out.println("Extracting...");
|
||||
ext.extractMessageBody(rtf);
|
||||
ext.extractAttachments(dir);
|
||||
System.out.println("Extraction completed");
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Attachment attributes and contents</title>
|
||||
|
||||
<p>To get at your attachments, simply call the
|
||||
<em>getAttachments()</em> method on a <em>HMEFMessage</em>
|
||||
instance, and you'll receive a list of all the attachments.</p>
|
||||
<p>When you have a <em>org.apache.poi.hmef.Attachment</em> object,
|
||||
there are several helper methods available. These will all
|
||||
return the value of the appropriate underlying attachment
|
||||
attributes, or null if for some reason the attribute isn't
|
||||
present in your file.</p>
|
||||
<ul>
|
||||
<li><em>getFilename()</em> - returns the name of the attachment
|
||||
file, possibly in 8.3 format</li>
|
||||
<li><em>getLongFilename()</em> - returns the full name of the
|
||||
attachment file</li>
|
||||
<li><em>getExtension()</em> - returns the extension of the
|
||||
attachment file, including the "."</li>
|
||||
<li><em>getModifiedDate()</em> - returns the date that the
|
||||
attachment file was last edited on</li>
|
||||
<li><em>getContents()</em> - returns a byte array of the contents
|
||||
of the attached file</li>
|
||||
<li><em>getRenderedMetaFile()</em> - returns a byte array of
|
||||
a windows meta file representation of the attached file</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Message attributes and message body</title>
|
||||
|
||||
<p>A <em>org.apache.poi.hmef.HMEFMessage</em> instance is created
|
||||
from an <em>InputStream</em> of the underlying TNEF (winmail.dat)
|
||||
file.</p>
|
||||
<p>From a <em>HMEFMessage</em>, there are three main methods of
|
||||
interest to call:</p>
|
||||
<ul>
|
||||
<li><em>getBody()</em> - returns a String containing the RTF
|
||||
contents of the message body. </li>
|
||||
<li><em>getSubject()</em> - returns the message subject</li>
|
||||
<li><em>getAttachments()</em> - returns the list of
|
||||
<em>Attachment</em> objects for the message</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Low level attribute access</title>
|
||||
|
||||
<p>Both Messages and Attachments contain two kinds of attributes.
|
||||
These are <em>TNEFAttribute</em> and <em>MAPIAttribute</em>.</p>
|
||||
<p>TNEFAttribute is specific to TNEF files in terms of the
|
||||
available types and properties. In general, Attachments have a
|
||||
few more useful ones of these then Messages.</p>
|
||||
<p>MAPIAttributes hold standard MAPI properties and values, and
|
||||
work in a similar way to <a href="../hsmf/">HSMF
|
||||
(Outlook)</a> does. There are typically many of these on both
|
||||
Messages and Attachments. <em>Note - see limitations</em></p>
|
||||
<p>Both <em>HMEFMessage</em> and <em>Attachment</em> supports
|
||||
support two different ways of getting to attributes of interest.
|
||||
Firstly, they support list getters, to return all attributes
|
||||
(either TNEF or MAPI). Secondly, they support specific getters by
|
||||
TNEF or MAPI property.</p>
|
||||
<source>
|
||||
HMEFMessage msg = new HMEFMessage(new FileInputStream(file));
|
||||
for(TNEFAttribute attr : msg.getMessageAttributes()) {
|
||||
System.out.println("TNEF : " + attr);
|
||||
}
|
||||
for(MAPIAttribute attr : msg.getMessageMAPIAttributes()) {
|
||||
System.out.println("MAPI : " + attr);
|
||||
}
|
||||
System.out.println("Subject is " + msg.getMessageMAPIAttribute(MAPIProperty.CONVERSATION_TOPIC));
|
||||
|
||||
for(Attachment attach : msg.getAttachments()) {
|
||||
for(TNEFAttribute attr : attach.getAttributes()) {
|
||||
System.out.println("A.TNEF : " + attr);
|
||||
}
|
||||
for(MAPIAttribute attr : attach.getMAPIAttributes()) {
|
||||
System.out.println("A.MAPI : " + attr);
|
||||
}
|
||||
System.out.println("Filename is " + attach.getAttribute(TNEFProperty.ID_ATTACHTITLE));
|
||||
System.out.println("Extension is " + attach.getMAPIAttribute(MAPIProperty.ATTACH_EXTENSION));
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Investigating a TNEF file</title>
|
||||
|
||||
<p>To get a feel for the contents of a file, and to track down
|
||||
where data of interest is stored, HMEF comes with
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-scratchpad/src/main/java/org/apache/poi/hmef/dev/">HMEFDumper</a>
|
||||
to print out the contents of the file.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Limitations</title>
|
||||
|
||||
<p>HMEF is currently a work-in-progress, and not everything
|
||||
works yet. The current limitations are:</p>
|
||||
<ul>
|
||||
<li>Non-standard MAPI properties from the range 0x8000 to 0x8fff
|
||||
may not be being quite correctly turned into attributes.
|
||||
The values show up, but the name and type may not always
|
||||
be correct.</li>
|
||||
<li>All testing so far has been performed on a small number of
|
||||
English documents. We think we're correctly turning bytes into
|
||||
Java unicode strings, but we need a few non-English sample
|
||||
files in the test suite to verify this!</li>
|
||||
<li>There is no support for saving changes, nor for creating new
|
||||
files</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
197
src/documentation/content/xdocs/components/hpbf/file-format.xml
Normal file
@ -0,0 +1,197 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HPBF - A Guide to the Publisher File Format</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at torchbox dot com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Document Streams</title>
|
||||
<p>
|
||||
The file is made up of a number of POIFS streams. A typical
|
||||
file will be made up as follows:
|
||||
</p>
|
||||
<source>
|
||||
Root Entry -
|
||||
Objects -
|
||||
(no children)
|
||||
SummaryInformation <(0x05)SummaryInformation>
|
||||
DocumentSummaryInformation <(0x05)DocumentSummaryInformation>
|
||||
Escher -
|
||||
EscherStm
|
||||
EscherDelayStm
|
||||
Quill -
|
||||
QuillSub -
|
||||
CONTENTS
|
||||
CompObj <(0x01)CompObj>
|
||||
Envelope
|
||||
Contents
|
||||
Internal <(0x03)Internal>
|
||||
CompObj <(0x01)CompObj>
|
||||
VBA -
|
||||
(no children)
|
||||
</source>
|
||||
</section>
|
||||
<section><title>Changing Text</title>
|
||||
<p>If you make a change to the text of a file, but not change
|
||||
how much text there is, then the <em>CONTENTS</em> stream
|
||||
will undergo a small change, and the <em>Contents</em> stream
|
||||
will undergo a large change.</p>
|
||||
<p>If you make a change to the text of a file, and change the
|
||||
amount of text there is, then both the <em>Contents</em> and
|
||||
the <em>CONTENTS</em> streams change.</p>
|
||||
</section>
|
||||
<section><title>Changing Shapes</title>
|
||||
<p>If you alter the size of a textbox, but make no text changes,
|
||||
then both <em>Contents</em> and <em>CONTENTS</em> streams
|
||||
change. There are no changes to the Escher streams.</p>
|
||||
<p>If you set the background colour of a textbox, but make
|
||||
no changes to the text, (to finish off)</p>
|
||||
</section>
|
||||
<section><title>Structure of CONTENTS</title>
|
||||
<p>First we have "CHNKINK ", followed by 24 bytes.</p>
|
||||
<p>Next we have 20 sequences of 24 bytes each. If the first two bytes
|
||||
at 0x1800, then that sequence entry exists, but if it's 0x0000 then
|
||||
the entry doesn't exist. If it does exist, we then have 4 bytes of
|
||||
upper case ASCII text, followed by three little endian shorts.
|
||||
The first of these seems to be the count of that type, the second is
|
||||
usually 1, the third is usually zero. The we have another 4 bytes of
|
||||
upper case ASCII text, normally but not always the same as the first
|
||||
text. Finally, we have an unsigned little endian 32 bit offset to
|
||||
the start of the data for this, then an unsigned little endian
|
||||
32 bit offset of the length of this section.</p>
|
||||
<p>Normally, the first sequence entry is for TEXT, and the text data
|
||||
will start at 0x200. After that is normally two or three STSH entries
|
||||
(so the first short has values 0, then 1, then 2). After that it
|
||||
seems to vary.</p>
|
||||
<p>At 0x200 we have the text, stored as little endian 16 bit unicode.</p>
|
||||
<p>After the text comes all sorts of other stuff, presumably as
|
||||
described by the sequences.</p>
|
||||
<p>For a contents stream of length 7168 / 0x1c00 bytes, the start
|
||||
looks something like:</p>
|
||||
<source>
|
||||
CHNKINK // "CHNKINK "
|
||||
04 00 07 00 // Normally 04 00 07 00
|
||||
13 00 00 03 // Normally ## 00 00 03
|
||||
00 02 00 00 // Normally 00 ## 00 00
|
||||
00 1c 00 00 // Normally length of the stream
|
||||
f8 01 13 00 // Normally f8 01 11/13 00
|
||||
ff ff ff ff // Normally seems to be ffffffff
|
||||
|
||||
18 00
|
||||
TEXT 00 00 01 00 00 00 // TEXT 0 1 0
|
||||
TEXT 00 02 00 00 d0 03 00 00 // TEXT from: 200 (512), len: 3d0 (976)
|
||||
18 00
|
||||
STSH 00 00 01 00 00 00 // STSH 0 1 0
|
||||
STSH d0 05 00 00 1e 00 00 00 // STSH from: 5d0 (1488), len: 1e (30)
|
||||
18 00
|
||||
STSH 01 00 01 00 00 00 // STSH 1 1 0
|
||||
STSH ee 05 00 00 b8 01 00 00 // STSH from: 5ee (1518), len: 1b8 (440)
|
||||
18 00
|
||||
STSH 02 00 01 00 00 00 // STSH 2 1 0
|
||||
STSH a6 07 00 00 3c 00 00 00 // STSH from: 7a6 (1958), len: 3c (60)
|
||||
18 00
|
||||
FDPP 00 00 01 00 00 00 // FDPP 0 1 0
|
||||
FDPP 00 08 00 00 00 02 00 00 // FDPP from: 800 (2048), len: 200 (512)
|
||||
18 00
|
||||
FDPC 00 00 01 00 00 00 // FDPC 0 1 0
|
||||
FDPC 00 0a 00 00 00 02 00 00 // FDPC from: a00 (2560), len: 200 (512)
|
||||
18 00
|
||||
FDPC 01 00 01 00 00 00 // FDPC 1 1 0
|
||||
FDPC 00 0c 00 00 00 02 00 00 // FDPC from: c00 (3072), len: 200 (512)
|
||||
18 00
|
||||
SYID 00 00 01 00 00 00 // SYID 0 1 0
|
||||
SYID 00 0e 00 00 20 00 00 00 // SYID from: e00 (3584), len: 20 (32)
|
||||
18 00
|
||||
SGP 00 00 01 00 00 00 // SGP 0 1 0
|
||||
SGP 20 0e 00 00 0a 00 00 00 // SGP from: e20 (3616), len: a (10)
|
||||
18 00
|
||||
INK 00 00 01 00 00 00 // INK 0 1 0
|
||||
INK 2a 0e 00 00 04 00 00 00 // INK from: e2a (3626), len: 4 (4)
|
||||
18 00
|
||||
BTEP 00 00 01 00 00 00 // BTEP 0 1 0
|
||||
PLC 2e 0e 00 00 18 00 00 00 // PLC from: e2e (3630), len: 18 (24)
|
||||
18 00
|
||||
BTEC 00 00 01 00 00 00 // BTEC 0 1 0
|
||||
PLC 46 0e 00 00 20 00 00 00 // PLC from: e46 (3654), len: 20 (32)
|
||||
18 00
|
||||
FONT 00 00 01 00 00 00 // FONT 0 1 0
|
||||
FONT 66 0e 00 00 48 03 00 00 // FONT from: e66 (3686), len: 348 (840)
|
||||
18 00
|
||||
TCD 03 00 01 00 00 00 // TCD 3 1 0
|
||||
PLC ae 11 00 00 24 00 00 00 // PLC from: 11ae (4526), len: 24 (36)
|
||||
18 00
|
||||
TOKN 04 00 01 00 00 00 // TOKN 4 1 0
|
||||
PLC d2 11 00 00 0a 01 00 00 // PLC from: 11d2 (4562), len: 10a (266)
|
||||
18 00
|
||||
TOKN 05 00 01 00 00 00 // TOKN 5 1 0
|
||||
PLC dc 12 00 00 2a 01 00 00 // PLC from: 12dc (4828), len: 12a (298)
|
||||
18 00
|
||||
STRS 00 00 01 00 00 00 // STRS 0 1 0
|
||||
PLC 06 14 00 00 46 00 00 00 // PLC from: 1406 (5126), len: 46 (70)
|
||||
18 00
|
||||
MCLD 00 00 01 00 00 00 // MCLD 0 1 0
|
||||
MCLD 4c 14 00 00 16 06 00 00 // MCLD from: 144c (5196), len: 616 (1558)
|
||||
18 00
|
||||
PL 00 00 01 00 00 00 // PL 0 1 0
|
||||
PL 62 1a 00 00 48 00 00 00 // PL from: 1a62 (6754), len: 48 (72)
|
||||
00 00 // Blank entry follows
|
||||
00 00 00 00 00 00
|
||||
00 00 00 00 00 00 00 00
|
||||
00 00 00 00 00 00 00 00
|
||||
|
||||
(the text will then start)
|
||||
</source>
|
||||
<p>We think that the first 4 bytes of text describes the
|
||||
the function of the data at the offset. The first short is
|
||||
then the count of that type, eg the 2nd will have 1. We
|
||||
think that the second 4 bytes of text describes the format
|
||||
of data block at the offset. The format of the text block
|
||||
is easy, but we're still trying to figure out the others.</p>
|
||||
|
||||
<section><title>Structure of TEXT bit</title>
|
||||
<p>This is very simple. All the text for the document is
|
||||
stored in a single bit of the Quill CONTENTS. The text
|
||||
is stored as little endian 16 bit unicode strings.</p>
|
||||
</section>
|
||||
<section><title>Structure of PLC bit</title>
|
||||
<p>The first four bytes seem to hold the count of the
|
||||
entries in the bit, and the second four bytes seem to hold
|
||||
the type. There is then some pre-data, and then data for
|
||||
each of the entries, the exact format dependant on the type.</p>
|
||||
<p>Type 0 has 4 2 byte unsigned ints, then a pair of 2 byte
|
||||
unsigned ints for each entry.</p>
|
||||
<p>Type 4 has 4 2 byte unsigned ints, then a pair of 4 byte
|
||||
unsigned ints for each entry.</p>
|
||||
<p>Type 8 has 7 2 byte unsigned ints, then a pair of 4 byte
|
||||
unsigned ints for each entry.</p>
|
||||
<p>Type 12 holds hyperlinks, and is very much more complex.
|
||||
See <a href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/src/main/java/org/apache/poi/hpbf/model/qcbits/QCPLCBit.java?view=markup"><code>org.apache.poi.hpbf.model.qcbits.QCPLCBit</code></a>
|
||||
for our best guess as to how the contents match up.</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
77
src/documentation/content/xdocs/components/hpbf/index.xml
Normal file
@ -0,0 +1,77 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HPBF - Java API To Access Microsoft Publisher Format Files</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at apache dot org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>HPBF is the POI Project's pure Java implementation of the
|
||||
Publisher file format.</p>
|
||||
<p>Currently, HPBF is in an early stage, whilst we try to
|
||||
figure out the file format. So far, we have basic text
|
||||
extraction support, and are able to read some parts within
|
||||
the file. Writing is not yet supported, as we are unable
|
||||
to make sense of the Contents stream, which we think has
|
||||
lots of offsets to other parts of the file.</p>
|
||||
<p>Our initial aim is to produce a text extractor for the format
|
||||
(now done), and be able to extract hyperlinks from within
|
||||
the document (partly supported). Additional low level
|
||||
code to process the file format may follow, if there
|
||||
is demand and developer interest warrants it.</p>
|
||||
<p>Text Extraction is available via the
|
||||
<em>org.apache.poi.hpbf.extractor.PublisherTextExtractor</em>
|
||||
class.</p>
|
||||
<p>At this time, there is no <em>usermodel</em> api or similar.
|
||||
There is only low level support for certain parts of
|
||||
the file, but by no means all of it.</p>
|
||||
<p>Our current understanding of the file format is documented
|
||||
<a href="site:hpbformat">here</a>.</p>
|
||||
<p>As of 2017, we are unaware of a public format specification for
|
||||
Microsoft Publisher .pub files. This format was not included in
|
||||
the Microsoft Open Specifications Promise with the rest of the
|
||||
Microsoft Office file formats.
|
||||
As of <a href="https://social.msdn.microsoft.com/Forums/en-US/63dc6c4e-d6b2-4873-97dd-139ddb304e24/what-about-publisher-file-format?forum=os_binaryfile">2009</a> and <a href="https://social.msdn.microsoft.com/Forums/en-US/a5f55c72-5378-4dc9-944a-9973a12bfaa7/reading-viso-vsdfiles-and-publisher-pubfiles-without-office?forum=os_binaryfile">2016</a>, Microsoft had no plans to document the .pub file format.
|
||||
If this changes in the future, perhaps we will see a spec published
|
||||
on the <a href="https://msdn.microsoft.com/en-us/library/cc313105(v=office.12).aspx">Microsoft Office File Format Open Specification Technical Documentation</a>.
|
||||
</p>
|
||||
|
||||
<note>
|
||||
This code currently lives the
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/">scratchpad area</a>
|
||||
of the POI SVN repository. To use this component, ensure
|
||||
you have the Scratchpad Jar on your classpath, or a dependency
|
||||
defined on the <em>poi-scratchpad</em> artifact - the main POI
|
||||
jar is not enough! See the
|
||||
<a href="site:components">POI Components Map</a>
|
||||
for more details.
|
||||
</note>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
1477
src/documentation/content/xdocs/components/hpsf/how-to.xml
Normal file
73
src/documentation/content/xdocs/components/hpsf/index.xml
Normal file
@ -0,0 +1,73 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - HPSF - Java API for Microsoft Format Document
|
||||
Properties</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Rainer Klute" email="klute@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
|
||||
<p>Microsoft applications like "Word", "Excel" or "Powerpoint" let the user
|
||||
describe a document by properties like "title", "category" and so on. The
|
||||
application itself adds further information: last author, creation date
|
||||
etc. These document properties are stored in <strong>property set
|
||||
streams</strong>. A property set stream is a separate document within a
|
||||
<a href="../poifs/index.html">POI filesystem</a>. HPSF is POI's pure-Java
|
||||
implementation to read and write property sets.</p>
|
||||
|
||||
<p>The <a href="how-to.html">HPSF HOWTO</a> describes what a Java
|
||||
application should do to read a property set using HPSF, how to retrieve
|
||||
the information it needs, and how to write properties into the
|
||||
document.</p>
|
||||
|
||||
<p>HPSF supports OLE2 property set streams in general, and is not limited to
|
||||
the special case of document properties in the Microsoft Office files
|
||||
mentioned above. The <a href="internals.html">HPSF description</a>
|
||||
describes the internal structure of property set streams. A separate
|
||||
document explains the internal of <a href="thumbnails.html">thumbnail
|
||||
images</a>.</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode: xml
|
||||
sgml-omittag:nil
|
||||
sgml-shorttag:nil
|
||||
sgml-namecase-general:nil
|
||||
sgml-general-insert-case:lower
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:nil
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
||||
1079
src/documentation/content/xdocs/components/hpsf/internals.xml
Normal file
198
src/documentation/content/xdocs/components/hpsf/thumbnails.xml
Normal file
@ -0,0 +1,198 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>HPSF THUMBNAIL HOW-TO</title>
|
||||
<authors>
|
||||
<person name="Drew Varner" email="Drew.Varner@-deleteThis-sc.edu" />
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>The VT_CF Format</title>
|
||||
|
||||
<p>Thumbnail information is stored as a VT_CF, or Thumbnail Variant. The
|
||||
Thumbnail Variant is used to store various types of information in a
|
||||
clipboard. The VT_CF can store information in formats for the Macintosh or
|
||||
Windows clipboard.</p>
|
||||
|
||||
<p>There are many types of data that can be copied to the clipboard, but the
|
||||
only types of information needed for thumbnail manipulation are the image
|
||||
formats.</p>
|
||||
|
||||
<p>The <code>VT_CF</code> structure looks like this:</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Element:</th>
|
||||
<td>Clipboard Size</td>
|
||||
<td>Clipboard Format Tag</td>
|
||||
<td>Clipboard Data</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Size:</th>
|
||||
<td>32 bit unsigned integer (DWord)</td>
|
||||
<td>32 bit signed integer (DWord)</td>
|
||||
<td>variable length (byte array)</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>The Clipboard Size refers to the size (in bytes) of Clipboard Data
|
||||
(variable size) plus the Clipboard Format (four bytes).</p>
|
||||
|
||||
<p>Clipboard Format Tag has four possible values:</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Value</th>
|
||||
<th>Identifier</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-1L</code></td>
|
||||
<td><code>CFTAG_WINDOWS</code></td>
|
||||
<td>a built-in Windows© clipboard format value</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-2L</code></td>
|
||||
<td><code>CFTAG_MACINTOSH</code></td>
|
||||
<td>a Macintosh clipboard format value</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-3L</code></td>
|
||||
<td><code>CFTAG_FMTID</code></td>
|
||||
<td>a format identifier (FMTID) This is rarely used.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>0L</code></td>
|
||||
<td><code>CFTAG_NODATA</code></td>
|
||||
<td>No data This is rarely used.</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
|
||||
|
||||
|
||||
<section><title>Windows Clipboard Data</title>
|
||||
|
||||
<p>Windows clipboard data has four image formats for thumbnails:</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Value</th>
|
||||
<th>Identifier</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>3</td>
|
||||
<td><code>CF_METAFILEPICT</code></td>
|
||||
<td>Windows metafile format - recommended</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>8</td>
|
||||
<td><code>CF_DIB</code></td>
|
||||
<td>Device Independent Bitmap</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>14</td>
|
||||
<td><code>CF_ENHMETAFILE</code></td>
|
||||
<td>Enhanced Windows metafile format</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>2</td>
|
||||
<td><code>CF_BITMAP</code></td>
|
||||
<td>Bitmap - Obsolete - Use <code>CF_DIB</code> instead</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
|
||||
<section><title>Windows Metafile Format</title>
|
||||
|
||||
<p>The most common format for thumbnails on the Windows platform is the
|
||||
Windows metafile format. The Clipboard places and extra header in front of
|
||||
a the standard Windows Metafile Format data.</p>
|
||||
|
||||
<p>The Clipboard Data byte array looks like this when an image is stored in
|
||||
Windows' Clipboard WMF format.</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Identifier</th>
|
||||
<td>CF_METAFILEPICT</td>
|
||||
<td>mm</td>
|
||||
<td>width</td>
|
||||
<td>height</td>
|
||||
<td>handle</td>
|
||||
<td>WMF data</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Size</th>
|
||||
<td>32 bit unsigned int</td>
|
||||
<td>16 bit unsigned(?) int</td>
|
||||
<td>16 bit unsigned(?) int</td>
|
||||
<td>16 bit unsigned(?) int</td>
|
||||
<td>16 bit unsigned(?) int</td>
|
||||
<td>byte array - variable length</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Clipboard WMF</td>
|
||||
<td>Mapping Mode</td>
|
||||
<td>Image Width</td>
|
||||
<td>Image Height</td>
|
||||
<td>handle to the WMF data array in memory, or 0</td>
|
||||
<td>standard WMF byte stream</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
|
||||
|
||||
<section><title>Device Independent Bitmap</title>
|
||||
<p><strong>FIXME:</strong> Describe the Device Independent Bitmap
|
||||
format!</p>
|
||||
</section>
|
||||
|
||||
|
||||
|
||||
<section><title>Macintosh Clipboard Data</title>
|
||||
<p><strong>FIXME:</strong> Describe the Macintosh clipboard formats!</p>
|
||||
</section>
|
||||
|
||||
</body>
|
||||
</document>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode: xml
|
||||
sgml-omittag:nil
|
||||
sgml-shorttag:nil
|
||||
sgml-namecase-general:nil
|
||||
sgml-general-insert-case:lower
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:nil
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
||||
77
src/documentation/content/xdocs/components/hpsf/todo.xml
Normal file
@ -0,0 +1,77 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>To Do</title>
|
||||
<authors>
|
||||
<person name="Rainer Klute" email="klute@rainer-klute.de"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>To Do</title>
|
||||
|
||||
<p>The following functionalities should be added to HPFS:</p>
|
||||
|
||||
<ol>
|
||||
<li>
|
||||
Improve writing support! We need convenience classes and methods for
|
||||
easily writing summary information streams and document summary
|
||||
information streams.
|
||||
</li>
|
||||
<li>
|
||||
Add resource bundles to
|
||||
<code>org.apache.poi.hpsf.wellknown</code> to ease
|
||||
localizations. This would be useful for mapping standard property IDs to
|
||||
localized strings. Example: The property ID 4 could be mapped to "Author"
|
||||
in English or "Verfasser" in German.
|
||||
</li>
|
||||
<li>
|
||||
Implement reading functionality for those property types that are not
|
||||
yet supported. HPSF should return proper Java types instead of just byte
|
||||
arrays.
|
||||
</li>
|
||||
<li>
|
||||
Add WMF to <code>java.awt.Image</code> example code in the <a
|
||||
href="thumbnails.html">Thumbnail HOW-TO</a>.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode: xml
|
||||
sgml-omittag:nil
|
||||
sgml-shorttag:nil
|
||||
sgml-namecase-general:nil
|
||||
sgml-general-insert-case:lower
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:nil
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
||||
65
src/documentation/content/xdocs/components/hsmf/index.xml
Normal file
@ -0,0 +1,65 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HSMF - Java API To Access Microsoft Outlook MSG Files</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at apache dot org"/>
|
||||
<person name="Travis Ferguson" email="uniformstupidity at gmail dot com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>HSMF is the POI Project's pure Java implementation of the Outlook MSG format.</p>
|
||||
<p>At this time, it provides low-level read access to all of the file, along
|
||||
with a user-facing way to get at the common textual content of MSG files.
|
||||
to all</p>
|
||||
<p>There is an example MSG textual renderer, which shows how to access the
|
||||
common parts such as sender, subject, message body and examples. This is
|
||||
in the
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hsmf/">HSMF examples area</a>
|
||||
of SVN. You may also wish to look at the unit tests for more use guides.</p>
|
||||
|
||||
<note>
|
||||
This code currently lives the
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/src/main/java">scratchpad area</a>
|
||||
of the POI SVN repository. To use this component, ensure
|
||||
you have the Scratchpad Jar on your classpath, or a dependency
|
||||
defined on the <em>poi-scratchpad</em> artifact - the main POI
|
||||
jar is not enough! See the
|
||||
<a href="site:components">POI Components Map</a>
|
||||
for more details.
|
||||
</note>
|
||||
<note>
|
||||
This code is subject to change between versions, and being
|
||||
"scratchpad", doesn't maintain the usual Apache POI backwards
|
||||
compatibility guarantees. In particular, the way that property
|
||||
values are fetched is expected to change soon, as part of the
|
||||
work to improve fixed-length property support.
|
||||
</note>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
423
src/documentation/content/xdocs/components/index.xml
Normal file
@ -0,0 +1,423 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Component Overview</title>
|
||||
<authors>
|
||||
<person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person id="RK" name="Rainer Klute" email="klute@apache.org"/>
|
||||
<person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Apache POI Project Components</title>
|
||||
<p>The Apache POI project is the master project for developing pure
|
||||
Java ports of file formats based on Microsoft's OLE 2 Compound
|
||||
Document Format. OLE 2 Compound Document Format is used by
|
||||
Microsoft Office Documents, as well as by programs using MFC
|
||||
property sets to serialize their document objects.
|
||||
</p>
|
||||
<p>Apache POI is also the master project for developing pure
|
||||
Java ports of file formats based on Office Open XML (ooxml).
|
||||
OOXML is part of an ECMA / ISO standardisation effort. This
|
||||
documentation is quite large, but you can normally find the bit you
|
||||
need without too much effort!
|
||||
<a href="https://ecma-international.org/publications-and-standards/standards/ecma-376/">ECMA-376 standard is here</a>,
|
||||
and is also under the
|
||||
<a href="https://msdn.microsoft.com/en-us/openspecifications/default">Microsoft OSP</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<section><title>POIFS for OLE 2 Documents</title>
|
||||
<p>
|
||||
POIFS is the oldest and most stable part of POI. It is our port of the OLE 2 Compound Document Format to
|
||||
pure Java. It supports both read and write functionality. All of our components for the binary (non-XML)
|
||||
Microsoft Office formats ultimately rely on it by
|
||||
definition. Please see <a href="./poifs/index.html">the POIFS project page</a> for more information.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HSSF and XSSF for Excel Documents</title>
|
||||
<p>
|
||||
HSSF is our port of the Microsoft Excel 97 (-2003) file format (BIFF8) to pure
|
||||
Java. XSSF is our port of the Microsoft Excel XML (2007+) file format (OOXML) to
|
||||
pure Java. SS is a package that provides common support for both formats with a common API.
|
||||
They both support read and write capability. Please see
|
||||
<a href="site:spreadsheet">the HSSF+XSSF project page</a> for more
|
||||
information.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HWPF and XWPF for Word Documents</title>
|
||||
<p>
|
||||
HWPF is our port of the Microsoft Word 97 (-2003) file format to pure
|
||||
Java. It supports read, and limited write capabilities. It also provides
|
||||
simple text extraction support for the older Word 6 and Word 95 formats.
|
||||
Please see <a href="site:document">the HWPF project page for more
|
||||
information</a>. This component remains in early stages of
|
||||
development. It can already read and write simple files.
|
||||
</p>
|
||||
<p>
|
||||
We are also working on the XWPF for the WordprocessingML (2007+) format from the
|
||||
OOXML specification. This provides read and write support for simpler
|
||||
files, along with text extraction capabilities.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HSLF and XSLF for PowerPoint Documents</title>
|
||||
<p>
|
||||
HSLF is our port of the Microsoft PowerPoint 97(-2003) file format to pure
|
||||
Java. It supports read and write capabilities. Please see <a
|
||||
href="site:slideshow">the HSLF project page for more
|
||||
information</a>.
|
||||
</p>
|
||||
<p>
|
||||
We are also working on the XSLF for the PresentationML (2007+) format from the
|
||||
OOXML specification.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HPSF for OLE 2 Document Properties</title>
|
||||
<p>
|
||||
HPSF is our port of the OLE 2 property set format to pure
|
||||
Java. Property sets are mostly use to store a document's properties
|
||||
(title, author, date of last modification etc.), but they can be used
|
||||
for application-specific purposes as well.
|
||||
</p>
|
||||
<p>
|
||||
HPSF supports both reading and writing of properties.
|
||||
</p>
|
||||
<p>
|
||||
Please see <a href="./hpsf/index.html">the HPSF project
|
||||
page</a> for more information.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HDGF and XDGF for Visio Documents</title>
|
||||
<p>
|
||||
HDGF is our port of the Microsoft Visio 97(-2003) file format to pure
|
||||
Java. It currently only supports reading at a very low level, and
|
||||
simple text extraction. Please see <a
|
||||
href="./diagram/index.html">the HDGF / Diagram project page for more
|
||||
information</a>.
|
||||
</p>
|
||||
<p>
|
||||
XDGF is our port of the Microsoft Visio XML (.vsdx) file format to pure
|
||||
Java. It has slightly more support than HDGF. Please see <a
|
||||
href="./diagram/index.html">the XDGF / Diagram project page for more
|
||||
information</a>.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HPBF for Publisher Documents</title>
|
||||
<p>
|
||||
HPBF is our port of the Microsoft Publisher 98(-2007) file format to pure
|
||||
Java. It currently only supports reading at a low level for around
|
||||
half of the file parts, and simple text extraction. Please see <a
|
||||
href="./hpbf/index.html">the HPBF project page for more
|
||||
information</a>.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HMEF for TNEF (winmail.dat) Outlook Attachments</title>
|
||||
<p>
|
||||
HMEF is our port of the Microsoft TNEF (Transport Neutral Encoding
|
||||
Format) file format to pure Java. TNEF is sometimes used by Outlook
|
||||
for encoding the message, and will typically come through as
|
||||
winmail.dat. HMEF currently only supports reading at a low level, but
|
||||
we hope to add text and attachment extraction. Please see <a
|
||||
href="./hmef/index.html">the HMEF project page for more
|
||||
information</a>.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HSMF for Outlook Messages</title>
|
||||
<p>
|
||||
HSMF is our port of the Microsoft Outlook message file format to pure
|
||||
Java. It currently only some of the textual content of MSG files, and
|
||||
some attachments. Further support and documentation is coming in slowly.
|
||||
For now, users are advised to consult the unit tests for example use.
|
||||
Please see <a href="./hsmf/index.html">the HSMF project page for more
|
||||
information</a>.
|
||||
</p>
|
||||
<p>
|
||||
Microsoft has recently added the Outlook file format to its OSP. More information
|
||||
is now available making implementing this API an easier task.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section id="components"><title>Component Map</title>
|
||||
<p>
|
||||
The Apache POI distribution consists of support for many document file formats. This support is provided
|
||||
in several Jar files. Not all of the Jars are needed for every format. The following tables
|
||||
show the relationships between POI components, Maven repository tags, and the project's Jar files.
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Component</th>
|
||||
<th>Application type</th>
|
||||
<th>Maven artifactId</th>
|
||||
<th>Notes</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./poifs/index.html">POIFS</a></td>
|
||||
<td>OLE2 Filesystem</td>
|
||||
<td><em>poi</em></td>
|
||||
<td>Required to work with OLE2 / POIFS based files</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./hpsf/index.html">HPSF</a></td>
|
||||
<td>OLE2 Property Sets</td>
|
||||
<td><em>poi</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:spreadsheet">HSSF</a></td>
|
||||
<td>Excel XLS</td>
|
||||
<td><em>poi</em></td>
|
||||
<td>For HSSF only, if common SS is needed see below</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:slideshow">HSLF</a></td>
|
||||
<td>PowerPoint PPT</td>
|
||||
<td><em>poi-scratchpad</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:document">HWPF</a></td>
|
||||
<td>Word DOC</td>
|
||||
<td><em>poi-scratchpad</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./diagram/index.html">HDGF</a></td>
|
||||
<td>Visio VSD</td>
|
||||
<td><em>poi-scratchpad</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./hpbf/index.html">HPBF</a></td>
|
||||
<td>Publisher PUB</td>
|
||||
<td><em>poi-scratchpad</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./hsmf/index.html">HSMF</a></td>
|
||||
<td>Outlook MSG</td>
|
||||
<td><em>poi-scratchpad</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>DDF</td>
|
||||
<td>Escher common drawings</td>
|
||||
<td><em>poi</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>HWMF</td>
|
||||
<td>WMF drawings</td>
|
||||
<td><em>poi-scratchpad</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./oxml4j/index.html">OpenXML4J</a></td>
|
||||
<td>OOXML</td>
|
||||
<td><em>poi-ooxml</em> plus either <em>poi-ooxml-lite</em> or<br/>
|
||||
<em>poi-ooxml-full</em></td>
|
||||
<td>See notes below for differences between these options</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:spreadsheet">XSSF</a></td>
|
||||
<td>Excel XLSX</td>
|
||||
<td><em>poi-ooxml</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:slideshow">XSLF</a></td>
|
||||
<td>PowerPoint PPTX</td>
|
||||
<td><em>poi-ooxml</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:document">XWPF</a></td>
|
||||
<td>Word DOCX</td>
|
||||
<td><em>poi-ooxml</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./diagram/index.html">XDGF</a></td>
|
||||
<td>Visio VSDX</td>
|
||||
<td><em>poi-ooxml</em></td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="./slideshow/index.html">Common SL</a></td>
|
||||
<td>PowerPoint PPT and PPTX</td>
|
||||
<td><em>poi-scratchpad</em> and <em>poi-ooxml</em></td>
|
||||
<td>SL code is in the core POI jar, but implementations are in poi-scratchpad
|
||||
and poi-ooxml.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="site:spreadsheet">Common SS</a></td>
|
||||
<td>Excel XLS and XLSX</td>
|
||||
<td><em>poi-ooxml</em></td>
|
||||
<td>WorkbookFactory and friends all require poi-ooxml, not just core poi</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p><br /></p>
|
||||
|
||||
<p>
|
||||
This table maps artifacts into the jar file name. "version-yyyymmdd" is
|
||||
the POI version stamp. You can see what the latest stamp is on the
|
||||
<a href="site:download">downloads page</a>.
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Maven artifactId</th>
|
||||
<th>Prerequisites</th>
|
||||
<th>JAR</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi</td>
|
||||
<td><a href="https://search.maven.org/#artifactdetails|org.apache.logging.log4j|log4j-api|2.24.3|jar">log4j 2.x</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|commons-codec|commons-codec|1.17.1|jar">commons-codec</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|org.apache.commons|commons-collections4|4.4|jar">commons-collections</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|org.apache.commons|commons-math3|3.6.1|jar">commons-math3</a>
|
||||
<a href="https://search.maven.org/#artifactdetails|commons-io|commons-io|2.19.0|jar">commons-io</a>
|
||||
</td>
|
||||
<td>poi-version-yyyymmdd.jar</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi-scratchpad</td>
|
||||
<td><a href="https://search.maven.org/#search|gav|1|g:org.apache.poi AND a:poi">poi</a></td>
|
||||
<td>poi-scratchpad-version-yyyymmdd.jar</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi-ooxml</td>
|
||||
<td><a href="https://search.maven.org/#search|gav|1|g:org.apache.poi AND a:poi">poi</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.poi AND a:poi-ooxml-lite">poi-ooxml-lite</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|org.apache.commons|commons-compress|1.23.0|jar">commons-compress</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|com.zaxxer|SparseBitSet|1.2|jar">SparseBitSet</a><br/>
|
||||
For SVG support:
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.xmlgraphics AND a:batik-all">batik-all</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:xml-apis AND a:xml-apis-ext">xml-apis-ext</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.xmlgraphics AND a:xmlgraphics-commons">xmlgraphics-commons</a><br/>
|
||||
For PDF support:
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.pdfbox AND a:pdfbox">pdfbox</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.pdfbox AND a:fontbox">fontbox</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:de.rototor.pdfbox AND a:graphics2d">rototor graphics2d</a>
|
||||
</td>
|
||||
<td>poi-ooxml-version-yyyymmdd.jar</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi-ooxml-lite</td>
|
||||
<td><a href="https://search.maven.org/#artifactdetails|org.apache.xmlbeans|xmlbeans|5.3.0|jar">xmlbeans</a></td>
|
||||
<td>poi-ooxml-lite-version-yyyymmdd.jar</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi-examples</td>
|
||||
<td><a href="https://search.maven.org/#search|gav|1|g:org.apache.poi AND a:poi">poi</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.poi AND a:poi-scratchpad">poi-scratchpad</a>,
|
||||
<a href="https://search.maven.org/#search|gav|1|g:org.apache.poi AND a:poi-ooxml">poi-ooxml</a>
|
||||
</td>
|
||||
<td>poi-examples-version-yyyymmdd.jar</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi-ooxml-full (known as ooxml-schemas)</td>
|
||||
<td><a href="https://search.maven.org/#artifactdetails|org.apache.xmlbeans|xmlbeans|5.3.0|jar">xmlbeans</a><br/>
|
||||
For signing:
|
||||
<a href="https://search.maven.org/#artifactdetails|org.bouncycastle|bcpkix-jdk18on|1.81|jar">bcpkix-jdk18on</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|org.bouncycastle|bcutil-jdk18on|1.81|jar">bcprov-jdk18on</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|org.apache.santuario|xmlsec|3.0.6|bundle">xmlsec</a>,
|
||||
<a href="https://search.maven.org/#artifactdetails|org.slf4j|slf4j-api|2.0.17|jar">slf4j-api</a>
|
||||
</td>
|
||||
<td>poi-ooxml-full-version-yyyymmdd.jar</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p> </p>
|
||||
<note>
|
||||
Apache commons-math3 and commons-compress were added as a dependency in POI 4.0.0.<br/>
|
||||
Zaxxer SparseBitSet was added as a dependency in POI 4.1.2<br/>
|
||||
Apache commons-io was added as a dependency in POI 5.1.0
|
||||
</note>
|
||||
<p>
|
||||
poi-ooxml requires poi-ooxml-lite. This is a substantially smaller
|
||||
version of the poi-ooxml-full jar (ooxml-schemas-1.4.jar for POI 4.0.0,
|
||||
ooxml-schemas-1.3.jar for POI 3.14 or to POI 3.17,
|
||||
ooxml-schemas-1.1.jar for POI 3.7 up to POI 3.13, ooxml-schemas-1.0.jar
|
||||
for POI 3.5 and 3.6).
|
||||
The larger poi-ooxml-full (formerly, ooxml-schemas) jar is <a href="../help/index.html#faq-N10025">normally</a>
|
||||
only required for features that are not fully implemented in poi-ooxml.
|
||||
There used to also be an ooxml-security jar, which contained
|
||||
all of the classes relating to encryption and signing. POI 5 no longer needs this jar.
|
||||
The equivalent classes are now in poi-ooxml-full and poi-ooxml-lite.
|
||||
This JAR was ooxml-security-1.1.jar for POI 3.14 and POI 4. ooxml-security-1.0.jar
|
||||
was used prior to that.
|
||||
</p>
|
||||
<p>
|
||||
The OOXML jars require a stax implementation, but now that Apache
|
||||
POI requires Java 8, that dependency is provided by the JRE and no additional
|
||||
stax jars are required. The OOXML jars used to require DOM4J, but
|
||||
the code has now been changed to use JAXP and no additional dom4j
|
||||
jars are required. By the way, look at this <a href="../help/index.html#faq-N1017E">FAQ</a>
|
||||
if you have problems when using a non-Oracle JDK.
|
||||
</p>
|
||||
<p>
|
||||
The ooxml schemas jars are compiled with <a href="https://xmlbeans.apache.org/">Apache XMLBeans</a>.
|
||||
It is recommended that you use the XMLBeans version that was used to build the POI OOXML schemas.
|
||||
It may be possible to use newer XMLBeans jars but there are no guarantees, especially if the XMLBeans version
|
||||
numbers differ a lot.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Examples</title>
|
||||
<p>
|
||||
Small sample programs using the POI API are available in the
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples">src/examples</a>
|
||||
(<a href="https://svn.apache.org/viewvc/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples">viewvc</a>) directory of the source distribution.
|
||||
</p>
|
||||
<p>
|
||||
All of the examples are included in POI distributions as a poi-examples artifact.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Running POI on other JVM languages</title>
|
||||
<p>
|
||||
POI can be run on most languages that run on the JVM. For code examples,
|
||||
see <a href="poi-jvm-languages.html">Running POI on other JVM languages</a>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Contributed Software</title>
|
||||
<p>
|
||||
Besides the "official" components outlined above there is some further
|
||||
software distributed with POI. This is called "contributed" software. It
|
||||
is not explicitly recommended or even maintained by the POI team, but
|
||||
it might still be useful to you.
|
||||
</p>
|
||||
<p>
|
||||
See <a href="poi-ruby.html">POI Ruby Bindings</a> and other code in the
|
||||
<a
|
||||
href="https://svn.apache.org/repos/asf/poi/trunk/src/contrib/">poi-contrib module</a>
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
290
src/documentation/content/xdocs/components/logging.xml
Normal file
@ -0,0 +1,290 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Logging Framework</title>
|
||||
<authors>
|
||||
<person id="DS" name="Dominik Stadler" email="centic@apache.org"/>
|
||||
<person id="MV" name="Marius Volkhart" email="mariusvolkhart@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Introduction</title>
|
||||
<p>
|
||||
Logging in POI is used primarily as a debugging mechanism, not a normal runtime
|
||||
logging system. Logging at levels noisier than WARN is ONLY for autopsy type debugging, and should
|
||||
NEVER be enabled on a production system.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>POI 5.1.0 and above</title>
|
||||
<p>
|
||||
Since version 5.1.0 Apache POI uses <a href="https://logging.apache.org/log4j/2.x/">Apache Log4j v2</a> directly.
|
||||
</p>
|
||||
<p>
|
||||
Apache POI only depends on log4j-api and allows choosing which logging framework to use. log4j-core is
|
||||
just one of many options.
|
||||
If you want to continue to use another SLF4J compatible logging framework, you can deploy the
|
||||
<a href="https://logging.apache.org/log4j/log4j-2.2/log4j-to-slf4j/index.html">log4j-to-slf4j</a> jar to
|
||||
facilitate this.
|
||||
</p>
|
||||
<p>
|
||||
POI tries to name loggers after the canonical name of the containing class. For example,
|
||||
<code>org.apache.poi.poifs.filesystem.POIFSFileSystem</code>. Use your logging framework's typical
|
||||
mechanisms for activating and deactivating logging for specific loggers.
|
||||
</p>
|
||||
<p>
|
||||
All loggers are named <code>com.apache.poi.*</code>, so rules applied to <code>com.apache.poi</code>
|
||||
will affect all POI loggers.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Logging with Log4j 2 Core</title>
|
||||
<p>
|
||||
Capturing POI logs using Log4j 2 Core is as simple as including the
|
||||
<a href="https://logging.apache.org/log4j/2.x/maven-artifacts.html"><code>log4j-core</code></a> JAR in
|
||||
your project. POI also has dependencies on libraries that make use of the SLF4J and Apache Commons
|
||||
Logging APIs. Gather logs from these dependencies by adding the
|
||||
<a href="https://logging.apache.org/log4j/2.x/log4j-jcl/index.html">Commons Logging Bridge</a> and the
|
||||
the <a href="https://logging.apache.org/log4j/2.x/log4j-slf4j-impl/index.html">SLF4J Binding</a> to your
|
||||
project.
|
||||
</p>
|
||||
<p>
|
||||
The simplest configuration is to capture all POI logs at the same level as your application. You might
|
||||
want to collect all messages <code>INFO</code> and higher, and are OK with capturing POI messages as well.
|
||||
</p>
|
||||
<source>
|
||||
<Configuration ...>
|
||||
<Loggers>
|
||||
<Root level="INFO">
|
||||
...
|
||||
</Root>
|
||||
</Loggers>
|
||||
</Configuration>
|
||||
</source>
|
||||
|
||||
<p>
|
||||
A more recommended configuration is to capture only messages from loggers you opt in to. For example,
|
||||
you might want to capture all messages from <code>com.example.myapplication</code> at <code>INFO</code>
|
||||
but only POI messages at <code>WARN</code> or more severe.
|
||||
</p>
|
||||
<source>
|
||||
<Configuration ...>
|
||||
<Loggers>
|
||||
<Logger name="com.example.myapplication" level="INFO" />
|
||||
<Logger name="org.apache.poi" level="WARN" />
|
||||
|
||||
<Root level="OFF">
|
||||
...
|
||||
</Root>
|
||||
</Loggers>
|
||||
</Configuration>
|
||||
</source>
|
||||
|
||||
<p>Another strategy you may decide to use is to capture all messages except those coming from POI.</p>
|
||||
<source>
|
||||
<Configuration ...>
|
||||
<Loggers>
|
||||
<Logger name="org.apache.poi" level="OFF" />
|
||||
|
||||
<Root level="INFO">
|
||||
...
|
||||
</Root>
|
||||
</Loggers>
|
||||
</Configuration>
|
||||
</source>
|
||||
</section>
|
||||
<section>
|
||||
<title>Log4J SimpleLogger</title>
|
||||
<p>
|
||||
If your main aim is just to get rid of the scary logging log message from Log4J that says
|
||||
'ERROR StatusLogger Log4j2 could not find a logging implementation.', then one option is to
|
||||
enable the SimpleLogger using a system property.
|
||||
</p>
|
||||
<p>
|
||||
-Dlog4j2.loggerContextFactory=org.apache.logging.log4j.simple.SimpleLoggerContextFactory
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Logging with SLF4J</title>
|
||||
<p>
|
||||
If you want to continue to use another SLF4J compatible logging framework, you can deploy the
|
||||
<a href="https://logging.apache.org/log4j/log4j-2.2/log4j-to-slf4j/index.html">log4j-to-slf4j</a> jar
|
||||
and the intended slf4j-bridges to facilitate this.
|
||||
</p>
|
||||
<p>
|
||||
See <a href="https://www.slf4j.org/">https://www.slf4j.org/</a> for more details about using SLF4J.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Logging with Logback</title>
|
||||
<p>
|
||||
Capturing POI logs using Logback requires adding the
|
||||
<a href="https://logging.apache.org/log4j/2.x/log4j-to-slf4j/index.html">Log4j to SLF4J Adapter</a> to
|
||||
your project, along with the standard Logback dependencies. POI also has dependencies on libraries that
|
||||
make use of the SLF4J and Apache Commons Logging APIs. Gather logs from these dependencies by adding the
|
||||
<a href="https://www.slf4j.org/legacy.html#jcl-over-slf4j">Commons Logging Bridge</a> to your project.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The simplest configuration is to capture all POI logs at the same level as your application. You might
|
||||
want to collect all messages <code>INFO</code> and higher, and are OK with capturing POI messages as well.
|
||||
</p>
|
||||
<source>
|
||||
<configuration ...>
|
||||
<root level="INFO">
|
||||
...
|
||||
</root>
|
||||
</configuration>
|
||||
</source>
|
||||
|
||||
<p>
|
||||
A more recommended configuration is to capture only messages from loggers you opt in to. For example,
|
||||
you might want to capture all messages from <code>com.example.myapplication</code> at <code>INFO</code>
|
||||
but only POI messages at <code>WARN</code> or more severe.
|
||||
</p>
|
||||
<source>
|
||||
<configuration ...>
|
||||
<logger name="com.example.myapplication" level="INFO" />
|
||||
<logger name="org.apache.poi" level="WARN" />
|
||||
|
||||
<root level="OFF">
|
||||
...
|
||||
</root>
|
||||
</configuration>
|
||||
</source>
|
||||
|
||||
<p>Another strategy you may decide to use is to capture all messages except those coming from POI.</p>
|
||||
<source>
|
||||
<configuration ...>
|
||||
<logger name="org.apache.poi" level="OFF" />
|
||||
|
||||
<root level="INFO">
|
||||
...
|
||||
</root>
|
||||
</configuration>
|
||||
</source>
|
||||
</section>
|
||||
<section>
|
||||
<title>POI 5.0.0</title>
|
||||
<p>
|
||||
POI 5.0.0 switched to using <a href="https://www.slf4j.org/">SLF4J</a> for logging. If you want
|
||||
to enable logging, please read up on the various SLF4J compatible logging frameworks.
|
||||
<a href="https://logging.apache.org/log4j/2.x/">Apache Log4j v2</a> is a good choice.
|
||||
<a href="https://logback.qos.ch/">Logback</a> is also widely used.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Legacy POI Logging Framework (no longer supported in POI 5.0.0 and above)</title>
|
||||
<p>
|
||||
Prior to POI 5.0.0, POI used a custom logging framework which allows to configure where logs are sent to.
|
||||
</p>
|
||||
<p>
|
||||
Logging in POI 3 and 4 is used only as a debugging mechanism, not as a normal runtime
|
||||
logging system. Logging at level debug/info is ONLY for debugging, and should
|
||||
NEVER be enabled on a production system.
|
||||
</p>
|
||||
<p>
|
||||
The framework is extensible so that you can send log messages to any logging framework
|
||||
that your application uses.
|
||||
</p>
|
||||
<p>
|
||||
A number of default logging implementations are supported by POI out-of-the-box and can be selected via a
|
||||
system property.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>POI 4.x and before: Enable Legacy POI Logging Framework</title>
|
||||
<p>
|
||||
By default, logging is disabled in POI 3 and 4. Sometimes, it might be useful
|
||||
to enable logging to see some debug messages printed out which can
|
||||
help in analyzing problems.
|
||||
</p>
|
||||
<p>
|
||||
You can select the logging framework by setting the system property <em>org.apache.poi.util.POILogger</em> during application startup or by calling System.setProperty():
|
||||
</p>
|
||||
<source>
|
||||
System.setProperty("org.apache.poi.util.POILogger", "org.apache.poi.util.CommonsLogger" );
|
||||
</source>
|
||||
<p>
|
||||
Note: You need to call <em>setProperty()</em> before any POI functionality is invoked as the logger is only initialized during startup.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>POI 4.x and before: Available Legacy POI Logging Framework implementations</title>
|
||||
<p>
|
||||
The following logger implementations are provided by POI 3 and 4:
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Class</th>
|
||||
<th>Type</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>org.apache.poi.util.SystemOutLogger</td>
|
||||
<td>Sends log output to the system console</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>org.apache.poi.util.NullLogger</td>
|
||||
<td>Default logger, does not log anything</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>org.apache.poi.util.CommonsLogger</td>
|
||||
<td>Allows to use <a href="https://commons.apache.org/proper/commons-logging/">Apache Commons Logging</a> for logging. This can use JDK1.4 logging,
|
||||
log4j, logkit, etc. The log4j dependency was removed in POI 5.0.0, so you will need to include this dependency yourself if you need it.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>org.apache.poi.util.DummyPOILogger</td>
|
||||
<td>Simple logger which will keep all log-lines in memory for later analysis (this class is not in the jar, just in the test source).
|
||||
Used primarily for testing. Note: this may cause a memory leak if used in production application!</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>POI 4.x and before: Sending logs to a different log framework</title>
|
||||
<p>
|
||||
You can send logs to other logging frameworks by implementing the interface <em>org.apache.poi.util.POILogger</em>.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>POI 4.x and before: Implementation details</title>
|
||||
<p>
|
||||
Every class uses a <code>POILogger</code> to log, and gets it using a static method
|
||||
of the <code>POILogFactory</code> .
|
||||
</p>
|
||||
<p>
|
||||
Each class in POI can log using a <code>POILogger</code>, which is an abstract class.
|
||||
We decided to make our own logging facade because:</p>
|
||||
<ol>
|
||||
<li>we need to log many values and we put many methods in this class to facilitate the
|
||||
programmer, without having him write string concatenations;</li>
|
||||
<li>we need to be able to use POI without any logger package present.</li>
|
||||
</ol>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
45
src/documentation/content/xdocs/components/oxml4j/index.xml
Normal file
@ -0,0 +1,45 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-OpenXML4J - Java API To Access Office Open XML documents</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
<p>OpenXML4J is the POI Project's pure Java implementation of the Open Packaging Conventions (OPC) defined in
|
||||
<a href="https://ecma-international.org/publications-and-standards/standards/ecma-376/">ECMA-376</a>.</p>
|
||||
<p>Every OpenXML file comprises a collection of byte streams called parts, combined into a container called a package.
|
||||
POI OpenXML4J provides a physical implementation of the OPC that uses the Zip file format.</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>History</title>
|
||||
<p>OpenXML4J was originally developed by
|
||||
<a href="https://web.archive.org/web/20090611063015/https://www.openxml4j.org/">openxml4j.org</a>,
|
||||
and was contributed to Apache POI in 2008. The original code is available at
|
||||
<a href="https://sourceforge.net/projects/openxml4j/">https://sourceforge.net/projects/openxml4j/</a>.
|
||||
Thanks to the support and guidance of Julien Chable</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
351
src/documentation/content/xdocs/components/poi-jvm-languages.xml
Normal file
@ -0,0 +1,351 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>JVM languages</title>
|
||||
<authors>
|
||||
<person id="JO" name="Javen O'Neal" email="onealj@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Intro</title>
|
||||
<p>
|
||||
Apache POI can be used with any
|
||||
<a href="https://en.wikipedia.org/wiki/List_of_JVM_languages">JVM language</a>
|
||||
that can import Java jar files such as Jython, Groovy, Scala, Kotlin, and JRuby.
|
||||
</p>
|
||||
<ul>
|
||||
<li><a href="#Jython+example">Jython</a></li>
|
||||
<li><a href="#Scala+example">Scala</a></li>
|
||||
<li><a href="#Groovy+example">Groovy</a></li>
|
||||
<li><a href="#Clojure+example">Clojure</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
|
||||
<section><title>Tested Environments</title>
|
||||
<ul>
|
||||
<li><a href="https://www.jython.org/">Jython</a> 2.5+ (older versions probably work, but are untested)</li>
|
||||
<li><a href="https://www.scala-lang.org/">Scala</a> 2.x</li>
|
||||
<li><a href="https://groovy-lang.org/">Groovy</a> 2.4 (anything from 1.6 onwards ought to work, but only the latest 2.4 releases have been tested by us)</li>
|
||||
<li><a href="https://clojure.org/">Clojure</a> 1.5.1+</li>
|
||||
</ul>
|
||||
<p>If you use POI in a different language (Kotlin, JRuby, ...) and would like to share a <em>Hello POI!</em> example,
|
||||
please share it.</p>
|
||||
<p>Please <a href="site:mailinglists">let us know</a> if you use POI in an environment not listed here</p>
|
||||
</section>
|
||||
|
||||
<!-- FIXME: Need to make each language section expandable/collapseable so that users can compare their language to Java on one screen. See https://jsfiddle.net/eJX8z/ for an example implementation. -->
|
||||
<section><title>Java code</title>
|
||||
<section><title>POILanguageExample.java</title>
|
||||
<source> <!-- lang="java" -->
|
||||
// include poi-{version}-{yyyymmdd}.jar, poi-ooxml-{version}-{yyyymmdd}.jar,
|
||||
// and poi-ooxml-lite-{version}-{yyyymmdd}.jar on Java classpath
|
||||
|
||||
// Import the POI classes
|
||||
import java.io.File;
|
||||
import java.io.FileOutputStream;
|
||||
import java.io.OutputStream;
|
||||
import org.apache.poi.ss.usermodel.Cell;
|
||||
import org.apache.poi.ss.usermodel.Row;
|
||||
import org.apache.poi.ss.usermodel.Sheet;
|
||||
import org.apache.poi.ss.usermodel.Workbook;
|
||||
import org.apache.poi.ss.usermodel.WorkbookFactory;
|
||||
import org.apache.poi.ss.usermodel.DataFormatter;
|
||||
|
||||
// Read the contents of the workbook
|
||||
File f = new File("SampleSS.xlsx");
|
||||
Workbook wb = WorkbookFactory.create(f);
|
||||
DataFormatter formatter = new DataFormatter();
|
||||
int i = 1;
|
||||
int numberOfSheets = wb.getNumberOfSheets();
|
||||
for ( Sheet sheet : wb ) {
|
||||
System.out.println("Sheet " + i + " of " + numberOfSheets + ": " + sheet.getSheetName());
|
||||
for ( Row row : sheet ) {
|
||||
System.out.println("\tRow " + row.getRowNum());
|
||||
for ( Cell cell : row ) {
|
||||
System.out.println("\t\t"+ cell.getAddress().formatAsString() + ": " + formatter.formatCellValue(cell));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Modify the workbook
|
||||
Sheet sh = wb.createSheet("new sheet");
|
||||
Row row = sh.createRow(7);
|
||||
Cell cell = row.createCell(42);
|
||||
cell.setActiveCell(true);
|
||||
cell.setCellValue("The answer to life, the universe, and everything");
|
||||
|
||||
// Save and close the workbook
|
||||
OutputStream fos = new FileOutputStream("SampleSS-updated.xlsx");
|
||||
wb.write(fos);
|
||||
fos.close();
|
||||
</source>
|
||||
</section> <!-- POILanguageExample.java -->
|
||||
</section> <!-- Java code -->
|
||||
|
||||
<section><title>Jython example</title>
|
||||
<source> <!-- lang="python" -->
|
||||
# Add <a href="site:components">poi jars</a> onto the python classpath or add them at run time
|
||||
import sys
|
||||
for jar in ('poi', 'poi-ooxml', 'poi-ooxml-lite'):
|
||||
sys.path.append('/path/to/%s-5.4.1.jar')
|
||||
|
||||
from java.io import File, FileOutputStream
|
||||
from contextlib import closing
|
||||
|
||||
# Import the POI classes
|
||||
from org.apache.poi.ss.usermodel import <a href="../apidocs/dev/org/apache/poi/ss/usermodel/WorkbookFactory.html">WorkbookFactory</a>, <a href="../apidocs/dev/org/apache/poi/ss/usermodel/DataFormatter.html">DataFormatter</a>
|
||||
|
||||
# Read the contents of the workbook
|
||||
wb = WorkbookFactory.create(File('<a href="https://svn.apache.org/viewvc/poi/trunk/test-data/spreadsheet/SampleSS.xlsx">SampleSS.xlsx</a>'))
|
||||
formatter = DataFormatter()
|
||||
for i, sheet in enumerate(wb, start=1):
|
||||
print('Sheet %d of %d: %s'.format(i, wb.numberOfSheets, sheet.sheetName))
|
||||
for row in sheet:
|
||||
print('\tRow %i' % row.rowNum)
|
||||
for cell in row:
|
||||
print('\t\t%s: %s' % (cell.address, formatter.formatCellValue(cell)))
|
||||
|
||||
# Modify the workbook
|
||||
sh = wb.createSheet('new sheet')
|
||||
row = sh.createRow(7)
|
||||
cell = sh.createCell(42)
|
||||
cell.activeCell = True
|
||||
cell.cellValue = 'The answer to life, the universe, and everything'
|
||||
|
||||
# Save and close the workbook
|
||||
with closing(FileOutputStream('SampleSS-updated.xlsx')) as fos:
|
||||
wb.write(fos)
|
||||
wb.close()
|
||||
</source>
|
||||
<p>There are several websites that have examples of using Apache POI in Jython projects:
|
||||
<a href="https://wiki.python.org/jython/PoiExample">python.org</a>,
|
||||
<a href="https://www.jython.org/jythonbook/en/1.0/appendixB.html#working-with-spreadsheets">jython.org</a>, and many others.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Scala example</title>
|
||||
<section><title>build.sbt</title>
|
||||
<source> <!-- lang="scala" -->
|
||||
// Add the POI core and OOXML support dependencies into your build.sbt
|
||||
libraryDependencies ++= Seq(
|
||||
"org.apache.poi" % "poi" % "5.4.1",
|
||||
"org.apache.poi" % "poi-ooxml" % "5.4.1",
|
||||
"org.apache.poi" % "poi-ooxml-lite" % "5.4.1"
|
||||
)
|
||||
</source>
|
||||
</section>
|
||||
<section><title>XSSFMain.scala</title>
|
||||
<source> <!-- lang="scala" -->
|
||||
// Import the required classes
|
||||
import org.apache.poi.ss.usermodel.{<a href="../apidocs/dev/org/apache/poi/ss/usermodel/WorkbookFactory.html">WorkbookFactory</a>, <a href="../apidocs/dev/org/apache/poi/ss/usermodel/DataFormatter.html">DataFormatter</a>}
|
||||
import java.io.{File, FileOutputStream}
|
||||
|
||||
object XSSFMain extends App {
|
||||
|
||||
// Automatically convert Java collections to Scala equivalents
|
||||
import scala.collection.JavaConversions._
|
||||
|
||||
// Read the contents of the workbook
|
||||
val workbook = WorkbookFactory.create(new File("<a href="https://svn.apache.org/viewvc/poi/trunk/test-data/spreadsheet/SampleSS.xlsx">SampleSS.xlsx</a>"))
|
||||
val formatter = new DataFormatter()
|
||||
for {
|
||||
// Iterate and print the sheets
|
||||
(sheet, i) <- workbook.zipWithIndex
|
||||
_ = println(s"Sheet $i of ${workbook.getNumberOfSheets}: ${sheet.getSheetName}")
|
||||
|
||||
// Iterate and print the rows
|
||||
row <- sheet
|
||||
_ = println(s"\tRow ${row.getRowNum}")
|
||||
|
||||
// Iterate and print the cells
|
||||
cell <- row
|
||||
} {
|
||||
println(s"\t\t${cell.getCellAddress}: ${formatter.formatCellValue(cell)}")
|
||||
}
|
||||
|
||||
// Add a sheet to the workbook
|
||||
val sheet = workbook.createSheet("new sheet")
|
||||
val row = sheet.createRow(7)
|
||||
val cell = row.createCell(42)
|
||||
cell.setAsActiveCell()
|
||||
cell.setCellValue("The answer to life, the universe, and everything")
|
||||
|
||||
// Save the updated workbook as a new file
|
||||
val fos = new FileOutputStream("SampleSS-updated.xlsx")
|
||||
workbook.write(fos)
|
||||
workbook.close()
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section><title>Groovy example</title>
|
||||
<section><title>build.gradle</title>
|
||||
<source> <!-- lang="groovy" -->
|
||||
// Add the POI core and OOXML support dependencies into your gradle build,
|
||||
// along with all of Groovy so it can run as a standalone script
|
||||
repositories {
|
||||
mavenCentral()
|
||||
}
|
||||
dependencies {
|
||||
runtime 'org.codehaus.groovy:groovy-all:2.5.15'
|
||||
runtime 'org.apache.poi:poi:5.4.1'
|
||||
runtime 'org.apache.poi:poi-ooxml:5.4.1'
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
<section><title>SpreadSheetDemo.groovy</title>
|
||||
<source> <!-- lang="groovy" -->
|
||||
import org.apache.poi.ss.usermodel.*
|
||||
import org.apache.poi.ss.util.*
|
||||
import java.io.File
|
||||
|
||||
if (args.length == 0) {
|
||||
println "Use:"
|
||||
println " SpreadSheetDemo <excel-file> [output-file]"
|
||||
return 1
|
||||
}
|
||||
|
||||
File f = new File(args[0])
|
||||
DataFormatter formatter = new DataFormatter()
|
||||
WorkbookFactory.create(f,null,true).withCloseable { workbook ->
|
||||
println "Has ${workbook.getNumberOfSheets()} sheets"
|
||||
|
||||
// Dump the contents of the spreadsheet
|
||||
(0..<workbook.getNumberOfSheets()).each { sheetNum ->
|
||||
println "Sheet ${sheetNum} is called ${workbook.getSheetName(sheetNum)}"
|
||||
|
||||
def sheet = workbook.getSheetAt(sheetNum)
|
||||
sheet.each { row ->
|
||||
def nonEmptyCells = row.grep { c -> c.getCellType() != Cell.CELL_TYPE_BLANK }
|
||||
println " Row ${row.getRowNum()} has ${nonEmptyCells.size()} non-empty cells:"
|
||||
nonEmptyCells.each { c ->
|
||||
def cRef = [c] as CellReference
|
||||
println " * ${cRef.formatAsString()} = ${formatter.formatCellValue(c)}"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Add two new sheets and populate
|
||||
CellStyle headerStyle = makeHeaderStyle(workbook)
|
||||
Sheet ns1 = workbook.createSheet("Generated 1")
|
||||
exportHeader(ns1, headerStyle, null, ["ID","Title","Num"] as String[])
|
||||
ns1.createRow(1).createCell(0).setCellValue("TODO - Populate with data")
|
||||
|
||||
Sheet ns2 = workbook.createSheet("Generated 2")
|
||||
exportHeader(ns2, headerStyle, "This is a demo sheet",
|
||||
["ID","Title","Date","Author","Num"] as String[])
|
||||
ns2.createRow(2).createCell(0).setCellValue(1)
|
||||
ns2.createRow(3).createCell(0).setCellValue(4)
|
||||
ns2.createRow(4).createCell(0).setCellValue(1)
|
||||
|
||||
// Save
|
||||
File output = File.createTempFile("output-", (f.getName() =~ /(\.\w+$)/)[0][0])
|
||||
output.withOutputStream { os -> workbook.write(os) }
|
||||
println "Saved as ${output}"
|
||||
}
|
||||
|
||||
CellStyle makeHeaderStyle(Workbook wb) {
|
||||
int HEADER_HEIGHT = 18
|
||||
CellStyle style = wb.createCellStyle()
|
||||
|
||||
style.setFillForegroundColor(IndexedColors.AQUA.getIndex())
|
||||
style.setFillPattern(FillPatternType.SOLID_FOREGROUND)
|
||||
|
||||
Font font = wb.createFont()
|
||||
font.setFontHeightInPoints((short)HEADER_HEIGHT)
|
||||
font.setBold(true)
|
||||
style.setFont(font)
|
||||
|
||||
return style
|
||||
}
|
||||
void exportHeader(Sheet s, CellStyle headerStyle, String info, String[] headers) {
|
||||
Row r
|
||||
int rn = 0
|
||||
int HEADER_HEIGHT = 18
|
||||
// Do they want an info row at the top?
|
||||
if (info != null && !info.isEmpty()) {
|
||||
r = s.createRow(rn)
|
||||
r.setHeightInPoints(HEADER_HEIGHT+1)
|
||||
rn++
|
||||
|
||||
Cell c = r.createCell(0)
|
||||
c.setCellValue(info)
|
||||
c.setCellStyle(headerStyle)
|
||||
s.addMergedRegion(new CellRangeAddress(0,0,0,headers.length-1))
|
||||
}
|
||||
// Create the header row, of the right size
|
||||
r = s.createRow(rn)
|
||||
r.setHeightInPoints(HEADER_HEIGHT+1)
|
||||
// Add the column headings
|
||||
headers.eachWithIndex { col, idx ->
|
||||
Cell c = r.createCell(idx)
|
||||
c.setCellValue(col)
|
||||
c.setCellStyle(headerStyle)
|
||||
s.autoSizeColumn(idx)
|
||||
}
|
||||
// Make all the columns filterable
|
||||
s.setAutoFilter(new CellRangeAddress(rn, rn, 0, headers.length-1))
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section><title>Clojure example</title>
|
||||
<section><title>SpreadSheetDemo.clj</title>
|
||||
<!-- code example provided by Blake Watson -->
|
||||
<source> <!-- lang="clojure" -->
|
||||
(ns poi.core
|
||||
(:gen-class)
|
||||
(:use [clojure.java.io :only [input-stream]])
|
||||
(:import [org.apache.poi.ss.usermodel WorkbookFactory DataFormatter]))
|
||||
|
||||
|
||||
(defn sheets [wb] (map #(.getSheetAt wb %1) (range 0 (.getNumberOfSheets wb))))
|
||||
|
||||
(defn print-all [wb]
|
||||
(let [df (DataFormatter.)]
|
||||
(doseq [sheet (sheets wb)]
|
||||
(doseq [row (seq sheet)]
|
||||
(doseq [cell (seq row)]
|
||||
(println (.formatAsString (.getAddress cell)) ": " (.formatCellValue df cell)))))))
|
||||
|
||||
(defn -main [& args]
|
||||
(when-let [name (first args)]
|
||||
(let [wb (WorkbookFactory/create (input-stream name))]
|
||||
(print-all wb))))
|
||||
</source>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
152
src/documentation/content/xdocs/components/poi-ruby.xml
Normal file
@ -0,0 +1,152 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI Ruby Bindings</title>
|
||||
<authors>
|
||||
<person id="AS" name="Avik Sengupta" email="avik@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Intro</title>
|
||||
<p>The POI library can now be compiled as a Ruby extension, allowing the API to be called from
|
||||
Ruby language programs. Ruby users can therefore read and write OLE2 documents, such as Excel files
|
||||
with ease
|
||||
</p>
|
||||
<p>The bindings are generated by compiling POI with <a href="https://gcc.gnu.org/java/">gcj</a>,
|
||||
and generating the Ruby wrapper using <a href="https://www.swig.org">SWIG</a>. The aim is the keep
|
||||
the POI api as-is. However, where java standard library objects are used, an effort is made to transform them smoothly
|
||||
into Ruby objects. Therefore, where the POI API takes an OutputStream, you can pass an IO object. Where the POI works
|
||||
java.util.Date or java.util.Calendar object, you can work with a Ruby Time object. </p>
|
||||
</section>
|
||||
|
||||
|
||||
<section><title>Getting Started</title>
|
||||
<section><title>Pre-Requisites</title>
|
||||
<p>The bindings have been developed with GCC 3.4.3 and Ruby 1.8.2. You are unlikely to get correct results with
|
||||
versions of GCC prior to 3.4 or versions of Ruby prior to 1.8. To compile the Ruby extension, you must have
|
||||
GCC (compiled with java language support), Ruby development headers, and SWIG. To run, you will need Ruby (obviously!) and
|
||||
<em>libgcj </em>, presumably from the same version of GCC with which you compiled.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Subversion</title>
|
||||
<p>
|
||||
The POI-Ruby module sits under the POI <a href="https://svn.apache.org/repos/asf/poi/trunk/src/contrib/poi-ruby/">Subversion</a>
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/src/contrib/poi-ruby/">(viewvc)</a>. Running <em>make</em>
|
||||
inside that directory will create a loadable ruby extension <em>poi4r.so</em> in the release subdirectory. Tests
|
||||
are in the <em>tests/</em> subdirectory, and should be run from the <em>poi-ruby</em> directory. Please read the tests to figure out the usage.
|
||||
</p>
|
||||
<p>Note that the makefile, though designed to work across Linux/OS X/Cygwin, has been tested only on linux.
|
||||
There are likely to be issues on other platform; fixes gratefully accepted! </p>
|
||||
</section>
|
||||
<section><title>Binary</title>
|
||||
<p>A version of poi4r.so is available <a href="https://www.apache.org/~avik/dist/poi4r.so">here</a> (broken link). Its been compiled on a linux box
|
||||
with GCC 3.4.3 and Ruby 1.8.2. It dynamically links to libgcj. No guarantees about working on any other box. </p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
|
||||
|
||||
|
||||
<section>
|
||||
<title>Usage</title>
|
||||
<p>The following ruby code shows some of the things you can do with POI in Ruby</p>
|
||||
<source>
|
||||
h=Poi4r::HSSFWorkbook.new
|
||||
#Test Sheet Creation
|
||||
s=h.createSheet("Sheet1")
|
||||
|
||||
#Test setting cell values
|
||||
s=h.getSheetAt(0)
|
||||
r=s.createRow(0)
|
||||
c=r.createCell(0)
|
||||
c.setCellValue(1.5)
|
||||
|
||||
c=r.createCell(1)
|
||||
c.setCellValue("Ruby")
|
||||
|
||||
#Test styles
|
||||
st = h.createCellStyle()
|
||||
c=r.createCell(2)
|
||||
st.setAlignment(Poi4r::HSSFCellStyle.ALIGN_CENTER)
|
||||
c.setCellStyle(st)
|
||||
c.setCellValue("centr'd")
|
||||
|
||||
#Date handling
|
||||
c=r.createCell(3)
|
||||
t1=Time.now
|
||||
c.setCellValue(Time.now)
|
||||
t2= c.getDateCellValue().gmtime
|
||||
|
||||
st=h.createCellStyle();
|
||||
st.setDataFormat(Poi4r::HSSFDataFormat.getBuiltinFormat("m/d/yy h:mm"))
|
||||
c.setCellStyle(st)
|
||||
|
||||
#Formulas
|
||||
c=r.createCell(4)
|
||||
c.setCellFormula("A1*2")
|
||||
c.getCellFormula()
|
||||
|
||||
#Writing
|
||||
h.write(File.new("test.xls","w"))
|
||||
</source>
|
||||
<p> The <em>tc_base_tests.rb</em> file in the <em>tests</em> sub directory of the source distribution
|
||||
contains examples of simple uses of the API. The <a href="spreadsheet/quick-guide.html">quick guide </a> is the best
|
||||
place to learn HSSF API use. (Note however that none of the Drawing features are implemented in the Ruby binding.)
|
||||
See also the <a href="site:javadocs">POI API documentation</a> for more details.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Future Directions</title>
|
||||
<section><title>TODO's</title>
|
||||
<ul>
|
||||
<li>Implement support for reading Excel files (easy)</li>
|
||||
<li>Expose POIFS API to read raw OLE2 files from Ruby</li>
|
||||
<li>Expose HPSF API to read property streams </li>
|
||||
<li>Tests... Tests... Tests...</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Limitations</title>
|
||||
<ul>
|
||||
<li>Check operations in 64bit machines - Java primitive types are fixed irrespective of machine type, unlike C/C++ types. The wrapping code
|
||||
that converts C/C++ primitive types to/from Java types is making assumptions on type sizes that MAY be incorrect on wide architectures. </li>
|
||||
<li>The current implementation is with the POI 2.0 release. The 2.5 release adds support for Excel drawing primitives, and
|
||||
thus has a dependency on java AWT. Since AWT is not very mature in gcj, leaving it out seemed to be the safer option.</li>
|
||||
<li>Packaging - The current make file makes no effort to install the extension into the standard ruby directories. This should probably be
|
||||
packaged as a <a href="https://www.rubygems.org">gem</a>.</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
1099
src/documentation/content/xdocs/components/poifs/design.xml
Normal file
95
src/documentation/content/xdocs/components/poifs/embeded.xml
Normal file
@ -0,0 +1,95 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - POIFS - Documents embedded in other documents</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick@apache.org"/>
|
||||
<person name="Yegor Kozlov" email="yegor@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
<p>It is possible for one OLE 2 based document to have other
|
||||
OLE 2 documents embedded in it. For example, an Excel file
|
||||
may have a Word document and a PowerPoint slideshow
|
||||
embedded as part of it.</p>
|
||||
<p>Normally, these other documents are stored in subdirectories
|
||||
of the OLE 2 (POIFS) filesystem. The exact location of the
|
||||
embedded documents will vary depending on the type of the
|
||||
master document, and the exact directory names will differ
|
||||
each time. To figure out exactly which directory to look
|
||||
in, you will either need to process the appropriate OLE 2
|
||||
linking entry in the master document, or simple iterate
|
||||
over all the directories in the filesystem.</p>
|
||||
<p>As a general rule, you will find the same OLE 2 entries
|
||||
in the subdirectories, as you would've found at the root
|
||||
of the filesystem were a document to not be embedded.</p>
|
||||
|
||||
<section><title>Files embedded in Excel</title>
|
||||
<p>Excel normally stores embedded files in subdirectories
|
||||
of the filesystem root. Typically these subdirectories
|
||||
are named starting with MBD, with 8 hex characters following.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Files embedded in Word</title>
|
||||
<p>Word normally stores embedded files in subdirectories
|
||||
of the ObjectPool directory, itself a subdirectory of the
|
||||
filesystem root. Typically these subdirectories and named
|
||||
starting with an underscore, followed by 10 numbers.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Files embedded in PowerPoint</title>
|
||||
<p>PowerPoint does not normally store embedded files
|
||||
in the OLE2 layer. Instead, they are held within records
|
||||
of the main PowerPoint file.
|
||||
<br/>See the <a href="./../slideshow/how-to-shapes.html#OLE">HSLF Tutorial</a>
|
||||
for how to retrieve embedded OLE objects from a presentation</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section><title>Listing POIFS contents</title>
|
||||
<p>POIFS provides a simple tool for listing the contents of
|
||||
OLE2 files. This can allow you to see what your POIFS file
|
||||
contents, and hence if it has any embedded documents in it,
|
||||
and where.</p>
|
||||
<p>The tool to use is <em>org.apache.poi.poifs.dev.POIFSLister</em>.
|
||||
This tool may be run from the command line, and takes a filename
|
||||
as its parameter. It will print out all the directories and
|
||||
files contained within the POIFS file.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Opening embedded files</title>
|
||||
<p>All of the POIDocument classes (HSSFWorkbook, HSLFSlideShow,
|
||||
HWPFDocument and HDGFDiagram) can either be opened from
|
||||
a POIFSFileSystem, or from a specific directory within a
|
||||
POIFSFileSystem. So, to open embedded files, simply locate the
|
||||
appropriate DirectoryNode that represents the subdirectory
|
||||
of interest, and pass this + the overall POIFSFileSystem to
|
||||
the constructor.</p>
|
||||
<p>I you want to extract the textual contents of the embedded file,
|
||||
then open the appropriate POIDocument, and then pass this to
|
||||
the extractor class, instead of simply passing the POIFSFilesystem
|
||||
to the extractor.</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
703
src/documentation/content/xdocs/components/poifs/fileformat.xml
Normal file
@ -0,0 +1,703 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
<document>
|
||||
<header>
|
||||
<title>POIFS File System Internals</title>
|
||||
<authors>
|
||||
<person email="mjohnson@apache.org" name="Marc Johnson" id="MJ"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>POIFS File System Internals</title>
|
||||
<section><title>Introduction</title>
|
||||
<p>POIFS file systems are essentially normal files stored on a
|
||||
Java-compatible platform's native file system. They are
|
||||
typically identified by names ending in a four character
|
||||
extension noting what type of data they contain. For
|
||||
example, a file ending in ".xls" would likely
|
||||
contain spreadsheet data, and a file ending in
|
||||
".doc" would probably contain a word processing
|
||||
document. POIFS file systems are called "file
|
||||
system", because they contain multiple embedded files
|
||||
in a manner similar to traditional file systems. Along
|
||||
functional lines, it would be more accurate to call these
|
||||
POIFS archives. For the remainder of this document it is
|
||||
referred to as a file system in order to avoid confusion
|
||||
with the "files" it contains.</p>
|
||||
<p>POIFS file systems are compatible with those document
|
||||
formats used by a well-known software company's popular
|
||||
office productivity suite and programs outputting
|
||||
compatible data. Because the POIFS file system does not
|
||||
provide compression, encryption or any other worthwhile
|
||||
feature, its not a good choice unless you require
|
||||
interoperability with these programs.</p>
|
||||
<p>The POIFS file system does not encode the documents
|
||||
themselves. For example, if you had a word processor file
|
||||
with the extension ".doc", you would actually
|
||||
have a POIFS file system with a document file archived
|
||||
inside of that file system.</p>
|
||||
<p>Note - this document is a good overview and explanation of
|
||||
the file format, but for the very nitty-gritty details,
|
||||
you should refer to
|
||||
<a href="https://msdn.microsoft.com/en-us/library/dd942138%28v=prot.13%29.aspx">[MS-CFB].pdf</a>
|
||||
in the (now public) Microsoft Documentation.</p>
|
||||
</section>
|
||||
<section><title>Document Conventions</title>
|
||||
<p>This document utilizes the numeric types as described by
|
||||
the Java Language Specification, which can be found at
|
||||
<a href="https://java.sun.com">https://java.sun.com</a>. In
|
||||
short:</p>
|
||||
<ul>
|
||||
<li>A <em>byte</em> is an 8 bit signed integer ranging from
|
||||
-128 to 127.</li>
|
||||
<li>A <em>short</em> is a 16 bit signed integer ranging from
|
||||
-32768 to 32767</li>
|
||||
<li>An <em>int</em> is a 32 bit signed integer ranging from
|
||||
-2147483648 to 2147483647</li>
|
||||
<li>A <em>long</em> is a 64 bit signed integer ranging from
|
||||
-9.22E18 to 9.22E18.</li>
|
||||
</ul>
|
||||
<p>The Java Language Specification spells out a number of
|
||||
other types that are not referred to by this document.</p>
|
||||
<p>Where this document makes references to "endian
|
||||
conversion" it is referring to the byte order of
|
||||
stored numbers. Numbers in "little-endian order"
|
||||
are stored with the <em>least</em> significant byte first. In
|
||||
order to properly read a short, for example, you'd read two
|
||||
bytes and then shift the second byte 8 bits to the left
|
||||
before performing an <code>or</code> operation to it
|
||||
against the first byte. The following code illustrates this
|
||||
method:</p>
|
||||
<source>
|
||||
public int getShort (byte[] rec)
|
||||
{
|
||||
return ((rec[1] << 8) | (rec[0] & 0x00ff));
|
||||
}</source>
|
||||
</section>
|
||||
<section><title>File System Walkthrough</title>
|
||||
<p>This is a walkthrough of a POIFS file system and how it is
|
||||
put together. It is not intended to give a concise
|
||||
description but to give a "big picture" of the
|
||||
general structure and how it's interpreted.</p>
|
||||
<p>A POIFS file system begins with a header. This header
|
||||
identifies locations in the file by function and provides a
|
||||
sanity check identifying a file as a POIFS file system.</p>
|
||||
<p>The first 64 bits of the header compose a <em>magic number
|
||||
identifier.</em> This identifier tells the client software
|
||||
that this is indeed a POIFS file system and that it should
|
||||
be treated as such. This is a "sanity check" to
|
||||
make sure this is a POIFS file system and not some other
|
||||
format. The header also contains an <em>array of block
|
||||
numbers</em>. These block numbers refer to blocks in the
|
||||
file. When these blocks are read together they form the
|
||||
<em>Block Allocation Table</em>. The header also contains a
|
||||
pointer to the first element in the <em>property table</em>,
|
||||
also known as the <em>root element</em>, and a pointer to the
|
||||
<em>small Block Allocation Table (SBAT)</em>.</p>
|
||||
<p>The <em>block allocation table</em> or <em>BAT</em>, along with
|
||||
the <em>property table</em>, specify which blocks in the file
|
||||
system belong to which files. After the header block, the
|
||||
file system is divided into identically sized blocks of
|
||||
data, numbered from 0 to however many blocks there are in
|
||||
the file system. For each file in the file system, its
|
||||
entry in the property table includes the index of the first
|
||||
block in the array of blocks. Each block's index into the
|
||||
array of blocks is also its index into the BAT, and the
|
||||
integer value stored at that index in the BAT gives the
|
||||
index of the next block in the array (and thus the index of
|
||||
the next BAT value). A special value is stored in the BAT
|
||||
to indicate "end of file".</p>
|
||||
<p>The <em>property table</em> is essentially the directory
|
||||
storage for the file system. It consists of the name of the
|
||||
file or directory, its <em>start block</em> in both the file
|
||||
system and <em>BAT</em>, and its actual size. The first
|
||||
property in the property table is the <em>root
|
||||
element</em>. It has two purposes: to be a directory entry
|
||||
(the root of the directory tree, to be specific), and to
|
||||
hold the start block for the <em>small block data</em>.</p>
|
||||
<p>Small block data is a special file that contains the data
|
||||
for small files (less than 4K bytes). It subdivides its
|
||||
blocks into smaller blocks and there is a special small
|
||||
block allocation table that, like the main BAT for larger
|
||||
files, is used to map a small file to its small blocks.</p>
|
||||
</section>
|
||||
<section><title>Header Block</title>
|
||||
<p>The POIFS file system begins with a <em>header
|
||||
block</em>. The first 64 bits of the header form a long
|
||||
<em>file type id</em> or <em>magic number identifier</em> of
|
||||
<code>0xE11AB1A1E011CFD0L</code>. This is basically a
|
||||
sanity check. If this isn't the first thing in the header
|
||||
(and consequently the file system) then this is not a
|
||||
POIFS file system and should be read with some other
|
||||
library.</p>
|
||||
<p>It's important to know the most important parts of the
|
||||
header. These are discussed in the rest of this
|
||||
section.</p>
|
||||
<section><title>BATs</title>
|
||||
<p>At offset <em>0x2C</em> is an int specifying the number
|
||||
of elements in the <em>BAT array</em>. The array at
|
||||
<em>0x4C</em> an array of ints. This array contains the
|
||||
indices of every block in the Block Allocation
|
||||
Table.</p>
|
||||
</section>
|
||||
<section><title>XBATs</title>
|
||||
<p>Very large POIFS archives may have more blocks than can
|
||||
be addressed by the BAT blocks enumerated in the header
|
||||
block. How large? Well, the BAT array in the header can
|
||||
contain up to 109 BAT block indices; each BAT block
|
||||
references up to 128 blocks, and each block is 512
|
||||
bytes, so we're talking about 109 * 128 * 512 =
|
||||
6.8MB. That's a pretty respectable document! But, you
|
||||
could have much more data than that, and in today's
|
||||
world of cheap gigabyte drives, why not? So, the BAT
|
||||
may be extended in that event. The integer value at
|
||||
offset <em>0x44</em> of the header is the index of the
|
||||
first <em>extended BAT (XBAT) block</em>. At offset
|
||||
<em>0x48</em> of the header, there is an int value that
|
||||
specifies how many XBAT blocks there are. The XBAT
|
||||
blocks begin at the specified index into the array of
|
||||
blocks making up the POIFS file system, and are chained
|
||||
for the specified count of XBAT blocks.</p>
|
||||
<p>Each XBAT block contains the indices of up to 127 BAT
|
||||
blocks, so the document size can be expanded by another
|
||||
~8MB for each XBAT block. The BAT blocks indexed by an
|
||||
XBAT block are appended to the end of the list of BAT
|
||||
blocks enumerated in the header block. Thus the BAT
|
||||
blocks enumerated in the header block are BAT blocks 0
|
||||
through 108, the BAT blocks enumerated in the first
|
||||
XBAT block are BAT blocks 109 through 235, the BAT
|
||||
blocks enumerated in the second XBAT block are BAT
|
||||
blocks 236 through 362, and so on.</p>
|
||||
<p>While a normal BAT block holds 128 entries, each XBAT
|
||||
only references 127 BAT blocks. The last, 128th entry
|
||||
in an XBAT is the offset to the next XBAT block in the
|
||||
chain (or -1 if this is the last XBAT).</p>
|
||||
<p>Through the use of XBAT blocks, the limit on the
|
||||
overall document size is that imposed by the 4-byte
|
||||
block indices; if the indices are unsigned ints, the
|
||||
maximum file size is 2 terabytes, 1 terabyte if the
|
||||
indices are treated as signed ints. Either way, I have
|
||||
yet to see a disk drive large enough to accommodate
|
||||
such a file on the shelves at the local office supply
|
||||
stores.</p>
|
||||
</section>
|
||||
<section><title>SBATs</title>
|
||||
<p>If a file contained in a POIFS archive is smaller than
|
||||
4096 bytes, it is stored in small blocks. Small blocks
|
||||
are 64 bytes in length and are contained within big
|
||||
blocks, up to 8 to a big block. As the main BAT is used
|
||||
to navigate the array of big blocks, so the <em>small
|
||||
block allocation table</em> is used to navigate the
|
||||
array of small blocks. The SBAT's start block index is
|
||||
found at offset <em>0x3C</em> of the header block, and
|
||||
remaining blocks constituting the SBAT are found by
|
||||
walking the main BAT as if it were an ordinary file in
|
||||
the POIFS file system (this process is described
|
||||
below).</p>
|
||||
</section>
|
||||
<section><title>Property Table Start Index</title>
|
||||
<p>An integer at address <em>0x30</em> specifies the start
|
||||
index of the property table. This integer is specified
|
||||
as a <em>"block index"</em>. The Property Table
|
||||
is stored, as is almost everything in a POIFS file
|
||||
system, in big blocks and walked via the BAT. The
|
||||
Property Table is described below.</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Property Table</title>
|
||||
<p>The property table is essentially nothing more than the
|
||||
directory system. Properties are 128 byte records
|
||||
contained within the 512 byte blocks. The first property
|
||||
is always the Root Entry. The following applies to
|
||||
individual properties within a property table:</p>
|
||||
<ul>
|
||||
<li>At offset <em>0x00</em> in the property is the
|
||||
"<em>name</em>". This is stored as an
|
||||
uncompressed 16 bit unicode string. In short every
|
||||
other byte corresponds to an "ASCII"
|
||||
character. The size of this string is stored at offset
|
||||
<em>0x40</em> (<em>string size</em>) as a short.</li>
|
||||
<li>At offset <em>0x42</em> is the <em>property type</em>
|
||||
(byte). The type is 1 for directory, 2 for file or 5
|
||||
for the Root Entry.</li>
|
||||
<li>At offset <em>0x43</em> is the <em>node color</em>
|
||||
(byte). The color is either 1, (black), or 0,
|
||||
(red). Properties are apparently meant to be arranged
|
||||
in a red-black binary tree, subject to the following
|
||||
rules:
|
||||
<ol>
|
||||
<li>The root of the tree is always black</li>
|
||||
<li>Two consecutive nodes cannot both be red</li>
|
||||
<li>A property is less than another property if its
|
||||
name length is less than the other property's name
|
||||
length</li>
|
||||
<li>If two properties have the same name length, the
|
||||
sort order is determined by the sort order of the
|
||||
properties' names.</li>
|
||||
</ol></li>
|
||||
<li>At offset <em>0x44</em> is the index (int) of the
|
||||
<em>previous property</em>.</li>
|
||||
<li>At offset <em>0x48</em> is the index (int) of the
|
||||
<em>next property</em>.</li>
|
||||
<li>At offset <em>0x4C</em> is the index (int) of the
|
||||
<em>first directory entry</em>. This is used by
|
||||
directory entries.</li>
|
||||
<li>At offset <em>0x74</em> is an integer giving the
|
||||
<em>start block</em> for the file described by this
|
||||
property. This index corresponds to an index in the
|
||||
array of indices that is the Block Allocation Table
|
||||
(or the Small Block Allocation Table) as well as the
|
||||
index of the first block in the file. This is used by
|
||||
files and the root entry.</li>
|
||||
<li>At offset <em>0x78</em> is an integer giving the total
|
||||
<em>actual size</em> of the file pointed at by this
|
||||
property. If the file size is less than 4096, the file
|
||||
is stored in small blocks and the SBAT is used to walk
|
||||
the small blocks making up the file. If the file size
|
||||
is 4096 or larger, the file is stored in big blocks
|
||||
and the main BAT is used to walk the big blocks making
|
||||
up the file. The exception to this rule is the <em>Root
|
||||
Entry</em>, which, regardless of its size, is
|
||||
<em>always</em> stored in big blocks and the main BAT is
|
||||
used to walk the big blocks making up this special
|
||||
file.</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Root Entry</title>
|
||||
<p>The <em>Root Entry</em> in the <em>Property Table</em>
|
||||
contains the information necessary to read and write
|
||||
small files, which are files less than 4096 bytes
|
||||
long. The start block field of the Root Entry is the
|
||||
start index of the <em>Small Block Array</em>, which is
|
||||
read like any other file in the POIFS file system. Since
|
||||
the SBAT cannot be used without the Small Block Array,
|
||||
the Root Entry MUST be read or written using the <em>Block
|
||||
Allocation Table</em>. The blocks making up the Small
|
||||
Block Array are divided into 64-byte small blocks, up to
|
||||
the size indicated in the Root Entry (which should always
|
||||
be a multiple of 64).</p>
|
||||
</section>
|
||||
<section><title>Walking the Nodes of the Property Table</title>
|
||||
<p>The individual properties form a directory tree, with the
|
||||
<em>Root Entry</em> as the directory tree's root, as shown
|
||||
in the accompanying drawing. Note the numbers in
|
||||
parentheses in each node; they represent the node's index
|
||||
in the array of properties. The <em>NEXT_PROP</em>,
|
||||
<em>PREVIOUS_PROP</em>, and <em>CHILD_PROP</em> fields hold
|
||||
these indices, and are used to navigate the tree.</p>
|
||||
<p><img alt="property set" src="images/PropertySet.jpg" /></p>
|
||||
<p>Each directory entry (i.e., a property whose type is
|
||||
<em>directory</em> or <em>root entry</em>) uses its
|
||||
<em>CHILD_PROP</em> field to point to one of its
|
||||
subordinate (child) properties. It doesn't seem to matter
|
||||
which of its children it points to. Thus in the previous
|
||||
drawing, the Root Entry's CHILD_PROP field may contain 1,
|
||||
4, or the index of one of its other children. Similarly,
|
||||
the directory node (index 1) may have, in its CHILD_PROP
|
||||
field, 2, 3, or the index of one of its other
|
||||
children.</p>
|
||||
<p>The children of a given directory property point to each
|
||||
other in a similar fashion by using their
|
||||
<em>NEXT_PROP</em> and <em>PREVIOUS_PROP</em> fields.</p>
|
||||
<p>Unused <em>NEXT_PROP</em>, <em>PREVIOUS_PROP</em>, and
|
||||
<em>CHILD_PROP</em> fields contain the marker value of
|
||||
-1. All file properties have a value of -1 for their
|
||||
CHILD_PROP fields for example.</p>
|
||||
</section>
|
||||
<section><title>Block Allocation Table</title>
|
||||
<p>The <em>BAT blocks</em> are pointed at by the bat array
|
||||
contained in the header and supplemented, if necessary,
|
||||
by the <em>XBAT blocks</em>. These blocks form a large
|
||||
table of integers. These integers are block numbers. The
|
||||
<em>Block Allocation Table</em> holds chains of integers.
|
||||
These chains are terminated with -2. The elements in
|
||||
these chains refer to blocks in the files. The starting
|
||||
block of a file is NOT specified in the BAT. It is
|
||||
specified by the <em>property</em> for a given file. The
|
||||
elements in this BAT are both the block number (within
|
||||
the file minus the header) <em>and</em> the number of the
|
||||
next BAT element in the chain. This can be thought of as
|
||||
a linked list of blocks. The BAT array contains the links
|
||||
from one block to the next, including the end of chain
|
||||
marker.</p>
|
||||
<p>Here's an example: Let's assume that the BAT begins as
|
||||
follows:</p>
|
||||
<p><code>BAT[ 0 ] = 2</code></p>
|
||||
<p><code>BAT[ 1 ] = 5</code></p>
|
||||
<p><code>BAT[ 2 ] = 3</code></p>
|
||||
<p><code>BAT[ 3 ] = 4</code></p>
|
||||
<p><code>BAT[ 4 ] = 6</code></p>
|
||||
<p><code>BAT[ 5 ] = -2</code></p>
|
||||
<p><code>BAT[ 6 ] = 7</code></p>
|
||||
<p><code>BAT[ 7 ] = -2</code></p>
|
||||
<p><code>...</code></p>
|
||||
<p>Now, if we have a file whose Property Table entry says it
|
||||
begins with index 0, we walk the BAT array and see that
|
||||
the file consists of blocks 0 (because the start block is
|
||||
0), 2 (because BAT[ 0 ] is 2), 3 (BAT[ 2 ] is 3), 4 (BAT[
|
||||
3 ] is 4), 6 (BAT[ 4 ] is 6), and 7 (BAT[ 6 ] is 7). It
|
||||
ends at block 7 because BAT[ 7 ] is -2, which is the end
|
||||
of chain marker.</p>
|
||||
<p>Similarly, a file beginning at index 1 consists of
|
||||
blocks 1 and 5.</p>
|
||||
<p>Other special numbers in a BAT array are:</p>
|
||||
<ul>
|
||||
<li>-1, which indicates an unused block</li>
|
||||
<li>-3, which indicates a "special" block, such
|
||||
as a block used to make up the Small Block Array, the
|
||||
Property Table, the main BAT, or the SBAT</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>File System Structures</title>
|
||||
<p>The following outlines the basic file system structures.</p>
|
||||
<section><title>Header (block 1) -- 512 (0x200) bytes</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Field</em></td>
|
||||
<td><em>Description</em></td>
|
||||
<td><em>Offset</em></td>
|
||||
<td><em>Length</em></td>
|
||||
<td><em>Default value or const</em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>FILETYPE</td>
|
||||
<td>Magic number identifying this as a POIFS file
|
||||
system.</td>
|
||||
<td>0x0000</td>
|
||||
<td>Long</td>
|
||||
<td>0xE11AB1A1E011CFD0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK1</td>
|
||||
<td>Unknown constant</td>
|
||||
<td>0x0008</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK2</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x000C</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK3</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x0014</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK4</td>
|
||||
<td>Unknown Constant (revision?)</td>
|
||||
<td>0x0018</td>
|
||||
<td>Short</td>
|
||||
<td>0x003B</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK5</td>
|
||||
<td>Unknown Constant (version?)</td>
|
||||
<td>0x001A</td>
|
||||
<td>Short</td>
|
||||
<td>0x0003</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK6</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x001C</td>
|
||||
<td>Short</td>
|
||||
<td>-2</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>LOG_2_BIG_BLOCK_SIZE</td>
|
||||
<td>Log, base 2, of the big block size</td>
|
||||
<td>0x001E</td>
|
||||
<td>Short</td>
|
||||
<td>9 (2 ^ 9 = 512 bytes)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>LOG_2_SMALL_BLOCK_SIZE</td>
|
||||
<td>Log, base 2, of the small block size</td>
|
||||
<td>0x0020</td>
|
||||
<td>Integer</td>
|
||||
<td>6 (2 ^ 6 = 64 bytes)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK7</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x0024</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK8</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x0028</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>BAT_COUNT</td>
|
||||
<td>Number of elements in the BAT array</td>
|
||||
<td>0x002C</td>
|
||||
<td>Integer</td>
|
||||
<td>required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>PROPERTIES_START</td>
|
||||
<td>Block index of the first block of the property
|
||||
table</td>
|
||||
<td>0x0030</td>
|
||||
<td>Integer</td>
|
||||
<td>required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK9</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x0034</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UK10</td>
|
||||
<td>Unknown Constant</td>
|
||||
<td>0x0038</td>
|
||||
<td>Integer</td>
|
||||
<td>0x00001000</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SBAT_START</td>
|
||||
<td>Block index of first big block containing the small
|
||||
block allocation table (SBAT)</td>
|
||||
<td>0x003C</td>
|
||||
<td>Integer</td>
|
||||
<td>-2</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SBAT_Block_Count</td>
|
||||
<td>Number of big blocks holding the SBAT</td>
|
||||
<td>0x0040</td>
|
||||
<td>Integer</td>
|
||||
<td>1</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>XBAT_START</td>
|
||||
<td>Block index of the first block in the Extended Block
|
||||
Allocation Table (XBAT)</td>
|
||||
<td>0x0044</td>
|
||||
<td>Integer</td>
|
||||
<td>-2</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>XBAT_COUNT</td>
|
||||
<td>Number of elements in the Extended Block Allocation
|
||||
Table (to be added to the BAT)</td>
|
||||
<td>0x0048</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>BAT_ARRAY</td>
|
||||
<td>Array of block indices constituting the Block
|
||||
Allocation Table (BAT)</td>
|
||||
<td>0x004C, 0x0050, 0x0054 ... 0x01FC</td>
|
||||
<td>Integer[]</td>
|
||||
<td>-1 for unused elements, at least first element must
|
||||
be filled.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>N/A</td>
|
||||
<td>Header block data not otherwise described in this
|
||||
table</td>
|
||||
<td>N/A</td>
|
||||
<td>N/A</td>
|
||||
<td>-1</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section>
|
||||
<title>Block Allocation Table Block -- 512 (0x200) bytes</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td>
|
||||
<em>Field</em>
|
||||
</td>
|
||||
<td>
|
||||
<em>Description</em>
|
||||
</td>
|
||||
<td>
|
||||
<em>Offset</em>
|
||||
</td>
|
||||
<td>
|
||||
<em>Length</em>
|
||||
</td>
|
||||
<td>
|
||||
<em>Default value or const</em>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>BAT_ELEMENT</td>
|
||||
<td>Any given element in the BAT block</td>
|
||||
<td>0x0000, 0x0004, 0x0008, ... 0x01FC</td>
|
||||
<td>Integer</td>
|
||||
<td>
|
||||
-1 = unused<br/>
|
||||
-2 = end of chain<br/>
|
||||
-3 = special (e.g., BAT block)<br/>
|
||||
All other values point to the next element in the
|
||||
chain and the next index of a block composing the
|
||||
file.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Property Block -- 512 (0x200) byte block</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Field</em></td>
|
||||
<td><em>Description</em></td>
|
||||
<td><em>Offset</em></td>
|
||||
<td><em>Length</em></td>
|
||||
<td><em>Default value or const</em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Properties[]</td>
|
||||
<td>This block contains the properties.</td>
|
||||
<td>0x0000, 0x0080, 0x0100, 0x0180</td>
|
||||
<td>128 bytes</td>
|
||||
<td>All unused space is set to -1.</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Property -- 128 (0x80) byte block</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Field</em></td>
|
||||
<td><em>Description</em></td>
|
||||
<td><em>Offset</em></td>
|
||||
<td><em>Length</em></td>
|
||||
<td><em>Default value or const</em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>NAME</td>
|
||||
<td>A unicode null-terminated uncompressed 16bit string
|
||||
(lose the high bytes) containing the name of the
|
||||
property.</td>
|
||||
<td>0x00, 0x02, 0x04, ... 0x3E</td>
|
||||
<td>Short[]</td>
|
||||
<td>0x0000 for unused elements, field required, 32
|
||||
(0x40) element max</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>NAME_SIZE</td>
|
||||
<td>Number of characters in the NAME field</td>
|
||||
<td>0x40</td>
|
||||
<td>Short</td>
|
||||
<td>Required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>PROPERTY_TYPE</td>
|
||||
<td>Property type (directory, file, or root)</td>
|
||||
<td>0x42</td>
|
||||
<td>Byte</td>
|
||||
<td>1 (directory), 2 (file), or 5 (root entry)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>NODE_COLOR</td>
|
||||
<td>Node color</td>
|
||||
<td>0x43</td>
|
||||
<td>Byte</td>
|
||||
<td>0 (red) or 1 (black)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>PREVIOUS_PROP</td>
|
||||
<td>Previous property index</td>
|
||||
<td>0x44</td>
|
||||
<td>Integer</td>
|
||||
<td>-1</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>NEXT_PROP</td>
|
||||
<td>Next property index</td>
|
||||
<td>0x48</td>
|
||||
<td>Integer</td>
|
||||
<td>-1</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>CHILD_PROP</td>
|
||||
<td>First child property index</td>
|
||||
<td>0x4c</td>
|
||||
<td>Integer</td>
|
||||
<td>-1</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SECONDS_1</td>
|
||||
<td>Seconds component of the created timestamp?</td>
|
||||
<td>0x64</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>DAYS_1</td>
|
||||
<td>Days component of the created timestamp?</td>
|
||||
<td>0x68</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SECONDS_2</td>
|
||||
<td>Seconds component of the modified timestamp?</td>
|
||||
<td>0x6C</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>DAYS_2</td>
|
||||
<td>Days component of the modified timestamp?</td>
|
||||
<td>0x70</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>START_BLOCK</td>
|
||||
<td>Starting block of the file, used as the first block
|
||||
in the file and the pointer to the next block from
|
||||
the BAT</td>
|
||||
<td>0x74</td>
|
||||
<td>Integer</td>
|
||||
<td>Required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SIZE</td>
|
||||
<td>Actual size of the file this property points
|
||||
to. (used to truncate the blocks to the real
|
||||
size).</td>
|
||||
<td>0x78</td>
|
||||
<td>Integer</td>
|
||||
<td>0</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
649
src/documentation/content/xdocs/components/poifs/how-to.xml
Normal file
@ -0,0 +1,649 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?><!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
<document>
|
||||
<header>
|
||||
<title>How To Use the POIFS APIs</title>
|
||||
<authors>
|
||||
<person email="mjohnson@apache.org" name="Marc Johnson" id="MJ"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section>
|
||||
<title>How To Use the POIFS APIs</title>
|
||||
<p>This document describes how to use the POIFS APIs to read, write, and modify files that employ a
|
||||
POIFS-compatible data structure to organize their content.
|
||||
</p>
|
||||
<section>
|
||||
<title>Target Audience</title>
|
||||
<p>This document is intended for Java developers who need to use the POIFS APIs to read, write, or
|
||||
modify files that employ a POIFS-compatible data structure to organize their content. It is not
|
||||
necessary for developers to understand the POIFS data structures, and an explanation of those data
|
||||
structures is beyond the scope of this document. It is expected that the members of the target
|
||||
audience will understand the rudiments of a hierarchical file system, and familiarity with the event
|
||||
pattern employed by Java APIs such as AWT would be helpful.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Glossary</title>
|
||||
<p>This document attempts to be consistent in its terminology, which is defined here:</p>
|
||||
<dl>
|
||||
<dt>Directory</dt>
|
||||
<dd>A special file that may contain other directories and documents.</dd>
|
||||
<dt>DirectoryEntry</dt>
|
||||
<dd>Representation of a directory within another directory.</dd>
|
||||
<dt>Document</dt>
|
||||
<dd>A file containing data, such as word processing data or a spreadsheet workbook.</dd>
|
||||
<dt>DocumentEntry</dt>
|
||||
<dd>Representation of a document within a directory.</dd>
|
||||
<dt>Entry</dt>
|
||||
<dd>Representation of a file in a directory.</dd>
|
||||
<dt>File</dt>
|
||||
<dd>A named entity, managed and contained by the file system.</dd>
|
||||
<dt>File System</dt>
|
||||
<dd>The POIFS data structures, plus the contained directories and documents, which are maintained in
|
||||
a hierarchical directory structure.
|
||||
</dd>
|
||||
<dt>Root Directory</dt>
|
||||
<dd>The directory at the base of a file system. All file systems have a root directory. The POIFS
|
||||
APIs will not allow the root directory to be removed or renamed, but it can be accessed for the
|
||||
purpose of reading its contents or adding files (directories and documents) to it.
|
||||
</dd>
|
||||
</dl>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>The different ways of working with POIFS</title>
|
||||
<p>The POIFS API provides ways to read, modify and write files and streams that employ a POIFS-compatible
|
||||
data structure to organize their content. The following use cases are covered:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="#reading">Reading a File System</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#reading_poifsfilesystem">Conventional Reading with POIFSFileSystem</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#reading_event">Event-Driven Reading</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#writing">Writing a File System</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#modifying">Modifying a File System</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Reading a File System</title>
|
||||
<anchor id="reading"/>
|
||||
<p>This section covers reading a file system. There are two ways to read a file system; these techniques are
|
||||
sketched out in the following table, and then explained in greater depth in the sections following the
|
||||
table.
|
||||
</p>
|
||||
<dl>
|
||||
<dt>Conventional Reading with POIFSFileSystem</dt>
|
||||
<dd>
|
||||
<ul>
|
||||
<li class="pro">Simpler API similar to reading a conventional file system.</li>
|
||||
<li class="pro">Can read documents in any order.</li>
|
||||
<li class="pro">Well tested read and write support.</li>
|
||||
<li class="con">If created from an InputStream, all files are resident in memory. (If created
|
||||
from a File, only certain key structures are)
|
||||
</li>
|
||||
</ul>
|
||||
</dd>
|
||||
<dt>Event-Driven Reading</dt>
|
||||
<dd>
|
||||
<ul>
|
||||
<li class="pro">Reduced footprint -- only the documents you care about are processed.</li>
|
||||
<li class="pro">Improved performance -- no time is wasted reading the documents you're not
|
||||
interested in.
|
||||
</li>
|
||||
<li class="con">More complicated API.</li>
|
||||
<li class="con">Need to know in advance which documents you want to read.</li>
|
||||
<li class="con">No control over the order in which the documents are read.</li>
|
||||
<li class="con">No way to go back and get additional documents except to re-read the file
|
||||
system, which may not be possible, e.g., if the file system is being read from an input
|
||||
stream that lacks random access support.
|
||||
</li>
|
||||
</ul>
|
||||
</dd>
|
||||
</dl>
|
||||
|
||||
<section>
|
||||
<title>Conventional Reading with POIFSFileSystem</title>
|
||||
<anchor id="reading_poifsfilesystem"/>
|
||||
<p>In this technique for reading, certain key structures are loaded into memory, and the entire
|
||||
directory tree can be walked by the application, reading specific documents at leisure.
|
||||
</p>
|
||||
<p>If you create a POIFSFileSystem instance from a File, the memory footprint is very small. However, if
|
||||
you createa a POIFSFileSystem instance from an input stream, then the whole contents must be
|
||||
buffered into memory to allow random access. As such, you should budget on memory use of up to 20%
|
||||
of the file size when using a File, or up to 120% of the file size when using an InputStream.
|
||||
</p>
|
||||
|
||||
<section>
|
||||
<title>Preparation</title>
|
||||
<p>Before an application can read a file from the file system, the file system needs to be opened
|
||||
and core parts processed. This is done using the
|
||||
<code>org.apache.poi.poifs.filesystem.POIFSFileSystem</code>
|
||||
class. Once the file system has been loaded into memory, the application may need the root
|
||||
directory. The following code fragment will accomplish this preparation stage:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
// This is the most memory efficient way to open the FileSystem
|
||||
try (POIFSFileSystem fs = new POIFSFileSystem(new File(filename))) {
|
||||
DirectoryEntry root = fs.getRoot();
|
||||
} catch (IOException e) {
|
||||
// an I/O error occurred, or the File did not provide a compatible
|
||||
// POIFS data structure
|
||||
}
|
||||
|
||||
// Using an InputStream requires more memory than using a File
|
||||
try (POIFSFileSystem fs = new POIFSFileSystem(inputStream)) {
|
||||
DirectoryEntry root = fs.getRoot();
|
||||
} catch (IOException e) {
|
||||
// an I/O error occurred, or the InputStream did not provide
|
||||
// a compatible POIFS data structure
|
||||
}
|
||||
]]></source>
|
||||
<p>Assuming no exception was thrown, the file system can then be read.</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Reading the Directory Tree</title>
|
||||
<p>Once the file system has been loaded into memory and the root directory has been obtained, the
|
||||
root directory can be read. The following code fragment shows how to read the entries in an <code>
|
||||
org.apache.poi.poifs.filesystem.DirectoryEntry
|
||||
</code> instance:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
// dir is an instance of DirectoryEntry ...
|
||||
for (Entry entry : dir) {
|
||||
System.out.println("found entry: " + entry.getName());
|
||||
if (entry instanceof DirectoryEntry) {
|
||||
// .. recurse into this directory
|
||||
} else if (entry instanceof DocumentEntry) {
|
||||
// entry is a document, which you can read
|
||||
} else {
|
||||
// currently, either an Entry is a DirectoryEntry or a DocumentEntry,
|
||||
// but in the future, there may be other entry subinterfaces.
|
||||
// The internal data structure certainly allows for a lot more entry types.
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
<section>
|
||||
<title>Reading a Specific Document</title>
|
||||
<p>There are a couple of ways to read a document, depending on whether the document resides in the
|
||||
root directory or in another directory. Either way, you will obtain an <code>
|
||||
org.apache.poi.poifs.filesystem.DocumentInputStream
|
||||
</code> instance.
|
||||
</p>
|
||||
<section>
|
||||
<title>DocumentInputStream</title>
|
||||
<p>The DocumentInputStream class is a simple implementation of InputStream that makes a few
|
||||
guarantees worth noting:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<code>available()</code>
|
||||
always returns the number of bytes in the document from your current position in the
|
||||
document.
|
||||
</li>
|
||||
<li>
|
||||
<code>markSupported()</code>
|
||||
returns <code>true</code>.
|
||||
</li>
|
||||
<li>
|
||||
<code>mark(int limit)</code>
|
||||
ignores the limit parameter; basically the method marks the current position in the
|
||||
document.
|
||||
</li>
|
||||
<li>
|
||||
<code>reset()</code>
|
||||
takes you back to the position when <code>mark()</code> was last called, or to the
|
||||
beginning of the document if <code>mark()</code> has not been called.
|
||||
</li>
|
||||
<li>
|
||||
<code>skip(long n)</code>
|
||||
will take you to your current position + n (but not past the end of the document).
|
||||
</li>
|
||||
</ul>
|
||||
<p>The behavior of <code>available</code> means you can read in a document in a single read call
|
||||
like this:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
byte[] content = new byte[ stream.available() ];
|
||||
stream.read(content);
|
||||
stream.close();
|
||||
]]></source>
|
||||
<p>The combination of <code>mark</code>, <code>reset</code>, and <code>skip</code> provide the
|
||||
basic mechanisms needed for random access of the document contents.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Reading a Document From the Root Directory</title>
|
||||
<p>If the document resides in the root directory, you can obtain a <code>DocumentInputStream
|
||||
</code> like this:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
// load file system
|
||||
try (DocumentInputStream stream = filesystem.createDocumentInputStream(documentName)) {
|
||||
// process data from stream
|
||||
} catch (IOException e) {
|
||||
// no such document, or the Entry represented by documentName is not a DocumentEntry
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
<section>
|
||||
<title>Reading a Document From an Arbitrary Directory</title>
|
||||
<p>A more generic technique for reading a document is to obtain an <code>
|
||||
org.apache.poi.poifs.filesystem.DirectoryEntry
|
||||
</code> instance for the directory containing the desired document (recall that you can use <code>
|
||||
getRoot()
|
||||
</code> to obtain the root directory from its file system). From that DirectoryEntry, you can
|
||||
then obtain a <code>DocumentInputStream</code> like this:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
DocumentEntry document = (DocumentEntry)directory.getEntry(documentName);
|
||||
DocumentInputStream stream = new DocumentInputStream(document);
|
||||
]]></source>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Event-Driven Reading</title>
|
||||
<anchor id="reading_event"/>
|
||||
<p>The event-driven API for reading documents is a little more complicated and requires that your
|
||||
application know, in advance, which files it wants to read. The benefit of using this API is that
|
||||
each document is in memory just long enough for your application to read it, and documents that you
|
||||
never read at all are not in memory at all. When you're finished reading the documents you wanted,
|
||||
the file system has no data structures associated with it at all and can be discarded.
|
||||
</p>
|
||||
<section>
|
||||
<title>Preparation</title>
|
||||
<p>The preparation phase involves creating an instance of <code>
|
||||
org.apache.poi.poifs.eventfilesystem.POIFSReader
|
||||
</code> and to then register one or more <code>
|
||||
org.apache.poi.poifs.eventfilesystem.POIFSReaderListener
|
||||
</code> instances with the <code>POIFSReader</code>.
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
POIFSReader reader = new POIFSReader();
|
||||
// register for everything
|
||||
reader.registerListener(myOmnivorousListener);
|
||||
// register for selective files
|
||||
reader.registerListener(myPickyListener, "foo");
|
||||
reader.registerListener(myPickyListener, "bar");
|
||||
// register for selective files
|
||||
reader.registerListener(myOtherPickyListener, new POIFSDocumentPath(), "fubar");
|
||||
reader.registerListener(myOtherPickyListener, new POIFSDocumentPath( new String[] { "usr", "bin" ), "fubar");
|
||||
]]></source>
|
||||
</section>
|
||||
<section>
|
||||
<title>POIFSReaderListener</title>
|
||||
<p>
|
||||
<code>org.apache.poi.poifs.eventfilesystem.POIFSReaderListener</code>
|
||||
is an interface used to register for documents. When a matching document is read by the <code>
|
||||
org.apache.poi.poifs.eventfilesystem.POIFSReader</code>, the <code>POIFSReaderListener</code> instance
|
||||
receives an <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderEvent</code> instance, which
|
||||
contains an open <code>DocumentInputStream</code> and information about the document.
|
||||
</p>
|
||||
<p>A <code>POIFSReaderListener</code> instance can register for individual documents, or it can
|
||||
register for all documents; once it has registered for all documents, subsequent (and previous!)
|
||||
registration requests for individual documents are ignored. There is no way to unregister
|
||||
a <code>POIFSReaderListener</code>.
|
||||
</p>
|
||||
<p>Thus, it is possible to register a single <code>POIFSReaderListener</code> for multiple documents
|
||||
- one, some, or all documents. It is guaranteed that a single <code>POIFSReaderListener</code> will
|
||||
receive exactly one notification per registered document. There is no guarantee as to the order
|
||||
in which it will receive notification of its documents, as future implementations of <code>
|
||||
POIFSReader
|
||||
</code> are free to change the algorithm for walking the file system's directory structure.
|
||||
</p>
|
||||
<p>It is also permitted to register more than one <code>POIFSReaderListener</code> for the same
|
||||
document. There is no guarantee of ordering for notification of <code>POIFSReaderListener</code> instances
|
||||
that have registered for the same document when <code>POIFSReader</code> processes that
|
||||
document.
|
||||
</p>
|
||||
<p>It is guaranteed that all notifications occur in the same thread. A future enhancement may be
|
||||
made to provide multi-threaded notifications, but such an enhancement would very probably be
|
||||
made in a new reader class, a <code>ThreadedPOIFSReader</code> perhaps.
|
||||
</p>
|
||||
<p>The following describes the three ways to register a <code>POIFSReaderListener</code> for a
|
||||
document or set of documents:
|
||||
</p>
|
||||
<dl>
|
||||
<dt>registers <em>listener</em> for all documents.
|
||||
</dt>
|
||||
<dd>registerListener(POIFSReaderListener <em>listener</em>)
|
||||
</dd>
|
||||
<dt>registers <em>listener</em> for a document with the specified <em>name</em> in the root
|
||||
directory.
|
||||
</dt>
|
||||
<dd>registerListener(POIFSReaderListener <em>listener</em>, String <em>name</em>)
|
||||
</dd>
|
||||
<dt>registers <em>listener</em> for a document with the specified <em>name</em> in the directory
|
||||
described by
|
||||
<em>path</em>
|
||||
</dt>
|
||||
<dd>registerListener(POIFSReaderListener <em>listener</em>, POIFSDocumentPath <em>path</em>,
|
||||
String <em>name</em>)
|
||||
</dd>
|
||||
</dl>
|
||||
</section>
|
||||
<section>
|
||||
<title>POIFSDocumentPath</title>
|
||||
<p>The <code>org.apache.poi.poifs.filesystem.POIFSDocumentPath</code> class is used to describe a
|
||||
directory in a POIFS file system. Since there are no reserved characters in the name of a file
|
||||
in a POIFS file system, a more traditional string-based solution for describing a directory,
|
||||
with special characters delimiting the components of the directory name, is not feasible. The
|
||||
constructors for the class are used as follows:
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<td>
|
||||
<em>Constructor example</em>
|
||||
</td>
|
||||
<td>
|
||||
<em>Directory described</em>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>new POIFSDocumentPath()</td>
|
||||
<td>The root directory.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>new POIFSDocumentPath(null)</td>
|
||||
<td>The root directory.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>new POIFSDocumentPath(new String[ 0 ])</td>
|
||||
<td>The root directory.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>new POIFSDocumentPath(new String[ ] { "foo", "bar"} )</td>
|
||||
<td>in Unix terminology, "/foo/bar".</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>new POIFSDocumentPath(new POIFSDocumentPath(new String[] { "foo" }), new String[ ] {
|
||||
"fu", "bar"} )
|
||||
</td>
|
||||
<td>in Unix terminology, "/foo/fu/bar".</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section>
|
||||
<title>Processing POIFSReaderEvent Events</title>
|
||||
<p>Processing <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderEvent</code> events is
|
||||
relatively easy. After all of the <code>POIFSReaderListener</code> instances have been
|
||||
registered with <code>POIFSReader</code>, the <code>POIFSReader.read(InputStream stream)</code> method
|
||||
is called.
|
||||
</p>
|
||||
<p>Assuming that there are no problems with the data, as the <code>POIFSReader</code> processes the
|
||||
documents in the specified <code>InputStream</code>'s data, it calls registered <code>
|
||||
POIFSReaderListener
|
||||
</code> instances' <code>processPOIFSReaderEvent</code> method with a <code>POIFSReaderEvent
|
||||
</code> instance.
|
||||
</p>
|
||||
<p>The <code>POIFSReaderEvent</code> instance contains information to identify the document (a <code>
|
||||
POIFSDocumentPath
|
||||
</code> object to identify the directory that the document is in, and the document name), and an
|
||||
open <code>DocumentInputStream</code> instance from which to read the document.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Writing a File System</title>
|
||||
<anchor id="writing"/>
|
||||
<p>Writing a file system is very much like reading a file system in that there are multiple ways to do so.
|
||||
You can load an existing file system into memory and modify it (removing files, renaming files) and/or
|
||||
add new files to it, and write it, or you can start with a new, empty file system:
|
||||
</p>
|
||||
<source>
|
||||
POIFSFileSystem fs = new POIFSFileSystem();
|
||||
</source>
|
||||
<section>
|
||||
<title>The Naming of Names</title>
|
||||
<p>There are two restrictions on the names of files in a file system that must be considered when
|
||||
creating files:
|
||||
</p>
|
||||
<ol>
|
||||
<li>The name of the file must not exceed 31 characters. If it does, the POIFS API will silently
|
||||
truncate the name to fit.
|
||||
</li>
|
||||
<li>The name of the file must be unique within its containing directory. This seems pretty obvious,
|
||||
but if it isn't spelled out, there'll be hell to pay, to be sure. Uniqueness, of course, is
|
||||
determined <em>after</em> the name has been truncated, if the original name was too long to
|
||||
begin with.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
<section>
|
||||
<title>Creating a Document</title>
|
||||
<p>A document can be created by acquiring a <code>DirectoryEntry</code> and calling one of the two <code>
|
||||
createDocument
|
||||
</code> methods:
|
||||
</p>
|
||||
|
||||
<dl>
|
||||
<dt>createDocument(String name, InputStream stream)</dt>
|
||||
<dd>
|
||||
<ul>
|
||||
<li class="pro">Simple API</li>
|
||||
<li class="con">Increased memory footprint (document is in memory until file system is
|
||||
written).
|
||||
</li>
|
||||
</ul>
|
||||
</dd>
|
||||
<dt>createDocument(String name, int size, POIFSWriterListener writer)</dt>
|
||||
<dd>
|
||||
<ul>
|
||||
<li class="pro">Decreased memory footprint (only very small documents are held in memory,
|
||||
and then only for a short time).
|
||||
</li>
|
||||
<li class="con">More complex API.</li>
|
||||
<li class="con">Determining document size in advance may be difficult.</li>
|
||||
<li class="con">Lose control over when document is to be written.</li>
|
||||
</ul>
|
||||
</dd>
|
||||
</dl>
|
||||
|
||||
<p>Unlike reading, you don't have to choose between the in-memory and event-driven writing models; both
|
||||
can co-exist in the same file system.
|
||||
</p>
|
||||
<p>Writing is initiated when the <code>POIFSFileSystem</code> instance's <code>writeFilesystem()</code> method
|
||||
is called with an <code>OutputStream</code> to write to.
|
||||
</p>
|
||||
<p>The event-driven model is quite similar to the event-driven model for reading, in that the file
|
||||
system calls your <code>org.apache.poi.poifs.filesystem.POIFSWriterListener</code> when it's time to
|
||||
write your document, just as the <code>POIFSReader</code> calls your <code>POIFSReaderListener
|
||||
</code> when it's time to read your document. Internally, when <code>writeFilesystem()</code> is
|
||||
called, the final POIFS data structures are created and are written to the specified <code>
|
||||
OutputStream</code>. When the file system needs to write a document out that was created with
|
||||
the event-driven model, it calls the <code>POIFSWriterListener</code> back, calling its <code>
|
||||
processPOIFSWriterEvent()
|
||||
</code> method, passing an <code>org.apache.poi.poifs.filesystem.POIFSWriterEvent</code> instance.
|
||||
This object contains the <code>POIFSDocumentPath</code> and name of the document, its size, and an
|
||||
open <code>org.apache.poi.poifs.filesystem.DocumentOutputStream</code> to which to write. A <code>
|
||||
DocumentOutputStream
|
||||
</code> is a wrapper over the <code>OutputStream</code> that was provided to the <code>
|
||||
POIFSFileSystem
|
||||
</code> to write to, and has the responsibility of making sure that the document your application
|
||||
writes fits within the size you specified for it.
|
||||
</p>
|
||||
<p>If you are using a <code>POIFSFileSystem</code> loaded from a
|
||||
<code>File</code>
|
||||
with <code>readOnly</code> set to false, it is also possible to do an in-place write. Simply call <code>
|
||||
writeFilesystem()
|
||||
</code> to have the (limited) in-memory structures synced with the disk, then <code>close()</code> to
|
||||
finish.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Creating a Directory</title>
|
||||
<p>Creating a directory is similar to creating a document, except that there's only one way to do so:
|
||||
</p>
|
||||
<source>
|
||||
DirectoryEntry createdDir = existingDir.createDirectory(name);
|
||||
</source>
|
||||
</section>
|
||||
<section>
|
||||
<title>Using POIFSFileSystem Directly To Create a Document Or Directory</title>
|
||||
<p>As with reading documents, it is possible to create a new document or directory in the root directory
|
||||
by using convenience methods of POIFSFileSystem.
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<td>
|
||||
<em>DirectoryEntry Method Signature</em>
|
||||
</td>
|
||||
<td>
|
||||
<em>POIFSFileSystem Method Signature</em>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>createDocument(String name, InputStream stream)</td>
|
||||
<td>createDocument(InputStream stream, String name)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>createDocument(String name, int size, POIFSWriterListener writer)</td>
|
||||
<td>createDocument(String name, int size, POIFSWriterListener writer)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>createDirectory(String name)</td>
|
||||
<td>createDirectory(String name)</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Modifying a File System</title>
|
||||
<anchor id="modifying"/>
|
||||
<p>It is possible to modify an existing POIFS file system, whether it's one your application has loaded into
|
||||
memory, or one which you are creating on the fly.
|
||||
</p>
|
||||
<section>
|
||||
<title>Removing a Document</title>
|
||||
<p>Removing a document is simple: you get the <code>Entry</code> corresponding to the document and call
|
||||
its <code>delete()</code> method. This is a boolean method, but should always return <code>
|
||||
true</code>, indicating that the operation succeeded.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Removing a Directory</title>
|
||||
<p>Removing a directory is also simple: you get the <code>Entry</code> corresponding to the directory
|
||||
and call its <code>delete()</code> method. This is a boolean method, but, unlike deleting a
|
||||
document, may not always return <code>true</code>, indicating that the operation succeeded. Here are
|
||||
the reasons why the operation may fail:
|
||||
</p>
|
||||
<ul>
|
||||
<li>The directory still has files in it (to check, call <code>isEmpty()</code> on its
|
||||
DirectoryEntry; is the return value <code>false</code>?)
|
||||
</li>
|
||||
<li>The directory is the root directory. You cannot remove the root directory.</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section>
|
||||
<title>Changing a File's contents</title>
|
||||
<p>There are two ways available to change the contents of an existing file within a POIFS file system.
|
||||
One is using a <code>DocumentOutputStream</code>, the other is with
|
||||
<code>POIFSDocument.replaceContents</code>
|
||||
</p>
|
||||
<p>If you have available to you an <code>InputStream</code> to read the new File contents from, then the
|
||||
easiest way is via
|
||||
<code>POIFSDocument.replaceContents</code>. You would do something like:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
// Get the input stream from somewhere
|
||||
InputStream inp = db.getContentStream();
|
||||
|
||||
// Open the POIFS File System, and get the entry of interest
|
||||
|
||||
POIFSFileSystem fs = new POIFSFileSystem(new File(filename), false);
|
||||
DirectoryEntry root = fs.getRoot();
|
||||
DocumentEntry myDocE = (DocumentEntry)root.getEntry("ToChange");
|
||||
|
||||
// Replace the contents
|
||||
POIFSDocument myDoc = new POIFSDocument(myDocE);
|
||||
myDoc.replaceContents(inp);
|
||||
|
||||
// Save the changes to the file in-place
|
||||
fs.writeFileSystem();
|
||||
fs.close();
|
||||
]]></source>
|
||||
<p>Alternately, if you either have a byte array, or you need to write as you go along, then the
|
||||
OutputStream interface provided by
|
||||
<code>DocumentOutputStream</code>
|
||||
will likely be a better bet. Your code would want to look somewhat like:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
// Open the POIFS File System, and get the entry of interest
|
||||
try (POIFSFileSystem fs = new POIFSFileSystem(new File(filename))) {
|
||||
DirectoryEntry root = fs.getRoot();
|
||||
DocumentEntry myDoc = (DocumentEntry)root.getEntry("ToChange");
|
||||
|
||||
// Replace the content with a Write
|
||||
try (DocumentOutputStream docOut = new DocumentOutputStream(myDoc)) {
|
||||
myDoc.writeTo(docOut);
|
||||
}
|
||||
|
||||
// Save the changes to a new file
|
||||
try (FileOutputStream out = new FileOutputStream("NewFile.ole2")) {
|
||||
fs.write(out);
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
<p>For an example of an in-place change to one stream within a file, you can see the example
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hpsf/ModifyDocumentSummaryInformation.java">
|
||||
org/apache/poi/hpsf/examples/ModifyDocumentSummaryInformation.java
|
||||
</a>
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Renaming a File</title>
|
||||
<p>Regardless of whether the file is a directory or a document, it can be renamed, with one exception -
|
||||
the root directory has a special name that is expected by the components of a major software
|
||||
vendor's office suite, and the POIFS API will not let that name be changed. Renaming is done by
|
||||
acquiring the file's corresponding <code>Entry</code> instance and calling its <code>renameTo</code> method,
|
||||
passing in the new name.
|
||||
</p>
|
||||
<p>Like <code>delete</code>, <code>renameTo</code> returns <code>true</code> if the operation succeeded,
|
||||
otherwise <code>false</code>. Reasons for failure include these:
|
||||
</p>
|
||||
<ul>
|
||||
<li>The new name is the same as another file in the same directory. And don't forget - if the new
|
||||
name is longer than 31 characters, it <em>will</em> be silently truncated. In its original
|
||||
length, the new name may have been unique, but truncated to 31 characters, it may not be unique
|
||||
any longer.
|
||||
</li>
|
||||
<li>You tried to rename the root directory.</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
58
src/documentation/content/xdocs/components/poifs/index.xml
Normal file
@ -0,0 +1,58 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - POIFS - Java implementation of the OLE 2 Compound Document format</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
<p>POIFS is a pure Java implementation of the OLE 2 Compound
|
||||
Document format.</p>
|
||||
<p>By definition, all APIs developed by the POI project are
|
||||
based somehow on the POIFS API.</p>
|
||||
<p>A common confusion is on just what POIFS buys you or what OLE
|
||||
2 Compound Document format is exactly. POIFS does not buy you
|
||||
DOC, or XLS, but is necessary to generate or read DOC or XLS
|
||||
files. You see, all file formats based on the OLE 2 Compound
|
||||
Document Format have a common structure. The OLE 2 Compound
|
||||
Document Format is essentially a convoluted archive
|
||||
format. Think of POIFS as a "zip" library. Once you can get
|
||||
the data in a zip file you still need to interpret the
|
||||
data. As a general rule, while all of our formats <em>use</em>
|
||||
POIFS, most of them attempt to abstract you from it. There
|
||||
are some circumstances where this is not possible, but as a
|
||||
general rule this is true.</p>
|
||||
<p>If you're an end user type just looking to generate XLS
|
||||
files, then you'd be looking for HSSF not POIFS; however, if
|
||||
you have legacy code that uses MFC property sets, POIFS is
|
||||
for you! Regardless, you may or may not need to know how to
|
||||
use POIFS but ultimately if you use technologies that come
|
||||
from the POI project, you're using POIFS underneath. Perhaps
|
||||
we should have a branding campaign "POIFS Inside!". ;-)</p>
|
||||
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
653
src/documentation/content/xdocs/components/poifs/usecases.xml
Normal file
@ -0,0 +1,653 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
<document>
|
||||
<header>
|
||||
<title>POIFS Use Cases</title>
|
||||
<authors>
|
||||
<person email="mjohnson@apache.org" name="Marc Johnson" id="MJ"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>POIFS Use Cases</title>
|
||||
<section><title>Use Case 1: Read existing file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
POIFS client- wants to read content of file
|
||||
system<br/>
|
||||
POIFS - understands POIFS file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. POIFS client requests POIFS to read a POIFS file
|
||||
system, providing an
|
||||
<code>InputStream</code>
|
||||
containing POIFS file system in question.<br/>
|
||||
2. POIFS reads from the
|
||||
<code>InputStream</code> in
|
||||
512 byte blocks.<br/>
|
||||
3. POIFS verifies that the first block begins with
|
||||
the well known signature
|
||||
(
|
||||
<code>0xE11AB1A1E011CFD0</code>)<br/>
|
||||
4. POIFS reads the Block Allocation Table from the
|
||||
first block and, if necessary, from the XBAT
|
||||
blocks.<br/>
|
||||
5. POIFS obtains the start block of the Property
|
||||
Table and reads the Property Table (use case 9,
|
||||
read file)<br/>
|
||||
6. POIFS reads the individual entries in the Property
|
||||
Table<br/>
|
||||
7. POIFS obtains the start block of the Small Block
|
||||
Allocation Table and reads the Small Block
|
||||
Allocation Table (use case 9, read file)<br/>
|
||||
8. POIFS obtains the start block of the Small Block
|
||||
store from the first entry in the Property Table
|
||||
and reads the Small Block Array (use case 9, read
|
||||
file)<br/>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>
|
||||
2a. If the last block read is not a 512 byte
|
||||
block, the
|
||||
<code>InputStream</code> is not that of
|
||||
a POIFS file system, and POIFS throws an
|
||||
appropriate exception.
|
||||
<br/>
|
||||
3a. If the signature is incorrect, the
|
||||
<code>InputStream</code> is not that of a POIFS
|
||||
file system, and POIFS throws an appropriate
|
||||
exception.<br/>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 2: Write file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Primary Actor:</th>
|
||||
<th>POIFS client</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Scope:</th>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Level:</th>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Stakeholders and Interests:</th>
|
||||
<td>
|
||||
POIFS client- wants to write file system out.<br/>
|
||||
POIFS - knows how to write file system out.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Precondition:</th>
|
||||
<td>
|
||||
File system has been read (use case 1, read
|
||||
existing file system) and subsequently modified
|
||||
(use case 4, replace file in file system; use case
|
||||
5, delete file from file system; or use case 6,
|
||||
write new file to file system; in any
|
||||
combination)
|
||||
<br/>or<br/>
|
||||
File system has been created (use case 3, create
|
||||
new file system)
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Minimal Guarantee:</th>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Main Success Guarantee:</th>
|
||||
<td>
|
||||
1. POIFS client provides an
|
||||
<code>OutputStream</code>
|
||||
to write the file system to.
|
||||
<br/>
|
||||
2. POIFS gets the sizes of the Property Table and
|
||||
each file in the file system.<br/>
|
||||
3. If any files in the file system requires storage
|
||||
in a Small Block Array, POIFS creates a Small
|
||||
Block Array of sufficient size to hold all of the
|
||||
small files.<br/>
|
||||
4. POIFS calculates the number of big blocks needed
|
||||
to hold all of the large files, the Property
|
||||
Table, and, if necessary, the Small Block Array
|
||||
and the Small Block Allocation Table.<br/>
|
||||
5. POIFS creates a set of big blocks sufficient to
|
||||
store the Block Allocation Table<br/>
|
||||
6. POIFS creates and writes the header block<br/>
|
||||
7. POIFS writes out the XBAT blocks, if needed.<br/>
|
||||
8. POIFS writes out the Small Block Array, if
|
||||
needed<br/>
|
||||
9. POIFS writes out the Small Block Allocation Table,
|
||||
if needed<br/>
|
||||
10. POIFS writes out the Property Table<br/>
|
||||
11. POIFS writes out the large files, if needed<br/>
|
||||
12. POIFS closes the <code>OutputStream</code>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Extensions:</th>
|
||||
<td>
|
||||
6a. Exceptions writing to the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
7a. Exceptions writing to the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
8a. Exceptions writing to the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
9a. Exceptions writing to the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
10a. Exceptions writing to the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
11a. Exceptions writing to the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
12a. Exceptions closing the
|
||||
<code>OutputStream</code> will be propagated back
|
||||
to the POIFS client.
|
||||
<br/>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 3: Create new file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Primary Actor:</th>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Scope:</th>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Level:</th>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Stakeholders and Interests:</th>
|
||||
<td>
|
||||
POIFS client- wants to create a new file
|
||||
system<br/>
|
||||
POIFS - knows how to create a new file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Precondition:</th>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Minimal Guarantee:</th>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Main Success Guarantee:</th>
|
||||
<td>
|
||||
POIFS creates an empty Property Table.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Extensions:</th>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 4: Replace file in file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
1. POIFS client- wants to replace an existing file in
|
||||
the file system<br/>
|
||||
2. POIFS - knows how to manage the file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>
|
||||
Either
|
||||
<br/><br/>
|
||||
The file system has been read (use case 1, read
|
||||
existing file system) and a file has been
|
||||
extracted from the file system (use case 7, read
|
||||
existing file from file system)
|
||||
<br/><br/>or<br/><br/>
|
||||
The file system has been created (use case 3,
|
||||
create new file system) and a file has been
|
||||
written to the file system (use case 6, write new
|
||||
file to file system)
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. POIFS discards storage of the existing file.<br/>
|
||||
2. POIFS updates the existing file's entry in the
|
||||
Property Table<br/>
|
||||
3. POIFS stores the new file's data
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>
|
||||
1a. POIFS throws an exception if the file does not
|
||||
exist.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 5: Delete file from file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
* POIFS client- wants to remove a file from a file
|
||||
system<br/>
|
||||
* POIFS - knows how to manage the file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>
|
||||
Either<br/><br/>
|
||||
The file system has been read (use case 1, read
|
||||
existing file system) and a file has been
|
||||
extracted from the file system (use case 7, read
|
||||
existing file from file system)<br/>
|
||||
<br/>
|
||||
or<br/>
|
||||
<br/>
|
||||
The file system has been created (use case 3,
|
||||
create new file system) and a file has been
|
||||
written to the file system (use case 6, write new
|
||||
file to file system)
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. POIFS discards the specified file's storage.<br/>
|
||||
2. POIFS discards the file's Property Table
|
||||
entry.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>
|
||||
1a. POIFS throws an exception if the file does not
|
||||
exist.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 6: Write new file to file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
* POIFS client- wants to add a new file to the file
|
||||
system<br/>
|
||||
* POIFS - knows how to manage the file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>The specified file does not yet exist in the file
|
||||
system</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. The POIFS client provides a file name<br/>
|
||||
2. POIFS creates a new Property Table entry for the
|
||||
new file<br/>
|
||||
3. POIFS provides the POIFS client with an
|
||||
<code>OutputStream</code> to write to.<br/>
|
||||
4. The POIFS client writes data to the provided
|
||||
<code>OutputStream</code>.<br/>
|
||||
5. The POIFS client closes the provided
|
||||
<code>OutputStream</code><br/>
|
||||
6. POIFS updates the Property Table entry with the
|
||||
new file's size
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>
|
||||
1a. POIFS throws an exception if a file with the
|
||||
specified name already exists in the file
|
||||
system.<br/>
|
||||
1b. POIFS throws an exception if the file name is
|
||||
too long. The limit on file name length is 31
|
||||
characters.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 7: Read existing file from file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
* POIFS client- wants to read a file from the file
|
||||
system<br/>
|
||||
* POIFS - knows how to manage the file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>
|
||||
* The file system has been read (use case 1, read
|
||||
existing file system) or has been created and
|
||||
written to (use case 3, create new file system;
|
||||
use case 6, write new file to file system).<br/>
|
||||
* The specified file exists in the file system.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
* The POIFS client provides the name of a file to be read <br/>
|
||||
* POIFS provides an <code>InputStream</code> to read from. <br/>
|
||||
* The POIFS client reads from the <code>InputStream</code>.<br/>
|
||||
* The POIFS client closes the <code>InputStream</code>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>1a. POIFS throws an exception if no file with the
|
||||
specified name exists.</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 8: Read file system directory</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
* POIFS client- wants to know what files exist in
|
||||
the file system<br/>
|
||||
* POIFS - knows how to manage the file system
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>The file system has been read (use case 1, read
|
||||
existing file system) or created (use case 3, create
|
||||
new file system)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. The POIFS client requests the file system
|
||||
directory.
|
||||
2. POIFS returns an <code>Iterator</code>. The
|
||||
<code>Iterator</code> will not include the root
|
||||
entry in the Property Table, and may be an
|
||||
<code>Iterator</code> over an empty
|
||||
<code>Collection</code>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 9: Read file</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
POIFS - POIFS needs to read a file, or something
|
||||
resembling a file (i.e., the Property Table, the
|
||||
Small Block Array, or the Small Block Allocation
|
||||
Table)
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. POIFS begins with a start block, a file size, and
|
||||
a flag indicating whether to use the Big Block
|
||||
Allocation Table or the Small Block Allocation
|
||||
Table<br/>
|
||||
2. POIFS returns an <code>InputStream</code>.<br/>
|
||||
3. Reads from the <code>InputStream</code> are
|
||||
performed by walking the specified Block
|
||||
Allocation Table and reading the blocks
|
||||
indicated.<br/>
|
||||
4. POIFS closes the <code>InputStream</code> when
|
||||
finished reading the file, or its client wants to
|
||||
close the <code>InputStream</code>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>3a. An exception will be thrown if the specified Block
|
||||
Allocation Table is corrupt, as evidenced by an index
|
||||
pointing to a non-existent block, or by a chain
|
||||
extending past the known size of the file.</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>Use Case 10: Rename existing file in the file system</title>
|
||||
<table>
|
||||
<tr>
|
||||
<td><em>Primary Actor:</em></td>
|
||||
<td>POIFS client</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Scope:</em></td>
|
||||
<td>POIFS</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Level:</em></td>
|
||||
<td>Summary</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Stakeholders and Interests:</em></td>
|
||||
<td>
|
||||
* POIFS client- wants to rename an existing file in
|
||||
the file system.<br/>
|
||||
* POIFS - knows how to manage the file system.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Precondition:</em></td>
|
||||
<td>
|
||||
* The file system is has been read (use case 1, read
|
||||
existing file system) or has been created and
|
||||
written to (use case 3, create new file system;
|
||||
use case 6, write new file to file system.<br/>
|
||||
* The specified file exists in the file system.<br/>
|
||||
* The new name for the file does not duplicate
|
||||
another file in the file system.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Minimal Guarantee:</em></td>
|
||||
<td>None</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Main Success Guarantee:</em></td>
|
||||
<td>
|
||||
1. POIFS updates the Property Table entry for the
|
||||
specified file with its new name.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><em>Extensions:</em></td>
|
||||
<td>
|
||||
* 1a. If the old file name is not in the file
|
||||
system, POIFS throws an exception.<br/>
|
||||
* 1b. If the new file name already exists in the
|
||||
file system, POIFS throws an exception.<br/>
|
||||
* 1c. If the new file name is too long (the limit is
|
||||
31 characters), POIFS throws an exception.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,642 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Busy Developers' Guide to HSLF drawing layer</title>
|
||||
<authors>
|
||||
<person email="yegor@dinom.ru" name="Yegor Kozlov" id="CO"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Busy Developers' Guide to HSLF drawing layer</title>
|
||||
<section><title>Index of Features</title>
|
||||
<ul>
|
||||
<li><a href="#NewPresentation">How to create a new presentation and add new slides to it</a></li>
|
||||
<li><a href="#PageSize">How to retrieve or change slide size</a></li>
|
||||
<li><a href="#GetShapes">How to get shapes contained in a particular slide</a></li>
|
||||
<li><a href="#Shapes">Drawing a shape on a slide</a></li>
|
||||
<li><a href="#Pictures">How to work with pictures</a></li>
|
||||
<li><a href="#SlideTitle">How to set slide title</a></li>
|
||||
<li><a href="#Fill">How to work with slide/shape background</a></li>
|
||||
<li><a href="#Bullets">How to create bulleted lists</a></li>
|
||||
<li><a href="#Hyperlinks">Hyperlinks</a></li>
|
||||
<li><a href="#Tables">Tables</a></li>
|
||||
<li><a href="#RemoveShape">How to remove shapes</a></li>
|
||||
<li><a href="#OLE">How to retrieve embedded OLE objects</a></li>
|
||||
<li><a href="#Sound">How to retrieve embedded sounds</a></li>
|
||||
<li><a href="#Freeform">How to create shapes of arbitrary geometry</a></li>
|
||||
<li><a href="#Graphics2D">Shapes and Graphics2D</a></li>
|
||||
<li><a href="#Render">How to convert slides into images</a></li>
|
||||
<li><a href="#HeadersFooters">Headers / Footers</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Features</title>
|
||||
<anchor id="NewPresentation"/>
|
||||
<section><title>New Presentation</title>
|
||||
<source>
|
||||
//create a new empty slide show
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
|
||||
//add first slide
|
||||
HSLFSlide s1 = ppt.createSlide();
|
||||
|
||||
//add second slide
|
||||
HSLFSlide s2 = ppt.createSlide();
|
||||
|
||||
//save changes in a file
|
||||
FileOutputStream out = new FileOutputStream("slideshow.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="PageSize"/>
|
||||
<section><title>How to retrieve or change slide size</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(new HSLFSlideShowImpl("slideshow.ppt"));
|
||||
//retrieve page size. Coordinates are expressed in points (72 dpi)
|
||||
java.awt.Dimension pgsize = ppt.getPageSize();
|
||||
int pgx = pgsize.width; //slide width
|
||||
int pgy = pgsize.height; //slide height
|
||||
|
||||
//set new page size
|
||||
ppt.setPageSize(new java.awt.Dimension(1024, 768));
|
||||
//save changes
|
||||
FileOutputStream out = new FileOutputStream("slideshow.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="GetShapes"/>
|
||||
<section><title>How to get shapes contained in a particular slide</title>
|
||||
<p>
|
||||
The following code demonstrates how to iterate over shapes for each slide.
|
||||
</p>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(new HSLFSlideShowImpl("slideshow.ppt"));
|
||||
// get slides
|
||||
for (HSLFSlide slide : ppt.getSlides()) {
|
||||
for (HSLFShape sh : slide.getShapes()) {
|
||||
// name of the shape
|
||||
String name = sh.getShapeName();
|
||||
|
||||
// shapes's anchor which defines the position of this shape in the slide
|
||||
java.awt.Rectangle anchor = sh.getAnchor();
|
||||
|
||||
if (sh instanceof Line) {
|
||||
Line line = (Line) sh;
|
||||
// work with Line
|
||||
} else if (sh instanceof HSLFAutoShape) {
|
||||
HSLFAutoShape shape = (HSLFAutoShape) sh;
|
||||
// work with AutoShape
|
||||
} else if (sh instanceof HSLFTextBox) {
|
||||
HSLFTextBox shape = (HSLFTextBox) sh;
|
||||
// work with TextBox
|
||||
} else if (sh instanceof HSLFPictureShape) {
|
||||
HSLFPictureShape shape = (HSLFPictureShape) sh;
|
||||
// work with Picture
|
||||
}
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Shapes"/>
|
||||
<section><title>Drawing a shape on a slide</title>
|
||||
<warning>
|
||||
To work with graphic objects HSLF uses Java2D classes
|
||||
that may throw exceptions if graphical environment is not available. In case if graphical environment
|
||||
is not available, you must tell Java that you are running in headless mode and
|
||||
set the following system property: <code> java.awt.headless=true </code>
|
||||
(either via <code>-Djava.awt.headless=true</code> startup parameter or via <code>System.setProperty("java.awt.headless", "true")</code>).
|
||||
</warning>
|
||||
<p>
|
||||
When you add a shape, you usually specify the dimensions of the shape and the position
|
||||
of the upper left corner of the bounding box for the shape relative to the upper left
|
||||
corner of the slide. Distances in the drawing layer are measured in points (72 points = 1 inch).
|
||||
</p>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
|
||||
//Line shape
|
||||
Line line = new Line();
|
||||
line.setAnchor(new java.awt.Rectangle(50, 50, 100, 20));
|
||||
line.setLineColor(new Color(0, 128, 0));
|
||||
line.setLineCompound(LineCompound.DOUBLE);
|
||||
slide.addShape(line);
|
||||
|
||||
//TextBox
|
||||
HSLFTextBox txt = new HSLFTextBox();
|
||||
txt.setText("Hello, World!");
|
||||
txt.setAnchor(new java.awt.Rectangle(300, 100, 300, 50));
|
||||
|
||||
// use TextRun to work with the text format
|
||||
HSLFTextParagraph tp = txt.getTextParagraphs().get(0);
|
||||
tp.setAlignment(TextAlign.RIGHT);
|
||||
HSLFTextRun rt = tp.getTextRuns().get(0);
|
||||
rt.setFontSize(32.);
|
||||
rt.setFontFamily("Arial");
|
||||
rt.setBold(true);
|
||||
rt.setItalic(true);
|
||||
rt.setUnderlined(true);
|
||||
rt.setFontColor(Color.red);
|
||||
|
||||
slide.addShape(txt);
|
||||
|
||||
// Autoshape
|
||||
// 32-point star
|
||||
HSLFAutoShape sh1 = new HSLFAutoShape(ShapeType.STAR_32);
|
||||
sh1.setAnchor(new java.awt.Rectangle(50, 50, 100, 200));
|
||||
sh1.setFillColor(Color.red);
|
||||
slide.addShape(sh1);
|
||||
|
||||
//Trapezoid
|
||||
HSLFAutoShape sh2 = new HSLFAutoShape(ShapeType.TRAPEZOID);
|
||||
sh2.setAnchor(new java.awt.Rectangle(150, 150, 100, 200));
|
||||
sh2.setFillColor(Color.blue);
|
||||
slide.addShape(sh2);
|
||||
|
||||
FileOutputStream out = new FileOutputStream("slideshow.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Pictures"/>
|
||||
<section><title>How to work with pictures</title>
|
||||
|
||||
<p>
|
||||
Currently, HSLF API supports the following types of pictures:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Windows Metafiles (WMF)</li>
|
||||
<li>Enhanced Metafiles (EMF)</li>
|
||||
<li>JPEG Interchange Format</li>
|
||||
<li>Portable Network Graphics (PNG)</li>
|
||||
<li>Macintosh PICT</li>
|
||||
</ul>
|
||||
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(new HSLFSlideShowImpl("slideshow.ppt"));
|
||||
|
||||
// extract all pictures contained in the presentation
|
||||
int idx = 1;
|
||||
for (HSLFPictureData pict : ppt.getPictureData()) {
|
||||
// picture data
|
||||
byte[] data = pict.getData();
|
||||
|
||||
PictureData.PictureType type = pict.getType();
|
||||
String ext = type.extension;
|
||||
FileOutputStream out = new FileOutputStream("pict_" + idx + ext);
|
||||
out.write(data);
|
||||
out.close();
|
||||
idx++;
|
||||
}
|
||||
|
||||
// add a new picture to this slideshow and insert it in a new slide
|
||||
HSLFPictureData pd = ppt.addPicture(new File("clock.jpg"), PictureData.PictureType.JPEG);
|
||||
|
||||
HSLFPictureShape pictNew = new HSLFPictureShape(pd);
|
||||
|
||||
// set image position in the slide
|
||||
pictNew.setAnchor(new java.awt.Rectangle(100, 100, 300, 200));
|
||||
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
slide.addShape(pictNew);
|
||||
|
||||
// now retrieve pictures containes in the first slide and save them on disk
|
||||
idx = 1;
|
||||
slide = ppt.getSlides().get(0);
|
||||
for (HSLFShape sh : slide.getShapes()) {
|
||||
if (sh instanceof HSLFPictureShape) {
|
||||
HSLFPictureShape pict = (HSLFPictureShape) sh;
|
||||
HSLFPictureData pictData = pict.getPictureData();
|
||||
byte[] data = pictData.getData();
|
||||
PictureData.PictureType type = pictData.getType();
|
||||
FileOutputStream out = new FileOutputStream("slide0_" + idx + type.extension);
|
||||
out.write(data);
|
||||
out.close();
|
||||
idx++;
|
||||
}
|
||||
}
|
||||
|
||||
FileOutputStream out = new FileOutputStream("slideshow.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="SlideTitle"/>
|
||||
<section><title>How to set slide title</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
HSLFTextBox title = slide.addTitle();
|
||||
title.setText("Hello, World!");
|
||||
|
||||
// save changes
|
||||
FileOutputStream out = new FileOutputStream("slideshow.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
<p>
|
||||
Below is the equivalent code in PowerPoint VBA:
|
||||
</p>
|
||||
<source>
|
||||
Set myDocument = ActivePresentation.Slides(1)
|
||||
myDocument.Shapes.AddTitle.TextFrame.TextRange.Text = "Hello, World!"
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Fill"/>
|
||||
<section><title>How to modify background of a slide master</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
HSLFSlideMaster master = ppt.getSlideMasters().get(0);
|
||||
|
||||
HSLFFill fill = master.getBackground().getFill();
|
||||
HSLFPictureData pd = ppt.addPicture(new File("background.png"), PictureData.PictureType.PNG);
|
||||
fill.setFillType(HSLFFill.FILL_PICTURE);
|
||||
fill.setPictureData(pd);
|
||||
</source>
|
||||
</section>
|
||||
<section><title>How to modify background of a slide</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
|
||||
// This slide has its own background.
|
||||
// Without this line it will use master's background.
|
||||
slide.setFollowMasterBackground(false);
|
||||
HSLFFill fill = slide.getBackground().getFill();
|
||||
HSLFPictureData pd = ppt.addPicture(new File("background.png"), PictureData.PictureType.PNG);
|
||||
fill.setFillType(HSLFFill.FILL_PATTERN);
|
||||
fill.setPictureData(pd);
|
||||
</source>
|
||||
</section>
|
||||
<section><title>How to modify background of a shape</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
|
||||
HSLFShape shape = new HSLFAutoShape(ShapeType.RECT);
|
||||
shape.setAnchor(new java.awt.Rectangle(100, 100, 200, 200));
|
||||
HSLFFill fill = shape.getFill();
|
||||
fill.setFillType(HSLFFill.FILL_SHADE);
|
||||
fill.setBackgroundColor(Color.red);
|
||||
fill.setForegroundColor(Color.green);
|
||||
|
||||
slide.addShape(shape);
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Bullets"/>
|
||||
<section><title>How to create bulleted lists</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
|
||||
HSLFTextBox shape = new HSLFTextBox();
|
||||
HSLFTextParagraph tp = shape.getTextParagraphs().get(0);
|
||||
tp.setBullet(true);
|
||||
tp.setBulletChar('\u263A'); //bullet character
|
||||
tp.setIndent(0.); //bullet offset
|
||||
tp.setLeftMargin(50.); //text offset (should be greater than bullet offset)
|
||||
HSLFTextRun rt = tp.getTextRuns().get(0);
|
||||
shape.setText(
|
||||
"January\r" +
|
||||
"February\r" +
|
||||
"March\r" +
|
||||
"April");
|
||||
rt.setFontSize(42.);
|
||||
slide.addShape(shape);
|
||||
|
||||
shape.setAnchor(new java.awt.Rectangle(50, 50, 500, 300)); //position of the text box in the slide
|
||||
slide.addShape(shape);
|
||||
|
||||
FileOutputStream out = new FileOutputStream("bullets.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Hyperlinks"/>
|
||||
<section><title>How to read hyperlinks from a slide show</title>
|
||||
<source>
|
||||
FileInputStream is = new FileInputStream("slideshow.ppt");
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(is);
|
||||
is.close();
|
||||
|
||||
for (HSLFSlide slide : ppt.getSlides()) {
|
||||
//read hyperlinks from the text runs
|
||||
for (List<HSLFTextParagraph> txt : slide.getTextParagraphs()) {
|
||||
for (HSLFTextParagraph para : txt) {
|
||||
for (HSLFTextRun run : para) {
|
||||
HSLFHyperlink link = run.getHyperlink();
|
||||
if (link != null) {
|
||||
String title = link.getLabel();
|
||||
String address = link.getAddress();
|
||||
String text = run.getRawText();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
//in PowerPoint you can assign a hyperlink to a shape without text,
|
||||
//for example to a Line object. The code below demonstrates how to
|
||||
//read such hyperlinks
|
||||
for (HSLFShape sh : slide.getShapes()) {
|
||||
if (sh instanceof HSLFSimpleShape) {
|
||||
HSLFHyperlink link = ((HSLFSimpleShape)sh).getHyperlink();
|
||||
if(link != null) {
|
||||
String title = link.getLabel();
|
||||
String address = link.getAddress();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Tables"/>
|
||||
<section><title>How to create tables</title>
|
||||
<source>
|
||||
//table data
|
||||
String[][] data = {
|
||||
{"INPUT FILE", "NUMBER OF RECORDS"},
|
||||
{"Item File", "11,559"},
|
||||
{"Vendor File", "300"},
|
||||
{"Purchase History File", "10,000"},
|
||||
{"Total # of requisitions", "10,200,038"}
|
||||
};
|
||||
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
//create a table of 5 rows and 2 columns
|
||||
HSLFTable table = new HSLFTable(5, 2);
|
||||
for (int i = 0; i < data.length; i++) {
|
||||
for (int j = 0; j < data[i].length; j++) {
|
||||
HSLFTableCell cell = table.getCell(i, j);
|
||||
cell.setText(data[i][j]);
|
||||
|
||||
HSLFTextRun rt = cell.getTextParagraphs().get(0).getTextRuns().get(0);
|
||||
rt.setFontFamily("Arial");
|
||||
rt.setFontSize(10.);
|
||||
|
||||
cell.setVerticalAlignment(VerticalAlignment.MIDDLE);
|
||||
cell.setHorizontalCentered(true);
|
||||
}
|
||||
}
|
||||
|
||||
//set table borders
|
||||
Line border = table.createBorder();
|
||||
border.setLineColor(Color.black);
|
||||
border.setLineWidth(1.0);
|
||||
table.setAllBorders(border);
|
||||
|
||||
//set width of the 1st column
|
||||
table.setColumnWidth(0, 300);
|
||||
//set width of the 2nd column
|
||||
table.setColumnWidth(1, 150);
|
||||
|
||||
slide.addShape(table);
|
||||
table.moveTo(100, 100);
|
||||
|
||||
FileOutputStream out = new FileOutputStream("hslf-table.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="RemoveShape"/>
|
||||
<section><title>How to remove shapes from a slide</title>
|
||||
<source>
|
||||
for (HSLFShape shape : slide.getShapes()) {
|
||||
// remove the shape
|
||||
boolean ok = slide.removeShape(shape);
|
||||
if (ok) {
|
||||
// the shape was removed. Do something.
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="OLE"/>
|
||||
<section><title>How to retrieve embedded OLE objects</title>
|
||||
<source>
|
||||
for (HSLFShape shape : slide.getShapes()) {
|
||||
if (shape instanceof OLEShape) {
|
||||
OLEShape ole = (OLEShape) shape;
|
||||
HSLFObjectData data = ole.getObjectData();
|
||||
String name = ole.getInstanceName();
|
||||
if ("Worksheet".equals(name)) {
|
||||
HSSFWorkbook wb = new HSSFWorkbook(data.getData());
|
||||
} else if ("Document".equals(name)) {
|
||||
HWPFDocument doc = new HWPFDocument(data.getData());
|
||||
}
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="Sound"/>
|
||||
<section><title>How to retrieve embedded sounds</title>
|
||||
<source>
|
||||
FileInputStream is = new FileInputStream(args[0]);
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(is);
|
||||
is.close();
|
||||
|
||||
for (HSLFSoundData sound : ppt.getSoundData()) {
|
||||
// save *WAV sounds on disk
|
||||
if (sound.getSoundType().equals(".WAV")) {
|
||||
FileOutputStream out = new FileOutputStream(sound.getSoundName());
|
||||
out.write(sound.getData());
|
||||
out.close();
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="Freeform"/>
|
||||
<section><title>How to create shapes of arbitrary geometry</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
|
||||
java.awt.geom.GeneralPath path = new java.awt.geom.GeneralPath();
|
||||
path.moveTo(100, 100);
|
||||
path.lineTo(200, 100);
|
||||
path.curveTo(50, 45, 134, 22, 78, 133);
|
||||
path.curveTo(10, 45, 134, 56, 78, 100);
|
||||
path.lineTo(100, 200);
|
||||
path.closePath();
|
||||
|
||||
HSLFFreeformShape shape = new HSLFFreeformShape();
|
||||
shape.setPath(path);
|
||||
slide.addShape(shape);
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="Graphics2D"/>
|
||||
<section><title>How to draw into a slide using Graphics2D</title>
|
||||
<warning>
|
||||
Current implementation of the PowerPoint Graphics2D driver is not fully compliant with the java.awt.Graphics2D specification.
|
||||
Some features like clipping, drawing of images are not yet supported.
|
||||
</warning>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
HSLFSlide slide = ppt.createSlide();
|
||||
|
||||
// draw a simple bar graph
|
||||
// bar chart data.
|
||||
// The first value is the bar color,
|
||||
// the second is the width
|
||||
Object[] def = new Object[]{
|
||||
Color.yellow, new Integer(100),
|
||||
Color.green, new Integer(150),
|
||||
Color.gray, new Integer(75),
|
||||
Color.red, new Integer(200),
|
||||
};
|
||||
|
||||
// all objects are drawn into a shape group so we need to create one
|
||||
|
||||
HSLFGroupShape group = new HSLFGroupShape();
|
||||
// define position of the drawing in the slide
|
||||
Rectangle bounds = new java.awt.Rectangle(200, 100, 350, 300);
|
||||
// if you want to draw in the entire slide area then define the anchor
|
||||
// as follows:
|
||||
// Dimension pgsize = ppt.getPageSize();
|
||||
// java.awt.Rectangle bounds = new java.awt.Rectangle(0, 0,
|
||||
// pgsize.width, pgsize.height);
|
||||
|
||||
group.setAnchor(bounds);
|
||||
slide.addShape(group);
|
||||
|
||||
// draw a simple bar chart
|
||||
Graphics2D graphics = new PPGraphics2D(group);
|
||||
int x = bounds.x + 50, y = bounds.y + 50;
|
||||
graphics.setFont(new Font("Arial", Font.BOLD, 10));
|
||||
for (int i = 0, idx = 1; i < def.length; i += 2, idx++) {
|
||||
graphics.setColor(Color.black);
|
||||
int width = ((Integer) def[i + 1]).intValue();
|
||||
graphics.drawString("Q" + idx, x - 20, y + 20);
|
||||
graphics.drawString(width + "%", x + width + 10, y + 20);
|
||||
graphics.setColor((Color) def[i]);
|
||||
graphics.fill(new Rectangle(x, y, width, 30));
|
||||
y += 40;
|
||||
}
|
||||
graphics.setColor(Color.black);
|
||||
graphics.setFont(new Font("Arial", Font.BOLD, 14));
|
||||
graphics.draw(bounds);
|
||||
graphics.drawString("Performance", x + 70, y + 40);
|
||||
|
||||
FileOutputStream out = new FileOutputStream("hslf-graphics2d.ppt");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="Render"/>
|
||||
<section><title>Export PowerPoint slides into java.awt.Graphics2D</title>
|
||||
<p>
|
||||
HSLF provides a way to export slides into images. You can capture slides into java.awt.Graphics2D object (or any other)
|
||||
and serialize it into a PNG or JPEG format. Please note, although HSLF attempts to render slides as close to PowerPoint as possible,
|
||||
the output may look differently from PowerPoint due to the following reasons:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Java2D renders fonts differently vs PowerPoint. There are always some differences in the way the font glyphs are painted</li>
|
||||
<li>HSLF uses java.awt.font.LineBreakMeasurer to break text into lines. PowerPoint may do it in a different way.</li>
|
||||
<li>If a font from the presentation is not available, then the JDK default font will be used.</li>
|
||||
</ul>
|
||||
<p>
|
||||
Current Limitations:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Some types of shapes are not yet supported (WordArt, complex auto-shapes)</li>
|
||||
<li>Only Bitmap images (PNG, JPEG, DIB) can be rendered in Java</li>
|
||||
</ul>
|
||||
<source>
|
||||
FileInputStream is = new FileInputStream("slideshow.ppt");
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(is);
|
||||
is.close();
|
||||
|
||||
Dimension pgsize = ppt.getPageSize();
|
||||
|
||||
int idx = 1;
|
||||
for (HSLFSlide slide : ppt.getSlides()) {
|
||||
|
||||
BufferedImage img = new BufferedImage(pgsize.width, pgsize.height, BufferedImage.TYPE_INT_RGB);
|
||||
Graphics2D graphics = img.createGraphics();
|
||||
// clear the drawing area
|
||||
graphics.setPaint(Color.white);
|
||||
graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));
|
||||
|
||||
// render
|
||||
slide.draw(graphics);
|
||||
|
||||
// save the output
|
||||
FileOutputStream out = new FileOutputStream("slide-" + idx + ".png");
|
||||
javax.imageio.ImageIO.write(img, "png", out);
|
||||
out.close();
|
||||
|
||||
idx++;
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
<anchor id="HeadersFooters"/>
|
||||
<section><title>How to extract Headers / Footers from an existing presentation</title>
|
||||
<source>
|
||||
FileInputStream is = new FileInputStream("slideshow.ppt");
|
||||
HSLFSlideShow ppt = new HSLFSlideShow(is);
|
||||
is.close();
|
||||
|
||||
// presentation-scope headers / footers
|
||||
HeadersFooters hdd = ppt.getSlideHeadersFooters();
|
||||
if (hdd.isFooterVisible()) {
|
||||
String footerText = hdd.getFooterText();
|
||||
}
|
||||
|
||||
// per-slide headers / footers
|
||||
for (HSLFSlide slide : ppt.getSlides()) {
|
||||
HeadersFooters hdd2 = slide.getHeadersFooters();
|
||||
if (hdd2.isFooterVisible()) {
|
||||
String footerText = hdd2.getFooterText();
|
||||
}
|
||||
if (hdd2.isUserDateVisible()) {
|
||||
String customDate = hdd2.getDateTimeText();
|
||||
}
|
||||
if (hdd2.isSlideNumberVisible()) {
|
||||
int slideNUm = slide.getSlideNumber();
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
<section><title>How to set Headers / Footers</title>
|
||||
<source>
|
||||
HSLFSlideShow ppt = new HSLFSlideShow();
|
||||
|
||||
// presentation-scope headers / footers
|
||||
HeadersFooters hdd = ppt.getSlideHeadersFooters();
|
||||
hdd.setSlideNumberVisible(true);
|
||||
hdd.setFootersText("Created by POI-HSLF");
|
||||
</source>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,72 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HSLF and and POI-XLSF - Java API To Access Microsoft Powerpoint Format Files</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Avik Sengupta" email="avik at apache dot org"/>
|
||||
<person name="Nick Burch" email="nick at apache dot org"/>
|
||||
<person name="Yegor Kozlov" email="yegor at apache dot org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>POI-HSLF</title>
|
||||
|
||||
<p>HSLF is the POI Project's pure Java implementation of the Powerpoint '97(-2007) file format. </p>
|
||||
<p>HSLF provides a way to read, create or modify PowerPoint presentations. In particular, it provides:
|
||||
</p>
|
||||
<ul>
|
||||
<li>api for data extraction (text, pictures, embedded objects, sounds)</li>
|
||||
<li>usermodel api for creating, reading and modifying ppt files</li>
|
||||
</ul>
|
||||
<note>
|
||||
This code currently lives the
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/">scratchpad area</a>
|
||||
of the POI SVN repository. To use this component, ensure
|
||||
you have the Scratchpad Jar on your classpath, or a dependency
|
||||
defined on the <em>poi-scratchpad</em> artifact - the main POI
|
||||
jar is not enough! See the
|
||||
<a href="site:components">POI Components Map</a>
|
||||
for more details.
|
||||
</note>
|
||||
<p>The <a href="./quick-guide.html">quick guide</a> documentation provides
|
||||
information on using this API. Comments and fixes gratefully accepted on the POI
|
||||
dev mailing lists.</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>POI-XSLF</title>
|
||||
<p>
|
||||
XSLF is the POI Project's pure Java implementation of the PowerPoint 2007 OOXML (.xlsx) file format.
|
||||
Whilst HSLF and XSLF provide similar features, there is not a common interface across the two of them at this time.
|
||||
</p>
|
||||
<p>
|
||||
Please note that XSLF is still in early development and is a subject to incompatible changes in future.
|
||||
</p>
|
||||
<p>
|
||||
A quick guide is available in the <a href="./xslf-cookbook.html">XSLF Cookbook</a>
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,367 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HSLF - A Guide to the PowerPoint File Format</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at torchbox dot com"/>
|
||||
<person name="Yegor Kozlov" email="yegor at dinom dot ru"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Records, Containers and Atoms</title>
|
||||
<p>
|
||||
PowerPoint documents are made up of a tree of records. A record may
|
||||
contain either other records (in which case it is a Container),
|
||||
or data (in which case it's an Atom). A record can't hold both.
|
||||
</p>
|
||||
<p>
|
||||
PowerPoint documents don't have one overall container record. Instead,
|
||||
there are a number of different container records to be found at
|
||||
the top level.
|
||||
</p>
|
||||
<p>
|
||||
Any numbers or strings stored in the records are always stored in
|
||||
Little Endian format (least important bytes first). This is the case
|
||||
no matter what platform the file was written on - be that a
|
||||
Little Endian or a Big Endian system.
|
||||
</p>
|
||||
<p>
|
||||
PowerPoint may have Escher (DDF) records embedded in it. These
|
||||
are always held as the children of a PPDrawing record (record
|
||||
type 1036). Escher records have the same format as PowerPoint
|
||||
records.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Record Headers</title>
|
||||
<p>
|
||||
All records, be they containers or atoms, have the same standard
|
||||
8 byte header. It is:
|
||||
</p>
|
||||
<ul><li>1/2 byte container flag</li>
|
||||
<li>1.5 byte option field</li>
|
||||
<li>2 byte record type</li>
|
||||
<li>4 byte record length</li></ul>
|
||||
<p>
|
||||
If the first byte of the header, BINARY_AND with 0x0f, is 0x0f,
|
||||
then the record is a container. Otherwise, it's an atom. The rest
|
||||
of the first two bytes are used to store the "options" for the
|
||||
record. Most commonly, this is used to indicate the version of
|
||||
the record, but the exact usage is record specific.
|
||||
</p>
|
||||
<p>
|
||||
The record type is a little endian number, which tells you what
|
||||
kind of record you're dealing with. Each different kind of record
|
||||
has its own value that gets stored here. PowerPoint records have
|
||||
a type that's normally less than 6000 (decimal). Escher records
|
||||
normally have a type between 0xF000 and 0xF1FF.
|
||||
</p>
|
||||
<p>
|
||||
The record length is another little endian number. For an atom,
|
||||
it's the size of the data part of the record, i.e. the length
|
||||
of the record <em>less</em> its 8 byte record header. For a
|
||||
container, it's the size of all the records that are children of
|
||||
this record. That means that the size of a container record is the
|
||||
length, plus 8 bytes for its record header.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>CurrentUserAtom, UserEditAtom and PersistPtrIncrementalBlock</title>
|
||||
<p><strong>aka Records that care about the byte level position of other records</strong></p>
|
||||
<p>
|
||||
A small number of records contain byte level position offsets to other
|
||||
records. If you change the position of any records in the file, then
|
||||
there's a good chance that you will need to update some of these
|
||||
special records.
|
||||
</p>
|
||||
<p>
|
||||
First up, CurrentUserAtom. This is actually stored in a different
|
||||
OLE2 (POIFS) stream to the main PowerPoint document. It contains
|
||||
a few bits of information on who lasted edited the file. Most
|
||||
importantly, at byte 8 of its contents, it stores (as a 32 bit
|
||||
little endian number) the offset in the main stream to the most
|
||||
recent UserEditAtom.
|
||||
</p>
|
||||
<p>
|
||||
The UserEditAtom contains two byte level offsets (again as 32 bit
|
||||
little endian numbers). At byte 12 is the offset to the
|
||||
PersistPtrIncrementalBlock associated with this UserEditAtom
|
||||
(each UserEditAtom has one and only one PersistPtrIncrementalBlock).
|
||||
At byte 8, there's the offset to the previous UserEditAtom. If this
|
||||
is 0, then you're at the first one.
|
||||
</p>
|
||||
<p>
|
||||
Every time you do a non full save in PowerPoint, it tacks on another
|
||||
UserEditAtom and another PersistPtrIncrementalBlock. The
|
||||
CurrentUserAtom is updated to point to this new UserEditAtom, and the
|
||||
new UserEditAtom points back to the previous UserEditAtom. You then
|
||||
end up with a chain, starting from the CurrentUserAtom, linking
|
||||
back through all the UserEditAtoms, until you reach the first one
|
||||
from a full save.
|
||||
</p>
|
||||
<source>
|
||||
/-------------------------------\
|
||||
| CurrentUserAtom (own stream) |
|
||||
| OffsetToCurrentEdit = 10562 |==\
|
||||
\-------------------------------/ |
|
||||
|
|
||||
/==================================/
|
||||
| /-----------------------------------\
|
||||
| | PersistPtrIncrementalBlock @ 6144 |
|
||||
| \-----------------------------------/
|
||||
| /---------------------------------\ |
|
||||
| | UserEditAtom @ 6176 | |
|
||||
| | LastUserEditAtomOffset = 0 | |
|
||||
| | PersistPointersOffset = 6144 |==================/
|
||||
| \---------------------------------/
|
||||
| | /-----------------------------------\
|
||||
| \====================\ | PersistPtrIncrementalBlock @ 8646 |
|
||||
| | \-----------------------------------/
|
||||
| /---------------------------------\ | |
|
||||
| | UserEditAtom @ 8674 | | |
|
||||
| | LastUserEditAtomOffset = 6176 |=/ |
|
||||
| | PersistPointersOffset = 8646 |==================/
|
||||
| \---------------------------------/
|
||||
| | /------------------------------------\
|
||||
| \====================\ | PersistPtrIncrementalBlock @ 10538 |
|
||||
| | \------------------------------------/
|
||||
| /---------------------------------\ | |
|
||||
\==| UserEditAtom @ 10562 | | |
|
||||
| LastUserEditAtomOffset = 8674 |=/ |
|
||||
| PersistPointersOffset = 10538 |==================/
|
||||
\---------------------------------/
|
||||
</source>
|
||||
<p>
|
||||
The PersistPtrIncrementalBlock contains byte offsets to all the
|
||||
Slides, Notes, Documents and MasterSlides in the file. The first
|
||||
PersistPtrIncrementalBlock will point to all the ones that
|
||||
were present the first time the file was saved. Subsequent
|
||||
PersistPtrIncrementalBlocks will contain pointers to all the ones
|
||||
that were changed in that edit. To find the offset to a given
|
||||
sheet in the latest version, then start with the most recent
|
||||
PersistPtrIncrementalBlock. If this knows about the sheet, use the
|
||||
offset it has. If it doesn't, then work back through older
|
||||
PersistPtrIncrementalBlocks until you find one which does, and
|
||||
use that.
|
||||
</p>
|
||||
<p>
|
||||
Each PersistPtrIncrementalBlock can contain a number of entries
|
||||
blocks. Each block holds information on a sequence of sheets.
|
||||
Each block starts with a 32 bit little endian integer. Once read
|
||||
into memory, the lower 20 bits contain the starting number for the
|
||||
sequence of sheets to be described. The higher 12 bits contain
|
||||
the count of the number of sheets described. Following that is
|
||||
one 32 bit little endian integer for each sheet in the sequence,
|
||||
the value being the offset to that sheet. If there is any data
|
||||
left after parsing a block, then it corresponds to the next block.
|
||||
</p>
|
||||
<source>
|
||||
hex on disk decimal description
|
||||
----------- ------- -----------
|
||||
0000 0 No options
|
||||
7217 6002 Record type is 6002
|
||||
2000 0000 32 Length of data is 32 bytes
|
||||
0100 5000 5242881 Count is 5 (12 highest bits)
|
||||
Starting number is 1 (20 lowest bits)
|
||||
0000 0000 0 Sheet (1+0)=1 starts at offset 0
|
||||
900D 0000 3472 Sheet (1+1)=2 starts at offset 3472
|
||||
E403 0000 996 Sheet (1+2)=3 starts at offset 996
|
||||
9213 0000 5010 Sheet (1+3)=4 starts at offset 5010
|
||||
BE15 0000 5566 Sheet (1+4)=5 starts at offset 5566
|
||||
0900 1000 1048585 Count is 1 (12 highest bits)
|
||||
Starting number is 9 (20 lowest bits)
|
||||
4418 0000 6212 Sheet (9+0)=9 starts at offset 9212
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<section><title>Paragraph and Text Styling</title>
|
||||
<p>
|
||||
There are quite a number of records that affect the styling
|
||||
of text, and a smaller number that are responsible for the
|
||||
styling of paragraphs.
|
||||
</p>
|
||||
<p>
|
||||
By default, a given set of text will inherit paragraph and text
|
||||
stylings from the appropriate master sheet. If anything differs
|
||||
from the master sheet, then appropriate styling records will
|
||||
follow the text record.
|
||||
</p>
|
||||
<p>
|
||||
<em>(We don't currently know enough about master sheet styling
|
||||
to write about it)</em>
|
||||
</p>
|
||||
<p>
|
||||
Normally, powerpoint will have one text record (TextBytesAtom
|
||||
or TextCharsAtom) for every paragraph, with a preceding
|
||||
TextHeaderAtom to describe what sort of paragraph it is.
|
||||
If any of the stylings differ from the master's, then a
|
||||
StyleTextPropAtom will follow the text record. This contains
|
||||
the paragraph style information, and the styling information
|
||||
for each section of the text which has a different style.
|
||||
(More on StyleTextPropAtom later)
|
||||
</p>
|
||||
<p>
|
||||
For every font used, a FontEntityAtom must exist for that font.
|
||||
The FontEntityAtoms live inside a FontCollection record, and
|
||||
there's one of those inside Environment record inside the
|
||||
Document record. <em>(More on Fonts to be discovered)</em>
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>StyleTextPropAtom</title>
|
||||
<p>
|
||||
If the text or paragraph stylings for a given text record
|
||||
differ from those of the appropriate master, then there will
|
||||
be one of these records.
|
||||
</p>
|
||||
<p>
|
||||
This record is made up of two lists of lists. Firstly,
|
||||
there's a list of paragraph stylings - each made up of the
|
||||
number of characters it applies two, followed by the matching
|
||||
styling elements. Following that is the equivalent for
|
||||
character stylings.
|
||||
</p>
|
||||
<p>
|
||||
Each styling list (in either list) starts with the number
|
||||
of characters it applies to, stored in a 2 byte little
|
||||
endian number. If it is a paragraph styling, it will be
|
||||
followed by a 2 byte number (of unknown use). After this is
|
||||
a four byte number, which is a mask indicating which stylings
|
||||
will follow. You then have an entry for each of the stylings
|
||||
indicated in the mask. Finally, you move onto the next set
|
||||
of stylings.
|
||||
</p>
|
||||
<p>
|
||||
Each styling has a specific mask flag to indicate its
|
||||
presence. (The list may be found towards the top of
|
||||
org.apache.poi.hslf.record.StyleTextPropAtom.java, and is
|
||||
too long to sensibly include here). For each styling entry
|
||||
will occur in the order of its mask value (so one with mask
|
||||
1 will come first, followed by the next highest mask value).
|
||||
Depending on the styling, it is either made up of a 2 byte
|
||||
or 4 byte numeric value. The meaning of the value will
|
||||
depend on the styling (eg for font.size, it is the font
|
||||
size in points).
|
||||
</p>
|
||||
<p>
|
||||
Some stylings are actually mask stylings. For these, the
|
||||
value will be a 4 byte number. This is then processed as
|
||||
mask, to indicate a number of different sub-stylings.
|
||||
The styling for bold/italic/underline is one such example.
|
||||
</p>
|
||||
<source>
|
||||
hex on disk decimal description
|
||||
----------- ------- -----------
|
||||
|
||||
0000 0 No options
|
||||
A10F 4001 Record type is 4001
|
||||
8000 0000 128 Length of data is 128 bytes
|
||||
1E00 0000 30 The paragraph styling applies to 30 characters
|
||||
0000 0 Paragraph options are 0
|
||||
0018 0000 6144 0x0800=Text Alignment, 0x1000=Line Spacing
|
||||
0000 0 Text Alignment = Left
|
||||
5000 80 Line Spacing = 80
|
||||
|
||||
1C00 0000 28 The paragraph styling applies to 28 characters
|
||||
0000 0 Paragraph options are 0
|
||||
0010 0000 4096 0x1000=Line Spacing
|
||||
5000 80 Line Spacing = 80
|
||||
|
||||
1900 0000 25 The paragraph styling applies to 25 characters
|
||||
0000 0 Paragraph options are 0
|
||||
0018 0000 6144 0x0800=Text Alignment, 0x1000=Line Spacing
|
||||
0200 0 Text Alignment = Right
|
||||
5000 80 Line Spacing = 80
|
||||
|
||||
6100 0000 61 The paragraph styling applies to 61 characters
|
||||
(includes final CR)
|
||||
0000 0 Paragraph options are 0
|
||||
0018 0000 6144 0x0800=Text Alignment, 0x1000=Line Spacing
|
||||
0000 0 Text Alignment = Left
|
||||
5000 80 Line Spacing = 80
|
||||
|
||||
1E00 0000 30 The character styling applies to 30 characters
|
||||
0100 0200 131073 0x0001=Char Props Mask, 0x20000=Font Size
|
||||
0100 1 Char Props 0x0001=Bold
|
||||
1400 20 Font Size = 20
|
||||
|
||||
1C00 0000 28 The character styling applies to 28 characters
|
||||
0200 0600 393218 0x0002=Char Props Mask, 0x20000=Font Size, 0x40000=Font Color
|
||||
0200 2 Char Props 0x0002=Italic
|
||||
1400 20 Font Size = 20
|
||||
0000 0005 83886080 Blue
|
||||
|
||||
1900 0000 25 The character styling applies to 25 characters
|
||||
0000 0600 393216 0x20000=Font Size, 0x40000=Font Color
|
||||
1400 20 Font Size = 20
|
||||
FF33 00FE 4261426175 Red
|
||||
|
||||
6000 0000 96 The character styling applies to 96 characters
|
||||
0400 0300 196612 0x0004=Char Props Mask, 0x10000=Font Index, 0x20000=Font Size
|
||||
0400 4 Char Props 0x0004=Underlined
|
||||
0100 1 Font Index = 1 (2nd Font in table)
|
||||
1800 24 Font Size = 24
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<section><title>Fonts in PowerPoint</title>
|
||||
<p>
|
||||
PowerPoint stores information about the fonts used in FontEntityAtoms,
|
||||
which live inside Document.Environment.FontCollection. For every different
|
||||
font used, a FontEntityAtom must exist for that font. There is always at
|
||||
least one FontEntityAtom in Document.Environment.FontCollection,
|
||||
which describes the default font.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>FontEntityAtom</title>
|
||||
<p>
|
||||
The instance field of the record header contains the zero based index of the
|
||||
font. Font index entries in StyleTextPropAtoms will refer to their required
|
||||
font via this index.
|
||||
</p>
|
||||
<p>
|
||||
The length of FontEntityAtoms is always 68 bytes. The first 64 bytes of
|
||||
it hold the typeface name of the font to be used. This is stored as
|
||||
a null-terminated string, and encoded as little endian unicode. (The
|
||||
length of the string must not exceed 32 characters including the null
|
||||
termination, so the typeface name cannot exceed 31 characters).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
After the typeface name there are 4 bytes of bitmask flags. The details of these
|
||||
can be found in the Windows API, under the LOGFONT structure.
|
||||
The 65th byte is the output precision, which defines how closely the system chosen
|
||||
font must match the requested font, in terms of height, width, pitch etc.
|
||||
The 66th byte is the clipping precision, which defines how to clip characters
|
||||
that occur partly outside the clipping region.
|
||||
The 67th byte is the output quality, which defines how closely the system
|
||||
must match the logical font's attributes to those of the physical font used.
|
||||
The 68th (and final) byte is the pitch and family, which is used by the
|
||||
system when matching fonts.
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,210 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?><!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Rendering slideshows, WMF, EMF and EMF+</title>
|
||||
</header>
|
||||
<body>
|
||||
<note>Please be aware, that the documentation on this page reflects the current development, which might not
|
||||
have been released. If you rely on an unreleased feature, either use a
|
||||
<a href="site:download">nightly development build</a> or feel free to ask on the
|
||||
<a href="site:mailinglists">mailing list</a> for the release schedule.</note>
|
||||
<section>
|
||||
<title>Rendering slideshows, WMF, EMF and EMF+</title>
|
||||
<p>
|
||||
For rendering slideshow (HSLF/XSLF), WMF, EMF and EMF+ pictures, POI provides an utility class
|
||||
<a href="https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/main/java/org/apache/poi/xslf/util/PPTX2PNG.java?view=markup">
|
||||
PPTX2PNG</a>:
|
||||
</p>
|
||||
|
||||
<source><![CDATA[
|
||||
Usage: PPTX2PNG [options] <.ppt/.pptx/.emf/.wmf file or 'stdin'>
|
||||
|
||||
Options:
|
||||
-scale <float> scale factor
|
||||
-fixSide <side> specify side (long,short,width,height) to fix - use <scale> as amount of pixels
|
||||
-slide <integer> 1-based index of a slide to render
|
||||
-format <type> png,gif,jpg,svg,pdf (log,null for testing)
|
||||
-outdir <dir> output directory, defaults to origin of the ppt/pptx file
|
||||
-outfile <file> output filename, defaults to "${basename}-${slideno}.${format}"
|
||||
-outpat <pattern> output filename pattern, defaults to "${basename}-${slideno}.${format}"
|
||||
patterns: basename, slideno, format, ext
|
||||
-dump <file> dump the annotated records to a file
|
||||
-quiet do not write to console (for normal processing)
|
||||
-ignoreParse ignore parsing error and continue with the records read until the error
|
||||
-extractEmbedded extract embedded parts
|
||||
-inputType <type> default input file type (OLE2,WMF,EMF), default is OLE2 = Powerpoint
|
||||
some files (usually wmf) don't have a header, i.e. an identifiable file magic
|
||||
-textAsShapes text elements are saved as shapes in SVG, necessary for variable spacing
|
||||
often found in math formulas
|
||||
-charset <cs> sets the default charset to be used, defaults to Windows-1252
|
||||
-emfHeaderBounds force the usage of the emf header bounds to calculate the bounding box
|
||||
|
||||
-fontdir <dir> (PDF only) font directories separated by ";" - use $HOME for current users home dir
|
||||
defaults to the usual plattform directories
|
||||
-fontTtf <regex> (PDF only) regex to match the .ttf filenames
|
||||
-fontMap <map> ";"-separated list of font mappings <typeface from>:<typeface to>
|
||||
]]>
|
||||
</source>
|
||||
|
||||
<section>
|
||||
<title>Instructions to run</title>
|
||||
<p>
|
||||
Download the <a href="https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/">current nightly</a>
|
||||
and for SVG/PDF the <a href="site:components/index/batikpdf">additional dependencies</a>.</p>
|
||||
<p>Execute the java command (Unix-paths needs to be replaced for Windows - use "-charset" for non-western WMF/EMFs):</p>
|
||||
<source>
|
||||
java -cp poi-5.4.1.jar:poi-ooxml-5.4.1.jar:poi-ooxml-lite-5.4.1.jar:poi-scratchpad-5.4.1.jar:lib/*:ooxml-lib/*:auxiliary/* org.apache.poi.xslf.util.PPTX2PNG -format png -fixside long -scale 1000 -charset GBK file.pptx
|
||||
</source>
|
||||
<p>
|
||||
If you want to use the renderer on the module path (JPMS) there a currently a few more steps necessary:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Create a build project using Maven, Gradle or your favorite build tool.</li>
|
||||
<li>Alternatively, download the jars from https://repo1.maven.org/maven2/org/apache/poi/</li>
|
||||
<li>Exclude poi-ooxml-full-5.4.1.jar,poi-javadoc-5.4.1.jar and auxiliary/xml-apis-1.4.01.jar (Java 11+) into new subdirectory "unused"</li>
|
||||
<li>Move all other jars in current directory into a new subdirectory "poi"</li>
|
||||
<li>Remove auxiliary/batik-script-1.14.jar:/META-INF/services/org.apache.batik.script.InterpreterFactory - see <a href="https://issues.apache.org/jira/browse/BATIK-1260">BATIK-1260</a></li>
|
||||
<li>Invoke PPTX2PNG:
|
||||
<source>
|
||||
java --module-path poi:lib:auxiliary:ooxml-lib --module org.apache.poi.ooxml/org.apache.poi.xslf.util.PPTX2PNG -format png -fixside long -scale 1000 file.pptx
|
||||
</source>
|
||||
</li>
|
||||
</ul>
|
||||
<note>
|
||||
JDK 1.8 is by default using the PiscesRenderingEngine and affected by
|
||||
<a href="https://github.com/AdoptOpenJDK/openjdk-build/issues/716">Busy loop hangs</a>.
|
||||
To workaround this, use the MarlinRenderingEngine which is experimental provided starting from
|
||||
<a href="https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8143849">openjdk8u252 (JDK-8143849)</a>
|
||||
via <code>-Dsun.java2d.renderer=sun.java2d.marlin.MarlinRenderingEngine</code> or for older jdk builds,
|
||||
<a href="https://github.com/bourgesl/marlin-renderer/wiki/How-to-use">preload the marlin jar</a>.
|
||||
</note>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
<section>
|
||||
<title>Integrate rendering in your code</title>
|
||||
<section>
|
||||
<title>#1 - Use PPTX2PNG via file or stdin</title>
|
||||
<p>For file system access, you need to save your slideshow/WMF/EMF/EMF+ first to disc and then call <code>
|
||||
PPTX2PNG.main()
|
||||
</code> with the corresponding parameters.
|
||||
</p>
|
||||
|
||||
<p>for stdin access, you need to redirect <code>System.in</code> before:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
/* the file content */
|
||||
InputStream is = ...;
|
||||
/* Save and set System.in */
|
||||
InputStream oldIn = System.in;
|
||||
try {
|
||||
System.setIn(is);
|
||||
|
||||
String[] args = {
|
||||
"-format", "png", // png,gif,jpg,svg or null for test
|
||||
"-outdir", new File("out/").getCanonicalPath(),
|
||||
"-outfile", "export.png",
|
||||
"-fixside", "long",
|
||||
"-scale", "800",
|
||||
"-ignoreParse",
|
||||
"stdin"
|
||||
};
|
||||
PPTX2PNG.main(args);
|
||||
|
||||
} finally {
|
||||
System.setIn(oldIn);
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
<section>
|
||||
<title>#2 - Render WMF / EMF / EMF+ via the *Picture classes</title>
|
||||
<source><![CDATA[
|
||||
File f = samples.getFile("santa.wmf");
|
||||
try (FileInputStream fis = new FileInputStream(f)) {
|
||||
// for WMF
|
||||
HwmfPicture wmf = new HwmfPicture(fis);
|
||||
|
||||
// for EMF / EMF+
|
||||
HemfPicture emf = new HemfPicture(fis);
|
||||
|
||||
Dimension dim = wmf.getSize();
|
||||
int width = Units.pointsToPixel(dim.getWidth());
|
||||
// keep aspect ratio for height
|
||||
int height = Units.pointsToPixel(dim.getHeight());
|
||||
double max = Math.max(width, height);
|
||||
if (max > 1500) {
|
||||
width *= 1500/max;
|
||||
height *= 1500/max;
|
||||
}
|
||||
|
||||
BufferedImage bufImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
|
||||
Graphics2D g = bufImg.createGraphics();
|
||||
g.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
|
||||
g.setRenderingHint(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
|
||||
g.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC);
|
||||
g.setRenderingHint(RenderingHints.KEY_FRACTIONALMETRICS, RenderingHints.VALUE_FRACTIONALMETRICS_ON);
|
||||
|
||||
wmf.draw(g, new Rectangle2D.Double(0,0,width,height));
|
||||
|
||||
g.dispose();
|
||||
|
||||
ImageIO.write(bufImg, "PNG", new File("bla.png"));
|
||||
}
|
||||
]]>
|
||||
</source>
|
||||
</section>
|
||||
<section>
|
||||
<title>#3 - Render slideshows directly</title>
|
||||
<source><![CDATA[
|
||||
File file = new File("example.pptx");
|
||||
double scale = 1.5;
|
||||
try (SlideShow<?, ?> ss = SlideShowFactory.create(file, null, true)) {
|
||||
Dimension pgsize = ss.getPageSize();
|
||||
int width = (int) (pgsize.width * scale);
|
||||
int height = (int) (pgsize.height * scale);
|
||||
|
||||
for (Slide<?, ?> slide : ss.getSlides()) {
|
||||
BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
|
||||
Graphics2D graphics = img.createGraphics();
|
||||
|
||||
// default rendering options
|
||||
graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
|
||||
graphics.setRenderingHint(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
|
||||
graphics.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC);
|
||||
graphics.setRenderingHint(RenderingHints.KEY_FRACTIONALMETRICS, RenderingHints.VALUE_FRACTIONALMETRICS_ON);
|
||||
graphics.setRenderingHint(Drawable.BUFFERED_IMAGE, new WeakReference<>(img));
|
||||
|
||||
graphics.scale(scale, scale);
|
||||
|
||||
// draw stuff
|
||||
slide.draw(graphics);
|
||||
|
||||
ImageIO.write(img, "PNG", new File("output.png"));
|
||||
graphics.dispose();
|
||||
img.flush();
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,133 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HSLF - A Quick Guide</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="nick at torchbox dot com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Basic Text Extraction</title>
|
||||
<p>For basic text extraction, make use of
|
||||
<code>org.apache.poi.sl.extractor.SlideShowExtractor</code>.
|
||||
It accepts a slideshow which can be created from a file or stream via <code>org.apache.poi.sl.usermodel.SlideShowFactory</code>.
|
||||
The <code>getText()</code> method can be used to get the text from the slides.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Specific Text Extraction</title>
|
||||
<p>To get specific bits of text, first create a <code>org.apache.poi.hslf.usermodel.HSLFSlideShow</code>
|
||||
(from a <code>org.apache.poi.hslf.usermodel.HSLFSlideShowImpl</code>, which accepts a file or an input
|
||||
stream). Use <code>getSlides()</code> and <code>getNotes()</code> to get the slides and notes.
|
||||
These can be queried to get their page ID (though they should be returned
|
||||
in the right order).</p>
|
||||
<p>You can then call <code>getTextParagraphs()</code> on these, to get
|
||||
their blocks of text. (A list of <code>HSLFTextParagraph</code> normally holds all the text in a
|
||||
given area of the page, eg in the title bar, or in a box).
|
||||
From the <code>HSLFTextParagraph</code>, you can extract the text, and check
|
||||
what type of text it is (eg Body, Title). You can also call
|
||||
<code>getTextRuns()</code>, which will return the
|
||||
<code>HSLFTextRun</code>s that make up the <code>TextParagraph</code>. A
|
||||
<code>HSLFTextRun</code> is a text fragment, having the same character formatting.
|
||||
The paragraph formatting is defined in the parent <code>HSLFTextParagraph</code>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Poor Quality Text Extraction</title>
|
||||
<p>If speed is the most important thing for you, you don't care
|
||||
about getting duplicate blocks of text, you don't care about
|
||||
getting text from master sheets, and you don't care about getting
|
||||
old text, then
|
||||
<code>org.apache.poi.hslf.extractor.QuickButCruddyTextExtractor</code>
|
||||
might be of use.</p>
|
||||
<p>QuickButCruddyTextExtractor doesn't use the normal record
|
||||
parsing code, instead it uses a tree structure blind search
|
||||
method to get all text holding records. You will get all the text,
|
||||
including lots of text you normally wouldn't ever want. However,
|
||||
you will get it back very very fast!</p>
|
||||
<p>There are two ways of getting the text back.
|
||||
<code>getTextAsString()</code> will return a single string with all
|
||||
the text in it. <code>getTextAsVector()</code> will return a
|
||||
vector of strings, one for each text record found in the file.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Changing Text</title>
|
||||
<p>It is possible to change the text via
|
||||
<code>HSLFTextParagraph.setText(List<HSLFTextParagraph>,String)</code> or
|
||||
<code>HSLFTextRun.setText(String)</code>. It is possible to add additional TextRuns
|
||||
with <code>HSLFTextParagraph.appendText(List<HSLFTextParagraph>,String,boolean)</code>
|
||||
or <code>HSLFTextParagraph.addTextRun(HSLFTextRun)</code></p>
|
||||
<p>When calling <code>HSLFTextParagraph.setText(List<HSLFTextParagraph>,String)</code>, all
|
||||
the text will end up with the same formatting. When calling
|
||||
<code>HSLFTextRun.setText(String)</code>, the text will retain
|
||||
the old formatting of that <code>HSLFTextRun</code>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Adding Slides</title>
|
||||
<p>You may add new slides by calling
|
||||
<code>HSLFSlideShow.createSlide()</code>, which will add a new slide
|
||||
to the end of the SlideShow. It is possible to re-order slides with <code>HSLFSlideShow.reorderSlide(...)</code>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Guide to key classes</title>
|
||||
<ul>
|
||||
<li><code>org.apache.poi.hslf.usermodel.HSLFSlideShowImpl</code>
|
||||
Handles reading in and writing out files. Calls
|
||||
<code>org.apache.poi.hslf.record.record</code> to build a tree
|
||||
of all the records in the file, which it allows access to.
|
||||
</li>
|
||||
<li><code>org.apache.poi.hslf.record.Record</code>
|
||||
Base class of all records. Also provides the main record generation
|
||||
code, which will build up a tree of records for a file.
|
||||
</li>
|
||||
<li><code>org.apache.poi.hslf.usermodel.HSLFSlideShow</code>
|
||||
Builds up model entries from the records, and presents a user facing
|
||||
view of the file
|
||||
</li>
|
||||
<li><code>org.apache.poi.hslf.usermodel.HSLFSlide</code>
|
||||
A user facing view of a Slide in a slideshow. Allows you to get at the
|
||||
Text of the slide, and at any drawing objects on it.
|
||||
</li>
|
||||
<li><code>org.apache.poi.hslf.usermodel.HSLFTextParagraph</code>
|
||||
A list of <code>HSLFTextParagraph</code>s holds all the text in a given area of the Slide, and will
|
||||
contain one or more <code>HSLFTextRun</code>s.
|
||||
</li>
|
||||
<li><code>org.apache.poi.hslf.usermodel.HSLFTextRun</code>
|
||||
Holds a run of text, all having the same character stylings. It is possible to modify text, and/or text stylings.
|
||||
</li>
|
||||
<li><code>org.apache.poi.sl.extractor.SlideShowExtractor</code>
|
||||
Uses the model code to allow extraction of text from files
|
||||
</li>
|
||||
<li><code>org.apache.poi.hslf.extractor.QuickButCruddyTextExtractor</code>
|
||||
Uses the record code to extract all the text from files very fast,
|
||||
but including deleted text (and other bits of Crud).
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,304 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>XSLF Cookbook</title>
|
||||
<authors>
|
||||
<person email="yegor@apache.org" name="Yegor Kozlov" id="YK"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>XSLF Cookbook</title>
|
||||
<p>
|
||||
This page offers a short introduction into the XSLF API. More examples can be found in the
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xslf/">XSLF Examples</a>
|
||||
in the POI SVN repository.
|
||||
</p>
|
||||
<note>
|
||||
Please note that XSLF is still in early development and is a subject to incompatible changes in a future release.
|
||||
</note>
|
||||
<section><title>Index of Features</title>
|
||||
<ul>
|
||||
<li><a href="#NewPresentation">Create a new presentation</a></li>
|
||||
<li><a href="#ReadPresentation">Read an existing presentation</a></li>
|
||||
<li><a href="#SlideLayout">Create a slide with a predefined layout</a></li>
|
||||
<li><a href="#DeleteSlide">Delete slide</a></li>
|
||||
<li><a href="#MoveSlide">Re-order slides</a></li>
|
||||
<li><a href="#SlideSize">Change slide size</a></li>
|
||||
<li><a href="#GetShapes">Read shapes</a></li>
|
||||
<li><a href="#AddImage">Add image</a></li>
|
||||
<li><a href="#ReadImages">Read images contained in a presentation</a></li>
|
||||
<li><a href="#Text">Format text</a></li>
|
||||
<li><a href="#Hyperlinks">Hyperlinks</a></li>
|
||||
<li><a href="#PPTX2PNG">Convert .pptx slides into images</a></li>
|
||||
<li><a href="#Merge">Merge multiple presentations together</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Cookbook</title>
|
||||
<anchor id="NewPresentation"/>
|
||||
<section><title>New Presentation</title>
|
||||
<p>
|
||||
The following code creates a new .pptx slide show and adds a blank slide to it:
|
||||
</p>
|
||||
<source>
|
||||
//create a new empty slide show
|
||||
XMLSlideShow ppt = new XMLSlideShow();
|
||||
|
||||
//add first slide
|
||||
XSLFSlide blankSlide = ppt.createSlide();
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="ReadPresentation"/>
|
||||
<section><title>Read an existing presentation and append a slide to it</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx"));
|
||||
|
||||
//append a new slide to the end
|
||||
XSLFSlide blankSlide = ppt.createSlide();
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="SlideLayout"/>
|
||||
<section><title>Create a new slide from a predefined slide layout</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx"));
|
||||
|
||||
// first see what slide layouts are available :
|
||||
System.out.println("Available slide layouts:");
|
||||
for(XSLFSlideMaster master : ppt.getSlideMasters()){
|
||||
for(XSLFSlideLayout layout : master.getSlideLayouts()){
|
||||
System.out.println(layout.getType());
|
||||
}
|
||||
}
|
||||
|
||||
// blank slide
|
||||
XSLFSlide blankSlide = ppt.createSlide();
|
||||
|
||||
// there can be multiple masters each referencing a number of layouts
|
||||
// for demonstration purposes we use the first (default) slide master
|
||||
XSLFSlideMaster defaultMaster = ppt.getSlideMasters().get(0);
|
||||
|
||||
// title slide
|
||||
XSLFSlideLayout titleLayout = defaultMaster.getLayout(SlideLayout.TITLE);
|
||||
// fill the placeholders
|
||||
XSLFSlide slide1 = ppt.createSlide(titleLayout);
|
||||
XSLFTextShape title1 = slide1.getPlaceholder(0);
|
||||
title1.setText("First Title");
|
||||
|
||||
// title and content
|
||||
XSLFSlideLayout titleBodyLayout = defaultMaster.getLayout(SlideLayout.TITLE_AND_CONTENT);
|
||||
XSLFSlide slide2 = ppt.createSlide(titleBodyLayout);
|
||||
|
||||
XSLFTextShape title2 = slide2.getPlaceholder(0);
|
||||
title2.setText("Second Title");
|
||||
|
||||
XSLFTextShape body2 = slide2.getPlaceholder(1);
|
||||
body2.clearText(); // unset any existing text
|
||||
body2.addNewTextParagraph().addNewTextRun().setText("First paragraph");
|
||||
body2.addNewTextParagraph().addNewTextRun().setText("Second paragraph");
|
||||
body2.addNewTextParagraph().addNewTextRun().setText("Third paragraph");
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="DeleteSlide"/>
|
||||
<section><title>Delete slide</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx"));
|
||||
|
||||
ppt.removeSlide(0); // 0-based index of a slide to be removed
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="MoveSlide"/>
|
||||
<section><title>Re-order slides</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx"));
|
||||
List<XSLFSlide> slides = ppt.getSlides();
|
||||
|
||||
XSLFSlide thirdSlide = slides.get(2);
|
||||
ppt.setSlideOrder(thirdSlide, 0); // move the third slide to the beginning
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="SlideSize"/>
|
||||
<section><title>How to retrieve or change slide size</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow();
|
||||
//retrieve page size. Coordinates are expressed in points (72 dpi)
|
||||
java.awt.Dimension pgsize = ppt.getPageSize();
|
||||
int pgx = pgsize.width; //slide width in points
|
||||
int pgy = pgsize.height; //slide height in points
|
||||
|
||||
//set new page size
|
||||
ppt.setPageSize(new java.awt.Dimension(1024, 768));
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="GetShapes"/>
|
||||
<section><title>How to read shapes contained in a particular slide</title>
|
||||
<p>
|
||||
The following code demonstrates how to iterate over shapes for each slide.
|
||||
</p>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx"));
|
||||
// get slides
|
||||
for (XSLFSlide slide : ppt.getSlides()) {
|
||||
for (XSLFShape sh : slide.getShapes()) {
|
||||
// name of the shape
|
||||
String name = sh.getShapeName();
|
||||
|
||||
// shapes's anchor which defines the position of this shape in the slide
|
||||
if (sh instanceof PlaceableShape) {
|
||||
java.awt.geom.Rectangle2D anchor = ((PlaceableShape)sh).getAnchor();
|
||||
}
|
||||
|
||||
if (sh instanceof XSLFConnectorShape) {
|
||||
XSLFConnectorShape line = (XSLFConnectorShape) sh;
|
||||
// work with Line
|
||||
} else if (sh instanceof XSLFTextShape) {
|
||||
XSLFTextShape shape = (XSLFTextShape) sh;
|
||||
// work with a shape that can hold text
|
||||
} else if (sh instanceof XSLFPictureShape) {
|
||||
XSLFPictureShape shape = (XSLFPictureShape) sh;
|
||||
// work with Picture
|
||||
}
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="AddImage"/>
|
||||
<section><title>Add Image to Slide</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow();
|
||||
XSLFSlide slide = ppt.createSlide();
|
||||
|
||||
byte[] pictureData = IOUtils.toByteArray(new FileInputStream("image.png"));
|
||||
|
||||
XSLFPictureData pd = ppt.addPicture(pictureData, PictureData.PictureType.PNG);
|
||||
XSLFPictureShape pic = slide.createPicture(pd);
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="ReadImages"/>
|
||||
<section><title>Read Images contained within a presentation</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx"));
|
||||
for(XSLFPictureData data : ppt.getAllPictures()){
|
||||
byte[] bytes = data.getData();
|
||||
String fileName = data.getFileName();
|
||||
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="Text"/>
|
||||
<section><title>Basic text formatting</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow();
|
||||
XSLFSlide slide = ppt.createSlide();
|
||||
|
||||
XSLFTextBox shape = slide.createTextBox();
|
||||
XSLFTextParagraph p = shape.addNewTextParagraph();
|
||||
|
||||
XSLFTextRun r1 = p.addNewTextRun();
|
||||
r1.setText("The");
|
||||
r1.setFontColor(Color.blue);
|
||||
r1.setFontSize(24.);
|
||||
|
||||
XSLFTextRun r2 = p.addNewTextRun();
|
||||
r2.setText(" quick");
|
||||
r2.setFontColor(Color.red);
|
||||
r2.setBold(true);
|
||||
|
||||
XSLFTextRun r3 = p.addNewTextRun();
|
||||
r3.setText(" brown");
|
||||
r3.setFontSize(12.);
|
||||
r3.setItalic(true);
|
||||
r3.setStrikethrough(true);
|
||||
|
||||
XSLFTextRun r4 = p.addNewTextRun();
|
||||
r4.setText(" fox");
|
||||
r4.setUnderline(true);
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="Hyperlinks"/>
|
||||
<section><title>How to create a hyperlink</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow();
|
||||
XSLFSlide slide = ppt.createSlide();
|
||||
|
||||
// assign a hyperlink to a text run
|
||||
XSLFTextBox shape = slide.createTextBox();
|
||||
XSLFTextRun r = shape.addNewTextParagraph().addNewTextRun();
|
||||
r.setText("Apache POI");
|
||||
XSLFHyperlink link = r.createHyperlink();
|
||||
link.setAddress("https://poi.apache.org");
|
||||
</source>
|
||||
</section>
|
||||
<anchor id="PPTX2PNG"/>
|
||||
<section><title>PPTX2PNG is an application that converts each slide of a .pptx slideshow into a PNG image</title>
|
||||
<source>
|
||||
Usage: PPTX2PNG [options] <pptx file>
|
||||
Options:
|
||||
-scale <float> scale factor (default is 1.0)
|
||||
-slide <integer> 1-based index of a slide to render. Default is to render all slides.
|
||||
</source>
|
||||
<p>How it works:</p>
|
||||
<p>
|
||||
The XSLFSlide object implements a draw(Graphics2D graphics) method that recursively paints all shapes
|
||||
in the slide into the supplied graphics canvas:
|
||||
</p>
|
||||
<source>
|
||||
slide.draw(graphics);
|
||||
</source>
|
||||
<p>
|
||||
where graphics is a class implementing java.awt.Graphics2D. In PPTX2PNG the graphic canvas is derived from
|
||||
java.awt.image.BufferedImage, i.e. the destination is an image in memory, but in general case you can pass
|
||||
any compliant implementation of java.awt.Graphics2D.
|
||||
Find more information in the designated <a href="site:slrender">render page</a>, e.g. on how to render SVG images.
|
||||
</p>
|
||||
</section>
|
||||
<anchor id="Merge"/>
|
||||
<section>
|
||||
<title>Merge multiple presentations together</title>
|
||||
<source>
|
||||
XMLSlideShow ppt = new XMLSlideShow();
|
||||
String[] inputs = {"presentations1.pptx", "presentation2.pptx"};
|
||||
for(String arg : inputs){
|
||||
FileInputStream is = new FileInputStream(arg);
|
||||
XMLSlideShow src = new XMLSlideShow(is);
|
||||
is.close();
|
||||
|
||||
for(XSLFSlide srcSlide : src.getSlides()){
|
||||
ppt.createSlide().importContent(srcSlide);
|
||||
}
|
||||
}
|
||||
|
||||
FileOutputStream out = new FileOutputStream("merged.pptx");
|
||||
ppt.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
1532
src/documentation/content/xdocs/components/spreadsheet/chart.xml
Normal file
@ -0,0 +1,232 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Upgrading to POI 3.5, including converting existing HSSF Usermodel code to SS Usermodel (for XSSF and HSSF)</title>
|
||||
<authors>
|
||||
<person email="nick@apache.org" name="Nick Burch" id="NB"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Things that have to be changed when upgrading to POI 3.5</title>
|
||||
<p>Wherever possible, we have tried to ensure that you can use your
|
||||
existing POI code with POI 3.5 without requiring any changes. However,
|
||||
Java doesn't always make that easy, and unfortunately there are a
|
||||
few changes that may be required for some users.</p>
|
||||
<section><title>org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.CellValue</title>
|
||||
<p>Annoyingly, java will not let you access a static inner class via
|
||||
a child of the parent one. So, all references to
|
||||
<em>org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.CellValue</em>
|
||||
will need to be changed to
|
||||
<em>org.apache.poi.ss.usermodel.FormulaEvaluator.CellValue</em>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>org.apache.poi.hssf.usermodel.HSSFRow.MissingCellPolicy</title>
|
||||
<p>Annoyingly, java will not let you access a static inner class via
|
||||
a child of the parent one. So, all references to
|
||||
<em>org.apache.poi.hssf.usermodel.HSSFRow.MissingCellPolicy</em>
|
||||
will need to be changed to
|
||||
<em>org.apache.poi.ss.usermodel.Row.MissingCellPolicy</em>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>DDF and org.apache.poi.hssf.record.RecordFormatException</title>
|
||||
<p>Previously, record level errors within DDF would throw an
|
||||
exception from the hssf class hierarchy. Now, record level errors
|
||||
within DDF will throw a more general RecordFormatException,
|
||||
<em>org.apache.poi.util.RecordFormatException</em></p>
|
||||
<p>In addition, org.apache.poi.hssf.record.RecordFormatException
|
||||
has been changed to inherit from the new
|
||||
<em>org.apache.poi.util.RecordFormatException</em>, so you may
|
||||
wish to change catches of the hssf version to the new util version.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Converting existing HSSF Usermodel code to SS Usermodel (for XSSF and HSSF)</title>
|
||||
|
||||
<section><title>Why change?</title>
|
||||
<p>If you have existing HSSF usermodel code that works just
|
||||
fine, and you don't want to use the new OOXML XSSF support,
|
||||
then you probably don't need to. Your existing HSSF only code
|
||||
will continue to work just fine.</p>
|
||||
<p>However, if you want to be able to work with both HSSF for
|
||||
your .xls files, and also XSSF for .xslx files, then you will
|
||||
need to make some slight tweaks to your code.</p>
|
||||
</section>
|
||||
<section><title>org.apache.poi.ss.usermodel</title>
|
||||
<p>The new SS usermodel (org.apache.poi.ss.usermodel) is very
|
||||
heavily based on the old HSSF usermodel
|
||||
(org.apache.poi.hssf.usermodel). The main difference is that
|
||||
the package name and class names have been tweaked to remove
|
||||
HSSF from them. Otherwise, the new SS Usermodel interfaces
|
||||
should provide the same functionality.</p>
|
||||
</section>
|
||||
<section><title>Constructors</title>
|
||||
<p>Calling the empty HSSFWorkbook remains as the way to
|
||||
create a new, empty Workbook object. To open an existing
|
||||
Workbook, you should now call WorkbookFactory.create(inp).</p>
|
||||
<p>For all other cases when you would have called a
|
||||
Usermodel constructor, such as 'new HSSFRichTextString()' or
|
||||
'new HSSFDataFormat', you should instead use a CreationHelper.
|
||||
There's a method on the Workbook to get a CreationHelper, and
|
||||
the CreationHelper will then handle constructing new objects
|
||||
for you.</p>
|
||||
</section>
|
||||
<section><title>Other Code</title>
|
||||
<p>For all other code, generally change a reference from
|
||||
org.apache.poi.hssf.usermodel.HSSFFoo to a reference to
|
||||
org.apache.poi.ss.usermodel.Foo. Method signatures should
|
||||
otherwise remain the same, and it should all then work for
|
||||
both XSSF and HSSF.</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Worked Examples</title>
|
||||
<section><title>Old HSSF Code</title>
|
||||
<source><![CDATA[
|
||||
// import org.apache.poi.hssf.usermodel.*;
|
||||
|
||||
HSSFWorkbook wb = new HSSFWorkbook();
|
||||
// create a new sheet
|
||||
HSSFSheet s = wb.createSheet();
|
||||
// declare a row object reference
|
||||
HSSFRow r = null;
|
||||
// declare a cell object reference
|
||||
HSSFCell c = null;
|
||||
// create 2 cell styles
|
||||
HSSFCellStyle cs = wb.createCellStyle();
|
||||
HSSFCellStyle cs2 = wb.createCellStyle();
|
||||
HSSFDataFormat df = wb.createDataFormat();
|
||||
|
||||
// create 2 fonts objects
|
||||
HSSFFont f = wb.createFont();
|
||||
HSSFFont f2 = wb.createFont();
|
||||
|
||||
// Set font 1 to 12 point type, blue and bold
|
||||
f.setFontHeightInPoints((short) 12);
|
||||
f.setColor( HSSFColor.RED.index );
|
||||
f.setBoldweight(HSSFFont.BOLDWEIGHT_BOLD);
|
||||
|
||||
// Set font 2 to 10 point type, red and bold
|
||||
f2.setFontHeightInPoints((short) 10);
|
||||
f2.setColor( HSSFFont.RED.index );
|
||||
f2.setBoldweight(HSSFFont.BOLDWEIGHT_BOLD);
|
||||
|
||||
// Set cell style and formatting
|
||||
cs.setFont(f);
|
||||
cs.setDataFormat(df.getFormat("#,##0.0"));
|
||||
|
||||
// Set the other cell style and formatting
|
||||
cs2.setBorderBottom(cs2.BORDER_THIN);
|
||||
cs2.setDataFormat(HSSFDataFormat.getBuiltinFormat("text"));
|
||||
cs2.setFont(f2);
|
||||
|
||||
|
||||
// Define a few rows
|
||||
for(short rownum = (short)0; rownum < 30; rownum++) {
|
||||
HSSFRow r = s.createRow(rownum);
|
||||
for(short cellnum = (short)0; cellnum < 10; cellnum += 2) {
|
||||
HSSFCell c = r.createCell(cellnum);
|
||||
HSSFCell c2 = r.createCell(cellnum+1);
|
||||
|
||||
c.setCellValue((double)rownum + (cellnum/10));
|
||||
c2.setCellValue(new HSSFRichTextString("Hello! " + cellnum);
|
||||
}
|
||||
}
|
||||
|
||||
// Save
|
||||
FileOutputStream out = new FileOutputStream("workbook.xls");
|
||||
wb.write(out);
|
||||
out.close();
|
||||
]]></source>
|
||||
</section>
|
||||
<section><title>New, generic SS Usermodel Code</title>
|
||||
<source><![CDATA[
|
||||
// import org.apache.poi.ss.usermodel.*;
|
||||
|
||||
Workbook[] wbs = new Workbook[] { new HSSFWorkbook(), new XSSFWorkbook() };
|
||||
for(int i=0; i<wbs.length; i++) {
|
||||
Workbook wb = wbs[i];
|
||||
CreationHelper createHelper = wb.getCreationHelper();
|
||||
|
||||
// create a new sheet
|
||||
Sheet s = wb.createSheet();
|
||||
// declare a row object reference
|
||||
Row r = null;
|
||||
// declare a cell object reference
|
||||
Cell c = null;
|
||||
// create 2 cell styles
|
||||
CellStyle cs = wb.createCellStyle();
|
||||
CellStyle cs2 = wb.createCellStyle();
|
||||
DataFormat df = wb.createDataFormat();
|
||||
|
||||
// create 2 fonts objects
|
||||
Font f = wb.createFont();
|
||||
Font f2 = wb.createFont();
|
||||
|
||||
// Set font 1 to 12 point type, blue and bold
|
||||
f.setFontHeightInPoints((short) 12);
|
||||
f.setColor( IndexedColors.RED.getIndex() );
|
||||
f.setBoldweight(Font.BOLDWEIGHT_BOLD);
|
||||
|
||||
// Set font 2 to 10 point type, red and bold
|
||||
f2.setFontHeightInPoints((short) 10);
|
||||
f2.setColor( IndexedColors.RED.getIndex() );
|
||||
f2.setBoldweight(Font.BOLDWEIGHT_BOLD);
|
||||
|
||||
// Set cell style and formatting
|
||||
cs.setFont(f);
|
||||
cs.setDataFormat(df.getFormat("#,##0.0"));
|
||||
|
||||
// Set the other cell style and formatting
|
||||
cs2.setBorderBottom(cs2.BORDER_THIN);
|
||||
cs2.setDataFormat(df.getFormat("text"));
|
||||
cs2.setFont(f2);
|
||||
|
||||
|
||||
// Define a few rows
|
||||
for(int rownum = 0; rownum < 30; rownum++) {
|
||||
Row r = s.createRow(rownum);
|
||||
for(int cellnum = 0; cellnum < 10; cellnum += 2) {
|
||||
Cell c = r.createCell(cellnum);
|
||||
Cell c2 = r.createCell(cellnum+1);
|
||||
|
||||
c.setCellValue((double)rownum + (cellnum/10));
|
||||
c2.setCellValue(
|
||||
createHelper.createRichTextString("Hello! " + cellnum)
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Save
|
||||
String filename = "workbook.xls";
|
||||
if(wb instanceof XSSFWorkbook) {
|
||||
filename = filename + "x";
|
||||
}
|
||||
|
||||
FileOutputStream out = new FileOutputStream(filename);
|
||||
wb.write(out);
|
||||
out.close();
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,40 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>HSSF</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Usermodel Class Diagram by Matthew Young</title>
|
||||
<p>
|
||||
<img src="images/usermodel.gif" alt="Usermodel"/>
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,56 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>HSSF</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
<p>
|
||||
This section is intended for diagrams (UML/etc) that help
|
||||
explain HSSF.
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="diagram1.html">HSSF usermodel class diagram</a> -
|
||||
by Matthew Young (myoung at westernasset dot com)
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
Have more? Add a new "bug" to the bug database with [DOCUMENTATION]
|
||||
prefacing the description and a link to the file on an http server
|
||||
somewhere. If you don't have your own webserver, then you can email it
|
||||
to (acoliver at apache dot org) provided its < 5MB. Diagrams should be
|
||||
in some format that can be read at least on Linux and Windows. Diagrams
|
||||
that can be edited are preferable, but lets face it, there aren't too
|
||||
many good affordable UML tools yet! And no they don't HAVE to be UML...
|
||||
just useful.
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,591 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Developing Formula Evaluation</title>
|
||||
<authors>
|
||||
<person email="amoweb@yahoo.com" name="Amol Deshmukh" id="AD"/>
|
||||
<person email="yegor@apache.org" name="Yegor Kozlov" id="YK"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Introduction</title>
|
||||
<p>
|
||||
This document is for developers wishing to contribute to the
|
||||
FormulaEvaluator API functionality.
|
||||
</p>
|
||||
<p>
|
||||
When evaluating workbooks you may encounter an <code>org.apache.poi.ss.formula.eval.NotImplementedException</code>
|
||||
which indicates that a function is not (yet) supported by POI. Is there a workaround?
|
||||
Yes, the POI framework makes it easy to add implementation of new functions. Prior to POI-3.8
|
||||
you had to checkout the source code from svn and make a custom build with your function implementation.
|
||||
Since POI-3.8 you can register new functions in run-time.
|
||||
</p>
|
||||
<p>
|
||||
Currently, contribution is desired for implementing the standard MS
|
||||
Excel functions. Placeholder classes for these have been created,
|
||||
contributors only need to insert implementation for the
|
||||
individual <code>evaluate()</code> methods that do the actual evaluation.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Overview of FormulaEvaluator </title>
|
||||
<p>
|
||||
Briefly, a formula string (along with the sheet and workbook that
|
||||
form the context in which the formula is evaluated) is first parsed
|
||||
into Reverse Polish Notation (RPN) tokens using the <code>FormulaParser</code> class.
|
||||
(If you don't know what RPN tokens are, now is a good time to
|
||||
read <a href="http://www-stone.ch.cam.ac.uk/documentation/rrf/rpn.html">
|
||||
Anthony Stone's description of RPN</a>.)
|
||||
</p>
|
||||
<section><title> The big picture</title>
|
||||
<p>
|
||||
RPN tokens are mapped to <code>Eval</code> classes. (The class hierarchy for the <code>Eval</code>s
|
||||
is best understood if you view it in a class diagram
|
||||
viewer.) Depending on the type of RPN token (also called <code>Ptg</code>s
|
||||
henceforth since that is what the <code>FormulaParser</code> calls the classes), a
|
||||
specific type of <code>Eval</code> wrapper is constructed to wrap the RPN token and
|
||||
is pushed on the stack, unless the <code>Ptg</code> is an <code>OperationPtg</code>. If it is an
|
||||
<code>OperationPtg</code>, an <code>OperationEval</code> instance is created for the specific
|
||||
type of <code>OperationPtg</code>. And depending on how many operands it takes,
|
||||
that many <code>Eval</code>s are popped of the stack and passed in an array to
|
||||
the <code>OperationEval</code> instance's evaluate method which returns an <code>Eval</code>
|
||||
of subtype <code>ValueEval</code>. Thus an operation in the formula is evaluated.
|
||||
</p>
|
||||
<note> An <code>Eval</code> is of subinterface <code>ValueEval</code> or <code>OperationEval</code>.
|
||||
Operands are always <code>ValueEval</code>s, and operations are always <code>OperationEval</code>s.</note>
|
||||
<p>
|
||||
<code>OperationEval.evaluate(Eval[])</code> returns an <code>Eval</code> which is supposed
|
||||
to be an instance of one of the implementations of
|
||||
<code>ValueEval</code>. The <code>ValueEval</code> resulting from <code>evaluate()</code> is pushed on the
|
||||
stack and the next RPN token is evaluated. This continues until
|
||||
eventually there are no more RPN tokens, at which point, if the formula
|
||||
string was correctly parsed, there should be just one <code>Eval</code> on the
|
||||
stack — which contains the result of evaluating the formula.
|
||||
</p>
|
||||
<p>
|
||||
Two special <code>Ptg</code>s — <code>AreaPtg</code> and <code>ReferencePtg</code> —
|
||||
are handled a little differently, but the code should be self
|
||||
explanatory for that. Very briefly, the cells included in <code>AreaPtg</code> and
|
||||
<code>RefPtg</code> are examined and their values are populated in individual
|
||||
<code>ValueEval</code> objects which are set into the implementations of
|
||||
<code>AreaEval</code> and <code>RefEval</code>.
|
||||
</p>
|
||||
<p>
|
||||
<code>OperationEval</code>s for the standard operators have been implemented and tested.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>What functions are supported?</title>
|
||||
<p>
|
||||
As of release 5.2.0, POI implements 202 built-in functions,
|
||||
see <a href="#appendixA">Appendix A</a> for the list of supported functions with an implementation.
|
||||
You can programmatically list supported / unsupported functions using the following helper methods:
|
||||
</p>
|
||||
<source>import org.apache.poi.ss.formula.ss.formula.WorkbookEvaluator;
|
||||
|
||||
// list of functions that POI can evaluate
|
||||
Collection<String> supportedFuncs = WorkbookEvaluator.getSupportedFunctionNames();
|
||||
|
||||
// list of functions that are not supported by POI
|
||||
Collection<String> unsupportedFuncs = WorkbookEvaluator.getNotSupportedFunctionNames();
|
||||
</source>
|
||||
<section><title>I need a function that isn't supported!</title>
|
||||
<p>
|
||||
If you need a function that POI doesn't currently support, you have two options.
|
||||
You can create the function yourself, and have your program add it to POI at
|
||||
run-time. Doing this will help you get the function you need as soon as possible.
|
||||
The other option is to create the function yourself, and build it into the POI library,
|
||||
possibly contributing the code to the POI project. Doing this will help you get the
|
||||
function you need, but you'll have to build POI from source yourself. And if you
|
||||
contribute the code, you'll help others who need the function in the future, because
|
||||
it will already be supported in the next release of POI. The two options require
|
||||
almost identical code, but the process of deploying the function is different.
|
||||
If your function is a User Defined Function, you'll always take the run-time option,
|
||||
as POI doesn't distribute UDFs.
|
||||
</p>
|
||||
<p>
|
||||
In the sections ahead, we'll implement the Excel <code>SQRTPI()</code> function, first
|
||||
at run-time, and then we'll show how change it to a library-based implementation.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Two base interfaces to start your implementation</title>
|
||||
<p>
|
||||
All Excel formula function classes implement either the
|
||||
<code>org.apache.poi.hssf.record.formula.functions.Function</code> or the
|
||||
<code>org.apache.poi.hssf.record.formula.functions.FreeRefFunction</code> interface.
|
||||
<code>Function</code> is a common interface for the functions defined in the Binary Excel File Format (BIFF8): these are "classic" Excel functions like <code>SUM</code>, <code>COUNT</code>, <code>LOOKUP</code>, <em>etc</em>.
|
||||
<code>FreeRefFunction</code> is a common interface for the functions from the Excel Analysis ToolPak, for User Defined Functions that you create,
|
||||
and for Excel built-in functions that have been defined since BIFF8 was defined.
|
||||
In the future these two interfaces are expected be unified into one, but for now you have to start your implementation from two slightly different roots.
|
||||
</p>
|
||||
|
||||
<section><title>Which interface to start from?</title>
|
||||
<p>
|
||||
You are about to implement a function and don't know which interface to start from: <code>Function</code> or <code>FreeRefFunction</code>.
|
||||
You should use <code>Function</code> if the function is part of the Excel BIFF8
|
||||
definition, and <code>FreeRefFunction</code> for a function that is part of the Excel Analysis ToolPak, was added to Excel after BIFF8, or that you are creating yourself.
|
||||
</p>
|
||||
<p>
|
||||
You can check the list of Analysis ToolPak functions defined in <code>org.apache.poi.ss.formula.atp.AnalysisToolPak.createFunctionsMap()</code>
|
||||
to see if the function is part of the Analysis ToolPak.
|
||||
The list of BIFF8 functions is defined as a text file, in the
|
||||
<code>src/resources/main/org/apache/poi/ss/formula/function/functionMetadata.txt</code> file.
|
||||
</p>
|
||||
<p>
|
||||
You can also use the following code to check which base class your function should implement, if it is not a User Defined function (UDFs must implement <code>FreeRefFunction</code>):
|
||||
</p>
|
||||
<source>import org.apache.poi.hssf.record.formula.atp.AnalysisToolPak;
|
||||
|
||||
if (!AnalysisToolPak.isATPFunction(functionName)){
|
||||
// the function must implement org.apache.poi.hssf.record.formula.functions.Function
|
||||
} else {
|
||||
// the function must implement org.apache.poi.hssf.record.formula.functions.FreeRefFunction
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Implementing a function.</title>
|
||||
<p>
|
||||
Here is the fun part: let's walk through the implementation of the Excel function <code>SQRTPI()</code>,
|
||||
which POI doesn not currently support.
|
||||
</p>
|
||||
<p>
|
||||
<code>AnalysisToolPak.isATPFunction("SQRTPI")</code> returns true, so this is an Analysis ToolPak function.
|
||||
Thus the base interface must be <code>FreeRefFunction</code>. The same would be true if we were implementing
|
||||
a UDF.
|
||||
</p>
|
||||
<p>
|
||||
Because we're taking the run-time deployment option, we'll create this new function in a source
|
||||
file in our own program. Our function will return an <code>Eval</code> that is either
|
||||
it's proper result, or an <code>ErrorEval</code> that describes the error. All that work
|
||||
is done in the function's <code>evaluate()</code> method:
|
||||
</p>
|
||||
<source>package ...;
|
||||
import org.apache.poi.ss.formula.eval.EvaluationException;
|
||||
import org.apache.poi.ss.formula.eval.ErrorEval;
|
||||
import org.apache.poi.ss.formula.eval.NumberEval;
|
||||
import org.apache.poi.ss.formula.eval.OperandResolver;
|
||||
import org.apache.poi.ss.formula.eval.ValueEval;
|
||||
import org.apache.poi.ss.formula.functions.FreeRefFunction;
|
||||
|
||||
public final class SqrtPi implements FreeRefFunction {
|
||||
|
||||
public ValueEval evaluate(ValueEval[] args, OperationEvaluationContext ec) {
|
||||
ValueEval arg0 = args[0];
|
||||
int srcRowIndex = ec.getRowIndex();
|
||||
int srcColumnIndex = ec.getColumnIndex();
|
||||
try {
|
||||
// Retrieves a single value from a variety of different argument types according to standard
|
||||
// Excel rules. Does not perform any type conversion.
|
||||
ValueEval ve = OperandResolver.getSingleValue(arg0, srcRowIndex, srcColumnIndex);
|
||||
|
||||
// Applies some conversion rules if the supplied value is not already a number.
|
||||
// Throws EvaluationException(#VALUE!) if the supplied parameter is not a number
|
||||
double arg = OperandResolver.coerceValueToDouble(ve);
|
||||
|
||||
// this where all the heavy-lifting happens
|
||||
double result = Math.sqrt(arg*Math.PI);
|
||||
|
||||
// Excel uses the error code #NUM! instead of IEEE NaN and Infinity,
|
||||
// so when a numeric function evaluates to Double.NaN or Double.Infinity,
|
||||
// be sure to translate the result to the appropriate error code
|
||||
if (Double.isNaN(result) || Double.isInfinite(result)) {
|
||||
throw new EvaluationException(ErrorEval.NUM_ERROR);
|
||||
}
|
||||
|
||||
return new NumberEval(result);
|
||||
} catch (EvaluationException e){
|
||||
return e.getErrorEval();
|
||||
}
|
||||
}
|
||||
}
|
||||
</source>
|
||||
<p>
|
||||
If our function had been one of the BIFF8 Excel built-ins, it would have been based on
|
||||
the <code>Function</code> interface instead.
|
||||
There are sub-interfaces of <code>Function</code> that make life easier when implementing numeric functions
|
||||
or functions
|
||||
with a small, fixed number of arguments:
|
||||
</p>
|
||||
<ul>
|
||||
<li><code>org.apache.poi.hssf.record.formula.functions.NumericFunction</code></li>
|
||||
<li><code>org.apache.poi.hssf.record.formula.functions.Fixed0ArgFunction</code></li>
|
||||
<li><code>org.apache.poi.hssf.record.formula.functions.Fixed1ArgFunction</code></li>
|
||||
<li><code>org.apache.poi.hssf.record.formula.functions.Fixed2ArgFunction</code></li>
|
||||
<li><code>org.apache.poi.hssf.record.formula.functions.Fixed3ArgFunction</code></li>
|
||||
<li><code>org.apache.poi.hssf.record.formula.functions.Fixed4ArgFunction</code></li>
|
||||
</ul>
|
||||
<p>
|
||||
Since <code>SQRTPI()</code> takes exactly one argument, we would start our implementation from
|
||||
<code>Fixed1ArgFunction</code>. The differences for a BIFF8 <code>Fixed1ArgFunction</code>
|
||||
are pretty small:
|
||||
</p>
|
||||
<source>package ...;
|
||||
import org.apache.poi.ss.formula.eval.EvaluationException;
|
||||
import org.apache.poi.ss.formula.eval.ErrorEval;
|
||||
import org.apache.poi.ss.formula.eval.NumberEval;
|
||||
import org.apache.poi.ss.formula.eval.OperandResolver;
|
||||
import org.apache.poi.ss.formula.eval.ValueEval;
|
||||
import org.apache.poi.ss.formula.functions.Fixed1ArgFunction;
|
||||
|
||||
public final class SqrtPi extends Fixed1ArgFunction {
|
||||
|
||||
public ValueEval evaluate(int srcRowIndex, int srcColumnIndex, ValueEval arg0) {
|
||||
try {
|
||||
...
|
||||
}
|
||||
}
|
||||
</source>
|
||||
<p>
|
||||
Now when the implementation is ready we need to register it with the formula evaluator.
|
||||
This is the same no matter which kind of function we're creating. We simply add the
|
||||
following line to the program that is using POI:
|
||||
</p>
|
||||
<source>WorkbookEvaluator.registerFunction("SQRTPI", SqrtPi);
|
||||
</source>
|
||||
<p>
|
||||
Voila! The formula evaluator now recognizes <code>SQRTPI()</code>!
|
||||
</p>
|
||||
<section><title>Moving the function into the library</title>
|
||||
<p>
|
||||
If we choose instead to implement our function as part of the POI
|
||||
library, the code is nearly identical. All POI functions
|
||||
are part of one of two Java packages: <code>org.apache.poi.ss.formula.functions</code>
|
||||
for BIFF8 Excel built-in functions, and <code>org.apache.poi.ss.formula.atp</code>
|
||||
for Analysis ToolPak functions. The function still needs to implement the
|
||||
appropriate base class, just as before. To implement our <code>SQRTPI()</code>
|
||||
function in the POI library, we need to move the source code to
|
||||
<code>poi/src/main/java/org/apache/poi/ss/formula/atp/SqrtPi.java</code> in
|
||||
the POI source code, change the <code>package</code> statement, and add a
|
||||
singleton instance:
|
||||
</p>
|
||||
<source>package org.apache.poi.ss.formula.atp;
|
||||
...
|
||||
public final class SqrtPi implements FreeRefFunction {
|
||||
|
||||
public static final FreeRefFunction instance = new SqrtPi();
|
||||
|
||||
private SqrtPi() {
|
||||
// Enforce singleton
|
||||
}
|
||||
...
|
||||
}
|
||||
</source>
|
||||
<p>
|
||||
If our function had been one of the BIFF8 Excel built-ins, we would instead have moved
|
||||
the source code to
|
||||
<code>poi/src/main/java/org/apache/poi/ss/formula/functions/SqrtPi.java</code> in
|
||||
the POI source code, and changed the <code>package</code> statement to:
|
||||
</p>
|
||||
<source>package org.apache.poi.ss.formula.functions;
|
||||
</source>
|
||||
<p>
|
||||
POI library functions are registered differently from run-time-deployed functions.
|
||||
Again, the techniques differ for the two types of library functions (remembering
|
||||
that POI never releases the third type, UDFs).
|
||||
For our Analysis ToolPak function, we have to update the list of functions in
|
||||
<code>org.apache.poi.ss.formula.atp.AnalysisToolPak.createFunctionsMap()</code>:
|
||||
</p>
|
||||
<source>...
|
||||
private Map<String, FreeRefFunction> createFunctionsMap() {
|
||||
Map<String, FreeRefFunction> m = new HashMap<>(114);
|
||||
...
|
||||
r(m, "SQRTPI", SqrtPi.instance);
|
||||
...
|
||||
}
|
||||
...
|
||||
</source>
|
||||
<p>
|
||||
If our function had been one of the BIFF8 Excel built-ins,
|
||||
the registration instead would require updating an entry in the formula-function table,
|
||||
<code>poi/src/main/resources/org/apache/poi/ss/formula/function/functionMetadata.txt</code>:
|
||||
</p>
|
||||
<source>...
|
||||
#Columns: (index, name, minParams, maxParams, returnClass, paramClasses, isVolatile, hasFootnote )
|
||||
...
|
||||
359 SQRTPI 1 1 V V
|
||||
...
|
||||
</source>
|
||||
<p>
|
||||
and also updating the list of function implementation list in
|
||||
<code>org.apache.poi.ss.formula.eval.FunctionEval.produceFunctions()</code>:
|
||||
</p>
|
||||
<source>...
|
||||
private static Function[] produceFunctions() {
|
||||
...
|
||||
retval[359] = new SqrtPi();
|
||||
...
|
||||
}
|
||||
...
|
||||
</source>
|
||||
</section>
|
||||
<section><title>Floating Point Arithmetic in Excel</title>
|
||||
<p>
|
||||
Excel uses the IEEE Standard for Double Precision Floating Point numbers
|
||||
except two cases where it does not adhere to IEEE 754:
|
||||
</p>
|
||||
<ol>
|
||||
<li>Positive and Negative Infinities: Infinities occur when you divide by 0.
|
||||
Excel does not support infinities, rather, it gives a #DIV/0! error in these cases.
|
||||
</li>
|
||||
<li>Not-a-Number (NaN): NaN is used to represent invalid operations
|
||||
(such as infinity/infinity, infinity-infinity, or the square root of -1).
|
||||
NaNs allow a program to continue past an invalid operation.
|
||||
Excel instead immediately generates an error such as #NUM! or #DIV/0!.
|
||||
</li>
|
||||
</ol>
|
||||
<p>
|
||||
Be aware of these two cases when saving results of your scientific calculations in Excel:
|
||||
“where are my Infinities and NaNs? They are gone!”
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Testing Framework</title>
|
||||
<p>
|
||||
Automated testing of the implemented Function is easy.
|
||||
The source code for this is in the file: <code>org.apache.poi.hssf.record.formula.GenericFormulaTestCase.java</code>.
|
||||
This class has a reference to the test xls file (not <em>a</em> test xls, <em>the</em> test xls :) )
|
||||
which may need to be changed for your environment. Once you do that, in the test xls,
|
||||
locate the entry for the function that you have implemented and enter different tests
|
||||
in a cell in the FORMULA row. Then copy the "value of" the formula that you entered in the
|
||||
cell just below it (this is easily done in excel as:
|
||||
[copy the formula cell] > [go to cell below] > Edit > Paste Special > Values > "ok").
|
||||
You can enter multiple such formulas and paste their values in the cell below and the
|
||||
test framework will automatically test if the formula evaluation matches the expected
|
||||
value (Again, hard to put in words, so if you will, please take time to quickly look
|
||||
at the code and the currently entered tests in the patch attachment "FormulaEvalTestData.xls"
|
||||
file).
|
||||
</p>
|
||||
<note>This style of testing appears to have been abandoned. This section needs to be completely rewritten.</note>
|
||||
</section>
|
||||
</section>
|
||||
<anchor id="appendixA"/>
|
||||
<section>
|
||||
<title>Appendix A — Functions supported by POI</title>
|
||||
<p>
|
||||
Functions supported by POI (as of v5.2.0 release)
|
||||
</p>
|
||||
<source>ABS
|
||||
ACOS
|
||||
ACOSH
|
||||
ADDRESS
|
||||
AND
|
||||
AREAS
|
||||
ASIN
|
||||
ASINH
|
||||
ATAN
|
||||
ATAN2
|
||||
ATANH
|
||||
AVEDEV
|
||||
AVERAGE
|
||||
AVERAGEIFS
|
||||
BIN2DEC
|
||||
CEILING
|
||||
CHAR
|
||||
CHOOSE
|
||||
CLEAN
|
||||
CODE
|
||||
COLUMN
|
||||
COLUMNS
|
||||
COMBIN
|
||||
COMPLEX
|
||||
CONCAT
|
||||
CONCATENATE
|
||||
COS
|
||||
COSH
|
||||
COUNT
|
||||
COUNTA
|
||||
COUNTBLANK
|
||||
COUNTIF
|
||||
COUNTIFS
|
||||
DATE
|
||||
DATEVALUE
|
||||
DAY
|
||||
DAYS360
|
||||
DEC2BIN
|
||||
DEC2HEX
|
||||
DEGREES
|
||||
DELTA
|
||||
DEVSQ
|
||||
DGET
|
||||
DMAX
|
||||
DMIN
|
||||
DOLLAR
|
||||
DSUM
|
||||
EDATE
|
||||
EOMONTH
|
||||
ERROR.TYPE
|
||||
EVEN
|
||||
EXACT
|
||||
EXP
|
||||
FACT
|
||||
FACTDOUBLE
|
||||
FALSE
|
||||
FIND
|
||||
FIXED
|
||||
FLOOR
|
||||
FREQUENCY
|
||||
FV
|
||||
GEOMEAN
|
||||
HEX2DEC
|
||||
HLOOKUP
|
||||
HOUR
|
||||
HYPERLINK
|
||||
IF
|
||||
IFERROR
|
||||
IFNA
|
||||
IFS
|
||||
IMAGINARY
|
||||
IMREAL
|
||||
INDEX
|
||||
INDIRECT
|
||||
INT
|
||||
INTERCEPT
|
||||
IPMT
|
||||
IRR
|
||||
ISBLANK
|
||||
ISERR
|
||||
ISERROR
|
||||
ISEVEN
|
||||
ISLOGICAL
|
||||
ISNA
|
||||
ISNONTEXT
|
||||
ISNUMBER
|
||||
ISODD
|
||||
ISREF
|
||||
ISTEXT
|
||||
LARGE
|
||||
LEFT
|
||||
LEN
|
||||
LN
|
||||
LOG
|
||||
LOG10
|
||||
LOOKUP
|
||||
LOWER
|
||||
MATCH
|
||||
MAX
|
||||
MAXA
|
||||
MAXIFS
|
||||
MDETERM
|
||||
MEDIAN
|
||||
MID
|
||||
MIN
|
||||
MINA
|
||||
MINIFS
|
||||
MINUTE
|
||||
MINVERSE
|
||||
MIRR
|
||||
MMULT
|
||||
MOD
|
||||
MODE
|
||||
MONTH
|
||||
MROUND
|
||||
NA
|
||||
NETWORKDAYS
|
||||
NOT
|
||||
NOW
|
||||
NPER
|
||||
NPV
|
||||
OCT2DEC
|
||||
ODD
|
||||
OFFSET
|
||||
OR
|
||||
PERCENTILE
|
||||
PERCENTRANK
|
||||
PERCENTRANK.EXC
|
||||
PERCENTRANK.INC
|
||||
PI
|
||||
PMT
|
||||
POISSON
|
||||
POWER
|
||||
PPMT
|
||||
PRODUCT
|
||||
PROPER
|
||||
PV
|
||||
QUOTIENT
|
||||
RADIANS
|
||||
RAND
|
||||
RANDBETWEEN
|
||||
RANK
|
||||
RATE
|
||||
REPLACE
|
||||
REPT
|
||||
RIGHT
|
||||
ROMAN
|
||||
ROUND
|
||||
ROUNDDOWN
|
||||
ROUNDUP
|
||||
ROW
|
||||
ROWS
|
||||
SEARCH
|
||||
SECOND
|
||||
SIGN
|
||||
SIN
|
||||
SINGLE
|
||||
SINH
|
||||
SLOPE
|
||||
SMALL
|
||||
SQRT
|
||||
STDEV
|
||||
SUBSTITUTE
|
||||
SUBTOTAL
|
||||
SUM
|
||||
SUMIF
|
||||
SUMIFS
|
||||
SUMPRODUCT
|
||||
SUMSQ
|
||||
SUMX2MY2
|
||||
SUMX2PY2
|
||||
SUMXMY2
|
||||
SWITCH
|
||||
T
|
||||
T.DIST
|
||||
T.DIST.2T
|
||||
T.DIST.RT
|
||||
TAN
|
||||
TANH
|
||||
TDIST
|
||||
TEXT
|
||||
TEXTJOIN
|
||||
TIME
|
||||
TIMEVALUE
|
||||
TODAY
|
||||
TRANSPOSE
|
||||
TREND
|
||||
TRIM
|
||||
TRUE
|
||||
TRUNC
|
||||
UPPER
|
||||
VALUE
|
||||
VAR
|
||||
VARP
|
||||
VLOOKUP
|
||||
WEEKDAY
|
||||
WEEKNUM
|
||||
WORKDAY
|
||||
XLOOKUP
|
||||
XMATCH
|
||||
YEAR
|
||||
YEARFRAC</source>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
410
src/documentation/content/xdocs/components/spreadsheet/eval.xml
Normal file
@ -0,0 +1,410 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Formula Evaluation</title>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Introduction</title>
|
||||
<p>The POI formula evaluation code enables you to calculate the result of
|
||||
formulas in Excels sheets read-in, or created in POI. This document explains
|
||||
how to use the API to evaluate your formulas.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<anchor id="WhyEvaluate"/>
|
||||
<section><title>Why do I need to evaluate formulas?</title>
|
||||
<p>The Excel file format (both .xls and .xlsx) stores a "cached" result for
|
||||
every formula along with the formula itself. This means that when the file
|
||||
is opened, it can be quickly displayed, without needing to spend a long
|
||||
time calculating all of the formula results. It also means that when reading
|
||||
a file through Apache POI, the result is quickly available to you too!
|
||||
</p>
|
||||
<p>After making changes with Apache POI to either Formula Cells themselves,
|
||||
or those that they depend on, you should normally perform a Formula
|
||||
Evaluation to have these "cached" results updated. This is normally done
|
||||
after all changes have been performed, but before you write the file out.
|
||||
If you don't do this, there's a good chance that when you open the file in
|
||||
Excel, until you go to the cell and hit enter or F9, you will either see
|
||||
the old value or '#VALUE!' for the cell. (Sometimes Excel will notice
|
||||
itself, and trigger a recalculation on load, but unless you know you are
|
||||
using volatile functions it's generally best to trigger a <a href="#recalculation">Recalulation</a>
|
||||
through POI)
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<anchor id="Status"/>
|
||||
<section><title>Status</title>
|
||||
<p>The code currently provides implementations for all the arithmatic operators.
|
||||
It also provides implementations for approx. 140 built in
|
||||
functions in Excel. The framework however makes it easy to add
|
||||
implementation of new functions. See the <a href="eval-devguide.html"> Formula
|
||||
evaluation development guide</a> and <a href="../../apidocs/dev/org/apache/poi/hssf/record/formula/functions/package-summary.html">javadocs</a>
|
||||
for details. </p>
|
||||
<p> Both HSSFWorkbook and XSSFWorkbook are supported, so you can
|
||||
evaluate formulas on both .xls and .xlsx files.</p>
|
||||
<p> User-defined functions are <a href="user-defined-functions.html">supported</a>,
|
||||
but must be rewritten in Java and registered with the macro-enabled workbook in order to be evaluated.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>User API How-TO</title>
|
||||
<p>The following code demonstrates how to use the FormulaEvaluator
|
||||
in the context of other POI excel reading code.
|
||||
</p>
|
||||
<p>There are several ways in which you can use the FormulaEvalutator API.</p>
|
||||
|
||||
<anchor id="Evaluate"/>
|
||||
<section><title>Using FormulaEvaluator.<strong>evaluate</strong>(Cell cell)</title>
|
||||
<p>This evaluates a given cell, and returns the new value,
|
||||
without affecting the cell</p>
|
||||
<source>
|
||||
FileInputStream fis = new FileInputStream("c:/temp/test.xls");
|
||||
Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("c:/temp/test.xls")
|
||||
Sheet sheet = wb.getSheetAt(0);
|
||||
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
// suppose your formula is in B3
|
||||
CellReference cellReference = new CellReference("B3");
|
||||
Row row = sheet.getRow(cellReference.getRow());
|
||||
Cell cell = row.getCell(cellReference.getCol());
|
||||
|
||||
CellValue cellValue = evaluator.evaluate(cell);
|
||||
|
||||
switch (cellValue.getCellType()) {
|
||||
case Cell.CELL_TYPE_BOOLEAN:
|
||||
System.out.println(cellValue.getBooleanValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_NUMERIC:
|
||||
System.out.println(cellValue.getNumberValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_STRING:
|
||||
System.out.println(cellValue.getStringValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_BLANK:
|
||||
break;
|
||||
case Cell.CELL_TYPE_ERROR:
|
||||
break;
|
||||
|
||||
// CELL_TYPE_FORMULA will never happen
|
||||
case Cell.CELL_TYPE_FORMULA:
|
||||
break;
|
||||
}
|
||||
</source>
|
||||
<p>Thus using the retrieved value (of type
|
||||
FormulaEvaluator.CellValue - a nested class) returned
|
||||
by FormulaEvaluator is similar to using a Cell object
|
||||
containing the value of the formula evaluation. CellValue is
|
||||
a simple value object and does not maintain reference
|
||||
to the original cell.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<anchor id="EvaluateFormulaCell"/>
|
||||
<section><title>Using FormulaEvaluator.<strong>evaluateFormulaCell</strong>(Cell cell)</title>
|
||||
<p><strong>evaluateFormulaCell</strong>(Cell cell)
|
||||
will check to see if the supplied cell is a formula cell.
|
||||
If it isn't, then no changes will be made to it. If it is,
|
||||
then the formula is evaluated. The value for the formula
|
||||
is saved alongside it, to be displayed in excel. The
|
||||
formula remains in the cell, just with a new value</p>
|
||||
<p>The return of the function is the type of the
|
||||
formula result, such as Cell.CELL_TYPE_BOOLEAN</p>
|
||||
<source>
|
||||
FileInputStream fis = new FileInputStream("/somepath/test.xls");
|
||||
Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
|
||||
Sheet sheet = wb.getSheetAt(0);
|
||||
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
// suppose your formula is in B3
|
||||
CellReference cellReference = new CellReference("B3");
|
||||
Row row = sheet.getRow(cellReference.getRow());
|
||||
Cell cell = row.getCell(cellReference.getCol());
|
||||
|
||||
if (cell!=null) {
|
||||
switch (evaluator.evaluateFormulaCell(cell)) {
|
||||
case Cell.CELL_TYPE_BOOLEAN:
|
||||
System.out.println(cell.getBooleanCellValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_NUMERIC:
|
||||
System.out.println(cell.getNumericCellValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_STRING:
|
||||
System.out.println(cell.getStringCellValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_BLANK:
|
||||
break;
|
||||
case Cell.CELL_TYPE_ERROR:
|
||||
System.out.println(cell.getErrorCellValue());
|
||||
break;
|
||||
|
||||
// CELL_TYPE_FORMULA will never occur
|
||||
case Cell.CELL_TYPE_FORMULA:
|
||||
break;
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="EvaluateInCell"/>
|
||||
<section><title>Using FormulaEvaluator.<strong>evaluateInCell</strong>(Cell cell)</title>
|
||||
<p><strong>evaluateInCell</strong>(Cell cell) will check to
|
||||
see if the supplied cell is a formula cell. If it isn't,
|
||||
then no changes will be made to it. If it is, then the
|
||||
formula is evaluated, and the new value saved into the cell,
|
||||
in place of the old formula.</p>
|
||||
<source>
|
||||
FileInputStream fis = new FileInputStream("/somepath/test.xls");
|
||||
Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
|
||||
Sheet sheet = wb.getSheetAt(0);
|
||||
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
// suppose your formula is in B3
|
||||
CellReference cellReference = new CellReference("B3");
|
||||
Row row = sheet.getRow(cellReference.getRow());
|
||||
Cell cell = row.getCell(cellReference.getCol());
|
||||
|
||||
if (cell!=null) {
|
||||
switch (evaluator.<strong>evaluateInCell</strong>(cell).getCellType()) {
|
||||
case Cell.CELL_TYPE_BOOLEAN:
|
||||
System.out.println(cell.getBooleanCellValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_NUMERIC:
|
||||
System.out.println(cell.getNumericCellValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_STRING:
|
||||
System.out.println(cell.getStringCellValue());
|
||||
break;
|
||||
case Cell.CELL_TYPE_BLANK:
|
||||
break;
|
||||
case Cell.CELL_TYPE_ERROR:
|
||||
System.out.println(cell.getErrorCellValue());
|
||||
break;
|
||||
|
||||
// CELL_TYPE_FORMULA will never occur
|
||||
case Cell.CELL_TYPE_FORMULA:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="EvaluateAll"/>
|
||||
<section><title>Re-calculating all formulas in a Workbook</title>
|
||||
<source>
|
||||
FileInputStream fis = new FileInputStream("/somepath/test.xls");
|
||||
Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
|
||||
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
|
||||
for (Sheet sheet : wb) {
|
||||
for (Row r : sheet) {
|
||||
for (Cell c : r) {
|
||||
if (c.getCellType() == Cell.CELL_TYPE_FORMULA) {
|
||||
evaluator.evaluateFormulaCell(c);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
</source>
|
||||
|
||||
<p>Alternately, if you know which of HSSF or XSSF you're working
|
||||
with, then you can call the static
|
||||
<strong>evaluateAllFormulaCells</strong> method on the appropriate
|
||||
HSSFFormulaEvaluator or XSSFFormulaEvaluator class.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<anchor id="recalculation"/>
|
||||
<section><title>Recalculation of Formulas</title>
|
||||
<p>
|
||||
In certain cases you may want to force Excel to re-calculate formulas when the workbook is opened.
|
||||
Consider the following example:
|
||||
</p>
|
||||
<p>
|
||||
Open Excel and create a new workbook. On the first sheet set A1=1, B1=1, C1=A1+B1.
|
||||
Excel automatically calculates formulas and the value in C1 is 2. So far so good.
|
||||
</p>
|
||||
<p>
|
||||
Now modify the workbook with POI:
|
||||
</p>
|
||||
<source>
|
||||
Workbook wb = WorkbookFactory.create(new FileInputStream("workbook.xls"));
|
||||
|
||||
Sheet sh = wb.getSheetAt(0);
|
||||
sh.getRow(0).getCell(0).setCellValue(2); // set A1=2
|
||||
|
||||
FileOutputStream out = new FileOutputStream("workbook2.xls");
|
||||
wb.write(out);
|
||||
out.close();
|
||||
</source>
|
||||
<p>
|
||||
Now open workbook2.xls in Excel and the value in C1 is still 2 while you expected 3. Wrong? No!
|
||||
The point is that Excel caches previously calculated results and you need to trigger recalculation to updated them.
|
||||
It is not an issue when you are creating new workbooks from scratch, but important to remember when you are modifing
|
||||
existing workbooks with formulas. This can be done in two ways:
|
||||
</p>
|
||||
<p>
|
||||
1. Re-evaluate formulas with POI's FormulaEvaluator:
|
||||
</p>
|
||||
<source>
|
||||
Workbook wb = WorkbookFactory.create(new FileInputStream("workbook.xls"));
|
||||
|
||||
Sheet sh = wb.getSheetAt(0);
|
||||
sh.getRow(0).getCell(0).setCellValue(2); // set A1=2
|
||||
|
||||
wb.getCreationHelper().createFormulaEvaluator().evaluateAll();
|
||||
</source>
|
||||
<p>
|
||||
2. Delegate re-calculation to Excel. The application will perform a full recalculation when the workbook is opened:
|
||||
</p>
|
||||
<source>
|
||||
Workbook wb = WorkbookFactory.create(new FileInputStream("workbook.xls"));
|
||||
|
||||
Sheet sh = wb.getSheetAt(0);
|
||||
sh.getRow(0).getCell(0).setCellValue(2); // set A1=2
|
||||
|
||||
wb.setForceFormulaRecalculation(true);
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="external"/>
|
||||
<section><title>External (Cross-Workbook) references</title>
|
||||
<p>It is possible for a formula in an Excel spreadsheet to
|
||||
refer to a Named Range or Cell in a different workbook.
|
||||
These cross-workbook references are normally called <em>External
|
||||
References</em>. These are formulas which look something like:</p>
|
||||
<source>
|
||||
=SUM([Finances.xlsx]Numbers!D10:D25)
|
||||
=SUM('C:\Data\[Finances.xlsx]Numbers'!D10:D25)
|
||||
=SUM([Finances.xlsx]Range20)
|
||||
</source>
|
||||
<p>If you don't have access to these other workbooks, then you
|
||||
should call
|
||||
<a href="../../apidocs/dev/org/apache/poi/ss/usermodel/FormulaEvaluator.html#setIgnoreMissingWorkbooks(boolean)">setIgnoreMissingWorkbooks(true)</a>
|
||||
to tell the Formula Evaluator to skip evaluating any external
|
||||
references it can't look up.</p>
|
||||
<p>In order for POI to be able to evaluate external references, it
|
||||
needs access to the workbooks in question. As these don't necessarily
|
||||
have the same names on your system as in the workbook, you need to
|
||||
give POI a map of external references to open workbooks, through
|
||||
the
|
||||
<a href="../../apidocs/dev/org/apache/poi/ss/usermodel/FormulaEvaluator.html#setupReferencedWorkbooks(java.util.Map)">setupReferencedWorkbooks(java.util.Map<java.lang.String,FormulaEvaluator> workbooks)</a>
|
||||
method. You should normally do something like:</p>
|
||||
<source>
|
||||
// Create a FormulaEvaluator to use
|
||||
FormulaEvaluator mainWorkbookEvaluator = workbook.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
// Track the workbook references
|
||||
Map<String,FormulaEvaluator> workbooks = new HashMap<String, FormulaEvaluator>();
|
||||
// Add this workbook
|
||||
workbooks.put("report.xlsx", mainWorkbookEvaluator);
|
||||
// Add two others
|
||||
workbooks.put("input.xls", WorkbookFactory.create("C:\\temp\\input22.xls").getCreationHelper().createFormulaEvaluator());
|
||||
workbooks.put("lookups.xlsx", WorkbookFactory.create("/home/poi/data/tmp-lookups.xlsx").getCreationHelper().createFormulaEvaluator());
|
||||
|
||||
// Attach them
|
||||
mainWorkbookEvaluator.setupReferencedWorkbooks(workbooks);
|
||||
|
||||
// Evaluate
|
||||
mainWorkbookEvaluator.evaluateAll();
|
||||
</source>
|
||||
</section>
|
||||
|
||||
<anchor id="Performance"/>
|
||||
<section><title>Performance Notes</title>
|
||||
<ul>
|
||||
<li>Generally you should have to create only one FormulaEvaluator
|
||||
instance per Workbook. The FormulaEvaluator will cache
|
||||
evaluations of dependent cells, so if you have multiple
|
||||
formulas all depending on a cell then subsequent evaluations
|
||||
will be faster.
|
||||
</li>
|
||||
<li>You should normally perform all of your updates to cells,
|
||||
before triggering the evaluation, rather than doing one
|
||||
cell at a time. By waiting until all the updates/sets are
|
||||
performed, you'll be able to take best advantage of the caching
|
||||
for complex formulas.
|
||||
</li>
|
||||
<li>If you do end up making changes to cells part way through
|
||||
evaluation, you should call <em>notifySetFormula</em> or
|
||||
<em>notifyUpdateCell</em> to trigger suitable cache clearance.
|
||||
Alternately, you could instantiate a new FormulaEvaluator,
|
||||
which will start with empty caches.
|
||||
</li>
|
||||
<li>Also note that FormulaEvaluator maintains a reference to
|
||||
the sheet and workbook, so ensure that the evaluator instance
|
||||
is available for garbage collection when you are done with it
|
||||
(in other words don't maintain long lived reference to
|
||||
FormulaEvaluator if you don't really need to - unless
|
||||
all references to the sheet and workbook are removed, these
|
||||
don't get garbage collected and continue to occupy potentially
|
||||
large amounts of memory).
|
||||
</li>
|
||||
<li>CellValue instances however do not maintain reference to the
|
||||
Cell or the sheet or workbook, so these can be long-lived
|
||||
objects without any adverse effect on performance.
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Formula Evaluation Debugging</title>
|
||||
<p>POI is not perfect and you may stumble across formula evaluation problems (Java exceptions
|
||||
or just different results) in your special use case. To support an easy detailed analysis, a special
|
||||
logging of the full evaluation is provided.</p>
|
||||
<p>POI 5.1.0 and above uses <a href="https://logging.apache.org/log4j/2.x/">Log4J 2.x</a> as a logging framework. Try to set up a logging
|
||||
configuration that lets you see the info and other log messages.</p>
|
||||
<p>Example use:</p>
|
||||
<source>
|
||||
// open your file
|
||||
Workbook wb = new HSSFWorkbook(new FileInputStream("foobar.xls"));
|
||||
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
// get your cell
|
||||
Cell cell = wb.getSheet(0).getRow(0).getCell(0); // just a dummy example
|
||||
|
||||
// perform debug output for the next evaluate-call only
|
||||
evaluator.setDebugEvaluationOutputForNextEval(true);
|
||||
evaluator.evaluateFormulaCell(cell);
|
||||
evaluator.evaluateFormulaCell(cell); // no logging performed for this next evaluate-call
|
||||
</source>
|
||||
<p>The special Logger called "POI.FormulaEval" is used (useful if you use the CommonsLogger and a detailed logging configuration).
|
||||
The used log levels are WARN and INFO (for detailed parameter info and results) - the level are so high to allow this
|
||||
special logging without being disturbed by the bunch of DEBUG log entries from other classes.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="sxssf"/>
|
||||
<section><title>Formula Evaluation and SXSSF</title>
|
||||
<p>For versions before 3.13 final, no formula evaluation is possible with
|
||||
SXSSF.</p>
|
||||
<p>If you are using POI 3.13 final or newer, formula evaluation is possible with SXSSF,
|
||||
but with some caveats.</p>
|
||||
<p>The biggest restriction is that, since evaluating a cell needs that cell in memory
|
||||
and any others it depends on, only pure-function formulas and formulas referencing
|
||||
nearby cells can be evaluated with SXSSF. If a formula references a cell that hasn't
|
||||
yet been written, or one which has already been flushed to disk, then it won't be
|
||||
possible to evaluate it.</p>
|
||||
<p>Because of this, a call to <em>wb.getCreationHelper().createFormulaEvaluator().evaluateAll();</em>
|
||||
will very rarely work on SXSSF, as it's very rare that all the cells wil be available
|
||||
and in memory at any time! Instead, it is suggested to evaluate formula cells just
|
||||
after writing them, or shortly after when cells they depend on are added. Just make
|
||||
sure that all cells needing or needed for evaluation are inside the window.</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,274 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>HSSF and XSSF Examples</title>
|
||||
<authors>
|
||||
<person id="YK" name="Yegor Kozlov" email="user@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>HSSF and XSSF common examples</title>
|
||||
<p>Apache POI comes with a number of examples that demonstrate how you
|
||||
can use the POI API to create documents from "real life".
|
||||
The examples below based on common XSSF-HSSF interfaces so that you
|
||||
can generate either *.xls or *.xlsx output just by setting a
|
||||
command-line argument:
|
||||
</p>
|
||||
<source>
|
||||
BusinessPlan -xls
|
||||
or
|
||||
BusinessPlan -xlsx
|
||||
</source>
|
||||
<p>All sample source is available in <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/">SVN</a></p>
|
||||
<p>In addition, there are a handful of
|
||||
<a href="#hssf-only">HSSF only</a> and
|
||||
<a href="#xssf-only">XSSF only</a> examples as well.
|
||||
</p>
|
||||
|
||||
<section><title>Available Examples</title>
|
||||
<p>
|
||||
The following examples are available:
|
||||
</p>
|
||||
<ul>
|
||||
<li><a href="#ss-common">Common HSSF and XSSF</a><ul>
|
||||
<li><a href="#business-plan">Business Plan</a></li>
|
||||
<li><a href="#calendar">Calendar</a></li>
|
||||
<li><a href="#loan-calculator">Loan Calculator</a></li>
|
||||
<li><a href="#timesheet">Timesheet</a></li>
|
||||
<li><a href="#conditional-formats">Conditional Formats</a></li>
|
||||
<li><a href="#common-formulas">Formula Examples</a></li>
|
||||
<li><a href="#add-dimensioned-image">Add Dimensioned Image</a></li>
|
||||
<li><a href="#aligned-cells">Aligned Cells</a></li>
|
||||
<li><a href="#cell-style-details">Cell Style Details</a></li>
|
||||
<li><a href="#linked-dropdown">Linked Dropdown Lists</a></li>
|
||||
<li><a href="#performance-test">Common SS Performance Test</a></li>
|
||||
<li><a href="#to-html">To HTML</a></li>
|
||||
<li><a href="#to-csv">To CSV</a></li>
|
||||
</ul></li>
|
||||
<li><a href="#hssf-only">HSSF-Only</a></li>
|
||||
<li><a href="#xssf-only">XSSF-Only</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<anchor id="ss-common" />
|
||||
<anchor id="business-plan" />
|
||||
<section><title>Business Plan</title>
|
||||
<p> The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/BusinessPlan.java">BusinessPlan</a>
|
||||
application creates a sample business plan with three phases, weekly iterations and time highlighting. Demonstrates advanced cell formatting
|
||||
(number and date formats, alignments, fills, borders) and various settings for organizing data in a sheet (freezed panes, grouped rows).
|
||||
</p>
|
||||
<figure src="images/businessplan.jpg" alt="business plan demo"/>
|
||||
</section>
|
||||
|
||||
<anchor id="calendar" />
|
||||
<section><title>Calendar</title>
|
||||
<p> The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/CalendarDemo.java">Calendar</a>
|
||||
demo creates a multi sheet calendar. Each month is on a separate sheet.
|
||||
</p>
|
||||
<figure src="images/calendar.jpg" alt="calendar demo"/>
|
||||
</section>
|
||||
|
||||
<anchor id="loan-calculator" />
|
||||
<section><title>Loan Calculator</title>
|
||||
<p> The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/LoanCalculator.java">LoanCalculator</a>
|
||||
demo creates a simple loan calculator. Demonstrates advance usage of cell formulas and named ranges.
|
||||
</p>
|
||||
<figure src="images/loancalc.jpg" alt="loan calculator demo"/>
|
||||
</section>
|
||||
|
||||
<anchor id="timesheet" />
|
||||
<section><title>Timesheet</title>
|
||||
<p> The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/TimesheetDemo.java">Timesheet</a>
|
||||
demo creates a weekly timesheet with automatic calculation of total hours. Demonstrates advance usage of cell formulas.
|
||||
</p>
|
||||
<figure src="images/timesheet.jpg" alt="timesheet demo"/>
|
||||
</section>
|
||||
|
||||
<anchor id="conditional-formats" />
|
||||
<section><title>Conditional Formats</title>
|
||||
<p> The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/ConditionalFormats.java">ConditionalFormats</a>
|
||||
demo is a collection of short examples showing what you can do with Excel conditional formatting in POI:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Highlight cells based on their values</li>
|
||||
<li>Highlight a range of cells based on a formula</li>
|
||||
<li>Hide errors</li>
|
||||
<li>Hide the duplicate values</li>
|
||||
<li>Highlight duplicate entries in a column</li>
|
||||
<li>Highlight items that are in a list on the worksheet</li>
|
||||
<li>Highlight payments that are due in the next thirty days</li>
|
||||
<li>Shade alternating rows on the worksheet</li>
|
||||
<li>Shade bands of rows on the worksheet</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<anchor id="common-formulas" />
|
||||
<section><title>Formula Examples</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/formula/CalculateMortgage.java">CalculateMortgage</a>
|
||||
example demonstrates a simple user-defined function to calculate
|
||||
principal and interest.</p>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/formula/CheckFunctionsSupported.java">CheckFunctionsSupported</a>
|
||||
example shows how to test what functions and formulas aren't
|
||||
supported from a given file.</p>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/formula/SettingExternalFunction.java">SettingExternalFunction</a>
|
||||
example demonstrates how to use externally provided (third-party)
|
||||
formula add-ins.</p>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/formula/UserDefinedFunctionExample.java">UserDefinedFunctionExample</a>
|
||||
example demonstrates how to invoke a User Defined Function for a
|
||||
given Workbook instance using POI's UDFFinder implementation.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="add-dimensioned-image" />
|
||||
<section><title>Add Dimensioned Image</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/AddDimensionedImage.java">AddDimensionedImage</a>
|
||||
example demonstrates how to add an image to a worksheet and set that
|
||||
images size to a specific number of millimetres irrespective of the
|
||||
width of the columns or height of the rows.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="aligned-cells" />
|
||||
<section><title>Aligned Cells</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/AligningCells.java">AligningCells</a>
|
||||
example demonstrates how various alignment options work.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="cell-style-details" />
|
||||
<section><title>Cell Style Details</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/CellStyleDetails.java">CellStyleDetails</a>
|
||||
example demonstrates how to read excel styles for cells.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="linked-dropdown" />
|
||||
<section><title>Linked Dropdown Lists</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/LinkedDropDownLists.java">LinkedDropDownLists</a>
|
||||
example demonstrates one technique that may be used to create linked
|
||||
or dependent drop down lists.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="performance-test" />
|
||||
<section><title>Common SS Performance Test</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/SSPerformanceTest.java">SSPerformanceTest</a>
|
||||
example provides a way to create simple example files of varying
|
||||
sizes, and to calculate how long they take. Useful for benchmarking
|
||||
your system, and to also test if slow performance is due to Apache
|
||||
POI itself or to your own code.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="to-html" />
|
||||
<section><title>ToHtml</title>
|
||||
<p> The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/html/ToHtml.java">ToHtml</a>
|
||||
example shows how to display a spreadsheet in HTML using the classes for spreadsheet display.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<anchor id="to-csv" />
|
||||
<section><title>ToCSV</title>
|
||||
<p>The <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/ToCSV.java">ToCSV</a>
|
||||
example demonstrates <em>one</em> way to convert an Excel spreadsheet into a CSV file.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<anchor id="hssf-only" />
|
||||
<section><title>HSSF-only Examples</title>
|
||||
<p>All the HSSF-only examples can be found in
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/">SVN</a></p>
|
||||
<ul>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/CellComments.java">CellComments</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/HyperlinkFormula.java">HyperlinkFormula</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/EventExample.java">EventExample</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/OfficeDrawingWithGraphics.java">OfficeDrawingWithGraphics</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/CreateDateCells.java">CreateDateCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/NewWorkbook.java">NewWorkbook</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/EmeddedObjects.java">EmeddedObjects</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/Hyperlinks.java">Hyperlinks</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/OfficeDrawing.java">OfficeDrawing</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/HSSFReadWrite.java">HSSFReadWrite</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/NewSheet.java">NewSheet</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/SplitAndFreezePanes.java">SplitAndFreezePanes</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/InCellLists.java">InCellLists</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/RepeatingRowsAndColumns.java">RepeatingRowsAndColumns</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/MergedCells.java">MergedCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/CellTypes.java">CellTypes</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/ZoomSheet.java">ZoomSheet</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/ReadWriteWorkbook.java">ReadWriteWorkbook</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/CreateCells.java">CreateCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/Alignment.java">Alignment</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/FrillsAndFills.java">FrillsAndFills</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/AddDimensionedImage.java">AddDimensionedImage</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/Borders.java">Borders</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/NewLinesInCells.java">NewLinesInCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/WorkingWithFonts.java">WorkingWithFonts</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/BigExample.java">BigExample</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/Outlines.java">Outlines</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/eventusermodel/XLS2CSVmra.java">XLS2CSVmra</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<anchor id="xssf-only" />
|
||||
<section><title>XSSF-only Examples</title>
|
||||
<p>All the XSSF-only examples can be found in
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/">SVN</a></p>
|
||||
<ul>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CellComments.java">CellComments</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/HeadersAndFooters.java">HeadersAndFooters</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CreateUserDefinedDataFormats.java">CreateUserDefinedDataFormats</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CreatePivotTable.java">CreatePivotTable</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CreatePivotTable2.java">CreatePivotTable2</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/FillsAndColors.java">FillsAndColors</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/WorkingWithBorders.java">WorkingWithBorders</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/BigGridDemo.java">BigGridDemo</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CreateTable.java">CreateTable</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CalendarDemo.java">CalendarDemo</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/AligningCells.java">AligningCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/SplitAndFreezePanes.java">SplitAndFreezePanes</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/WorkingWithPageSetup.java">WorkingWithPageSetup</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/WorkingWithPictures.java">WorkingWithPictures</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/MergingCells.java">MergingCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CustomXMLMapping.java">CustomXMLMapping</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/SelectedSheet.java">SelectedSheet</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/EmbeddedObjects.java">EmbeddedObjects</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/WorkbookProperties.java">WorkbookProperties</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/NewLinesInCells.java">NewLinesInCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/Outlining.java">Outlining</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/CreateCell.java">CreateCell</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/IterateCells.java">IterateCells</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/BarChart.java">BarChart</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/BarAndLineChart.java">BarAndLineChart</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/LineChart.java">LineChart</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/ScatterChart.java">ScatterChart</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/WorkingWithFonts.java">WorkingWithFonts</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/HyperlinkExample.java">HyperlinkExample</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/ShiftRows.java">ShiftRows</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/WorkingWithRichText.java">WorkingWithRichText</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/usermodel/FitSheetToOnePage.java">FitSheetToOnePage</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/streaming/HybridStreaming.java">HybridStreaming</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/streaming/Outlining.java">Outlining (SXSSF output)</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/streaming/DeferredGeneration.java">DeferredGeneration (SXSSF output)</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/streaming/SavePasswordProtectedXlsx.java">SavePasswordProtectedXlsx (SXSSF output)</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/XLSX2CSV.java">XLSX2CSV (streaming read)</a></li>
|
||||
<li><a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/FromHowTo.java">FromHowTo (streaming read)</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,317 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>ExcelAnt - Ant Tasks for Validating Excel Spreadsheets</title>
|
||||
<authors>
|
||||
<person email="jon@loquatic.com" name="Jon Svede" id="JDS"/>
|
||||
<person email="brian.bush@nrel.gov" name="Brian Bush" id="BWB"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>ExcelAnt - Ant Tasks for Validating Excel Spreadsheets</title>
|
||||
|
||||
<section><title>Introduction</title>
|
||||
<p>ExcelAnt is a set of Ant tasks that make it possible to verify or test
|
||||
a workbook without having to write Java code. Of course, the tasks themselves
|
||||
are written in Java, but to use this framework you only need to know a little
|
||||
bit about Ant.</p>
|
||||
<p>This document covers the basic usage and set up of ExcelAnt.</p>
|
||||
<p>This document will assume basic familiarity with Ant and Ant build files.</p>
|
||||
</section>
|
||||
<section><title>Setup</title>
|
||||
<p>To start with ExcelAnt, you'll need to have the POI 3.8 or higher jar files. If you test only .xls
|
||||
workbooks then you need to have the following jars in your path:</p>
|
||||
<ul>
|
||||
<li>poi-excelant-$version-YYYYDDMM.jar</li>
|
||||
<li>poi-$version-YYYYDDMM.jar</li>
|
||||
<li>poi-ooxml-$version-YYYYDDMM.jar</li>
|
||||
</ul>
|
||||
<p> If you evaluate .xlsx workbooks then you need to add these: </p>
|
||||
<ul>
|
||||
<li>poi-ooxml-lite-$version-YYYYDDMM.jar</li>
|
||||
<li>xmlbeans.jar</li>
|
||||
</ul>
|
||||
<p>For example, if you have these jars in a lib/ dir in your project, your build.xml
|
||||
might look like this:</p>
|
||||
<source><![CDATA[
|
||||
<property name="lib.dir" value="lib" />
|
||||
|
||||
<path id="excelant.path">
|
||||
<pathelement location="${lib.dir}/poi-excelant-3.8-beta1-20101230.jar" />
|
||||
<pathelement location="${lib.dir}/poi-3.8-beta1-20101230.jar" />
|
||||
<pathelement location="${lib.dir}/poi-ooxml-3.8-beta1-20101230.jar" />
|
||||
</path>
|
||||
]]></source>
|
||||
<p>Next, you'll need to define the Ant tasks. There are several ways to use ExcelAnt:</p>
|
||||
|
||||
<ul><li>The traditional way:</li></ul>
|
||||
<source><![CDATA[
|
||||
<typedef resource="org/apache/poi/ss/excelant/antlib.xml" classpathref="excelant.path" />
|
||||
]]></source>
|
||||
<p>
|
||||
Where excelant.path refers to the classpath with POI jars.
|
||||
Using this approach the provided extensions will live in the default namespace. Note that the default task/typenames (evaluate, test) may be too generic and should either be explicitly overridden or used with a namespace.
|
||||
</p>
|
||||
<ul><li>Similar, but assigning a namespace URI:</li></ul>
|
||||
<source><![CDATA[
|
||||
<project name="excelant-demo" xmlns:poi="antlib:org.apache.poi.ss.excelant">
|
||||
|
||||
<typedef resource="org/apache/poi/ss/excelant/antlib.xml"
|
||||
classpathref="excelant.classpath"
|
||||
uri="antlib:org.apache.poi.ss.excelant"/>
|
||||
|
||||
<target name="test-nofile">
|
||||
<poi:excelant>
|
||||
|
||||
</poi:excelant>
|
||||
</target>
|
||||
</project>
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<section><title>A Simple Example</title>
|
||||
<p>The simplest example of using Excel is the ability to validate that POI is giving you back
|
||||
the value you expect it to. Does this mean that POI is inaccurate? Hardly. There are cases
|
||||
where POI is unable to evaluate cells for a variety of reasons. If you need to write code
|
||||
to integrate a worksheet into an app, you may want to know that it's going to work before
|
||||
you actually try to write that code. ExcelAnt helps with that.</p>
|
||||
|
||||
<p>Consider the <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/excelant/simple-mortgage-calculation.xls">mortgage-calculation.xls</a>
|
||||
file found in the Examples (link broken / file is missing). This sheet is shown below:</p>
|
||||
|
||||
<figure src="images/simple-xls-with-function.jpg" alt="mortgage calculation spreadsheet"/>
|
||||
|
||||
<p>This sheet calculates the principal and interest payment for a mortgage based
|
||||
on the amount of the loan, term and rate. To write a simple ExcelAnt test you
|
||||
need to tell ExcelAnt about the file like this:</p>
|
||||
<source><![CDATA[
|
||||
<property name="xls.file" value="" />
|
||||
|
||||
<target name="simpleTest">
|
||||
<excelant fileName="${xls.file}">
|
||||
<test name="checkValue" showFailureDetail="true">
|
||||
<evaluate showDelta="true" cell="'MortgageCalculator'!$B$4" expectedValue="790.7936" precision="1.0e-4" />
|
||||
</test>
|
||||
</excelant>
|
||||
</target>
|
||||
]]></source>
|
||||
|
||||
|
||||
<p>This code sets up ExcelAnt to access the file defined in the ant property
|
||||
xls.file. Then it creates a 'test' named 'checkValue'. Finally it tries
|
||||
to evaluate the B4 on the sheet named 'MortgageCalculator'. There are some assumptions
|
||||
here that are worth explaining. For starters, ExcelAnt is focused on the testing
|
||||
numerically oriented sheets. The <evaluate> task is actually evaluating the
|
||||
cell as a formula using a FormulaEvaluator instance from POI. Therefore it will fail
|
||||
if you point it to a cell that doesn't contain a formula or a test a plain old number.</p>
|
||||
|
||||
<p>Having said all that, here is what the output looks like:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
simpleTest:
|
||||
[excelant] ExcelAnt version 0.4.0 Copyright 2011
|
||||
[excelant] Using input file: resources/excelant.xls
|
||||
[excelant] 1/1 tests passed.
|
||||
BUILD SUCCESSFUL
|
||||
Total time: 391 milliseconds
|
||||
]]></source>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>Setting Values into a Cell</title>
|
||||
<p>So now we know that at a minimum POI can use our sheet to calculate the existing value.
|
||||
This is an important point: in many cases sheets have dependencies, i.e., cells they reference.
|
||||
As is often the case, these cells may have dependencies, which may have dependencies, etc.
|
||||
The point is that sometimes a dependent cell may get adjusted by a macro or a function
|
||||
and it may be that POI doesn't have the capabilities to do the same thing. This test
|
||||
verifies that we can rely on POI to retrieve the default value, based on the stored values
|
||||
of the sheet. Now we want to know if we can manipulate those dependencies and verify
|
||||
the output.</p>
|
||||
|
||||
<p>To verify that we can manipulate cell values, we need a way in ExcelAnt to set a value.
|
||||
This is provided by the following task types:</p>
|
||||
<ul>
|
||||
<li>setDouble() - sets the specified cell as a double.</li>
|
||||
<li>setFormula() - sets the specified cell as a formula.</li>
|
||||
<li>setString() = sets the specified cell as a String.</li>
|
||||
</ul>
|
||||
|
||||
<p>For the purposes of this example we'll use the <setDouble> task. Let's
|
||||
start with a $240,000, 30 year loan at 11% (let's pretend it's like 1984). Here
|
||||
is how we will set that up:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
<setDouble cell="'MortgageCalculator'!$B$1" value="240000"/>
|
||||
<setDouble cell="'MortgageCalculator'!$B$2" value ="0.11"/>
|
||||
<setDouble cell="'MortgageCalculator'!$B$3" value ="30"/>
|
||||
<evaluate showDelta="true" cell="'MortgageCalculator'!$B$4" expectedValue="2285.576149" precision="1.0e-4" />
|
||||
]]></source>
|
||||
|
||||
<p>Don't forget that we're verifying the behavior so you need to put all this
|
||||
into the sheet. That is how I got the result of $2,285 and change. So save your
|
||||
changes and run it; you should get the following: </p>
|
||||
|
||||
<source><![CDATA[
|
||||
Buildfile: C:\opt\eclipse\workspaces\excelant\excelant.examples\build.xml
|
||||
simpleTest:
|
||||
[excelant] ExcelAnt version 0.4.0 Copyright 2011
|
||||
[excelant] Using input file: resources/excelant.xls
|
||||
[excelant] 1/1 tests passed.
|
||||
BUILD SUCCESSFUL
|
||||
Total time: 406 milliseconds
|
||||
]]></source>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>Getting More Details</title>
|
||||
|
||||
<p>This is great, it's working! However, suppose you want to see a little more detail. The
|
||||
ExcelAnt tasks leverage the Ant logging so you can add the -verbose and -debug flags to
|
||||
the Ant command line to get more detail. Try adding -verbose. Here is what
|
||||
you should see:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
simpleTest:
|
||||
[excelant] ExcelAnt version 0.4.0 Copyright 2011
|
||||
[excelant] Using input file: resources/excelant.xls
|
||||
[evaluate] test precision = 1.0E-4 global precision = 0.0
|
||||
[evaluate] Using evaluate precision of 1.0E-4
|
||||
[excelant] 1/1 tests passed.
|
||||
BUILD SUCCESSFUL
|
||||
Total time: 406 milliseconds
|
||||
]]></source>
|
||||
|
||||
|
||||
<p>We see a little more detail. Notice that we see that there is a setting for global precision.
|
||||
Up until now we've been setting the precision on each evaluate that we call. This
|
||||
is obviously useful but it gets cumbersome. It would be better if there were a way
|
||||
that we could specify a global precision - and there is. There is a <precision>
|
||||
tag that you can specify as a child of the <excelant> tag. Let's go back to
|
||||
our original task we set up earlier and modify it:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
<property name="xls.file" value="" />
|
||||
|
||||
<target name="simpleTest">
|
||||
<excelant fileName="${xls.file}">
|
||||
<precision value="1.0e-3"/>
|
||||
<test name="checkValue" showFailureDetail="true">
|
||||
<evaluate showDelta="true" cell="'MortgageCalculator'!$B$4" expectedValue="790.7936" />
|
||||
</test>
|
||||
</excelant>
|
||||
</target>
|
||||
]]></source>
|
||||
|
||||
<p>In this example we have set the global precision to 1.0e-3. This means that
|
||||
in the absence of something more stringent, all tests in the task will use
|
||||
the global precision. We can still override this by specifying the
|
||||
precision attribute of all of our <evaluate> task. Let's first run
|
||||
this task with the global precision and the -verbose flag:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
simpleTest:
|
||||
[excelant] ExcelAnt version 0.4.0 Copyright 2011
|
||||
[excelant] Using input file: resources/excelant.xls
|
||||
[excelant] setting precision for the test checkValue
|
||||
[test] setting globalPrecision to 0.0010 in the evaluator
|
||||
[evaluate] test precision = 0.0 global precision = 0.0010
|
||||
[evaluate] Using global precision of 0.0010
|
||||
[excelant] 1/1 tests passed.
|
||||
]]></source>
|
||||
|
||||
|
||||
<p>As the output clearly shows, the test itself has no precision but there is
|
||||
the global precision. Additionally, it tells us we're going to use that
|
||||
more stringent global value. Now suppose that for this test we want
|
||||
to use a more stringent precision, say 1.0e-4. We can do that by adding
|
||||
the precision attribute back to the <evaluate> task:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
<excelant fileName="${xls.file}">
|
||||
<precision value="1.0e-3"/>
|
||||
<test name="checkValue" showFailureDetail="true">
|
||||
<setDouble cell="'MortgageCalculator'!$B$1" value="240000"/>
|
||||
<setDouble cell="'MortgageCalculator'!$B$2" value ="0.11"/>
|
||||
<setDouble cell="'MortgageCalculator'!$B$3" value ="30"/>
|
||||
<evaluate showDelta="true" cell="'MortgageCalculator'!$B$4" expectedValue="2285.576149" precision="1.0e-4" />
|
||||
</test>
|
||||
</excelant>
|
||||
]]></source>
|
||||
|
||||
|
||||
<p>Now when you re-run this test with the verbose flag you will see that
|
||||
your test ran and passed with the higher precision:</p>
|
||||
<source><![CDATA[
|
||||
simpleTest:
|
||||
[excelant] ExcelAnt version 0.4.0 Copyright 2011
|
||||
[excelant] Using input file: resources/excelant.xls
|
||||
[excelant] setting precision for the test checkValue
|
||||
[test] setting globalPrecision to 0.0010 in the evaluator
|
||||
[evaluate] test precision = 1.0E-4 global precision = 0.0010
|
||||
[evaluate] Using evaluate precision of 1.0E-4 over the global precision of 0.0010
|
||||
[excelant] 1/1 tests passed.
|
||||
BUILD SUCCESSFUL
|
||||
Total time: 390 milliseconds
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<section><title>Leveraging User Defined Functions</title>
|
||||
<p>POI has an excellent feature (besides ExcelAnt) called <a href="user-defined-functions.html">User Defined Functions</a>,
|
||||
that allows you to write Java code that will be used in place of custom VB
|
||||
code or macros is a spreadsheet. If you have read the documentation and written
|
||||
your own FreeRefFunction implmentations, ExcelAnt can make use of this code.
|
||||
For each <excelant> task you define you can nest a <udf> tag
|
||||
which allows you to specify the function alias and the class name.</p>
|
||||
|
||||
<p>Consider the previous example of the mortgage calculator. What if, instead
|
||||
of being a formula in a cell, it was a function defined in a VB macro? As luck
|
||||
would have it, we already have an example of this in the examples from the
|
||||
User Defined Functions example, so let's use that. In the example spreadsheet
|
||||
there is a tab for MortgageCalculatorFunction, which will use. If you look in
|
||||
cell B4, you see that rather than a messy cell based formula, there is only the function
|
||||
call. Let's not get bogged down in the function/Java implementation, as these
|
||||
are covered in the User Defined Function documentation. Let's just add
|
||||
a new target and test to our existing build file:</p>
|
||||
<source><![CDATA[
|
||||
<target name="functionTest">
|
||||
<excelant fileName="${xls.file}">
|
||||
<udf functionAlias="calculatePayment" class="org.apache.poi.ss.examples.formula.CalculateMortgage"/>
|
||||
<precision value="1.0e-3"/>
|
||||
<test name="checkValue" showFailureDetail="true">
|
||||
<setDouble cell="'MortgageCalculator'!$B$1" value="240000"/>
|
||||
<setDouble cell="'MortgageCalculator'!$B$2" value ="0.11"/>
|
||||
<setDouble cell="'MortgageCalculator'!$B$3" value ="30"/>
|
||||
<evaluate showDelta="true" cell="'MortgageCalculatorFunction'!$B$4" expectedValue="2285.576149" precision="1.0e-4" />
|
||||
</test>
|
||||
</excelant>
|
||||
</target>
|
||||
]]></source>
|
||||
|
||||
<p>So if you look at this carefully it looks the same as the previous examples. We
|
||||
still use the global precision, we're still setting values, and we still want
|
||||
to evaluate a cell. The only real differences are the sheet name and the
|
||||
addition of the function.</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,120 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Formula Support</title>
|
||||
<authors>
|
||||
<person email="avik@apache.org" name="Avik Sengupta" id="AS"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Introduction</title>
|
||||
<p>
|
||||
This document describes the current state of formula support in POI.
|
||||
The information in this document currently applies to the 3.13 version of POI.
|
||||
Since this area is a work in progress, this document will be updated with new
|
||||
features as and when they are added.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
<section><title>The basics</title>
|
||||
<p>
|
||||
In org.apache.poi.ss.usermodel.Cell
|
||||
<strong> setCellFormula("formulaString") </strong> is used to add a
|
||||
formula to a sheet, and <strong> getCellFormula() </strong> is used to retrieve
|
||||
the string representation of a formula.
|
||||
</p>
|
||||
<p>
|
||||
We aim to support the complete excel grammar for formulas. Thus, the string that
|
||||
you pass in to the <em> setCellFormula </em> call should be what you expect to
|
||||
type into excel. Also, note that you should NOT add a "=" to the front of the string.
|
||||
</p>
|
||||
<p>
|
||||
Please note that localized versions of Excel allow to enter localized
|
||||
function-names. However internally Excel stores the English names and thus POI
|
||||
only supports these and not the localized ones. Also note that only commas may be
|
||||
used to separate arguments, as per the Excel English style, alternate delimeters
|
||||
used in other localizations are not supported.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Supported Features</title>
|
||||
<ul>
|
||||
<li>References: single cell & area, 2D & 3D, relative & absolute</li>
|
||||
<li>Literals: number, text, boolean, error and array</li>
|
||||
<li>Operators: arithmetic and logical, some region operators</li>
|
||||
<li>Built-in functions: over 350 recognised, 280 evaluatable</li>
|
||||
<li>Add-in functions: 24 from Analysis Toolpack</li>
|
||||
<li>Array Formulas: via Sheet.setArrayFormula() and Sheet.removeArrayFormula()</li>
|
||||
<li>Region operators: union, intersection</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Not yet supported</title>
|
||||
<ul>
|
||||
<li>Manipulating table formulas (In Excel, formulas that look like "{=...}" as opposed to "=...")</li>
|
||||
<li>Parsing of previously uncalled add-in functions</li>
|
||||
<li>Preservation of whitespace in formulas (when POI manipulates them)</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section><title>Supported Functions</title>
|
||||
<p>To get the list of formula functions that POI supports, you need to
|
||||
call some code!</p>
|
||||
<p>The methods you need are available on
|
||||
<a href="../../apidocs/dev/org/apache/poi/ss/formula/eval/FunctionEval.html">org.apache.poi.ss.formula.eval.FunctionEval</a>.
|
||||
To find which functions your copy of Apache POI supports, use
|
||||
<a href="../../apidocs/dev/org/apache/poi/ss/formula/eval/FunctionEval.html#getSupportedFunctionNames()">getSupportedFunctionNames()</a>
|
||||
to get a list of the implemented function names. For the list of functions that
|
||||
POI knows the name of, but doesn't currently implement, use
|
||||
<a href="../../apidocs/dev/org/apache/poi/ss/formula/eval/FunctionEval.html#getNotSupportedFunctionNames()">getNotSupportedFunctionNames()</a>
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Internals</title>
|
||||
<p>
|
||||
Formulas in Excel are stored as sequences of tokens in Reverse Polish Notation order. The
|
||||
<a href="https://sc.openoffice.org/excelfileformat.pdf">open office XLS spec</a> is the best
|
||||
documentation you will find for the format.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The tokens used by excel are modeled as individual *Ptg classes in the <strong>
|
||||
org.apache.poi.hssf.record.formula</strong> package.
|
||||
</p>
|
||||
<p>
|
||||
The task of parsing a formula string into an array of RPN ordered tokens is done by the <strong>
|
||||
org.apache.poi.ss.formula.FormulaParser</strong> class. This class implements a hand
|
||||
written recursive descent parser.
|
||||
</p>
|
||||
<p>
|
||||
Formula tokens in Excel are stored in one of three possible <em> operand classes </em>:
|
||||
Reference, Value and Array. Based on the location of a token, its class can change
|
||||
in complicated and undocumented ways. While we have support for most cases, we
|
||||
are not sure if we have covered all bases (since there is no documentation for this area.)
|
||||
We would therefore like you to report any
|
||||
occurrence of #VALUE! in a cell upon opening a POI generated workbook in excel. (Check that
|
||||
typing the formula into Excel directly gives a valid result.)
|
||||
</p>
|
||||
<p>Check out the <a href="site:javadocs">javadocs </a> for details.
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,89 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Hacking HSSF</title>
|
||||
<authors>
|
||||
<person email="user@poi.apache.org" name="Glen Stampoultzis" id="GJS"/>
|
||||
<person email="acoliver@apache.org" name="Andrew Oliver" id="AO"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Where Can I Find Documentation on Feature X</title>
|
||||
<p>
|
||||
You might find the
|
||||
'Excel 97 Developer's Kit' (out of print, Microsoft Press, no
|
||||
restrictive covenants, available on Amazon.com) helpful for
|
||||
understanding the file format.
|
||||
</p>
|
||||
<p>
|
||||
Also useful is the <a href="https://sc.openoffice.org/excelfileformat.pdf">open office XLS spec</a>. We
|
||||
are collaborating with the maintainer of the spec so if you think you can add something to their
|
||||
document just send through your changes.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Help, I Can't Find Feature X Documented Anywhere</title>
|
||||
<ol>
|
||||
<li>
|
||||
Look at OpenOffice.org or Gnumeric sources if its implemented there.
|
||||
</li>
|
||||
<li>
|
||||
Use org.apache.poi.hssf.dev.BiffViewer to view the structure of the
|
||||
file. Experiment by adding one criteria entry at a time. See what it
|
||||
does to the structure, infer behavior and structure from it. Using the
|
||||
unix diff command (or get cygwin from www.cygwin.com for windows) you
|
||||
can figure out a lot very quickly. Unimplemented records show up as
|
||||
'UNKNOWN' and prints a hex dump.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
<section><title>Low-level Record Generation</title>
|
||||
<p>
|
||||
Low level records can be time consuming to created. We created a record
|
||||
generator to help generate some of the simpler tasks.
|
||||
</p>
|
||||
<p>
|
||||
We use XML
|
||||
descriptors to generate the Java code (which sure beats the heck out of
|
||||
the PERL scripts originally used ;-) for low level records. The
|
||||
generator is kinda alpha-ish right now and could use some enhancement,
|
||||
so you may find that to be about 1/2 of the work. Notice this is in
|
||||
org.apache.poi.hssf.record.definitions.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Important Notice</title>
|
||||
<p>One thing to note: If you are making a large code contribution we need to ensure
|
||||
any participants in this process have never
|
||||
signed a "Non Disclosure Agreement" with Microsoft, and have not
|
||||
received any information covered by such an agreement. If they have
|
||||
they'll not be able to participate in the POI project. For large contributions we
|
||||
may ask you to sign an agreement.</p>
|
||||
</section>
|
||||
<section><title>What Can I Work On?</title>
|
||||
<p>Ask in the dev mailing list for advice.</p>
|
||||
</section>
|
||||
<section><title>What Else Should I Know?</title>
|
||||
<p>Make sure you <a href="site:guidelines">read the contributing section</a>
|
||||
as it contains more generation information about contributing to POI in general.</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,884 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>The New Halloween Document</title>
|
||||
<authors>
|
||||
<person email="acoliver2@users.sourceforge.net" name="Andrew C. Oliver" id="AO"/>
|
||||
<person email="user@poi.apache.org" name="Glen Stampoultzis" id="GJS"/>
|
||||
<person email="nick@apache.org" name="Nick Burch" id="NB"/>
|
||||
<person email="sergeikozello@mail.ru" name="Sergei Kozello" id="SK"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>How to use the HSSF API</title>
|
||||
|
||||
<section><title>Capabilities</title>
|
||||
<p>This release of the how-to outlines functionality for the
|
||||
current svn trunk.
|
||||
Those looking for information on previous releases should
|
||||
look in the documentation distributed with that release.</p>
|
||||
<p>
|
||||
HSSF allows numeric, string, date or formula cell values to be written to
|
||||
or read from an XLS file. Also
|
||||
in this release is row and column sizing, cell styling (bold,
|
||||
italics, borders,etc), and support for both built-in and user
|
||||
defined data formats. Also available is
|
||||
an event-based API for reading XLS files.
|
||||
It differs greatly from the read/write API
|
||||
and is intended for intermediate developers who need a smaller
|
||||
memory footprint.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Different APIs</title>
|
||||
<p>There are a few different ways to access the HSSF API. These
|
||||
have different characteristics, so you should read up on
|
||||
all to select the best for you.</p>
|
||||
<ul>
|
||||
<li><a href="#user_api">User API (HSSF and XSSF)</a></li>
|
||||
<li><a href="#event_api">Event API (HSSF Only)</a></li>
|
||||
<li><a href="#record_aware_event_api">Event API with extensions to be Record Aware (HSSF Only)</a></li>
|
||||
<li><a href="#xssf_sax_api">XSSF and SAX (Event API)</a></li>
|
||||
<li><a href="#sxssf">SXSSF (Streaming User API)</a></li>
|
||||
<li><a href="#low_level_api">Low Level API</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>General Use</title>
|
||||
<anchor id="user_api" />
|
||||
<section><title>User API (HSSF and XSSF)</title>
|
||||
<section><title>Writing a new file</title>
|
||||
|
||||
<p>The high level API (package: org.apache.poi.ss.usermodel)
|
||||
is what most people should use. Usage is very simple.
|
||||
</p>
|
||||
<p>Workbooks are created by creating an instance of
|
||||
org.apache.poi.ss.usermodel.Workbook. Either create
|
||||
a concrete class directly
|
||||
(org.apache.poi.hssf.usermodel.HSSFWorkbook or
|
||||
org.apache.poi.xssf.usermodel.XSSFWorkbook), or use
|
||||
the handy factory class
|
||||
org.apache.poi.ss.usermodel.WorkbookFactory.
|
||||
</p>
|
||||
<p>Sheets are created by calling createSheet() from an existing
|
||||
instance of Workbook, the created sheet is automatically added in
|
||||
sequence to the workbook. Sheets do not in themselves have a sheet
|
||||
name (the tab at the bottom); you set
|
||||
the name associated with a sheet by calling
|
||||
Workbook.setSheetName(sheetindex,"SheetName",encoding).
|
||||
For HSSF, the name may be in 8bit format
|
||||
(HSSFWorkbook.ENCODING_COMPRESSED_UNICODE)
|
||||
or Unicode (HSSFWorkbook.ENCODING_UTF_16). Default
|
||||
encoding for HSSF is 8bit per char. For XSSF, the name
|
||||
is automatically handled as unicode.
|
||||
</p>
|
||||
<p>Rows are created by calling createRow(rowNumber) from an existing
|
||||
instance of Sheet. Only rows that have cell values should be
|
||||
added to the sheet. To set the row's height, you just call
|
||||
setRowHeight(height) on the row object. The height must be given in
|
||||
twips, or 1/20th of a point. If you prefer, there is also a
|
||||
setRowHeightInPoints method.
|
||||
</p>
|
||||
<p>Cells are created by calling createCell(column, type) from an
|
||||
existing Row. Only cells that have values should be added to the
|
||||
row. Cells should have their cell type set to either
|
||||
Cell.CELL_TYPE_NUMERIC or Cell.CELL_TYPE_STRING depending on
|
||||
whether they contain a numeric or textual value. Cells must also have
|
||||
a value set. Set the value by calling setCellValue with either a
|
||||
String or double as a parameter. Individual cells do not have a
|
||||
width; you must call setColumnWidth(colindex, width) (use units of
|
||||
1/256th of a character) on the Sheet object. (You can't do it on
|
||||
an individual basis in the GUI either).</p>
|
||||
<p>Cells are styled with CellStyle objects which in turn contain
|
||||
a reference to an Font object. These are created via the
|
||||
Workbook object by calling createCellStyle() and createFont().
|
||||
Once you create the object you must set its parameters (colors,
|
||||
borders, etc). To set a font for an CellStyle call
|
||||
setFont(fontobj).
|
||||
</p>
|
||||
<p>Once you have generated your workbook, you can write it out by
|
||||
calling write(outputStream) from your instance of Workbook, passing
|
||||
it an OutputStream (for instance, a FileOutputStream or
|
||||
ServletOutputStream). You must close the OutputStream yourself. HSSF
|
||||
does not close it for you.
|
||||
</p>
|
||||
<p>Here is some example code (excerpted and adapted from
|
||||
org.apache.poi.hssf.dev.HSSF test class):</p>
|
||||
<source><![CDATA[
|
||||
short rownum;
|
||||
|
||||
// create a new file
|
||||
FileOutputStream out = new FileOutputStream("workbook.xls");
|
||||
// create a new workbook
|
||||
Workbook wb = new HSSFWorkbook();
|
||||
// create a new sheet
|
||||
Sheet s = wb.createSheet();
|
||||
// declare a row object reference
|
||||
Row r = null;
|
||||
// declare a cell object reference
|
||||
Cell c = null;
|
||||
// create 3 cell styles
|
||||
CellStyle cs = wb.createCellStyle();
|
||||
CellStyle cs2 = wb.createCellStyle();
|
||||
CellStyle cs3 = wb.createCellStyle();
|
||||
DataFormat df = wb.createDataFormat();
|
||||
// create 2 fonts objects
|
||||
Font f = wb.createFont();
|
||||
Font f2 = wb.createFont();
|
||||
|
||||
//set font 1 to 12 point type
|
||||
f.setFontHeightInPoints((short) 12);
|
||||
//make it blue
|
||||
f.setColor( (short)0xc );
|
||||
// make it bold
|
||||
//arial is the default font
|
||||
f.setBoldweight(Font.BOLDWEIGHT_BOLD);
|
||||
|
||||
//set font 2 to 10 point type
|
||||
f2.setFontHeightInPoints((short) 10);
|
||||
//make it red
|
||||
f2.setColor( (short)Font.COLOR_RED );
|
||||
//make it bold
|
||||
f2.setBoldweight(Font.BOLDWEIGHT_BOLD);
|
||||
|
||||
f2.setStrikeout( true );
|
||||
|
||||
//set cell stlye
|
||||
cs.setFont(f);
|
||||
//set the cell format
|
||||
cs.setDataFormat(df.getFormat("#,##0.0"));
|
||||
|
||||
//set a thin border
|
||||
cs2.setBorderBottom(cs2.BORDER_THIN);
|
||||
//fill w fg fill color
|
||||
cs2.setFillPattern((short) CellStyle.SOLID_FOREGROUND);
|
||||
//set the cell format to text see DataFormat for a full list
|
||||
cs2.setDataFormat(HSSFDataFormat.getBuiltinFormat("text"));
|
||||
|
||||
// set the font
|
||||
cs2.setFont(f2);
|
||||
|
||||
// set the sheet name in Unicode
|
||||
wb.setSheetName(0, "\u0422\u0435\u0441\u0442\u043E\u0432\u0430\u044F " +
|
||||
"\u0421\u0442\u0440\u0430\u043D\u0438\u0447\u043A\u0430" );
|
||||
// in case of plain ascii
|
||||
// wb.setSheetName(0, "HSSF Test");
|
||||
// create a sheet with 30 rows (0-29)
|
||||
int rownum;
|
||||
for (rownum = (short) 0; rownum < 30; rownum++)
|
||||
{
|
||||
// create a row
|
||||
r = s.createRow(rownum);
|
||||
// on every other row
|
||||
if ((rownum % 2) == 0)
|
||||
{
|
||||
// make the row height bigger (in twips - 1/20 of a point)
|
||||
r.setHeight((short) 0x249);
|
||||
}
|
||||
|
||||
//r.setRowNum(( short ) rownum);
|
||||
// create 10 cells (0-9) (the += 2 becomes apparent later
|
||||
for (short cellnum = (short) 0; cellnum < 10; cellnum += 2)
|
||||
{
|
||||
// create a numeric cell
|
||||
c = r.createCell(cellnum);
|
||||
// do some goofy math to demonstrate decimals
|
||||
c.setCellValue(rownum * 10000 + cellnum
|
||||
+ (((double) rownum / 1000)
|
||||
+ ((double) cellnum / 10000)));
|
||||
|
||||
String cellValue;
|
||||
|
||||
// create a string cell (see why += 2 in the
|
||||
c = r.createCell((short) (cellnum + 1));
|
||||
|
||||
// on every other row
|
||||
if ((rownum % 2) == 0)
|
||||
{
|
||||
// set this cell to the first cell style we defined
|
||||
c.setCellStyle(cs);
|
||||
// set the cell's string value to "Test"
|
||||
c.setCellValue( "Test" );
|
||||
}
|
||||
else
|
||||
{
|
||||
c.setCellStyle(cs2);
|
||||
// set the cell's string value to "\u0422\u0435\u0441\u0442"
|
||||
c.setCellValue( "\u0422\u0435\u0441\u0442" );
|
||||
}
|
||||
|
||||
|
||||
// make this column a bit wider
|
||||
s.setColumnWidth((short) (cellnum + 1), (short) ((50 * 8) / ((double) 1 / 20)));
|
||||
}
|
||||
}
|
||||
|
||||
//draw a thick black border on the row at the bottom using BLANKS
|
||||
// advance 2 rows
|
||||
rownum++;
|
||||
rownum++;
|
||||
|
||||
r = s.createRow(rownum);
|
||||
|
||||
// define the third style to be the default
|
||||
// except with a thick black border at the bottom
|
||||
cs3.setBorderBottom(cs3.BORDER_THICK);
|
||||
|
||||
//create 50 cells
|
||||
for (short cellnum = (short) 0; cellnum < 50; cellnum++)
|
||||
{
|
||||
//create a blank type cell (no value)
|
||||
c = r.createCell(cellnum);
|
||||
// set it to the thick black border style
|
||||
c.setCellStyle(cs3);
|
||||
}
|
||||
|
||||
//end draw thick black border
|
||||
|
||||
|
||||
// demonstrate adding/naming and deleting a sheet
|
||||
// create a sheet, set its title then delete it
|
||||
s = wb.createSheet();
|
||||
wb.setSheetName(1, "DeletedSheet");
|
||||
wb.removeSheetAt(1);
|
||||
//end deleted sheet
|
||||
|
||||
// write the workbook to the output stream
|
||||
// close our file (don't blow out our file handles
|
||||
wb.write(out);
|
||||
out.close();
|
||||
]]></source>
|
||||
</section>
|
||||
<section><title>Reading or modifying an existing file</title>
|
||||
|
||||
<p>Reading in a file is equally simple. To read in a file, create a
|
||||
new instance of org.apache.poi.poifs.Filesystem, passing in an open InputStream, such as a FileInputStream
|
||||
for your XLS, to the constructor. Construct a new instance of
|
||||
org.apache.poi.hssf.usermodel.HSSFWorkbook passing the
|
||||
Filesystem instance to the constructor. From there you have access to
|
||||
all of the high level model objects through their assessor methods
|
||||
(workbook.getSheet(sheetNum), sheet.getRow(rownum), etc).
|
||||
</p>
|
||||
<p>Modifying the file you have read in is simple. You retrieve the
|
||||
object via an assessor method, remove it via a parent object's remove
|
||||
method (sheet.removeRow(hssfrow)) and create objects just as you
|
||||
would if creating a new xls. When you are done modifying cells just
|
||||
call workbook.write(outputstream) just as you did above.</p>
|
||||
<p>An example of this can be seen in
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/usermodel/HSSFReadWrite.java">org.apache.poi.hssf.usermodel.examples.HSSFReadWrite</a>.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<anchor id="event_api" />
|
||||
<section><title>Event API (HSSF Only)</title>
|
||||
|
||||
<p>The event API is newer than the User API. It is intended for intermediate
|
||||
developers who are willing to learn a little bit of the low level API
|
||||
structures. Its relatively simple to use, but requires a basic
|
||||
understanding of the parts of an Excel file (or willingness to
|
||||
learn). The advantage provided is that you can read an XLS with a
|
||||
relatively small memory footprint.
|
||||
</p>
|
||||
<p>One important thing to note with the basic Event API is that it
|
||||
triggers events only for things actually stored within the file.
|
||||
With the XLS file format, it is quite common for things that
|
||||
have yet to be edited to simply not exist in the file. This means
|
||||
there may well be apparent "gaps" in the record stream, which
|
||||
you either need to work around, or use the
|
||||
<a href="#record_aware_event_api">Record Aware</a> extension
|
||||
to the Event API.</p>
|
||||
<p>To use this API you construct an instance of
|
||||
org.apache.poi.hssf.eventmodel.HSSFRequest. Register a class you
|
||||
create that supports the
|
||||
org.apache.poi.hssf.eventmodel.HSSFListener interface using the
|
||||
HSSFRequest.addListener(yourlistener, recordsid). The recordsid
|
||||
should be a static reference number (such as BOFRecord.sid) contained
|
||||
in the classes in org.apache.poi.hssf.record. The trick is you
|
||||
have to know what these records are. Alternatively you can call
|
||||
HSSFRequest.addListenerForAllRecords(mylistener). In order to learn
|
||||
about these records you can either read all of the javadoc in the
|
||||
org.apache.poi.hssf.record package or you can just hack up a
|
||||
copy of org.apache.poi.hssf.dev.EFHSSF and adapt it to your
|
||||
needs. TODO: better documentation on records.</p>
|
||||
<p>Once you've registered your listeners in the HSSFRequest object
|
||||
you can construct an instance of
|
||||
org.apache.poi.poifs.filesystem.FileSystem (see POIFS howto) and
|
||||
pass it your XLS file inputstream. You can either pass this, along
|
||||
with the request you constructed, to an instance of HSSFEventFactory
|
||||
via the HSSFEventFactory.processWorkbookEvents(request, Filesystem)
|
||||
method, or you can get an instance of DocumentInputStream from
|
||||
Filesystem.createDocumentInputStream("Workbook") and pass
|
||||
it to HSSFEventFactory.processEvents(request, inputStream). Once you
|
||||
make this call, the listeners that you constructed receive calls to
|
||||
their processRecord(Record) methods with each Record they are
|
||||
registered to listen for until the file has been completely read.
|
||||
</p>
|
||||
<p>A code excerpt from org.apache.poi.hssf.dev.EFHSSF (which is
|
||||
in CVS or the source distribution) is reprinted below with excessive
|
||||
comments:</p>
|
||||
<source><![CDATA[
|
||||
/**
|
||||
* This example shows how to use the event API for reading a file.
|
||||
*/
|
||||
public class EventExample
|
||||
implements HSSFListener
|
||||
{
|
||||
private SSTRecord sstrec;
|
||||
|
||||
/**
|
||||
* This method listens for incoming records and handles them as required.
|
||||
* @param record The record that was found while reading.
|
||||
*/
|
||||
public void processRecord(Record record)
|
||||
{
|
||||
switch (record.getSid())
|
||||
{
|
||||
// the BOFRecord can represent either the beginning of a sheet or the workbook
|
||||
case BOFRecord.sid:
|
||||
BOFRecord bof = (BOFRecord) record;
|
||||
if (bof.getType() == bof.TYPE_WORKBOOK)
|
||||
{
|
||||
System.out.println("Encountered workbook");
|
||||
// assigned to the class level member
|
||||
} else if (bof.getType() == bof.TYPE_WORKSHEET)
|
||||
{
|
||||
System.out.println("Encountered sheet reference");
|
||||
}
|
||||
break;
|
||||
case BoundSheetRecord.sid:
|
||||
BoundSheetRecord bsr = (BoundSheetRecord) record;
|
||||
System.out.println("New sheet named: " + bsr.getSheetname());
|
||||
break;
|
||||
case RowRecord.sid:
|
||||
RowRecord rowrec = (RowRecord) record;
|
||||
System.out.println("Row found, first column at "
|
||||
+ rowrec.getFirstCol() + " last column at " + rowrec.getLastCol());
|
||||
break;
|
||||
case NumberRecord.sid:
|
||||
NumberRecord numrec = (NumberRecord) record;
|
||||
System.out.println("Cell found with value " + numrec.getValue()
|
||||
+ " at row " + numrec.getRow() + " and column " + numrec.getColumn());
|
||||
break;
|
||||
// SSTRecords store an array of unique strings used in Excel.
|
||||
case SSTRecord.sid:
|
||||
sstrec = (SSTRecord) record;
|
||||
for (int k = 0; k < sstrec.getNumUniqueStrings(); k++)
|
||||
{
|
||||
System.out.println("String table value " + k + " = " + sstrec.getString(k));
|
||||
}
|
||||
break;
|
||||
case LabelSSTRecord.sid:
|
||||
LabelSSTRecord lrec = (LabelSSTRecord) record;
|
||||
System.out.println("String cell found with value "
|
||||
+ sstrec.getString(lrec.getSSTIndex()));
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Read an excel file and spit out what we find.
|
||||
*
|
||||
* @param args Expect one argument that is the file to read.
|
||||
* @throws IOException When there is an error processing the file.
|
||||
*/
|
||||
public static void main(String[] args) throws IOException
|
||||
{
|
||||
// create a new file input stream with the input file specified
|
||||
// at the command line
|
||||
FileInputStream fin = new FileInputStream(args[0]);
|
||||
// create a new org.apache.poi.poifs.filesystem.Filesystem
|
||||
POIFSFileSystem poifs = new POIFSFileSystem(fin);
|
||||
// get the Workbook (excel part) stream in a InputStream
|
||||
InputStream din = poifs.createDocumentInputStream("Workbook");
|
||||
// construct out HSSFRequest object
|
||||
HSSFRequest req = new HSSFRequest();
|
||||
// lazy listen for ALL records with the listener shown above
|
||||
req.addListenerForAllRecords(new EventExample());
|
||||
// create our event factory
|
||||
HSSFEventFactory factory = new HSSFEventFactory();
|
||||
// process our events based on the document input stream
|
||||
factory.processEvents(req, din);
|
||||
// once all the events are processed close our file input stream
|
||||
fin.close();
|
||||
// and our document input stream (don't want to leak these!)
|
||||
din.close();
|
||||
System.out.println("done.");
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<anchor id="record_aware_event_api" />
|
||||
<section><title>Record Aware Event API (HSSF Only)</title>
|
||||
<p>
|
||||
This is an extension to the normal
|
||||
<a href="#event_api">Event API</a>. With this, your listener
|
||||
will be called with extra, dummy records. These dummy records should
|
||||
alert you to records which aren't present in the file (eg cells that have
|
||||
yet to be edited), and allow you to handle these.
|
||||
</p>
|
||||
<p>
|
||||
There are three dummy records that your HSSFListener will be called with:
|
||||
</p>
|
||||
<ul>
|
||||
<li>org.apache.poi.hssf.eventusermodel.dummyrecord.MissingRowDummyRecord
|
||||
<br />
|
||||
This is called during the row record phase (which typically occurs before
|
||||
the cell records), and indicates that the row record for the given
|
||||
row is not present in the file.</li>
|
||||
<li>org.apache.poi.hssf.eventusermodel.dummyrecord.MissingCellDummyRecord
|
||||
<br />
|
||||
This is called during the cell record phase. It is called when a cell
|
||||
record is encountered which leaves a gap between it an the previous one.
|
||||
You can get multiple of these, before the real cell record.</li>
|
||||
<li>org.apache.poi.hssf.eventusermodel.dummyrecord.LastCellOfRowDummyRecord
|
||||
<br />
|
||||
This is called after the last cell of a given row. It indicates that there
|
||||
are no more cells for the row, and also tells you how many cells you have
|
||||
had. For a row with no cells, this will be the only record you get.</li>
|
||||
</ul>
|
||||
<p>
|
||||
To use the Record Aware Event API, you should create an
|
||||
org.apache.poi.hssf.eventusermodel.MissingRecordAwareHSSFListener, and pass
|
||||
it your HSSFListener. Then, register the MissingRecordAwareHSSFListener
|
||||
to the event model, and start that as normal.
|
||||
</p>
|
||||
<p>
|
||||
One example use for this API is to write a CSV outputter, which always
|
||||
outputs a minimum number of columns, even where the file doesn't contain
|
||||
some of the rows or cells. It can be found at
|
||||
<code>/poi-examples/src/main/java/org/apache/poi/examples/hssf/eventusermodel/XLS2CSVmra.java</code>,
|
||||
and may be called on the command line, or from within your own code.
|
||||
The latest version is always available from
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/hssf/eventusermodel/">subversion</a>.
|
||||
</p>
|
||||
<p>
|
||||
<em>In POI versions before 3.0.3, this code lived in the scratchpad section.
|
||||
If you're using one of these older versions of POI, you will either
|
||||
need to include the scratchpad jar on your classpath, or build from a</em>
|
||||
<a href="site:subversion">subversion checkout</a>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<anchor id="xssf_sax_api"/>
|
||||
<section><title>XSSF and SAX (Event API)</title>
|
||||
|
||||
<p>If memory footprint is an issue, then for XSSF, you can get at
|
||||
the underlying XML data, and process it yourself. This is intended
|
||||
for intermediate developers who are willing to learn a little bit of
|
||||
low level structure of .xlsx files, and who are happy processing
|
||||
XML in java. Its relatively simple to use, but requires a basic
|
||||
understanding of the file structure. The advantage provided is that
|
||||
you can read a XLSX file with a relatively small memory footprint.
|
||||
</p>
|
||||
<p>One important thing to note with the basic Event API is that it
|
||||
triggers events only for things actually stored within the file.
|
||||
With the XLSX file format, it is quite common for things that
|
||||
have yet to be edited to simply not exist in the file. This means
|
||||
there may well be apparent "gaps" in the record stream, which
|
||||
you need to work around.</p>
|
||||
<p>To use this API you construct an instance of
|
||||
org.apache.poi.xssf.eventmodel.XSSFReader. This will optionally
|
||||
provide a nice interface on the shared strings table, and the styles.
|
||||
It provides methods to get the raw xml data from the rest of the
|
||||
file, which you will then pass to SAX.</p>
|
||||
<p>This example shows how to get at a single known sheet, or at
|
||||
all sheets in the file. It is based on the example in
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/FromHowTo.java">svn
|
||||
poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/FromHowTo.java</a></p>
|
||||
<source><![CDATA[
|
||||
import java.io.InputStream;
|
||||
import java.util.Iterator;
|
||||
|
||||
import org.apache.poi.util.XMLHelper;
|
||||
import org.apache.poi.openxml4j.opc.OPCPackage;
|
||||
import org.apache.poi.xssf.eventusermodel.XSSFReader;
|
||||
import org.apache.poi.xssf.model.SharedStringsTable;
|
||||
import org.xml.sax.Attributes;
|
||||
import org.xml.sax.ContentHandler;
|
||||
import org.xml.sax.InputSource;
|
||||
import org.xml.sax.SAXException;
|
||||
import org.xml.sax.XMLReader;
|
||||
import org.xml.sax.helpers.DefaultHandler;
|
||||
|
||||
import javax.xml.parsers.ParserConfigurationException;
|
||||
|
||||
public class ExampleEventUserModel {
|
||||
public void processOneSheet(String filename) throws Exception {
|
||||
OPCPackage pkg = OPCPackage.open(filename);
|
||||
XSSFReader r = new XSSFReader( pkg );
|
||||
SharedStringsTable sst = r.getSharedStringsTable();
|
||||
|
||||
XMLReader parser = fetchSheetParser(sst);
|
||||
|
||||
// To look up the Sheet Name / Sheet Order / rID,
|
||||
// you need to process the core Workbook stream.
|
||||
// Normally it's of the form rId# or rSheet#
|
||||
InputStream sheet2 = r.getSheet("rId2");
|
||||
InputSource sheetSource = new InputSource(sheet2);
|
||||
parser.parse(sheetSource);
|
||||
sheet2.close();
|
||||
}
|
||||
|
||||
public void processAllSheets(String filename) throws Exception {
|
||||
OPCPackage pkg = OPCPackage.open(filename);
|
||||
XSSFReader r = new XSSFReader( pkg );
|
||||
SharedStringsTable sst = r.getSharedStringsTable();
|
||||
|
||||
XMLReader parser = fetchSheetParser(sst);
|
||||
|
||||
Iterator<InputStream> sheets = r.getSheetsData();
|
||||
while(sheets.hasNext()) {
|
||||
System.out.println("Processing new sheet:\n");
|
||||
InputStream sheet = sheets.next();
|
||||
InputSource sheetSource = new InputSource(sheet);
|
||||
parser.parse(sheetSource);
|
||||
sheet.close();
|
||||
System.out.println("");
|
||||
}
|
||||
}
|
||||
|
||||
public XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException, ParserConfigurationException {
|
||||
XMLReader parser = XMLHelper.newXMLReader();
|
||||
ContentHandler handler = new SheetHandler(sst);
|
||||
parser.setContentHandler(handler);
|
||||
return parser;
|
||||
}
|
||||
|
||||
/**
|
||||
* See org.xml.sax.helpers.DefaultHandler javadocs
|
||||
*/
|
||||
private static class SheetHandler extends DefaultHandler {
|
||||
private SharedStringsTable sst;
|
||||
private String lastContents;
|
||||
private boolean nextIsString;
|
||||
|
||||
private SheetHandler(SharedStringsTable sst) {
|
||||
this.sst = sst;
|
||||
}
|
||||
|
||||
public void startElement(String uri, String localName, String name,
|
||||
Attributes attributes) throws SAXException {
|
||||
// c => cell
|
||||
if(name.equals("c")) {
|
||||
// Print the cell reference
|
||||
System.out.print(attributes.getValue("r") + " - ");
|
||||
// Figure out if the value is an index in the SST
|
||||
String cellType = attributes.getValue("t");
|
||||
if(cellType != null && cellType.equals("s")) {
|
||||
nextIsString = true;
|
||||
} else {
|
||||
nextIsString = false;
|
||||
}
|
||||
}
|
||||
// Clear contents cache
|
||||
lastContents = "";
|
||||
}
|
||||
|
||||
public void endElement(String uri, String localName, String name)
|
||||
throws SAXException {
|
||||
// Process the last contents as required.
|
||||
// Do now, as characters() may be called more than once
|
||||
if(nextIsString) {
|
||||
int idx = Integer.parseInt(lastContents);
|
||||
lastContents = sst.getItemAt(idx).getString();
|
||||
nextIsString = false;
|
||||
}
|
||||
|
||||
// v => contents of a cell
|
||||
// Output after we've seen the string contents
|
||||
if(name.equals("v")) {
|
||||
System.out.println(lastContents);
|
||||
}
|
||||
}
|
||||
|
||||
public void characters(char[] ch, int start, int length) {
|
||||
lastContents += new String(ch, start, length);
|
||||
}
|
||||
}
|
||||
|
||||
public static void main(String[] args) throws Exception {
|
||||
ExampleEventUserModel example = new ExampleEventUserModel();
|
||||
example.processOneSheet(args[0]);
|
||||
example.processAllSheets(args[0]);
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
<p>
|
||||
For a fuller example, including support for fetching number formatting
|
||||
information and applying it to numeric cells (eg to format dates or
|
||||
percentages), please see
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/XLSX2CSV.java">the XLSX2CSV example in svn</a>
|
||||
</p>
|
||||
<p>An example is also <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/streaming/HybridStreaming.java">provided</a>
|
||||
showing how to combine the user API and the SAX API by doing a streaming parse
|
||||
of larger worksheets and a traditional user-model parse of the rest of a workbook.</p>
|
||||
</section>
|
||||
<anchor id="sxssf"/>
|
||||
<section><title>SXSSF (Streaming Usermodel API)</title>
|
||||
<p>
|
||||
SXSSF (package: org.apache.poi.xssf.streaming) is an API-compatible streaming extension of XSSF to be used when
|
||||
very large spreadsheets have to be produced, and heap space is limited.
|
||||
SXSSF achieves its low memory footprint by limiting access to the rows that
|
||||
are within a sliding window, while XSSF gives access to all rows in the
|
||||
document. Older rows that are no longer in the window become inaccessible,
|
||||
as they are written to the disk.
|
||||
</p>
|
||||
<p>
|
||||
You can specify the window size at workbook construction time via <em>new SXSSFWorkbook(int windowSize)</em>
|
||||
or you can set it per-sheet via <em>SXSSFSheet#setRandomAccessWindowSize(int windowSize)</em>
|
||||
</p>
|
||||
<p>
|
||||
When a new row is created via createRow() and the total number
|
||||
of unflushed records would exceed the specified window size, then the
|
||||
row with the lowest index value is flushed and cannot be accessed
|
||||
via getRow() anymore.
|
||||
</p>
|
||||
<p>
|
||||
The default window size is <em>100</em> and defined by SXSSFWorkbook.DEFAULT_WINDOW_SIZE.
|
||||
</p>
|
||||
<p>
|
||||
A windowSize of -1 indicates unlimited access. In this case all
|
||||
records that have not been flushed by a call to flushRows() are available
|
||||
for random access.
|
||||
</p>
|
||||
<p>
|
||||
Note that SXSSF allocates temporary files that you <strong>must</strong> always clean up explicitly, by calling the dispose method.
|
||||
</p>
|
||||
<p>
|
||||
SXSSFWorkbook defaults to using inline strings instead of a shared strings
|
||||
table. This is very efficient, since no document content needs to be kept in
|
||||
memory, but is also known to produce documents that are incompatible with
|
||||
some clients. With shared strings enabled all unique strings in the document
|
||||
has to be kept in memory. Depending on your document content this could use
|
||||
a lot more resources than with shared strings disabled.
|
||||
</p>
|
||||
<p>
|
||||
Please note that there are still things that still may consume a large
|
||||
amount of memory based on which features you are using, e.g. merged regions,
|
||||
hyperlinks, comments, ... are still only stored in memory and thus may require a lot of
|
||||
memory if used extensively.
|
||||
</p>
|
||||
<p>
|
||||
Carefully review your memory budget and compatibility needs before deciding
|
||||
whether to enable shared strings or not.
|
||||
</p>
|
||||
<p> The example below writes a sheet with a window of 100 rows. When the row count reaches 101,
|
||||
the row with rownum=0 is flushed to disk and removed from memory, when rownum reaches 102 then the row with rownum=1 is flushed, etc.
|
||||
</p>
|
||||
|
||||
|
||||
<source><![CDATA[
|
||||
import junit.framework.Assert;
|
||||
import org.apache.poi.ss.usermodel.Cell;
|
||||
import org.apache.poi.ss.usermodel.Row;
|
||||
import org.apache.poi.ss.usermodel.Sheet;
|
||||
import org.apache.poi.ss.usermodel.Workbook;
|
||||
import org.apache.poi.ss.util.CellReference;
|
||||
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
|
||||
|
||||
public static void main(String[] args) throws Throwable {
|
||||
SXSSFWorkbook wb = new SXSSFWorkbook(100); // keep 100 rows in memory, exceeding rows will be flushed to disk
|
||||
Sheet sh = wb.createSheet();
|
||||
for(int rownum = 0; rownum < 1000; rownum++){
|
||||
Row row = sh.createRow(rownum);
|
||||
for(int cellnum = 0; cellnum < 10; cellnum++){
|
||||
Cell cell = row.createCell(cellnum);
|
||||
String address = new CellReference(cell).formatAsString();
|
||||
cell.setCellValue(address);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
// Rows with rownum < 900 are flushed and not accessible
|
||||
for(int rownum = 0; rownum < 900; rownum++){
|
||||
Assert.assertNull(sh.getRow(rownum));
|
||||
}
|
||||
|
||||
// ther last 100 rows are still in memory
|
||||
for(int rownum = 900; rownum < 1000; rownum++){
|
||||
Assert.assertNotNull(sh.getRow(rownum));
|
||||
}
|
||||
|
||||
FileOutputStream out = new FileOutputStream("/temp/sxssf.xlsx");
|
||||
wb.write(out);
|
||||
out.close();
|
||||
|
||||
// dispose of temporary files backing this workbook on disk
|
||||
wb.dispose();
|
||||
}
|
||||
|
||||
|
||||
]]></source>
|
||||
<p>The next example turns off auto-flushing (windowSize=-1) and the code manually controls how portions of data are written to disk</p>
|
||||
<source><![CDATA[
|
||||
import org.apache.poi.ss.usermodel.Cell;
|
||||
import org.apache.poi.ss.usermodel.Row;
|
||||
import org.apache.poi.ss.usermodel.Sheet;
|
||||
import org.apache.poi.ss.usermodel.Workbook;
|
||||
import org.apache.poi.ss.util.CellReference;
|
||||
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
|
||||
|
||||
public static void main(String[] args) throws Throwable {
|
||||
SXSSFWorkbook wb = new SXSSFWorkbook(-1); // turn off auto-flushing and accumulate all rows in memory
|
||||
Sheet sh = wb.createSheet();
|
||||
for(int rownum = 0; rownum < 1000; rownum++){
|
||||
Row row = sh.createRow(rownum);
|
||||
for(int cellnum = 0; cellnum < 10; cellnum++){
|
||||
Cell cell = row.createCell(cellnum);
|
||||
String address = new CellReference(cell).formatAsString();
|
||||
cell.setCellValue(address);
|
||||
}
|
||||
|
||||
// manually control how rows are flushed to disk
|
||||
if(rownum % 100 == 0) {
|
||||
((SXSSFSheet)sh).flushRows(100); // retain 100 last rows and flush all others
|
||||
|
||||
// ((SXSSFSheet)sh).flushRows() is a shortcut for ((SXSSFSheet)sh).flushRows(0),
|
||||
// this method flushes all rows
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
FileOutputStream out = new FileOutputStream("/temp/sxssf.xlsx");
|
||||
wb.write(out);
|
||||
out.close();
|
||||
|
||||
// dispose of temporary files backing this workbook on disk
|
||||
wb.dispose();
|
||||
}
|
||||
|
||||
|
||||
]]></source>
|
||||
<p>SXSSF flushes sheet data in temporary files (a temp file per sheet) and the size of these temporary files
|
||||
can grow to a very large value. For example, for a 20 MB csv data the size of the temp xml becomes more than a gigabyte.
|
||||
If the size of the temp files is an issue, you can tell SXSSF to use gzip compression:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
SXSSFWorkbook wb = new SXSSFWorkbook();
|
||||
wb.setCompressTempFiles(true); // temp files will be gzipped
|
||||
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<anchor id="low_level_api" />
|
||||
<section><title>Low Level APIs</title>
|
||||
|
||||
<p>The low level API is not much to look at. It consists of lots of
|
||||
"Records" in the org.apache.poi.hssf.record.* package,
|
||||
and set of helper classes in org.apache.poi.hssf.model.*. The
|
||||
record classes are consistent with the low level binary structures
|
||||
inside a BIFF8 file (which is embedded in a POIFS file system). You
|
||||
probably need the book: "Microsoft Excel 97 Developer's Kit"
|
||||
from Microsoft Press in order to understand how these fit together
|
||||
(out of print but easily obtainable from Amazon's used books). In
|
||||
order to gain a good understanding of how to use the low level APIs
|
||||
should view the source in org.apache.poi.hssf.usermodel.* and
|
||||
the classes in org.apache.poi.hssf.model.*. You should read the
|
||||
documentation for the POIFS libraries as well.</p>
|
||||
</section>
|
||||
<section><title>Generating XLS from XML</title>
|
||||
<p>If you wish to generate an XLS file from some XML, it is possible to
|
||||
write your own XML processing code, then use the User API to write out
|
||||
the document.</p>
|
||||
<p>The other option is to use <a href="https://cocoon.apache.org/">Cocoon</a>.
|
||||
In Cocoon, there is the <a href="https://cocoon.apache.org/2.1/userdocs/xls-serializer.html">HSSF Serializer</a>,
|
||||
which takes in XML (in the gnumeric format), and outputs an XLS file for you.</p>
|
||||
</section>
|
||||
<section><title>HSSF Class/Test Application</title>
|
||||
|
||||
<p>The HSSF application is nothing more than a test for the high
|
||||
level API (and indirectly the low level support). The main body of
|
||||
its code is repeated above. To run it:
|
||||
</p>
|
||||
<ul>
|
||||
<li>download the poi-alpha build and untar it (tar xvzf
|
||||
tarball.tar.gz)
|
||||
</li>
|
||||
<li>set up your classpath as follows:
|
||||
<code>export HSSFDIR={wherever you put HSSF's jar files}
|
||||
export LOG4JDIR={wherever you put LOG4J's jar files}
|
||||
export CLASSPATH=$CLASSPATH:$HSSFDIR/hssf.jar:$HSSFDIR/poi-poifs.jar:$HSSFDIR/poi-util.jar:$LOG4JDIR/log4j.jar</code>
|
||||
</li><li>type:
|
||||
<code>java org.apache.poi.hssf.dev.HSSF ~/myxls.xls write</code></li>
|
||||
</ul>
|
||||
<p></p>
|
||||
<p>This should generate a test sheet in your home directory called <code>"myxls.xls"</code>. </p>
|
||||
<ul>
|
||||
<li>Type:
|
||||
<code>java org.apache.poi.hssf.dev.HSSF ~/input.xls output.xls</code>
|
||||
<br/>
|
||||
<br/>
|
||||
This is the read/write/modify test. It reads in the spreadsheet, modifies a cell, and writes it back out.
|
||||
Failing this test is not necessarily a bad thing. If HSSF tries to modify a non-existant sheet then this will
|
||||
most likely fail. No big deal. </li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>HSSF Developer's Tools</title>
|
||||
|
||||
<p>HSSF has a number of tools useful for developers to debug/develop
|
||||
stuff using HSSF (and more generally XLS files). We've already
|
||||
discussed the app for testing HSSF read/write/modify capabilities;
|
||||
now we'll talk a bit about BiffViewer. Early on in the development of
|
||||
HSSF, it was decided that knowing what was in a record, what was
|
||||
wrong with it, etc. was virtually impossible with the available
|
||||
tools. So we developed BiffViewer. You can find it at
|
||||
org.apache.poi.hssf.dev.BiffViewer. It performs two basic
|
||||
functions and a derivative.
|
||||
</p>
|
||||
<p>The first is "biffview". To do this you run it (assumes
|
||||
you have everything setup in your classpath and that you know what
|
||||
you're doing enough to be thinking about this) with an xls file as a
|
||||
parameter. It will give you a listing of all understood records with
|
||||
their data and a list of not-yet-understood records with no data
|
||||
(because it doesn't know how to interpret them). This listing is
|
||||
useful for several things. First, you can look at the values and SEE
|
||||
what is wrong in quasi-English. Second, you can send the output to a
|
||||
file and compare it.
|
||||
</p>
|
||||
<p>The second function is "big freakin dump", just pass a
|
||||
file and a second argument matching "bfd" exactly. This
|
||||
will just make a big hexdump of the file.
|
||||
</p>
|
||||
<p>Lastly, there is "mixed" mode which does the same as
|
||||
regular biffview, only it includes hex dumps of certain records
|
||||
intertwined. To use that just pass a file with a second argument
|
||||
matching "on" exactly.</p>
|
||||
<p>In the next release cycle we'll also have something called a
|
||||
FormulaViewer. The class is already there, but its not very useful
|
||||
yet. When it does something, we'll document it.</p>
|
||||
|
||||
</section>
|
||||
<section><title>What's Next?</title>
|
||||
|
||||
<p>Further effort on HSSF is going to focus on the following major areas: </p>
|
||||
<ul>
|
||||
<li>Performance: POI currently uses a lot of memory for large sheets.</li>
|
||||
<li>Charts: This is a hard problem, with very little documentation.</li>
|
||||
</ul>
|
||||
<p><a href="site:guidelines"> So jump in! </a> </p>
|
||||
|
||||
</section>
|
||||
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
119
src/documentation/content/xdocs/components/spreadsheet/index.xml
Normal file
@ -0,0 +1,119 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI-HSSF and POI-XSSF/SXSSF - Java API To Access Microsoft Excel Format Files</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>HSSF is the POI Project's pure Java implementation of the
|
||||
Excel '97(-2007) file format. XSSF is the POI Project's pure
|
||||
Java implementation of the Excel 2007 OOXML (.xlsx) file
|
||||
format.</p>
|
||||
<p>HSSF and XSSF provides ways to read spreadsheets create,
|
||||
modify, read and write XLS spreadsheets. They provide:
|
||||
</p>
|
||||
<ul>
|
||||
<li>low level structures for those with special needs</li>
|
||||
<li>an eventmodel api for efficient read-only access</li>
|
||||
<li>a full usermodel api for creating, reading and modifying XLS files</li>
|
||||
</ul>
|
||||
<p>For people converting from pure HSSF usermodel, who wish
|
||||
to use the joint SS Usermodel for HSSF and XSSF support, then
|
||||
see the <a href="converting.html">ss usermodel converting
|
||||
guide</a>.
|
||||
</p>
|
||||
<p>
|
||||
An alternate way of generating a spreadsheet is via the <a href="https://cocoon.apache.org">Cocoon</a> serializer (yet you'll still be using HSSF indirectly).
|
||||
With Cocoon you can serialize any XML datasource (which might be a ESQL page outputting in SQL for instance) by simply
|
||||
applying the stylesheet and designating the serializer.
|
||||
</p>
|
||||
<p>
|
||||
If you're merely reading spreadsheet data, then use the
|
||||
eventmodel api in either the org.apache.poi.hssf.eventusermodel
|
||||
package, or the org.apache.poi.xssf.eventusermodel package, depending
|
||||
on your file format.
|
||||
</p>
|
||||
<p>
|
||||
If you're modifying spreadsheet data then use the usermodel api. You
|
||||
can also generate spreadsheets this way.
|
||||
</p>
|
||||
<p>
|
||||
Note that the usermodel system has a higher memory footprint than
|
||||
the low level eventusermodel, but has the major advantage of being
|
||||
much simpler to work with. Also please be aware that as the new
|
||||
XSSF supported Excel 2007 OOXML (.xlsx) files are XML based,
|
||||
the memory footprint for processing them is higher than for the
|
||||
older HSSF supported (.xls) binary files.
|
||||
</p>
|
||||
|
||||
|
||||
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>SXSSF (Since POI 3.8 beta3)</title>
|
||||
<p>Since 3.8-beta3, POI provides a low-memory footprint SXSSF API built on top of XSSF.</p>
|
||||
<p>
|
||||
SXSSF is an API-compatible streaming extension of XSSF to be used when
|
||||
very large spreadsheets have to be produced, and heap space is limited.
|
||||
SXSSF achieves its low memory footprint by limiting access to the rows that
|
||||
are within a sliding window, while XSSF gives access to all rows in the
|
||||
document. Older rows that are no longer in the window become inaccessible,
|
||||
as they are written to the disk.
|
||||
</p>
|
||||
<p>
|
||||
In auto-flush mode the size of the access window can be specified, to hold a certain number of rows in memory.
|
||||
When that value is reached, the creation of an additional row causes the row with the lowest index to to be
|
||||
removed from the access window and written to disk. Or, the window size can be set to grow dynamically;
|
||||
it can be trimmed periodically by an explicit call to flushRows(int keepRows) as needed.
|
||||
</p>
|
||||
<p>
|
||||
Due to the streaming nature of the implementation, there are the following
|
||||
limitations when compared to XSSF:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Only a limited number of rows are accessible at a point in time.</li>
|
||||
<li>Sheet.clone() is not supported.</li>
|
||||
<li>Formula evaluation is not supported</li>
|
||||
</ul>
|
||||
|
||||
<p> See more details at <a href="how-to.html#sxssf">SXSSF How-To</a></p>
|
||||
|
||||
<p>The table below synopsizes the comparative features of POI's Spreadsheet API:</p>
|
||||
<p><em>Spreadsheet API Feature Summary</em></p>
|
||||
|
||||
<p>
|
||||
<img src="images/ss-features.png" alt="Spreadsheet API Feature Summary"/>
|
||||
</p>
|
||||
</section>
|
||||
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,99 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - HSSF and XSSF Limitations</title>
|
||||
<authors>
|
||||
<person email="user@poi.apache.org" name="Glen Stampoultzis" id="GJS"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>Current HSSF / XSSF main limitations</title>
|
||||
<p>
|
||||
The intent of this document is to outline some of the known limitations of the
|
||||
POI HSSF and XSSF APIs. It is not intended to be complete list of every bug
|
||||
or missing feature of HSSF or XSSF, rather it's purpose is to provide a broad
|
||||
feel for some of the functionality that is missing or broken.
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
File sizes/Memory usage<br/><br/>
|
||||
<ul>
|
||||
<li>
|
||||
There are some inherent limits in the Excel file formats. These are defined in class
|
||||
<a href="../../apidocs/dev/org/apache/poi/ss/SpreadsheetVersion.html">SpreadsheetVersion</a>.
|
||||
As long as you have enough main-memory, you should be able to handle files up to these limits. For huge files
|
||||
using the default POI classes you will likely need a very large amount of memory.
|
||||
<br/>
|
||||
<br/>
|
||||
There are ways to overcome the main-memory limitations if needed:
|
||||
<br/>
|
||||
<ul>
|
||||
<li>
|
||||
For writing very huge files, there is <a href="site:spreadsheet">SXSSFWorkbook</a>
|
||||
which allows to do a streaming write of data out to files (with certain limitations on what you can do as only
|
||||
parts of the file are held in memory).
|
||||
</li>
|
||||
<li>
|
||||
For reading very huge files, take a look at the sample
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/XLSX2CSV.java">XLSX2CSV</a>
|
||||
which shows how you can read a file in streaming fashion (again with some limitations on what information you
|
||||
can read out of the file, but there are ways to get at most of it if necessary).
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
Charts<br/><br/>
|
||||
<ul>
|
||||
<li>
|
||||
HSSF has some limited support for creating a handful of very simple Chart types,
|
||||
but largely this isn't supported. HSSF (largely) doesn't support changing Charts.
|
||||
You can however create a chart in Excel using Named ranges, modify the chart data
|
||||
values using HSSF and write a new spreadsheet out. This is possible because POI
|
||||
attempts to keep existing records intact as far as possible.<br/>
|
||||
</li>
|
||||
<li>
|
||||
XSSF has only limited chart support including making some simple changes
|
||||
and adding at least some line and scatter charts, see the examples <em>LineChart</em>
|
||||
and <em>ScatterChart</em>.<br/><br/>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
Macros<br/><br/>
|
||||
Macros can not be created. The are currently no plans to support macros.
|
||||
However, reading and re-writing files containing macros will safely preserve
|
||||
the macros. Recent versions of Apache POI support extracting the macro data
|
||||
via <a href="../../apidocs/dev/org/apache/poi/poifs/macros/VBAMacroExtractor.html">VBAMacroExtractor</a>
|
||||
and <a href="../../apidocs/dev/org/apache/poi/poifs/macros/VBAMacroReader.html">VBAMacroReader</a><br/><br/>
|
||||
</li>
|
||||
<li>
|
||||
Pivot Tables<br/><br/>
|
||||
HSSF doesn't have support for reading or creating Pivot tables. XSSF has limited
|
||||
support for creating Pivot Tables, and very limited read/change support.
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,212 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Record Generator HOWTO</title>
|
||||
<authors>
|
||||
<person email="user@poi.apache.org" name="Glen Stampoultzis" id="glens"/>
|
||||
<person email="acoliver@apache.org" name="Andrew C. Oliver" id="acoliver"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>How to Use the Record Generator</title>
|
||||
|
||||
<section><title>History</title>
|
||||
<p>
|
||||
The record generator was born from frustration with translating
|
||||
the Excel records to Java classes. Doing this manually is a time
|
||||
consuming process. It's also very easy to make mistakes.
|
||||
</p>
|
||||
<p>
|
||||
A utility was needed to take the definition of what a
|
||||
record looked like and do all the boring and repetitive work.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Capabilities</title>
|
||||
<p>
|
||||
The record generator takes XML as input and produces the following
|
||||
output:
|
||||
</p>
|
||||
<ul>
|
||||
<li>A Java file capable of decoding and encoding the record.</li>
|
||||
<li>A test class that provides a fill-in-the-blanks implementation
|
||||
of a test case for ensuring the record operates as
|
||||
designed.</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Usage</title>
|
||||
<p>
|
||||
The record generator is invoked as an Ant target
|
||||
(generate-records). It goes through looking for all files in
|
||||
<code>src/records/definitions</code> ending with _record.xml.
|
||||
It then creates two files; the Java record definition and the
|
||||
Java test case template.
|
||||
</p>
|
||||
<p>
|
||||
The records themselves have the following general layout:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
<record id="0x1032" name="Frame" package="org.apache.poi.hssf.record"
|
||||
excel-record-id="FRAME">
|
||||
<description>The frame record indicates whether there is a border
|
||||
around the displayed text of a chart.</description>
|
||||
<author>Glen Stampoultzis (glens at apache.org)</author>
|
||||
<fields>
|
||||
<field type="int" size="2" name="border type">
|
||||
<const name="regular" value="0" description="regular rectangle or no border"/>
|
||||
<const name="shadow" value="1" description="rectangle with shadow"/>
|
||||
</field>
|
||||
<field type="int" size="2" name="options">
|
||||
<bit number="0" name="auto size"
|
||||
description="excel calculates the size automatically if true"/>
|
||||
<bit number="1" name="auto position"
|
||||
description="excel calculates the position automatically"/>
|
||||
</field>
|
||||
</fields>
|
||||
</record>
|
||||
]]></source>
|
||||
<p>
|
||||
The following table details the allowable types and sizes for
|
||||
the fields.
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<th>Size</th>
|
||||
<th>Java Type</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>int</td>
|
||||
<td>1</td>
|
||||
<td>byte</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>int</td>
|
||||
<td>2</td>
|
||||
<td>short</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>int</td>
|
||||
<td>4</td>
|
||||
<td>int</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>int</td>
|
||||
<td>8</td>
|
||||
<td>long</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>int</td>
|
||||
<td>varword</td>
|
||||
<td>array of shorts</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>bits</td>
|
||||
<td>1</td>
|
||||
<td>A byte comprising of a bits (defined by the bit element)
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>bits</td>
|
||||
<td>2</td>
|
||||
<td>An short comprising of a bits</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>bits</td>
|
||||
<td>4</td>
|
||||
<td>A int comprising of a bits</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>float</td>
|
||||
<td>8</td>
|
||||
<td>double</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>hbstring</td>
|
||||
<td>java expression</td>
|
||||
<td>String</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p>
|
||||
The Java records are regenerated each time the record generator is
|
||||
run, however the test stubs are only created if the test stub does
|
||||
not already exist. What this means is that you may change test
|
||||
stubs but not the generated records.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Custom Field Types</title>
|
||||
<p>
|
||||
Occasionally the builtin types are not enough. More control
|
||||
over the encoding and decoding of the streams is required. This
|
||||
can be achieved using a custom type.
|
||||
</p>
|
||||
<p>
|
||||
A custom type lets you escape to java to define the way in which
|
||||
the field encodes and decodes. To code a custom type you
|
||||
declare your field like this:
|
||||
</p>
|
||||
<source><![CDATA[
|
||||
<field type="custom:org.apache.poi.hssf.record.LinkedDataFormulaField"
|
||||
size="var" name="formula of link" description="formula"/>
|
||||
]]></source>
|
||||
<p>
|
||||
Where the class name specified after <code>custom:</code> is a
|
||||
class implementing the interface <code>CustomField</code>.
|
||||
</p>
|
||||
<p>
|
||||
You can then implement the encoding yourself.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>How it Works</title>
|
||||
<p>
|
||||
The record generation works by taking an XML file and styling it
|
||||
using XSLT. Given that XSLT is a little limited in some ways it was
|
||||
necessary to add a little Java code to the mix.
|
||||
</p>
|
||||
<p>
|
||||
See record.xsl, record_test.xsl, FieldIterator.java,
|
||||
RecordUtil.java, RecordGenerator.java
|
||||
</p>
|
||||
<p>
|
||||
There is a corresponding "type" generator for HWPF.
|
||||
See the HWPF documentation for details.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Limitations</title>
|
||||
<p>
|
||||
The record generator does not handle all possible record types and
|
||||
goes not intend to perform this function. When dealing with a
|
||||
non-standard record sometimes the cost-benefit of coding the
|
||||
record by hand will be greater than attempting modify the
|
||||
generator. The main point of the record generator is to save
|
||||
time, so keep that in mind.
|
||||
</p>
|
||||
<p>
|
||||
Currently the XSL file that generates the record calls out to
|
||||
Java objects. The Java code for the record generation is
|
||||
currently quite messy with minimal comments.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,200 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>HSSF Use Cases</title>
|
||||
<authors>
|
||||
<person email="marc.johnson@yahoo.com" name="Marc Johnson" id="MJ"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>HSSF Use Cases</title>
|
||||
<section><title>Use Case 1: Read existing HSSF</title>
|
||||
|
||||
<p><strong>Primary Actor:</strong> HSSF client</p>
|
||||
<p><strong>Scope:</strong> HSSF</p>
|
||||
<p><strong>Level:</strong> Summary</p>
|
||||
<p><strong>Stakeholders and Interests:</strong></p>
|
||||
<ul>
|
||||
<li>HSSF client- wants to read content
|
||||
of HSSF file</li>
|
||||
<li>HSSF - understands HSSF file</li>
|
||||
<li>POIFS - understands underlying POI
|
||||
file system</li>
|
||||
</ul>
|
||||
<p><strong>Precondition:</strong> None</p>
|
||||
<p><strong>Minimal Guarantee:</strong> None</p>
|
||||
<p><strong>Main Success Guarantee:</strong></p>
|
||||
<ol>
|
||||
<li>HSSF client requests HSSF to read
|
||||
a HSSF file, providing an InputStream
|
||||
containing HSSF file in question.</li>
|
||||
<li>HSSF requests POIFS to read the HSSF
|
||||
file, passing the InputStream
|
||||
object to POIFS (POIFS use case 1, read existing file system)</li>
|
||||
<li>HSSF reads the "Workbook"
|
||||
file (use case 4, read workbook entry)</li>
|
||||
</ol>
|
||||
<p><strong>Extensions:</strong></p>
|
||||
<p>2a. Exceptions
|
||||
thrown by POIFS will be passed on to the HSSF client.</p>
|
||||
</section>
|
||||
<section><title>Use Case 2: Write HSSF file</title>
|
||||
|
||||
<p><strong>Primary Actor:</strong> HSSF client</p>
|
||||
<p><strong>Scope:</strong> HSSF</p>
|
||||
<p><strong>Level:</strong> Summary</p>
|
||||
<p><strong>Stakeholders and Interests:</strong></p>
|
||||
<ul>
|
||||
<li>HSSF client- wants to write file
|
||||
out.</li>
|
||||
<li>HSSF - knows how to write file
|
||||
out.</li>
|
||||
<li>POIFS - knows how to write file
|
||||
system out.</li>
|
||||
</ul>
|
||||
<p><strong>Precondition:</strong></p>
|
||||
<ul>
|
||||
<li>File has been
|
||||
read (use case 1, read existing HSSF file) and subsequently modified
|
||||
or file has been created (use case 3, create HSSF file)</li>
|
||||
</ul>
|
||||
<p><strong>Minimal Guarantee:</strong> None</p>
|
||||
<p><strong>Main Success Guarantee:</strong></p>
|
||||
<ol>
|
||||
<li>HSSF client
|
||||
provides an OutputStream to
|
||||
write the file to.</li>
|
||||
<li>HSSF writes
|
||||
the "Workbook" to its associated POIFS file system (use case
|
||||
5, write workbook entry)</li>
|
||||
<li>HSSF
|
||||
requests POIFS to write its file system out, using the OutputStream
|
||||
obtained from the HSSF client (POIFS use case 2, write file system).</li>
|
||||
</ol>
|
||||
<p><strong>Extensions:</strong></p>
|
||||
<p>3a. Exceptions
|
||||
from POIFS are passed to the HSSF client.</p>
|
||||
|
||||
</section>
|
||||
<section><title>Use Case 3:Create HSSF file</title>
|
||||
|
||||
<p><strong>Primary Actor:</strong> HSSF client</p>
|
||||
<p><strong>Scope:</strong> HSSF</p>
|
||||
<p>
|
||||
<strong>Level:</strong> Summary</p>
|
||||
<p><strong>Stakeholders and Interests:</strong></p>
|
||||
<ul>
|
||||
<li>HSSF client- wants to create a new
|
||||
file.</li>
|
||||
<li>HSSF - knows how to create a new
|
||||
file.</li>
|
||||
<li>POIFS - knows how to create a new
|
||||
file system.</li>
|
||||
</ul>
|
||||
<p><strong>Precondition:</strong></p>
|
||||
<p><strong>Minimal Guarantee:</strong> None</p>
|
||||
<p><strong>Main Success Guarantee:</strong></p>
|
||||
<ol>
|
||||
<li>HSSF requests
|
||||
POIFS to create a new file system (POIFS use case 3, create new file
|
||||
system)</li>
|
||||
</ol>
|
||||
<p><strong>Extensions:</strong>
|
||||
None</p>
|
||||
|
||||
</section>
|
||||
<section><title>Use Case 4: Read workbook entry</title>
|
||||
<p><strong>Primary Actor:</strong> HSSF</p>
|
||||
<p><strong>Scope:</strong> HSSF</p>
|
||||
<p>
|
||||
<strong>Level:</strong> Summary</p>
|
||||
<p><strong>Stakeholders and Interests:</strong></p>
|
||||
<ul>
|
||||
<li>HSSF - knows how to read the
|
||||
workbook entry</li>
|
||||
<li>POIFS - knows how to manage the file
|
||||
system.</li>
|
||||
</ul>
|
||||
<p><strong>Precondition:</strong></p>
|
||||
<ul>
|
||||
<li>The file
|
||||
system has been read (use case 1, read existing HSSF file) or has
|
||||
been created and written to (use case 3, create HSSF file system;
|
||||
use case 5, write workbook entry).</li>
|
||||
</ul>
|
||||
<p><strong>Minimal
|
||||
Guarantee:</strong> None</p>
|
||||
<p><strong>Main Success Guarantee:</strong></p>
|
||||
<ol>
|
||||
<li>
|
||||
HSSF requests POIFS for the "Workbook" file</li>
|
||||
<li>POIFS returns
|
||||
an InputStream for the file.</li>
|
||||
<li>HSSF reads
|
||||
from the InputStream provided by POIFS</li>
|
||||
<li>HSSF closes
|
||||
the InputStream provided by POIFS</li>
|
||||
</ol>
|
||||
<p><strong>Extensions:</strong></p>
|
||||
<p>3a. Exceptions
|
||||
thrown by POIFS will be passed on</p>
|
||||
</section>
|
||||
<section><title>Use Case 5: Write workbook entry</title>
|
||||
|
||||
|
||||
<p><strong>Primary Actor:</strong> HSSF</p>
|
||||
<p><strong>Scope:</strong> HSSF</p>
|
||||
<p>
|
||||
<strong>Level:</strong> Summary</p>
|
||||
<p><strong>Stakeholders and Interests:</strong></p>
|
||||
<ul>
|
||||
<li>HSSF - knows how to manage the
|
||||
write the workbook entry.</li>
|
||||
<li>POIFS - knows how to manage the file
|
||||
system.</li>
|
||||
</ul>
|
||||
<p><strong>Precondition:</strong>
|
||||
</p>
|
||||
<ul>
|
||||
<li>Either an existing HSSF file has
|
||||
been read (use case 1, read existing HSSF file) or an HSSF file has
|
||||
been created (use case 3, create HSSF file).</li>
|
||||
</ul>
|
||||
<p><strong>Minimal Guarantee:</strong> None</p>
|
||||
<p><strong>Main Success Guarantee:</strong></p>
|
||||
<ol>
|
||||
<li>HSSF
|
||||
checks the POIFS file system directory for the "Workbook"
|
||||
file (POIFS use case 8, read file system directory)</li>
|
||||
<li>If "Workbook" is in the directory, HSSF requests POIFS to
|
||||
replace it with the new workbook entry (POIFS use case 4, replace file
|
||||
in file system). Otherwise, HSSF requests POIFS to write the new
|
||||
workbook file, with the name "Workbook" (POIFS use case 6,
|
||||
write new file to file system)</li>
|
||||
</ol>
|
||||
<p><strong>Extensions:</strong>None</p>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,414 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>User Defined Functions</title>
|
||||
<authors>
|
||||
<person email="jon@loquatic.com" name="Jon Svede" id="JDS"/>
|
||||
<person email="brian.bush@nrel.gov" name="Brian Bush" id="BWB"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section><title>How to Create and Use User Defined Functions</title>
|
||||
|
||||
<section><title>Description</title>
|
||||
<p>This document describes the User Defined Functions within POI.
|
||||
User defined functions allow you to take code that is written in VBA
|
||||
and re-write in Java and use within POI. Consider the following example.</p>
|
||||
</section>
|
||||
<section><title>An Example</title>
|
||||
<p>Suppose you are given a spreadsheet that can calculate the principal and interest
|
||||
payments for a mortgage. The user enters the principal loan amount, the interest rate
|
||||
and the term of the loan. The Excel spreadsheet does the rest.</p>
|
||||
<p>
|
||||
<img src="images/simple-xls-with-function.jpg" alt="mortgage calculation spreadsheet"/>
|
||||
</p>
|
||||
<p>When you actually look at the workbook you discover that rather than having
|
||||
the formula in a cell it has been written as VBA function. You review the
|
||||
function and determine that it could be written in Java:</p>
|
||||
<p>
|
||||
<img src="images/calculatePayment.jpg" alt="VBA code"/>
|
||||
</p>
|
||||
<p>If we write a small program to try to evaluate this cell, we'll fail. Consider this source code:</p>
|
||||
<source><![CDATA[
|
||||
import java.io.File ;
|
||||
import java.io.FileInputStream ;
|
||||
import java.io.FileNotFoundException ;
|
||||
import java.io.IOException ;
|
||||
|
||||
import org.apache.poi.openxml4j.exceptions.InvalidFormatException ;
|
||||
import org.apache.poi.ss.formula.functions.FreeRefFunction ;
|
||||
import org.apache.poi.ss.formula.udf.AggregatingUDFFinder ;
|
||||
import org.apache.poi.ss.formula.udf.DefaultUDFFinder ;
|
||||
import org.apache.poi.ss.formula.udf.UDFFinder ;
|
||||
import org.apache.poi.ss.usermodel.Cell ;
|
||||
import org.apache.poi.ss.usermodel.CellValue ;
|
||||
import org.apache.poi.ss.usermodel.Row ;
|
||||
import org.apache.poi.ss.usermodel.Sheet ;
|
||||
import org.apache.poi.ss.usermodel.Workbook ;
|
||||
import org.apache.poi.ss.usermodel.WorkbookFactory ;
|
||||
import org.apache.poi.ss.util.CellReference ;
|
||||
|
||||
public class Evaluator {
|
||||
|
||||
|
||||
|
||||
public static void main( String[] args ) {
|
||||
|
||||
System.out.println( "fileName: " + args[0] ) ;
|
||||
System.out.println( "cell: " + args[1] ) ;
|
||||
|
||||
File workbookFile = new File( args[0] ) ;
|
||||
|
||||
try {
|
||||
FileInputStream fis = new FileInputStream(workbookFile);
|
||||
Workbook workbook = WorkbookFactory.create(fis);
|
||||
|
||||
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
CellReference cr = new CellReference( args[1] ) ;
|
||||
String sheetName = cr.getSheetName() ;
|
||||
Sheet sheet = workbook.getSheet( sheetName ) ;
|
||||
int rowIdx = cr.getRow() ;
|
||||
int colIdx = cr.getCol() ;
|
||||
Row row = sheet.getRow( rowIdx ) ;
|
||||
Cell cell = row.getCell( colIdx ) ;
|
||||
|
||||
CellValue value = evaluator.evaluate( cell ) ;
|
||||
|
||||
System.out.println("returns value: " + value ) ;
|
||||
|
||||
|
||||
} catch( FileNotFoundException e ) {
|
||||
e.printStackTrace();
|
||||
} catch( InvalidFormatException e ) {
|
||||
e.printStackTrace();
|
||||
} catch( IOException e ) {
|
||||
e.printStackTrace();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
]]></source>
|
||||
<p>If you run this code, you're likely to get the following error:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
Exception in thread "main" org.apache.poi.ss.formula.eval.NotImplementedException: Error evaluating cell Sheet1!B4
|
||||
at org.apache.poi.ss.formula.WorkbookEvaluator.addExceptionInfo(WorkbookEvaluator.java:321)
|
||||
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:288)
|
||||
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:221)
|
||||
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateFormulaCellValue(HSSFFormulaEvaluator.java:320)
|
||||
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluate(HSSFFormulaEvaluator.java:182)
|
||||
at poi.tests.Evaluator.main(Evaluator.java:61)
|
||||
Caused by: org.apache.poi.ss.formula.eval.NotImplementedException: calculatePayment
|
||||
at org.apache.poi.ss.formula.UserDefinedFunction.evaluate(UserDefinedFunction.java:59)
|
||||
at org.apache.poi.ss.formula.OperationEvaluatorFactory.evaluate(OperationEvaluatorFactory.java:129)
|
||||
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateFormula(WorkbookEvaluator.java:456)
|
||||
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:279)
|
||||
... 4 more
|
||||
|
||||
]]></source>
|
||||
|
||||
<p>How would we make it so POI can use this sheet?</p>
|
||||
</section>
|
||||
|
||||
<section><title>Defining Your Function</title>
|
||||
<p>To 'convert' this code to Java and make it available to POI you need to implement
|
||||
a FreeRefFunction instance. FreeRefFunction is an interface in the org.apache.poi.ss.formula.functions
|
||||
package. This interface defines one method, evaluate(ValueEval[] args, OperationEvaluationContext ec),
|
||||
which is how you will receive the argument values from POI.</p>
|
||||
<p>The evaluate() method as defined above is where you will convert the ValueEval instances to the
|
||||
proper number types. The following code snippet shows you how to get your values:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
public class CalculateMortgage implements FreeRefFunction {
|
||||
|
||||
@Override
|
||||
public ValueEval evaluate( ValueEval[] args, OperationEvaluationContext ec ) {
|
||||
if (args.length != 3) {
|
||||
return ErrorEval.VALUE_INVALID;
|
||||
}
|
||||
|
||||
double principal, rate, years, result;
|
||||
try {
|
||||
ValueEval v1 = OperandResolver.getSingleValue( args[0],
|
||||
ec.getRowIndex(),
|
||||
ec.getColumnIndex() ) ;
|
||||
ValueEval v2 = OperandResolver.getSingleValue( args[1],
|
||||
ec.getRowIndex(),
|
||||
ec.getColumnIndex() ) ;
|
||||
ValueEval v3 = OperandResolver.getSingleValue( args[2],
|
||||
ec.getRowIndex(),
|
||||
ec.getColumnIndex() ) ;
|
||||
|
||||
principal = OperandResolver.coerceValueToDouble( v1 ) ;
|
||||
rate = OperandResolver.coerceValueToDouble( v2 ) ;
|
||||
years = OperandResolver.coerceValueToDouble( v3 ) ;
|
||||
]]></source>
|
||||
|
||||
<p>The first thing we do is check the number of arguments being passed since there is no sense
|
||||
in attempting to go further if you are missing critical information.</p>
|
||||
<p>Next we declare our variables, in our case we need variables for:</p>
|
||||
<ul>
|
||||
<li>principal - the amount of the loan</li>
|
||||
<li>rate - the interest rate as a decimal</li>
|
||||
<li>years - the length of the loan in years</li>
|
||||
<li>result - the result of the calculation</li>
|
||||
</ul>
|
||||
<p>Next, we use the OperandResolver to convert the ValueEval instances to doubles, though not directly.
|
||||
First we start by getting discreet values. Using the OperandResolver.getSingleValue() method
|
||||
we retrieve each of the values passed in by the cell in the spreadsheet. Next, we use the
|
||||
OperandResolver again to convert the ValueEval instances to doubles, in this case. This
|
||||
class has other methods of coercion for getting Strings, ints and booleans. Now that we've
|
||||
got our primitive values we can move on to calculating the value.</p>
|
||||
<p>As shown previously, we have the VBA source. We need to add code to our class to calculate
|
||||
the payment. To do this you could simply add it to the method we've already created but I've
|
||||
chosen to add it as its own method. Add the following method: </p>
|
||||
<source><![CDATA[
|
||||
public double calculateMortgagePayment( double p, double r, double y ) {
|
||||
|
||||
double i = r / 12 ;
|
||||
double n = y * 12 ;
|
||||
|
||||
double principalAndInterest =
|
||||
p * (( i * Math.pow((1 + i),n ) ) / ( Math.pow((1 + i),n) - 1)) ;
|
||||
|
||||
return principalAndInterest ;
|
||||
}
|
||||
]]></source>
|
||||
<p>The biggest change necessary is related to the exponents; Java doesn't have a notation for this
|
||||
so we had to add calls to Math.pow(). Now we need to add this call to our previous method:</p>
|
||||
<source><![CDATA[
|
||||
result = calculateMortgagePayment( principal, rate, years ) ;
|
||||
]]></source>
|
||||
<p>Having done that, the last things we need to do are to check to make sure we didn't get a bad result and,
|
||||
if not, we need to return the value. Add the following code to the class:</p>
|
||||
<source><![CDATA[
|
||||
private void checkValue(double result) throws EvaluationException {
|
||||
if (Double.isNaN(result) || Double.isInfinite(result)) {
|
||||
throw new EvaluationException(ErrorEval.NUM_ERROR);
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
<p>Then add a line of code to our evaluate method to call this new static method, complete our try/catch and return the value:</p>
|
||||
<source><![CDATA[
|
||||
checkValue(result);
|
||||
|
||||
} catch (EvaluationException e) {
|
||||
e.printStackTrace() ;
|
||||
return e.getErrorEval();
|
||||
}
|
||||
|
||||
return new NumberEval( result ) ;
|
||||
]]></source>
|
||||
|
||||
<p>So the whole class would be as follows:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
import org.apache.poi.ss.formula.OperationEvaluationContext ;
|
||||
import org.apache.poi.ss.formula.eval.ErrorEval ;
|
||||
import org.apache.poi.ss.formula.eval.EvaluationException ;
|
||||
import org.apache.poi.ss.formula.eval.NumberEval ;
|
||||
import org.apache.poi.ss.formula.eval.OperandResolver ;
|
||||
import org.apache.poi.ss.formula.eval.ValueEval ;
|
||||
import org.apache.poi.ss.formula.functions.FreeRefFunction ;
|
||||
|
||||
/**
|
||||
* A simple function to calculate principal and interest.
|
||||
*
|
||||
* @author Jon Svede
|
||||
*
|
||||
*/
|
||||
public class CalculateMortgage implements FreeRefFunction {
|
||||
|
||||
@Override
|
||||
public ValueEval evaluate( ValueEval[] args, OperationEvaluationContext ec ) {
|
||||
if (args.length != 3) {
|
||||
return ErrorEval.VALUE_INVALID;
|
||||
}
|
||||
|
||||
double principal, rate, years, result;
|
||||
try {
|
||||
ValueEval v1 = OperandResolver.getSingleValue( args[0],
|
||||
ec.getRowIndex(),
|
||||
ec.getColumnIndex() ) ;
|
||||
ValueEval v2 = OperandResolver.getSingleValue( args[1],
|
||||
ec.getRowIndex(),
|
||||
ec.getColumnIndex() ) ;
|
||||
ValueEval v3 = OperandResolver.getSingleValue( args[2],
|
||||
ec.getRowIndex(),
|
||||
ec.getColumnIndex() ) ;
|
||||
|
||||
principal = OperandResolver.coerceValueToDouble( v1 ) ;
|
||||
rate = OperandResolver.coerceValueToDouble( v2 ) ;
|
||||
years = OperandResolver.coerceValueToDouble( v3 ) ;
|
||||
|
||||
result = calculateMortgagePayment( principal, rate, years ) ;
|
||||
|
||||
checkValue(result);
|
||||
|
||||
} catch (EvaluationException e) {
|
||||
e.printStackTrace() ;
|
||||
return e.getErrorEval();
|
||||
}
|
||||
|
||||
return new NumberEval( result ) ;
|
||||
}
|
||||
|
||||
public double calculateMortgagePayment( double p, double r, double y ) {
|
||||
double i = r / 12 ;
|
||||
double n = y * 12 ;
|
||||
|
||||
//M = P [ i(1 + i)n ] / [ (1 + i)n - 1]
|
||||
double principalAndInterest =
|
||||
p * (( i * Math.pow((1 + i),n ) ) / ( Math.pow((1 + i),n) - 1)) ;
|
||||
|
||||
return principalAndInterest ;
|
||||
}
|
||||
|
||||
/**
|
||||
* Excel does not support infinities and NaNs, rather, it gives a #NUM! error in these cases
|
||||
*
|
||||
* @throws EvaluationException (#NUM!) if <tt>result</tt> is <tt>NaN</> or <tt>Infinity</tt>
|
||||
*/
|
||||
static final void checkValue(double result) throws EvaluationException {
|
||||
if (Double.isNaN(result) || Double.isInfinite(result)) {
|
||||
throw new EvaluationException(ErrorEval.NUM_ERROR);
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
]]></source>
|
||||
|
||||
<p>Great! Now we need to go back to our original program that failed to evaluate our cell and add code that will allow it run our new Java code.</p>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>Registering Your Function</title>
|
||||
<p>Now we need to register our function in the Workbook, so that the Formula Evaluator can resolve the name "calculatePayment"
|
||||
and map it to the actual implementation (CalculateMortgage). This is done using the UDFFinder object.
|
||||
The UDFFinder manages FreeRefFunctions which are our analogy for the VBA code. We need to create a UDFFinder. There are
|
||||
a few things we need to know in order to do this:</p>
|
||||
<ul>
|
||||
<li>The name of the function in the VBA code (in our case it is calculatePayment)</li>
|
||||
<li>The Class name of our FreeRefFunction</li>
|
||||
</ul>
|
||||
<p>UDFFinder is actually an interface, so we need to use an actual implementation of this interface. Therefore we use the org.apache.poi.ss.formula.udf.DefaultUDFFinder class. If you refer to the Javadocs you'll see that this class expects to get two arrays, one
|
||||
containing the alias and the other containing an instance of the class that will represent that alias. In our case our alias will be calculatePayment
|
||||
and our class instance will be of the CalculateMortgage type. This class needs to be available at compile and runtime. Be sure to keep these arrays
|
||||
well organized because you'll run into problems if these arrays are of different sizes or the alias aren't in the same relative position in their respective
|
||||
arrays. Add the following code:</p>
|
||||
<source><![CDATA[
|
||||
String[] functionNames = { "calculatePayment" } ;
|
||||
FreeRefFunction[] functionImpls = { new CalculateMortgage() } ;
|
||||
|
||||
UDFFinder udfs = new DefaultUDFFinder( functionNames, functionImpls ) ;
|
||||
UDFFinder udfToolpack = new AggregatingUDFFinder( udfs ) ;
|
||||
]]></source>
|
||||
<p>Now we have our UDFFinder instance and we've created the AggregatingUDFFinder instance. The last step is to pass this to our Workbook:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
workbook.addToolPack(udfToolpack);
|
||||
]]></source>
|
||||
<p>So now the whole class will look like this:</p>
|
||||
<source><![CDATA[
|
||||
import java.io.File ;
|
||||
import java.io.FileInputStream ;
|
||||
import java.io.FileNotFoundException ;
|
||||
import java.io.IOException ;
|
||||
|
||||
import org.apache.poi.openxml4j.exceptions.InvalidFormatException ;
|
||||
import org.apache.poi.ss.formula.functions.FreeRefFunction ;
|
||||
import org.apache.poi.ss.formula.udf.AggregatingUDFFinder ;
|
||||
import org.apache.poi.ss.formula.udf.DefaultUDFFinder ;
|
||||
import org.apache.poi.ss.formula.udf.UDFFinder ;
|
||||
import org.apache.poi.ss.usermodel.Cell ;
|
||||
import org.apache.poi.ss.usermodel.CellValue ;
|
||||
import org.apache.poi.ss.usermodel.Row ;
|
||||
import org.apache.poi.ss.usermodel.Sheet ;
|
||||
import org.apache.poi.ss.usermodel.Workbook ;
|
||||
import org.apache.poi.ss.usermodel.WorkbookFactory ;
|
||||
import org.apache.poi.ss.util.CellReference ;
|
||||
|
||||
public class Evaluator {
|
||||
|
||||
public static void main( String[] args ) {
|
||||
|
||||
System.out.println( "fileName: " + args[0] ) ;
|
||||
System.out.println( "cell: " + args[1] ) ;
|
||||
|
||||
File workbookFile = new File( args[0] ) ;
|
||||
|
||||
try {
|
||||
FileInputStream fis = new FileInputStream(workbookFile);
|
||||
Workbook workbook = WorkbookFactory.create(fis);
|
||||
|
||||
String[] functionNames = { "calculatePayment" } ;
|
||||
FreeRefFunction[] functionImpls = { new CalculateMortgage() } ;
|
||||
|
||||
UDFFinder udfs = new DefaultUDFFinder( functionNames, functionImpls ) ;
|
||||
UDFFinder udfToolpack = new AggregatingUDFFinder( udfs ) ;
|
||||
|
||||
workbook.addToolPack(udfToolpack);
|
||||
|
||||
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
|
||||
|
||||
CellReference cr = new CellReference( args[1] ) ;
|
||||
String sheetName = cr.getSheetName() ;
|
||||
Sheet sheet = workbook.getSheet( sheetName ) ;
|
||||
int rowIdx = cr.getRow() ;
|
||||
int colIdx = cr.getCol() ;
|
||||
Row row = sheet.getRow( rowIdx ) ;
|
||||
Cell cell = row.getCell( colIdx ) ;
|
||||
|
||||
CellValue value = evaluator.evaluate( cell ) ;
|
||||
|
||||
System.out.println("returns value: " + value ) ;
|
||||
|
||||
|
||||
} catch( FileNotFoundException e ) {
|
||||
e.printStackTrace();
|
||||
} catch( InvalidFormatException e ) {
|
||||
e.printStackTrace();
|
||||
} catch( IOException e ) {
|
||||
e.printStackTrace();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
]]></source>
|
||||
<p>Now that our evaluator is aware of the UDFFinder which in turn is aware of our FreeRefFunction, we're ready to re-run our example:</p>
|
||||
<source>Evaluator mortgage-calculation.xls Sheet1!B4</source>
|
||||
<p>which prints the following output in the console:</p>
|
||||
<source><![CDATA[
|
||||
fileName: mortgage-calculation.xls
|
||||
cell: Sheet1!B4
|
||||
returns value: org.apache.poi.ss.usermodel.CellValue [790.7936267415464]
|
||||
]]></source>
|
||||
<p>That is it! Now you can create Java code and register it, allowing your POI based appliction to run spreadsheets that previously were inaccessible.</p>
|
||||
<p>This example can be found in the <a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/formula">poi-examples/src/main/java/org/apache/poi/examples/ss/formula</a> folder in the source.</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
|
||||
408
src/documentation/content/xdocs/devel/guidelines.xml
Normal file
@ -0,0 +1,408 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Contribution Guidelines</title>
|
||||
<authors>
|
||||
<person name="Nick Burch" email="dev@poi.apache.org"/>
|
||||
<person name="David Fisher" email="dev@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
|
||||
<section><title>Index of Contribution Guidelines</title>
|
||||
<ul>
|
||||
<li><a href="#Introduction">Introduction</a></li>
|
||||
<li><a href="#WhereHelpNeeded">Where is help needed on the project?</a></li>
|
||||
<li><a href="#GetInvolved">I just want to get involved, but don't know where to start?</a></li>
|
||||
<li><a href="#SubmittingPatches">Submitting Patches</a></li>
|
||||
<li><a href="#CodeStyle">Code Style</a></li>
|
||||
<li><a href="#Mentoring">Mentoring and Committership</a></li>
|
||||
<li><a href="#FileFormatInformation">File Format Information</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<anchor id="Introduction"/>
|
||||
<section><title>Introduction</title>
|
||||
|
||||
<section><title>Disclaimer</title>
|
||||
<p>
|
||||
Any information in here that might be perceived as legal information is
|
||||
informational only. We're not lawyers, so consult a legal professional
|
||||
if needed.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>The Licensing</title>
|
||||
<p>
|
||||
The POI project is <a href="http://www.opensource.org">OpenSource</a>
|
||||
and developed/distributed under the <a
|
||||
href="https://www.apache.org/foundation/license-faq.html">
|
||||
Apache Software License v2</a>. Unlike some other licenses, the Apache
|
||||
license allows free open source development. Unlike some other Open Source
|
||||
licenses, it does not require you to release your source or use any
|
||||
particular license for your code which builds on top of it. (There are a
|
||||
handful of restrictions, especially around attribution, notices and trademarks,
|
||||
so it's worth a read of the license - it isn't scary!). If you wish to
|
||||
contribute to Apache POI (which you're very welcome and encouraged to do so),
|
||||
then you must agree to grant your contributions to us under the same license.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<anchor id="WhereHelpNeeded"/>
|
||||
<section><title>Where is help needed on the project?</title>
|
||||
<p>There are a lot of open issues in Bugzilla and TODOs in the code. Please see
|
||||
the section below for more on these. Get in touch using our mailing lists if you want
|
||||
to volunteer.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="GetInvolved"/>
|
||||
<section><title>I just want to get involved, but don't know where to start?</title>
|
||||
<ul>
|
||||
<li>Read the rest of the website, understand what POI is and what it does,
|
||||
the project vision, etc.</li>
|
||||
<li>Use POI a bit, look for gaps in the documentation and examples.</li>
|
||||
<li>Join the <a href="site:mailinglists">mailing lists</a> and share your knowledge with others.</li>
|
||||
<li>Get <a href="site:subversion">Subversion</a> and check out the POI source tree</li>
|
||||
<li>Documentation is always the best place to start contributing, maybe you found that if the documentation just told you how to do X then it would make more sense, modify the documentation.</li>
|
||||
<li>Contribute examples - if there's something people are often asking about on the <a href="site:mailinglists">user list</a> which isn't covered in the documentation or current examples, try writing an example of this and uploading it as a patch.</li>
|
||||
<li>Get used to building POI, you'll be doing it a lot, be one with the build, know its targets, etc.</li>
|
||||
<li>Write Unit Tests. Great way to understand POI. Look for classes that aren't tested, or aren't tested on a public/protected method level, start there.</li>
|
||||
<li>Download the file format documentation from Microsoft -
|
||||
<a href="https://msdn.microsoft.com/en-us/library/cc313105%28v=office.12%29.aspx">OLE2 Binary
|
||||
File Formats</a> or
|
||||
<a href="https://ecma-international.org/publications-and-standards/standards/ecma-376/">OOXML XML File Formats</a></li>
|
||||
<li>Submit patches (see below) of your contributions, modifications.</li>
|
||||
<li>Check the <a href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI">bug database</a> for simple problem reports, and write a patch to fix the problem</li>
|
||||
<li>Review existing patches in the <a href="http://issues.apache.org/bugzilla/buglist.cgi?product=POI">bug database</a>, and report if they still apply, if they need unit tests atc.</li>
|
||||
<li>Take a look at all the <a href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI;bug_status=NEW;bug_status=NEEDINFO">unresolved issues in the bug database</a>, and see if you can help with testing or patches for them</li>
|
||||
<li>Add in new features, see <a href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bug database</a> for suggestions.</li>
|
||||
</ul>
|
||||
|
||||
<p>The Apache <a href="https://infra.apache.org/contributors.html">Contributors Tech Guide</a> gives a good overview how to start contributing patches.</p>
|
||||
|
||||
<p>The Nutch project also have a very useful guide on becoming a
|
||||
new developer in their project. While it is written for their project,
|
||||
a large part of it will apply to POI too. You can read it at
|
||||
<a href="https://wiki.apache.org/nutch/Becoming_A_Nutch_Developer">http://wiki.apache.org/nutch/Becoming_A_Nutch_Developer</a>. The
|
||||
<a href="https://community.apache.org/">Apache Community Development
|
||||
Project</a> also provides guidance and mentoring for new contributors.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="SubmittingPatches"/>
|
||||
<section><title>Submitting Patches</title>
|
||||
<p>
|
||||
If you use GitHub, you can submit Pull Requests to https://github.com/apache/poi. It is probably
|
||||
a good idea to create an issue in the <a href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bug Database</a>
|
||||
first and reference it in the PR.
|
||||
</p>
|
||||
<p>
|
||||
For Subversion fans, you can add patch files to the Bugzilla issues at
|
||||
<a href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bug Database</a>.
|
||||
If there is already a bug-report, attach it there, otherwise create a new bug,
|
||||
set the subject to [PATCH] followed by a brief description.
|
||||
Explain you patch and any special instructions and submit/save it.
|
||||
Next, go back to the bug, and create attachments for the patch files you
|
||||
created. Be sure to describe not only the files purpose, but its format.
|
||||
(Is that ZIP or a tgz or a bz2 or what?).
|
||||
</p>
|
||||
<p>
|
||||
Ideally, patches should be submitted early and often. This is for
|
||||
two key reasons. Firstly, it's much easier to review smaller patches
|
||||
than large ones. This means that smaller patches are much more likely
|
||||
to be applied to SVN in a timely fashion. Secondly, by sending in your
|
||||
patches earlier rather than later, it's much easier to get feedback
|
||||
on your coding and direction. If you've missed an easier way to do something,
|
||||
or are duplicating some (probably hidden) existing code, or taking things
|
||||
in an unusual direction, it's best to get the feedback sooner rather than
|
||||
later! As such, when submitting patches to POI, as with other Apache
|
||||
Software Foundation projects, do please try to submit early and often, rather
|
||||
than "throwing a large patch over the wall" at the end.
|
||||
</p>
|
||||
<p>
|
||||
A number of Apache projects provide far more comprehensive guides to producing
|
||||
and submitting patches than we do, you may wish to review some of their
|
||||
information if you're unsure. The
|
||||
<a href="https://commons.apache.org/patches.html">Apache Commons</a> one
|
||||
is fairly similar as a starting point.
|
||||
</p>
|
||||
<p>You may create your patch file using either of the following approaches (the committers recommend the first):</p>
|
||||
<section><title>Approach 1 - use Ant</title>
|
||||
<p>Use Ant to generate a patch file to POI: </p>
|
||||
<source>
|
||||
ant -f patch.xml
|
||||
</source>
|
||||
<p>
|
||||
This will create a file named <code>patch.tar.gz</code> that will contain a unified diff of files that have been modified
|
||||
and also include files that have been added. Review the file for completeness and correctness. This approach
|
||||
is recommended because it standardizes the way in which patch files are constructed. It also eliminates the
|
||||
chance of you missing to submit new files that constitute part of the patch.
|
||||
</p>
|
||||
<p>
|
||||
To apply a previously generated <code>patch.tar.gz</code> file to a clean subversion checkout, use the following command.
|
||||
It will unpack the tarball and add new files to the subversion working copy.
|
||||
</p>
|
||||
<source>
|
||||
ant -f patch.xml apply
|
||||
</source>
|
||||
</section>
|
||||
<section><title>Approach 2 - the manual way</title>
|
||||
<p>
|
||||
Patches to existing files should be generated with <code>svn diff filename</code> and save the output to a file.
|
||||
If you want to get the changes made to multiple files in a directory, just use <code>svn diff</code>.
|
||||
then, tar and gzip the patch file as well as any new files that you have added.
|
||||
</p>
|
||||
<p>If you use a unix shell, you may find the following following
|
||||
sequence of commands useful for building the files to attach.</p>
|
||||
<source>
|
||||
# run this in the root of the checkout, i.e. the directory holding
|
||||
# build.xml and poi.pom
|
||||
|
||||
# build the directory to hold new files
|
||||
mkdir /tmp/poi-patch/
|
||||
mkdir /tmp/poi-patch/new-files/
|
||||
|
||||
# get changes to existing files
|
||||
svn diff > /tmp/poi-patch/diff.txt
|
||||
|
||||
# capture any new files, as svn diff won't include them
|
||||
# preserve the path
|
||||
svn status | grep "^\?" | awk '{printf "cp --parents %s /tmp/poi-patch/new-files/\n", $2 }' | sh -s
|
||||
|
||||
# tar up the new files
|
||||
cd /tmp/poi-patch/new-files/
|
||||
tar jcvf ../new-files.tar.bz2
|
||||
cd ..
|
||||
|
||||
# upload these to bugzilla
|
||||
echo "please upload to bugzilla:"
|
||||
echo " /tmp/poi-patch/diff.txt"
|
||||
echo " /tmp/poi-patch/new-files.tar.bz2"
|
||||
</source>
|
||||
</section>
|
||||
<section><title>Approach 3 - the git way</title>
|
||||
<p>
|
||||
If you are working on a Git clone of Apache POI (see the
|
||||
<a href="site:subversion">Version Control page</a> for
|
||||
more on the read-only Git mirrors), it is possible to generate
|
||||
a patch of your changes (including new binary files) using Git.
|
||||
</p>
|
||||
<p>
|
||||
For new developers, we'd normally suggest using Subversion and
|
||||
one of the methods above, as they tend to be simpler. For people
|
||||
who are already proficient with Git, then generating a patch
|
||||
from Git can be an easy way to contribute!
|
||||
</p>
|
||||
<p>
|
||||
When generating a patch / patch set from Git, for many related and
|
||||
small changes a squashed patch is probably best, as it makes the
|
||||
(manual) review quicker. For larger changes, several distinct
|
||||
patches are probably best.
|
||||
</p>
|
||||
<p>
|
||||
If you intend to do a noticeable amount of work enhancing Apache POI
|
||||
on your own Git repo, we would suggest sending in patches early and
|
||||
asking for advice. There's nothing worse than spending a week working
|
||||
hard on your own on a change, only to discover you did something on
|
||||
Day 1 that isn't acceptable to the project meaning your whole patch
|
||||
needs re-doing... Git's offline workflow makes this easier, so try not
|
||||
to fall into that trap!
|
||||
</p>
|
||||
</section>
|
||||
<section><title>checklist before submitting a patch</title>
|
||||
<ul>
|
||||
<li>Added code complies with <a href="#CodeStyle">coding standards</a>.</li>
|
||||
<li>Added code compiles and runs on Java 1.8 and preferably newer versions.</li>
|
||||
<li>New java files begin with the <a href="https://www.apache.org/foundation/license-faq.html">
|
||||
Apache Software License</a> statement.</li>
|
||||
<li>The code does not depend on code that is unlicensed or
|
||||
<a href="https://www.apache.org/legal/resolved.html#category-a">incompatibly licensed with ASL 2.0</a>.
|
||||
<a href="https://www.apache.org/licenses/GPL-compatibility.html">GPL</a> and LGPL code may not be used.</li>
|
||||
<li>The code does not include <code>@author</code> tags.</li>
|
||||
<li>Existing test cases succeed.</li>
|
||||
<li>New test cases written and succeed (Use <code>@Disabled</code> from <code>org.junit</code> for in-progress work).</li>
|
||||
<li>Documentation page extended as appropriate.</li>
|
||||
<li>Examples updated or added as appropriate.</li>
|
||||
<li>Diff files generated using <code>svn diff</code>.</li>
|
||||
<li>Newly added files are included in the patch or alongside the patch.</li>
|
||||
<li>The <a href="https://bz.apache.org/bugzilla/describecomponents.cgi?product=POI">bugzilla</a> subject dev contains [PATCH], task name and patch reason in subject.</li>
|
||||
<li>The bugzilla description contains a rationale for the patch.</li>
|
||||
<li>Attachment to the bugzilla entry contains the patch file.</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<anchor id="CodeStyle"/>
|
||||
<section><title>Code Style</title>
|
||||
<p>The long standing
|
||||
<a href="site:res001">Minimal
|
||||
Coding Standards</a> from 2002 still largely apply to the project.</p>
|
||||
<p>When making changes to an existing file, please try to follow the
|
||||
same style that that file already uses. This will keep things
|
||||
looking similar, and will prevent patches becoming largely about
|
||||
whitespace. Whitespace fixing changes, if needed, should normally be
|
||||
in their own commit, so that they don't crowd out coding changes
|
||||
in review.</p>
|
||||
<p>Normally, tabs should not be used to indent code. Instead, spaces
|
||||
should be used. If starting on a fresh file, please use 4 spaces to
|
||||
indent your code. If working on an existing file, please use
|
||||
whichever of 3 or 4 spaces that file already follows.</p>
|
||||
<p>Normally, braces should open on the same line as the decision
|
||||
statement. Braces should normally close on their own line. Brackets
|
||||
should normally have a space before them when they are the first.</p>
|
||||
<p>Lines normally shouldn't be too long. There's no hard and fast rule,
|
||||
but if you line is getting above about 90 characters think about
|
||||
splitting it, and you should rarely create something over about 100
|
||||
characters without a very good reason!</p>
|
||||
</section>
|
||||
|
||||
<anchor id="Mentoring"/>
|
||||
<section><title>Mentoring and Committership</title>
|
||||
<p>The POI project will generally offer committership to contributors who send
|
||||
in consistently good patches over a period of several months.</p>
|
||||
<p>The requirement for "good patches" generally means patches which can be applied
|
||||
to SVN with little or no changes. These patches should include unit test, and
|
||||
appropriate documentation. Whilst your first patch to POI may require quite a
|
||||
bit of work before it can be committed by an existing committer, with any luck
|
||||
your later patches will be applied with no / minor tweaks. Please do take note
|
||||
of any changes required by your earlier patches, to learn for later ones! If
|
||||
in doubt, ask on the <a href="site:mailinglists">dev mailing list</a>.</p>
|
||||
<p>The requirement for patches over several months is to ensure that committers
|
||||
remain with the project. It's very easy for a good developer to fire off half
|
||||
a dozen good patches in the couple of weeks that they're working on a POI
|
||||
powered project. However, if that developer then moves away, and stops
|
||||
contributing to POI after that spurt, then they're not a good candidate for
|
||||
committership. As such, we generally require people to stay around for a while,
|
||||
submitting patches and helping on the mailing list before considering them
|
||||
for committership.</p>
|
||||
<p>Where possible, patches should be submitted early and often. For more details
|
||||
on this, please see the "Submitting Patches" section above.</p>
|
||||
|
||||
<p>Where possible, the existing developers will try to help and mentor new
|
||||
contributors. However, everyone involved in POI is a volunteer, and it may
|
||||
happen that your first few patches come in at a time when all the committers
|
||||
are very busy. Do please have patience, and remember to use the
|
||||
<a href="site:mailinglists">dev mailing list</a> so that other
|
||||
contributors can assist you!</p>
|
||||
<p>For more information on getting started at Apache, mentoring, and local
|
||||
Apache Committers near you who can offer advice, please see the
|
||||
<a href="http://community.apache.org/">Apache Community Development
|
||||
Project</a> website.</p>
|
||||
</section>
|
||||
|
||||
<anchor id="FileFormatInformation"/>
|
||||
<section><title>File Format Information</title>
|
||||
<section><title>Publicly Available Information on the file formats</title>
|
||||
<p>
|
||||
In early 2008, Microsoft made a fairly complete set of documentation
|
||||
on the binary file formats freely and publicly available. These were
|
||||
released under the <a href="https://msdn.microsoft.com/en-us/openspecifications/default">
|
||||
Open Specification Promise</a>, which does allow us to use them for
|
||||
building open source software under the <a
|
||||
href="https://www.apache.org/foundation/license-FAQ.html">
|
||||
Apache Software License</a>.
|
||||
</p>
|
||||
<p>
|
||||
You can download the documentation on Excel, Word, PowerPoint and
|
||||
Escher (drawing) from
|
||||
<a href="https://msdn.microsoft.com/en-us/library/cc313118.aspx">http://msdn.microsoft.com/en-us/library/cc313118.aspx</a>.
|
||||
Documentation on a few of the supporting technologies used in these
|
||||
file formats can be downloaded from
|
||||
<a href="https://msdn.microsoft.com/en-us/library/jj633110.aspx">http://msdn.microsoft.com/en-us/library/jj633110.aspx</a>.
|
||||
</p>
|
||||
<p>
|
||||
For the VSDX format (implemented in Apache POI as XDGF), an
|
||||
<a href="https://msdn.microsoft.com/en-us/library/office/jj228622.aspx">introduction
|
||||
is available from Microsoft</a>, and full details are available
|
||||
<a href="https://msdn.microsoft.com/en-us/library/office/jj684209(v=office.15).aspx">here</a>
|
||||
and
|
||||
<a href="https://msdn.microsoft.com/en-us/library/hh645006(v=office.12).aspx">here</a>.
|
||||
</p>
|
||||
<p>
|
||||
Previously, Microsoft published a book on the Excel 97 file format.
|
||||
It can still be of plenty of use, and is handy dead tree form. Pick up
|
||||
a copy of "Excel 97 Developer's Kit" from your favourite second hand
|
||||
book store.
|
||||
</p>
|
||||
<p>
|
||||
The newer Office Open XML (ooxml) file formats are documented as part
|
||||
of the ECMA / ISO standardisation effort for the formats. This
|
||||
documentation is quite large, but you can normally find the bit you
|
||||
need without too much effort! This can be downloaded from
|
||||
<a href="https://ecma-international.org/publications-and-standards/standards/ecma-376/">https://ecma-international.org/publications-and-standards/standards/ecma-376/</a>,
|
||||
and is also under the
|
||||
<a href="https://msdn.microsoft.com/en-us/openspecifications/default">OSP</a>.
|
||||
</p>
|
||||
<p>
|
||||
Additionally for the newer Office Open XML (ooxml) file formats, you can
|
||||
find some good introductary documentation (often clearer for getting
|
||||
started with) at <a href="http://officeopenxml.com/">officeopenxml.com</a>,
|
||||
which is an independent site documenting the file formats.
|
||||
</p>
|
||||
<p>
|
||||
It is also worth checking the documentation and code of the other
|
||||
open source implementations of the file formats.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>I just signed an NDA to get a spec from Microsoft and I'd like to contribute</title>
|
||||
<p>
|
||||
In short, stay away, stay far far away. Implementing these file formats
|
||||
in POI is done strictly by using public information. Most of this Public
|
||||
Information currently comes from the documentation that Microsoft
|
||||
makes freely available (see above). The rest of the public information
|
||||
includes sources from other open source projects, books that state the
|
||||
purpose intended is for allowing implementation of the file format and
|
||||
do not require any non-disclosure agreement and just hard work.
|
||||
We are intent on keeping it legal, by contributing patches you agree to
|
||||
do the same.
|
||||
</p>
|
||||
<p>
|
||||
If you've ever received information regarding the OLE 2 Compound Document
|
||||
Format under any type of exclusionary agreement from Microsoft, or
|
||||
received such information from a person bound by such an agreement, you
|
||||
cannot participate in this project. Sorry. Well, unless you can persuade
|
||||
Microsoft to release you from the terms of the NDA on the grounds that
|
||||
most of the information is now publicly available. However, if you have
|
||||
been party to a Microsoft NDA, you will need to get clearance from Microsoft
|
||||
before contributing.
|
||||
</p>
|
||||
<p>
|
||||
Those submitting patches that show insight into the file format may be
|
||||
asked to state explicitly that they have only ever read the publicly
|
||||
available file format information, and not any received under an NDA
|
||||
or similar, and have only made us of the public documentation.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
2255
src/documentation/content/xdocs/devel/history/changes-3x.xml
Normal file
326
src/documentation/content/xdocs/devel/history/changes-pre3x.xml
Normal file
@ -0,0 +1,326 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE changes PUBLIC "-//APACHE//DTD Changes POI//EN" "changes-poi.dtd">
|
||||
|
||||
<changes>
|
||||
<contexts>
|
||||
<context id="OOXML" title="OOXML"/>
|
||||
<context id="OPC" title="OPC"/>
|
||||
<context id="POI_Overall" title="POI Overall"/>
|
||||
<context id="HSSF" title="Horrible SpreadSheet Format"/>
|
||||
<context id="XSSF" title="ooXml SpreadSheet Format"/>
|
||||
<context id="SXSSF" title="Streaming ooXml SpreadSheet Format"/>
|
||||
<context id="SS_Common" title="SpreadSheet Common"/>
|
||||
<context id="HSLF" title="Horrible SlideShow Format"/>
|
||||
<context id="XSLF" title="ooXml SlideShow Format"/>
|
||||
<context id="SL_Common" title="SlideShow Common"/>
|
||||
<context id="HWPF" title="Horrible WordProcessor Format"/>
|
||||
<context id="XWPF" title="ooXml WordProcessor Format"/>
|
||||
<context id="HDF" title="Horrible Document Format"/>
|
||||
<context id="HPSF" title="Horrible PropertySet Format"/>
|
||||
<context id="HDGF" title="Horrible Dreadful Graph Format"/>
|
||||
<context id="XDGF" title="ooXml Dreadful Graph Format"/>
|
||||
<context id="DDF" title="Dreadful Drawing Format"/>
|
||||
<context id="XDDF" title="ooXml Dreadful Drawing Format"/>
|
||||
<context id="HMEF" title="Horrible Mail Encoder Format"/>
|
||||
<context id="HSMF" title="Horrible Senseless Format"/>
|
||||
<context id="HPBF" title="Horrible Peep Book Format"/>
|
||||
<context id="HWMF" title="Horrible Wannabe Metafile Format"/>
|
||||
<context id="HEMF" title="Horrible Ermahgerd Metafile Format"/>
|
||||
<context id="POIFS" title="Poor Obfuscation Implementation FileSystem"/>
|
||||
</contexts>
|
||||
|
||||
<section id="current_release">
|
||||
<title>Current releases</title>
|
||||
<p>The change log for the <a href="site:changes">current release</a> can be found in the home section.</p>
|
||||
</section>
|
||||
|
||||
|
||||
<release version="2.5.1-FINAL" date="2004-02-29">
|
||||
<actions>
|
||||
<action type="add" context="HSSF">Outlining support</action>
|
||||
<action type="fix" fixes-bug="27574" context="HSSF">HSSFDateUtil.getExcelDate() is one hour off when DST changes</action>
|
||||
<action type="fix" fixes-bug="26465" context="HSSF">wrong lastrow entry</action>
|
||||
<action type="fix" fixes-bug="28203" context="HSSF">Unable to open read-write excel file including forms</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="2.5-FINAL" date="2004-02-29">
|
||||
<actions>
|
||||
<action type="add" context="DDF">Add support for the Escher file format</action>
|
||||
<action type="fix" fixes-bug="27005" context="HSSF">java.lang.IndexOutOfBoundsException during Workbook.cloneSheet()</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="2.0-FINAL" date="2004-01-26">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">No changes</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="2.0-RC2" date="2004-01-11">
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="25695" context="HSSF">HSSFCell.getStringCellValue() on cell which has string formula will return swap bye unicode characters</action>
|
||||
<action type="fix" context="POI_Overall">Updated website for upcoming release</action>
|
||||
<action type="fix" fixes-bug="25457" context="HSSF">Formula Parser fixes with tests, by Peter M Murray Bug 25457</action>
|
||||
<action type="fix" context="HSSF">Fixed cloning merge regions</action>
|
||||
<action type="fix" context="HSSF">The cloned reference for merged cells did not create a new collection, so deletes cascaded to the original</action>
|
||||
<action type="fix" fixes-bug="24519" context="HSSF">Call to getCustomPalette() from a newly created workbook now works</action>
|
||||
<action type="fix" fixes-bug="24397" context="POI_Overall">Some compilation got ambiguous classes. Explicitly imports the classes. Patch supplied by Jean-Pierre Paris</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="2.0-RC1" date="2003-11-02">
|
||||
<actions>
|
||||
<action type="fix" fixes-bug="12561" context="HSSF" importance="minor">HSSFWorkbook throws Exceptions</action>
|
||||
<action type="fix" fixes-bug="12730" context="HSSF" importance="normal">values dont get copied to another sheet</action>
|
||||
<action type="fix" fixes-bug="13224" context="POI_Overall" importance="major">Exception thrown when cell has =Names call</action>
|
||||
<action type="fix" fixes-bug="13796" context="HSSF" importance="normal">Error Reading Formula Record (optimized if, external link)</action>
|
||||
<action type="fix" fixes-bug="13921" context="HSSF" importance="normal">Sheet name cannot exceed 31 characters and cannot contain :</action>
|
||||
<action type="fix" fixes-bug="14330" context="HSSF" importance="normal">Error reading FormulaRecord</action>
|
||||
<action type="fix" fixes-bug="14460" context="HSSF" importance="normal">Name in Formula - ArrayOutOfBoundsException</action>
|
||||
<action type="fix" fixes-bug="15228" context="HSSF" importance="critical">ArrayIndexoutofbounds Exception. POI - Version 1.8</action>
|
||||
<action type="fix" fixes-bug="16488" context="HSSF" importance="major">Unable to open written spreadsheet in Excel, but can in Open</action>
|
||||
<action type="fix" fixes-bug="16559" context="HSSF" importance="normal">testCustomPalette.xls crashes Excel 97</action>
|
||||
<action type="fix" fixes-bug="16560" context="HSSF" importance="normal">testBoolErr.xls crashes Excel '97</action>
|
||||
<action type="fix" fixes-bug="17374" context="HSSF" importance="minor">HSSFFont - BOLDWEIGHT_NORMAL</action>
|
||||
<action type="fix" fixes-bug="18800" context="HSSF" importance="major">The sheet made by HSSFWorkbook#cloneSheet() doesn't work cor</action>
|
||||
<action type="fix" fixes-bug="18846" context="POI_Overall" importance="minor">[RFE]Refactor the transformation between byte array a</action>
|
||||
<action type="fix" fixes-bug="19599" context="HSSF" importance="minor">java.lang.IllegalArgumentException</action>
|
||||
<action type="fix" fixes-bug="19961" context="HSSF" importance="normal">Sheet.getColumnWidth() returns wrong value</action>
|
||||
<action type="fix" fixes-bug="21066" context="HSSF" importance="blocker">Can not modify a blank spreadsheet</action>
|
||||
<action type="fix" fixes-bug="21444" context="HSSF" importance="enhancement">Macro functions</action>
|
||||
<action type="fix" fixes-bug="21447" context="HSSF" importance="normal">[RFE]String Formula Cells</action>
|
||||
<action type="fix" fixes-bug="21674" context="HSSF" importance="enhancement">Documentation changes for @(Greater|Less|Not)EqualPt</action>
|
||||
<action type="fix" fixes-bug="21863" context="POI_Overall" importance="enhancement">build.xml fixes</action>
|
||||
<action type="fix" fixes-bug="22195" context="POIFS" importance="normal">[RFE] Support for Storage Class ID</action>
|
||||
<action type="fix" fixes-bug="22742" context="HSSF" importance="critical">Failed to create HSSFWorkbook!</action>
|
||||
<action type="fix" fixes-bug="22922" context="HSSF" importance="critical">HSSFSheet.shiftRows() throws java.lang.IndexOutOfBoundsExcep</action>
|
||||
<action type="fix" fixes-bug="22963" context="HPSF" importance="normal">org.apache.poi.hpsf.SummaryInformation.getEditTime() should</action>
|
||||
<action type="fix" fixes-bug="24149" context="POIFS" importance="major">Error passing inputstream to POIFSFileSystem</action>
|
||||
<action type="fix" fixes-bug="21722" context="HSSF" importance="normal">Add a ProtectRecord to Sheets and give control over</action>
|
||||
<action type="fix" fixes-bug="9576" context="HSSF" importance="normal">DBCELL, INDEX EXTSST (was Acess 97 import)</action>
|
||||
<action type="fix" fixes-bug="13478" context="POIFS" importance="blocker">[RFE] POIFS, RawDataBlock: Missing workaround for lo</action>
|
||||
<action type="fix" fixes-bug="14824" context="HSSF" importance="normal">Unable to modify empty sheets</action>
|
||||
<action type="fix" fixes-bug="12843" context="HSSF" importance="critical">Make POI handle chinese better</action>
|
||||
<action type="fix" fixes-bug="15353" context="HSSF" importance="normal">[RFE] creating a cell with a hyperlink</action>
|
||||
<action type="fix" fixes-bug="15375" context="HSSF" importance="blocker">Post 1.5.1 POI causes spreadsheet to become unopenable</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="2.0-pre3" date="2003-07-29">
|
||||
<actions>
|
||||
<action type="add" context="HPSF">HPSF is now able to read properties which are given in the property set stream but which don't have a value ("variant" type VT_EMPTY). The getXXX() methods of the PropertySet class return null if their return type is a reference (like a string) or 0
|
||||
if the return type is numeric. Details about the return types and about how to distinguish between a property value of zero and a property value that is not present can be found in the API documentation</action>
|
||||
<action type="fix" context="HSSF">Gridlines can now be turned on and off</action>
|
||||
<action type="fix" context="HSSF">NamePTG refactoring/fixes</action>
|
||||
<action type="fix" context="HSSF">minor fixes to ExternSheet and formula strings</action>
|
||||
<action type="fix" context="HSSF">Sheet comparisons now ignore case</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
<release version="2.0-pre2" date="2003-07-06">
|
||||
<actions>
|
||||
<action type="fix" context="HSSF">A nasty concurrency problem has been fixed. Any users working in a multithreaded environment should seriously consider upgrading to this release</action>
|
||||
<action type="update" context="HSSF">The EXTSST record has been implemented. This record is used by excel for optimized reading of strings</action>
|
||||
<action type="update" context="HSSF">When rows are shifted, the merged regions now move with them. If a row contains 2 merged cells, the resulting shifted row should have those cells merged as well</action>
|
||||
<action type="fix" context="HSSF">There were some issues when removing merged
|
||||
regions (specifically, removing all of them and then adding some more) and have been resolved.
|
||||
</action>
|
||||
<action type="fix" context="HSSF">When a sheet contained shared formulas (when a formula is
|
||||
dragged across greater than 6 cells), the clone would fail. We now support cloning of
|
||||
sheets that contain this Excel optimization.
|
||||
</action>
|
||||
<action type="add" context="HSSF">Support added for reading formulas with UnaryPlus and UnaryMinus operators</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="2.0-pre1" date="2003-05-17">
|
||||
<actions>
|
||||
<action type="add" context="HSSF">Patch applied for deep cloning of worksheets was provided</action>
|
||||
<action type="add" context="HSSF">Patch applied to allow sheet reordering</action>
|
||||
<action type="add" context="HSSF">Added additional print area setting methods using row/column numbers</action>
|
||||
<action type="fix" context="HDF">Negative Array size fix</action>
|
||||
<action type="update" context="HSSF">Added argument pointers to support the IF formula</action>
|
||||
<action type="update" context="HSSF">Formulas: Added special character support for string literals, specifically for SUMIF formula support and addresses a bug as well</action>
|
||||
<action type="fix" context="POIFS">BlockingInputStream committed to help ensure reads</action>
|
||||
<action type="fix" context="HSSF">Fixed problem with NaN values differing from the investigated value from file reads in FormulaRecords</action>
|
||||
<action type="fix" context="HSSF">Patch for getColumnWidth in HSSF</action>
|
||||
<action type="add" context="HDF">Patch for dealing with mult-level numbered lists in HDF</action>
|
||||
<action type="fix" context="HSSF">Due to named reference work, several named-ranged bugs were closed</action>
|
||||
<action type="fix" context="HSSF">Patch applied to prevent sheet corruption after a template modification</action>
|
||||
<action type="update" context="HSSF">Shared Formulas now Supported</action>
|
||||
<action type="update" context="HSSF">Added GreaterEqual, LessEqual and NotEqual to Formula Parser</action>
|
||||
<action type="update" context="HSSF">Added GreaterThan and LessThan functionality to formulas</action>
|
||||
<action type="fix" context="POI_Overall">Patches for i10n</action>
|
||||
<action type="update" context="POI_Overall">POI Build System Updated</action>
|
||||
<action type="fix" context="HSSF">font names can now be null</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.10-dev" date="2003-02-19">
|
||||
<actions>
|
||||
<action type="add" context="HSSF">Support for zoom level</action>
|
||||
<action type="add" context="HSSF">Freeze and split pane support</action>
|
||||
<action type="add" context="HSSF">Row and column headers on printouts</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.8-dev" date="2002-09-20">
|
||||
<actions>
|
||||
<action type="add" context="HSSF">Custom Data Format Support</action>
|
||||
<action type="add" context="HSSF">Enhanced Unicode Support for Russian and Japanese</action>
|
||||
<action type="add" context="HSSF">Enhanced formula support including read-only for
|
||||
"optimized if" statements.
|
||||
</action>
|
||||
<action type="add" context="HSSF">Support for cloning objects</action>
|
||||
<action type="add" context="HSSF">Fixes for header/footer</action>
|
||||
<action type="add" context="POI_Overall">Spanish Documentation translations</action>
|
||||
<action type="add" context="HSSF">Support for preserving VBA macros</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.7-dev" date="Release date not recorded">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Removed runtime dependency on commons logging</action>
|
||||
<action type="update" context="HSSF">Formula support</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.5.1" date="2002-06-16">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Removed depedency on commons logging. Now define poi.logging system property to enable logging to standard out</action>
|
||||
<action type="fix" context="HSSF">Fixed SST string handling so that spreadsheets with rich text or extended text will be read correctly</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.5" date="2002-05-06">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">New project build</action>
|
||||
<action type="update" context="POI_Overall">New project documentation system based on Cocoon</action>
|
||||
<action type="update" context="POI_Overall">Package rename</action>
|
||||
<action type="fix" context="POI_Overall">Various bug fixes</action>
|
||||
<action type="add" context="HSSF">Early stages of HSSF development (not ready for development)</action>
|
||||
<action type="add" context="HSSF">Initial low level record support for charting (not complete)</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.2.0" date="2002-01-19">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.1.0" date="2002-01-04">
|
||||
<actions>
|
||||
<action type="update" context="HSSF">Created new event model</action>
|
||||
<action type="update" context="HSSF">Optimizations made to HSSF including aggregate records for values, rows, etc.</action>
|
||||
<action type="update" context="POI_Overall">predictive sizing, offset based writing (instead of lots of array copies)</action>
|
||||
<action type="update" context="POI_Overall">minor re-factoring and bug fixes</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.0.2" date="2002-01-11">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.0.1" date="2002-01-04">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="1.0.0" date="2001-12-30">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Minor documentation updates</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.14.0" date="2001-12-22">
|
||||
<actions>
|
||||
<action type="update" context="HSSF">Added DataFormat helper class and exposed set and get format on HSSFCellStyle</action>
|
||||
<action type="update" context="HSSF">Fixed column width apis (unit wise) and various javadoc on the subject</action>
|
||||
<action type="update" context="HSSF">Fix for Dimensions record (again)... (one of these days I'll write a unit test for this ;-p).</action>
|
||||
<action type="update" context="HSSF">Some optimization on sheet creation</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.13.0" date="2001-12-16">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.12.0" date="2001-12-12">
|
||||
<actions>
|
||||
<action type="update" context="HSSF">Added MulBlank, Blank, ColInfo</action>
|
||||
<action type="update" context="POI_Overall">Added log4j facility and removed all sys.out type logging</action>
|
||||
<action type="update" context="HSSF">Added support for adding font's, styles and corresponding high level api for styling cells</action>
|
||||
<action type="update" context="HSSF">added support for changing row height, cell width and default row height/cell width.</action>
|
||||
<action type="update" context="HSSF">Added fixes for internationalization (UTF-16 should work now from HSSFCell.setStringValue, etc when the encoding is set)</action>
|
||||
<action type="update" context="HSSF">added support for adding/removing and naming sheets</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.11.0" date="2001-12-08">
|
||||
<actions>
|
||||
<action type="update" context="HSSF">Bugfix release. We were throwing an exception when reading RKRecord objects.</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.10.0" date="2001-12-02">
|
||||
<actions>
|
||||
<action type="update" context="HSSF">Got continuation records to work (read/write)</action>
|
||||
<action type="update" context="HSSF">Added various pre-support for formulas</action>
|
||||
<action type="update" context="POI_Overall">Massive API reorganization, repackaging</action>
|
||||
<action type="update" context="POI_Overall">Better API support for modification</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.7 (and interim releases)" date="2001-11-17">
|
||||
<actions>
|
||||
<action type="update" context="HSSF">Added encoding flag to high and low level api to use utf-16
|
||||
when needed (HSSFCell.setEncoding())
|
||||
</action>
|
||||
<action type="update" context="HSSF">added read only support for Label records (which are
|
||||
reinterpreted as LabelSST when written)
|
||||
</action>
|
||||
<action type="update" context="HSSF">Broken continuation record implementation (oops)</action>
|
||||
<action type="update" context="POIFS HSSF">BiffViewer class added for validating POI and/or HSSF Output.</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.6" date="2001-11-11">
|
||||
<actions>
|
||||
<action type="update" context="POIFS">Support for read/write and modify</action>
|
||||
<action type="update" context="HSSF">Read only support for MulRK records (converted to Number when writing)</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.5" date="2001-11-05">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.4" date="2001-10-31">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.3" date="2001-10-26">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.2" date="2001-09-24">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">Changes not recorded</action>
|
||||
</actions>
|
||||
</release>
|
||||
<release version="0.1" date="2001-08-28">
|
||||
<actions>
|
||||
<action type="update" context="POI_Overall">First ever public release</action>
|
||||
</actions>
|
||||
</release>
|
||||
|
||||
</changes>
|
||||
163
src/documentation/content/xdocs/devel/history/index.xml
Normal file
@ -0,0 +1,163 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Project History</title>
|
||||
<authors>
|
||||
<person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
|
||||
<section><title>Apache POI™ - The name</title>
|
||||
<p>Refer to the <a href="https://en.wikipedia.org/wiki/Apache_POI#History_and_roadmap">explanation on Wikipedia</a>
|
||||
for some folklore about how the name "POI" came into existence.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Apache POI™ - Brief Project History</title>
|
||||
|
||||
<p>The POI project was dreamed up back around April 2001, when
|
||||
Andrew Oliver landed a short term contract to do Java-based
|
||||
reporting to Excel. He'd done this project a few times before
|
||||
and knew right where to look for the tools he needed.
|
||||
Ironically, the API he used to use had skyrocketed from around
|
||||
$300 ($US) to around $10K ($US). He figured it would take two
|
||||
people around six months to write an Excel port so he
|
||||
recommended the client fork out the $10K.
|
||||
</p>
|
||||
|
||||
<p>Around June 2001, Andrew started thinking how great it would
|
||||
be to have an open source Java tool to do this and, while he
|
||||
had some spare time, he started on the project and learned
|
||||
about OLE 2 Compound Document Format. After hitting some real
|
||||
stumpers he realized he'd need help. He posted a message to
|
||||
his local Java User's Group (JUG) and asked if anyone else
|
||||
would be interested. He lucked out and the most talented Java
|
||||
programmer he'd ever met, Marc Johnson, joined the project. He
|
||||
ran rings around Andrew at porting OLE 2 CDF and rewrote his
|
||||
skeletal code into a more sophisticated library. It took Marc
|
||||
a few iterations to get something they were happy with.
|
||||
</p>
|
||||
|
||||
<p>While Marc worked on that, Andrew ported XLS to Java, based
|
||||
on Marc's library. Several users wrote in asking to read XLS
|
||||
(not just write as had originally been planned) and one user
|
||||
had special requests for a different use for POIFS. Before
|
||||
long, the project scope had tripled. POI 1.0 was released a
|
||||
month later than planned, but with far more features. Marc
|
||||
quickly wrote the serializer framework and HSSF Serializer in
|
||||
record time and Andrew banged out more documentation and worked
|
||||
on making people aware of the project
|
||||
</p>
|
||||
|
||||
<p> Shortly before the release, POI was fortunate to come into
|
||||
contact with Nicola -Ken- Barrozzi who gave them samples for
|
||||
the HSSF Serializer and help uncover its unfortunate bugs
|
||||
(which were promptly fixed). More recently, Ken ported most
|
||||
of the POI project documentation to XML from Andrew's crappy
|
||||
HTML docs he wrote with Star Office.
|
||||
</p>
|
||||
|
||||
<p> Around the same time as the release, Glen Stampoultzis
|
||||
joined the project. Glen was ticked off at Andrew's flippant attitude
|
||||
towards adding graphing to HSSF. Glen got so ticked off he decided to
|
||||
grab a hammer and do it himself. Glen has already become an integral
|
||||
part of the POI development community; his contributions to HSSF have
|
||||
already started making waves.
|
||||
</p>
|
||||
|
||||
<p>Somewhere in there we decided to finally submit the project
|
||||
to <a href="https://cocoon.apache.org/">The Apache
|
||||
Cocoon Project</a>, only to discover the project had
|
||||
outgrown fitting nicely into just Cocoon long ago.
|
||||
Furthermore, Andrew started eyeing other projects he'd like to
|
||||
see POI functionality added to. So it was decided to donate
|
||||
the Serializers and Generators to Cocoon, other POI
|
||||
integration components to other projects, and the POI APIs
|
||||
would become part of Jakarta. It was a bumpy road but it
|
||||
looks like everything turned out since you're reading this!
|
||||
</p>
|
||||
|
||||
<p>In Early 2007, we graduated from
|
||||
<a href="https://jakarta.apache.org/">Jakarta</a>, and became
|
||||
our own Top Level Project (TLP) within Apache.</p>
|
||||
</section>
|
||||
|
||||
<!--
|
||||
<section><title>What's next for Poi</title>
|
||||
<p>First we'll tackle this from a project standpoint: Well, we
|
||||
made an offer to Microsoft and Actuate (tongue in cheek
|
||||
... well mostly) that we'd quit the project and retire if
|
||||
they'd simply write us each a really large check. I've yet to
|
||||
get a phone call or email so I'm assuming they're not going to
|
||||
pay us to go away.
|
||||
</p>
|
||||
<p>Next, we've got some work to do here at Jakarta to finish
|
||||
integrating POI into the community. Furthermore, we're
|
||||
still transitioning the Serializer to Cocoon.
|
||||
</p>
|
||||
<p>HSSF, during the 2.0 cycle, will undergo a few
|
||||
optimizations. We'll also be adding new features like a full
|
||||
implementation of Formulas and custom text formats. We're
|
||||
hoping to be able to generate smaller files by adding
|
||||
write-support for RK, MulRK and MulBlank records. I'm also
|
||||
going to work on a Cocoon 2 Generator. Currently, reading is
|
||||
not very efficient in HSSF. This is mainly because in order to
|
||||
write or modify, one needs to be able to update upstream
|
||||
pointers to downstream data. To do this you have to have
|
||||
everything between in memory. A Generator would allow SAX
|
||||
events to be processed instead. (This will be based on the low
|
||||
level structures). One of the great things about this is that,
|
||||
you'll not only have a more efficient way to read the file,
|
||||
you'll have a great way to use spreadsheets as XML data
|
||||
sources.
|
||||
</p>
|
||||
<p>The HSSF Serializer, will further separate into a general
|
||||
framework for creating serializers for other formats and the
|
||||
HSSF Serializer specific implementation. (This is largely
|
||||
already true). We'll also be adding support for features
|
||||
already supported by HSSF (styles, fonts, text formats). We're
|
||||
hoping to add support for formulas during this cycle.
|
||||
</p>
|
||||
<p>We're beginning to expand our scope yet again. If we could
|
||||
do all of this for XLS files, what about Doc files or PowerPoint
|
||||
files? We're thinking that our next component (HWPF - Manipulates
|
||||
Word Processor Format) should follow the same pattern. We're hoping
|
||||
that new blood will join the team and allow us to tackle this
|
||||
even faster (in part because POIFS is already finished). But
|
||||
maybe what we need most is you! </p>
|
||||
</section> -->
|
||||
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
|
||||
|
||||
</document>
|
||||
168
src/documentation/content/xdocs/devel/index.xml
Normal file
@ -0,0 +1,168 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - How To Build</title>
|
||||
<authors>
|
||||
<person email="user@poi.apache.org" name="Glen Stampoultzis" id="GS"/>
|
||||
<person email="tetsuya@apache.org" name="Tetsuya Kitahata" id="TK"/>
|
||||
<person email="dfisher@jmlafferty.com" name="David Fisher" id="DF"/>
|
||||
</authors>
|
||||
</header>
|
||||
<body>
|
||||
<section>
|
||||
<title>JDK Version</title>
|
||||
<p>
|
||||
POI 4.0 and later require JDK version 1.8 or later. JDK version 11 is required to compile module support.
|
||||
</p>
|
||||
<p>
|
||||
POI 3.11 and later 3.x versions require JDK version 1.6 or later.
|
||||
</p>
|
||||
<p>
|
||||
POI 3.5 to 3.10 required the JDK version 1.5 or later.
|
||||
Versions prior to 3.5 required JDK 1.4+.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Install Apache Forrest</title>
|
||||
<p>
|
||||
The POI build system requires
|
||||
<a href="https://forrest.apache.org/">Apache Forrest</a>
|
||||
to build the documentation.
|
||||
</p>
|
||||
<p>
|
||||
Specifically, the build has been tested to work with Forrest 0.9. When building with Forrest,
|
||||
it is recommended to use Java 8.
|
||||
</p>
|
||||
<p>
|
||||
Remember to set the FORREST_HOME environment variable.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Building Targets with Gradle</title>
|
||||
<p>
|
||||
The main Apache POI build was traditionally done with <a href="https://ant.apache.org/">Apache Ant</a>.
|
||||
In 2021, we moved to using <a href="https://gradle.org/">Gradle</a>.
|
||||
After <a href="subversion.html">checking out</a> the POI code, you will find <strong>gradlew</strong> and
|
||||
<strong>gradlew.bat</strong>. These command files are used for running Gradle on Linux/Mac and Windows respectively.
|
||||
Gradlew checks if you the right version of Gradle installed and will install it if you don't.
|
||||
</p>
|
||||
<p>
|
||||
Note that our source releases no longer contain gradlew or gradlew.bat. You can install the Gradle tool
|
||||
yourself and use it to build POI.
|
||||
</p>
|
||||
<p>
|
||||
The main targets of interest to our users are:
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Gradle Target</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>clean</td>
|
||||
<td>Erase all build work products (ie. everything in the
|
||||
build directory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>test</td>
|
||||
<td>Run all unit tests from main, ooxml and scratchpad</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>jar</td>
|
||||
<td>Produce jar files</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>jenkins</td>
|
||||
<td>
|
||||
Runs the tests which Jenkins, our Continuous Integration system, does. This includes the unit tests and various code quality checks.
|
||||
Also, packages up the jars and build distributions.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p>
|
||||
To run the tests from just one test class, use a command like:
|
||||
</p>
|
||||
<p>
|
||||
./gradlew poi-ooxml:test --tests *TestXSSFBugs
|
||||
</p>
|
||||
<p>
|
||||
gradlew poi-ooxml:test --tests *TestXSSFBugs
|
||||
</p>
|
||||
<p>
|
||||
The example command runs tests in the poi-ooxml sub-project that match the name '*TestXSSFBugs'.
|
||||
The '*' wildcard is useful to avoid typing the full Java package name.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Working with Eclipse</title>
|
||||
<p>
|
||||
Apache POI no longer includes a pre-defined Eclipse project file. When importing the POI project,
|
||||
your IDE should recognise that there is Gradle support and offer to do the build using that.
|
||||
</p>
|
||||
<p>
|
||||
First make sure that Java is set up properly and that you can execute the 'javac' executable in your shell.
|
||||
</p>
|
||||
<p>
|
||||
Next, open Eclipse and create either a local SVN repository, or a copy of the Git repository,
|
||||
and import the project into Eclipse.
|
||||
</p>
|
||||
<p>
|
||||
Note: when executing junit tests from within Eclipse, you might need to set the system
|
||||
property "POI.testdata.path" to the actual location of the 'test-data' directory to make
|
||||
the test framework find the required test-files. A simple value of 'test-data' usually works.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Working with IntelliJ Idea</title>
|
||||
<p>
|
||||
Import the Gradle project into your IDE. Execute a build to get all the dependencies and generated code
|
||||
in place.
|
||||
</p>
|
||||
<p>
|
||||
Note: when executing junit tests from within IntelliJ, you might need to set the system
|
||||
property "POI.testdata.path" to the actual location of the 'test-data' directory to make
|
||||
the test framework find the required test-files. A simple value of 'test-data' usually works.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Setting environment variables</title>
|
||||
<p>Linux:
|
||||
<a href="https://help.ubuntu.com/community/EnvironmentVariables">help.ubuntu.com</a>,
|
||||
<a href="http://unix.stackexchange.com/questions/117467/how-to-permanently-set-environmental-variables">unix.stackexchange.com</a>
|
||||
</p>
|
||||
<p>Windows:
|
||||
<a href="https://en.wikipedia.org/wiki/Environment_variable#DOS.2C_OS.2F2_and_Windows">en.wikipedia.org</a>
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
|
||||
|
||||
57
src/documentation/content/xdocs/devel/nightly.xml
Normal file
@ -0,0 +1,57 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Nightly Builds</title>
|
||||
</header>
|
||||
<body>
|
||||
<section id="nightly">
|
||||
<title>Nightly Builds</title>
|
||||
<p>The POI nightly builds are run on the <a href="https://ci-builds.apache.org/job/POI/">Jenkins</a>
|
||||
continuous integration server.<br/>
|
||||
<strong>These builds should not be used in production</strong>: they are mostly intended for use by
|
||||
developers to help with resolving bugs and evaluating new features or users who want to try out the
|
||||
latest version.
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/">
|
||||
Last Successful Jenkins build for POI-trunk
|
||||
</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://sonarcloud.io/dashboard?id=poi-parent">Sonar statistics for the nightly</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
|
||||
|
||||
74
src/documentation/content/xdocs/devel/plan/index.xml
Normal file
@ -0,0 +1,74 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Planning Documentation</title>
|
||||
<subtitle>Overview</subtitle>
|
||||
<authors>
|
||||
<person name="David Crossley" email="crossley@apache.org"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
|
||||
<p>This is a collection of notes to assist with long-term planning and
|
||||
development.
|
||||
</p>
|
||||
|
||||
<p>There is much discussion of issues and research topics (RT) threads on
|
||||
the <code>dev</code> mailing list (and elsewhere). However, details
|
||||
get lost in the sheer volume. This is the place to document the summary of
|
||||
discussions on some key topics. Some new and complex capabilities will take
|
||||
lots of design and specification before they can be implemented.
|
||||
</p>
|
||||
|
||||
<p>Another use for this collection of notes is as a place to quickly store
|
||||
a snippet from an email discussion or even a link to a discussion thread.
|
||||
The concepts can then be fleshed-out over time.
|
||||
</p>
|
||||
|
||||
<p>Anyone can participate in this process. Please get involved in discussion
|
||||
on <code>dev</code> and contribute patches for these summary planning
|
||||
documents via the normal <a href="site:guidelines">contribution</a>
|
||||
process.
|
||||
</p>
|
||||
|
||||
<p>These planning documents are intended to be concise notes only. They are
|
||||
also ever-evolving, because as issues are addressed these notes will be
|
||||
revised.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Topics and Issues</title>
|
||||
|
||||
<ul>
|
||||
<li><a href="site:vision20">POI Version 2.0 Vision</a>
|
||||
</li>
|
||||
<li><a href="site:vision10">POI Version 1.0 Vision</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
</body>
|
||||
</document>
|
||||
521
src/documentation/content/xdocs/devel/plan/vision10.xml
Normal file
@ -0,0 +1,521 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI 1.0 Vision Document</title>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person name="Marcus W. Johnson" email="mjohnson@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
|
||||
<section><title>Preface</title>
|
||||
<p>
|
||||
(21-Jan-02) While this document is just full of useful project
|
||||
introductory information and I do suggest those interested in getting
|
||||
involved in the project read it, it is woefully out of date.
|
||||
</p>
|
||||
<p>
|
||||
We deliberately allowed this document to run out of date because it
|
||||
is a good reflection of what the original vision was for POI 1.0.
|
||||
You'll note that some of the terminology is not used in quite the same
|
||||
way any longer. I've made some minor corrections where reading this
|
||||
confused me. An example: in some places this document may refer to
|
||||
POI API instead of POIFS API. When this vision was written we had
|
||||
an incomplete understanding of the project.
|
||||
</p>
|
||||
<p>
|
||||
Lastly, the scope of the project expanded dramatically near the end
|
||||
of the 1.0 cycle. Our vision at the time was to focus merely on the
|
||||
Excel port (having no idea how the project would grow or be received)
|
||||
and provide the OLE 2 Compound Document port for others to port later
|
||||
formats. We now plan to spearhead these ports under the umbrella of
|
||||
the POI project. So, you've been warned. Read on, but just realize
|
||||
that we had a fuzzy view of things to come, and hindsight is 20-20.
|
||||
</p>
|
||||
<p>
|
||||
If I recall major holes were: a complete understanding of the format
|
||||
of OLE 2 Compound Document format, Excel file format, and exactly how
|
||||
Cocoon 2 Serializers worked. (that just about covers the whole range
|
||||
huh?)
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>1. Introduction</title>
|
||||
<section><title>1.1 Purpose of this document</title>
|
||||
<p>
|
||||
The purpose of this document is to
|
||||
collect, analyze and define high-level requirements, user needs and
|
||||
features of the HSSF Serializer for Cocoon 2 and related libraries.
|
||||
The HSSF Serializer is a java class supporting the Serializer
|
||||
interface from the Cocoon 2 project and outputting in a compatible
|
||||
format of that used by the spreadsheet program Microsoft Excel '97.
|
||||
The HSSF Serializer will be responsible for converting XML
|
||||
spreadsheet-like documents into Excel-compatible XLS spreadsheets.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
|
||||
<section><title>1.2 Project Overview</title>
|
||||
<p>
|
||||
Many web apps today hit a brick wall
|
||||
when it comes to the user request that they be able to easily
|
||||
manipulate their reports and data extracts in the popular Microsoft
|
||||
Excel spreadsheet format. This often causes inferior technologies to be
|
||||
chosen for the project simply because they easily support this
|
||||
format. This project seeks to extend existing XML, Java and Apache
|
||||
Cocoon 2 project technologies by:
|
||||
</p>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
providing an extensible library
|
||||
(POIFS) which reads/writes in a compatible format to OLE 2 Compound
|
||||
Document Format (aka Structured Storage Format) for easy
|
||||
implementation of other document types;
|
||||
</li>
|
||||
<li>
|
||||
providing a library (HSSF) for
|
||||
manipulating spreadsheet data and outputting it in a compatible
|
||||
format to Microsoft Excel XLS format;
|
||||
</li>
|
||||
<li>
|
||||
and providing a Cocoon 2
|
||||
Serializer (HSSFSerializer) for serializing XML documents as
|
||||
Excel-compatible spreadsheets.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
</section>
|
||||
</section>
|
||||
<section><title>2. User Description</title>
|
||||
<section><title>2.1 User/Market Demographics</title>
|
||||
<p>
|
||||
There are a number of enthusiastic
|
||||
users of XML, UNIX and Java technology. Secondly, the Microsoft
|
||||
solution for outputting Office Document formats often involves
|
||||
actually manipulating the software as an OLE Server. This method
|
||||
provides extremely low performance, extremely high overhead and is
|
||||
only capable of handling one document at a time.
|
||||
</p>
|
||||
<ol>
|
||||
<li>
|
||||
Our intended audience for the HSSF
|
||||
Serializer portion of this project are developers writing reports or
|
||||
data extracts in XML format.
|
||||
</li>
|
||||
<li>
|
||||
Our intended audience for the HSSF
|
||||
library portion of this project is ourselves as we are developing
|
||||
the Serializer and anyone who needs to write to Excel spreadsheets
|
||||
in a non-XML Java environment or who has specific needs not
|
||||
addressed by the Serializer.
|
||||
</li>
|
||||
<li>
|
||||
Our intended audience for the
|
||||
"POIFS" OLE 2 Compound Document format reader/writer is
|
||||
ourselves as we are writing the HSSF library and secondly, anyone
|
||||
wishing to provide other libraries for reading/writing OLE 2
|
||||
Compound Document Format in Java.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
<section><title>2.2. User environment</title>
|
||||
<p>
|
||||
The users of this software shall be
|
||||
developers in a Java environment on any Operating System or power
|
||||
users who are capable of XML document generation/deployment.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>2.3. Key User Needs</title>
|
||||
<p>
|
||||
The OLE 2 Compound Document format is
|
||||
undocumented for all practical purposes and cryptic for all
|
||||
impractical purposes. Developer needs in this area include
|
||||
documentation and an easy to use library for reading and writing in
|
||||
this format without requiring the developer to have intimate
|
||||
knowledge of the format.
|
||||
</p>
|
||||
<p>
|
||||
There is currently no good way to write
|
||||
to Microsoft Excel documents from Java or from a non-Microsoft
|
||||
Windows based platform for that matter. Developers need an easy to
|
||||
use library that supports a reasonable feature set and allows
|
||||
separation of data from formatting/stylistic concerns.
|
||||
</p>
|
||||
<p>
|
||||
There is currently no good way to
|
||||
transform XML data to Microsoft Excel. Apache's Cocoon 2 project
|
||||
supplies a complete framework for XML, but nothing for outputting in
|
||||
Excel's XLS format. Developers and power users alike need a simple
|
||||
method to output XML documents to Excel through server-side
|
||||
processing.
|
||||
</p>
|
||||
|
||||
|
||||
</section>
|
||||
</section>
|
||||
<section><title>3. Project Overview</title>
|
||||
<section><title>3.1. Project Perspective</title>
|
||||
<p>
|
||||
The produced code shall be licensed by
|
||||
the Apache License as used by the Cocoon 2 project and maintained on
|
||||
a project page until such time as the Cocoon 2 developers accept it
|
||||
as a donation (at which time the copyright will be turned over to
|
||||
them).
|
||||
</p>
|
||||
</section>
|
||||
<section><title>3.2. Project Position Statement</title>
|
||||
<p>
|
||||
For developers on a Java and/or XML
|
||||
environment this project will provide all the tools necessary for
|
||||
outputting XML data in the Microsoft Excel format. This project seeks
|
||||
to make the use of Microsoft Windows based servers unnecessary for
|
||||
file format considerations and to fully document the OLE 2 Compound
|
||||
Document format. The project aims not only to provide the tools for
|
||||
serializing XML to Excel's file format and the tools for writing to
|
||||
that file format from Java, but also to provide the tools for later
|
||||
projects to convert other OLE 2 Compound Document formats to pure
|
||||
Java APIs.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>3.3. Summary of Capabilities</title>
|
||||
<p>
|
||||
HSSF Serializer for Apache Cocoon 2
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<td>
|
||||
Benefit
|
||||
</td>
|
||||
<td>
|
||||
Supporting Features
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Standard XML tag language for sheet data
|
||||
</td>
|
||||
<td>
|
||||
Serializer will transform documents utilizing a defined tag
|
||||
language
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Utilize XML to output in Excel
|
||||
</td>
|
||||
<td>
|
||||
Serializer will output in Excel
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Java API to output in Excel on any platform
|
||||
</td>
|
||||
<td>
|
||||
The project will develop an API that outputs in Excel using
|
||||
pure Java.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Make it easy for developers to port other OLE 2 Compound
|
||||
Document-based formats to Java.
|
||||
</td>
|
||||
<td>
|
||||
The POIFS library will contain both a high-level abstraction
|
||||
along with low-level constructs. The project will fully document
|
||||
the OLE 2 Compound Document Format.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
<section><title>3.4. Assumptions and Dependencies</title>
|
||||
<ul>
|
||||
<li>
|
||||
The HSSF Serializer will run on
|
||||
any Java 2 supporting platform with Apache Cocoon 2 installed along
|
||||
with the HSSF and POIFS APIs.
|
||||
</li>
|
||||
<li>
|
||||
The HSSF API requires a Java 2
|
||||
implementation and the POI API.
|
||||
</li>
|
||||
<li>
|
||||
The POIFS API requires a Java 2
|
||||
implementation.
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>4. Project Features</title>
|
||||
<p>
|
||||
The POIFS API will include:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
Low level structures representing
|
||||
the structures in a POI filesystems.
|
||||
</li>
|
||||
<li>
|
||||
A low-level API for
|
||||
creating/manipulating POI filesystems.
|
||||
</li>
|
||||
<li>
|
||||
A set of high level interfaces
|
||||
abstracting the user from the POI filesystem constructs and
|
||||
representing it as a standard filesystem (Files, directories, etc)
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
The HSSF API will include:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
Low level structures representing
|
||||
the structures in an Excel file.
|
||||
</li>
|
||||
<li>
|
||||
A low-level API for creating and
|
||||
manipulating Excel files and writing them into POI filesystems.
|
||||
</li>
|
||||
<li>
|
||||
A high level model and style
|
||||
interface for manipulating spreadsheet data without knowing anything
|
||||
about the Excel format itself.
|
||||
</li>
|
||||
</ul>
|
||||
<section><title>4.1 POI Filesystem API</title>
|
||||
<p>
|
||||
The POI Filesystem API includes:
|
||||
</p>
|
||||
<ul>
|
||||
<li>An implementation of Big Blocks</li>
|
||||
<li>An implementation of Small Blocks</li>
|
||||
<li>An implementation of Header Blocks</li>
|
||||
<li>An implementation of Block Allocation Tables</li>
|
||||
<li>An implementation of Property Sets</li>
|
||||
<li>An implementation of the POI
|
||||
filesystem including functions to get and set the above constructs;
|
||||
compound functions for reading/writing files/directories.
|
||||
</li>
|
||||
<li>An abstraction of the POI
|
||||
filesystem providing interfaces representing Files, Directories,
|
||||
FileSystems in normal terminology and encapulating the above
|
||||
constructs.
|
||||
</li>
|
||||
<li>Full documentation of the POI file
|
||||
format.
|
||||
</li>
|
||||
<li>Full documentation of the APIs and
|
||||
interfaces provided through Javadoc, user documentation (aimed at
|
||||
developers using the APIs)
|
||||
</li>
|
||||
<li>Examples aimed at teaching the
|
||||
user to write code using POI. (titled: recipes for POI)
|
||||
</li>
|
||||
<li>Performance specifications.
|
||||
(Example POI filesystems rated by some measure of complexity along
|
||||
with system specifications and execution times for given operations)
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>4.2 HSSF API</title>
|
||||
<p>
|
||||
The HSSF API includes:
|
||||
</p>
|
||||
<ul>
|
||||
<li>An implementation of Record
|
||||
(binary 2 byte type followed by 2 byte size (n) followed by n bytes)</li>
|
||||
<li>Implementations of many standard
|
||||
record types mapping the data bytes to fields along with methods to
|
||||
reserialize those fields</li>
|
||||
<li>An implementation of the HSSF File
|
||||
including functions to get/set the above constructs, create a blank
|
||||
file with the minimum required record types and mappings between
|
||||
getting/setting data and style in a workbook to the creation of
|
||||
record types, and read HSSF files.</li>
|
||||
<li>An abstraction of the HSSF file
|
||||
format providing interfaces representing the HSSF File, HSSF
|
||||
Workbook, HSSF Sheet, HSSF Column, HSSF Formulas in a manner
|
||||
separating the data from the styling and encapsulating the above
|
||||
constructs.</li>
|
||||
<li>Full documentation of the HSSF
|
||||
file format (which will be a subset of the Excel '97 File format).
|
||||
This must be done with care for legal reasons.</li>
|
||||
<li>Full documentation of the APIs and
|
||||
interfaces provided through Javadoc, user documentation (aimed at
|
||||
developers using the APIs).</li>
|
||||
<li>Examples aimed at teaching
|
||||
developers to use the APIs.
|
||||
</li>
|
||||
<li>Performance specifications.
|
||||
(Example files rated by some measure of complexity along with system
|
||||
specifications and execution times for given operations - possibly
|
||||
the same files used for POI's tests)</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>4.3 HSSF Serializer</title>
|
||||
<p>
|
||||
The HSSF Serializer subproject:
|
||||
</p>
|
||||
<ul>
|
||||
<li>A class supporting the Cocoon 2
|
||||
Serializer Interface.</li>
|
||||
<li>An interface between the SAX
|
||||
events and the HSSF APIs.</li>
|
||||
<li>A specified tag language for using
|
||||
with the Serializer.</li>
|
||||
<li>Documentation on the tag language
|
||||
for the HSSF Serializer</li>
|
||||
<li>Normal javadocs.</li>
|
||||
<li>Example XML files</li>
|
||||
<li>Performance specifications.
|
||||
(Example XML docs and stylesheets rated by some measure of
|
||||
complexity along with system specifications and execution times)</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>5. Other Product Requirements</title>
|
||||
<section><title>5.1. Applicable Standards</title>
|
||||
<p>
|
||||
All Java code will be 100% pure Java.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>5.2. System Requirements</title>
|
||||
<p>
|
||||
The minimum system requirements for POIFS are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
</ul>
|
||||
<p>
|
||||
The minimum system requirements for HSSF are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
<li>POIFS API</li>
|
||||
</ul>
|
||||
<p>
|
||||
The minimum system requirements for the HSSF Serializer are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
<li>Cocoon 2</li>
|
||||
<li>HSSF API</li>
|
||||
<li>POI API</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>5.3. Performance Requirements</title>
|
||||
<p>
|
||||
All components must perform well enough
|
||||
to be practical for use in a webserver environment (especially
|
||||
Cocoon2/Tomcat/Apache combo)
|
||||
</p>
|
||||
</section>
|
||||
<section><title>5.4. Environmental Requirements</title>
|
||||
<p>
|
||||
The software will run primarily in
|
||||
developer environments. We should make some allowances for
|
||||
not-highly-technical users to write XML documents for the HSSF
|
||||
Serializer. All other components will assume intermediate Java 2
|
||||
knowledge. No XML knowledge will be required except for using the
|
||||
HSSF Serializer. As much documentation as is practical shall be
|
||||
required for all components as XML is relatively new, and the
|
||||
concepts introduced for writing spreadsheets and to POI filesystems
|
||||
will be brand new to Java and many Java developers.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>6. Documentation Requirements</title>
|
||||
<section><title>6.1 POI Filesystem</title>
|
||||
<p>
|
||||
The filesystem as read and written by
|
||||
POI shall be fully documented and explained so that the average Java
|
||||
developer can understand it.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.2. POI API</title>
|
||||
<p>
|
||||
The POI API will be fully documented
|
||||
through Javadoc. A walkthrough of using the high level POI API shall
|
||||
be provided. No documentation outside of the Javadoc shall be
|
||||
provided for the low-level POI APIs.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.3. HSSF File Format</title>
|
||||
<p>
|
||||
The HSSF File Format as implemented by
|
||||
the HSSF API will be fully documented. No documentation will be
|
||||
provided for features that are not supported by HSSF API that are
|
||||
supported by the Excel 97 File Format. Care will be taken not to
|
||||
infringe on any "legal stuff".
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.4. HSSF API</title>
|
||||
<p>
|
||||
The HSSF API will be documented by
|
||||
javadoc. A walkthrough of using the high level HSSF API shall be
|
||||
provided. No documentation outside of the Javadoc shall be provided
|
||||
for the low level HSSF APIs.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>6.5. HSSF Serializer</title>
|
||||
<p>
|
||||
The HSSF Serializer will be documented
|
||||
by javadoc.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>6.6 HSSF Serializer Tag language</title>
|
||||
<p>
|
||||
The XML tag language along with
|
||||
function and usage shall be fully documented. Examples will be
|
||||
provided as well.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>7. Terminology</title>
|
||||
<section><title>7.1 Filesystem</title>
|
||||
<p>
|
||||
filesystem shall refer only to the POI formatted archive.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>7.2 File</title>
|
||||
<p>
|
||||
file shall refer to the embedded data stream within a
|
||||
POI filesystem. This will be the actual embedded document.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
594
src/documentation/content/xdocs/devel/plan/vision20.xml
Normal file
@ -0,0 +1,594 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI 2.0 Vision Document</title>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver2@users.sourceforge.net"/>
|
||||
<person name="Marcus W. Johnson" email="mjohnson@apache.org"/>
|
||||
<person name="Glen Stampoultzis" email="user@poi.apache.org"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
|
||||
<section><title>Preface</title>
|
||||
<p>
|
||||
This is the POI 2.0 cycle vision document. Although the vision
|
||||
has not changed and this document is certainly not out of date and
|
||||
the vision has not changed, the structure of the project has
|
||||
changed a bit. We're not going to change the vision document to
|
||||
reflect this (however proper that may be) because it would only
|
||||
involve deletion. There is no purpose in providing less
|
||||
information provided we give clarification.
|
||||
</p>
|
||||
<p>
|
||||
This document was created before the POI components for
|
||||
<a href="https://xml.apache.org/cocoon">Apache Cocoon</a>
|
||||
were accepted into the Cocoon project itself. It was also
|
||||
written before POI was accepted into Jakarta. So while the
|
||||
vision hasn't changed some of the components are actually now
|
||||
part of other projects. We'll still be working on them on the
|
||||
same timeline roughly (minus the overhead of coordination with
|
||||
other groups), but they are no longer technically part of the
|
||||
POI project itself.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>1. Introduction</title>
|
||||
<section><title>1.1 Purpose of this document</title>
|
||||
<p>
|
||||
The purpose of this document is to
|
||||
collect, analyze and define high-level requirements, user needs,
|
||||
and features of the second release of the POI project software.
|
||||
The POI project currently consists of the following components:
|
||||
the HSSF Serializer, the HSSF library and the POIFS library.
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
The HSSF Serializer is a set of Java classes whose main
|
||||
class supports the Serializer interface from the Cocoon
|
||||
2 project and outputs the serialized data in a format
|
||||
compatible with the spreadsheet program Microsoft Excel
|
||||
'97.
|
||||
</li>
|
||||
<li>
|
||||
The HSSF library is a set of classes for reading and
|
||||
writing Microsoft Excel 97 file format using pure Java.
|
||||
</li>
|
||||
<li>
|
||||
The POIFS library is a set of classes for reading and
|
||||
writing Microsoft's OLE 2 Compound Document format using
|
||||
pure Java.
|
||||
</li>
|
||||
</ul>
|
||||
<p>By the completion of this release cycle the POI project will also
|
||||
include the HSSF Generator and the HWPF library.
|
||||
</p>
|
||||
<ul>
|
||||
<li>The HSSF Generator will be responsible for using HSSF to read
|
||||
in the XLS (Excel 97) file format and create SAX events. The HSSF
|
||||
Generator will support the applicable interfaces specified by the
|
||||
Apache Cocoon 2 project.
|
||||
</li>
|
||||
<li>The HWPF library will provide a set of high level interfaces
|
||||
for reading and writing Microsoft Word 97 file format using pure
|
||||
Java.</li>
|
||||
</ul>
|
||||
|
||||
</section>
|
||||
|
||||
|
||||
<section><title>1.2 Project Overview</title>
|
||||
<p>
|
||||
The first release of the POI project
|
||||
was an astounding success. This release seeks to build on that
|
||||
success by:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
Refactoring POIFS into input and
|
||||
output classes as well as an event-driven API for reading.
|
||||
</li>
|
||||
<li>
|
||||
Refactor HSSF for greater
|
||||
performance as well as an event-driven API for reading
|
||||
</li>
|
||||
<li>
|
||||
Extend HSSF by adding the ability to read and write formulas.
|
||||
</li>
|
||||
<li>
|
||||
Extend HSSF by adding the ability to read and write
|
||||
user-defined styles.
|
||||
</li>
|
||||
<li>
|
||||
Create a Cocoon 2 Generator for HSSF using the same tags
|
||||
as the HSSF Serializer.
|
||||
</li>
|
||||
<li>
|
||||
Create a new library (HWPF) for reading and writing
|
||||
Microsoft Word DOC format.
|
||||
</li>
|
||||
<li>
|
||||
Refactor the HSSFSerializer into a separate extensible
|
||||
POIFSSerializer and HSSFSerializer
|
||||
</li>
|
||||
<li>
|
||||
Providing the create excel charts. (write only)
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>2. User Description</title>
|
||||
<section><title>2.1 User/Market Demographics</title>
|
||||
<p>
|
||||
There are a number of enthusiastic
|
||||
users of XML, UNIX and Java technology. Furthermore, the Microsoft
|
||||
solution for outputting Office Document formats often involves
|
||||
actually manipulating the software as an OLE Server. This method
|
||||
provides extremely low performance, extremely high overhead and is
|
||||
only capable of handing one document at a time.
|
||||
</p>
|
||||
<ol>
|
||||
<li>
|
||||
Our intended audience for the HSSF
|
||||
Serializer portion of this project are developers writing reports or
|
||||
data extracts in XML format.
|
||||
</li>
|
||||
<li>
|
||||
Our intended audience for the HSSF
|
||||
library portion of this project is ourselves as we are developing
|
||||
the HSSF serializer and anyone who needs to read and write Excel
|
||||
spreadsheets in a non-XML Java environment, or who has specific
|
||||
needs not addressed by the Serializer
|
||||
</li>
|
||||
<li>
|
||||
Our intended audience for the
|
||||
POIFS library is ourselves as we are developing the HSSF and HWPF
|
||||
libraries and anyone wishing to provide other libraries for
|
||||
reading/writing other file formats utilizing the OLE 2 Compound
|
||||
Document Format in Java.
|
||||
</li>
|
||||
<li>
|
||||
Our intended audience for the HSSF
|
||||
generator are developers who need to export Excel spreadsheets to
|
||||
XML in a non-proprietary environment.
|
||||
</li>
|
||||
<li>
|
||||
Our intended audience for the HWPF
|
||||
library is ourselves, as we will be developing a HWPF Serializer in a
|
||||
later release, and anyone wishing to add .DOC file processing and
|
||||
creation to their projects.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
<section><title>2.2. User environment</title>
|
||||
<p>
|
||||
The users of this software shall be
|
||||
developers in a Java environment on any operating system, or power
|
||||
users who are capable of XML document generation/deployment.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>2.3. Key User Needs</title>
|
||||
<p>
|
||||
The HSSF library currently requires a
|
||||
full object representation to be created before reading values. This
|
||||
results in very high memory utilization. We need to reduce this
|
||||
substantially for reading. It would be preferable to do this for
|
||||
writing, but it may not be possible due to the constraints imposed by
|
||||
the file format itself. Memory utilization during read is our top
|
||||
user complaint.
|
||||
</p>
|
||||
<p>
|
||||
The POIFS library currently requires a
|
||||
full object representation to be created before reading values. This
|
||||
results in very high memory utilization. We need to reduce this
|
||||
substantially for reading.
|
||||
</p>
|
||||
<p>
|
||||
The HSSF library currently ignores
|
||||
formula cells and identifies them as "UnknownRecord" at the
|
||||
lower level of the API. We must provide a way to read and write
|
||||
formulas. This is now the top requested feature.
|
||||
</p>
|
||||
<p>
|
||||
The HSSF library currently does not support
|
||||
charts. This is a key requirement of some users who wish to use HSSF
|
||||
in a reporting engine.
|
||||
</p>
|
||||
<p>
|
||||
The HSSF Serializer currently does not
|
||||
provide serialization for cell styling. User's will want stylish
|
||||
spreadsheets to result from their XML.
|
||||
</p>
|
||||
<p>
|
||||
There is currently no way to generate
|
||||
the XML from an XLS that is consistent with the format used by the
|
||||
HSSF Serializer.
|
||||
</p>
|
||||
<p>
|
||||
There should be a way to read and write
|
||||
the DOC file format using pure Java.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
</section>
|
||||
<section><title>3. Project Overview</title>
|
||||
<section><title>3.1. Project Perspective</title>
|
||||
<p>
|
||||
The produced code shall be licensed by
|
||||
the Apache License as used by the Cocoon 2 project (APL 1.1) and
|
||||
maintained on at <a href="http://poi.sourceforge.net/">http://poi.sourceforge.net</a>
|
||||
and <a href="http://sourceforge.net/projects/poi">http://sourcefoge.net/projects/poi</a>.
|
||||
It is our hope to at some point integrate with the various Apache
|
||||
projects (xml.apache.org and jakarta.apache.org), at which point we'd
|
||||
turn the copyright over to them.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>3.2. Project Position Statement</title>
|
||||
<p>
|
||||
For developers on a Java and/or XML
|
||||
environment this project will provide all the tools necessary for
|
||||
outputting XML data in the Microsoft Excel format. This project seeks
|
||||
to make the use of Microsoft Windows based servers unnecessary for
|
||||
file format considerations and to fully document the OLE 2 Compound
|
||||
Document format. The project aims not only to provide the tools for
|
||||
serializing XML to Excel and Word file formats and the tools for
|
||||
writing to those file formats from Java, but also to provide the
|
||||
tools for later projects to convert other OLE 2 Compound Document
|
||||
formats to pure Java APIs.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>3.3. Summary of Capabilities</title>
|
||||
<p>
|
||||
HSSF Serializer for Apache Cocoon 2
|
||||
</p>
|
||||
<table>
|
||||
<tr>
|
||||
<th>
|
||||
Benefit
|
||||
</th>
|
||||
<th>
|
||||
Supporting Features
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Ability to serialize styles from XML spreadsheets.
|
||||
</td>
|
||||
<td>
|
||||
HSSFSerializer will support styles.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Ability to read and write formulas in XLS files.
|
||||
</td>
|
||||
<td>
|
||||
HSSF will support reading/writing formulas.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Ability to output in MS Word on any platform using Java.
|
||||
</td>
|
||||
<td>
|
||||
The project will develop an API that outputs in Word format
|
||||
using pure Java.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Enhance performance for reading and writing XLS files.
|
||||
</td>
|
||||
<td>
|
||||
HSSF will undergo a number of performance enhancements. HSSF
|
||||
will include a new event-based API for reading XLS files. POIFS
|
||||
will support a new event-based API for reading OLE2 CDF files.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
Ability to generate XML from XLS files
|
||||
</td>
|
||||
<td>
|
||||
The project will develop an HSSF Generator.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
The ability to generate charts
|
||||
</td>
|
||||
<td>
|
||||
HSSF will provide low level support for chart records as well
|
||||
as high level API support for generating charts. The ability
|
||||
to read chart information will not initially be provided.
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
</table>
|
||||
</section>
|
||||
<section><title>3.4. Assumptions and Dependencies</title>
|
||||
<ul>
|
||||
<li>
|
||||
The HSSF Serializer and Generator
|
||||
will support the Gnumeric 1.0 XML tag language.
|
||||
</li>
|
||||
<li>
|
||||
The HSSF Generator and HSSF
|
||||
Serializer will be mutually validating. It should be possible to
|
||||
have an XLS file created by the Serializer run through the Generator
|
||||
and the output back through the Serializer (via the Cocoon pipeline)
|
||||
and get the same file or a reasonable facsimile (no one cares if it
|
||||
differs by the order of the binary records in some minor but
|
||||
non-visually recognizable manner).
|
||||
</li>
|
||||
<li>
|
||||
The HSSF Generator will run on any
|
||||
Java 2 supporting platform with Apache Cocoon 2 installed along with
|
||||
the HSSF and POIFS APIs.
|
||||
</li>
|
||||
<li>
|
||||
The HSSF Serializer will run on
|
||||
any Java 2 supporting platform with Apache Cocoon 2 installed along
|
||||
with the HSSF and POIFS APIs.
|
||||
</li>
|
||||
<li>
|
||||
The HWPF API requires a Java 2
|
||||
implementation and the POIFS API.
|
||||
</li>
|
||||
<li>
|
||||
The HSSF API requires a Java 2
|
||||
implementation and the POIFS API.
|
||||
</li>
|
||||
<li>
|
||||
The POIFS API requires a Java 2
|
||||
implementation.
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>4. Project Features</title>
|
||||
<p>
|
||||
Enhancements to the POIFS API will
|
||||
include:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
An event driven API for reading
|
||||
POIFS Filesystems.
|
||||
</li>
|
||||
<li>
|
||||
A low-level API for
|
||||
creating/manipulating POI filesystems.
|
||||
</li>
|
||||
<li>
|
||||
Code improvements supporting
|
||||
greater separation between read and write structures.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
Enhancements to the HSSF API will
|
||||
include:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
An event driven API for reading
|
||||
XLS files.
|
||||
</li>
|
||||
<li>
|
||||
Performance improvements.
|
||||
</li>
|
||||
<li>
|
||||
Formula support (read/write)
|
||||
</li>
|
||||
<li>
|
||||
Support for user-defined data
|
||||
formats
|
||||
</li>
|
||||
<li>
|
||||
Better documentation of the file
|
||||
format and structure.
|
||||
</li>
|
||||
<li>
|
||||
An API for creation of charts.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
The HSSF Generator will include:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
A set of classes supporting the
|
||||
Cocoon 2 Generator interfaces providing a method for reading XLS
|
||||
files and outputting SAX events.
|
||||
</li>
|
||||
<li>
|
||||
The same tag format used by the
|
||||
HSSFSerializer in any given release.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
The HWPF API will include:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
An event driven API for reading
|
||||
DOC files.
|
||||
</li>
|
||||
<li>
|
||||
A set of high and low level APIs
|
||||
for reading and writing DOC files.
|
||||
</li>
|
||||
<li>
|
||||
Documentation of the DOC file
|
||||
format or enhancements to existing documentation.
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>5. Other Product Requirements</title>
|
||||
<section><title>5.1. Applicable Standards</title>
|
||||
<p>
|
||||
All Java code will be 100% pure Java.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>5.2. System Requirements</title>
|
||||
<p>
|
||||
The minimum system requirements for the POIFS API are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
</ul>
|
||||
<p>
|
||||
The minimum system requirements for the HSSF API are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
<li>POIFS API</li>
|
||||
</ul>
|
||||
<p>
|
||||
The minimum system requirements for the HWPF API are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
<li>POIFS API</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
The minimum system requirements for the HSSF Serializer are:
|
||||
</p>
|
||||
<ul>
|
||||
<li>64 Mbytes memory</li>
|
||||
<li>Java 2 environment</li>
|
||||
<li>Pentium or better processor (or equivalent on other platforms)</li>
|
||||
<li>Cocoon 2</li>
|
||||
<li>HSSF API</li>
|
||||
<li>POI API</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>5.3. Performance Requirements</title>
|
||||
<p>
|
||||
All components must perform well enough
|
||||
to be practical for use in a webserver environment (especially
|
||||
the "killer trio": Cocoon2/Tomcat/Apache combo)
|
||||
</p>
|
||||
</section>
|
||||
<section><title>5.4. Environmental Requirements</title>
|
||||
<p>
|
||||
The software will run primarily in
|
||||
developer environments. We should make some allowances for
|
||||
not-highly-technical users to write XML documents for the HSSF
|
||||
Serializer. All other components will assume intermediate Java 2
|
||||
knowledge. No XML knowledge will be required except for using the
|
||||
HSSF Serializer. As much documentation as is practical shall be
|
||||
required for all components as XML is relatively new, and the
|
||||
concepts introduced for writing spreadsheets and to POI filesystems
|
||||
will be brand new to Java and many Java developers.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>6. Documentation Requirements</title>
|
||||
<section><title>6.1 POI Filesystem</title>
|
||||
<p>
|
||||
The filesystem as read and written by
|
||||
POI shall be fully documented and explained so that the average Java
|
||||
developer can understand it.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.2. POI API</title>
|
||||
<p>
|
||||
The POI API will be fully documented
|
||||
through Javadoc. A walkthrough of using the high level POI API shall
|
||||
be provided. No documentation outside of the Javadoc shall be
|
||||
provided for the low-level POI APIs.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.3. HSSF File Format</title>
|
||||
<p>
|
||||
The HSSF File Format as implemented by
|
||||
the HSSF API will be fully documented. No documentation will be
|
||||
provided for features that are not supported by HSSF API that are
|
||||
supported by the Excel 97 File Format. Care will be taken not to
|
||||
infringe on any "legal stuff". Additionally, we are
|
||||
collaborating with the fine folks at OpenOffice.org on
|
||||
*free* documentation of the format.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.4. HSSF API</title>
|
||||
<p>
|
||||
The HSSF API will be documented by
|
||||
javadoc. A walkthrough of using the high level HSSF API shall be
|
||||
provided. No documentation outside of the Javadoc shall be provided
|
||||
for the low level HSSF APIs.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.5 HWPF API</title>
|
||||
<p>
|
||||
The HWPF API will be documented by
|
||||
javadoc. A walkthrough of using the high level HWPF API shall be
|
||||
provided. No documentation outside of the Javadoc shall be provided
|
||||
for the low level HWPF APIs.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.6 HSSF Serializer</title>
|
||||
<p>
|
||||
The HSSF Serializer will be documented
|
||||
by javadoc.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.7 HSSF Generator</title>
|
||||
<p>
|
||||
The HSSF Generator will be documented
|
||||
by javadoc.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>6.8 HSSF Serializer Tag language</title>
|
||||
<p>
|
||||
The XML tag language along with
|
||||
function and usage shall be fully documented. Examples will be
|
||||
provided as well.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>7. Terminology</title>
|
||||
<section><title>7.1 Filesystem</title>
|
||||
<p>
|
||||
filesystem shall refer only to the POI formatted archive.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>7.2 File</title>
|
||||
<p>
|
||||
file shall refer to the embedded data stream within a
|
||||
POI filesystem. This will be the actual embedded document.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
@ -0,0 +1,86 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Third Party Contributions</title>
|
||||
<authors>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
|
||||
<section><title>How to Contribute</title>
|
||||
<p>
|
||||
See <a href="contrib.xml">How to contribute to Poi</a>.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>Contributed Components</title>
|
||||
<p>
|
||||
These are not necessarily deemed to be high enough quality to be included in the
|
||||
core distribution, but they have been tested under <a href="contrib.xml">
|
||||
several key environments</a>, they are provided under the same license
|
||||
as Poi, and they are included in the POI distribution under the
|
||||
<code>contrib/</code> directory.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
<strong>None as yet!</strong> - although you can expect that some of the links
|
||||
listed below will eventually migrate to the "contributed components" level, and
|
||||
then maybe even into the main distribution.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Patch Queue</title>
|
||||
<p><a href="patches.html">Submissions of modifications</a>
|
||||
to POI which are awaiting review. Anyone can
|
||||
comment on them on the dev mailing list - code reviewers are needed!
|
||||
<strong>Use these at your own risk</strong> - although POI has no guarantee
|
||||
either, these patches have not been reviewed, let alone accepted.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Other Extensions</title>
|
||||
<p>The other extensions listed here are <strong>not endorsed</strong> by the POI
|
||||
project either - they are provided as a convenience only. They may or may not work,
|
||||
they may or may not be open source, etc.
|
||||
</p>
|
||||
|
||||
<p>To have a link added to this table, see <a href="contrib.xml">How to contribute
|
||||
to POI</a>.</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Name and Link</th>
|
||||
<th>Type</th>
|
||||
<th>Description</th>
|
||||
<th>Status</th>
|
||||
<th>Licensing</th>
|
||||
<th>Contact</th>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
66
src/documentation/content/xdocs/devel/references/index.xml
Normal file
@ -0,0 +1,66 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Live Sites using Poi</title>
|
||||
<authors>
|
||||
<person name="Donald Ball" email="balld@webslingerZ.com"/>
|
||||
<person name="Stefano Mazzocchi" email="stefano@apache.org"/>
|
||||
<person name="Robin Green" email="greenrd@hotmail.com"/>
|
||||
<person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
<person name="Glen Stampoultzis" email="user@poi.apache.org"/>
|
||||
<person name="Rainer Klute" email="klute@rainer-klute.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>References</title>
|
||||
|
||||
<section><title>Live Sites using POI</title>
|
||||
<p>Currently we don't have any sites listed that use POI, but we're
|
||||
sure they're out there. Help us change this. If you've written a site
|
||||
that utilises POI let us know.</p>
|
||||
<!--
|
||||
<ul>
|
||||
<li><a href=""></a></li>
|
||||
</ul>
|
||||
-->
|
||||
</section>
|
||||
|
||||
<section><title>Products/Projects using POI</title>
|
||||
<p>Publicly available products/projects using POI include:</p>
|
||||
<ul>
|
||||
<li><a href="http://jtimetracker.sourceforge.net/">JTimeTracker</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section><title>File Format Descriptions</title>
|
||||
<p>POI depends on publicly available documents describing various
|
||||
file formats. The list below contains links to some of them.</p>
|
||||
<ul>
|
||||
<li><a href="http://www.wotsit.org/">Wotsit's Format</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
194
src/documentation/content/xdocs/devel/references/logocontest.xml
Normal file
@ -0,0 +1,194 @@
|
||||
<?xml version="1.0" encoding="ISO-8859-1"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title></title>
|
||||
<authors>
|
||||
<person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person id="GS" name="Glen Stampoultzis" email="user@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>POI logos</title>
|
||||
<p>
|
||||
Here are the current logo submissions. Thanks to the artists!
|
||||
</p>
|
||||
<section><title>Michael Mosmann</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoMichaelMosmann.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Loïc Lefèvre</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoLoicLefevre.png"/>
|
||||
<img alt="logo" src="images/logoLoicLefevre2.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Glen Stampoultzis</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoGlenStampoutlzis.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Marcus Gustafsson</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoGustafsson1.png"/>
|
||||
<img alt="logo" src="images/logoGustafsson2.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Adrianus Handoyo</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoAdria1.png"/>
|
||||
<img alt="logo" src="images/logoAdria2.png"/>
|
||||
<img alt="logo" src="images/logoAdria3.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>RussellBeattie</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRussellBeattie1.png"/>
|
||||
<img alt="logo" src="images/logoRussellBeattie2.png"/>
|
||||
<img alt="logo" src="images/logoRussellBeattie3.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRussellBeattie4.png"/>
|
||||
<img alt="logo" src="images/logoRussellBeattie5.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Daniel Fernandez</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoDanielFernandez.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Andrew Clements</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoAndrewClements.png"/>
|
||||
<img alt="logo" src="images/logoAndrewClements2.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Wendy Wise</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoWendyWise.png"/>
|
||||
<img alt="logo" src="images/logoWendyWise2.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Nikhil Karmokar</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoKarmokar1.png"/>
|
||||
<img alt="logo" src="images/logoKarmokar1s.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoKarmokar2.png"/>
|
||||
<img alt="logo" src="images/logoKarmokar2s.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoKarmokar3.png"/>
|
||||
<img alt="logo" src="images/logoKarmokar3s.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoKarmokar4.png"/>
|
||||
<img alt="logo" src="images/logoKarmokar4s.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoKarmokar5.png"/>
|
||||
<img alt="logo" src="images/logoKarmokar5s.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoKarmokar6.png"/>
|
||||
<img alt="logo" src="images/logoKarmokar6s.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Lieven Janssen</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoJanssen1.png"/>
|
||||
<img alt="logo" src="images/logoJanssen2.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>RaPi GmbH</title>
|
||||
<p>
|
||||
Contact Person: Fancy at: fancy at my-feiqi.com
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH1.png"/>
|
||||
<img alt="logo" src="images/logoRaPiGmbH2.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH5.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH6.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH7.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH8.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH9.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH10.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH11.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRaPiGmbH12.png"/>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Randy Stanard</title>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard01.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard02.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard03.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard04.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard05.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard06.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard07.png"/>
|
||||
</p>
|
||||
<p>
|
||||
<img alt="logo" src="images/logoRandyStanard08.png"/>
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
55
src/documentation/content/xdocs/devel/resolutions/index.xml
Normal file
@ -0,0 +1,55 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Resolutions</title>
|
||||
<subtitle>About this section</subtitle>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>About Resolutions</title>
|
||||
<p>
|
||||
Every project in Apache has resolutions that they vote on.
|
||||
Decisions are made, etc. But what happens once those decisions
|
||||
are made? They are archived in the mail list archive never to
|
||||
be read again (once its not in the top 10 or so posts). So they
|
||||
get discussed again and again.
|
||||
</p>
|
||||
<p>
|
||||
Rather than have that big waste of time, we have this section to
|
||||
record important POI decisions. Once a decision is passed it
|
||||
need only be linked to this page (either by creating a page for
|
||||
it or by simply linking it to the archive messages). Wherever
|
||||
possible a brief about how many votes for and against an maybe
|
||||
some background should be posted.
|
||||
</p>
|
||||
<p>
|
||||
This section is intended mainly to reduce big waste of time
|
||||
discussions from taking away from whats important...developing
|
||||
POI! :-D
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
113
src/documentation/content/xdocs/devel/resolutions/res001.xml
Normal file
@ -0,0 +1,113 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>POI Resoluton</title>
|
||||
<subtitle>Resolution 001 - Minimal Coding Standards</subtitle>
|
||||
<authors>
|
||||
<person name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Resolution 001 - Minimal Coding Standards</title>
|
||||
<section><title>Majority Position</title>
|
||||
<p>
|
||||
As the POI project has grown the "styles" used have become more
|
||||
varied, some see this as a bad thing, but in reality it
|
||||
can be a good thing. Each can learn from the different
|
||||
styles by working with different code. That being said
|
||||
there are some universal "good quality" guidelines that
|
||||
must be adopted on a project of any proportions.
|
||||
</p>
|
||||
<p>
|
||||
Marc Johnson Authored the following resolution:
|
||||
</p>
|
||||
<p>
|
||||
On Tue, 2002-01-08 at 22:23, Marc Johnson wrote:
|
||||
Standards are wonderful; everyone should have a set.
|
||||
Here's what I propose for coding standards for POI WRT comments (should I
|
||||
feel the need, I'll post more of these little gems):
|
||||
</p>
|
||||
<ol>
|
||||
<li>
|
||||
All classes and interfaces MUST have, right at the
|
||||
beginning of the file, the Apache Software License
|
||||
2.0 License Header. (see /legal/LICENSE).
|
||||
</li>
|
||||
<li>
|
||||
All classes and interfaces MUST include class javadoc. Conventionally,
|
||||
this goes after the package and imports, and before the start of the class
|
||||
or interface.
|
||||
<!-- No more author tags -->
|
||||
<!-- The class javadoc MUST have at least one @author tag -->
|
||||
</li>
|
||||
<li>
|
||||
All methods that are accessible outside the class MUST have javadoc
|
||||
comments. In other words, if it isn't private, it MUST have javadoc
|
||||
comments. Simple getters can consist of a simple @return tag; simple setters
|
||||
can consist of a simple @param tag. Anything else requires some verbiage
|
||||
plus all the standard javadoc tags as appropriate. You MUST include @throws
|
||||
or @throws for any non-runtime exceptions, and you SHOULD document any
|
||||
runtime exceptions you expect to throw. @throws/@throws tags SHOULD
|
||||
include an explanation of why that exception would be thrown. If your method
|
||||
might return null, you MUST say so. An accompanying explanation of the
|
||||
circumstances for doing so would be nice.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
<section><title>Amendments (informal by extension and not by vote)</title>
|
||||
<section><title>License</title>
|
||||
<p>
|
||||
As opposed to the formerly used POI License (which was
|
||||
based on the Apache Public License), now that POI is
|
||||
part of Apache, use the standard Apache Software
|
||||
License 2.0 header. As per standard Apache Software
|
||||
Foundation policy, the full (long) version of the
|
||||
header should be used.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>2 cents</title>
|
||||
<p>
|
||||
Tip: No laughing or joking allowed in conversations regarding coding
|
||||
standards.
|
||||
Any mail on coding standards will be treated very seriously,
|
||||
and sent here with a RTFM.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Dissent</title>
|
||||
<p>
|
||||
The motion was passed unanimously with no negative or
|
||||
neutral votes.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Comments</title>
|
||||
<p>
|
||||
Andy didn't feel like going through his mail and sucking
|
||||
out the comments.. If there is anything you feel should
|
||||
be added here do it yourself ;-).
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
250
src/documentation/content/xdocs/devel/subversion.xml
Normal file
@ -0,0 +1,250 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Source Code Repository</title>
|
||||
<authors>
|
||||
<person id="NB" name="Nick Burch" email="dev@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Download the Source</title>
|
||||
<p>
|
||||
Most users of the source code probably don't need to have day to
|
||||
day access to the source code as it changes. Therefore most users will want
|
||||
to make use of our <a href="site:download">source release</a>
|
||||
packages, which contain the complete source tree for each binary
|
||||
release, suitable for browsing or debugging. These source releases
|
||||
are available from our
|
||||
<a href="site:download">download page.</a>
|
||||
</p>
|
||||
<p>
|
||||
The Apache POI source code is also available as source artifacts
|
||||
in the <a href="https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.poi%22">Maven Central repository</a>,
|
||||
which may be helpful for those users who make use of POI via Maven
|
||||
and wish to inspect the source (eg when debugging in an IDE).
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Access the Version Controlled Source Code</title>
|
||||
<p>
|
||||
For general information on connecting to the ASF Subversion,
|
||||
repositories, see the
|
||||
<a href="https://www.apache.org/dev/version-control.html">version control page.</a>
|
||||
</p>
|
||||
|
||||
<p>Apache POI uses <a href="https://subversion.apache.org">Subversion</a> as its version control system,
|
||||
but also has a read-only git mirror
|
||||
</p>
|
||||
|
||||
<p><strong>NOTE</strong>: When checking out a subproject using
|
||||
subversion, either perform a sparse checkout or check out
|
||||
the trunk or a single branch or tag to avoid filling up
|
||||
your hard-disk and wasting bandwidth.
|
||||
</p>
|
||||
|
||||
<ul>
|
||||
<li>For read only access to the latest Apache POI code, please use
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/">https://svn.apache.org/repos/asf/poi/trunk/</a></li>
|
||||
<li>To browse the svn repository in your web browser, please use
|
||||
<a href="https://svn.apache.org/viewvc/poi/">ViewVC</a></li>
|
||||
</ul>
|
||||
|
||||
<p>If you are not a <em>Committer</em>, but you want to submit patches
|
||||
or even request commit privileges, please see our
|
||||
<a href="site:guidelines">Contribution Guidelines</a> for more
|
||||
information.</p>
|
||||
</section>
|
||||
<section><title>Git access to POI sources</title>
|
||||
<p>
|
||||
The master source repository for Apache POI is the Subversion
|
||||
one listed above. To support those users and developers who prefer
|
||||
to use the Git tooling, read-only access to the POI source tree is
|
||||
also available via Git. The Git mirrors normally track SVN to
|
||||
within a few minutes.
|
||||
</p>
|
||||
<p>
|
||||
The official read-only Git repository for Apache POI is available
|
||||
from <a href="https://git.apache.org/">git.apache.org/</a> .
|
||||
The Git Clone URL is: <a href="git://git.apache.org/poi.git">git://git.apache.org/poi.git</a>
|
||||
and Https Clone URL: <a href="https://git.apache.org/poi.git">https://git.apache.org/poi.git</a> .
|
||||
Please see the <a href="https://git.apache.org/">Git at
|
||||
Apache</a> page for more details on the service.
|
||||
</p>
|
||||
<p>
|
||||
In addition to the <a href="https://git.apache.org/">git.apache.org</a>
|
||||
repository, changes are also mirrored in near-realtime to GitHub.
|
||||
The GitHub repository is available at
|
||||
<a href="https://github.com/apache/poi">https://github.com/apache/poi</a> .
|
||||
Please note that the GitHub repository is read-only, but pull requests sent
|
||||
to it will result in an email being sent to the mailing list. A Git-formatted
|
||||
patch added to Bugzilla is generally preferred though, as it can be tracked
|
||||
along with all the other contributions. Please see the
|
||||
<a href="site:guidelines">contribution guidelines</a> for more
|
||||
information on getting involved in the project.</p>
|
||||
</section>
|
||||
<section><title>Using Git via the SVN-Git bridge</title>
|
||||
<section><title>General information</title>
|
||||
<p>
|
||||
Git provides a nice functionality "git-svn" which allows to read the history
|
||||
of a Subversion repository and convert it into a full Git repository. This
|
||||
will keep information from the SVN revisions so that the Git repository can
|
||||
be updated with newer revisions from Subversion as well as allowing to push
|
||||
commits from Git "upstream" into the Subversion repository. See the
|
||||
<a href="https://www.kernel.org/pub/software/scm/git/docs/git-svn.html">
|
||||
official documentation</a> for more details.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Set up the repository</title>
|
||||
<p>
|
||||
The git-svn functionality is provided as a set of sub-commands to
|
||||
"git svn". To start retrieving information from SVN and create the
|
||||
initial Git repository run the following command:
|
||||
|
||||
</p>
|
||||
<source>
|
||||
git svn clone https://svn.apache.org/repos/asf/poi/trunk poisvngit --revision <a href="https://svn.apache.org/viewvc?view=revision&revision=1732982">1732982</a>:HEAD
|
||||
</source>
|
||||
<p>
|
||||
Running without <code>--revision from:HEAD</code> will run for a long time and will retrieve the full version history of
|
||||
the Subversion repository. If you need more repository history, change the <code>from</code> revision to an
|
||||
<a href="https://svn.apache.org/viewvc/poi/tags/">earlier release</a> or omit the <code>--revision</code>
|
||||
specifier altogether.
|
||||
</p>
|
||||
<p>
|
||||
When this finishes you have a Git repository whose "master" branch
|
||||
mirrors the SVN "trunk".
|
||||
<br/>
|
||||
From here you can use the full power of Git, i.e. quick branching,
|
||||
rebasing, merging, ...
|
||||
<br/>
|
||||
See below for some common usage hints.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Fetching newer SVN revisions</title>
|
||||
<p>
|
||||
In order to fetch the latest SVN revisions, you need to "rebase" onto
|
||||
the SVN trunk:
|
||||
</p>
|
||||
<source>
|
||||
git checkout master
|
||||
git svn rebase
|
||||
</source>
|
||||
<p>
|
||||
This will fetch the latest changes from Subversion and will rebase
|
||||
the master-branch onto them.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Pushing Git commits to Subversion</title>
|
||||
<p>
|
||||
The following command will push all changes on <code>master</code> back to
|
||||
Subversion:
|
||||
</p>
|
||||
<source>
|
||||
git svn dcommit
|
||||
</source>
|
||||
<p>
|
||||
Note that usually all commits on master will be sent to Subversion
|
||||
in one go, so it's similar to a "push" to another Git repository.
|
||||
|
||||
The dcommit may fail if there are newer revisions in Subversion, you
|
||||
will need to run a <code>git svn rebase</code> first in this case.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>General usage guidelines</title>
|
||||
<p>
|
||||
Although you can use the full power of Git, there are a few
|
||||
things that work well and some things that will get you into
|
||||
trouble:
|
||||
</p>
|
||||
<p>
|
||||
You should not develop on master, rather use some branching
|
||||
concept where you do work on sub-branches and only merge/cherry-pick the
|
||||
changes that are ready for being sent upstream.
|
||||
It seems to work better to constantly rebase changes onto the
|
||||
master branch as this will keep the history clean compared to
|
||||
the SVN repository and will avoid sending useless "Merge" commits to
|
||||
Subversion.
|
||||
</p>
|
||||
<p>
|
||||
You can keep some changes that are only useful locally by using
|
||||
two branches that are rebased onto each other. E.g.
|
||||
something like the following has proven to work well:
|
||||
</p>
|
||||
<source>
|
||||
master
|
||||
-> localchanges - commits that should not be sent upstream ->
|
||||
-> workbranch - place for doing development work
|
||||
</source>
|
||||
<p>
|
||||
When things are ready in the workbranch do a
|
||||
</p>
|
||||
<source>
|
||||
git checkout master
|
||||
git cherry-pick commitid ...
|
||||
</source>
|
||||
<p>
|
||||
to get all the finished commits onto master as preparation for pushing them upstream.
|
||||
|
||||
Then you can <code>git svn dcommit</code> to send the changes upstream
|
||||
and a <code>git svn rebase</code> to get master updated with the newly
|
||||
created SVN revisions.
|
||||
|
||||
Finally do the following to update both branches onto the new SVN head
|
||||
</p>
|
||||
<source>
|
||||
# rebase you local changes onto the latest SVN state
|
||||
git checkout localchanges
|
||||
git rebase master
|
||||
|
||||
# also set the working branch to the latest state from SVN.
|
||||
git checkout workbranch
|
||||
git rebase workbranch
|
||||
</source>
|
||||
<p>
|
||||
Sounds like too much work? Put these steps into a small script and all
|
||||
this will become a simple <code>poiupdate</code> to get all branches
|
||||
rebased onto HEAD from Subversion.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Code metrics </title>
|
||||
<p>
|
||||
Code quality reports for Apache POI are available on the
|
||||
<a href="https://sonarcloud.io/dashboard?id=poi-parent">Apache Sonar instance</a>.
|
||||
</p>
|
||||
<p>
|
||||
Sonar provides lots of useful numbers and statistics, especially
|
||||
watching the project over time shows how some of the indicators evolve
|
||||
and allows to see which areas need some polishing.
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
134
src/documentation/content/xdocs/devel/who.xml
Normal file
@ -0,0 +1,134 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Who We Are</title>
|
||||
<authors>
|
||||
<person name="Apache POI Developers" email="dev@poi.apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
|
||||
<section><title>Apache POI™ - Who we are</title>
|
||||
<p>
|
||||
The Apache POI Project operates on a meritocracy: the more you do, the more
|
||||
responsibility you will obtain. This page lists all of the people who have
|
||||
gone the extra mile and are Committers. If you would like to get involved,
|
||||
the first step is to join the <a href="site:mailinglists">mailing lists</a>.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
We ask that you please do not send us emails privately asking for support.
|
||||
We are non-paid volunteers who help out with the project and we do not
|
||||
necessarily have the time or energy to help people on an individual basis.
|
||||
The <a href="site:mailinglists">mailing lists</a> have many individuals
|
||||
who will help answer detailed requests for help. The benefit of
|
||||
using mailing lists over private communication is that they are a shared
|
||||
resource where others can also learn from common questions.
|
||||
</p>
|
||||
<p>
|
||||
POI Developers count on feedback from the mailing lists. Many developers do take
|
||||
an active role on the lists.
|
||||
</p>
|
||||
|
||||
<!-- <section><title>Advisors</title>-->
|
||||
<!-- <ul>-->
|
||||
<!-- <li><a href="http://www.betaversion.org/~stefano/">Stefano Mazzocchi</a> (stefano at apache dot org)-->
|
||||
<!-- </li>-->
|
||||
<!-- </ul>-->
|
||||
<!-- </section>-->
|
||||
|
||||
<section><title>Project Chair</title>
|
||||
<ul>
|
||||
<li>Dominik Stadler (centic at apache dot org)</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Committers</title>
|
||||
<ul>
|
||||
<!-- Alphabetical by surname -->
|
||||
<li>Tim Allison (tallison at apache dot org)</li>
|
||||
<li><a href="https://people.apache.org/list_B.html#kiwiwings">Andreas Beeker</a> (kiwiwings at apache dot org)</li>
|
||||
<li>Nick Burch (nick at apache dot org)</li>
|
||||
<li>Amol S Deshmukh (amol at apache dot org)</li>
|
||||
<li>David Fisher (wave at apache dot org)</li>
|
||||
<li>Jason Height (jheight at apache dot org)</li>
|
||||
<li>Marc Johnson (mjohnson at apache dot org)</li>
|
||||
<li><a href="http://www.rainer-klute.de/">Rainer Klute</a> (klute at apache dot org)</li>
|
||||
<li>Yegor Kozlov (yegor at apache dot org)</li>
|
||||
<li>Shawn Laubach (slaubach at apache dot org)</li>
|
||||
<li>Josh Micich (josh at apache dot org)</li>
|
||||
<li>Mark Murphy (jmarkmurphy at apache dot org)</li>
|
||||
<li>Danny Mui (dmui at apache dot org)</li>
|
||||
<li><a href="https://people.apache.org/~dnorth/">David North</a> (dnorth at apache dot org)</li>
|
||||
<li>Javen O'Neal (onealj at apache dot org)</li>
|
||||
<li>Uwe Schindler (uschindler at apache dot org)</li>
|
||||
<li>Avik Sengupta (avik at apache dot org)</li>
|
||||
<li>Dominik Stadler (centic at apache dot org)</li>
|
||||
<li><a href="http://members.iinet.net.au/~gstamp/glen/">Glen Stampoultzis</a> (glens at apache.org)</li>
|
||||
<li>Jon Svede (jsvede at apache dot org)</li>
|
||||
<li>Maxim Valyanskiy (maxcom at apache dot org)</li>
|
||||
<li>Sergey Vladimirov (sergey at apache dot org)</li>
|
||||
<li>Greg Woolsey (gwoolsey at apache dot org)</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section><title>Emeritus Committers</title>
|
||||
<ul>
|
||||
<li>Andrew C. Oliver (acoliver at gmail dot com)</li>
|
||||
<li>Nicola Ken Barozzi (barozzi at nicolaken dot com)</li>
|
||||
<li>Ryan Ackley (sackley at apache dot org)</li>
|
||||
<li>Tetsuya Kitahata (ai at spa dot nifty dot com)</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section><title>I want some progress on a bug report!</title>
|
||||
<p>
|
||||
So you took the time to report a bug, provided information that should make
|
||||
it possible to reproduce the problem and fix it. Surely the fix is easy and
|
||||
should take a seasoned developer a few minutes at max to fix!
|
||||
|
||||
So why is there no progress on your bug report? Is there nobody
|
||||
taking care when your problem is clearly stopping nearly everybody
|
||||
from using POI?
|
||||
|
||||
We know that the absence of responses on bug-reports can be frustrating,
|
||||
sometimes bugs lie dormant for a long time for no apparent reason.
|
||||
|
||||
Please always remember: <em>nobody is paid to work on POI</em>, the team is
|
||||
a bunch of volunteers who look at things in their free time.
|
||||
|
||||
Because of that developers might choose to work on things based on a
|
||||
different priority than yours! Especially the quality and maturity of
|
||||
bug reports will affect if somebody decides to look at it.
|
||||
|
||||
So the best way to help a bug report see progress is to provide more information
|
||||
if available or supply patches together with unit-tests.
|
||||
|
||||
If you can, look at <a href="site:guidelines">Contribution Guidelines</a>
|
||||
for more information about providing patches.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
</body>
|
||||
</document>
|
||||
171
src/documentation/content/xdocs/download.xml
Normal file
@ -0,0 +1,171 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Download Release Artifacts</title>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Available Downloads</title>
|
||||
<p>
|
||||
This page provides instructions on how to download and verify the Apache POI release artifacts. There
|
||||
are different versions available depending on how stable your code should be.
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="#POI-5.4.1">The latest stable release is Apache POI 5.4.1</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="#archive">Archives of all prior releases</a>
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
Apache POI releases are available under the
|
||||
<a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0.</a>
|
||||
See the NOTICE file contained in each release artifact for applicable copyright attribution notices.
|
||||
</p>
|
||||
<p>
|
||||
To ensure that you have downloaded the true release you should
|
||||
<a href="#verify">verify the integrity</a>
|
||||
of the files using the signatures and checksums available from this page.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<!-- latest final release -->
|
||||
|
||||
<section id="POI-5.4.1"><title>6 April 2025 - POI 5.4.1 available</title>
|
||||
<p>The Apache POI team is pleased to announce the release of 5.4.1.
|
||||
Featured are a handful of new areas of functionality and numerous bug fixes.</p>
|
||||
<p>A summary of changes is available in the
|
||||
<a href="https://www.apache.org/dyn/closer.lua/poi/dev/RELEASE-NOTES-5.4.1.txt">Release Notes</a>.
|
||||
A full list of changes is available in the <a href="site:changes">change log</a>.
|
||||
People interested should also follow the <a href="site:mailinglists">dev list</a>
|
||||
to track progress.</p>
|
||||
<p>
|
||||
The POI source release is listed below.
|
||||
Pre-built versions of all <a href="site:components">POI components</a>
|
||||
are available in the central Maven repository under Group ID "org.apache.poi" and Version
|
||||
"5.4.1".
|
||||
</p>
|
||||
<section id="POI-5.4.1-src"><title>Source Distribution</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="https://www.apache.org/dyn/closer.lua/poi/release/src/apache-poi-src-5.4.1-20250401.tgz">apache-poi-src-5.4.1-20250401.tgz</a>
|
||||
(116 MB, <a href="https://downloads.apache.org/poi/release/src/apache-poi-src-5.4.1-20250401.tgz.asc">signature (.asc)</a>,
|
||||
checksum: <a href="https://downloads.apache.org/poi/release/src/apache-poi-src-5.4.1-20250401.tgz.sha512">SHA-512</a>)
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://www.apache.org/dyn/closer.lua/poi/release/src/apache-poi-src-5.4.1-20250401.zip">apache-poi-src-5.4.1-20250401.zip</a>
|
||||
(120 MB, <a href="https://downloads.apache.org/poi/release/src/apache-poi-src-5.4.1-20250401.zip.asc">signature (.asc)</a>,
|
||||
checksum: <a href="https://downloads.apache.org/poi/release/src/apache-poi-src-5.4.1-20250401.zip.sha512">SHA-512</a>)
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section id="POI-bin-artifacts">
|
||||
<title>Binary Artifacts</title>
|
||||
<p>
|
||||
POI 5.2.3 was the last version where we produced a set of poi-bin*.zip and poi-bin*.tgz files.
|
||||
We will continue to publish jars to Maven Central. If you are not using a build tool like
|
||||
Apache Maven or Gradle, you can still find these jars by traversing the directories at
|
||||
<a href="https://repo1.maven.org/maven2/org/apache/poi/">https://repo1.maven.org/maven2/org/apache/poi/</a>.
|
||||
</p>
|
||||
<p>
|
||||
If you want to download a legacy poi-bin archive, see the
|
||||
<a href="#archive">archives of all prior releases</a>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section id="verify">
|
||||
<title>Verify</title>
|
||||
<p>
|
||||
It is essential that you verify the integrity of the downloaded files using the PGP and SHA2 signatures.
|
||||
Please read
|
||||
<a href="https://httpd.apache.org/dev/verification.html">Verifying Apache HTTP Server Releases</a>
|
||||
for more information on why you should verify our releases. This page provides detailed instructions
|
||||
which you can use for POI artifacts.
|
||||
</p>
|
||||
<p>
|
||||
The PGP signatures can be verified using PGP or GPG. First download the
|
||||
<a href="https://downloads.apache.org/poi/KEYS">KEYS</a>
|
||||
file as well as the .asc signature files for the relevant release packages. Make sure you get these
|
||||
files from the main distribution directory, rather than from a mirror.
|
||||
Then <a href="https://www.apache.org/info/verification.html">verify the signatures</a>.
|
||||
</p>
|
||||
<p>Batch check of all distribution files:</p>
|
||||
<source>
|
||||
find . -name "*.sha256" -type f -execdir sha256sum -c {} \;
|
||||
find . -name "*.sha512" -type f -execdir sha512sum -c {} \;
|
||||
find . -name "*.asc" -exec gpg --no-secmem-warning --verify {} \;
|
||||
</source>
|
||||
<p>Sample verification of poi-bin-3.5-FINAL-20090928.tgz</p>
|
||||
<source>% gpg --import KEYS
|
||||
gpg: key 12DAE9BE: "Glen Stampoultzis <glens at apache dot org>" not changed
|
||||
gpg: key 4CEED75F: "Nick Burch <nick at gagravarr dot org>" not changed
|
||||
gpg: key 84B5A42E: "Rainer Klute <rainer.klute at gmx dot de>" not changed
|
||||
gpg: key F5BB52CD: "Yegor Kozlov <yegor.kozlov at gmail dot com>" not changed
|
||||
gpg: Total number processed: 4
|
||||
gpg: unchanged: 4
|
||||
% gpg --verify poi-bin-3.5-FINAL-20090928.tgz.asc poi-bin-3.5-FINAL-20090928.tgz
|
||||
gpg: Signature made Mon Sep 28 10:28:25 2009 PDT using DSA key ID F5BB52CD
|
||||
gpg: Good signature from "Yegor Kozlov <yegor.kozlov at gmail dot com>"
|
||||
gpg: aka "Yegor Kozlov <yegor at dinom dot ru>"
|
||||
gpg: aka "Yegor Kozlov <yegor at apache dot org>"
|
||||
Primary key fingerprint: 7D77 0C77 6CE7 754E E6AF 23AA 6934 0A02 F5BB 52CD
|
||||
% gpg --fingerprint F5BB52CD
|
||||
pub 1024D/F5BB52CD 2007-06-18 [expires: 2012-06-16]
|
||||
Key fingerprint = 7D77 0C77 6CE7 754E E6AF 23AA 6934 0A02 F5BB 52CD
|
||||
uid Yegor Kozlov <yegor.kozlov at gmail dot com>
|
||||
uid Yegor Kozlov <yegor at dinom dot ru>
|
||||
uid Yegor Kozlov <yegor at apache dot org>
|
||||
sub 4096g/7B45A98A 2007-06-18 [expires: 2012-06-16]</source>
|
||||
</section>
|
||||
<section id="archive">
|
||||
<title>Release Archives</title>
|
||||
<p>
|
||||
Apache POI became a top level project in June 2007 and POI 3.0 artifacts were re-released. Prior to that
|
||||
date POI was a sub-project of
|
||||
<a href="https://jakarta.apache.org/">Apache Jakarta.</a>
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="https://archive.apache.org/dist/poi/release/src/">Source Artifacts</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://archive.apache.org/dist/poi/release/bin/">Binary Artifacts</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://archive.apache.org/dist/jakarta/poi/release/">Artifacts from prior to 3.0</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.<br/>
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache POI project logo are trademarks of The
|
||||
Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
510
src/documentation/content/xdocs/encryption.xml
Normal file
@ -0,0 +1,510 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Encryption support</title>
|
||||
<authors>
|
||||
<person id="maxcom" name="Maxim Valyanskiy" email="maxcom@apache.org"/>
|
||||
<person id="AB" name="Andreas Beeker" email="kiwiwings@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>Apache POI contains support for reading few variants of encrypted office files: </p>
|
||||
<ul>
|
||||
<li>Binary formats (.xls, .ppt, .doc, ...)<br/>
|
||||
encryption is format-dependent and needs to be implemented per format differently.<br/>
|
||||
Use <a href="apidocs/dev/org/apache/poi/hssf/record/crypto/Biff8EncryptionKey.html">
|
||||
Biff8EncryptionKey</a>.<a href="apidocs/dev/org/apache/poi/hssf/record/crypto/Biff8EncryptionKey.html#setCurrentUserPassword(java.lang.String)">setCurrentUserPassword</a>(String password)
|
||||
to specify the decryption password before opening the file or (where applicable) before saving.
|
||||
Setting a null password before saving removes the password protection.<br/>
|
||||
The password is set in a thread local variable. Do not forget to reset it to null after text extraction.
|
||||
</li>
|
||||
<li>XML-based formats (.xlsx, .pptx, .docx, ...)<br/>
|
||||
use the same encryption logic over all formats. When encrypted, the zipped files will be
|
||||
stored within an OLE file in the EncryptedPackage stream.<br/>
|
||||
If you plan to use POI to actually generate encrypted documents, be aware not to use anything less than
|
||||
agile encryption, because <a href="https://eprint.iacr.org/2005/007.pdf">RC4 is not really secure</a> and
|
||||
<a href="https://blog.cryptographyengineering.com/2011/12/01/how-not-to-use-symmetric-encryption/">ECB chaining is problematic too</a>.
|
||||
Of course you'll need to make sure, that your clients can read the documents,
|
||||
i.e. the various free Excel, Powerpoint, Word viewers have limitations in the cipher or hashing parameters.<br/>
|
||||
If you want to use high encryption parameters, you need to install the "Java Cryptography Extension (JCE) Unlimited
|
||||
Strength Jurisdiction Policy Files" for your JRE version
|
||||
(Oracle <a href="http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html">JDK6</a>,
|
||||
<a href="http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html">JDK7</a>,
|
||||
<a href="http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html">JDK8</a>,
|
||||
IBM <a href="https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.security.component.80.doc/security-component/sdkpolicyfiles.html">JDK8</a>).
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>Some "write-protected" files are encrypted with the built-in password "VelvetSweatshop", POI can read that files too.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Supported feature matrix</title>
|
||||
|
||||
<table class="autosize POITable">
|
||||
<tr>
|
||||
<th>Encryption</th>
|
||||
<th>HSSF</th>
|
||||
<th>HSLF</th>
|
||||
<th>HWPF</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/dd949802(v=office.12).aspx">XOR obfuscation *)</a></td>
|
||||
<td class="feature-yes">Yes (Writing since 3.16)</td>
|
||||
<td class="feature-na">N/A</td>
|
||||
<td class="feature-no">No</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/dd909583(v=office.12).aspx">40-bit RC4 encryption</a></td>
|
||||
<td class="feature-yes">Yes (Writing since 3.16)</td>
|
||||
<td class="feature-na">N/A</td>
|
||||
<td class="feature-yes">Yes (since 3.17)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/dd910113(v=office.12).aspx">Office Binary Document RC4 CryptoAPI Encryption</a></td>
|
||||
<td class="feature-yes">Yes (Since 3.16)</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes (since 3.17)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th/>
|
||||
<th>XSSF</th>
|
||||
<th>XSLF</th>
|
||||
<th>XWPF</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/dd907466(v=office.12).aspx">Office Binary Document RC4 Encryption **)</a></td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/dd906131(v=office.12).aspx">ECMA-376 Standard Encryption</a></td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/dd906131(v=office.12).aspx">ECMA-376 Agile Encryption</a></td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://msdn.microsoft.com/en-us/library/ms757845(v=vs.85).aspx">ECMA-376 XML Signature</a></td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
<td class="feature-yes">Yes</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>*) the xor encryption is flawed and works only for very small files - see <a href="https://bz.apache.org/bugzilla/show_bug.cgi?id=59857">#59857</a>.
|
||||
</p>
|
||||
|
||||
<p>**) the <a href="https://msdn.microsoft.com/en-us/library/cc313071(v=office.12).aspx">MS-OFFCRYPTO</a>
|
||||
documentation only mentions the RC4 (without CryptoAPI) encryption as a "in place" encryption, but
|
||||
apparently there's also a container based method with that key generation logic.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Binary formats</title>
|
||||
<p>As mentioned above, use
|
||||
<a href="apidocs/dev/org/apache/poi/hssf/record/crypto/Biff8EncryptionKey.html">
|
||||
Biff8EncryptionKey</a>.<a href="apidocs/dev/org/apache/poi/hssf/record/crypto/Biff8EncryptionKey.html#setCurrentUserPassword(java.lang.String)">setCurrentUserPassword</a>(String password)
|
||||
to specify the password.</p>
|
||||
|
||||
<section>
|
||||
<title>XOR/RC4 decryption for xls</title>
|
||||
<source><![CDATA[
|
||||
Biff8EncryptionKey.setCurrentUserPassword("pass");
|
||||
POIFSFileSystem fs = new POIFSFileSystem(new File("file.xls"), true);
|
||||
HSSFWorkbook hwb = new HSSFWorkbook(fs.getRoot(), true);
|
||||
Biff8EncryptionKey.setCurrentUserPassword(null);
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>RC4 CryptoApi support ppt - decryption</title>
|
||||
<source><![CDATA[
|
||||
Biff8EncryptionKey.setCurrentUserPassword("pass");
|
||||
POIFSFileSystem fs = new POIFSFileSystem(new File("file.ppt"), true);
|
||||
HSLFSlideShow hss = new HSLFSlideShow(fs);
|
||||
...
|
||||
// Option 1: remove password
|
||||
Biff8EncryptionKey.setCurrentUserPassword(null);
|
||||
OutputStream os = new FileOutputStream("decrypted.ppt");
|
||||
hss.write(os);
|
||||
os.close();
|
||||
...
|
||||
// Option 2: change encryption settings (experimental)
|
||||
// need to cache data (i.e. read all data) before changing the key size
|
||||
PictureData picsExpected[] = hss.getPictures();
|
||||
hss.getDocumentSummaryInformation();
|
||||
EncryptionInfo ei = hss.getDocumentEncryptionAtom().getEncryptionInfo();
|
||||
((CryptoAPIEncryptionHeader)ei.getHeader()).setKeySize(0x78);
|
||||
OutputStream os = new FileOutputStream("file_120bit.ppt");
|
||||
hss.write(os);
|
||||
os.close();
|
||||
]]></source>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>XML-based formats - Decryption</title>
|
||||
|
||||
<p>XML-based formats are stored in OLE-package stream "EncryptedPackage". Use org.apache.poi.poifs.crypt.Decryptor
|
||||
to decode file:</p>
|
||||
|
||||
<source><![CDATA[
|
||||
EncryptionInfo info = new EncryptionInfo(filesystem);
|
||||
Decryptor d = Decryptor.getInstance(info);
|
||||
|
||||
try {
|
||||
if (!d.verifyPassword(password)) {
|
||||
throw new RuntimeException("Unable to process: document is encrypted");
|
||||
}
|
||||
|
||||
InputStream dataStream = d.getDataStream(filesystem);
|
||||
|
||||
// parse dataStream
|
||||
|
||||
} catch (GeneralSecurityException ex) {
|
||||
throw new RuntimeException("Unable to process encrypted document", ex);
|
||||
}
|
||||
]]></source>
|
||||
|
||||
<p>If you want to read file encrypted with build-in password, use Decryptor.DEFAULT_PASSWORD.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>XML-based formats - Encryption</title>
|
||||
|
||||
<p>Encrypting a file is similar to the above decryption process. Basically you'll need to choose between
|
||||
<a href="apidocs/dev/org/apache/poi/poifs/crypt/EncryptionMode.html">binaryRC4, standard and agile encryption</a>,
|
||||
the cryptoAPI mode is used internally and its direct use would result in an incomplete file.
|
||||
Apart of the CipherMode, the EncryptionInfo class provides further parameters to specify the cipher and
|
||||
hashing algorithm to be used.</p>
|
||||
<source><![CDATA[
|
||||
try (POIFSFileSystem fs = new POIFSFileSystem()) {
|
||||
EncryptionInfo info = new EncryptionInfo(EncryptionMode.agile);
|
||||
// EncryptionInfo info = new EncryptionInfo(EncryptionMode.agile, CipherAlgorithm.aes192, HashAlgorithm.sha384, -1, -1, null);
|
||||
|
||||
Encryptor enc = info.getEncryptor();
|
||||
enc.confirmPassword("foobaa");
|
||||
|
||||
// Read in an existing OOXML file and write to encrypted output stream
|
||||
// don't forget to close the output stream otherwise the padding bytes aren't added
|
||||
try (OPCPackage opc = OPCPackage.open(new File("..."), PackageAccess.READ_WRITE);
|
||||
OutputStream os = enc.getDataStream(fs)) {
|
||||
opc.save(os);
|
||||
}
|
||||
|
||||
// Write out the encrypted version
|
||||
try (FileOutputStream fos = new FileOutputStream("...")) {
|
||||
fs.writeFilesystem(fos);
|
||||
}
|
||||
}
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>XML-based formats - Signing (XML Signature)</title>
|
||||
|
||||
<note>As of <a href="https://bz.apache.org/bugzilla/show_bug.cgi?id=64186">#64186</a> the configuration of the
|
||||
OPCPackage has changed, the examples below have been adopted and reflect the POI 5.0.0 API</note>
|
||||
|
||||
<p>An Office document can be digital signed by a <a href="https://en.wikipedia.org/wiki/XML_Signature">XML Signature</a>
|
||||
to protect it from unauthorized modifications, i.e. modifications without having the original certificate.
|
||||
The current implementation is based on the <!--<a href="http://eid-applet.googlecode.com">eID Applet</a>-->
|
||||
<a href="https://github.com/e-Contract/eid-applet">eID Applet</a> which
|
||||
is dual-licensed to
|
||||
<a href="https://github.com/e-Contract/eid-applet/blob/master/README.md#7-license">Apache License 2.0 and LGPL v3.0</a>.
|
||||
Instead of using the internal <a href="http://www.jsourcecode.com/class.php?proj=jdk%5Copenjdk&jar=openjdk-6-b14&class=org.jcp.xml.dsig.internal.dom.DOMXMLSignatureFactory">JDK API</a>
|
||||
this version is based on <a href="https://santuario.apache.org">Apache Santuario</a>.</p>
|
||||
<p>The classes have been tested against the following libraries, which need to be included additionally to the
|
||||
<a href="site:components">default dependencies</a>:</p>
|
||||
<ul>
|
||||
<li>BouncyCastle bcpkix, bcprov and bcutil (tested against 1.81)</li>
|
||||
<li>Apache Santuario "xmlsec" (tested against 3.0.5)</li>
|
||||
<li>and slf4j-api (tested against 2.0.x)</li>
|
||||
</ul>
|
||||
<p>Depending on the <a href="apidocs/dev/org/apache/poi/poifs/crypt/dsig/SignatureConfig.html">configuration</a>
|
||||
and the activated <a href="apidocs/dev/org/apache/poi/poifs/crypt/dsig/facets/package-summary.html">facets</a>
|
||||
various <a href="https://en.wikipedia.org/wiki/XAdES">XAdES levels</a> are supported - the support for higher levels (XAdES-T+)
|
||||
depend on supporting services and although the code is adopted, the integration is not well tested ... please support us on
|
||||
integration (testing) with timestamp and revocation (OCSP) services.
|
||||
</p>
|
||||
<p>Further test examples can be found in the corresponding <a href="https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/test/java/org/apache/poi/poifs/crypt/dsig/TestSignatureInfo.java?view=markup">test class</a>.</p>
|
||||
|
||||
<p>If you want to use a hash algorithm with 64 bytes (currently only applies to SHA512),
|
||||
<a href="https://bz.apache.org/bugzilla/show_bug.cgi?id=42061">a base64 "feature"</a> in xmlsec
|
||||
leads to line breaks in the digest values, which won't be accepted by Office. To workaround this, you
|
||||
need to set the following system property:<br/>
|
||||
<strong>-Dorg.apache.xml.security.ignoreLineBreaks=true</strong></p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Validating a signed office document</title>
|
||||
|
||||
<source><![CDATA[
|
||||
OPCPackage pkg = OPCPackage.open(..., PackageAccess.READ);
|
||||
SignatureConfig sic = new SignatureConfig();
|
||||
SignatureInfo si = new SignatureInfo();
|
||||
si.setOpcPackage(pkg);
|
||||
si.setSignatureConfig(sic);
|
||||
boolean isValid = si.verifySignature();
|
||||
...
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Signing an office document</title>
|
||||
|
||||
<section>
|
||||
<title>Signing a file</title>
|
||||
<source><![CDATA[
|
||||
// loading the keystore - pkcs12 is used here, but of course jks & co are also valid
|
||||
// the keystore needs to contain a private key and it's certificate having a
|
||||
// 'digitalSignature' key usage
|
||||
char password[] = "test".toCharArray();
|
||||
File file = new File("test.pfx");
|
||||
KeyStore keystore = KeyStore.getInstance("PKCS12");
|
||||
FileInputStream fis = new FileInputStream(file);
|
||||
keystore.load(fis, password);
|
||||
fis.close();
|
||||
|
||||
// extracting private key and certificate
|
||||
String alias = "xyz"; // alias of the keystore entry
|
||||
Key key = keystore.getKey(alias, password);
|
||||
X509Certificate x509 = (X509Certificate)keystore.getCertificate(alias);
|
||||
|
||||
// filling the SignatureConfig entries (minimum fields, more options are available ...)
|
||||
SignatureConfig signatureConfig = new SignatureConfig();
|
||||
signatureConfig.setKey(keyPair.getPrivate());
|
||||
signatureConfig.setSigningCertificateChain(Collections.singletonList(x509));
|
||||
|
||||
// adding the signature document to the package
|
||||
SignatureInfo si = new SignatureInfo();
|
||||
OPCPackage pkg = OPCPackage.open(..., PackageAccess.READ_WRITE);
|
||||
si.setOpcPackage(pkg);
|
||||
si.setSignatureConfig(signatureConfig);
|
||||
si.confirmSignature();
|
||||
// optionally verify the generated signature
|
||||
boolean b = si.verifySignature();
|
||||
assert (b);
|
||||
// write the changes back to disc
|
||||
pkg.close();
|
||||
]]></source>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Signing a stream - in-memory</title>
|
||||
|
||||
<p>When saving a OOXML document, POI creates missing relations on the fly. Therefore calling the signing method before
|
||||
would result in an invalid signature. Instead of trying to fix all save invocations, the user is asked to save the stream
|
||||
before in an intermediate byte array (stream) and process this stream instead.</p>
|
||||
|
||||
<source><![CDATA[
|
||||
// load the key and setup SignatureConfig ... - see "Signing a file"
|
||||
|
||||
SignatureInfo si = new SignatureInfo();
|
||||
si.setSignatureConfig(signatureConfig);
|
||||
|
||||
// populate sample object
|
||||
XSSFWorkbook wb = new XSSFWorkbook();
|
||||
wb.createSheet().createRow(1).createCell(1).setCellValue("Test");
|
||||
ByteArrayOutputStream bos = new ByteArrayOutputStream(100000);
|
||||
wb.write(bos);
|
||||
wb.close();
|
||||
|
||||
// process the
|
||||
OPCPackage pkg = OPCPackage.open(new ByteArrayInputStream(bos.toByteArray()));
|
||||
|
||||
si.setOpcPackage(pkg);
|
||||
si.confirmSignature();
|
||||
bos.reset();
|
||||
pkg.save(bos);
|
||||
pkg.close();
|
||||
|
||||
// bos now contains the signed ooxml document
|
||||
]]></source>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Encrypting temporary files created when unzipping an OOXML document</title>
|
||||
|
||||
<p>For security-conscious environments where data at rest must be stored encrypted,
|
||||
the creation of plaintext temporary files is a grey area.</p>
|
||||
|
||||
<p>The code example, written by PJ Fanning, modifies the behavior of SXSSFWorkbook
|
||||
to extract an OOXML spreadsheet zipped container and write the contents to disk using AES
|
||||
encryption.</p>
|
||||
|
||||
<p>See <a href="https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/main/java/org/apache/poi/poifs/crypt/temp/SXSSFWorkbookWithCustomZipEntrySource.java?view=markup">SXSSFWorkbookWithCustomZipEntrySource.java</a>
|
||||
and other <a href="https://svn.apache.org/viewvc?view=revision&revision=1768744">files</a>
|
||||
that are needed for this example.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Debugging XML signature issues</title>
|
||||
<p>Finding the source of a XML signature problem can be sometimes a pain in the ... neck, because
|
||||
the hashing of the canonicalized form is more or less done in the background.</p>
|
||||
|
||||
<!-- TODO: find original source -->
|
||||
<p>One of the tripping hazards are <a href="https://stackoverflow.com/questions/36063375">different
|
||||
linebreaks in Windows/Unix</a>, therefore use the non-indent form of the xmls. Furthermore the
|
||||
elements/ancestors containing namespace definitions and the used prefix might also differ.</p>
|
||||
|
||||
<p>The next thing is to compare successful signed documents from Office vs. POIs generated signature,
|
||||
i.e. unzip both files and look for differences. Usually the package relations (*.rels) will be different,
|
||||
and the sig1.xml, core.xml and [Content_Types].xml due to different order of the references.</p>
|
||||
|
||||
<p>The package relationships (*.rels) will be specially handled, i.e. they will be filtered and only
|
||||
a subset will be processed - see <a href="https://www.ecma-international.org/activities/Office%20Open%20XML%20Formats/Draft%20ECMA-376%203rd%20edition,%20March%202011/Office%20Open%20XML%20Part%202%20-%20Open%20Packaging%20Conventions.pdf">13.2.4.24 Relationships Transform Algorithm</a>.</p>
|
||||
|
||||
<p>POI and Santuario (XmlSec) use <a href="https://logging.apache.org/log4j/2.x">Log4J 2.x</a> and
|
||||
<a href="https://www.slf4j.org/">SLF4J</a> respectively for logging.</p>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
(Since the change to Log4J 2 in POI 5.1.0, this hasn't been tested, and you need to adapt the
|
||||
logging settings to get all log output of XmlSec and POI)
|
||||
</li>
|
||||
<li>
|
||||
add the following JVM parameters:
|
||||
<source><![CDATA[
|
||||
-Djava.io.tmpdir=<custom temp directory>
|
||||
-Xbootclasspath/p:<preload dir, which contains /org/apache/xml/security/utils/UnsyncBufferedOutputStream.class>
|
||||
]]></source>
|
||||
</li>
|
||||
<li>
|
||||
To check the processed files in the canonicalized form, the below UnsyncBufferedOutputStream class needs
|
||||
to be injected/replaced. Put the .class file in separate directory and add it to the JVM parameters (see above):
|
||||
|
||||
<source><![CDATA[
|
||||
package org.apache.xml.security.utils;
|
||||
|
||||
import java.io.File;
|
||||
import java.io.FileOutputStream;
|
||||
import java.io.IOException;
|
||||
import java.io.OutputStream;
|
||||
|
||||
public class UnsyncBufferedOutputStream extends OutputStream {
|
||||
static final int size = 8*1024;
|
||||
static int filecnt = 0;
|
||||
|
||||
private int pointer = 0;
|
||||
private final OutputStream out;
|
||||
private final FileOutputStream out2;
|
||||
|
||||
private final byte[] buf;
|
||||
|
||||
public UnsyncBufferedOutputStream(OutputStream out) {
|
||||
buf = new byte[size];
|
||||
this.out = out;
|
||||
synchronized(UnsyncBufferedOutputStream.class) {
|
||||
try {
|
||||
String tmpDir = System.getProperty("java.io.tmpdir");
|
||||
if (tmpDir == null) {
|
||||
tmpDir = "build";
|
||||
}
|
||||
File f = new File(tmpDir, "unsync-"+filecnt+".xml");
|
||||
out2 = new FileOutputStream(f);
|
||||
} catch (IOException e) {
|
||||
throw new RuntimeException(e);
|
||||
} finally {
|
||||
filecnt++;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public void write(byte[] arg0) throws IOException {
|
||||
write(arg0, 0, arg0.length);
|
||||
}
|
||||
|
||||
public void write(byte[] arg0, int arg1, int len) throws IOException {
|
||||
int newLen = pointer+len;
|
||||
if (newLen > size) {
|
||||
flushBuffer();
|
||||
if (len > size) {
|
||||
out.write(arg0, arg1,len);
|
||||
out2.write(arg0, arg1,len);
|
||||
return;
|
||||
}
|
||||
newLen = len;
|
||||
}
|
||||
System.arraycopy(arg0, arg1, buf, pointer, len);
|
||||
pointer = newLen;
|
||||
}
|
||||
|
||||
private void flushBuffer() throws IOException {
|
||||
if (pointer > 0) {
|
||||
out.write(buf, 0, pointer);
|
||||
out2.write(buf, 0, pointer);
|
||||
}
|
||||
pointer = 0;
|
||||
|
||||
}
|
||||
|
||||
public void write(int arg0) throws IOException {
|
||||
if (pointer >= size) {
|
||||
flushBuffer();
|
||||
}
|
||||
buf[pointer++] = (byte)arg0;
|
||||
|
||||
}
|
||||
|
||||
public void flush() throws IOException {
|
||||
flushBuffer();
|
||||
out.flush();
|
||||
out2.flush();
|
||||
}
|
||||
|
||||
public void close() throws IOException {
|
||||
flush();
|
||||
out.close();
|
||||
out2.close();
|
||||
}
|
||||
|
||||
}
|
||||
]]></source>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
742
src/documentation/content/xdocs/help/faq.xml
Normal file
@ -0,0 +1,742 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE faqs PUBLIC "-//APACHE//DTD FAQ V2.0//EN" "faq-v20.dtd">
|
||||
|
||||
<faqs>
|
||||
<title>Frequently Asked Questions</title>
|
||||
<faq id="faq-N10006">
|
||||
<question>
|
||||
My code uses some new feature, compiles fine but fails when live with a "MethodNotFoundException" or "IncompatibleClassChangeError"
|
||||
</question>
|
||||
<answer>
|
||||
<p>You almost certainly have an older version of Apache POI
|
||||
on your classpath. Quite a few runtimes and other packages
|
||||
will ship older version of Apache POI, so this is an easy problem
|
||||
to hit without your realising. Some will ship just one old jar,
|
||||
some may ship a full set of old POI jars.</p>
|
||||
<p>The best way to identify the offending earlier jar files is
|
||||
with a few lines of java. These will load a Core POI class, an
|
||||
OOXML class and a Scratchpad class, and report where they all came
|
||||
from.</p>
|
||||
<source><![CDATA[
|
||||
ClassLoader classloader =
|
||||
org.apache.poi.poifs.filesystem.POIFSFileSystem.class.getClassLoader();
|
||||
URL res = classloader.getResource(
|
||||
"org/apache/poi/poifs/filesystem/POIFSFileSystem.class");
|
||||
String path = res.getPath();
|
||||
System.out.println("POI Core came from " + path);
|
||||
|
||||
classloader = org.apache.poi.ooxml.POIXMLDocument.class.getClassLoader();
|
||||
res = classloader.getResource("org/apache/poi/ooxml/POIXMLDocument.class");
|
||||
path = res.getPath();
|
||||
System.out.println("POI OOXML came from " + path);
|
||||
|
||||
classloader = org.apache.poi.hslf.usermodel.HSLFSlideShow.class.getClassLoader();
|
||||
res = classloader.getResource("org/apache/poi/hslf/usermodel/HSLFSlideShow.class");
|
||||
path = res.getPath();
|
||||
System.out.println("POI Scratchpad came from " + path);]]></source>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10019">
|
||||
<question>
|
||||
My code uses the scratchpad, compiles fine but fails to run with a "MethodNotFoundException"
|
||||
</question>
|
||||
<answer>
|
||||
<p>You almost certainly have an older version earlier on your
|
||||
classpath. See the prior answer.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10025">
|
||||
<question>
|
||||
I'm using the poi-ooxml-lite (previously known as poi-ooxml-schemas) jar, but my code is failing with "java.lang.NoClassDefFoundError: org/openxmlformats/schemas/*something*"
|
||||
</question>
|
||||
<answer>
|
||||
<p>To use the new OOXML file formats, POI requires a jar containing
|
||||
the file format XSDs, as compiled by
|
||||
<a href="https://xmlbeans.apache.org/">XMLBeans</a>. These
|
||||
XSDs, once compiled into Java classes, live in the
|
||||
<em>org.openxmlformats.schemas</em> namespace.</p>
|
||||
<p>There are two jar files available, as described in
|
||||
<a href="site:components">the components overview section</a>.
|
||||
The <em>full jar of all of the schemas is poi-ooxml-full-XXX.jar (previously known as ooxml-schemas)
|
||||
(lower versions for older releases, see table below)</em>,
|
||||
and it is currently around 16mb. The <em>smaller poi-ooxml-lite (previously known as poi-ooxml-schemas)
|
||||
jar</em> is only about 6mb. This latter jar file only contains the
|
||||
typically used parts though.</p>
|
||||
<p>Many users choose to use the smaller poi-ooxml-lite jar to save
|
||||
space. However, the poi-ooxml-lite jar only contains the XSDs and
|
||||
classes that are typically used, as identified by the unit tests.
|
||||
Every so often, you may try to use part of the file format which
|
||||
isn't included in the minimal poi-ooxml-lite jar. In this case,
|
||||
you should switch to the full poi-ooxml-full jar. Longer term,
|
||||
you may also wish to submit a new unit test which uses the extra
|
||||
parts of the XSDs, so that a future poi-ooxml-lite jar will
|
||||
include them.</p>
|
||||
<p>There are a number of ways to get the full poi-ooxml-full jar.
|
||||
If you are a maven user, see the
|
||||
<a href="site:components">the components overview section</a>
|
||||
for the artifact details to have maven download it for you.
|
||||
If you download the source release of POI, and/or checkout the
|
||||
source code from <a href="site:subversion">subversion</a>,
|
||||
then you can run the ant task "compile-ooxml-xsds" to have the
|
||||
OOXML schemas downloaded and compiled for you (This will also
|
||||
give you the XMLBeans generated source code, in case you wish to
|
||||
look at this). Finally, you can download the jar by hand from the
|
||||
<a href="http://mirrors.ibiblio.org/apache/poi/">POI
|
||||
Maven Repository</a>.</p>
|
||||
<p>Note that historically, different versions of poi-ooxml-full / ooxml-schemas were
|
||||
used</p>
|
||||
|
||||
<table class="POITable autosize">
|
||||
<tr>
|
||||
<th>Version of ooxml-schemas</th>
|
||||
<th>Version of POI</th>
|
||||
<th>Commment</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ooxml-schemas-1.0.jar</td>
|
||||
<td>POI 3.5 and 3.6</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ooxml-schemas-1.1.jar</td>
|
||||
<td>POI 3.7 to POI 3.13</td>
|
||||
<td>Generics support added, can be used with POI 3.5 and POI 3.6 as well</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ooxml-schemas-1.2.jar</td>
|
||||
<td>-</td>
|
||||
<td>Not released</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ooxml-schemas-1.3.jar</td>
|
||||
<td>POI 3.14 and newer</td>
|
||||
<td>Visio XML format support added, can be used with POI 3.7 - POI 3.13 as well</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ooxml-schemas-1.4.jar</td>
|
||||
<td>POI 4.*.*</td>
|
||||
<td>Provide schema for AlternateContent, can be used with previous versions of POI as well</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>poi-ooxml-full jar</td>
|
||||
<td>POI 5.0.0 and newer</td>
|
||||
<td>Upgrade to ECMA-376 5th edition - which is not downward compatible</td>
|
||||
</tr>
|
||||
</table>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N100B6">
|
||||
<question>
|
||||
Why is reading a simple sheet taking so long?
|
||||
</question>
|
||||
<answer>
|
||||
<p>You've probably enabled logging. Logging is intended only for
|
||||
autopsy style debugging. Having it enabled will reduce performance
|
||||
by a factor of at least 100. Logging is helpful for understanding
|
||||
why POI can't read some file or developing POI itself. Important
|
||||
errors are thrown as exceptions, which means you probably don't need
|
||||
logging.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N100C2">
|
||||
<question>
|
||||
What is the HSSF "eventmodel"?
|
||||
</question>
|
||||
<answer>
|
||||
<p>The SS eventmodel package is an API for reading Excel files without loading the whole spreadsheet into memory. It does
|
||||
require more knowledge on the part of the user, but reduces memory consumption by more than
|
||||
tenfold. It is based on the AWT event model in combination with SAX. If you need read-only
|
||||
access, this is the best way to do it.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N100CE">
|
||||
<question>
|
||||
Why can't read the document I created using Star Office 5.1?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Star Office 5.1 writes some records using the older BIFF standard. This causes some problems
|
||||
with POI which supports only BIFF8.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N100DA">
|
||||
<question>
|
||||
Why am I getting an exception each time I attempt to read my spreadsheet?
|
||||
</question>
|
||||
<answer>
|
||||
<p>It's possible your spreadsheet contains a feature that is not currently supported by POI.
|
||||
If you encounter this then please create the simplest file that demonstrates the trouble and submit it to
|
||||
<a href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bugzilla.</a></p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N100E9">
|
||||
<question>
|
||||
How do you tell if a spreadsheet cell contains a date?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Excel stores dates as numbers therefore the only way to determine if a cell is
|
||||
actually stored as a date is to look at the formatting. There is a helper method
|
||||
in HSSFDateUtil that checks for this.
|
||||
Thanks to Jason Hoffman for providing the solution.</p>
|
||||
<source><![CDATA[
|
||||
case HSSFCell.CELL_TYPE_NUMERIC:
|
||||
double d = cell.getNumericCellValue();
|
||||
// test if a date!
|
||||
if (HSSFDateUtil.isCellDateFormatted(cell)) {
|
||||
// format in form of M/D/YY
|
||||
cal.setTime(HSSFDateUtil.getJavaDate(d));
|
||||
cellText =
|
||||
(String.valueOf(cal.get(Calendar.YEAR))).substring(2);
|
||||
cellText = cal.get(Calendar.MONTH)+1 + "/" +
|
||||
cal.get(Calendar.DAY_OF_MONTH) + "/" +
|
||||
cellText;
|
||||
}]]></source>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N100F9">
|
||||
<question>
|
||||
I'm trying to stream an XLS file from a servlet and I'm having some trouble. What's the problem?
|
||||
</question>
|
||||
<answer>
|
||||
<p>
|
||||
The problem usually manifests itself as the junk characters being shown on
|
||||
screen. The problem persists even though you have set the correct mime type.
|
||||
</p>
|
||||
<p>
|
||||
The short answer is, don't depend on IE to display a binary file type properly if you stream it via a
|
||||
servlet. Every minor version of IE has different bugs on this issue.
|
||||
</p>
|
||||
<p>
|
||||
The problem in most versions of IE is that it does not use the mime type on
|
||||
the HTTP response to determine the file type; rather it uses the file extension
|
||||
on the request. Thus you might want to add a
|
||||
<strong>.xls</strong> to your request
|
||||
string. For example
|
||||
<em>http://yourserver.com/myServelet.xls?param1=xx</em>. This is
|
||||
easily accomplished through URL mapping in any servlet container. Sometimes
|
||||
a request like
|
||||
<em>http://yourserver.com/myServelet?param1=xx&dummy=file.xls</em> is also
|
||||
known to work.
|
||||
</p>
|
||||
<p>
|
||||
To guarantee opening the file properly in Excel from IE, write out your file to a
|
||||
temporary file under your web root from your servlet. Then send an http response
|
||||
to the browser to do a client side redirection to your temp file. (Note that using a
|
||||
server side redirect using RequestDispatcher will not be effective in this case)
|
||||
</p>
|
||||
<p>
|
||||
Note also that when you request a document that is opened with an
|
||||
external handler, IE sometimes makes two requests to the webserver. So if your
|
||||
generating process is heavy, it makes sense to write out to a temporary file, so that multiple
|
||||
requests happen for a static file.
|
||||
</p>
|
||||
<p>
|
||||
None of this is particular to Excel. The same problem arises when you try to
|
||||
generate any binary file dynamically to an IE client. For example, if you generate
|
||||
pdf files using
|
||||
<a href="https://xml.apache.org/fop">FOP</a>, you will come across many of the same issues.
|
||||
</p>
|
||||
<!-- Thanks to Avik for the answer -->
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10123">
|
||||
<question>
|
||||
I want to set a cell format (Data format of a cell) of an excel sheet as ###,###,###.#### or ###,###,###.0000. Is it possible using POI ?
|
||||
</question>
|
||||
<answer>
|
||||
<p>
|
||||
Yes. You first need to get a DataFormat object from the workbook and call getFormat with the desired format. Some examples are <a href="../components/spreadsheet/quick-guide.html#DataFormats">here</a>.
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10133">
|
||||
<question>
|
||||
I want to set a cell format (Data format of a cell) of an excel sheet as text. Is it possible using POI ?
|
||||
</question>
|
||||
<answer>
|
||||
<p>
|
||||
Yes. This is a built-in format for excel that you can get from DataFormat object using the format string "@". Also, the string "text" will alias this format.
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N1013F">
|
||||
<question>
|
||||
How do I add a border around a merged cell?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Add blank cells around where the cells normally would have been and set the borders individually for each cell.
|
||||
We will probably enhance HSSF in the future to make this process easier.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N1014B">
|
||||
<question>
|
||||
I am using styles when creating a workbook in POI, but Excel refuses to open the file, complaining about "Too Many Styles".
|
||||
</question>
|
||||
<answer>
|
||||
<p>You just create the styles OUTSIDE of the loop in which you create cells.</p>
|
||||
<p>GOOD:</p>
|
||||
<source><![CDATA[
|
||||
HSSFWorkbook wb = new HSSFWorkbook();
|
||||
HSSFSheet sheet = wb.createSheet("new sheet");
|
||||
HSSFRow row = null;
|
||||
|
||||
// Aqua background
|
||||
HSSFCellStyle style = wb.createCellStyle();
|
||||
style.setFillBackgroundColor(HSSFColor.AQUA.index);
|
||||
style.setFillPattern(HSSFCellStyle.BIG_SPOTS);
|
||||
HSSFCell cell = row.createCell((short) 1);
|
||||
cell.setCellValue("X");
|
||||
cell.setCellStyle(style);
|
||||
|
||||
// Orange "foreground",
|
||||
// foreground being the fill foreground not the font color.
|
||||
style = wb.createCellStyle();
|
||||
style.setFillForegroundColor(HSSFColor.ORANGE.index);
|
||||
style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND);
|
||||
|
||||
for (int x = 0; x < 1000; x++) {
|
||||
|
||||
// Create a row and put some cells in it. Rows are 0 based.
|
||||
row = sheet.createRow((short) k);
|
||||
|
||||
for (int y = 0; y < 100; y++) {
|
||||
cell = row.createCell((short) k);
|
||||
cell.setCellValue("X");
|
||||
cell.setCellStyle(style);
|
||||
}
|
||||
}
|
||||
|
||||
// Write the output to a file
|
||||
FileOutputStream fileOut = new FileOutputStream("workbook.xls");
|
||||
wb.write(fileOut);
|
||||
fileOut.close();
|
||||
</source>
|
||||
<p>BAD:</p>
|
||||
<source>
|
||||
HSSFWorkbook wb = new HSSFWorkbook();
|
||||
HSSFSheet sheet = wb.createSheet("new sheet");
|
||||
HSSFRow row = null;
|
||||
|
||||
for (int x = 0; x < 1000; x++) {
|
||||
// Aqua background
|
||||
HSSFCellStyle style = wb.createCellStyle();
|
||||
style.setFillBackgroundColor(HSSFColor.AQUA.index);
|
||||
style.setFillPattern(HSSFCellStyle.BIG_SPOTS);
|
||||
HSSFCell cell = row.createCell((short) 1);
|
||||
cell.setCellValue("X");
|
||||
cell.setCellStyle(style);
|
||||
|
||||
// Orange "foreground",
|
||||
// foreground being the fill foreground not the font color.
|
||||
style = wb.createCellStyle();
|
||||
style.setFillForegroundColor(HSSFColor.ORANGE.index);
|
||||
style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND);
|
||||
|
||||
// Create a row and put some cells in it. Rows are 0 based.
|
||||
row = sheet.createRow((short) k);
|
||||
|
||||
for (int y = 0; y < 100; y++) {
|
||||
cell = row.createCell((short) k);
|
||||
cell.setCellValue("X");
|
||||
cell.setCellStyle(style);
|
||||
}
|
||||
}
|
||||
|
||||
// Write the output to a file
|
||||
FileOutputStream fileOut = new FileOutputStream("workbook.xls");
|
||||
wb.write(fileOut);
|
||||
fileOut.close();]]></source>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10165">
|
||||
<question>
|
||||
I think POI is using too much memory! What can I do?
|
||||
</question>
|
||||
<answer>
|
||||
<p>This one comes up quite a lot, but often the reason isn't what
|
||||
you might initially think. So, the first thing to check is - what's
|
||||
the source of the problem? Your file? Your code? Your environment?
|
||||
Or Apache POI?</p>
|
||||
<p>(If you're here, you probably think it's Apache POI. However, it
|
||||
often isn't! A moderate laptop, with a decent but not excessive heap
|
||||
size, from a standing start, can normally read or write a file with
|
||||
100 columns and 100,000 rows in under a couple of seconds, including
|
||||
the time to start the JVM).</p>
|
||||
<p>Apache POI ships with a few programs and a few example programs,
|
||||
which can be used to do some basic performance checks. For testing
|
||||
file generation, the class to use is in the examples package,
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/SSPerformanceTest.java">SSPerformanceTest</a>
|
||||
(<a href="https://svn.apache.org/viewvc/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/SSPerformanceTest.java">viewvc</a>).
|
||||
Run SSPerformanceTest with arguments of the writing type (HSSF, XSSF
|
||||
or SXSSF), the number rows, the number of columns, and if the file
|
||||
should be saved. If you can't run that with 50,000 rows and 50 columns
|
||||
in HSSF and SXSSF in under 3 seconds, and XSSF in under 20 seconds
|
||||
(and ideally all 3 in less than that!), then the problem is with
|
||||
your environment.</p>
|
||||
<p>Next, use the example program
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/ToCSV.java">ToCSV</a>
|
||||
(<a href="https://svn.apache.org/viewvc/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/ss/ToCSV.java">viewvc</a>)
|
||||
to try reading the file in with HSSF or XSSF. Related is
|
||||
<a href="https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/XLSX2CSV.java">XLSX2CSV</a>
|
||||
(<a href="https://svn.apache.org/viewvc/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/XLSX2CSV.java">viewvc</a>),
|
||||
which uses SAX parsing for .xlsx. Run this against both your problem file,
|
||||
and a simple one generated by SSPerformanceTest of the same size. If this is
|
||||
slow, then there could be an Apache POI problem with how the file is being
|
||||
processed (POI makes some assumptions that might not always be right on all
|
||||
files). If these tests are fast, then performance problems likely are in your
|
||||
code.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10192">
|
||||
<question>
|
||||
I can't seem to find the source for the OOXML CT.. classes, where do they
|
||||
come from?
|
||||
</question>
|
||||
<answer>
|
||||
<p>The OOXML support in Apache POI is built on top of the file format
|
||||
XML Schemas, as compiled into Java using
|
||||
<a href="https://xmlbeans.apache.org/">XMLBeans</a>. Currently,
|
||||
the compilation is done with XMLBeans 5.x, for maximum compatibility
|
||||
with installations.</p>
|
||||
<p>All of the <em>org.openxmlformats.schemas.spreadsheetml.x2006</em> CT...
|
||||
classes are auto-generated by XMLBeans. The resulting generated Java goes
|
||||
in the <em>poi-ooxml-full-*-sources</em> jar, and the compiled version into the
|
||||
<em>poi-ooxml-full</em> jar.</p>
|
||||
<p>The full <em>poi-ooxml-full</em> jar is distributed with Apache POI,
|
||||
along with the cut-down <em>poi-ooxml-lite</em> jar containing just
|
||||
the common parts. Use the sources off <em>poi-ooxml-full</em> for the lite version,
|
||||
which is available from Maven Central - ask your favourite Maven
|
||||
mirror for the <em>poi-ooxml-full-*-sources</em> jar. Alternately, if you download
|
||||
the POI source distribution (or checkout from SVN) and build, Ant will
|
||||
automatically compile it for you to generate the source and binary poi-ooxml-full jars.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N101BA">
|
||||
<question>
|
||||
An OLE2 ("binary") file is giving me problems, but I can't share it. How can I investigate the problem on my own?
|
||||
</question>
|
||||
<answer>
|
||||
<p>The first thing to try is running the
|
||||
<a href="https://blogs.msdn.com/b/officeinteroperability/archive/2011/07/12/microsoft-office-binary-file-format-validator-is-now-available.aspx">Binary File Format Validator</a>
|
||||
from Microsoft against the file, which will report if the file
|
||||
complies with the specification. If your input file doesn't, then this
|
||||
may well explain why POI isn't able to process it correctly. You
|
||||
should probably in this case speak to whoever is generating the file,
|
||||
and have them fix it there. If your POI generated file is identified
|
||||
as having an issue, and you're on the
|
||||
<a href="site:howtobuild">latest codebase</a>, report a new
|
||||
POI bug and include the details of the validation failure.</p>
|
||||
<p>Another thing to try, especially if the file is valid but POI isn't
|
||||
behaving as expected, are the POI Dev Tools for the component you're
|
||||
using. For example, HSSF has <em>org.apache.poi.hssf.dev.BiffViewer</em>
|
||||
which will allow you to view the file as POI does. This will often
|
||||
allow you to check that things are being read as you expect, and
|
||||
narrow in on problem records and structures.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N101D4">
|
||||
<question>
|
||||
An OOXML ("xml") file is giving me problems, but I can't share it. How can I investigate the problem on my own?
|
||||
</question>
|
||||
<answer>
|
||||
<p>There's not currently a simple validator tool as there is for the
|
||||
OLE2 based (binary) file formats, but checking the basics of a file
|
||||
is generally much easier.</p>
|
||||
<p>Files such as .xlsx, .docx and .pptx are actually a zip file of XML
|
||||
files, with a special structure. Your first step in diagnosing the
|
||||
issues with the input or output file will likely be to unzip the
|
||||
file, and look at the XML of it. Newer versions of Office will
|
||||
normally tell you which area of the file is problematic, so
|
||||
narrow in on there. Looking at the XML, does it look correct?</p>
|
||||
<p>When reporting bugs, ideally include the whole file, but if you're
|
||||
unable to then include the snippet of XML for the problem area, and
|
||||
reference the OOXML standard for what it should contain.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N101E6">
|
||||
<question>
|
||||
Why do I get a java.lang.NoClassDefFoundError: javax/xml/stream/XMLEventFactory.newFactory()
|
||||
</question>
|
||||
<answer>
|
||||
<p><strong>Applies to versions <= 3.17 (Java 6): </strong></p>
|
||||
<p>This error indicates that the class XMLEventFactory does not provide
|
||||
functionality which POI is depending upon. There can be a number of
|
||||
different reasons for this:</p>
|
||||
<ul>
|
||||
<li>Outdated xml-apis.jar, stax-apis.jar or xercesImpl.jar:<br/>
|
||||
These libraries were required with Java 5 and lower, but are not actually
|
||||
required with spec-compliant Java 6 implementations, so try removing those
|
||||
libraries from your classpath. If this is not possible, try upgrading to a
|
||||
newer version of those jar files.
|
||||
</li>
|
||||
<li>Running IBM Java 6 (potentially as part of WebSphere Application Server):<br/>
|
||||
IBM Java 6 does not provide all the interfaces required by the XML standards,
|
||||
only IBM Java 7 seems to provide the correct interfaces, so try upgrading
|
||||
your JDK.
|
||||
</li>
|
||||
<li>Sun/Oracle Java 6 with outdated patchlevel:<br/>
|
||||
Some of the interfaces were only included/fixed in some of the patchlevels for
|
||||
Java 6. Try running with the latest available patchlevel or even better use
|
||||
Java 7/8 where this functionality should be available in all cases.
|
||||
</li>
|
||||
</ul>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10204">
|
||||
<question>
|
||||
Can I mix POI jars from different versions?
|
||||
</question>
|
||||
<answer>
|
||||
<p>No. This is not supported.</p>
|
||||
<p>All POI jars in use must come from the same version. A combination
|
||||
such as <em>poi-3.11.jar</em> and <em>poi-ooxml-3.9.jar</em> is not
|
||||
supported, and will fail to work in unpredictable ways.</p>
|
||||
<p>If you're not sure which POI jars you're using at runtime, and/or
|
||||
you suspect it might not be the one you intended, see
|
||||
<a href="#faq-N10006">this FAQ entry</a> for details on
|
||||
diagnosing it. If you aren't sure what POI jars you need, see the
|
||||
<a href="site:components">Components Overview</a>
|
||||
for details</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N10224">
|
||||
<question>
|
||||
Can I access/modify workbooks/documents/slideshows in multiple threads?
|
||||
What are the multi-threading guarantees that Apache POI makes
|
||||
</question>
|
||||
<answer>
|
||||
<p>In short: <em>Handling different document-objects in different threads will
|
||||
work. Accessing the same document in multiple threads will not work.</em></p>
|
||||
<p>This means the workbook/document/slideshow objects are not checked for
|
||||
thread safety, but any globally held object like global caches or other
|
||||
data structures are guarded against multi threaded access accordingly.</p>
|
||||
<p>There have been
|
||||
<a href="https://mail-archives.apache.org/mod_mbox/poi-user/201109.mbox/%3C1314859350817-4757295.post@n5.nabble.com%3E">discussions</a>
|
||||
about accessing different Workbook-sheets
|
||||
in different threads concurrently. While this may work to some degree, it may lead
|
||||
to very hard to track errors as multi-threading issues typically only
|
||||
manifest after long runtime when many threads are active and the system
|
||||
is under high load, i.e. in production use! Also it might break in future
|
||||
versions of Apache POI as we do not specifically test using the library
|
||||
this way.</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N1023C">
|
||||
<question>
|
||||
What are the advantages and disadvantages of the different constructor and
|
||||
write methods?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Across most of the UserModel classes (
|
||||
<a href="../apidocs/dev/org/apache/poi/ooxml/POIDocument.html">POIDocument</a>
|
||||
and
|
||||
<a href="../apidocs/dev/org/apache/poi/ooxml/POIXMLDocument.html">POIXMLDocument</a>),
|
||||
you can open the document from a read-only <em>File</em>, a read-write <em>File</em>
|
||||
or an <em>InputStream</em>. You can always write out to an <em>OutputStream</em>,
|
||||
and increasing also to a <em>File</em>.
|
||||
</p>
|
||||
<p>Opening your document from a <em>File</em> is suggested wherever possible.
|
||||
This will always be quicker and lower memory then using an <em>InputStream</em>,
|
||||
as the latter has to buffer things in memory.</p>
|
||||
<p>When writing, you can use an <em>OutputStream</em> to write to a new file, or
|
||||
overwrite an existing one (provided it isn't already open!). On slow links / disks,
|
||||
wrapping with a <em>BufferedOutputStream</em> is suggested. To write like this, use
|
||||
<a href="../apidocs/dev/org/apache/poi/POIDocument.html#write(java.io.OutputStream)">write(OutputStream)</a>.
|
||||
</p>
|
||||
<p>To write to the currently open file (an in-place write / replace), you need to
|
||||
have opened your document from a <em>File</em>, not an <em>InputStream</em>. In
|
||||
addition, you need to have opened from the <em>File</em> in read-write mode, not
|
||||
read-only mode. To write to the currently open file, on formats that support it
|
||||
(not all do), use
|
||||
<a href="../apidocs/dev/org/apache/poi/POIDocument.html#write()">write()</a>.
|
||||
</p>
|
||||
<p>You can also write out to a new <em>File</em>. This is available no matter how
|
||||
you opened the document, and will create/replace a new file. It is faster and lower
|
||||
memory than writing to an <em>OutputStream</em>. However, you can't use this to
|
||||
replace the currently open file, only files not currently open. To write to a
|
||||
new / different file, use
|
||||
<a href="../apidocs/dev/org/apache/poi/POIDocument.html#write(java.io.File)">write(File)</a>
|
||||
</p>
|
||||
<p>More information is also available in the
|
||||
<a href="../components/spreadsheet/quick-guide.html#FileInputStream">HSSF and XSSF documentation</a>,
|
||||
which largely applies to the other formats too.
|
||||
</p>
|
||||
<p>Note that currenly (POI 3.15 beta 3), not all of the write methods are available
|
||||
for the OOXML formats yet.
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N1029C">
|
||||
<question>
|
||||
Can POI be used with OSGI?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Starting with POI 3.16 there's a workaround for OSGIs context classloader handling,
|
||||
i.e. it replaces the threads current context classloader with an implementation of
|
||||
limited class view. This will lead to IllegalStateExceptions, as xmlbeans can't find
|
||||
the xml schema definitions in this reduced view. The workaround is to initialize
|
||||
the classloader delegate of <em>POIXMLTypeLoader</em> , which defaults to the current
|
||||
thread context classloader. The initialization should take place before any other
|
||||
OOXML related calls. The class in the example could be any class, which is
|
||||
part of the poi-ooxml-schema or ooxml-schema:<br/>
|
||||
<em> POIXMLTypeLoader.setClassLoader(CTTable.class.getClassLoader());</em>
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-N102B0">
|
||||
<question>
|
||||
Can Apache POI be compiled/used with Java 11, 17 and 21?
|
||||
</question>
|
||||
<answer>
|
||||
<p>
|
||||
POI is successfully tested with many different versions of Java. It is
|
||||
recommended that you use Java versions that have Long Term Support (Java 11, 17 and 21).
|
||||
</p>
|
||||
<p>Including the existing binaries as normal jar-files
|
||||
should work when using recent versions of Apache POI. You may see
|
||||
some warnings about illegal reflective access, but it should work fine
|
||||
despite those. We are working on getting the code changed so we avoid
|
||||
discouraged accesses in the future.
|
||||
</p>
|
||||
<p>NOTE: Apache POI tries to support the Java module system but it is more complicated
|
||||
because Apache POI is still supporting Java 8 and the module system
|
||||
cannot be fully supported while maintaining such support.
|
||||
</p>
|
||||
<p>
|
||||
FYI, jaxb in current versions also causes some warnings about reflective access,
|
||||
we cannot fix those until jaxb >= 2.4.0 is available, see
|
||||
https://stackoverflow.com/a/50251510/411846 for details, you can set a system
|
||||
property "com.sun.xml.bind.v2.bytecode.ClassTailor.noOptimize" to avoid this warning.
|
||||
</p>
|
||||
<p>
|
||||
For compiling Apache POI, you should use at least version 4.1.0 when it becomes available
|
||||
or a recent trunk checkout until then.
|
||||
</p>
|
||||
<p>
|
||||
If you are building POI yourself from source files, use an up to date version of Gradle.
|
||||
If you use Ant, again check the Ant version supports the version of Java you are using.
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-java10">
|
||||
<question>
|
||||
Can Apache POI be compiled/used with Java 9 or Java 10?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Apache POI does not actively support Java 9 or Java 10 any longer as those versions were
|
||||
obsoleted by Oracle already. See the previous FAQ entry for information about support for
|
||||
Java LTS versions.
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-ibmjdk">
|
||||
<question>
|
||||
Anything to consider when using IBM JDK?
|
||||
</question>
|
||||
<answer>
|
||||
<p>The IBM Java runtime is using a JIT compiler which doesn't behave sometimes. ;)
|
||||
Especially when rendering slideshows it throws errors, which don't occur when debugging the code.
|
||||
E.g. an ArrayIndexOutOfBoundsException is thrown in TexturePaintContext when the image contains
|
||||
textures - see <a href="https://bz.apache.org/bugzilla/show_bug.cgi?id=62999">#62999</a> for more
|
||||
details on how to detected JIT errors.</p>
|
||||
<p>To prevent the JIT errors, the affected methods need be excluded from JIT compiling.
|
||||
Currently (tested with IBM JDK 1.8.0_144 and _191) the following should be added to the VM parameters:<br/>
|
||||
</p>
|
||||
<source>
|
||||
-Xjit:exclude={sun/java2d/pipe/AAShapePipe.renderTiles(Lsun/java2d/SunGraphics2D;Ljava/awt/Shape;Lsun/java2d/pipe/AATileGenerator;[I)V},exclude={sun/java2d/pipe/AlphaPaintPipe.renderPathTile(Ljava/lang/Object;[BIIIIII)V},exclude={java/awt/TexturePaintContext.getRaster(IIII)Ljava/awt/image/Raster;}
|
||||
</source>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-thread-local-memory-leaks">
|
||||
<question>
|
||||
Tomcat is reporting memory leaks caused by some class in Apache POI which uses ThreadLocal
|
||||
</question>
|
||||
<answer>
|
||||
<p>Apache POI uses Java <a href="https://docs.oracle.com/javase/8/docs/api/java/lang/ThreadLocal.html">ThreadLocals</a>
|
||||
in order to cache some data when Apache POI is used in a multi-threading environment (see also the FAQ about thread-safety above!)
|
||||
</p>
|
||||
<p>WebServers like Tomcat use thread-pooling to re-use threads to avoid the cost of frequent thread-startup and shutdown.
|
||||
In order to guard against memory-leaks, Tomcat performs checks on allocated memory in ThreadLocals and reports them as warnings.
|
||||
</p>
|
||||
<p>In order to get rid of these warnings, Apache POI, starting with version 5.2.4, provides a utility ThreadLocalUtils which can
|
||||
be used to clear all objects held in thread-local objects before returning the thread back to the global pool.
|
||||
</p>
|
||||
<source>
|
||||
org.apache.poi.util.ThreadLocalUtil.clearAllThreadLocals();
|
||||
|
||||
// if you use poi-ooxml, also clear thread-locals in XMLBeans
|
||||
org.apache.xmlbeans.ThreadLocalUtil.clearAllThreadLocals();
|
||||
</source>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-demand-fix-asap">
|
||||
<question>
|
||||
How can I demand fixes or features in Apache POI to be done with urgency?
|
||||
</question>
|
||||
<answer>
|
||||
<p>Apache POI is an open source project developed by a very small group of volunteers.
|
||||
</p>
|
||||
<p>Currently no-one is paid to work on new features or bug-fixes.
|
||||
</p>
|
||||
<p>So it is considered fairly rude to "demand" things, especially "ASAP" is quite frowned
|
||||
upon and may even reduce the likelihood that your issue is picked up and worked on.
|
||||
</p>
|
||||
<p>If you would like to increase chances that your problem is tackled, you can do a number of things
|
||||
as follows, sorted by the amount of effort which may be required from you:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Ensure your bug-report is complete and contains instructions/samples which allow to reproduce the problem.
|
||||
Ideally a self-sufficient test-case which does not need lots of manual setup.</li>
|
||||
<li>Provide a summary of research of the root-cause of your problem.</li>
|
||||
<li>Provide a patch which fixes the problem. We usually like to have unit-tests accompanying changes to
|
||||
have high code-coverage and good confidence that issues are fixed and few regressions are introduced
|
||||
over time.</li>
|
||||
<li>Become a contributor! The entry threshold is actually not too high as soon as you provided your
|
||||
first successful bugfix. If you think you can spare the time to contribute for some longer time,
|
||||
becoming an official committer should not be too hard.</li>
|
||||
</ul>
|
||||
</answer>
|
||||
</faq>
|
||||
<faq id="faq-reproducible-build-and-output">
|
||||
<question>
|
||||
Does Apache POI support building reproducibly and/or producing reproducible output?
|
||||
</question>
|
||||
<answer>
|
||||
<p>There are two angles to reproducibility: building reproducible jars for Apache POI itself and making Apache POI
|
||||
produce byte-for-byte identical files when it is used to create documents.
|
||||
</p>
|
||||
<ul>
|
||||
<li>The build of jars for Apache POI should be reproducible since version 5.2.4 by removing the build-timestamp
|
||||
from the generated Version.java. Make sure the exact same combination of build-tools is used,
|
||||
especially the version of the JDK.</li>
|
||||
<li>Producing reproducible output files will be supported in the future (after version 5.3.0), initial support is available in
|
||||
nightly builds.<br/>
|
||||
Note: Files are only written without timestamps if the environment variable SOURCE_DATE_EPOCH is set to a
|
||||
non-empty value.</li>
|
||||
</ul>
|
||||
<p>Please create a bug entry if you find things which break reproducibility, both for building and output files.<br/>
|
||||
Please provide exact steps how to reproduce your issue!
|
||||
</p>
|
||||
<p>See <a href="https://reproducible-builds.org/">https://reproducible-builds.org/</a> for general information about why reproducible builds
|
||||
and output may be important.
|
||||
</p>
|
||||
</answer>
|
||||
</faq>
|
||||
</faqs>
|
||||
142
src/documentation/content/xdocs/help/index.xml
Normal file
@ -0,0 +1,142 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Mailing Lists</title>
|
||||
<authors>
|
||||
<person id="NB" name="Nick Burch" email="nick@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Mailing Lists - Guidelines</title>
|
||||
<p>
|
||||
<strong>Before subscribing or participating in any of the mailing
|
||||
lists, we suggest you read and understand the following
|
||||
guidelines:</strong>
|
||||
</p>
|
||||
<ul>
|
||||
<li><a href="https://www.apache.org/foundation/mailinglists.html">ASF guide to Mailing Lists</a></li>
|
||||
<li><a href="https://www.apache.org/dev/contrib-email-tips.html">ASF Tips for email contributors</a></li>
|
||||
<li><a href="https://jakarta.apache.org/site/mail.html">The Jakarta guide to Mailing Lists</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Lists</title>
|
||||
<section><title>The POI Developer List</title>
|
||||
<p>
|
||||
<strong>Medium Traffic</strong>
|
||||
<a href="https://lists.apache.org/list.html?dev@poi.apache.org">View,
|
||||
Participate and Subscribe to the Dev List</a>
|
||||
</p>
|
||||
<p>
|
||||
This is the list where participating developers of the POI
|
||||
project meet and discuss issues, code changes/additions, etc.
|
||||
Subscribers to this list also get notices of each and every
|
||||
code change, build results, testing notices, etc.
|
||||
<strong>Do not send mail to this list with usage questions or
|
||||
configuration problems. Use the <a href="#The+POI+User+List">POI User List</a> or community sites
|
||||
such as <a href="#Stack+Overflow+and+other+communities">Stack Overflow</a>, instead.</strong>
|
||||
</p>
|
||||
<p>
|
||||
Alternate options:
|
||||
<a href="mailto:dev-subscribe@poi.apache.org">Subscribe</a>
|
||||
<a href="mailto:dev-unsubscribe@poi.apache.org">Unsubscribe</a>
|
||||
<a href="https://mail-archives.apache.org/mod_mbox/poi-dev/">Old Archive</a>
|
||||
<!--a href="http://news.gmane.org/gmane.comp.jakarta.poi.devel">gmane.org</a-->
|
||||
<a href="http://apache-poi.1045710.n5.nabble.com/POI-Dev-f2312866.html">Nabble</a>
|
||||
<a href="http://markmail.org/search/org.apache.poi.dev+list:org.apache.poi.dev">MarkMail</a>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>The POI User List</title>
|
||||
<p>
|
||||
<strong>Low Traffic</strong>
|
||||
<a href="https://lists.apache.org/list.html?user@poi.apache.org">View,
|
||||
Participate and Subscribe to the User List</a>
|
||||
</p>
|
||||
<p>
|
||||
This list is for users of POI to ask questions, share knowledge,
|
||||
and discuss issues. POI developers are also expected to be
|
||||
lurking on this list to offer support to users of POI.
|
||||
</p>
|
||||
<p>
|
||||
Alternate options:
|
||||
<a href="mailto:user-subscribe@poi.apache.org">Subscribe</a>
|
||||
<a href="mailto:user-unsubscribe@poi.apache.org">Unsubscribe</a>
|
||||
<a href="https://mail-archives.apache.org/mod_mbox/poi-user/">Old Archive</a>
|
||||
<!--a href="http://news.gmane.org/thread.php?group=gmane.comp.jakarta.poi.user">gmane.org</a-->
|
||||
<a href="http://apache-poi.1045710.n5.nabble.com/POI-User-f2280730.html">Nabble</a>
|
||||
<a href="http://markmail.org/search/org.apache.poi.user+list:org.apache.poi.user">MarkMail</a>
|
||||
</p>
|
||||
</section>
|
||||
<section><title>The POI General List</title>
|
||||
<p>
|
||||
<strong>Very Low Traffic</strong>
|
||||
<a href="https://lists.apache.org/list.html?general@poi.apache.org">View,
|
||||
Participate and Subscribe to the General List</a>
|
||||
</p>
|
||||
<p>
|
||||
This list exists for general discussions on POI, not specific to
|
||||
code or problems with code. Used for discussion of general matters
|
||||
relating to all of the POI project, such as the website and
|
||||
changes in procedures.
|
||||
</p>
|
||||
<p>
|
||||
Alternate options:
|
||||
<a href="mailto:general-subscribe@poi.apache.org">Subscribe</a>
|
||||
<a href="mailto:general-unsubscribe@poi.apache.org">Unsubscribe</a>
|
||||
<a href="https://mail-archives.apache.org/mod_mbox/poi-general/">Old Archive</a>
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section><title>Stack Overflow and other communities</title>
|
||||
<p>
|
||||
There are many POI users in the Stack Overflow community who have asked
|
||||
and answered questions that may be similar to the problem you are facing.
|
||||
Search for the <a href="http://stackoverflow.com/questions/tagged/apache-poi">apache-poi</a>
|
||||
tag on Stack Overflow.
|
||||
</p>
|
||||
<p>Regardless of which community you seek help from, remember to be courteous.
|
||||
Short, working code examples, an explanation of observed and expected behavior,
|
||||
the version of POI you are using, and genuine troubleshooting and research effort
|
||||
on your part go a long way towards getting a helpful answer.
|
||||
</p>
|
||||
<p>Please read through the <a href="site:faq">FAQ</a>,
|
||||
<a href="site:ssquickguide">Quick Guide</a>,
|
||||
<a href="site:sshowto">How To</a> or
|
||||
<a href="site:xslfcook">Cookbook</a>, and
|
||||
<a href="site:ssexamples">Examples</a>
|
||||
of the POI module that you are trying to use before consulting help. You may also find your
|
||||
question has already been answered on the POI <a href="#The+POI+Developer+List">dev</a>
|
||||
or <a href="#The+POI+User+List">user</a> mailing lists,
|
||||
<a href="https://bz.apache.org/bugzilla/describecomponents.cgi?product=POI">bugzilla</a>,
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
BIN
src/documentation/content/xdocs/images/add.png
Normal file
|
After Width: | Height: | Size: 626 B |
BIN
src/documentation/content/xdocs/images/favicon.ico
Normal file
|
After Width: | Height: | Size: 2.2 KiB |
BIN
src/documentation/content/xdocs/images/fix.png
Normal file
|
After Width: | Height: | Size: 587 B |
BIN
src/documentation/content/xdocs/images/group-logo.png
Normal file
|
After Width: | Height: | Size: 5.9 KiB |
522
src/documentation/content/xdocs/images/group.svg
Normal file
@ -0,0 +1,522 @@
|
||||
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<svg
|
||||
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
||||
xmlns:cc="http://creativecommons.org/ns#"
|
||||
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
|
||||
xmlns:svg="http://www.w3.org/2000/svg"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
|
||||
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
|
||||
version="1.1"
|
||||
id="Layer_2"
|
||||
x="0px"
|
||||
y="0px"
|
||||
viewBox="0 0 7127.6 2890"
|
||||
enable-background="new 0 0 7127.6 2890"
|
||||
xml:space="preserve"
|
||||
sodipodi:docname="asf_logo.svg"
|
||||
inkscape:export-filename="/home/kiwiwings/project/xmlbeans/site/src/documentation/resources/images/asf_logo.png"
|
||||
inkscape:export-xdpi="6.6435986"
|
||||
inkscape:export-ydpi="6.6435986"
|
||||
inkscape:version="0.92.3 (2405546, 2018-03-11)"><metadata
|
||||
id="metadata5077"><rdf:RDF><cc:Work
|
||||
rdf:about=""><dc:format>image/svg+xml</dc:format><dc:type
|
||||
rdf:resource="http://purl.org/dc/dcmitype/StillImage" /></cc:Work></rdf:RDF></metadata><defs
|
||||
id="defs5075" /><sodipodi:namedview
|
||||
pagecolor="#ffffff"
|
||||
bordercolor="#666666"
|
||||
borderopacity="1"
|
||||
objecttolerance="10"
|
||||
gridtolerance="10"
|
||||
guidetolerance="10"
|
||||
inkscape:pageopacity="0"
|
||||
inkscape:pageshadow="2"
|
||||
inkscape:window-width="3770"
|
||||
inkscape:window-height="2096"
|
||||
id="namedview5073"
|
||||
showgrid="false"
|
||||
inkscape:zoom="0.41360345"
|
||||
inkscape:cx="3563.8"
|
||||
inkscape:cy="1445"
|
||||
inkscape:window-x="70"
|
||||
inkscape:window-y="27"
|
||||
inkscape:window-maximized="1"
|
||||
inkscape:current-layer="Layer_2" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M7104.7,847.8c15.3,15.3,22.9,33.7,22.9,55.2c0,21.5-7.6,39.9-22.9,55.4c-15.3,15.4-33.8,23.1-55.6,23.1 c-21.8,0-40.2-7.6-55.4-22.9c-15.1-15.3-22.7-33.7-22.7-55.2c0-21.5,7.6-39.9,22.9-55.4c15.3-15.4,33.7-23.1,55.4-23.1 C7070.9,824.9,7089.4,832.5,7104.7,847.8z M7098.1,951.9c13.3-13.6,20-29.8,20-48.7s-6.6-35-19.8-48.5 c-13.2-13.4-29.4-20.1-48.6-20.1c-19.2,0-35.4,6.7-48.7,20.2c-13.3,13.5-19.9,29.7-19.9,48.7c0,19,6.6,35.2,19.7,48.6 c13.1,13.4,29.3,20.1,48.5,20.1S7084.7,965.4,7098.1,951.9z M7087.1,888.1c0,14-6.1,22.8-18.4,26.4l22.5,30.5h-18.2l-20.3-28.3 h-18.6v28.3h-14.7v-84.6h31.8c12.8,0,22,2.2,27.6,6.6C7084.4,871.4,7087.1,878.4,7087.1,888.1z M7068.2,900c3-2.4,4.4-6.5,4.4-12 c0-5.5-1.5-9.4-4.5-11.6c-3-2.2-8.4-3.2-16-3.2h-18v30.5h17.5C7059.7,903.6,7065.3,902.4,7068.2,900z"
|
||||
id="path4880" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M1803.6,499.8v155.4h-20V499.8h-56.8v-19.2h133.9v19.2H1803.6z"
|
||||
id="path4882" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M2082.2,655.2v-76.9h-105.2v76.9h-20V480.5h20v78.9h105.2v-78.9h20v174.7H2082.2z"
|
||||
id="path4884" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M2241.4,499.8v57.4h88.1v19.2h-88.1v59.8h101.8v19h-121.8V480.5H2340v19.2H2241.4z"
|
||||
id="path4886" />
|
||||
<path
|
||||
fill="#D22128"
|
||||
d="M1574.5,1852.4l417.3-997.6h80.1l417.3,997.6h-105.4l-129.3-311.9h-448.2l-127.9,311.9H1574.5z M2032.6,970 l-205.1,493.2h404.7L2032.6,970z"
|
||||
id="path4888" />
|
||||
<path
|
||||
fill="#D22128"
|
||||
d="M2596.9,1852.4V854.8H3010c171.4,0,295.1,158.8,295.1,313.3c0,163-115.2,316.1-286.6,316.1h-324.6v368.1 H2596.9z M2693.9,1397.1h318.9c118,0,193.9-108.2,193.9-229c0-125.1-92.7-226.2-202.3-226.2h-310.5V1397.1z"
|
||||
id="path4890" />
|
||||
<path
|
||||
fill="#D22128"
|
||||
d="M3250.5,1852.4l417.3-997.6h80.1l417.3,997.6h-105.4l-129.3-311.9h-448.2l-127.9,311.9H3250.5z M3708.6,970 l-205.1,493.2h404.7L3708.6,970z"
|
||||
id="path4892" />
|
||||
<path
|
||||
fill="#D22128"
|
||||
d="M4637.3,849.1c177,0,306.3,89.9,368.1,217.8l-78.7,47.8c-63.2-132.1-186.9-177-295.1-177 c-238.9,0-369.5,213.6-369.5,414.5c0,220.6,161.6,420.1,373.7,420.1c112.4,0,244.5-56.2,307.7-185.5l81.5,42.1 c-64.6,148.9-241.7,231.8-394.8,231.8c-274,0-466.5-261.3-466.5-514.2C4163.8,1106.3,4336.6,849.1,4637.3,849.1z"
|
||||
id="path4894" />
|
||||
<path
|
||||
fill="#D22128"
|
||||
d="M5949.1,854.8v997.6h-98.4v-466.5h-591.5v466.5h-96.9V854.8h96.9v444h591.5v-444H5949.1z"
|
||||
id="path4896" />
|
||||
<path
|
||||
fill="#D22128"
|
||||
d="M6844.6,1765.2v87.1h-670.2V854.8H6832v87.1h-560.6v359.7h489v82.9h-489v380.8H6844.6z"
|
||||
id="path4898" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M1667.6,2063.6c11.8,3.5,22.2,8.3,31,14.2l-10.3,22.6c-9-6-18.6-10.4-28.9-13.4c-10.2-2.9-20-4.4-29.2-4.4 c-13.6,0-24.5,2.4-32.6,7.3c-8.1,4.9-12.2,11.8-12.2,20.7c0,7.6,2.3,14,6.8,19c4.5,5,10.2,8.9,17,11.7c6.8,2.8,16.1,6,28,9.6 c14.4,4.6,26,8.9,34.7,12.9c8.8,4,16.3,9.9,22.5,17.8c6.2,7.8,9.3,18.2,9.3,31c0,11.7-3.2,21.8-9.5,30.6 c-6.3,8.7-15.3,15.5-26.8,20.3c-11.6,4.8-24.9,7.2-40,7.2c-15.1,0-29.7-2.9-43.9-8.7c-14.2-5.8-26.4-13.6-36.6-23.4l10.7-21.6 c9.6,9.4,20.7,16.7,33.3,21.9c12.6,5.2,24.8,7.8,36.8,7.8c15.3,0,27.3-3,36.1-8.9c8.8-5.9,13.2-13.9,13.2-23.9 c0-7.8-2.3-14.3-6.9-19.4c-4.6-5.1-10.3-9-17.1-11.9c-6.8-2.8-16.1-6-28-9.6c-14.2-4.2-25.7-8.3-34.6-12.2 c-8.9-3.9-16.4-9.7-22.5-17.5c-6.1-7.7-9.2-17.9-9.2-30.6c0-10.9,3-20.4,9-28.6c6-8.2,14.6-14.6,25.6-19.1 c11.1-4.5,23.8-6.8,38.2-6.8C1643.8,2058.3,1655.7,2060.1,1667.6,2063.6z"
|
||||
id="path4900" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M1980.1,2072.8c16.8,9.4,30.2,22.3,40,38.4c9.8,16.2,14.8,33.9,14.8,53.3c0,19.5-4.9,37.4-14.8,53.6 c-9.8,16.3-23.2,29.1-40,38.6c-16.8,9.5-35.3,14.3-55.2,14.3c-20.3,0-38.8-4.7-55.7-14.3c-16.8-9.5-30.2-22.4-40-38.6 c-9.8-16.3-14.8-34.1-14.8-53.6c0-19.5,4.9-37.3,14.8-53.5c9.8-16.2,23.2-29,40-38.3c16.8-9.4,35.4-14,55.7-14 C1944.8,2058.6,1963.2,2063.3,1980.1,2072.8z M1881.9,2092.7c-13.1,7.4-23.6,17.5-31.4,30.1c-7.8,12.6-11.8,26.5-11.8,41.7 c0,15.3,3.9,29.3,11.8,42c7.8,12.7,18.3,22.8,31.4,30.2c13.1,7.4,27.4,11.1,42.9,11.1c15.5,0,29.7-3.7,42.7-11.1 c13-7.4,23.3-17.4,31.1-30.2c7.7-12.7,11.6-26.7,11.6-42s-3.9-29.2-11.6-41.8c-7.7-12.6-18.1-22.6-31.1-30 c-13-7.4-27.2-11.2-42.6-11.2C1909.4,2081.5,1895.1,2085.2,1881.9,2092.7z"
|
||||
id="path4902" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M2186.5,2082.4v74h98.4v23.2h-98.4v90.2h-24.1v-210.6h133.8v23.2H2186.5z"
|
||||
id="path4904" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M2491.6,2082.4v187.4h-24.1v-187.4h-68.4v-23.2h161.4v23.2H2491.6z"
|
||||
id="path4906" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M2871.8,2269.8l-56.8-177.4l-57.6,177.4h-24.5l-70.5-210.6h25.9l57.9,182.7l57.1-182.4l24.1-0.3l57.7,182.7 l57.1-182.7h25l-70.6,210.6H2871.8z"
|
||||
id="path4908" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M3087.3,2216.6l-23.5,53.2h-25.6l94.4-210.6h25l94.1,210.6h-26.1l-23.5-53.2H3087.3z M3144.5,2086.6 l-46.9,106.8h94.4L3144.5,2086.6z"
|
||||
id="path4910" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M3461.1,2202.7c-6,0.4-10.7,0.6-14.1,0.6h-56v66.5H3367v-210.6h80c26.2,0,46.6,6.2,61.2,18.5 c14.5,12.3,21.8,29.8,21.8,52.3c0,17.2-4.1,31.7-12.2,43.3c-8.1,11.6-19.8,20-35,25l49.2,71.5h-27.3L3461.1,2202.7z M3491.3,2167.6 c10.3-8.4,15.5-20.8,15.5-37c0-15.9-5.2-27.9-15.5-36c-10.3-8.1-25.1-12.2-44.3-12.2h-56v97.8h56 C3466.2,2180.2,3481,2176,3491.3,2167.6z"
|
||||
id="path4912" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M3688.3,2082.4v69.2h106.2v23.2h-106.2v72.1h122.8v22.9h-146.9v-210.6h142.9v23.2H3688.3z"
|
||||
id="path4914" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M4147,2082.4v74h98.4v23.2H4147v90.2h-24.1v-210.6h133.8v23.2H4147z"
|
||||
id="path4916" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M4523.3,2072.8c16.8,9.4,30.2,22.3,40,38.4c9.8,16.2,14.8,33.9,14.8,53.3c0,19.5-4.9,37.4-14.8,53.6 c-9.8,16.3-23.2,29.1-40,38.6c-16.8,9.5-35.3,14.3-55.2,14.3c-20.3,0-38.8-4.7-55.7-14.3c-16.8-9.5-30.2-22.4-40-38.6 c-9.8-16.3-14.8-34.1-14.8-53.6c0-19.5,4.9-37.3,14.8-53.5c9.8-16.2,23.2-29,40-38.3c16.8-9.4,35.4-14,55.7-14 C4488.1,2058.6,4506.5,2063.3,4523.3,2072.8z M4425.2,2092.7c-13.1,7.4-23.6,17.5-31.4,30.1c-7.8,12.6-11.8,26.5-11.8,41.7 c0,15.3,3.9,29.3,11.8,42c7.8,12.7,18.3,22.8,31.4,30.2c13.1,7.4,27.4,11.1,42.9,11.1c15.5,0,29.7-3.7,42.7-11.1 c13-7.4,23.3-17.4,31.1-30.2c7.7-12.7,11.6-26.7,11.6-42s-3.9-29.2-11.6-41.8c-7.7-12.6-18.1-22.6-31.1-30 c-13-7.4-27.2-11.2-42.6-11.2C4452.6,2081.5,4438.3,2085.2,4425.2,2092.7z"
|
||||
id="path4918" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M4854.7,2247.7c-15.7,15.5-37.3,23.3-64.8,23.3c-27.7,0-49.4-7.8-65.1-23.3c-15.7-15.5-23.6-37-23.6-64.6 v-124h24.1v124c0,20.3,5.8,36.1,17.3,47.5c11.6,11.4,27.3,17.1,47.3,17.1c20.1,0,35.8-5.7,47.1-17c11.4-11.3,17-27.2,17-47.7v-124 h24.1v124C4878.2,2210.7,4870.4,2232.2,4854.7,2247.7z"
|
||||
id="path4920" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M5169.5,2269.8l-126.3-169.1v169.1h-24.1v-210.6h25l126.3,169.3v-169.3h23.8v210.6H5169.5z"
|
||||
id="path4922" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M5478.4,2073.1c16.4,9.3,29.4,21.9,38.9,37.9c9.6,16,14.3,33.9,14.3,53.5s-4.8,37.6-14.3,53.6 c-9.5,16.1-22.6,28.7-39.3,37.9c-16.6,9.2-35.2,13.8-55.5,13.8h-84.3v-210.6h85.2C5443.7,2059.2,5462,2063.8,5478.4,2073.1z M5362.3,2246.9h61.4c15.5,0,29.6-3.5,42.3-10.6c12.7-7.1,22.8-16.9,30.2-29.5c7.4-12.5,11.1-26.5,11.1-42 c0-15.5-3.8-29.4-11.3-41.9c-7.5-12.5-17.7-22.3-30.6-29.6c-12.8-7.2-27-10.9-42.6-10.9h-60.5V2246.9z"
|
||||
id="path4924" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M5668.6,2216.6l-23.5,53.2h-25.6l94.4-210.6h25l94.1,210.6H5807l-23.5-53.2H5668.6z M5725.8,2086.6 l-46.9,106.8h94.4L5725.8,2086.6z"
|
||||
id="path4926" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M5991,2082.4v187.4H5967v-187.4h-68.4v-23.2h161.4v23.2H5991z"
|
||||
id="path4928" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M6175.9,2269.8v-210.6h24.1v210.6H6175.9z"
|
||||
id="path4930" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M6493.7,2072.8c16.8,9.4,30.2,22.3,40,38.4c9.8,16.2,14.8,33.9,14.8,53.3c0,19.5-4.9,37.4-14.8,53.6 c-9.8,16.3-23.2,29.1-40,38.6c-16.8,9.5-35.3,14.3-55.2,14.3c-20.3,0-38.8-4.7-55.7-14.3c-16.8-9.5-30.2-22.4-40-38.6 c-9.8-16.3-14.8-34.1-14.8-53.6c0-19.5,4.9-37.3,14.8-53.5c9.8-16.2,23.2-29,40-38.3c16.8-9.4,35.4-14,55.7-14 C6458.5,2058.6,6476.9,2063.3,6493.7,2072.8z M6395.6,2092.7c-13.1,7.4-23.6,17.5-31.4,30.1c-7.8,12.6-11.8,26.5-11.8,41.7 c0,15.3,3.9,29.3,11.8,42c7.8,12.7,18.3,22.8,31.4,30.2c13.1,7.4,27.4,11.1,42.9,11.1c15.5,0,29.7-3.7,42.7-11.1 c13-7.4,23.3-17.4,31.1-30.2c7.7-12.7,11.6-26.7,11.6-42s-3.9-29.2-11.6-41.8c-7.7-12.6-18.1-22.6-31.1-30 c-13-7.4-27.2-11.2-42.6-11.2C6423,2081.5,6408.8,2085.2,6395.6,2092.7z"
|
||||
id="path4932" />
|
||||
<path
|
||||
fill="#6D6E71"
|
||||
d="M6826.5,2269.8l-126.3-169.1v169.1h-24.1v-210.6h25l126.3,169.3v-169.3h23.8v210.6H6826.5z"
|
||||
id="path4934" />
|
||||
<linearGradient
|
||||
id="SVGID_1_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-4516.6152"
|
||||
y1="-2338.7222"
|
||||
x2="-4108.4111"
|
||||
y2="-1861.3982"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0"
|
||||
style="stop-color:#F69923"
|
||||
id="stop4936" />
|
||||
<stop
|
||||
offset="0.3123"
|
||||
style="stop-color:#F79A23"
|
||||
id="stop4938" />
|
||||
<stop
|
||||
offset="0.8383"
|
||||
style="stop-color:#E97826"
|
||||
id="stop4940" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_1_)"
|
||||
d="M1230.1,13.7c-45.3,26.8-120.6,102.5-210.5,212.3l82.6,155.9c58-82.9,116.9-157.5,176.3-221.2 c4.6-5.1,7-7.5,7-7.5c-2.3,2.5-4.6,5-7,7.5c-19.2,21.2-77.5,89.2-165.5,224.4c84.7-4.2,214.9-21.6,321.1-39.7 c31.6-177-31-258-31-258S1323.4-41.4,1230.1,13.7z"
|
||||
id="path4943" />
|
||||
<path
|
||||
fill="none"
|
||||
d="M1090.2,903.1c0.6-0.1,1.2-0.2,1.8-0.3l-11.9,1.3c-0.7,0.3-1.4,0.7-2.1,1 C1082.1,904.4,1086.2,903.7,1090.2,903.1z"
|
||||
id="path4945" />
|
||||
<path
|
||||
fill="none"
|
||||
d="M1005.9,1182.3c-6.7,1.5-13.7,2.7-20.7,3.7C992.3,1185,999.2,1183.8,1005.9,1182.3z"
|
||||
id="path4947" />
|
||||
<path
|
||||
fill="none"
|
||||
d="M432.9,1808.8c0.9-2.3,1.8-4.7,2.6-7c18.2-48,36.2-94.7,54-140.1c20-51,39.8-100.4,59.3-148.3 c20.6-50.4,40.9-99.2,60.9-146.3c21-49.4,41.7-97,62-142.8c16.5-37.3,32.8-73.4,48.9-108.3c5.4-11.7,10.7-23.2,16-34.6 c10.5-22.7,21-44.8,31.3-66.5c9.5-20,19-39.6,28.3-58.8c3.1-6.4,6.2-12.8,9.3-19.1c0.5-1,1-2,1.5-3.1l-10.2,1.1l-8-15.9 c-0.8,1.6-1.6,3.1-2.4,4.6c-14.5,28.8-28.9,57.9-43.1,87.2c-8.2,16.9-16.4,34-24.6,51c-22.6,47.4-44.8,95.2-66.6,143.3 c-22.1,48.6-43.7,97.5-64.9,146.5c-20.8,48.1-41.3,96.2-61.2,144.2c-20,48-39.5,95.7-58.5,143.2c-19.9,49.5-39.2,98.7-58,147.2 c-4.2,10.9-8.5,21.9-12.7,32.8c-15,39.2-29.7,77.8-44,116l12.7,25.1l11.4-1.2c0.4-1.1,0.8-2.3,1.3-3.4 C396.7,1905.4,414.9,1856.4,432.9,1808.8z"
|
||||
id="path4949" />
|
||||
<path
|
||||
fill="none"
|
||||
d="M980,1186.8L980,1186.8c0.1,0,0.1,0,0.1-0.1C980.1,1186.8,980.1,1186.8,980,1186.8z"
|
||||
id="path4951" />
|
||||
<path
|
||||
fill="#BE202E"
|
||||
d="M952.6,1323c-10.6,1.9-21.4,3.8-32.5,5.7c-0.1,0-0.1,0.1-0.2,0.1c5.6-0.8,11.2-1.7,16.6-2.6 C942,1325.2,947.3,1324.1,952.6,1323z"
|
||||
id="path4953" />
|
||||
<path
|
||||
opacity="0.35"
|
||||
fill="#BE202E"
|
||||
d="M952.6,1323c-10.6,1.9-21.4,3.8-32.5,5.7c-0.1,0-0.1,0.1-0.2,0.1c5.6-0.8,11.2-1.7,16.6-2.6 C942,1325.2,947.3,1324.1,952.6,1323z"
|
||||
id="path4955" />
|
||||
<path
|
||||
fill="#BE202E"
|
||||
d="M980.3,1186.7C980.2,1186.7,980.2,1186.7,980.3,1186.7c-0.1,0.1-0.2,0.1-0.2,0.1c1.8-0.2,3.5-0.5,5.2-0.8 c7-1,13.9-2.2,20.7-3.7C997.5,1183.8,989,1185.2,980.3,1186.7L980.3,1186.7L980.3,1186.7z"
|
||||
id="path4957" />
|
||||
<path
|
||||
opacity="0.35"
|
||||
fill="#BE202E"
|
||||
d="M980.3,1186.7C980.2,1186.7,980.2,1186.7,980.3,1186.7c-0.1,0.1-0.2,0.1-0.2,0.1 c1.8-0.2,3.5-0.5,5.2-0.8c7-1,13.9-2.2,20.7-3.7C997.5,1183.8,989,1185.2,980.3,1186.7L980.3,1186.7L980.3,1186.7z"
|
||||
id="path4959" />
|
||||
<linearGradient
|
||||
id="SVGID_2_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7537.7339"
|
||||
y1="-2391.4075"
|
||||
x2="-4625.4141"
|
||||
y2="-2391.4075"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop4961" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop4963" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop4965" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop4967" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_2_)"
|
||||
d="M858.6,784.7c25.1-46.9,50.5-92.8,76.2-137.4c26.7-46.4,53.7-91.3,80.9-134.7 c1.6-2.6,3.2-5.2,4.8-7.7c27-42.7,54.2-83.7,81.6-122.9L1019.5,226c-6.2,7.6-12.5,15.3-18.8,23.2c-23.8,29.7-48.6,61.6-73.9,95.5 c-28.6,38.2-58,78.9-87.8,121.7c-27.6,39.5-55.5,80.9-83.5,123.7c-23.8,36.5-47.7,74-71.4,112.5c-0.9,1.4-1.8,2.9-2.6,4.3 l107.5,212.3C811.8,873.6,835.1,828.7,858.6,784.7z"
|
||||
id="path4970" />
|
||||
<linearGradient
|
||||
id="SVGID_3_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7186.1777"
|
||||
y1="-2099.3059"
|
||||
x2="-5450.7183"
|
||||
y2="-2099.3059"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0"
|
||||
style="stop-color:#282662"
|
||||
id="stop4972" />
|
||||
<stop
|
||||
offset="9.548390e-02"
|
||||
style="stop-color:#662E8D"
|
||||
id="stop4974" />
|
||||
<stop
|
||||
offset="0.7882"
|
||||
style="stop-color:#9F2064"
|
||||
id="stop4976" />
|
||||
<stop
|
||||
offset="0.9487"
|
||||
style="stop-color:#CD2032"
|
||||
id="stop4978" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_3_)"
|
||||
d="M369,1981c-14.2,39.1-28.5,78.9-42.9,119.6c-0.2,0.6-0.4,1.2-0.6,1.8c-2,5.7-4.1,11.5-6.1,17.2 c-9.7,27.4-18,52.1-37.3,108.2c31.7,14.5,57.1,52.5,81.1,95.6c-2.6-44.7-21-86.6-56.2-119.1c156.1,7,290.6-32.4,360.1-146.6 c6.2-10.2,11.9-20.9,17-32.2c-31.6,40.1-70.8,57.1-144.5,53c-0.2,0.1-0.3,0.1-0.5,0.2c0.2-0.1,0.3-0.1,0.5-0.2 c108.6-48.6,163.1-95.3,211.2-172.6c11.4-18.3,22.5-38.4,33.8-60.6c-94.9,97.5-205,125.3-320.9,104.2l-86.9,9.5 C374.4,1966.3,371.7,1973.6,369,1981z"
|
||||
id="path4981" />
|
||||
<linearGradient
|
||||
id="SVGID_4_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7374.1626"
|
||||
y1="-2418.5454"
|
||||
x2="-4461.8428"
|
||||
y2="-2418.5454"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop4983" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop4985" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop4987" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop4989" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_4_)"
|
||||
d="M409.6,1786.3c18.8-48.5,38.1-97.7,58-147.2c19-47.4,38.5-95.2,58.5-143.2 c20-48,40.4-96.1,61.2-144.2c21.2-49,42.9-97.8,64.9-146.5c21.8-48.1,44-95.9,66.6-143.3c8.1-17.1,16.3-34.1,24.6-51 c14.2-29.3,28.6-58.4,43.1-87.2c0.8-1.6,1.6-3.1,2.4-4.6L681.4,706.8c-1.8,2.9-3.5,5.8-5.3,8.6c-25.1,40.9-50,82.7-74.4,125.4 c-24.7,43.1-49,87.1-72.7,131.7c-20,37.6-39.6,75.6-58.6,113.9c-3.8,7.8-7.6,15.5-11.3,23.2c-23.4,48.2-44.6,94.8-63.7,139.5 c-21.7,50.7-40.7,99.2-57.5,145.1c-11,30.2-21,59.4-30.1,87.4c-7.5,24-14.7,47.9-21.5,71.8c-16,56.3-29.9,112.4-41.2,168.3 L353,1935.1c14.3-38.1,28.9-76.8,44-116C401.1,1808.2,405.4,1797.3,409.6,1786.3z"
|
||||
id="path4992" />
|
||||
<linearGradient
|
||||
id="SVGID_5_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7161.7642"
|
||||
y1="-2379.1431"
|
||||
x2="-5631.2524"
|
||||
y2="-2379.1431"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0"
|
||||
style="stop-color:#282662"
|
||||
id="stop4994" />
|
||||
<stop
|
||||
offset="9.548390e-02"
|
||||
style="stop-color:#662E8D"
|
||||
id="stop4996" />
|
||||
<stop
|
||||
offset="0.7882"
|
||||
style="stop-color:#9F2064"
|
||||
id="stop4998" />
|
||||
<stop
|
||||
offset="0.9487"
|
||||
style="stop-color:#CD2032"
|
||||
id="stop5000" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_5_)"
|
||||
d="M243.5,1729.4c-13.6,68.2-23.2,136.2-28,203.8c-0.2,2.4-0.4,4.7-0.5,7.1 c-33.7-54-124-106.8-123.8-106.2c64.6,93.7,113.7,186.7,120.9,278c-34.6,7.1-82-3.2-136.8-23.3c57.1,52.5,100,67,116.7,70.9 c-52.5,3.3-107.1,39.3-162.1,80.8c80.5-32.8,145.5-45.8,192.1-35.3C148.1,2414.2,74.1,2645,0,2890c22.7-6.7,36.2-21.9,43.9-42.6 c13.2-44.4,100.8-335.6,238-718.2c3.9-10.9,7.8-21.8,11.8-32.9c1.1-3,2.2-6.1,3.3-9.2c14.5-40.1,29.5-81.1,45.1-122.9 c3.5-9.5,7.1-19,10.7-28.6c0.1-0.2,0.1-0.4,0.2-0.6l-107.9-213.2C244.6,1724.4,244,1726.9,243.5,1729.4z"
|
||||
id="path5003" />
|
||||
<linearGradient
|
||||
id="SVGID_6_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7374.1626"
|
||||
y1="-2117.1309"
|
||||
x2="-4461.8428"
|
||||
y2="-2117.1309"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop5005" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop5007" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop5009" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop5011" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_6_)"
|
||||
d="M805.6,937c-3.1,6.3-6.2,12.7-9.3,19.1c-9.3,19.2-18.8,38.8-28.3,58.8 c-10.3,21.7-20.7,43.9-31.3,66.5c-5.3,11.4-10.6,22.9-16,34.6c-16.1,35-32.4,71.1-48.9,108.3c-20.3,45.8-41,93.4-62,142.8 c-20,47.1-40.3,95.9-60.9,146.3c-19.5,47.9-39.3,97.3-59.3,148.3c-17.8,45.4-35.9,92.1-54,140.1c-0.9,2.3-1.8,4.7-2.6,7 c-18,47.6-36.2,96.6-54.6,146.8c-0.4,1.1-0.8,2.3-1.3,3.4l86.9-9.5c-1.7-0.3-3.5-0.5-5.2-0.9c103.9-13,242.1-90.6,331.4-186.5 c41.1-44.2,78.5-96.3,113-157.3c25.7-45.4,49.8-95.8,72.8-151.5c20.1-48.7,39.4-101.4,58-158.6c-23.9,12.6-51.2,21.8-81.4,28.2 c-5.3,1.1-10.7,2.2-16.1,3.1c-5.5,1-11,1.8-16.6,2.6l0,0l0,0c0.1,0,0.1-0.1,0.2-0.1c96.9-37.3,158-109.2,202.4-197.4 c-25.5,17.4-66.9,40.1-116.6,51.1c-6.7,1.5-13.7,2.7-20.7,3.7c-1.7,0.3-3.5,0.6-5.2,0.8l0,0l0,0c0.1,0,0.1,0,0.1-0.1 c0,0,0.1,0,0.1,0l0,0c33.6-14.1,62-29.8,86.6-48.4c5.3-4,10.4-8.1,15.3-12.3c7.5-6.5,14.7-13.3,21.5-20.5c4.4-4.6,8.6-9.3,12.7-14.2 c9.6-11.5,18.7-23.9,27.1-37.3c2.6-4.1,5.1-8.3,7.6-12.6c3.2-6.2,6.3-12.3,9.3-18.3c13.5-27.2,24.4-51.5,33-72.8 c4.3-10.6,8.1-20.5,11.3-29.7c1.3-3.7,2.5-7.2,3.7-10.6c3.4-10.2,6.2-19.3,8.4-27.3c3.3-12,5.3-21.5,6.4-28.4l0,0l0,0 c-3.3,2.6-7.1,5.2-11.3,7.7c-29.3,17.5-79.5,33.4-119.9,40.8l79.8-8.8l-79.8,8.8c-0.6,0.1-1.2,0.2-1.8,0.3c-4,0.7-8.1,1.3-12.2,2 c0.7-0.3,1.4-0.7,2.1-1l-273,29.9C806.6,935,806.1,936,805.6,937z"
|
||||
id="path5014" />
|
||||
<linearGradient
|
||||
id="SVGID_7_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7554.8232"
|
||||
y1="-2132.0981"
|
||||
x2="-4642.5034"
|
||||
y2="-2132.0981"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop5016" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop5018" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop5020" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop5022" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_7_)"
|
||||
d="M1112.9,385.1c-24.3,37.3-50.8,79.6-79.4,127.5c-1.5,2.5-3,5.1-4.5,7.6 c-24.6,41.5-50.8,87.1-78.3,137c-23.8,43.1-48.5,89.3-74.3,139c-22.4,43.3-45.6,89.2-69.4,137.8l273-29.9 c79.5-36.6,115.1-69.7,149.6-117.6c9.2-13.2,18.4-27,27.5-41.3c28-43.8,55.6-92,80.1-139.9c23.7-46.3,44.7-92.2,60.7-133.5 c10.2-26.3,18.4-50.8,24.1-72.3c5-19,8.9-36.9,11.9-54.1C1327.9,363.5,1197.6,380.9,1112.9,385.1z"
|
||||
id="path5025" />
|
||||
<path
|
||||
fill="#BE202E"
|
||||
d="M936.5,1326.1c-5.5,1-11,1.8-16.6,2.6l0,0C925.5,1328,931,1327.1,936.5,1326.1z"
|
||||
id="path5027" />
|
||||
<path
|
||||
opacity="0.35"
|
||||
fill="#BE202E"
|
||||
d="M936.5,1326.1c-5.5,1-11,1.8-16.6,2.6l0,0C925.5,1328,931,1327.1,936.5,1326.1z"
|
||||
id="path5029" />
|
||||
<linearGradient
|
||||
id="SVGID_8_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7374.1626"
|
||||
y1="-2027.484"
|
||||
x2="-4461.8433"
|
||||
y2="-2027.484"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop5031" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop5033" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop5035" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop5037" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_8_)"
|
||||
d="M936.5,1326.1c-5.5,1-11,1.8-16.6,2.6l0,0C925.5,1328,931,1327.1,936.5,1326.1z"
|
||||
id="path5040" />
|
||||
<path
|
||||
fill="#BE202E"
|
||||
d="M980,1186.8c1.8-0.2,3.5-0.5,5.2-0.8C983.5,1186.3,981.8,1186.6,980,1186.8L980,1186.8z"
|
||||
id="path5042" />
|
||||
<path
|
||||
opacity="0.35"
|
||||
fill="#BE202E"
|
||||
d="M980,1186.8c1.8-0.2,3.5-0.5,5.2-0.8C983.5,1186.3,981.8,1186.6,980,1186.8L980,1186.8z"
|
||||
id="path5044" />
|
||||
<linearGradient
|
||||
id="SVGID_9_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-7374.1626"
|
||||
y1="-2037.7417"
|
||||
x2="-4461.8433"
|
||||
y2="-2037.7417"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop5046" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop5048" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop5050" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop5052" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_9_)"
|
||||
d="M980,1186.8c1.8-0.2,3.5-0.5,5.2-0.8C983.5,1186.3,981.8,1186.6,980,1186.8L980,1186.8z"
|
||||
id="path5055" />
|
||||
<path
|
||||
fill="#BE202E"
|
||||
d="M980.2,1186.7C980.2,1186.7,980.2,1186.7,980.2,1186.7L980.2,1186.7L980.2,1186.7L980.2,1186.7 C980.2,1186.7,980.2,1186.7,980.2,1186.7z"
|
||||
id="path5057" />
|
||||
<path
|
||||
opacity="0.35"
|
||||
fill="#BE202E"
|
||||
d="M980.2,1186.7C980.2,1186.7,980.2,1186.7,980.2,1186.7L980.2,1186.7L980.2,1186.7 L980.2,1186.7C980.2,1186.7,980.2,1186.7,980.2,1186.7z"
|
||||
id="path5059" />
|
||||
<linearGradient
|
||||
id="SVGID_10_"
|
||||
gradientUnits="userSpaceOnUse"
|
||||
x1="-5738.0635"
|
||||
y1="-2039.799"
|
||||
x2="-5094.3457"
|
||||
y2="-2039.799"
|
||||
gradientTransform="matrix(0.4226 -0.9063 0.9063 0.4226 5117.8774 -2859.9343)">
|
||||
<stop
|
||||
offset="0.3233"
|
||||
style="stop-color:#9E2064"
|
||||
id="stop5061" />
|
||||
<stop
|
||||
offset="0.6302"
|
||||
style="stop-color:#C92037"
|
||||
id="stop5063" />
|
||||
<stop
|
||||
offset="0.7514"
|
||||
style="stop-color:#CD2335"
|
||||
id="stop5065" />
|
||||
<stop
|
||||
offset="1"
|
||||
style="stop-color:#E97826"
|
||||
id="stop5067" />
|
||||
</linearGradient>
|
||||
<path
|
||||
fill="url(#SVGID_10_)"
|
||||
d="M980.2,1186.7C980.2,1186.7,980.2,1186.7,980.2,1186.7L980.2,1186.7L980.2,1186.7L980.2,1186.7 C980.2,1186.7,980.2,1186.7,980.2,1186.7z"
|
||||
id="path5070" />
|
||||
</svg>
|
||||
|
After Width: | Height: | Size: 24 KiB |
BIN
src/documentation/content/xdocs/images/icon.png
Normal file
|
After Width: | Height: | Size: 696 B |
BIN
src/documentation/content/xdocs/images/poweredby-poi-logo.png
Normal file
|
After Width: | Height: | Size: 7.0 KiB |
1025
src/documentation/content/xdocs/images/poweredby-poi.svg
Normal file
|
After Width: | Height: | Size: 68 KiB |
BIN
src/documentation/content/xdocs/images/project-header.png
Normal file
|
After Width: | Height: | Size: 9.4 KiB |
BIN
src/documentation/content/xdocs/images/remove.png
Normal file
|
After Width: | Height: | Size: 539 B |
BIN
src/documentation/content/xdocs/images/update.png
Normal file
|
After Width: | Height: | Size: 665 B |
BIN
src/documentation/content/xdocs/images/usemap.gif
Normal file
|
After Width: | Height: | Size: 1.6 KiB |
259
src/documentation/content/xdocs/index.xml
Normal file
@ -0,0 +1,259 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - the Java API for Microsoft Documents</title>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Project News</title>
|
||||
|
||||
<section><title>8 April 2025 - CVE-2025-31672 - Improper Input Validation vulnerability in Apache POI before 5.4.0</title>
|
||||
<p>
|
||||
While parsing of OOXML format files like xlsx, docx and pptx, a specially crafted file could
|
||||
be used to provide multiple entries with the same name in the zip-compressed file-format.
|
||||
<br/>
|
||||
Products reading the affected file could read different data because one of the zip entries with the
|
||||
duplicate name is selected over another by different products differently.<br/><br/>
|
||||
This issue affects Apache POI component poi-ooxml before 5.4.0. Starting with 5.4.0 poi-ooxml performs
|
||||
a check that throws an exception if zip entries with duplicate file names are found in the input file.<br/><br/>
|
||||
Users are recommended to upgrade to version poi-ooxml 5.4.0 or later, which fixes the issue.
|
||||
Please refer to our <a href="https://poi.apache.org/security.html">security guidelines</a>
|
||||
for recommendations about how to use the POI libraries securely.
|
||||
</p>
|
||||
<p>
|
||||
References:
|
||||
</p>
|
||||
<ul>
|
||||
<li><a href="https://bz.apache.org/bugzilla/show_bug.cgi?id=69620">Bug 69620</a></li>
|
||||
<li><a href="https://www.cve.org/CVERecord?id=CVE-2025-31672">CVE-2025-31672</a></li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<!-- latest final release -->
|
||||
<section><title>6 April 2025 - POI 5.4.1 available</title>
|
||||
<p>The Apache POI team is pleased to announce the release of 5.4.1.
|
||||
Several dependencies were updated to their latest versions to pick up security fixes and other improvements.</p>
|
||||
<p>A summary of changes is available in the
|
||||
<a href="https://www.apache.org/dyn/closer.lua/poi/release/RELEASE-NOTES.txt">Release Notes</a>.
|
||||
A full list of changes is available in the <a href="changes.html#5.4.1">change log</a>.
|
||||
People interested should also follow the <a href="site:mailinglists">dev list</a> to track progress.</p>
|
||||
<p>See the <a href="download.html#POI-5.4.1">downloads</a> page for more details.</p>
|
||||
<p>POI requires Java 8 or newer since version 4.0.1.</p>
|
||||
</section>
|
||||
|
||||
<section><title>11 November 2024 - Avoid log4j-api 2.24.1</title>
|
||||
<p>While testing a potential Apache POI 5.4.0 release, we discovered a serious bug in
|
||||
log4j-api 2.24.1. This leads to NullPointerExceptions when you use a version of log4j-core that is not of
|
||||
the exact same version (2.24.1). We recommend that users avoid log4j 2.24.1 and use the latest
|
||||
2.24.x version where this issue is fixed again.</p>
|
||||
<p>XMLBeans release 5.2.2 had the problematic log4j-api 2.24.1 dependency and thus
|
||||
can lead to such issues if used in some other context. In the meantime a version 5.3.0
|
||||
of XmlBeans was released which avoids this issue.</p>
|
||||
<p>Please direct any queries to the Log4j Team. The main issue is
|
||||
<a href="https://github.com/apache/logging-log4j2/issues/3143">Issue 3143</a>.</p>
|
||||
</section>
|
||||
|
||||
<section><title>4 March 2022 - CVE-2022-26336 - A carefully crafted TNEF file can cause an out of memory exception in Apache POI poi-scratchpad versions prior to 5.2.0</title>
|
||||
<p>Description:<br/>
|
||||
A shortcoming in the HMEF package of poi-scratchpad (Apache POI) allows an attacker to cause an Out of Memory exception.
|
||||
This package is used to read TNEF files (Microsoft Outlook and Microsoft Exchange Server).
|
||||
If an application uses poi-scratchpad to parse TNEF files and the application allows untrusted users to supply them, then a carefully crafted file can cause an Out of Memory exception.</p>
|
||||
|
||||
<p>Mitigation:<br/>
|
||||
Affected users are advised to update to poi-scratchpad 5.2.1 or above
|
||||
which fixes this vulnerability. It is recommended that you use the same versions of all POI jars.</p>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>10+16+18 December 2021- Log4j vulnerabilities CVE-2021-44228, CVE-2021-45046 and CVE-2021-45105</title>
|
||||
<p>The Apache POI PMC has evaluated the security vulnerabilities reported
|
||||
for Apache Log4j.</p>
|
||||
<p>POI 5.1.0 and XMLBeans 5.0.2 only have dependencies on log4j-api 2.14.1.
|
||||
The security vulnerabilities are not in log4j-api - they are in log4j-core.</p>
|
||||
<p>If any POI or XMLBeans user uses log4j-core to control their logging of their application,
|
||||
we strongly recommend that they upgrade all their log4j dependencies to the latest
|
||||
version (currently v2.20.0) - including log4j-api.</p>
|
||||
</section>
|
||||
|
||||
<section><title>13 January 2021 - CVE-2021-23926 - XML External Entity (XXE) Processing in Apache XMLBeans versions prior to 3.0.0</title>
|
||||
<p>Description:<br/>
|
||||
When parsing XML files using XMLBeans 2.6.0 or below, the underlying parser
|
||||
created by XMLBeans could be susceptible to XML External Entity (XXE) attacks.</p>
|
||||
|
||||
<p>This issue was fixed a few years ago but on review, we decided we should have a CVE
|
||||
to raise awareness of the issue.</p>
|
||||
|
||||
<p>Mitigation:<br/>
|
||||
Affected users are advised to update to Apache XMLBeans 3.0.0 or above
|
||||
which fixes this vulnerability. XMLBeans 4.0.0 or above is preferable.</p>
|
||||
|
||||
<p>References:
|
||||
<a href="https://en.wikipedia.org/wiki/XML_external_entity_attack">XML external entity attack</a>
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>20 October 2019 - CVE-2019-12415 - XML External Entity (XXE) Processing in Apache POI versions prior to 4.1.1</title>
|
||||
<p>Description:<br/>
|
||||
When using the tool XSSFExportToXml to convert user-provided Microsoft
|
||||
Excel documents, a specially crafted document can allow an attacker to
|
||||
read files from the local filesystem or from internal network resources
|
||||
via XML External Entity (XXE) Processing.</p>
|
||||
|
||||
<p>Mitigation:<br/>
|
||||
Apache POI 4.1.0 and before: users who do not use the tool XSSFExportToXml
|
||||
are not affected. Affected users are advised to update to Apache POI 4.1.1
|
||||
which fixes this vulnerability.</p>
|
||||
|
||||
<p>Credit:
|
||||
This issue was discovered by Artem Smotrakov from SAP</p>
|
||||
|
||||
<p>References:
|
||||
<a href="https://en.wikipedia.org/wiki/XML_external_entity_attack">XML external entity attack</a>
|
||||
</p>
|
||||
</section>
|
||||
|
||||
|
||||
<!-- xmlbeans 3.1.0 release -->
|
||||
<section><title>26 March 2019 - XMLBeans 3.1.0 available</title>
|
||||
<p>The Apache POI team is pleased to announce the release of XMLBeans 3.1.0.
|
||||
Featured are a handful of bug fixes.</p>
|
||||
<p>The Apache POI project has unretired the XMLBeans codebase and is maintaining it as a sub-project,
|
||||
due to its importance in the poi-ooxml codebase.</p>
|
||||
<p>A summary of changes is available in the
|
||||
<a href="https://svn.apache.org/viewvc/xmlbeans/trunk/CHANGES.txt?view=markup">Release Notes</a>.
|
||||
People interested should also follow the <a href="site:mailinglists">POI dev list</a> to track progress.</p>
|
||||
<p>The XMLBeans <a href="https://issues.apache.org/jira/projects/XMLBEANS">JIRA project</a> has been reopened and feel free to open issues.</p>
|
||||
<p>POI 4.1.0 uses XMLBeans 3.1.0.</p>
|
||||
<p>XMLBeans requires Java 6 or newer since version 3.0.2.</p>
|
||||
</section>
|
||||
|
||||
<section><title>11 January 2019 - Initial support for JDK 11</title>
|
||||
<p>We did some work to verify that compilation with Java 11 is working and
|
||||
that all unit-tests pass.
|
||||
</p>
|
||||
<p>See the details in the <a href="help/faq.html#faq-N102B0">FAQ entry</a>.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section><title>Mission Statement</title>
|
||||
<p>
|
||||
The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats
|
||||
based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2).
|
||||
In short, you can read and write MS Excel files using Java.
|
||||
In addition, you can read and write MS Word and MS PowerPoint files using Java. Apache POI is your Java Excel
|
||||
solution (for Excel 97-2008). We have a complete API for porting other OOXML and OLE2 formats and welcome others to participate.
|
||||
</p>
|
||||
<p>
|
||||
OLE2 files include most Microsoft Office files such as XLS, DOC, and PPT as well as MFC serialization API based file formats.
|
||||
The project provides APIs for the <a href="site:poifs">OLE2 Filesystem (POIFS)</a> and
|
||||
<a href="site:hpsf">OLE2 Document Properties (HPSF)</a>.
|
||||
</p>
|
||||
<p>
|
||||
Office OpenXML Format is the new standards based XML file format found in Microsoft Office 2007 and 2008.
|
||||
This includes XLSX, DOCX and PPTX. The project provides a low level API to support the Open Packaging Conventions
|
||||
using <a href="site:oxml4j">openxml4j</a>.
|
||||
</p>
|
||||
<p>
|
||||
For each MS Office application there exists a component module that attempts to provide a common high level Java api to both OLE2 and OOXML
|
||||
document formats. This is most developed for <a href="site:spreadsheet">Excel workbooks (SS=HSSF+XSSF)</a>.
|
||||
Work is progressing for <a href="site:document">Word documents (WP=HWPF+XWPF)</a> and
|
||||
<a href="site:slideshow">PowerPoint presentations (SL=HSLF+XSLF)</a>.
|
||||
</p>
|
||||
<p>
|
||||
The project has some support for <a href="site:hsmf">Outlook (HSMF)</a>. Microsoft opened the specifications
|
||||
to this format in October 2007. We would welcome contributions.
|
||||
</p>
|
||||
<p>
|
||||
There are also projects for
|
||||
<a href="site:diagram">Visio (HDGF and XDGF)</a>,
|
||||
<a href="site:hmef">TNEF (HMEF)</a>,
|
||||
and <a href="site:hpbf">Publisher (HPBF)</a>.
|
||||
</p>
|
||||
<p>
|
||||
As a general policy we collaborate as much as possible with other projects to
|
||||
provide this functionality. Examples include: <a href="https://xml.apache.org/cocoon">Cocoon</a> for
|
||||
which there are serializers for HSSF;
|
||||
<a href="https://www.openoffice.org">Open Office.org</a> with whom we collaborate in documenting the
|
||||
XLS format; and <a href="https://tika.apache.org/">Tika</a> /
|
||||
<a href="https://lucene.apache.org">Lucene</a>,
|
||||
for which we provide format interpretors. When practical, we donate
|
||||
components directly to those projects for POI-enabling them.
|
||||
</p>
|
||||
<section><title>Why should I use Apache POI?</title>
|
||||
<p>
|
||||
A major use of the Apache POI api is for <a href="text-extraction.html">Text Extraction</a> applications
|
||||
such as web spiders, index builders, and content management systems.
|
||||
</p>
|
||||
<p>
|
||||
So why should you use POIFS, HSSF or XSSF?
|
||||
</p>
|
||||
<p>
|
||||
You'd use POIFS if you had a document written in OLE 2 Compound Document Format, probably written using
|
||||
MFC, that you needed to read in Java. Alternatively, you'd use POIFS to write OLE 2 Compound Document Format
|
||||
if you needed to inter-operate with software running on the Windows platform. We are not just bragging when
|
||||
we say that POIFS is the most complete and correct implementation of this file format to date!
|
||||
</p>
|
||||
<p>
|
||||
You'd use HSSF if you needed to read or write an Excel file using Java (XLS). You'd use
|
||||
XSSF if you need to read or write an OOXML Excel file using Java (XLSX). The combined
|
||||
SS interface allows you to easily read and write all kinds of Excel files (XLS and XLSX)
|
||||
using Java. Additionally there is a specialized SXSSF implementation which allows to write
|
||||
very large Excel (XLSX) files in a memory optimized way.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Components</title>
|
||||
<p>
|
||||
The Apache POI Project provides several component modules some of which may not be of interest to you.
|
||||
Use the information on our <a href="site:components">Components</a> page to determine which
|
||||
jar files to include in your classpath.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section><title>Contributing</title>
|
||||
<p>
|
||||
So you'd like to contribute to the project? Great! We need enthusiastic,
|
||||
hard-working, talented folks to help us on the project, no matter your
|
||||
background. So if you're motivated, ready, and have the time: Download the
|
||||
source from the
|
||||
<a href="site:subversion">Subversion Repository</a>,
|
||||
<a href="site:howtobuild">build the code</a>, join the
|
||||
<a href="site:mailinglists">mailing lists</a>, and we'll be happy to
|
||||
help you get started on the project!
|
||||
</p>
|
||||
<p>
|
||||
Please read our <a href="site:guidelines">Contribution Guidelines</a>.
|
||||
When your contribution is ready submit a patch to our
|
||||
<a href="https://bz.apache.org/bugzilla/buglist.cgi?product=POI">Bug Database</a>.
|
||||
</p>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
100
src/documentation/content/xdocs/legal.xml
Normal file
@ -0,0 +1,100 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Legal Stuff</title>
|
||||
<authors>
|
||||
<person id="TK" name="Tetsuya Kitahata" email="tetsuya@apache.org"/>
|
||||
<person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>License and Notice</title>
|
||||
<p>
|
||||
Apache POI™ releases are available under the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0.</a>
|
||||
See the NOTICE file contained in each release artifact for applicable copyright attribution notices. Release artifacts are available
|
||||
from the <a href="site:download">Download</a> page.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Copyrights and Trademarks</title>
|
||||
<p>
|
||||
All material on this website is Copyright © 2002-2025, The Apache
|
||||
Software Foundation.
|
||||
</p>
|
||||
<p>
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache POI
|
||||
project logo are trademarks of The Apache Software Foundation.
|
||||
</p>
|
||||
<p>
|
||||
Sun, Sun Microsystems, Solaris, Java, JavaServer Web Development Kit,
|
||||
and JavaServer Pages are trademarks or registered trademarks of Sun
|
||||
Microsystems, Inc. UNIX is a registered trademark in the United States
|
||||
and other countries, exclusively licensed through 'The Open Group'.
|
||||
Microsoft, Windows, WindowsNT, Excel, Word, PowerPoint, Visio, Publisher, Outlook,
|
||||
and Win32 are registered trademarks of Microsoft Corporation.
|
||||
Linux is a registered trademark of Linus Torvalds.
|
||||
All other product names mentioned herein and throughout the entire
|
||||
web site are trademarks of their respective owners.
|
||||
</p>
|
||||
<section><title>Cryptography Notice</title>
|
||||
<p>
|
||||
This distribution includes cryptographic software. The country in
|
||||
which you currently reside may have restrictions on the import,
|
||||
possession, use, and/or re-export to another country, of
|
||||
encryption software. BEFORE using any encryption software, please
|
||||
check your country's laws, regulations and policies concerning the
|
||||
import, possession, or use, and re-export of encryption software, to
|
||||
see if this is permitted. See
|
||||
<a href="http://www.wassenaar.org/">http://www.wassenaar.org/</a>
|
||||
for more information.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The U.S. Government Department of Commerce, Bureau of Industry and
|
||||
Security (BIS), has classified this software as Export Commodity
|
||||
Control Number (ECCN) 5D002.C.1, which includes information security
|
||||
software using or performing cryptographic functions with asymmetric
|
||||
algorithms. The form and manner of this Apache Software Foundation
|
||||
distribution makes it eligible for export under the License Exception
|
||||
ENC Technology Software Unrestricted (TSU) exception (see the BIS
|
||||
Export Administration Regulations, Section 740.13) for both object
|
||||
code and source code.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The cryptographic software used is from <em>java.security</em> and
|
||||
<em>javax.crypto</em> and is used when processing encrypted and
|
||||
protected documents.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
233
src/documentation/content/xdocs/news.xml
Normal file
@ -0,0 +1,233 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - In the News over the world</title>
|
||||
<authors>
|
||||
<person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
||||
<person id="TK" name="Tetsuya Kitahata" email="tetsuya@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>POI in the news</title>
|
||||
<p>
|
||||
These are articles/etc. posted about POI around the web. If you
|
||||
see POI in the news or mentioned at least somewhat prominently
|
||||
on a site (not your homepage that you put the work POI on in
|
||||
order to get us to link you and by the why here is a picture of
|
||||
your wife in kids) then send a patch to the list. In general
|
||||
equal time will be given so please feel free to send inflammatory
|
||||
defamation as well as favorable, technical and factual. Really
|
||||
stupid things won't be mentioned (sorry).
|
||||
</p>
|
||||
</section>
|
||||
<section><title>English</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://archive.midrange.com/web400/200204/msg00023.html">Discussion about using POI on AS/400s</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.somelist.com/mails/23819.html">Discussion from back when we almost had POI as the filter for KOffice if politics and licenses hadn't killed iit</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.oreillynet.com/pub/wlg/1552?page=last&x-showcontent=text">Java discussion on O'Reilly Network including discussion about POI</a> - O'Reilly.net
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.rollerweblogger.org/page/roller/20020715">Poor Obfuscation Implementation.</a> - Blog of David M. Johnson
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.jsurfer.org/article.php?sid=322">
|
||||
POI 1.5-dev-rc2 released </a> - JSurfer
|
||||
</li>
|
||||
|
||||
<li>
|
||||
<a href="http://directory.google.com/Top/Computers/Programming/Languages/Java/Class_Libraries/Data_Formats/Microsoft_Formats/"> Google says we're the most important in our category </a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.javaworld.com/javaworld/javaqa/2002-05/01-qa-0503-excel3.html">It's POI-fect</a> - Tony Sintes, Javaworld
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.need-a-cake.com/categories/cocoonWeblog/2002/03/07.html">
|
||||
Nicola announces POI serialization code
|
||||
</a> - Matthew Langham's Radio Weblog
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://javalobby.org/discussionContext/showThreaded/frm/javalobby?folderId=20&discussionContextId=11523">
|
||||
Jakarta POI 1.4583 Released</a> - JavaLobby
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://javalobby.org/discussionContext/showThreaded/frm/javalobby?discussionContextId=11442&folderId=20">
|
||||
POI project moves to Jakarta (OLE 2 CDF/Excel/Word in
|
||||
pure java)</a> - JavaLobby
|
||||
</li>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.geocities.com/marcoschmidt.geo/java-image-coding.html">
|
||||
List of Java libraries to read and write image and document files
|
||||
</a> Marco Schmidt's homepage (normally we wouldn't
|
||||
feature someone's homepage but its an extensive list of
|
||||
information including "alternatives to POI" (for those
|
||||
of you who are very wealthy). But heck I think I'll
|
||||
bookmark his page for myself since he's like got every
|
||||
piece of info known to man linked or featured on it!
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://radio.weblogs.com/0101350/">
|
||||
The Experiences of an Operator (Måns af Klercker)
|
||||
</a> - radio.weblogs.com
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://dataconv.org/apps_office.html">
|
||||
DATACONV - Data Conversion Tools: Office
|
||||
</a> DATACONV
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://chicago.sourceforge.net/devel/">
|
||||
Chicago Developer Page
|
||||
</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.onjava.com/pub/d/1157">
|
||||
POI/POI Serialization Project
|
||||
</a> - Man you know you've hit the bigtime when
|
||||
O'Reilly Likes you.. ;-)
|
||||
</li>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.javaworld.com/netnews/index.shtml">
|
||||
News Around the Net
|
||||
</a> - Java World
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Nederlandstalige (Dutch)</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.ster.be/java/java9.html">
|
||||
Een Excel-werkboek maken vanuit Java - Lieven Smits
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Deutsch (German)</title>
|
||||
<ul>
|
||||
<li> <a
|
||||
href="http://www.entwickler.com/itr/news/show.php3?id=6132&nodeid=82 ">Apache POI verffentlicht</a> - entwicker.com
|
||||
</li>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.jsp-develop.de/newsletter/10/">
|
||||
Apache Jakarta-Projekt bringt Word und Excel in die Java-Welt </a> - jsp-develop.de (for the misguided who use JSP ;-) )
|
||||
</li>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.entwickler.com/news/2002/02/5718/news.shtml">
|
||||
Neues Apache-Projekt bringt Word- und Excel nach Java
|
||||
</a> - entwickler.com
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Español (Spanish)</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://www.javahispano.com/noticias/todas.jsp">
|
||||
OLE2 desde Java nativo
|
||||
</a> - javaHispano
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://p2p.wrox.com/archive/java_espanol/2002-08/3.asp">Spanish discussion about Excel and Java including POI from Wrox forums</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Français (French)</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://linuxfr.org/section/D%E9veloppeur,0,1,8,0.html">
|
||||
Excel/OLE accessibles
|
||||
</a> - Da Linux French Page
|
||||
</li>
|
||||
<li>
|
||||
<a href="http://www.sogid.com/javalist/f2002/traiter_word_java.html">Discussion on POI in French</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Nihongo (Japanese)</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://drpanda.freezope.org/Memo/docs/jakarta/poi/poi_sample">100% PureJava...</a> - Dr. Panda Portal
|
||||
</li>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.gimlay.org/~andoh/java/javanew.html">
|
||||
What's new with java?
|
||||
</a> - gimlay.org
|
||||
</li>
|
||||
<li><a href="http://taka-2.com/jclass/POI/">Java de Excel</a> - How to use Japanese with POI</li>
|
||||
<li><a href="http://www.tech-arts.co.jp/macosx/webobjects-jp/htdocs/3200/3218.html">Various discussion in Japanese including on POI</a></li>
|
||||
<li><a href="http://muimi.com/j/jakarta/">Japanese resources on Jakarta projects including POI</a></li>
|
||||
<li><a href="http://www.fk.urban.ne.jp/home/kishida/">Kishida's site</a> -- Weekly Forte Lectures -- includes a snip about POI and Japanese.</li>
|
||||
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Russkii Yazyk (Russian)</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://www.nestor.minsk.by/kg/kg02/21/kg22108.html">
|
||||
Probably a translation of the Javalobby announcement of 1.5-final
|
||||
</a> -- Computer News (What's New)
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>Hangul (Korean)</title>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://www.javabrain.co.kr/AnswerView?questionId=1189&categoryId=8">Various discussion in Korean about Excel output/APIs including POI</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section><title>No freaking idea</title>
|
||||
<p>
|
||||
If you can read one of these languages, send mail to the list
|
||||
telling us what language it is and we'll categorize it!
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a
|
||||
href="http://www.javacentrix.com/index.htm">
|
||||
If I had to guess, I'd say this is Thai, but
|
||||
maybe you actually know</a> - javacentrix.com
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
254
src/documentation/content/xdocs/related-projects.xml
Normal file
@ -0,0 +1,254 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Related Projects</title>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Introduction</title>
|
||||
<p>
|
||||
This page lists other projects that you might find interesting when working with documents of various types. Suggestions for additional links are welcome, however please note that we only list open source projects here. Commercial applications can provide <a href="casestudies.html">case studies</a> if they want to show their support for POI.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Apache projects</title>
|
||||
<section>
|
||||
<title>Apache Tika</title>
|
||||
<p>
|
||||
<a href="https://tika.apache.org/">Apache Tika</a>
|
||||
is a toolkit which detects and extracts metadata and text from over a thousand different file types.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Apache Drill</title>
|
||||
<p>
|
||||
<a href="https://drill.apache.org/">Apache Drill</a>
|
||||
is a toolkit that allows the use of SQL querying on numerous file and data formats. The POI support is in
|
||||
the <a href="https://drill.apache.org/docs/excel-format-plugin/">excel-format-plugin</a>.
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Apache Hop</title>
|
||||
<p>
|
||||
<a href="https://hop.apache.org/">Apache Hop</a>
|
||||
is a data orchestration and data engineering platform. The POI support is in
|
||||
the <a href="https://hop.apache.org/manual/latest/pipeline/transforms/excelinput.html">excelinput</a> transform
|
||||
and the <a href="https://hop.apache.org/manual/latest/pipeline/transforms/excelwriter.html">excelwriter</a> transform.
|
||||
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Apache DolphinScheduler</title>
|
||||
<p>
|
||||
<a href="https://dolphinscheduler.apache.org/">Apache DolphinScheduler</a>
|
||||
is a distributed and easy-to-extend visual workflow scheduler system. The POI support is in
|
||||
the alert email component.
|
||||
|
||||
</p>
|
||||
</section>
|
||||
<section>
|
||||
<title>Worksheet plugin for JSPWiki</title>
|
||||
<p>
|
||||
There is a <a href="https://jspwiki-wiki.apache.org/Wiki.jsp?page=WorksheetPlugin#top">Worksheet
|
||||
plugin</a> for <a href="https://jspwiki.apache.org/">JSPWiki</a> which allows you to display contents of Excel
|
||||
files as a table in JSPWiki.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section>
|
||||
<title>Apache incubating projects</title>
|
||||
<section><title>Apache Linkis</title>
|
||||
<p>
|
||||
<a href="https://linkis.apache.org/">Apache Linkis (incubating)</a> is a computation middleware layer.
|
||||
The linkis-storage component has an Excel read capability built using Apache Poi.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Apache Seatunnel</title>
|
||||
<p>
|
||||
<a href="https://seatunnel.apache.org/">Apache Seatunnel (incubating)</a> is a high-performance, distributed, massive data integration framework.
|
||||
The seatunnel-connector-spark-email component uses Apache Poi.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Apache ODF Toolkit (retired)</title>
|
||||
<p>
|
||||
<a href="https://incubator.apache.org/projects/odftoolkit.html">Apache ODF Toolkit (incubating)</a> is a set of Java modules that allow programmatic creation, scanning and manipulation of OpenDocument Format (ISO/IEC 26300 == ODF) documents.
|
||||
See also <a href="https://odftoolkit.org/">new website</a>.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section><title>Apache Corinthia (retired)</title>
|
||||
<p>
|
||||
<a href="https://corinthia.incubator.apache.org/">Apache Corinthia (incubating)</a> is a toolkit/application written in C++ for converting between and editing common office file formats, with an initial focus on word processing.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
<section>
|
||||
<title>Other projects</title>
|
||||
<section><title>Jackcess</title>
|
||||
<p>
|
||||
<a href="http://jackcess.sourceforge.net/">Jackcess</a> is a pure Java library for reading from and writing to MS Access databases available under Apache License 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>poi-mail-merge</title>
|
||||
<p>
|
||||
<a href="https://github.com/centic9/poi-mail-merge">poi-mail-merge</a> is a small tool to automate mail-merges, i.e. replacing strings in a template Microsoft Word file multiple times with data from a list of replacements
|
||||
provided as Excel/CSV data. Available under the BSD 2-Clause License.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>poi-visio</title>
|
||||
<p>Merged into POI as of version 3.14</p>
|
||||
<p>
|
||||
<a href="https://github.com/BBN-D/poi-visio">poi-visio</a> is a Java library that loads Visio OOXML (vsdx) files and creates an in-memory data structure that allows full access to the contents of the document.
|
||||
There is built-in support for easily traversing the content of the document in a structured way, and can render pages to simplified PNG files, or other backends supported by Java AWT.
|
||||
Currently, the library only operates in read-only mode, but its design does not exclude being able to modify existing documents or creating new documents.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>poi-visio-graph</title>
|
||||
<p>
|
||||
<a href="https://github.com/BBN-D/poi-visio-graph">poi-visio-graph</a> is a Java library that loads Visio OOXML (vsdx) files using the poi-visio library and creates an in-memory graph structure from the objects present on the page.
|
||||
It utilizes user-specified connection points and also performs analysis to infer logical visual connection points between the objects on each page.
|
||||
One possible use of this library is to create a network diagram from a Visio document.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>NPOI</title>
|
||||
<p>
|
||||
<a href="https://npoi.codeplex.com/">NPOI</a> is a .NET version of Apache POI available under Apache License 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Vaadin Spreadsheet</title>
|
||||
<p>
|
||||
<a href="https://github.com/vaadin/spreadsheet">Vaadin Spreadsheet</a> is a UI component add-on for Vaadin 7 which provides means to view and edit Excel spreadsheets in Vaadin applications.
|
||||
Available under the Commercial Vaadin Add-on License version 3 (CVALv3).
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Excel module for Apache Isis</title>
|
||||
<p>
|
||||
<a href="https://github.com/isisaddons/isis-module-excel">Excel module for Apache Isis</a> is an add on for Apache Isis and provides a domain service so that a collection of (view model)
|
||||
object scan be exported to an Excel spreadsheet, or recreated by importing from Excel.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Excel Streaming Reader</title>
|
||||
<p>
|
||||
<a href="https://github.com/monitorjbl/excel-streaming-reader">Excel Streaming Reader</a> uses the POI Streaming API to provide Row/Cell like read-access to large Excel spreadsheets.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
<p>
|
||||
<a href="https://github.com/pjfanning/excel-streaming-reader">Forked Version</a> that supports the latest POI versions.
|
||||
Has support for a number of extra features, including Strict OOXML files.
|
||||
Also, available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>fast-excel</title>
|
||||
<p>
|
||||
<a href="https://github.com/dhatim/fastexcel/">fastexcel</a> is a benchmarked library for reading and writing Excel files.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>poi-shared-strings</title>
|
||||
<p>
|
||||
<a href="https://github.com/pjfanning/poi-shared-strings/">poi-shared-strings</a> is a memory efficient Shared Strings Table and Comments Table implementation for POI streaming.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>The Wordinator</title>
|
||||
<p>
|
||||
<a href="https://github.com/drmacro/wordinator">The Wordinator</a> abstracts the general problem of mapping from XML (or any similar structured content--with XSLT 3 you could just as easily process JSON content or some other format) to word processing data through a relatively simple XML structure, the Simple Word Processing Markup Language (SWPX), which is basically OOXML simplified way down.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>POI-TL</title>
|
||||
<p>
|
||||
<a href="https://github.com/Sayi/poi-tl">POI-TL</a> is a Word template engine that generates new documents based on a Word template and data.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>XDocReport</title>
|
||||
<p>
|
||||
<a href="https://github.com/opensagres/xdocreport">XDocReport</a> is a Java API to merge XML document created with MS Office (docx) or OpenOffice (odt),
|
||||
LibreOffice (odt) with a Java model to generate report and convert it if you need to another format (PDF, XHTML...).
|
||||
XDocReport code is license under MIT license but the samples are licensed under LGPL license.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Frosted Sheets</title>
|
||||
<p>
|
||||
<a href="https://bitbucket.org/erosa/frostedsheets/overview">Frosted Sheets</a> is a Groovy library which provides decorators for Apache POI spreadsheets, making it easier to work with spreadsheets
|
||||
in Groovy.
|
||||
Frosted Sheets is license under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>IEXL Software</title>
|
||||
<p>
|
||||
<a href="http://www.iexlsoftware.com/">iEXL</a> is a commercial product which allows you to generate Excel spreadsheets on AS/400, iSeries, i5 or IBM i on Power systems.
|
||||
It uses Apache POI internally.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>jotlmsg</title>
|
||||
<p>
|
||||
<a href="https://github.com/ctabin/jotlmsg">jotlmsg</a> is a simple API (on top of POI) to easily generate Microsoft Outlook message files (.msg).
|
||||
</p>
|
||||
</section>
|
||||
<section><title>HadoopOffice</title>
|
||||
<p>
|
||||
<a href="https://github.com/ZuInnoTe/hadoopoffice">HadoopOffice</a> allows you to read and write Office documents while using the Hadoop ecosystem.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Scala POI Wrapper</title>
|
||||
<p>
|
||||
<a href="https://github.com/norbert-radyk/spoiwo">SPOIWO</a> allows you to read and write Office documents using Scala friendly APIs.
|
||||
Available under the MIT License.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>Spark Excel</title>
|
||||
<p>
|
||||
<a href="https://github.com/crealytics/spark-excel">Spark Excel</a> allows you to read and write Excel documents into/from Spark Dataframes.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>ExcelUtil</title>
|
||||
<p>
|
||||
<a href="https://github.com/nambach/ExcelUtil">ExcelUtil</a> is a Java wrapper using Apache POI to read and write Excel files in declarative fashion.
|
||||
Available under the Apache License, Version 2.0.
|
||||
</p>
|
||||
</section>
|
||||
<section><title>bld-commons/dev-excel</title>
|
||||
<p>
|
||||
<a href="https://github.com/bld-commons/dev-excel">dev-excel</a> is a Java wrapper using Apache POI to read and write Excel files.
|
||||
Available under the MIT License.
|
||||
</p>
|
||||
</section>
|
||||
</section>
|
||||
</body>
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
114
src/documentation/content/xdocs/security.xml
Normal file
@ -0,0 +1,114 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Security guidance</title>
|
||||
<authors>
|
||||
<person id="centic" name="Dominik Stadler" email="centic@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<p>This page provides some guidance about how Apache POI can be used in security-sensible areas.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Information about related security vulnerabilities</title>
|
||||
|
||||
<p>Information about security issues is included in the <a href="index.html">Project News</a>.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Reporting security vulnerabilities</title>
|
||||
|
||||
<p>Apache POI will try to fix security-related bugs with priority.</p>
|
||||
|
||||
<p>Please follow the general <a href="https://www.apache.org/security/">Apache Security Guidelines</a>
|
||||
for proper handling.</p>
|
||||
|
||||
<p>But please note that by the nature of processing external files, you should design your application
|
||||
in a way which limits impact of malicious documents as much as possible. The higher your security-related
|
||||
requirements are, the more you likely need to invest in your application to contain effects.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Architecting your Application</title>
|
||||
|
||||
<p>If you are processing documents from an untrusted source, you should add a number of safeguards to
|
||||
your application to contain any unexpected side effects.</p>
|
||||
|
||||
<p>Apache POI cannot fully protect against some documents causing impact on the current process, therefore
|
||||
we suggest the following additional layers of security.</p>
|
||||
|
||||
<ul>
|
||||
<li><strong>Expect any type of Exception when processing documents</strong><br/>
|
||||
As parsing the various formats is very complex and involved, there are some unexpected types of
|
||||
exceptions which can be thrown. E.g. StackOverflowError or many different types of RuntimeException.
|
||||
<br/>
|
||||
Make sure to have a broad catch-statement around your document-parsing functionality and be prepared
|
||||
to handle all those gracefully.
|
||||
</li>
|
||||
<li><strong>Expect long parsing time</strong><br/>
|
||||
As parsing the various formats is very complex and involved, some documents might cause prolonged CPU
|
||||
usage and long parsing time.
|
||||
<br/>
|
||||
If this is a concern, make sure to have a way to stop processing after some time, maybe by the
|
||||
sandboxing approach described below.
|
||||
</li>
|
||||
<li><strong>Memory use can be very high</strong><br/>
|
||||
The data in Microsoft format files is usually compressed so even small files can have a lot of data.
|
||||
<br/>
|
||||
The core POI APIs are not optimized to avoid excessive memory use. POI has streaming APIs for reading
|
||||
and writing xlsx files - so if you are working with large xlsx files, you should consider using the
|
||||
streaming APIs.
|
||||
</li>
|
||||
<li><strong>Consider sandboxing document-parsing</strong><br/>
|
||||
If you operate in a highly sensitive environment and would like to avoid any side effect from
|
||||
parsing documents on your application, then consider extracting the parsing logic into a separate
|
||||
process which is configured with appropriate memory settings and which you stop after some timeout.
|
||||
It is a good idea to be able to auto-restart the process in case of a crash.
|
||||
<br />
|
||||
</li>
|
||||
<li><strong>Keep up to date with releases</strong><br/>
|
||||
Apache POI does occasionally issue CVEs for security issues. There are also other bug fixes and
|
||||
improvements in each release. Some of these fixes will be to make POI more robust against malicious
|
||||
inputs, even if they are not explicitly security-related.
|
||||
<br />
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</body>
|
||||
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
172
src/documentation/content/xdocs/site.xml
Normal file
@ -0,0 +1,172 @@
|
||||
<?xml version="1.0"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!--
|
||||
Forrest site.xml
|
||||
|
||||
This file contains an outline of the site's information content. It is used to:
|
||||
- Generate the website menus (though these can be overridden - see docs)
|
||||
- Provide semantic, location-independent aliases for internal 'site:' URIs, eg
|
||||
<a href="site:changes"> links to changes.html (or ../changes.html if in
|
||||
subdir).
|
||||
- Provide aliases for external URLs in the external-refs section. Eg, <a
|
||||
href="ext:cocoon"> links to https://xml.apache.org/cocoon/
|
||||
|
||||
See https://xml.apache.org/forrest/linking.html for more info
|
||||
-->
|
||||
<site label="POI" xmlns="http://apache.org/forrest/linkmap/1.0">
|
||||
<home label="Overview" tab="home">
|
||||
<index label="Home" href="index.html"/>
|
||||
<download label="Download" href="download.html"/>
|
||||
<changes label="Changelog" href="changes.html"/>
|
||||
<javadocs label="Javadocs" href="apidocs/index.html"/>
|
||||
<extraction label="Text Extraction" href="text-extraction.html"/>
|
||||
<encryption label="Encryption support" href="encryption.html"/>
|
||||
<encryption label="Secure processing" href="security.html"/>
|
||||
<casestudies label="Case Studies" href="casestudies.html"/>
|
||||
<related label="Related projects" href="related-projects.html"/>
|
||||
<legal label="Legal" href="legal.html"/>
|
||||
</home>
|
||||
<apache label="Apache Wide" tab="home">
|
||||
<asf label="Apache Software Foundation" href="https://www.apache.org/"/>
|
||||
<license label="License" href="https://www.apache.org/licenses/"/>
|
||||
<sponsor label="Sponsorship" href="https://www.apache.org/foundation/sponsorship.html"/>
|
||||
<thanks label="Thanks" href="https://www.apache.org/foundation/thanks.html"/>
|
||||
<security label="Security" href="https://www.apache.org/security/"/>
|
||||
<privacy label="Privacy" href="https://privacy.apache.org/policies/privacy-policy-public.html"/>
|
||||
</apache>
|
||||
<components label="Component APIs" href="components/" tab="components">
|
||||
<index label="Overview" href="index.html">
|
||||
<batikpdf href="#components"/>
|
||||
</index>
|
||||
<javadocs label="Javadocs" href="site:javadocs"/>
|
||||
<spreadsheet label="Excel (HSSF/XSSF)" href="spreadsheet/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<ssquickguide label="Quick Guide" href="quick-guide.html"/>
|
||||
<sshowto label="HOWTO" href="how-to.html"/>
|
||||
<sscommon label="HSSF to SS Converting" href="converting.html"/>
|
||||
<formsupp label="Formula Support" href="formula.html"/>
|
||||
<formeval label="Formula Evaluation" href="eval.html"/>
|
||||
<evalguide label="Eval Dev Guide" href="eval-devguide.html"/>
|
||||
<ssexamples label="Examples" href="examples.html"/>
|
||||
<usecases label="Use Case" href="use-case.html"/>
|
||||
<picdocs label="Pictorial Docs" href="diagrams.html"/>
|
||||
<limits label="Limitations" href="limitations.html"/>
|
||||
<udf label="User Defined Functions" href="user-defined-functions.html"/>
|
||||
<excelant label="ExcelAnt Tests" href="excelant.html"/>
|
||||
<hackhssf label="Hacking HSSF" href="hacking-hssf.html"/>
|
||||
<recordgen label="Record Generator" href="record-generator.html"/>
|
||||
<charts label="Charts" href="chart.html"/>
|
||||
</spreadsheet>
|
||||
<slideshow label="PowerPoint (HSLF/XSLF)" href="slideshow/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<slquickguide label="Quick Guide" href="quick-guide.html"/>
|
||||
<hslfcook label="HSLF Cookbook" href="how-to-shapes.html"/>
|
||||
<xslfcook label="XSLF Cookbook" href="xslf-cookbook.html"/>
|
||||
<slrender label="Render SL/WMF/EMF" href="ppt-wmf-emf-renderer.html"/>
|
||||
<format label="PPT File Format" href="ppt-file-format.html"/>
|
||||
</slideshow>
|
||||
<document label="Word (HWPF/XWPF)" href="document/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<hwpfquick label="HWPF Quick Guide" href="quick-guide.html"/>
|
||||
<xwpfquick label="XWPF Quick Guide" href="quick-guide-xwpf.html"/>
|
||||
<docformat label="HWPF Format" href="docoverview.html"/>
|
||||
<hwpfplan label="HWPF Project plan" href="projectplan.html"/>
|
||||
</document>
|
||||
<hsmf label="Outlook (HSMF)" href="hsmf/index.html"/>
|
||||
<diagram label="Visio (HDGF+XDGF)" href="diagram/index.html"/>
|
||||
<hpbf label="Publisher (HPBF)" href="hpbf/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<hpbformat label="File Format" href="file-format.html"/>
|
||||
</hpbf>
|
||||
<poifs label="OLE2 Filesystem (POIFS)" href="poifs/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<howto label="How To" href="how-to.html"/>
|
||||
<embedded label="Embedded Documents" href="embeded.html"/>
|
||||
<format label="File System Documentation" href="fileformat.html"/>
|
||||
<usecases label="Use Cases" href="usecases.html"/>
|
||||
<design label="Design" href="design.html"/>
|
||||
</poifs>
|
||||
<hpsf label="OLE2 Document Props (HPSF)" href="hpsf/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<howto label="How To" href="how-to.html"/>
|
||||
<thumbnails label="Thumbnails" href="thumbnails.html"/>
|
||||
<internals label="Internals" href="internals.html"/>
|
||||
<todo label="To Do" href="todo.html"/>
|
||||
</hpsf>
|
||||
<hmef label="TNEF (HMEF) for winmail.dat" href="hmef/index.html"/>
|
||||
<oxml4j label="OpenXML4J (OOXML)" href="oxml4j/index.html"/>
|
||||
<log label="Logging framework" href="logging.html"/>
|
||||
<config label="Configuration" href="configuration.html"/>
|
||||
</components>
|
||||
<help label="Help" tab="help" href="help/">
|
||||
<mailinglists label="Mailing Lists" href="index.html"/>
|
||||
<faq label="FAQ" href="faq.html"/>
|
||||
<bugs label="Bug Database" href="https://bz.apache.org/bugzilla/buglist.cgi?product=POI"/>
|
||||
</help>
|
||||
<!-- can't use directory "community" because of forrest default rules in sitemap.xmap -> revisions -->
|
||||
<community label="Getting Involved" href="devel/" tab="community">
|
||||
<howtobuild label="How To Build" href="index.html"/>
|
||||
<nightly label="Nightly Builds" href="nightly.html"/>
|
||||
<subversion label="Subversion Repository" href="subversion.html"/>
|
||||
<guidelines label="Contribution Guidelines" href="guidelines.html"/>
|
||||
<whoweare label="Who We Are" href="who.html"/>
|
||||
<plan label="Planning Documents" href="plan/">
|
||||
<planning label="Overview" href="index.html"/>
|
||||
<vision10 label="1.0 Vision" href="vision10.html"/>
|
||||
<vision20 label="2.0 Vision" href="vision20.html"/>
|
||||
</plan>
|
||||
<references label="References" href="references/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<logocontest label="Logo Submissions" href="logocontest.html"/>
|
||||
<xlsspec label="XLS spec [PDF]" href="https://sc.openoffice.org/excelfileformat.pdf"/>
|
||||
<cocoon label="Apache Cocoon" href="https://xml.apache.org/cocoon/"/>
|
||||
</references>
|
||||
<resolutions label="Resolutions" href="resolutions/">
|
||||
<index label="Overview" href="index.html"/>
|
||||
<res001 label="Minimal Coding Standards" href="res001.html"/>
|
||||
</resolutions>
|
||||
<history label="History" href="history/">
|
||||
<historyfuture label="The early years" href="index.html"/>
|
||||
<changes3x label="Changelog 3.x" href="changes-3x.html"/>
|
||||
<changespre3x label="Changelog 0-2.x" href="changes-pre3x.html"/>
|
||||
</history>
|
||||
</community>
|
||||
|
||||
<external-refs>
|
||||
<forrest href="http://forrest.apache.org/">
|
||||
<aing href="docs/linking.html"/>
|
||||
<validation href="docs/validation.html"/>
|
||||
<webapp href="docs/your-project.html#webapp"/>
|
||||
<dtd-docs href="docs/dtd-docs.html"/>
|
||||
</forrest>
|
||||
<cocoon href="https://cocoon.apache.org/"/>
|
||||
<xml.apache.org href="https://xml.apache.org/"/>
|
||||
<junit href="junit/index.html"/>
|
||||
<jdepend href="jdepend/index.html"/>
|
||||
<download href="https://www.apache.org/dyn/closer.lua/poi/"/>
|
||||
<apidocs href="https://poi.apache.org/apidocs/">
|
||||
<v317 href="3.17/"/>
|
||||
<v40 href="4.0/"/>
|
||||
<v41 href="4.1/"/>
|
||||
<v50 href="5.0/"/>
|
||||
<dev href="dev/"/>
|
||||
</apidocs>
|
||||
</external-refs>
|
||||
</site>
|
||||
@ -0,0 +1,89 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
"""
|
||||
|
||||
"""This is a really crude throwaway script to get data out of Bugzilla and into status.xml
|
||||
It'd be far better to have Forrest look this information up whenever the site is rebuilt.
|
||||
Hopefully this is a one time effort
|
||||
If a closed bug's component is changed in Bugzilla, this script could be used to keep the changelog in sync.
|
||||
|
||||
requires Python 3.1+
|
||||
(Python 2.x doesn't do Unicode in CSVs nicely)
|
||||
"""
|
||||
|
||||
import csv, io
|
||||
import sys
|
||||
import requests
|
||||
|
||||
def get_fixesbug_attr(line):
|
||||
pieces = [x.strip() for x in line.split('"')]
|
||||
bugs = pieces[pieces.index('fixes-bug=') + 1]
|
||||
return bugs
|
||||
|
||||
def get_bugzilla_bug_to_component():
|
||||
print("Fetching details of POI bugs, please wait...")
|
||||
bugzilla_bug_to_component = {}
|
||||
r = requests.get('https://bz.apache.org/bugzilla/buglist.cgi?bug_status=__all__&limit=0&no_redirect=1&product=POI&query_format=advanced&ctype=csv&human=1')
|
||||
with io.StringIO(r.text) as f:
|
||||
csvreader = csv.DictReader(f)
|
||||
for row in csvreader:
|
||||
bugzilla_bug_to_component[row['Bug ID']] = row['Component']
|
||||
return bugzilla_bug_to_component
|
||||
|
||||
|
||||
def unique(seq):
|
||||
seen = set()
|
||||
for x in seq:
|
||||
if x not in seen:
|
||||
seen.add(x)
|
||||
yield x
|
||||
|
||||
def add_module_frombugzilla_attr(line):
|
||||
"""Add module_frombugzilla="XSSF" to <action ...> tag
|
||||
|
||||
line is a string, containing the <action> opening tag
|
||||
"""
|
||||
global bugzilla_bug_to_component
|
||||
assert 'module=' not in line, \
|
||||
"Invalid action line, should not already contain module: %s" % line
|
||||
|
||||
bugs = [x.strip() for x in get_fixesbug_attr(line).split(',')]
|
||||
modules = filter(bool, [bugzilla_bug_to_component.get(bug) for bug in bugs])
|
||||
module_frombugzilla = ','.join(unique(modules))
|
||||
line_with_module_frombugzilla = line.replace('>', ' module="{}">'.format(module_frombugzilla), 1)
|
||||
return line_with_module_frombugzilla
|
||||
|
||||
def add_module_attribute(inputfile, outputfile):
|
||||
print("Generating %s from %s and Bugzilla details"%(outputfile,inputfile))
|
||||
with open(inputfile, 'r') as infile, open(outputfile, 'w') as outfile:
|
||||
for line in infile:
|
||||
if '<action ' in line and ' fixes-bug=' in line and ' module=' not in line:
|
||||
# append "module="XXXX" at end of <action> tag
|
||||
outfile.write(add_module_frombugzilla_attr(line))
|
||||
else:
|
||||
outfile.write(line)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if len(sys.argv) != 3:
|
||||
print('Usage: python changelog.py inputfile outputfile')
|
||||
else:
|
||||
bugzilla_bug_to_component = get_bugzilla_bug_to_component()
|
||||
add_module_attribute(sys.argv[1], sys.argv[2])
|
||||
print("Generation complete!")
|
||||
35
src/documentation/content/xdocs/tabs.xml
Normal file
@ -0,0 +1,35 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE tabs PUBLIC "-//APACHE//DTD Cocoon Documentation Tab V1.1//EN" "http://forrest.apache.org/dtd/tab-cocoon-v11.dtd">
|
||||
|
||||
<tabs software="POI" title="POI" copyright="The Apache Software Foundation">
|
||||
|
||||
<!-- The rules are:
|
||||
@dir will always have /index.html added.
|
||||
@href is not modified unless it is root-relative and obviously specifies a
|
||||
directory (ends in '/'), in which case /index.html will be added
|
||||
-->
|
||||
|
||||
<tab id="home" label="Home" dir=""/>
|
||||
<tab id="help" label="Help" dir="help/"/>
|
||||
<tab id="components" label="Component APIs" dir="components/"/>
|
||||
<tab id="community" label="Getting Involved" dir="devel/"/>
|
||||
|
||||
</tabs>
|
||||
186
src/documentation/content/xdocs/text-extraction.xml
Normal file
@ -0,0 +1,186 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
====================================================================
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
====================================================================
|
||||
-->
|
||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
|
||||
|
||||
<document>
|
||||
<header>
|
||||
<title>Apache POI™ - Text Extraction</title>
|
||||
<authors>
|
||||
<person id="NB" name="Nick Burch" email="nick@apache.org"/>
|
||||
</authors>
|
||||
</header>
|
||||
|
||||
<body>
|
||||
<section><title>Overview</title>
|
||||
<p>For a number of years now, Apache POI has provided basic
|
||||
text extraction for all the project supported file formats. In
|
||||
addition, as well as the (plain) text, these provides access to
|
||||
the metadata associated with a given file, such as title and
|
||||
author.</p>
|
||||
<p>For more advanced text extraction needs, including Rich Text
|
||||
extraction (such as formatting and styling), along with XML and
|
||||
HTML output, Apache POI works closely with
|
||||
<a href="https://tika.apache.org/">Apache Tika</a> to deliver
|
||||
POI-powered Tika Parsers for all the project supported file formats.</p>
|
||||
<p>If you are after turn-key text extraction, including the latest
|
||||
support, styles etc, you are strongly advised to make use of
|
||||
<a href="https://tika.apache.org/">Apache Tika</a>, which builds
|
||||
on top of POI to provide Text and Metadata extraction. If you wish
|
||||
to have something very simple and stand-alone, or you wish to make
|
||||
heavy modifications, then the POI provided text extractors documented
|
||||
below might be a better fit for your needs.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Common functionality</title>
|
||||
<p>All of the POI text extractors extend from
|
||||
<em>org.apache.poi.extractor.POITextExtractor</em>. This provides a common
|
||||
method across all extractors, getText(). For many cases, the text
|
||||
returned will be all you need. However, many extractors do provide
|
||||
more targeted text extraction methods, so you may wish to use
|
||||
these in some cases.</p>
|
||||
<p>All POIFS / OLE 2 based text extractors also extend from
|
||||
<em>org.apache.poi.extractor.POIOLE2TextExtractor</em>. This additionally
|
||||
provides common methods to get at the <a href="hpfs/">HPFS
|
||||
document metadata</a>.</p>
|
||||
<p>All OOXML based text extractors also extend from
|
||||
<em>org.apache.poi.POIOOXMLTextExtractor</em>. This additionally
|
||||
provides common methods to get at the OOXML metadata.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Text Extractor Factory</title>
|
||||
<p>POI provides a common class to select the appropriate text extractor
|
||||
for you, based on the supplied document's contents.
|
||||
<em>ExtractorFactory</em> provides a
|
||||
similar function to WorkbookFactory. You simply pass it an
|
||||
InputStream, a File, a POIFSFileSystem or a OOXML Package. It
|
||||
figures out the correct text extractor for you, and returns it.</p>
|
||||
<p>For complete detection and text extractor auto-selection, users
|
||||
are strongly encouraged to investigate
|
||||
<a href="https://tika.apache.org/">Apache Tika</a>.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Excel</title>
|
||||
<p>For .xls files, there is
|
||||
<em>org.apache.poi.hssf.extractor.ExcelExtractor</em>, which will
|
||||
return text, optionally with formulas instead of their contents.
|
||||
Similarly, for .xlsx files there is
|
||||
<em>org.apache.poi.xssf.extractor.XSSFExcelExtractor</em>, which
|
||||
provides the same functionality.</p>
|
||||
<p>For those working in constrained memory footprints, there are
|
||||
two more Excel text extractors available. For .xls files, it's
|
||||
<em>org.apache.poi.hssf.extractor.EventBasedExcelExtractor</em>,
|
||||
based on the streaming EventUserModel code, and will generally
|
||||
deliver a lower memory footprint for extraction. However, it will
|
||||
have problems correctly outputting more complex formulas, as it
|
||||
works with records as they pass, and so doesn't have access to all
|
||||
parts of complex and shared formulas. For .xlsx files the equivalent is
|
||||
<em>org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor</em>,
|
||||
which is based on the XSSF SAX Event codebase.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Word</title>
|
||||
<p>For .doc files from Word 97 - Word 2003, in scratchpad there is
|
||||
<em>org.apache.poi.hwpf.extractor.WordExtractor</em>, which will
|
||||
return text for your document.</p>
|
||||
<p>You can also extract simple textual content from
|
||||
older Word 6 and Word 95 files, using the scratchpad class
|
||||
<em>org.apache.poi.hwpf.extractor.Word6Extractor</em>.</p>
|
||||
<p>For .docx files, the relevant class is
|
||||
<em>org.apache.poi.xwpf.extractor.XWPFWordExtractor</em></p>
|
||||
</section>
|
||||
|
||||
<section><title>PowerPoint</title>
|
||||
<p>For .ppt and .pptx files, there is common extractor
|
||||
<em>org.apache.poi.sl.extractor.SlideShowExtractor.SlideShowExtractor</em>, which
|
||||
will return text for your slideshow, optionally restricted to just
|
||||
slides text or notes text. For .ppt you need to add the poi-scratchpad.jar
|
||||
and for .pptx the poi-ooxml.jar and its dependencies are needed</p>
|
||||
</section>
|
||||
|
||||
<section><title>Publisher</title>
|
||||
<p>For .pub files, in scratchpad there is
|
||||
<em>org.apache.poi.hpbf.extractor.PublisherExtractor</em>, which
|
||||
will return text for your file.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Visio</title>
|
||||
<p>For .vsd files, in scratchpad there is
|
||||
<em>org.apache.poi.hdgf.extractor.VisioTextExtractor</em>, which
|
||||
will return text for your file.</p>
|
||||
</section>
|
||||
|
||||
<section><title>Embedded Objects</title>
|
||||
<p>Extractors already exist for Excel, Word, PowerPoint and Visio;
|
||||
if one of these objects is embedded into a worksheet, the ExtractorFactory class can be used to recover an extractor for it.
|
||||
</p>
|
||||
<source>
|
||||
FileInputStream fis = new FileInputStream(inputFile);
|
||||
POIFSFileSystem fileSystem = new POIFSFileSystem(fis);
|
||||
// Firstly, get an extractor for the Workbook
|
||||
POIOLE2TextExtractor oleTextExtractor =
|
||||
ExtractorFactory.createExtractor(fileSystem);
|
||||
// Then a List of extractors for any embedded Excel, Word, PowerPoint
|
||||
// or Visio objects embedded into it.
|
||||
POITextExtractor[] embeddedExtractors =
|
||||
ExtractorFactory.getEmbededDocsTextExtractors(oleTextExtractor);
|
||||
for (POITextExtractor textExtractor : embeddedExtractors) {
|
||||
// If the embedded object was an Excel spreadsheet.
|
||||
if (textExtractor instanceof ExcelExtractor) {
|
||||
ExcelExtractor excelExtractor = (ExcelExtractor) textExtractor;
|
||||
System.out.println(excelExtractor.getText());
|
||||
}
|
||||
// A Word Document
|
||||
else if (textExtractor instanceof WordExtractor) {
|
||||
WordExtractor wordExtractor = (WordExtractor) textExtractor;
|
||||
String[] paragraphText = wordExtractor.getParagraphText();
|
||||
for (String paragraph : paragraphText) {
|
||||
System.out.println(paragraph);
|
||||
}
|
||||
// Display the document's header and footer text
|
||||
System.out.println("Footer text: " + wordExtractor.getFooterText());
|
||||
System.out.println("Header text: " + wordExtractor.getHeaderText());
|
||||
}
|
||||
// PowerPoint Presentation.
|
||||
else if (textExtractor instanceof PowerPointExtractor) {
|
||||
PowerPointExtractor powerPointExtractor =
|
||||
(PowerPointExtractor) textExtractor;
|
||||
System.out.println("Text: " + powerPointExtractor.getText());
|
||||
System.out.println("Notes: " + powerPointExtractor.getNotes());
|
||||
}
|
||||
// Visio Drawing
|
||||
else if (textExtractor instanceof VisioTextExtractor) {
|
||||
VisioTextExtractor visioTextExtractor =
|
||||
(VisioTextExtractor) textExtractor;
|
||||
System.out.println("Text: " + visioTextExtractor.getText());
|
||||
}
|
||||
}
|
||||
</source>
|
||||
</section>
|
||||
</body>
|
||||
|
||||
<footer>
|
||||
<legal>
|
||||
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
||||
<br />
|
||||
Apache POI, POI, Apache, the Apache feather logo, and the Apache
|
||||
POI project logo are trademarks of The Apache Software Foundation.
|
||||
</legal>
|
||||
</footer>
|
||||
</document>
|
||||
61
src/documentation/publish-poi-site.txt
Normal file
@ -0,0 +1,61 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one
|
||||
# or more contributor license agreements. See the NOTICE file
|
||||
# distributed with this work for additional information
|
||||
# regarding copyright ownership. The ASF licenses this file
|
||||
# to you under the Apache License, Version 2.0 (the
|
||||
# "License"); you may not use this file except in compliance
|
||||
# with the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing,
|
||||
# software distributed under the License is distributed on an
|
||||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
# KIND, either express or implied. See the License for the
|
||||
# specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
|
||||
==============================
|
||||
Publishing POI Web Site
|
||||
==============================
|
||||
|
||||
The Apache POI web site is https://poi.apache.org/
|
||||
|
||||
The HTML and other files for the web site are stored in svn at https://svn.apache.org/repos/asf/poi/site
|
||||
Committing files to the `publish` directory of this repo will automatically lead to the web site being updated.
|
||||
There may be a small delay and you might need to force a refresh in your browser.
|
||||
|
||||
The site is built from using the main POI svn at https://svn.apache.org/repos/asf/poi/trunk
|
||||
|
||||
Prerequisites
|
||||
-------------
|
||||
|
||||
You will need an up to date version of Apache Ant installed (Ant 1.10 works well).
|
||||
You also need to install Apache Forrest. Forrest is no longer maintained but PJ has a fork with a few small changes.
|
||||
This is at https://github.com/pjfanning/apache-forrest-0.9
|
||||
|
||||
You can use the last official Apache Forrest release but you may notice some diffs when you build the site and try to
|
||||
publish it. https://forrest.apache.org/
|
||||
|
||||
You will need to create an environment variable called FORREST_HOME and set it to match the directory location
|
||||
where you installed Apache Forrest.
|
||||
|
||||
Building and Deploying the Site
|
||||
-------------------------------
|
||||
|
||||
It is recommended that you open a command prompt and set up Java 8 as your default. The web site build will fail
|
||||
if you use a very recent Java version.
|
||||
|
||||
In your local copy of the POI svn (https://svn.apache.org/repos/asf/poi/trunk), run:
|
||||
|
||||
ant site
|
||||
|
||||
After this completes, you can copy the files in `build/site` to the `publish` directory in your poi-site checkout
|
||||
(https://svn.apache.org/repos/asf/poi/site).
|
||||
|
||||
A command like this might work.
|
||||
|
||||
cp -r ~/svn/poi/build/site/* ~/svn/poi-site/publish/
|
||||
|
||||
I would recommend that you use `svn stat` and `svn diff` before committing the changes to poi-site.
|
||||
367
src/documentation/release-guide.txt
Normal file
@ -0,0 +1,367 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one
|
||||
# or more contributor license agreements. See the NOTICE file
|
||||
# distributed with this work for additional information
|
||||
# regarding copyright ownership. The ASF licenses this file
|
||||
# to you under the Apache License, Version 2.0 (the
|
||||
# "License"); you may not use this file except in compliance
|
||||
# with the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing,
|
||||
# software distributed under the License is distributed on an
|
||||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
# KIND, either express or implied. See the License for the
|
||||
# specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
|
||||
==============================
|
||||
POI Release Guide
|
||||
==============================
|
||||
|
||||
|
||||
(I) Prerequisites
|
||||
|
||||
1. You should read the <a href="https://www.apache.org/dev/release.html">Apache Release FAQ</a>
|
||||
2a. You must have shell access to people.apache.org; and you should
|
||||
have key-based authentication set up
|
||||
1. Generate ssh key with ssh-keygen -t rsa -b 4096
|
||||
(e.g. <a href="http://www.linuxproblem.org/art_9.html">how to</a>.)
|
||||
2. Add contents of id_rsa.pub to SSH Key (authorized_keys) line on https://id.apache.org/
|
||||
3. ssh -v username@people.apache.org
|
||||
Verify authenticity of host: https://www.apache.org/dev/machines
|
||||
4. Only sftp access is necessary
|
||||
2b. You must be a member of the committee group
|
||||
3. Release manager must have their public key appended to the KEYS file checked in to SVN and the key published on one of the public key servers.
|
||||
More info can be found here: <a href="https://www.apache.org/dev/release-signing.html">https://www.apache.org/dev/release-signing.html</a>
|
||||
4. You must have Java JDK 1.8 installed and active.
|
||||
5. You must have the following utilities installed on your local machine and available in your path:
|
||||
* <a href="www.openssh.com">ssh</a>
|
||||
* <a href="www.gnupg.org">gnupg</a>
|
||||
* <a href="www.openssl.org">openssl</a>
|
||||
For Windows users, install Cygwin and make sure you have the above utilities
|
||||
6a. The POI build system requires two components to perform a build
|
||||
* <a href="https://ant.apache.org">Ant</a> 1.9.x or higher
|
||||
* <a href="https://forrest.apache.org/">Forrest</a> 0.9.
|
||||
Make sure ANT_HOME and FORREST_HOME are set.
|
||||
|
||||
6b. Ensure you can log in to https://repository.apache.org/ with your Apache
|
||||
credentials, and that you can see the "Staging Repositories" area on
|
||||
the left hand side.
|
||||
|
||||
6c. It's a good idea to check at https://ci-builds.apache.org/job/POI/
|
||||
that Jenkins is in a good state (i.e. most recent build passed
|
||||
and is up to date with SVN). You probably also want to e-mail
|
||||
the dev list with a note to say you're building a release.
|
||||
|
||||
7. Before building, you should run the "rat-check" build task, which
|
||||
uses <a href="https://creadur.apache.org/rat/">Apache Rat</a>
|
||||
to check the source tree for files lacking license headers. Files
|
||||
without headers should be either fixed, or added to the exclude list
|
||||
|
||||
8. Check file permissions are correct in SVN.
|
||||
There can be files in the SVN tree marked executable (have the
|
||||
svn:executable property set), but which should not be. Checking them
|
||||
out will cause the executable bit to be set for them on filesystems
|
||||
which support it. The flag can be removed in batch using
|
||||
|
||||
{code:sh}
|
||||
svn pd 'svn:executable' $(find -name .svn -prune -or -type f ! -name \*.sh \
|
||||
-print0 | xargs -0 svn pg 'svn:executable' | cut -d ' ' -f 1)
|
||||
{code}
|
||||
|
||||
9. Before building, ensure that the year in the NOTICE file is correct,
|
||||
and review any new or updated dependencies to ensure that if they
|
||||
required LICENSE or NOTICE updates then these were done.
|
||||
|
||||
10. Ensure that the changelog is up to date
|
||||
|
||||
11. Ensure that the KEYS files in the dist areas are up-to-date with the
|
||||
latest ones in svn:
|
||||
https://dist.apache.org/repos/dist/dev/poi/KEYS
|
||||
https://dist.apache.org/repos/dist/release/poi/KEYS
|
||||
Dist is a regular svn repo that can be checked out and committed to.
|
||||
To upload to dist: https://www.apache.org/dev/release-distribution
|
||||
|
||||
You can use these commands to do a sparse checkout of dist.apache.org.
|
||||
There are so many files from all the Apache projects that it is not recommended to do a full checkout.
|
||||
|
||||
{code:sh}
|
||||
svn checkout https://dist.apache.org/repos/dist/ --depth immediates
|
||||
svn update --set-depth infinity dist/dev/poi/
|
||||
svn update --set-depth infinity dist/release/poi/
|
||||
{code}
|
||||
|
||||
|
||||
(II) Making release artifacts
|
||||
Run these commands from a clean checkout of https://svn.apache.org/repos/asf/poi/trunk
|
||||
|
||||
1. Update the version number in these files and commit the changes to svn.
|
||||
- build.xml (version.id)
|
||||
- build.gradle
|
||||
- osgi/pom.xml (version and poi.version)
|
||||
|
||||
2. Force a new build at https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8
|
||||
- when build completes, download the built jars from
|
||||
https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/
|
||||
|
||||
3. To produce the source distributions, run
|
||||
- ./gradlew srcDistZip
|
||||
- ./gradlew srcDistTar
|
||||
|
||||
4. Copy the build/dist files to your svn checkout of dist.apache.org (dist/dev/poi/src)
|
||||
{code:sh}
|
||||
svn co https://dist.apache.org/repos/dist/release/poi /opt/poi-dist
|
||||
cp build/dist/*.zip build/*.tgz /opt/poi-dist/dev/
|
||||
{code}
|
||||
|
||||
5. Generate SHA512 checksums
|
||||
|
||||
{code:sh}
|
||||
for f in *.zip *.tgz
|
||||
do
|
||||
sha512sum $f > $f.sha512
|
||||
done
|
||||
{code}
|
||||
|
||||
6. Generate signatures
|
||||
- The 1556F3A4 key in the command below is just an example, replace the value with your own key id
|
||||
|
||||
{code:sh}
|
||||
for f in *.zip *.tgz; do gpg --default-key 1556F3A4 -ab $f; done
|
||||
{code}
|
||||
|
||||
7. Validate the checksums and signatures
|
||||
|
||||
{code:sh}
|
||||
find . -name "*.sha512" -type f -execdir sha512sum -c {} \;
|
||||
find . -name "*.asc" -exec gpg --no-secmem-warning --verify {} \;
|
||||
{code}
|
||||
|
||||
8. Deploy the source distribution files to the dev area of dist.apache.org
|
||||
- Remove any old release candidates (only need to keep the latest one)
|
||||
- svn commit the changes
|
||||
|
||||
(III) Deploy Jars to Maven Staging
|
||||
|
||||
1. Ensure that there has been a build with the right svn commit and then download the jars
|
||||
- https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8
|
||||
- https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/
|
||||
|
||||
2. Set up a `poi-prep` directory and copy all the jars other than the `test` jars into it.
|
||||
- The artifacts in the archive.zip that you can download (see 1 above) are grouped in different dirs
|
||||
- you can unzip the archive.zip and use the sample script from the `poi-prep` dir
|
||||
This is an example script:
|
||||
{code:sh}
|
||||
export DOWNLOADED_JARS_DIR=/path/to/unzipped/archive
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-excelant/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-examples/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-ooxml/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-ooxml-full/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-ooxml-lite/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-scratchpad/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-javadoc/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-examples-javadoc/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-excelant-javadoc/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-ooxml-javadoc/*.jar .
|
||||
mv $DOWNLOADED_JARS_DIR/build/dist/maven/poi-scratchpad-javadoc/*.jar .
|
||||
{code:sh}
|
||||
|
||||
3. We need to create pom files. Copy the ones from the last release into the directory with the jars.
|
||||
- ensure the pom file names match the jar names (same version)
|
||||
|
||||
This is an example script:
|
||||
{code:sh}
|
||||
export POI_RELEASE=5.3.0
|
||||
export POI_LAST_RELEASE=5.2.5
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi/$POI_LAST_RELEASE/poi-$POI_LAST_RELEASE.pom --output poi-$POI_RELEASE.pom
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi-scratchpad/$POI_LAST_RELEASE/poi-scratchpad-$POI_LAST_RELEASE.pom --output poi-scratchpad-$POI_RELEASE.pom
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi-ooxml/$POI_LAST_RELEASE/poi-ooxml-$POI_LAST_RELEASE.pom --output poi-ooxml-$POI_RELEASE.pom
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi-ooxml-lite/$POI_LAST_RELEASE/poi-ooxml-lite-$POI_LAST_RELEASE.pom --output poi-ooxml-lite-$POI_RELEASE.pom
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi-ooxml-full/$POI_LAST_RELEASE/poi-ooxml-full-$POI_LAST_RELEASE.pom --output poi-ooxml-full-$POI_RELEASE.pom
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi-excelent/$POI_LAST_RELEASE/poi-excelent-$POI_LAST_RELEASE.pom --output poi-excelent-$POI_RELEASE.pom
|
||||
curl https://repo1.maven.org/maven2/org/apache/poi/poi-examples/$POI_LAST_RELEASE/poi-examples-$POI_LAST_RELEASE.pom --output poi-examples-$POI_RELEASE.pom
|
||||
{code:sh}
|
||||
|
||||
4. Fix up the version values in the poms
|
||||
- I would recommend using an IDE and loading up the full `poi-prep` directory
|
||||
- Replace all instances of the old version number with the new one
|
||||
- if using 'Replace All', approve each change just in case the old version number might also match the version of a non-POI dependency
|
||||
- update the dependency versions
|
||||
- check if we need to remove or add dependencies
|
||||
- great care must be taken at this stage because this step is very error prone
|
||||
- feel free to generate the poms using the Gradle build instead but the Gradle build will get some aspects wrong
|
||||
- the poi-ooxml-lite build is one aspect that messes up the generated poms
|
||||
- you can hand modify the poms to fix issues (compare against the poms from the last release)
|
||||
|
||||
5. Generate signatures (no need for SHA checksums because they are automatically created later)
|
||||
- The 1556F3A4 key in the command below is just an example, replace the value with your own key id
|
||||
|
||||
{code:sh}
|
||||
for f in *.jar *.pom; do gpg --default-key 1556F3A4 -ab $f; done
|
||||
{code}
|
||||
|
||||
6. Create bundle jars
|
||||
|
||||
{code:sh}
|
||||
jar -cvf poi-bundle.jar poi-5*.pom poi-5*.pom.asc poi-5*.jar poi-5*.jar.asc
|
||||
jar -cvf poi-ooxml-bundle.jar poi-ooxml-5*.pom poi-ooxml-5*.pom.asc poi-ooxml-5*.jar poi-ooxml-5*.jar.asc
|
||||
jar -cvf poi-ooxml-full-bundle.jar poi-ooxml-full*.pom poi-ooxml-full*.pom.asc poi-ooxml-full*.jar poi-ooxml-full*.jar.asc
|
||||
jar -cvf poi-ooxml-lite-bundle.jar poi-ooxml-lite*.pom poi-ooxml-lite*.pom.asc poi-ooxml-lite*.jar poi-ooxml-lite*.jar.asc
|
||||
jar -cvf poi-scratchpad-bundle.jar poi-scratchpad*.pom poi-scratchpad*.pom.asc poi-scratchpad*.jar poi-scratchpad*.jar.asc
|
||||
jar -cvf poi-excelant-bundle.jar poi-excelant*.pom poi-excelant*.pom.asc poi-excelant*.jar poi-excelant*.jar.asc
|
||||
jar -cvf poi-examples-bundle.jar poi-examples*.pom poi-examples*.pom.asc poi-examples*.jar poi-examples*.jar.asc
|
||||
{code}
|
||||
|
||||
7. Deploy bundle jars to repository.apache.org
|
||||
- Login with your Apache username and password
|
||||
- If you have never deployed a bundle artifact then read
|
||||
https://help.sonatype.com/repomanager2/staging-releases/artifact-bundles
|
||||
- deploy each of the 7 bundle jars - watch out for exceptions when loading them
|
||||
|
||||
(IV) Calling the vote:
|
||||
|
||||
1. The release manager should call the vote
|
||||
2. Include the URL of the release artifacts
|
||||
3. Include the time for the vote to run (3 day minimum, can be longer)
|
||||
4. Provide guidance on what needs to be checked
|
||||
5. Complete a tally, and send a result once the time has passed
|
||||
|
||||
(V) After the vote:
|
||||
|
||||
Deploy the artifacts from the staging area (https://dist.apache.org/repos/dist/dev/poi/)
|
||||
to the release area of the dist repo:
|
||||
https://dist.apache.org/repos/dist/release/poi/release/
|
||||
|
||||
And remove any old releases from the staging area if they exist (should have been deleted by Step 11)
|
||||
Staging area: https://dist.apache.org/repos/dist/dev/poi/
|
||||
|
||||
{code:sh}
|
||||
svn rm https://dist.apache.org/repos/dist/dev/poi/FIXME3.16-RC1 -m "remove old release from staging area"
|
||||
{code:sh}
|
||||
|
||||
You should get an email from the Apache Reporter Service (no-reply@reporter.apache.org)
|
||||
at your Apache email address.
|
||||
The email instructions will ask you to log on to https://reporter.apache.org/addrelease.html?poi
|
||||
and add your release data (version and date) to the database.
|
||||
|
||||
Log into https://repository.apache.org/ and go to the "Staging Repositories" area.
|
||||
Find the "orgapachepoi" entry, check it has the right content, then Close the repository
|
||||
(it was probably already closed by release-prep3).
|
||||
Select all artifacts and Release (and Automatically Drop) them.
|
||||
Refresh to verify that the artifacts are no longer in the Staging Repositories area.
|
||||
|
||||
2. Wait for the distributions to appear on your favourite mirror (anywhere from 3-24 hours)
|
||||
https://www.apache.org/dyn/closer.lua/poi/dev/
|
||||
|
||||
3. Wait for the maven artifacts to appear on Maven Central, and ensure they work:
|
||||
Maven Central: https://search.maven.org/#search|ga|1|g%3A%22org.apache.poi%22
|
||||
|
||||
Create a simple project and make sure the release artifacts are accessible
|
||||
by maven:
|
||||
|
||||
{code:sh}
|
||||
mvn archetype:create -DgroupId=org.apache.poi.scratchpad -DartifactId=maven-test
|
||||
cd maven-test
|
||||
{code}
|
||||
|
||||
edit pom.xml and add the release artifacts to the project dependencies:
|
||||
|
||||
{code:xml}
|
||||
<dependency>
|
||||
<groupId>org.apache.poi</groupId>
|
||||
<artifactId>poi-ooxml</artifactId>
|
||||
<version>4.0.0</version>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.apache.poi</groupId>
|
||||
<artifactId>poi-scratchpad</artifactId>
|
||||
<version>4.0.0</version>
|
||||
</dependency>
|
||||
{code}
|
||||
|
||||
edit src/main/java/Test.java and add this:
|
||||
|
||||
{code:java}
|
||||
import org.apache.poi.ss.usermodel.Workbook;
|
||||
import org.apache.poi.ss.usermodel.WorkbookFactory;
|
||||
|
||||
public class Test {}
|
||||
{code}
|
||||
|
||||
{code:sh}
|
||||
mvn compile
|
||||
{code}
|
||||
|
||||
You should see [INFO] BUILD SUCCESSFUL in the end, which tells you that
|
||||
the jars could be downloaded fine.
|
||||
|
||||
4. Edit the website homepage and list the new release there.
|
||||
* poi/site/src/documentation/content/xdocs/index.xml
|
||||
* poi/site/src/documentation/content/xdocs/changes.xml
|
||||
remove older releases.
|
||||
|
||||
5. Edit the website download page, and list the new release there. This should
|
||||
reference the checksums, so take care when updating
|
||||
* poi/site/src/documentation/content/xdocs/download.xml
|
||||
{code:sh}
|
||||
# the following generates a download-snipplet.xml to be copy&pasted in the download.xml
|
||||
ant update-download -Dversion.id="3.15" -Dreltype=dev -Drel_date="02 July 2016" -Dfile_date="20160702"
|
||||
{code}
|
||||
And copy the contents from the output, download-snipplet.xml, to the appropriate section
|
||||
in poi/site/src/documentation/content/xdocs/download.xml.
|
||||
|
||||
Additionally there are some further files to be updated ... check the results and commit them:
|
||||
{code:sh}
|
||||
# the following updates various references from the previous release to the current release
|
||||
ant release-finish -Dfile_date="20160702"
|
||||
{code}
|
||||
|
||||
|
||||
6. Build site using a recent version of Java 1.8
|
||||
Commit the site changes to svn, and publish live
|
||||
|
||||
7. Copy the build javadocs to a stable location under /apidocs/{ver}/
|
||||
- For a major release, create a new subfolder to hold the javadocs for
|
||||
this release family, eg 4.1 for 4.1.x
|
||||
- For a minor release, replace the existing subfolder for the release
|
||||
family, eg 4.1.2 uses 4.1 replacing the previous 4.1.1
|
||||
|
||||
8. Don't forget to upload the latest version of the site and javadocs
|
||||
|
||||
9. Send announcements:
|
||||
From: your @apache.org e-mail address
|
||||
To: user@poi.apache.org, dev@poi.apache.org, general@poi.apache.org, and announce@apache.org
|
||||
Subject: [ANNOUNCE] Apache POI FIXME3.16 released
|
||||
Body:
|
||||
"""
|
||||
The Apache POI PMC is pleased to announce the release of Apache POI FIXME3.16.
|
||||
|
||||
Apache POI is a Java library for reading and writing Microsoft Office files.
|
||||
|
||||
For detailed changes in this release, refer to the release notes [1] and the changelog [2].
|
||||
|
||||
Thank you to all our contributors for making this release possible.
|
||||
|
||||
On behalf of the Apache POI PMC,
|
||||
Your Name
|
||||
|
||||
[1] Release notes: https://www.apache.org/dyn/closer.lua/poi/dev/RELEASE-NOTES-FIXME3.16.txt
|
||||
[2] Changelog: https://poi.apache.org/changes.html#FIXME3.16
|
||||
"""
|
||||
|
||||
Note, announcements should be sent from your @apache.org e-mail address.
|
||||
|
||||
10. In Bugzilla, add a new version and the next "...-dev" version. Also close the n-2 -dev version to new bugs.
|
||||
|
||||
11. Add the version to the DOAP file too
|
||||
https://svn.apache.org/repos/asf/poi/trunk/doap_POI.rdf
|
||||
|
||||
12. Delete directory that held RC.
|
||||
|
||||
e.g.
|
||||
{code:sh}
|
||||
svn delete -m "delete empty RC directory for 4.0.0" https://dist.apache.org/repos/dist/dev/poi/4.0.0-RC1
|
||||
{code}
|
||||