2026-02-16 20:14:18 +01:00

398 lines
15 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="pelt">
<title>Apache POI&trade; - POIFS - Documents embedded in other documents</title>
<link type="text/css" href="../../skin/basic.css" rel="stylesheet">
<link media="screen" type="text/css" href="../../skin/screen.css" rel="stylesheet">
<link media="print" type="text/css" href="../../skin/print.css" rel="stylesheet">
<link type="text/css" href="../../skin/profile.css" rel="stylesheet">
<script src="../../skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="../../skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="../../skin/fontsize.js" language="javascript" type="text/javascript"></script>
<link rel="shortcut icon" href="../../images/favicon.ico">
</head>
<body onload="init()">
<script type="text/javascript">ndeSetTextSize();</script>
<div id="top">
<!--+
|breadtrail
+-->
<div class="breadtrail">
<a href="https://www.apache.org">Apache Software Foundation</a> &gt; <a href="https://poi.apache.org">Apache POI</a><script src="../../skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script>
</div>
<!--+
|header
+-->
<div class="header">
<!--+
|start group logo
+-->
<div class="grouplogo">
<a href="https://www.apache.org"><img class="logoImage" alt="Apache Software Foundation" src="../../images/asflogo_horizontal_color.svg" title="The Apache Software Foundation is a cornerstone of the modern Open Source software ecosystem &ndash; supporting some of the most widely used and important software solutions powering today's Internet economy."></a>
</div>
<!--+
|end group logo
+-->
<!--+
|start Project Logo
+-->
<div class="projectlogo">
<a href="https://poi.apache.org"><img class="logoImage" alt="Apache POI" src="../../images/project-header.png" title="Apache POI is well-known in the Java field as a library for reading and writing Microsoft Office file formats, such as Excel, PowerPoint, Word, Visio, Publisher and Outlook. It supports both the older (OLE2) and new (OOXML - Office Open XML) formats."></a>
</div>
<!--+
|end Project Logo
+-->
<!--+
|start Search
+-->
<div class="searchbox">
<form action="https://www.google.com/search" method="get" class="roundtopsmall">
<input value="poi.apache.org" name="sitesearch" type="hidden"><input onFocus="getBlank (this, 'Search the site with google');" size="25" name="q" id="query" type="text" value="Search the site with google">&nbsp;
<input name="Search" value="Search" type="submit">
</form>
</div>
<!--+
|end search
+-->
<!--+
|start Tabs
+-->
<ul id="tabs">
<li>
<a class="unselected" href="../../index.html">Home</a>
</li>
<li>
<a class="unselected" href="../../help/index.html">Help</a>
</li>
<li class="current">
<a class="selected" href="../../components/index.html">Component APIs</a>
</li>
<li>
<a class="unselected" href="../../devel/index.html">Getting Involved</a>
</li>
</ul>
<!--+
|end Tabs
+-->
</div>
</div>
<div id="main">
<div id="publishedStrip">
<!--+
|start Subtabs
+-->
<div id="level2tabs"></div>
<!--+
|end Endtabs
+-->
<script type="text/javascript"><!--
document.write("Last Published: " + document.lastModified);
// --></script>
</div>
<!--+
|breadtrail
+-->
<div class="breadtrail">
&nbsp;
</div>
<!--+
|start Menu, mainarea
+-->
<!--+
|start Menu
+-->
<div id="menu">
<div onclick="SwitchMenu('menu_selected_1.1', '../../skin/')" id="menu_selected_1.1Title" class="menutitle" style="background-image: url('../../skin/images/chapter_open.gif');">Component APIs</div>
<div id="menu_selected_1.1" class="selectedmenuitemgroup" style="display: block;">
<div class="menuitem">
<a href="../../components/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../apidocs/index.html">Javadocs</a>
</div>
<div onclick="SwitchMenu('menu_1.1.3', '../../skin/')" id="menu_1.1.3Title" class="menutitle">Excel (HSSF/XSSF)</div>
<div id="menu_1.1.3" class="menuitemgroup">
<div class="menuitem">
<a href="../../components/spreadsheet/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/quick-guide.html">Quick Guide</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/how-to.html">HOWTO</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/converting.html">HSSF to SS Converting</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/formula.html">Formula Support</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/eval.html">Formula Evaluation</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/eval-devguide.html">Eval Dev Guide</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/examples.html">Examples</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/use-case.html">Use Case</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/diagrams.html">Pictorial Docs</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/limitations.html">Limitations</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/user-defined-functions.html">User Defined Functions</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/excelant.html">ExcelAnt Tests</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/hacking-hssf.html">Hacking HSSF</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/record-generator.html">Record Generator</a>
</div>
<div class="menuitem">
<a href="../../components/spreadsheet/chart.html">Charts</a>
</div>
</div>
<div onclick="SwitchMenu('menu_1.1.4', '../../skin/')" id="menu_1.1.4Title" class="menutitle">PowerPoint (HSLF/XSLF)</div>
<div id="menu_1.1.4" class="menuitemgroup">
<div class="menuitem">
<a href="../../components/slideshow/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../components/slideshow/quick-guide.html">Quick Guide</a>
</div>
<div class="menuitem">
<a href="../../components/slideshow/how-to-shapes.html">HSLF Cookbook</a>
</div>
<div class="menuitem">
<a href="../../components/slideshow/xslf-cookbook.html">XSLF Cookbook</a>
</div>
<div class="menuitem">
<a href="../../components/slideshow/ppt-wmf-emf-renderer.html">Render SL/WMF/EMF</a>
</div>
<div class="menuitem">
<a href="../../components/slideshow/ppt-file-format.html">PPT File Format</a>
</div>
</div>
<div onclick="SwitchMenu('menu_1.1.5', '../../skin/')" id="menu_1.1.5Title" class="menutitle">Word (HWPF/XWPF)</div>
<div id="menu_1.1.5" class="menuitemgroup">
<div class="menuitem">
<a href="../../components/document/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../components/document/quick-guide.html">HWPF Quick Guide</a>
</div>
<div class="menuitem">
<a href="../../components/document/quick-guide-xwpf.html">XWPF Quick Guide</a>
</div>
<div class="menuitem">
<a href="../../components/document/docoverview.html">HWPF Format</a>
</div>
<div class="menuitem">
<a href="../../components/document/projectplan.html">HWPF Project plan</a>
</div>
</div>
<div class="menuitem">
<a href="../../components/hsmf/index.html">Outlook (HSMF)</a>
</div>
<div class="menuitem">
<a href="../../components/diagram/index.html">Visio (HDGF+XDGF)</a>
</div>
<div onclick="SwitchMenu('menu_1.1.8', '../../skin/')" id="menu_1.1.8Title" class="menutitle">Publisher (HPBF)</div>
<div id="menu_1.1.8" class="menuitemgroup">
<div class="menuitem">
<a href="../../components/hpbf/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../components/hpbf/file-format.html">File Format</a>
</div>
</div>
<div onclick="SwitchMenu('menu_selected_1.1.9', '../../skin/')" id="menu_selected_1.1.9Title" class="menutitle" style="background-image: url('../../skin/images/chapter_open.gif');">OLE2 Filesystem (POIFS)</div>
<div id="menu_selected_1.1.9" class="selectedmenuitemgroup" style="display: block;">
<div class="menuitem">
<a href="../../components/poifs/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../components/poifs/how-to.html">How To</a>
</div>
<div class="menupage">
<div class="menupagetitle">Embedded Documents</div>
</div>
<div class="menuitem">
<a href="../../components/poifs/fileformat.html">File System Documentation</a>
</div>
<div class="menuitem">
<a href="../../components/poifs/usecases.html">Use Cases</a>
</div>
<div class="menuitem">
<a href="../../components/poifs/design.html">Design</a>
</div>
</div>
<div onclick="SwitchMenu('menu_1.1.10', '../../skin/')" id="menu_1.1.10Title" class="menutitle">OLE2 Document Props (HPSF)</div>
<div id="menu_1.1.10" class="menuitemgroup">
<div class="menuitem">
<a href="../../components/hpsf/index.html">Overview</a>
</div>
<div class="menuitem">
<a href="../../components/hpsf/how-to.html">How To</a>
</div>
<div class="menuitem">
<a href="../../components/hpsf/thumbnails.html">Thumbnails</a>
</div>
<div class="menuitem">
<a href="../../components/hpsf/internals.html">Internals</a>
</div>
<div class="menuitem">
<a href="../../components/hpsf/todo.html">To Do</a>
</div>
</div>
<div class="menuitem">
<a href="../../components/hmef/index.html">TNEF (HMEF) for winmail.dat</a>
</div>
<div class="menuitem">
<a href="../../components/oxml4j/index.html">OpenXML4J (OOXML)</a>
</div>
<div class="menuitem">
<a href="../../components/logging.html">Logging framework</a>
</div>
<div class="menuitem">
<a href="../../components/configuration.html">Configuration</a>
</div>
</div>
<div id="credit"></div>
<div id="roundbottom">
<img style="display: none" class="corner" height="15" width="15" alt="" src="../../skin/images/rc-b-l-15-1body-2menu-3menu.png"></div>
<!--+
|alternative credits
+-->
<div id="credit2">
<a href="https://donate.apache.org/"><img border="0" title="Support Apache" alt="Support Apache - logo" src="../../images/support-asf.png" style="width: 125px;height: 125px;"></a><a href="https://www.apache.org/foundation/press/kit/#poweredby"><img border="0" title="powered by POI" alt="powered by POI - logo" src="../../images/poweredby-poi-logo.png" style="width: 125px;height: 125px;"></a>
</div>
</div>
<!--+
|end Menu
+-->
<!--+
|start content
+-->
<div id="content">
<h1>Apache POI&trade; - POIFS - Documents embedded in other documents</h1>
<h3>Overview</h3>
<div id="front-matter"></div>
<a name="Overview"></a>
<h2 class="boxed">Overview</h2>
<div class="section">
<p>It is possible for one OLE 2 based document to have other
OLE 2 documents embedded in it. For example, an Excel file
may have a Word document and a PowerPoint slideshow
embedded as part of it.</p>
<p>Normally, these other documents are stored in subdirectories
of the OLE 2 (POIFS) filesystem. The exact location of the
embedded documents will vary depending on the type of the
master document, and the exact directory names will differ
each time. To figure out exactly which directory to look
in, you will either need to process the appropriate OLE 2
linking entry in the master document, or simple iterate
over all the directories in the filesystem.</p>
<p>As a general rule, you will find the same OLE 2 entries
in the subdirectories, as you would've found at the root
of the filesystem were a document to not be embedded.</p>
<a name="Files+embedded+in+Excel"></a>
<h3 class="boxed">Files embedded in Excel</h3>
<p>Excel normally stores embedded files in subdirectories
of the filesystem root. Typically these subdirectories
are named starting with MBD, with 8 hex characters following.</p>
<a name="Files+embedded+in+Word"></a>
<h3 class="boxed">Files embedded in Word</h3>
<p>Word normally stores embedded files in subdirectories
of the ObjectPool directory, itself a subdirectory of the
filesystem root. Typically these subdirectories and named
starting with an underscore, followed by 10 numbers.</p>
<a name="Files+embedded+in+PowerPoint"></a>
<h3 class="boxed">Files embedded in PowerPoint</h3>
<p>PowerPoint does not normally store embedded files
in the OLE2 layer. Instead, they are held within records
of the main PowerPoint file.
<br>See the <a href="./../slideshow/how-to-shapes.html#OLE">HSLF Tutorial</a>
for how to retrieve embedded OLE objects from a presentation</p>
</div>
<a name="Listing+POIFS+contents"></a>
<h2 class="boxed">Listing POIFS contents</h2>
<div class="section">
<p>POIFS provides a simple tool for listing the contents of
OLE2 files. This can allow you to see what your POIFS file
contents, and hence if it has any embedded documents in it,
and where.</p>
<p>The tool to use is <em>org.apache.poi.poifs.dev.POIFSLister</em>.
This tool may be run from the command line, and takes a filename
as its parameter. It will print out all the directories and
files contained within the POIFS file.</p>
</div>
<a name="Opening+embedded+files"></a>
<h2 class="boxed">Opening embedded files</h2>
<div class="section">
<p>All of the POIDocument classes (HSSFWorkbook, HSLFSlideShow,
HWPFDocument and HDGFDiagram) can either be opened from
a POIFSFileSystem, or from a specific directory within a
POIFSFileSystem. So, to open embedded files, simply locate the
appropriate DirectoryNode that represents the subdirectory
of interest, and pass this + the overall POIFSFileSystem to
the constructor.</p>
<p>I you want to extract the textual contents of the embedded file,
then open the appropriate POIDocument, and then pass this to
the extractor class, instead of simply passing the POIFSFilesystem
to the extractor.</p>
</div>
<p align="right">
<font size="-2">by&nbsp;Nick Burch,&nbsp;Yegor Kozlov</font>
</p>
</div>
<!--+
|end content
+-->
<div class="clearboth">&nbsp;</div>
</div>
<div id="footer">
<!--+
|start bottomstrip
+-->
<div class="lastmodified">
<script type="text/javascript"><!--
document.write("Last Published: " + document.lastModified);
// --></script>
</div>
<div class="copyright">
Copyright &copy;
2001-2026 <a href="https://www.apache.org/">The Apache Software Foundation</a>
<br>
Apache POI, POI, Apache, the Apache logo, and the Apache
POI project logo are trademarks of The Apache Software Foundation.
</div>
<div id="feedback">
Send feedback about the website to:
<a id="feedbackto" href="mailto:dev@poi.apache.org?subject=Feedback%C2%A0components/poifs/embeded.html">dev@poi.apache.org</a>
</div>
<!--+
|end bottomstrip
+-->
</div>
</body>
</html>