Read and Write Open XML Files Using MS Office 2007
page 2 of 7
by Zeljko Svedic
Average Rating: This article has not yet been rated.
Views (Total / Last 10 Days): 32484/ 85

Microsoft Open XML format

Every Open XML file is essentially a ZIP archive containing many other files.  Office-specific data is stored in multiple XML files inside that archive.  This is in direct contrast with old WordML and SpreadsheetML formats which were single, non-compressed XML files.  Although more complex, the new approach offers a few benefits.

•        You do not need to process entire files in order to extract specific data.

•        Images and multimedia are now encoded in native format, not as text streams.

•        Files are smaller as a result of compression and native multimedia storage.

In Microsoft’s terminology, an open XML ZIP file is called a package.  Files inside that package are called parts.  It is important to know that every part has a defined content type and there are no default type presumptions based on the file extension.  Content type can describe anything, application XML, user XML, images, sounds, video or any other binary objects.  Every part must be connected to some other part using a relationship.  Inside the package are special XML files with “.rels” extension which define relationship between parts.  There is also a start part (sometimes called “root”, which is a bit misleading because graph containing all parts does not have to be a tree structure), so the entire structure looks like Figure 1.

Figure 1


To cut a long story short, in order to read the data from an Open XML file you need to:

1)       Open the package as a ZIP archive; any standard ZIP library will do.

2)       Find parts that contain data you want to read.  You can navigate through relationship graph (more complex) or you can presume that certain parts have a defined name and path (Microsoft can change that in the future).

3)       Read parts you are interested in using standard XML library (if they are XML) or some other method (if they are images, sounds or of some other type).

On the other hand, if you want to create a new Open XML file, you need to:

1)       Create/get all necessary parts by using some standard XML library (if they are XML), by copying them or by using some other method.

2)       Create all relationships by creating “.rels” files.

3)       Create content types by creating a “[Content_Types].xml” file.

4)       Package everything into a ZIP file with an appropriate extension (DOCX, XLSX or PPTX), any standard ZIP library will do.

The whole story about packages, parts, content types and relations is the same for all Open XML documents (regardless of their originating application) and Microsoft refers to it as Open Packaging Conventions.

View Entire Article

User Comments

No comments posted yet.

Product Spotlight
Product Spotlight 

Community Advice: ASP | SQL | XML | Regular Expressions | Windows

©Copyright 1998-2023  |  Page Processed at 2023-09-23 7:27:12 AM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search