Newsletter Archives
-
How to dig into Excel files using the Office XML file format
ISSUE 21.22 • 2024-05-27 OFFICE
By Mary Branscombe
The tools built into Excel assume colleagues might make honest mistakes. If you suspect something more nefarious, look in the XML of the file for clues.
Last time, I looked at what’s inside an Office document. In essence, it is a package of different files that contain both the content and the formatting of your document, kept together in what is effectively a ZIP file.
For Excel, this file collection also includes a lot of information about how a spreadsheet was put together.
Read the full story in our Plus Newsletter (21.22.0, 2024-05-27).
This story also appears in our public Newsletter. -
Understanding Office document formats
OFFICE
By Mary Branscombe
Inside every Office file is a hierarchy of formats and XML markup.
If you understand these structures, you can use that knowledge to extract information directly from most Office app files.
When Word, Excel, and PowerPoint first came out, they stored documents in proprietary binary file formats, with text, styles, page layout, and multimedia all encoded in the same file. That was fairly efficient: the binary file is compact, and there’s only one file to copy per document when you want to move it around or share it with someone.
Read the full story in our Plus Newsletter (21.18.0, 2024-04-29).