Achieving Openness: A Closer Look at ODF and OOXMLby Sam Hiser
An open, XML-based standard for displaying and storing data files (text documents, spreadsheets, and presentations) offers a new and promising approach to data storage and document exchange among office applications. A comparison of the two XML-based formats–OpenDocument Format ("ODF") and Office Open XML ("OOXML")–across widely accepted "openness" criteria has revealed substantial differences, including the following:
- ODF is developed and maintained in an open, multi-vendor, multi-stakeholder process that protects against control by a single organization. OOXML is less open in its development and maintenance, despite being submitted to a formal standards body, because control of the standard ultimately rests with one organization.
- ODF is the only openly available standard, published fully in a document that is freely available and easy to comprehend. This openness is reflected in the number of competing applications in which ODF is already implemented. Unlike ODF, OOXML's complexity, extraordinary length, technical omissions, and single-vendor dependencies combine to make alternative implementation unattractive as well as legally and practically impossible.
- ODF is the only format unencumbered by intellectual property rights (IPR) restrictions on its use in other software, as certified by the Software Freedom Law Center. Conversely, many elements designed into the OOXML formats but left undefined in the OOXML specification require behaviors upon document files that only Microsoft Office applications can provide. This makes data inaccessible and breaks work group productivity whenever alternative software is used.
- ODF offers interoperability with ODF-compliant applications on most of the common operating system platforms. OOXML is designed to operate fully within the Microsoft environment only. Though it will work elegantly across the many products in the Microsoft catalog, OOXML ignores accepted standards and best practices regarding its use of XML.
Overall, a comparison of both formats reveals significant differences in their levels of openness. While ODF is revealed as sufficiently open across all four key criteria, OOXML shows relative weakness in each criteria and offers fundamental flaws that undermine its candidacy as a global standard.
In today's knowledge economy, information and communication technology (ICT) architectures need to be flexible. They must be modular, pluggable, and easy to set up, fast to integrate, and fast to take down and repurpose for governments and businesses alike to meet the demands of their citizens and customers. The architectures need to be built around agreed standard protocols and data needs to flow seamlessly across different applications and platforms.
Open standards are at the core of these new interoperable systems. Based largely on the framework of TCP/IP and HTML, both open standards, the Internet's open architecture has enabled new and unimagined ways of communicating, working, and innovating. The Internet is the best example of what can be achieved when systems interoperate around open standards.
With that in mind, this article analyzes the "openness" of two emerging XML-based document formats. The analysis is timely. ODF was approved as an international standard in May 2006 (ISO 26300).i OOXML was recently submitted to JTC1 of the International Organization for Standardization (ISO) and the International Electrotechnical Committee (IEC), triggering a 9 to 12-month process during which OOXML will be reviewed and voted on by national standards bodies.ii Much of the information in this paper has appeared before, but not in a synthesis on the openness theme.
"Openness" in Document Formats
With the emergence of flexible ICT architectures that depend upon interoperability, a document format's degree of openness will affect the free flow of information across the world's computer systems. ODF and OOXML each promise different results for data access as well as cost, choice, and innovation in software.
Given the document-intensive nature of their day-to-day business, governments have a special interest in open document formats. An open standard for documents that is widely available in many software products would allow agencies and departments to exchange and collaborate on office documents, store them for long periods of time, ensure public access to them, and enable electronic communication with citizens without forcing on themselves or their citizens any particular brand of software. This is why interest in open document standards is growing and why the number of governments around the world requiring their use is increasing.iii
Various definitions of an "open standard" have been proposed.iv With document formats particularly in mind, a consensus emerges among the definitions; they gravitate to agreement in four basic areas:
- Open Life-Cycle
- Open Availability
- Multiple Implementations
- Interoperability Across Different Systems
Following is an analysis of both formats across each of these four consensus criteria and a measurement of the degree to which ODF and OOXML satisfy each one. In satisfying these criteria thoroughly, a document format can be deemed sufficiently open to bring us fully into the Internet era of low-cost, collaborative computing based on modular services and architectures.
(I) Open Life-Cycle
A format development process having an open life-cycle means the format is evolved in a fashion that is open to public participation, where meetings are held in the open, where meeting artifacts (notes, minutes, e-mail correspondences, and documentation) are published, and where all participants–individuals as well as companies–have a voice in consensus decision-making on the standard's technical make-up. An open standard should be platform- and vendor-neutral, so multiple implementors working on multiple platforms is essential.
ODF was developed and is continuously evolving in an open, appealable, and published process. ODF was developed at the Organization for the Advancement of Structured Information Standards (OASIS) Technical Committee, which operates in view of the public while inviting the participation of any interested party. E-mail of technical committee communications, including meeting notes and documentation, is archived on the OASIS web site. Technical committee meeting participation is not limited: individual members of OASIS as well as corporate representatives participate equally, with voting eligibility established by the level of individual participation.
ODF's Technical Committee and Sub-Committees include multiple active participants representing both proprietary and open source implementers. Other participants include accessibility advocates, academic and government representatives, and consumer groups.
Originally the default format in the OpenOffice.org application, ODF went through a rigorous, open evolution process starting in 2003 when it was submitted to OASIS. OASIS members improved it over the course of two years before a year-long review process at ISO, where it received more comments and correction, before it was officially published as an ISO standard in November 2006. During these four years of collaborative technical refinement, many software application vendors implemented it to varying degrees of completeness in both proprietary and open source solutions.
Ecma International ("Ecma") Technical Committee 45 ("TC45"), which maintains OOXML, works in an opaque manner with its voting, balloting, and appeals policies not published. It is unclear if voting, balloting, or appeals processes are used in the development of OOXML, since the formats were pre-developed within Microsoft's Office software development group and Microsoft retains veto power over any ongoing changes that are proposed in TC45. Moreover, while there is an after-the-fact reporting by press release, the meeting activities of TC45, the committee's work-in-progress, documents, and e-mail are not public.v
Barriers to participation in the development of OOXML are many. Ecma membership requirements are limiting: individuals are not welcome to participate except by special invitation or through their corporate membership at Ecma. Only senior corporate members have the right to vote on a TC. The OOXML specification's over 6,000 pages were reviewed in less than a year by Ecma and were submitted to ISO in December 2006 without a reference implementation in software.
OOXML is a single-vendor specification that does not have an open life-cycle. Ecma TC45 behaves only as a consultative body. A single vendor, Microsoft, retains control over development of OOXML. Performance on such key criteria as interoperability (see Section IV, below) therefore remains in the hands of one private entity. This contrasts significantly with ODF's open life-cycle as maintained at the OASIS ODF Technical Committee.
(II) Open Availability
An open format is published in its entirety in a specification document, which is freely available and easy to comprehend. Open Availability also means that a format is freely available for implementation in software.
ODF is published in its complete form in the .odt and .pdf formats, which are downloadable from the OASIS website free of cost.vi
ODF has been implemented in many different vendors' products, under both proprietary and open source software licenses and on numerous operating system platforms. This is possible because the ODF specification is technically explicit and contains references to other open standards, and because its length is reasonable and manageable.
The OOXML specification is free and may be downloaded from the Ecma website. It is, however, difficult to manage, coming in such length and in parts so numerous, in a text so complex and inconsistent in its technical terminology and with so many deliberate omissions, that questions arise about its availability on a practical level.vii In the following areas, OOXML presents significant questions and challenges regarding full, open availability:
Non-disclosure of elements of OOXML
OOXML contains numerous undocumented elements. For example, OOXML preserves certain file data in binary form based upon legacy formats that are not, and have never been, disclosed to outside developers. This means it is impossible for any entity besides Microsoft to create effective alternative implementations of the formats.
A second example is the implementation of OOXML for spreadsheets in Office 2007 (Excel 2007), which also makes use of data in binary form. As these binary formats have not yet been shared openly, it is presently impossible for other vendors or developers to create working alternative implementations of the OOXML binary spreadsheet format.
- OOXML elements require an application to emulate Microsoft Office
Numerous elements designed into but undefined by the OOXML specification require actions and behaviors upon document files that are particular only to legacy Microsoft Office and WordPerfect applications.viii Examples from the OOXML specification include:
Emulate Word 6.0 Line Wrapping for East Asian Text
Emulate Word 5.x for Macintosh Small Caps Formatting
Emulate Word 97 Text Wrapping Around Floating Objects
Emulate WordPerfect 6.x Font Height Calculation
Emulate Word 2002 Table Style Rules
Emulate Word 97 East Asian Line Breaking
Emulate WordPerfect 6.x Paragraph Justification
Emulate Word 97 Text Wrapping Around Floating Objects
The practical effect is that the data associated with these features, once it is contained in OOXML files, will not be readable, editable or renderable by software applications that cannot perfectly emulate Microsoft Office or WordPerfect. While the stated purpose of OOXML is to ensure backward compatibility with old files, such deprecated legacy data creates a dependency upon Microsoft's Windows operating system and office suite applications.ix
Such dependencies fail to meet the criteria of Open Availability. The Microsoft OOXML format includes interactions with its earlier unspecified formats. The result is that other vendors, developers, or users cannot access data in Microsoft formats to the same degree as Microsoft software, nor to the degree expected of standard XML.
These unspecified format characteristics and application behaviors are not explicit in the OOXML technical specification, nor are they legally allowed to be duplicated by developers. Microsoft's license for OOXML, the Open Specification Promise, prohibits such application behavior emulation and, therefore, blocks access by non-Microsoft entities to the data in OOXML form–in effect, this makes the specification unavailable while it also defeats the purpose of having an XML document format.x
As evidenced by its implementation in multiple products that are offered through multiple vendors, ODF achieves open availability. However, the OOXML specification's complexity, its length, omissions, and single-vendor dependencies prohibit efficient, cost-effective, or fully working implementations of the format in other software. OOXML is therefore unlikely to ever be fully implemented by any application other than Microsoft's Office, for which it was created.