Conrad Muller
Seattle, Washington

Email: conrad at
dot com

XML (eXtensible Markup Language)

Document Markup

All documents, audible, print or electronic, consist of content, organization, and presentation.

All programs that deal with text, numbers, and graphics require commands that tell the printer or viewing program (browser) how to organize and present the content for viewing. In electronic documents, markup languages describe layout (organization) and formatting (presentation) to the printer, browser, or other viewing medium.

In the beginning, Web page content was formatted for the browser by (marked up by) HTML, and it was good, if a little limiting. HTML (HyperText Markup Language) was originally devised by Tim Berners-Lee of the European Particle Physics Laboratory (CERN), beginning in 1989.

HTML was meant to be easy to use. It was designed to be flexible, but above all, it had to be easy to learn so that many people would use it. Power in markup was sacrificed to keep HTML simple.

Then the browser vendors began to expand HTML by adding new tags (commands) to the browsers as well as a browser programming language (JavaScript). The new commands, such as tables and frames, were very useful, however, each browser vendor implemented the new tags in its own non-standard way, and that was bad.

User organizations formed to pressure browser vendors to standardize, and the vendors ignored the users. So some of the users said: "What if we take making new tags out of the hands of the browser vendors? What if we get the vendors to add code the browsers that allows us to make up our own tags?"

The users wanted to extend the traditional set of markup commands (the markup language) provided by HTML.

After lots of volunteer effort, which included people from the major browser vendors, the W3C presented the eXtensible Markup Language.

eXtensible Markup Language

XML is the acronym for eXtensible Markup Language. XML is already revolutionizing EDI (Electronic Data Interchange), making business-to-business transactions over the Internet more flexible and less expensive than pre-existing proprietary communications protocols. XML tags can be nested inside other tags, so that hierarchical data structures can be maintained. For example a tagged section of data for an employee can include a section for name which contains tagged sections for first, middle, and last names. A section for address can contain tagged sections for address, city, state, and zip. All contained within the tags for that employee.

To make the new tags work the browsers must understand XML and there must be a programming language and a set of organizing and formatting functions built into the browsers. The programming language is usually JavaScript, the basic organizing is done in HTML, and final positioning and formatting is best done by Cascading Style Sheets.

To recap: We need a markup language to tell Web browsers how to display a document. HTML has provided a simple markup language and XML allows us to be more creative with our document structure and data manipulation.

Is that it? Is XML simply a language to create custom tags to ease communications between computers?

That's part of it. XML allows more flexible control of markup in documents and data. XML is already playing an important role in Web commerce, enabling transactions between vendors and clients at all levels of the economic food chain. While it is possible to create unique XML tags for a stand-alone application, it would be easier to use standard tags for most solutions. Remember, both the client and the server must agree on the set of definitions (or the document and browser). If multiple clients need to communicate with multiple servers, all of the clients and all of the servers must understand all of the XML tags. The tag definitions are communicated in a list called a DTD (Document Type Definition).

All of the required technologies are implemented in current browsers.

The Rest of the Story: Data Markup

Presently, XML has become even more popular for use as a data markup language. The power to create custom markup tags works as well for data, numbers and images as well as text. Even very complex data can be formatted in XML for transmission and storage. XML is more powerful than HTML since tags can be nested in other tags allowing for powerful hierarchical data representation. XML is written in ASCII text, so it is easy to store and transfer data in XML files.

XML is used to store configuration information and temporary data for applications. Application development environments use XML to store information about databases being accessed, and information about the structure of the pages and code being generated.

XML also has negaive aspects. XML adds enormous overhead to the data. The markup tags add to the file size, and marking up the data and then restoring it slows access to the data. However, for many uses, XML has the same advantages as tagging text. By including the tag definitions with the data file we can have self documenting data. By following the standards, data can be transferred between computers running otherwise incompatible software. This is similar to the way any brand of browser can display any standards-compliant Web page no matter what brand of Web server it came from.

This page is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Home | Resume | Project Portfolio | Writings | Developer Resources | Contact Form