An Intro to XML

What it is

In one of the many in-depth XML sessions at Microsoft Tech-Ed 2000 (June 5-8, 2000 in Orlando, Florida), Microsoft XML Product leads Eric Schmidt and Zak Davies began with a tongue-in-cheek demonstration of just how misunderstood this meta-markup language really is. "XML is cool. Let's go buy some XML," they bantered, before showing a video of a pair of nerdy Web developers in an elevator accosting another passenger, in a scene that went a little like this:

"Can you believe that? He didn't know what XML stands for! Every idiot knows what XML is, right?" they asked the poor bystander, who managed to stammer out "Je ne parle Anglais" seconds before a man enters the elevator and exposes the bystander's ruse with "Hiya Fred - how's it goin?"

If you feel a little like poor Fred, this article may help you get a handle on the topic.

For starters, XML, or Extensible Markup Language, differs from HTML in that it is a subset of the SGML standardized general markup language. HTML, in contrast, is an application of SGML. More importantly, HTML only describes how to display data. XML defines the meaning of data.

In other words, XML provides a format for describing structured data.   The language is still in an early (1.0) phase of its evolution. It formalizes several aspects of the more relaxed syntax of HTML. For example, XML elements must be nested correctly. It has differing requirements than HTML when it comes to "closing tags," size tags (the construct <hr size=5> is acceptable in HTML; it must be represented as <hr size="5"> in XML) and has numerous other structured conventions such as case sensitivity and formalized XML declaration syntax (currently, in the form <? xml version="1.0" ?>) at the beginning of a document, to name a few.

Perhaps most importantly, XML is, as its name implies, a extensible language, in which developers can define their own tags to describe virtually any data structure they wish.

There are a number of other XML related technologies and terms that figure strongly in Microsoft's application of the language:

  • XML DOM is the document object model
  • Schemas provide validation of XML "well formed documents"
  • XPath provides XML query capabilities, and
  • XSLT allows for the transformation of XML data into other structures.

Microsoft did not invent XML, nor does it control the language (yet). The company does, however, provide a COM component (technically, a DLL, or dynamic linked library) that provides these services, as well as a validating parser (DTD and Schema) and remoting via HTTPRequest. More information on these and other important XML concepts are described at www.msdn.microsoft.com/xml.

So, why are developers looking to XML as a potential solution for data-driven applications? Schmidt explains that, as TCP/IP-based applications (networks, email, FTP, etc.) began to require a user interface, HTML began to become increasingly important, as a presentation level protocol. However, as the need for more innovative web-based services drove developers into ever more database-driven Web services, developers hit the wall of HTML's considerable limitations in this regard. Ironically, one of the leading (by most accounts, the leading web application service) is none other than Microsoft's own ASP active server pages technology. It may therefore be surprising to hear Microsoft staffers stand up on a stage and proclaim that "using ASP to write HTML out to the browser is a sin now." In the new order, XML is perceived as a much easier and more extensible way of doing it.

However, there are some potential snags. HTML, at least in the current generation of XSLT tools from Microsoft, is embedded in the style sheet when the file is processed out to the customer's browser. This usually means that a style-sheet compliant browser is required to view the documents. This isn't a problem for users of Internet Explorer, Netscape, Opera, iCab and a handful of other full-featured browsers. However, it may present an issue to developers targeting the emerging class of web-enabled screen phones and other so-called "microbrowser" products. Fortunately, it is possible to query end-users' browsers with JavaScript (etc.) and send appropriate code as necessary.

Microsoft has bet heavily on XML. In addition to XML functionality in virtually all aspects of Office 2000, virtually all of its new server-side tools due in 2000 exploit the technology.

You'll find XML support in:

  • BizTalk Server 2000 (expected in 2001)
  • Commerce Server 2000
  • SQL Server 2000
  • SOAP (Simple Object Access Protocol) 1.1
  • MSXML

BizTalk and SQL Server 2000 are particularly interesting as XML enablers. Using schema mapping tools that Microsoft provides within its BizTalk framework, developers can transform data from virtually any format to any other. For example, if you received a database in an unexpected data format (e.g., attribute based), you could use the BizTalk Mapper (which in turn uses XSLT) to transform it to an element-based destination format compatible with, for example SQL Server. This transformation is practically as simple as drawing lines from one list on the left side of the screen to another list on the right, matching up source and desired destination fields. This creates a mapping file that processes any subsequent data. As a result, Web applications can have direct access to XML data in essentially any format the developer requires.

For Further Reading:

  • There are several resources XML enthusiasts can go for further information. Most notable is BizTalk.org. Here, you'll find the first open web schema library, tools to register your own SML schemas, 400+ 3rd party XML schemas and links to over 150 other XML developer organizations. Dig in!
  • xmlsucks.org - Does XML Suck?
  • Slashdot: After an earlier discussion on why XML sucks for programmers, Tim Bray clarifies his stance on his co-creation, XML, and gets back on his pulpit to declare that XML Doesn’t Suck. [Mar. 28, 2003]

Post new comment

More like this . . .

Instant Messaging

With new tools from Microsoft, AOL and the rest Introduction Instant Messaging (known as IM) is, for many computer users, a killer app. It tells them...

Shopping on the Web

Want to get started shopping on the Web? Start by updating your web browser to the newest release, for additional security and support for 128-bit...

Managing Spam

Want to reduce the amount of junk mail you receive? You're not alone. Reducing Spam Q: How can I find out who is sending me all this junk mail? A:...

Managing your E-mail

Want to back up your address book or move e-mail messages over to a different computer? Here's help. Outlook Contacts - Moving data files Q: What is...

E-mail Alternatives

If viruses and other threats targeting Outlook and Outlook Express have you looking for a safer e-mail alternative, start here.... Introduction There...

Anti-Spam Rules

For Spam Detective and other rule-based email filters MailShield (formerly known as Spam Detective), as noted in our report on Managing Spam,...

Protect your kids

Protect your kids from the seedy side of the Web Net Nannies and Safe Surfing If you've ever used an Internet search engine to search for a word that...