XML in a Citizen-Centered E-Government: Having our Cake and Eating It Too
Owen Ambur, Co-Chair, XML Working Group
November 9, 2001

If something sounds too good to be true, it usually is. There's no such thing as a "free lunch" and raising false expectations is a sure path to disappointment, if not outright failure. So the last thing we should do is oversell the benefits of eXtensible Markup Language (XML). As has been often said, XML merely a syntax - a way of structuring information. Moreover, structure without substance is meaningless ... or worse.

In many instances, too much structure is a bigger part of the problem than its solution. We've all experienced hierarchies that have outlived their useful lives and cannot cope with rapidly changing realities in the cyberage, much less add real value in the best interests of their stakeholders. However, as inefficient and ineffective as inflexible, poorly designed, and outmoded structures can be, some degree of commonality is required for the conveyance of meaning not only among human beings but especially among machines. It is a truism that each of us cannot manage our own lives, much less interact productively together in organizations, large and small, without some degree of structure.

And that generalized truism becomes an absolute requirement in the realm of machines. It has been widely suggested, for example, that the beauty of the Web is its "openness" and lack of structure, particularly the lack of any centralized authority controlling what may be posted. However, without standards like Internet Protocol (IP), Hypertext Transfer Protocol (HTTP), and Hypertext Markup Language (HTML), the Web just ain't happenin'. The truth is that the beauty of the Internet and the Web is in their structures. For it is those standards which enable us to use computers to express ourselves more openly and efficiently, thereby freeing the collective creativity of the human spirit.

Apart from gestures that can be interpreted logically and nonverbal expressions that seem to be universal regardless of culture, human beings simply cannot communicate without a common language in which words and phrases have widely recognized meanings. XML is not a "language" in that sense, since in and of itself, it does not convey meaning. However, it enables us to "mark up," that is, to add structure to our documents so as not only to make them more readily comprehensible by people but particularly to enable computers to analyze, manipulate, display, re-purpose and reuse them in ways that are highly useful to us. And that is what technology should be about - serving the needs and wishes of people, thereby enabling us to lead "freer," more productive, meaningful, prosperous, enjoyable, and fulfilling lives.

As human beings, we have a marvelous and indeed a wondrous ability that computers lack. We can fill in gaps and apply our own understandings and interpretations where the full, express meanings are lacking. Of course, to the degree that our perceptions and interpretation may differ from that which was intended by others, those capabilities can get us into a lot of trouble, including not only hard feelings and physical conflict on a personal level but also law suits among business partners and, ultimately, war among nations. In effect, we need to help ourselves avoid such unintended consequences - by helping those poor "dumb" computers do a better job of helping us understand each other.

So what does that mean to us in our daily lives as government employees? What should we be doing about it? And how can we work most effectively together to capitalize on the potential of XML in pursuit of our agency missions and the priorities established for us, such as those outlined in the Administration's Citizen-Centered E-Government Action Plan?

Two of the original proposals that led the CIO Council to charter the XML Working Group include the potential to:

1) use XML metadata tags to classify, manage, access, and retrieve electronic records, including Web pages, Governmentwide, and

2) render Government forms in XML and gather the data from those forms in XML.

Rendering forms in XML means that they can be filled in with a Web "browser." Gathering the data in XML means that it can readily be captured, manipulated, and analyzed in databases while at the same time the original, completed forms can be maintained as inviolate records for the appropriate periods specified in agency records retention schedules. Maintaining the original, completed forms as E-records apart from the databases provides for redundancy in the event of disaster, a requirement made gravely more apparent by the events of September 11. It also enhances security and the protection of privacy in cases where personal or other sensitive data may be required for authentication of each submission but may not be appropriate for inclusion in a database, where it is subject to the risks of inappropriate access, modification, and use.

Equally importantly, maintaining records apart from the databases facilitates audits of those databases. Indeed, if the original E-records are gathered and maintained in well-formed XML, audits can be largely automated. Not only does that mean auditors can devote their attention and talents to much higher-value activities than tracking down pieces of paper to confirm numbers in spreadsheets, but it also means that any stakeholder will be able to conduct his or her own "audit" of the public records any agency or program, on the Internet, anytime he or she sees fit to do so! And that's a big part of what "citizen-centered" E-government is all about - providing accountability by making the actions of public officials transparent to their stakeholders.

An equally big part of citizen-centered government is making our records readily accessible by our stakeholders - in terms that are meaningful to them, rather than merely to us as employees of various offices within bureaucracies. Pursuant to Section 508 of the Rehabilitation Act, as amended by the Workforce Investment Act, the term "accessibility" has come to mean enabling the rendition of information in manner that is comprehensible to persons whose sensory organs are diminished or disabled. That is certainly an important benefit of XML, which enables the ready reformatting and reuse of electronic records via specialized devices such as computer screen readers. However, that assumes people are able to discover and retrieve the information they need without great difficulty, regardless of whether they are disabled or not. And that is where the use of XML metatags can deliver large gains in efficiency and effectiveness for all of us, by enhancing search and retrieval time as well as precision.

Metatags are the equivalent of the information provided in a library card catalogue. They identify the attributes of each record that are of significant interest to its stakeholders. By embedding them in records on the Internet, agencies can automate the management and processing of their records while at the same time vastly enhancing the services provided by existing full-text search sites like FirstGov.

Since there is no limit on the number of metatags that may be included in each record, agencies can indeed "have their cake and eat it too" - by incorporating not only those metadata elements required to describe their records in terms that are most meaningful to the members of the public who are their stakeholders, but also in terms that are important for internal administrative purposes. That is a critical distinction relating to the use of metatags, which can be embedded within individual records and maintained by "suppliers" on a widely distributed bases ... while at the same time being dynamically and selectively used in other contexts by "customers" (citizens), as well as by third-party organizations serving more specialized needs and interests.

By contrast, static indices of hypertext links are limited by screen space, reflect the context of the supplier rather than the customer, and are difficult to keep current on a centralized basis. However, XML does not force an either/or choice between metatags that are embedded within widely distributed records versus external indices maintained by librarians, authoritative organizations, and other value-added service providers on a centralized basis. By virtue of its "extensibility," XML enables the efficient and cooperative use of both embedded and external metadata.

Indeed, externalized XML metadata may be used to construct "topic maps" that identify relationships and draw linkages among records that may not have occurred to and/or are beyond the capabilities or interests of their authors to specify. For example, an agency may not have the resources to identify many of the relationships of its records to those created and maintained by other agencies. However, commercial and nonprofit organizations representing those agencies' stakeholders may be more than willing to do so. Rendering agency records in well-formed XML enables others to provide value-additive indexing services efficiently and effectively, building upon and without having to re-create any of the work already done by the agencies themselves. That's another example of how XML can enable us to "have our cake and eat it too" with respect to overcoming the false choice between embedded versus external indices.

While XML is still a relatively new technology and the tools to apply it efficiently are immature, it is incumbent upon each of us as public servants to be thinking about and planning how best to use it to address the needs and interests of the citizens who are our stakeholders. Toward that end, the XML Working Group has been chartered by the CIO Council to undertake four general activities:

The Working Group meets monthly. Our meetings are open and participation is encouraged. Likewise, our listserv is open and we are hosting the xml.gov site as a primary means of sharing information with our stakeholders. However, beyond generalized awareness-building, education, and outreach activities, the primary value the Working Group can add is to accelerate the availability of a repository in which "inherently governmental" XML data elements and schemas can be registered and made readily available for use in applications Governmentwide. The registry is the key to reducing, if not eliminating needless inconsistencies and redundancies among the data stored in ill-coordinated information technology systems that do not interoperate and, thus, constitute "stovepipes" or "islands of information." Moreover, eliminating needless inconsistencies and redundancies is the key not only to making government more efficient and effective, but also to reducing the information collection burdens imposed upon the public.

Unfortunately, life is not as simple as it used to be and the ever-increasing complexities seem at times overwhelming. No technology can insulate us completely from the new realities we face. However, XML embodies a powerful new potential to represent both that which we all have in common as well as that which makes each of us and our organizations unique. Unlike many resources, knowledge is not "consumed" in the sense of being used up or converted to waste. The more knowledge is used, the more valuable it becomes. While there is no such thing as a free lunch, if we play our XML schemas right, we can have our cake and eat it too - in terms of an evermore rapidly expanding base of common and specialized knowledge, leading inexorably to a brighter future for all of us ... suppliers and customers, public servants and citizens alike!