Aviva Directory » Computers & Internet » Data Formats & File Extensions » SGML

SGML (Standard Generalized Markup Language) is an ISO standard for defining generalized markup languages for documents.

In 1969, while employed by IBM, Charles Goldfarb headed a research project on integrated law office information systems. Along with Edward Mosher and Raymond Lorie, he invented the Generalized Markup Language (GML) as a means of text editing, formatting, and informational retrieval.

Goldfarb continued development of the system, adding new concepts, such as short references, link processes, and concurrent document types, that were later incorporated into SGML. Thus, SGML descended from GML, which not only served as an acronym for Generalized Markup Language, but also reflected the initials of the surnames of its three principal authors, Goldfarb, Mosher, and Lorie.

In 1978, the American National Standard Institute (ANSI) established a committee, which became known as the Computer Languages for the Processing of Text committee. Goldfarb was asked to join the committee, which was chaired by Charles Card of Univac, and to lead a project for a text description language standard based on GML. The GCA GenCode committee supported the effort, and SGML was developed by a group of people. The first working draft of the SGML standard was published in 1980.

As a document markup language, SGML was originally designed to enable the sharing of machine-readable documents from the military, government, law, aerospace, and big industry, and particularly large projects that needed to remain readable for several decades. Derived from SGML, XML was applied to general-purpose projects on a smaller scale. HTML was a derivative of SGML until HTML5, which abandoned any attempt to define HTML as an SGML application.

Generalized markup is based on two axioms. One, markup should be declarative, describing a document's structure and other attributes rather than specifying the processing to be performed on it, as declarative markup is less likely to conflict with future processing needs and techniques. Two, markup should be rigorous to that the techniques available for processing rigorously-defined objects, like programs and databases, can be used for processing documents as well.

SGML is a tagging language that handles logical structures, and forms a file-linking and addressing scheme. It is also a database language for text, serving as a foundation for multimedia and hypertext, and a document representation language for any architecture. It allows coded text to be reused in ways not anticipated by the coder. It is a metalanguage for defining document types, and an extensible document description language. It has served as a standard for communication among different hardware platforms and software applications.

Document markup languages that are defined using SGML are known as SGML applications. Prior to XML, most SGML applications were proprietary to the organizations that developed them, and not available for use on the World Wide Web. Applications developed prior to XML include AAP DTD, CALS, DocBook, EDGAR, HyTime, ISO 12083, LinuxDoc, SGMLguid, and Text Encoding Initiative. While there have been several open-source implementations of SGML, some of the more significant ones are ARC-SGML, ASP-SGML, Project YAO, SGMLS, and SP.

The focal point of this guide is on Standard Generalized Markup Language (SGML) or its applications, although some of the more significant ones, like XML, may be included in their own separate categories. Any user forums, tutorials, guides, or informational pages about SGML are appropriate for this category, as well.



Recommended Resources

Search for SGML on Google, Bing, or Yahoo!