Pages

Copyright & Privacy

Construction of HTML5

Construction

The current HTML version (4.01), issued in 1997 per the vote and recommendation of the W3C, only defines the language as an SGML application. HTML 4.01 joined, in recent years, other recommendations such as XML and the DOM. HTML 4.01 was reformulated with the help of XML, from which emerged XHTML 1.0.

Both languages have an identical vocabulary. Moreover, it was specified on an HTML DOM Base, which allowed the dynamic manipulation of HTML documents. As a result of numerous publications, there are three specifications that deal with HTML, XHTML and DOM.

HTML5 will change this. The current draft of HTML5 defines it not only as a separate language, but includes a common vocabulary that uses available HTML, XHTML and DOM as illustrations.
The aim is to reduce the differences between the language developments and to achieve better cooperation between the components.
HTML5

HTML is, up to and including version 4.01, an SGML application, but even modern web browsers have no “real” SGML parser and have it only via an HTML or Tag Soup parser. Therefore, because of certain syntactic rules in SGML, HTML cannot be practically used.
HTML5 is no longer based on SGML, but instead is defined as a special HTML syntax, which is true to the classic, SGML-based HTML. It also re-defined areas such as error handling. In the process, the specification attempts to rely on the existing implementations of the browser, so the syntax is backward compatible. For example, the browser will indeed have to make changes, so the language can be processed identically whatever the browser. Even today, current browsers can transform many HTML5 features (at least in theory, as they are not yet implemented).
XHTML5
The specification of XHTML 1.0 allows taking into account certain rules regarding how documents with the text version of this media type are sent to html. This is necessary in order for XHTML documents from browsers such as Microsoft’s Internet Explorer and Lynx to be processed (they are then processed as HTML).
In practice, there are some differences between “real” XHTML and such, which are sent as text / html. HTML5 would reduce these differences to enable a smooth transition between the language variants.
XHTML5 is an XML format and, as such, is subject to the rules for the processing of XML documents. The HTML5 working group has no influence on this, so compromises for the formulation of the HTML and DOM need to be made. The WHATWG has created a page with differences between HTML5 and XHTML5.

DOM 5
The entire HTML5 specification is characterized by the DOM terminology.
Although not all user agents (presentation programs, search engines) are DOM-capable, HTML5 defines how these programs have to behave when they encounter HTML5 documents. The document object model itself also undergoes changes and enhancements in the specification.

In the future, all intended HTML elements (which have, in contrast to XHTML, elements through a namespace) belong to the XHTML namespace (http://www.w3.org/1999/xhtml).

The range of language

HTML5 defines many elements and attributes that can be seen under the hood. Strictly speaking, this is comparable with the HTML version 4.01. The specification should also define for the future, as older elements and attributes need to be processed. HTML5 browsers will need to be able to deal easily with existing Web pages that were not written with the rules of this language.

  • Share/Bookmark