Hyper Text Meta Language or HTML is the language used create webpages. All the web browsers can parse and understand HTML code and then render them appropriately as visual elements that is seen as webpages.
The HTML has been designed and developed over a long period of time, which means that there are several different versions and types of the language. The first line in your HTML file should refer to the document type (the type and version of the language) that is used in the page. This allows the web browsers and other programs to parse the file correctly using the exact language specification.
This definition on top of the page is commonly referred to as the Document Type Definition or DTD in short. It is also referred to as the DocType as the tag name used to specify it is of the same name. There are a couple of reasons as to why a DocType definition is important in the HTML page.
Syntax Verification: In order to verify the correctness of the code, it is important specify the document type, so that it is verified against the exact type and version of the specification.
Faster Parsing: The allows the readers, parsers and browsers to choose an appropriate HTML implementation ahead of time making a lot of guessing unnecessary. It leads to the next point on the list.
Standards Vs Quirks Parsing: It allows the standard mode parsing in web browsers. Otherwise, it uses what is called quirks mode which can be error prone.
The DocType declaration is the very first line in the file, even before the html tag. This is possible because the DocType declaration is not part of the HTML language and is not an html tag.
As mentioned there are several types of HTML: HTML 4.01 Strict, XHTML 1.0, HTML 4.01 Transitional, HTML5 etc etc. The XHTML and the previous versions of HTML are all based on the SGML and hence require a reference to the corresponding DTD. The latest HTML5 specification is not based on SGML and does not need one.
Let’s look how to specify the Document Type Declaration in each of the commonly used HTML types. All uses the same tag <!DOCTYPE …> at the top of the page. The follows a generic pattern or syntax. The syntax is as follows, in order of appearance in the tag:
Root Element: This is the root element of the document that follows. It is almost always html unless it is XHTML or XML.
DTD Type: The second attribute specifies what kind of DTD it refers to. The most common value is PUBLIC, which as it says is public. The other value can be SYSTEM, which says that the DTD is in the local system.
Formal Pubic Identifier (FPI): In case the DTD type is public, you need to specify the details of the DTD, such as provider, language etc. This is required only for public dtd types.
URI: This refers to the path of the DTD. An appropriate path needs to be specified according to the DTD type specified in the second attribute.
The HTML5 tag is very simple, as all it needs as a attribute stating that the root element is html. As I previously mentioned, it does not require values to DTD specification as it is not based on the SGML.
HTML 4.01 Strict
<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01//EN” “http://www.w3.org/TR/html4/strict.dtd”>
The first attribute specifies the root element which is html. The next attribute PUBLIC specifies that we are using a public DTD, followed by the FPI for the DTD. The FPI states that it is provided by W3C and the version is 4.01 in the english language etc. The next attribute is the URI to the above mentioned DTD.
HTML 4.01 Transitional
<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>
HTML 4.01 Frames
<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01 Frameset//EN” “http://www.w3.org/TR/html4/frameset.dtd”>
XHTML 1.1 DTD
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.1//EN” “http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd”>
XHTML 1.0 Strict
<?xml version=”1.0″ encoding=”UTF-8″?>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
So, what happens if you do not specify the Document Type. Most parsers and browsers will make a good faith effort to guess the document type by parsing the document/code. In the context of the web browsers, it is also referred to as the Quirks mode. Most often than not it will render the page correctly provided there are no major errors in the code itself.