If you create a Web site, it is good practice to declare the encoding. Properly encoded Web pages declare the encoding to a browser through a meta tag in the header. Without this tag, a browser may not know to switch to the proper encoding and characters may be displayed as gibberish.
Some example declaratios for common encodings are given below. If you are not sure which encoding system to declare, you may want to refer to the individual By Language Page or look at which system is declared in other Web sites written in the language.
The encoding meta tag is placed in the header. The encoding tage (e.g. utf-8 for Unicode) is declared after charset= specification at the end of the tag.
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
...
</head>
There are two tags - the encoding attribute in the initial XML tag and the charset meta tag (with a final slash). Both tags should be included for cross-browser compatability.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
...
</head>
Note: These tags should be included even though XML is theoretically Unicode by default. Not all browsers will parse a page as Unicode unless the meta tag is present.
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
...
<head>
NOTE: It IS good practice to declare the encoding even for an English Web site. One function of this is to tag is to "reset" the user's browser back to Latin-1 and ensure proper font settings. The Unicode "utf-8" encoding also ensures that any special characters inserted such as "Smart quotes", currency symbols, em-dashes and so forth will be properly displayed in most browsers.
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
...
<head>
See the individual By Language Page for other encodings or go to pages in your script and go to the View Source window to see which encodings are generally used.
If no encoding is declared, then the browser uses the default setting, which in the U.S. is typically Latin-1. If the page is actually in some other script, but no encoding specified, the browser will use a Roman alphabet font and display gibberish.
