Latin-1 Entities

The following table gives the character entity reference, decimal character reference, and hexadecimal character reference for 8-bit characters in the Latin-1 (ISO-8859-1) character set, as well as the rendering of each in your browser. Glyphs of the characters are available at the Unicode Consortium.

Character Entity Decimal Hex Rendering in Your Browser
Entity Decimal Hex
no-break space = non-breaking space
inverted exclamation mark ¡ ¡ ¡ ¡ ¡ ¡
cent sign ¢ ¢ ¢ ¢ ¢ ¢
pound sign £ £ £ £ £ £
currency sign ¤ ¤ ¤ ¤ ¤ ¤
yen sign = yuan sign ¥ ¥ ¥ ¥ ¥ ¥
broken bar = broken vertical bar ¦ ¦ ¦ ¦ ¦ ¦
section sign § § § § § §
diaeresis = spacing diaeresis ¨ ¨ ¨ ¨ ¨ ¨
copyright sign © © © © © ©
feminine ordinal indicator ª ª ª ª ª ª
left-pointing double angle quotation mark = left pointing guillemet « « « « « «
not sign ¬ ¬ ¬ ¬ ¬ ¬
soft hyphen = discretionary hyphen ­ ­ ­ ¬ ¬ ¬
registered sign = registered trade mark sign ® ® ® ® ® ®
macron = spacing macron = overline = APL overbar ¯ ¯ ¯ ¯ ¯ ¯
degree sign ° ° ° ° ° °
plus-minus sign = plus-or-minus sign ± ± ± ± ± ±
superscript two = superscript digit two = squared ² ² ² ² ² ²
superscript three = superscript digit three = cubed ³ ³ ³ ³ ³ ³
acute accent = spacing acute ´ ´ ´ ´ ´ ´
micro sign µ µ µ µ µ µ
pilcrow sign = paragraph sign
middle dot = Georgian comma = Greek middle dot · · ·
cedilla = spacing cedilla ¸ ¸ ¸ ¸ ¸ ¸
superscript one = superscript digit one ¹ ¹ ¹ ¹ ¹ ¹
masculine ordinal indicator º º º º º º
right-pointing double angle quotation mark = right pointing guillemet » » » » » »
vulgar fraction one quarter = fraction one quarter ¼ ¼ ¼ ¼ ¼ ¼
vulgar fraction one half = fraction one half ½ ½ ½ ½ ½ ½
vulgar fraction three quarters = fraction three quarters ¾ ¾ ¾ ¾ ¾ ¾
inverted question mark = turned question mark ¿ ¿ ¿ ¿ ¿ ¿
Latin capital letter A with grave = Lat

in capital letter A grave

À À À À À À
Latin capital letter A with acute Á Á Á Á Á Á
Latin capital letter A with circumflex      Â
Latin capital letter A with tilde à à à à à Ã
Latin capital letter A with diaeresis Ä Ä Ä Ä Ä Ä
Latin capital letter A with ring above = Latin capital letter A ring Å Å Å Å Å Å
Latin capital letter AE = Latin capital ligature AE Æ Æ Æ Æ Æ Æ
Latin capital letter C with cedilla Ç Ç Ç Ç Ç Ç
Latin capital letter E with grave È È È È È È
Latin capital letter E with acute É É É É É É
Latin capital letter E with circumflex Ê Ê Ê Ê Ê Ê
Latin capital letter E with diaeresis Ë Ë Ë Ë Ë Ë
Latin capital letter I with grave Ì Ì Ì Ì Ì Ì
Latin capital letter I with acute Í Í Í Í Í Í
Latin capital letter I with circumflex Î Î Î Î Î Î
Latin capital letter I with diaeresis Ï Ï Ï Ï Ï Ï
Latin capital letter ETH Ð Ð Ð Ð Ð Ð
Latin capital letter N with tilde Ñ Ñ Ñ Ñ Ñ Ñ
Latin capital letter O with grave Ò Ò Ò Ò Ò Ò
Latin capital letter O with acute Ó Ó Ó Ó Ó Ó
Latin capital letter O with circumflex Ô Ô Ô Ô Ô Ô
Latin capital letter O with tilde Õ Õ Õ Õ Õ Õ
Latin capital letter O with diaeresis Ö Ö Ö Ö Ö Ö
multiplication sign × × × × × ×
Latin capital letter O with stroke = Latin capital letter O slash Ø Ø Ø Ø Ø Ø
Latin capital letter U with grave Ù Ù Ù Ù Ù Ù
Latin capital letter U with acute Ú Ú Ú Ú Ú Ú
Latin capital letter U with circumflex Û Û Û Û Û Û
Latin capital letter U with diaeresis Ü Ü Ü Ü Ü Ü
Latin capital letter Y with acute Ý Ý Ý Ý Ý Ý
Latin capital letter THORN Þ Þ Þ Þ Þ Þ
Latin small letter sharp s = ess-zed ß ß ß ß ß ß
Latin small letter a with grave = Latin small letter a grave à à à à à à
Latin small letter a with acute á á á á á á
Latin small letter a with circumflex â â â â â â
Latin small letter a with tilde ã ã ã ã ã ã
Lati

n small letter a with diaeresis

ä ä ä ä ä ä
Latin small letter a with ring above = Latin small letter a ring å å å å å å
Latin small letter ae = Latin small ligature ae æ æ æ æ æ æ
Latin small letter c with cedilla ç ç ç ç ç ç
Latin small letter e with grave è è è è è è
Latin small letter e with acute é é é é é é
Latin small letter e with circumflex ê ê ê ê ê ê
Latin small letter e with diaeresis ë ë ë ë ë ë
Latin small letter i with grave ì ì ì ì ì ì
Latin small letter i with acute í í í í í í
Latin small letter i with circumflex î î î î î î
Latin small letter i with diaeresis ï ï ï ï ï ï
Latin small letter eth ð ð ð ð ð ð
Latin small letter n with tilde ñ ñ ñ ñ ñ ñ
Latin small letter o with grave ò ò ò ò ò ò
Latin small letter o with acute ó ó ó ó ó ó
Latin small letter o with circumflex ô ô ô ô ô ô
Latin small letter o with tilde õ õ õ õ õ õ
Latin small letter o with diaeresis ö ö ö ö ö ö
division sign ÷ ÷ ÷ ÷ ÷ ÷
Latin small letter o with stroke = Latin small letter o slash ø ø ø ø ø ø
Latin small letter u with grave ù ù ù ù ù ù
Latin small letter u with acute ú ú ú ú ú ú
Latin small letter u with circumflex û û û û û û
Latin small letter u with diaeresis ü ü ü ü ü ü
Latin small letter y with acute ý ý ý ý ý ý
Latin small letter thorn þ þ þ þ þ þ
Latin small letter y with diaeresis ÿ ÿ ÿ ÿ ÿ ÿ

What is the maximum length of a URL?

Microsoft Internet Explorer (Browser)

Microsoft states that the maximum length of a URL in Internet Explorer is 2,083 characters, with no more than 2,048 characters in the path portion of the URL. In my tests, attempts to use URLs longer than this produced a clear error message in Internet Explorer.

Firefox (Browser)

After 65,536 characters, the location bar no longer displays the URL in Windows Firefox 1.5.x. However, longer URLs will work. I stopped testing after 100,000 characters.

Safari (Browser)

At least 80,000 characters will work. I stopped testing after 80,000 characters.

Opera (Browser)

At least 190,000 characters will work. I stopped testing after 190,000 characters. Opera 9 for Windows continued to display a fully editable, copyable and pasteable URL in the location bar even at 190,000 characters.

Apache (Server)

My early attempts to measure the maximum URL length in web browsers bumped into a server URL length limit of approximately 4,000 characters, after which Apache produces a “413 Entity Too Large” error. I used the current up to date Apache build found in Red Hat Enterprise Linux 4. The official Apache documentation only mentions an 8,192-byte limit on an individual field in a request.

Microsoft Internet Information Server

The default limit is 16,384 characters (yes, Microsoft’s web server accepts longer URLs than Microsoft’s web browser). This is configurable.

Perl HTTP::Daemon (Server)

Up to 8,000 bytes will work. Those constructing web application servers with Perl’s HTTP::Daemon module will encounter a 16,384 byte limit on the combined size of all HTTP request headers. This does not include POST-method form data, file uploads, etc., but it does include the URL. In practice this resulted in a 413 error when a URL was significantly longer than 8,000 characters. This limitation can be easily removed. Look for all occurrences of 16×1024 in Daemon.pm and replace them with a larger value. Of course, this does increase your exposure to denial of service attacks.

Recommendations

Extremely long URLs are usually a mistake. URLs over 2,000 characters will not work in the most popular web browser. Don’t use them if you intend your site to work for the majority of Internet users.When you wish to submit a form containing many fields, which would otherwise produce a very long URL, the standard solution is to use the POST method rather than the GET method:

<form action="myscript.php" method="POST">
...
</form>

The form fields are then transmitted as part of the HTTP transaction body, not as part of the URL, and are not subject to the URL length limit. Short-lived information should not be stored in URLs.As a rule of thumb, if a piece of information isn’t needed to regenerate the same page as a result of returning to a favorite or bookmark, then it doesn’t belong in the URL.

The Bookmark Problem

In very rare cases, it may be useful to keep a large amount of “state” information in a URL. For instance, users of a map-navigating website might wish to add the currently displayed map to their “bookmarks” or “favorites” list and return later. If you must do this and your URLs are approaching 2,000 characters in length, keep your representation of the information as compact as you can, squeezing out as much “air” as possible. If your field names take up too much space, use a fixed field order instead. Squeeze out any field that doesn’t really need to be bookmarked. And avoid large decimal numbers – use only as much accuracy as you must, and consider a base-64 representation using letters and digits (I didn’t say this was easy).In extreme cases, consider using the gzip algorithm to compress your pretty but excessively long URL. Then reencode that binary data in base64 using only characters that are legal in URLs. This can yield a 3-4x space gain, at the cost of some CPU time when you unzip the URL again on the next visit. Again, I never said it was easy!

An alternative is to store the state information in a file or a database. Then you can store only the identifier needed to look up that information again in the URL. The disadvantage here is that you will have many state files or database records. Some of which might be linked to on websites run by others. One solution to this problem is to delete the state files or database records for the URLs that have not been revisited after a certain amount of time.

“What happens if the URL is too long f

or the server?”What exactly happens if a browser that supports very long URLs (such as Firefox) submits a long URL to a web server that does not support very long URLs (such as a standard build of Apache)?

The answer: nothing dramatic. Apache responds with a “413 Entity Too Large” error, and the request fails.

This response is preferable to cutting the URL short, because the results of cutting the URL short are unpredictable. What would that mean to the web application? It varies. So it’s better for the request to fail.

In the bad old days, some web servers and web browsers failed to truncate or ignore long URLs, resulting in dangerous “buffer overflow” situations. These could be used to insert executable code where it didn’t belong… resulting in a security hole that could be exploited to do bad things.

These days, the major browsers and servers are secure against such obvious attacks – although more subtle security flaws are often discovered (and, usually, promptly fixed).

While it’s true that modern servers are themselves well-secured against long URLs, there are still badly written CGI programs out there. Those who write CGI programs in C and other low-level languages must take responsibility for paying close attention to potential buffer overflows. The CGIC library can help with this.

In any case, if you’re a web developer and you’re still asking this question, then you probably haven’t paid attention to my advice about how to avoid the problem completely.