0

outlook logo

At best, it’s irritating to get an email that contains unreadable characters. At worst, it can prevent you from reading the mail at all. Sometimes, changing the encoding in Outlook shows those missing characters and lets you read the message. Here’s how to do it.

What is Character Encoding?

If you’re not sure what “character encoding” is, we’ve got a comprehensive explanation for you. The less-comprehensive explanation is that a character is a glyph that appears on screen when you type something. So every letter in this article is a glyph that represents a letter—a, b, c, and so on. Behind the scenes, your computer represents these glyphs using a code that is interpreted by a program—like a web browser or a word processor—and then renders them on screen as a character.

RELATED: What Are Character Encodings Like ANSI and Unicode, and How Do They Differ?

So far, so simple, especially if you think there are only 26 characters in the alphabet, ten numbers, and some grammatical marks like ! or @.

However, there are also 26 upper case letters and far more grammatical marks that you might realize (your keyboard only shows a small subset of possible grammatical marks, even for English). And this only covers one language, English, which is in one alphabet, Latin (also known as the Roman alphabet). The Latin alphabet includes most Western European languages and has a large number of diacritic symbols which aren’t used in English. Diacritic symbols are things like accents, umlauts, cedillas, and other marks that change the pronunciation of a letter or word.

Then there are the many other alphabets, such as Cyrillic (most widely known for containing the Russian language), Greek, Kanji (Japanese), and Chinese, many of which include more than one language.

Now, you can now start to see the scale of characters that need to be encoded as glyphs. There are over 70,000 Chinese glyphs alone. A character encoding contains a number of code points, each of which can encode one character. ASCII, which you have probably heard of, was an early Latin alphabet encoding that had 128 code points, nothing near enough to cover all the possible characters people use.

W3’s recommended encoding for HTML is called UTF-8, which has 1,112,064 code points. This is enough to cover pretty much all of the characters in all of the languages in all of the alphabets (although not every single one), and is used in 93% of all websites. UTF-8 is also the encoding recommended by the Internet Mail Consortium.

Why Would I Bother Changing It?

Read the remaining 15 paragraphs


Post a Comment Blogger

We welcome comments that add value to the discussion. We attempt to block comments that use offensive language or appear to be spam, and our editors frequently review the comments to ensure they are appropriate. As the comments are written and submitted by visitors of The Sheen Blog, they in no way represent the opinion of The Sheen Blog. Let's work together to keep the conversation civil.

 
Top