Utf Seven

Defined in RFC 2152, this is yet another UniCode encoding, this one targeted at mail communication over 7-bit SMTP. ASCII equivalent for 'direct characters' -- alphanumerics plus some punctuation. Depending on the profile, some subset of the other printable ASCII characters except '+' ('optional direct characters') may also be ASCII equivalent.

'+' is expressed as '+-'. All other non-direct code points are UTF-16 encoded and then Base64 encoded (MIME profile minus the '=' padding) and put between a '+' and any non-Base64 character. If the terminating character is a '-', it is ignored by the decoder, otherwise, the character is preserved.


Example:

Native: Vässarö, scoutön

UTF-7: V+AOQ-ssar+APY, scout+APY-n


Slightly less efficient than QuotedPrintable?-encoded latin-1 for typical Western European texts, with the bonus of being able to cover all UniCode code points. Excellent for ASCII text with the occasional math, wingding or CJK character.


EditText of this page (last edited November 3, 2006) or FindPage with title or text search