Supported Character Sets

The string encoding plug-in can convert certain character encodings, and your data is converted to character encoding when the profile is analyzed.

The following table shows the character encodings that the string encoding plug-in can convert and the character encoding that your data is converted to when the profile is analyzed. Note that some characters that exist in the website encoding (typically incoming referrer search strings) may not exist in the profile encoding and cannot be converted. For example, if the website is encoded in a Traditional Chinese encoding, and the profile uses a Simplified Chinese encoding, the string encoding plug-in cannot convert the characters that are not common to both.

Characters that cannot be converted are not included in reports.

Table 1. Supported Character Sets

Language

Web Site Encoding

Profile Encoding

Chinese

EUC-CN, HZ, GBK, GB18030, EUC-TW, CP950, BIG5-HKSCS, ISO-2022-CN, ISO-2022-CN-EXT, and the Unicode encodings listed in this table.

BIG5, GB2312, UTF-8

Japanese

EUC-JP, CP932, ISO-2022-JP, ISO-2022-JP-2, ISO-2022-JP-1, and the Unicode encodings listed in this table.

SHIFT_JIS, UTF-8

Korean

CP949, ISO-2022-KR, JOHAB, and the Unicode encodings listed in this table.

EUC-KR, UTF-8

Unicode

UCS-2, UCS-2BE, UCS-2LE, UCS-4, UCS-4BE, UCS-4LE, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, UTF-7, C99, JAVA.

UTF-8