Character sets aliases

Web servers can return the same charset in different notation. For example, iso-8859-2, iso8859-2, latin2 are the same charsets. There is support for charsets names aliases which search engine can understand:

  1. Aliases for all ISO charsets (using iso-8859-2 as an example):

    iso-8859-2, iso8859-2, iso8859.2, iso-8859.2, iso_8859-2:1988, iso_8859-2, iso_8859.2

  2. Aliases for all MS charsets (using windows-1250 as an example):

    windows-1250, cp-1250, cp1250, windows1250, x-cp1250

  3. Aliases for Cyrillic koi8-r:

    koi8-r, koi8r, koi-8-r, koi8, koi-8, koi

  4. Aliases for x-mac-cyrillic:

    x-mac-cyrillic, mac

  5. Aliases for DOS cp-866 Cyrillic:

    cp-866, cp866, csibm866, 866, ibm866, x-cp866, x-ibm866, alt

  6. Aliases for some latin character sets:

    latin1 for iso-8859-1

    latin2 for iso-8859-2

    latin4 for iso-8859-4

    latin5 for iso-8859-9

    latin7 for iso-8859-13

