Forum Discussion

sljesra27's avatar
sljesra27
Copper Contributor
Dec 11, 2024
Solved

SQL Server Collation

Hi Experts,

I am doing cross database migration. source database characterset is ISO_1 (ISO 8859-1) and binary sorting.

While checking equivalent collation in MS SQL, I found Latin1_General_BIN.

Later, I checked that all 4 collations are equivalent to the source database characterset and sorting.

Latin1_General_BIN

Latin1_General_BIN2

Latin1_General_100_BIN2

Latin1_General_100_BIN2_UTF8

Could you share your expert advice on the difference between the above collations and which one is best?

  • Apart from backward compatibility concerns that you can't solve or work around, there is no good reason to use the first two collations over the third (10.0 is a later version and has more Unicode mappings, and BIN2 uses the fully correct Unicode sorting).  The question is between the third and the fourth. That's a UTF8 question. If you don't need to use UTF8 in (var)char fields (i.e. if you store Unicode only in n(var)char fields, as UTF-16), then the drawback of the fourth collation (vs the third) is that the fourth collation is going to be slower on certain operations on (var)char fields, perhaps substantially so, but I haven't seen tests on that.

  • techvslife's avatar
    techvslife
    Copper Contributor

    Apart from backward compatibility concerns that you can't solve or work around, there is no good reason to use the first two collations over the third (10.0 is a later version and has more Unicode mappings, and BIN2 uses the fully correct Unicode sorting).  The question is between the third and the fourth. That's a UTF8 question. If you don't need to use UTF8 in (var)char fields (i.e. if you store Unicode only in n(var)char fields, as UTF-16), then the drawback of the fourth collation (vs the third) is that the fourth collation is going to be slower on certain operations on (var)char fields, perhaps substantially so, but I haven't seen tests on that.

Resources