Forum Discussion

panoslondon1's avatar
panoslondon1
Copper Contributor
Mar 30, 2023

BUG in REPLACE FUNCTION?

Hi,

I have a very weird issue. We have a requirement to get rid of special Euro language characters. So below replaces Þ with T. However, random letters like H are affected. The below gives GAT as a result. I had a similar issue with S and . Basically unrelated letters are dropped!

 

declare @outStr nvarchar(2000)

SET @outStr = 'GATH'  

SET @outStr = REPLACE(@outStr, 'Þ', 'T')  

select @outStr

^^^^^^^^

GAT

 

if i replace nvarchar with varchar works ok but we need to preserve any special chars

Any ideas what's going on?

 

Thank you,

Panos

  • panoslondon1 

     

    Hi, Panos.

     

    This is expected behaviour and is related to character set (or collation) translation.

     

    Here's some examples that highlight the issue as well as making use of COLLATE to perform a binary comparison instead, which avoids the issue.

     

    SELECT
    	REPLACE(N'GATH', N'Þ', N'P')
    	, REPLACE(N'GATH' COLLATE Latin1_General_BIN, N'Þ', N'P')
    	, REPLACE(N'GAÞ' COLLATE Latin1_General_BIN, N'Þ', N'P')

     

    Output:

     

    • The first test reproduces the problem;
    • The second test demonstrates that nothing is replaced, as is expected
    • The third test demonstrates that only the specified character is replaced, as is expected.

     

    Cheers,

    Lain

  • LainRobertson's avatar
    LainRobertson
    Silver Contributor

    panoslondon1 

     

    Hi, Panos.

     

    This is expected behaviour and is related to character set (or collation) translation.

     

    Here's some examples that highlight the issue as well as making use of COLLATE to perform a binary comparison instead, which avoids the issue.

     

    SELECT
    	REPLACE(N'GATH', N'Þ', N'P')
    	, REPLACE(N'GATH' COLLATE Latin1_General_BIN, N'Þ', N'P')
    	, REPLACE(N'GAÞ' COLLATE Latin1_General_BIN, N'Þ', N'P')

     

    Output:

     

    • The first test reproduces the problem;
    • The second test demonstrates that nothing is replaced, as is expected
    • The third test demonstrates that only the specified character is replaced, as is expected.

     

    Cheers,

    Lain

  • olafhelper's avatar
    olafhelper
    Bronze Contributor

    Any ideas what's going on?


    panoslondon1 , yes, you mix ASCII and Unicode and of course that goes wrong. You have to use the N literal for Unicode

     

    SET @outStr = REPLACE(@outStr, N'Þ', N'T')  

     

Resources