SOLVED

Scandic Characters not supported in SQL Server

%3CLINGO-SUB%20id%3D%22lingo-sub-265766%22%20slang%3D%22en-US%22%3EScandic%20Characters%20not%20supported%20in%20SQL%20Server%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-265766%22%20slang%3D%22en-US%22%3E%3CP%3E%3CEM%3EHi%2C%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3EI%20am%20having%20problem%20with%20scandic%20characters(mainly%2C%C3%A4%2C%C3%B6%2C%C3%A5)%20in%20SQL%20SERVER.%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3EThe%20issue%20with%20my%20database%20is%20that%20if%20i%20enter%20the%20special%20characters%20(by%20saying%20special%20characters%2C%20i%20am%20focused%20on%20%C3%A4%2C%C3%A5%2C%C3%B6)%20manually(%20just%20editing%20the%20value%20in%20table%20column)%2Cits%20working%20but%20when%20i%20use%20select%20statement%20to%20read%20from%20json%20file%20and%20insert%20into%20table%20%2C%20it%20looks%20something%20like%20this%3A%26nbsp%3BM%C3%83%C2%A4le%2C%26nbsp%3BFem%C3%83%C2%A4le.%20From%20this%2C%20what%20i%20have%20concluded%20is%20%2C%20database%20is%20supporting%20those%20characters%20but%20issue%20is%20in%20select%20statement.%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3EFollowing%20is%20the%20test%20sql%20script%20I%20have%20used%3A%3C%2FEM%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3Eselect%20gender%3CBR%20%2F%3E%26nbsp%3Bfrom%20OPENROWSET(bulk%20'abc%2FtestJsonFile.json'%2CDATA_SOURCE%3D'MyAzureBlobStorage'%2Csingle_clob)%20as%20j%3CBR%20%2F%3Ecross%20apply%20OPENJSON(BulkColumn)%3CBR%20%2F%3EWITH%20(%3CBR%20%2F%3Egender%20nvarchar(200)%20N'%24.gender'%3CBR%20%2F%3E)%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CEM%3EActually%2C%20I%20have%20used%20special%20characters%20in%20gender%20field%20in%20json%20file%20in%20order%20to%20test%20it.%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3EMale-%26gt%3BM%C3%A4le%26nbsp%3B%20and%26nbsp%3B%20Female-%26gt%3BFem%C3%A4le%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3Eand%20the%20output%20looks%20like%20this%3A%3C%2FEM%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CEM%3Egender%3C%2FEM%3E%3CBR%20%2F%3EFem%C3%83%C2%A4le%3CBR%20%2F%3EM%C3%83%C2%A4le%3CBR%20%2F%3EFem%C3%83%C2%A4le%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CEM%3EI%20found%20that%20database%20collation%20also%20has%20some%20role%20in%20supporting%20special%20characters.%20Is%20it%20somewhat%20related%20to%20collation%20issue%20or%20something%20else%3F%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3ELooking%20forward%20to%20suggestions.%3C%2FEM%3E%3C%2FP%3E%3CP%3E%3CEM%3EThanks.%3C%2FEM%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-270366%22%20slang%3D%22en-US%22%3ERe%3A%20Scandic%20Characters%20not%20supported%20in%20SQL%20Server%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-270366%22%20slang%3D%22en-US%22%3E%3CP%3EHi%2C%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EThe%20issue%20is%20resolved%20now.%20I%20thought%20it%20would%20be%20beneficial%20for%20someone%20like%20me%20having%20the%20same%20issue.Therefore%2C%20i%20am%20posting%20the%20solution.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EIt%20was%26nbsp%3Bactually%26nbsp%3Brelated%20to%20the%20encoding%20of%20the%20file.%20The%20json%20file%20i%20was%20using%20was%20in%20UTF-8.%20It%20was%20causing%20the%20problem%20as%20SQL%20Server%202017%20doesn't%20support%20UTF-8%20characters.%20Therefore%2C%20i%20used%20Azure%20Data%20Factory%20CopyActivity%20to%20transfer%20the%20file%20in%20UTF-8%20to%20file%20in%20UTF-16%20using%20%3CSTRONG%3Esource%3C%2FSTRONG%3E%20and%20%3CSTRONG%3Esink%3C%2FSTRONG%3E%26nbsp%3B%3CSTRONG%3E%22ENCODING%20NAME%22%26nbsp%3B%3C%2FSTRONG%3Ein%20copy%20activity.%20The%20desired%20file%20was%20in%20UTF-16%20now.%3C%2FP%3E%3CP%3EAfter%20that%20i%20made%20very%20small%20change%20in%20the%20script%20as%20follows%3A%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSPAN%3Eselect%20gender%3C%2FSPAN%3E%3CBR%20%2F%3E%3CSPAN%3E%26nbsp%3Bfrom%20OPENROWSET(bulk%20'abc%2FtestJsonFile.json'%2CDATA_SOURCE%3D'MyAzureBlobStorage'%2C%3CSTRONG%3Esingle_nclob%3C%2FSTRONG%3E)%20as%20j%3C%2FSPAN%3E%3CBR%20%2F%3E%3CSPAN%3Ecross%20apply%20OPENJSON(BulkColumn)%3C%2FSPAN%3E%3CBR%20%2F%3E%3CSPAN%3EWITH%20(%3C%2FSPAN%3E%3CBR%20%2F%3E%3CSPAN%3Egender%20nvarchar(200)%20N'%24.gender'%3C%2FSPAN%3E%3CBR%20%2F%3E%3CSPAN%3E)%3B%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSPAN%3EI%20changed%20%3CSTRONG%3Esingle_clob%3C%2FSTRONG%3E%20to%20%3CSTRONG%3Esingle_nclob%3C%2FSTRONG%3E%20as%20the%20specified%20file%20was%20unicode.%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%3CSPAN%3EAnd%2C%20it%20worked%20fine.%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSPAN%3EThanks.%3C%2FSPAN%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E
New Contributor

Hi,

I am having problem with scandic characters(mainly,ä,ö,å) in SQL SERVER.

The issue with my database is that if i enter the special characters (by saying special characters, i am focused on ä,å,ö) manually( just editing the value in table column),its working but when i use select statement to read from json file and insert into table , it looks something like this: Mäle, Femäle. From this, what i have concluded is , database is supporting those characters but issue is in select statement.

Following is the test sql script I have used:

 

select gender
 from OPENROWSET(bulk 'abc/testJsonFile.json',DATA_SOURCE='MyAzureBlobStorage',single_clob) as j
cross apply OPENJSON(BulkColumn)
WITH (
gender nvarchar(200) N'$.gender'
);

 

 

Actually, I have used special characters in gender field in json file in order to test it.

Male->Mäle  and  Female->Femäle

and the output looks like this:

 

gender
Femäle
Mäle
Femäle

 

I found that database collation also has some role in supporting special characters. Is it somewhat related to collation issue or something else?

Looking forward to suggestions.

Thanks.

1 Reply
best response confirmed by Chetan Sharma Kandel (New Contributor)
Solution

Hi,

 

The issue is resolved now. I thought it would be beneficial for someone like me having the same issue.Therefore, i am posting the solution.

 

It was actually related to the encoding of the file. The json file i was using was in UTF-8. It was causing the problem as SQL Server 2017 doesn't support UTF-8 characters. Therefore, i used Azure Data Factory CopyActivity to transfer the file in UTF-8 to file in UTF-16 using source and sink "ENCODING NAME" in copy activity. The desired file was in UTF-16 now.

After that i made very small change in the script as follows:

 

select gender
 from OPENROWSET(bulk 'abc/testJsonFile.json',DATA_SOURCE='MyAzureBlobStorage',single_nclob) as j
cross apply OPENJSON(BulkColumn)
WITH (
gender nvarchar(200) N'$.gender'
);

 

I changed single_clob to single_nclob as the specified file was unicode.

And, it worked fine.

 

Thanks.