Forum Discussion
How To Extract Values from the string that is key value pairs with different formats
Tried Below No luck Yet any help please?
SELECT
Dataval,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',Id:', Dataval) + 5, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',No', SUBSTRING(Dataval, CHARINDEX(',Id:', Dataval) + 5, LEN(Dataval))) > 0 THEN CHARINDEX(',No:', SUBSTRING(Dataval, CHARINDEX(',Id:', Dataval) + 5, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS COSTID,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) > 0 THEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) > 0 THEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',OffId:', Dataval) + 7, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',Except', SUBSTRING(Dataval, CHARINDEX(',OffId:', Dataval) + 7, LEN(Dataval))) > 0 THEN CHARINDEX(',Except', SUBSTRING(Dataval, CHARINDEX(',OffId:', Dataval) + 7, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS Branch,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) > 0 THEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) > 0 THEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',Email:', Dataval) + 8, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',Add:', SUBSTRING(Dataval, CHARINDEX(',Email:', Dataval) + 8, LEN(Dataval))) > 0 THEN CHARINDEX(',Add:', SUBSTRING(Dataval, CHARINDEX(',Email:', Dataval) + 8, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS ContactEmail,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',Zip:', Dataval) + 5, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',Not', SUBSTRING(Dataval, CHARINDEX(',Zip:', Dataval) + 5, LEN(Dataval))) > 0 THEN CHARINDEX(',Not', SUBSTRING(Dataval, CHARINDEX(',Zip:', Dataval) + 5, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS PostalCode
FROM DataForce c
;
Thanks
Asita
Hi, Asita.
The problem at the moment is your data - it's not uniformly presented.
The second row is in standards-compliant JSON format (excluding the final comma) but rows 1 and 3 are not.
If the data could be uniformly presented the way row 2 is, then things become simple as you can use the SQL JSON functionality:
- Work with JSON data in SQL Server - SQL Server | Microsoft Learn
- Parse and Transform JSON Data with OPENJSON - SQL Server | Microsoft Learn
But as the data currently stands, this is in the "too hard" basket - there's just too many things that can go wrong even at the parsing level.
You'd have to get into CLR and code a robust solution for this (since there's going to be a bit of regular expression work to do given the data isn't uniformly presented). SQL has mechanisms for XML and JSON, but that's all.
Cheers,
Lain
- buzza1234Jan 25, 2024Copper ContributorThank Lain.
but these 3 are common examples I have given some records with " some with ' some records with no quotes at all
Is there anyway we can pass through these could be 3 queries also fine
Just checking to see some light.
Thank you Very Much Again for inputs Lain- LainRobertsonJan 25, 2024Silver Contributor
Hi, Asita.
I can't think of any reliable way of handling the conversions using out-of-the-box SQL string functions, no. Given even a small presentation difference would need a different handler, the problem just scales out into too many combinations too quickly.
If I were faced with this poorly-presented data, I'd be pursuing the following, in order of most preferred to least preferred:
- Get the data presented uniformly, and ideally using the standards-compliant JSON format;
- Write a CLR module to handle the parsing (not an option if you're using Azure SQL Database; you need Azure SQL MI or SQL Server for this option);
- Introduce a two-step process where:
- Step one would be to author a stored procedure to standardise the data presentation to JSON format;
- Step two would be to use the JSON functions within any queries/views (so, the same approach as the top point in this list).
Cheers,
Lain
- buzza1234Jan 29, 2024Copper Contributor
LainRobertson Thank you for your guide lines and advise
here is what I tried and able to get in
DROP TABLE IF EXISTS #Mytemptable; CREATE TABLE #Mytemptable ( Id INT IDENTITY(1, 1) ,Dataval NVARCHAR(MAX) NOT NULL ); INSERT #Mytemptable ( Dataval ) VALUES (N'{''ani'':''80052242'',''dnis'':''90425935'',''cid'':''4601923812'',''Id'':''046067ce-7fec-4af7-bb51-e4d31b4720'',''No'':''3048'',''FN'':''Bod Conner'',''Ent'':''2012-08-27'',''Email'':''email address removed for privacy reasons'',''Add'':''P.o. Box 146'',''Zip'':''31040'',''Not'':''English'',''Off'':''FL'',''ACode'':'''',''OffId'':''0002 Home Office FL'',''Except'':''''}, ' -- Dataval - nvarchar(max) ) ,(N'{"ani":"30926539","dnis":"33300180","cid":"4594296356","Id":"a2a5bac2-b6ad-4239-95b2-2da730c2b6","No":"93872","FN":"Martinael u0026 Kar Schmale","Ent":"2023-07-08","Email":"email address removed for privacy reasons","Add":"552 WPlainview cir","Zip":"13498","Not":"","Off":"FL","ACode":"1|a2a5bac2-b6ad-4239-95b2-2da780c2b6|0061FL|V|000000000000000000|00000000","OffId":"0061 Rand FL","Except":""},') ,(N'{ani:63179364,dnis:88825454,cid:4563900520,Id:c1144de0-1945-4862-b057-fe25eabe73,No:97158,FN:Bombay and Marsha Osrow,Ent:2023-11-03,Email:email address removed for privacy reasons,Add:10014 SW Vitalia Court,Zip:14987,Not:English,Off:FL,ACode:,OffId:0021 Zanfini FL,Except:},'); SELECT jn.Id , cid = MAX (REPLACE (IIF(c2.ordinal = 3, c2.value, ''), 'cid:', '')) , Id = MAX (REPLACE (IIF(c2.ordinal = 4, c2.value, ''), 'Id:', '')) , Email = MAX (REPLACE (IIF(c2.ordinal = 8, c2.value, ''), 'Email:', '')) FROM #Mytemptable jn CROSS APPLY (SELECT Removedquotes = REPLACE (REPLACE (jn.Dataval, '''', ''), '"', '')) c1 CROSS APPLY (SELECT ss.* FROM STRING_SPLIT(c1.Removedquotes, ',', 1) ss) c2 GROUP BY jn.Id ORDER BY jn.Id;