Indexing JSON arrays using full-text search indexes
Published Mar 23 2019 02:05 PM 7,262 Views
Microsoft
First published on MSDN on Nov 09, 2015

In the previous post I described how you can index JSON fields using standard B-Tree indexes . In this post we will see how to use Full-text search indexes on JSON arrays.


Problem


In the database are stored arrays of JSON values e.g. ["Manufacturer","Quality"] and query needs to find all rows where array contains some value (e.g. "Manufacturer").


Solution


SQL Server enables you to analyze JSON arrays and use elements in queries. As an alternative, full text search can be used to find arrays that contains some value since JSON is a plain text that can be indexed.


Table structure


Full-text indexes can be used if JSON in your column is formatted as a simple array of scalar values. In the ideal case as array of numbers, but array of strings will also work in most of the cases. In this example I will use Sales.SalesOrder_json table that can be found in official AdventureWorks2016CTP3 database. We can assume that Sales.SalesOrder_json table has the following data:


SalesOrderID SalesOrderNumber ......... SalesReasons
1 SO44484 ........ ["Manufacturer","Quality"]
2 SO44352 ........ ["Price"]
3 SO44367 .......... ["Manufacturer","Quality"]
4 SO44464 ......... ["Price","Quality"]

Instead of related tables SalesOrderHeaderSalesReasons and SalesReasons I have placed sales reasons as an array of JSON strings.


If we create Full-text index on SalesReasons column, it will split tokens in this text using separators " and ,. This can be used to improve your queries that search for sales orders by some sales reason values in this array.


Use Case 1: Filtering values using OPENJSON


OPENJSON enables you to OPEN array of JSON objects and use elements from the array in queries. The following query finds ass SalesOrder rows where array stored in SalesReasons column contains value 'Manufacturer'



SELECT SalesOrderNumber, OrderDate, SalesReasons
FROM Sales.SalesOrder_json
CROSS APPLY OPENJSON (SalesReasons)
WHERE value = 'Manufacturer'



Query is simple but it requires full table scan, so it is better to use this kind of query with some additional filter.


Use Case 2: Filtering values using Full text index


We can create standard full text search index on JSON array. FTS will split elements in the array using brackets, quotes and commas as separators and it will index values.


First we will create full text search catalog and index:


-- Create full text catalog for JSON data
CREATE FULLTEXT CATALOG jsonFullTextCatalog;
GO

-- Create full text index on SalesReason column.
CREATE FULLTEXT INDEX ON Sales.SalesOrder_json(SalesReasons)
KEY INDEX PK_SalesOrder__json_SalesOrderID
ON jsonFullTextCatalog;
GO


With the full text index we can easily find all rows where SalesReasons JSON array contains some value (e.g. Price):


SELECT SalesOrderNumber, OrderDate, SalesReasons
FROM Sales.SalesOrder_json
WHERE CONTAINS(SalesReasons, 'Price')

We can even use more complex expression with AND, OR, NOT operators. Execution plan is shown in the following figure:


.




As you can see, although FTS is not specialized for JSON text, you can use it to improve performance of your queries where you need to filter results by some value in JSON array.


Note that FTS handles JSON as regular text - it does not have any custom parser. If you have an array of numbers formatted as JSON this structure is perfect fit for the FTS search. FTS will split text using commas and brackets so you will have numbers in FTI. It will also work fine if your string values in arrays do not have some characters that can split words (e.g. string 'SQL Server' will be split into "SQL" and "Server"). In that case you will need to use AND predicates in queries such as  CONTAINS(json, 'SQL and Server'); or Custom proximity search with NEAR .



.







Version history
Last update:
‎Mar 23 2019 02:05 PM
Updated by: