Query by JSON data & Indexing

novabyte · December 8, 2019, 10:23pm

Hey @Ndidi. This is a great post. Thanks for the question.

I’ll share as much detail as I can on what approach we recommend that will scale well with Nakama with these kinds of requirements but first some context.

Nakama’s storage engine API supports key-value batch gets, updates, and deletes as well as listings to return a range of objects in a collection (which optionally belong to a specific user). We don’t have an official partial update API yet or the option to filter within the contents of storage objects but this does not mean it’s not possible with the server.

As you’ve observed its possible to drop under the official API and use custom SQL queries to access your data in an optimal way. I’d definitely recommend this approach when you know your access patterns to subsets of your data like in your example and can set custom indexes.

SELECT
    collection, key, user_id, value
FROM
    storage
WHERE
    value @> '{"state":"Complete"}'

The SQL query example you’ve given needs to access a field called "state" with a value called "Complete" or similar. You’re using the "@>" (containment operator) which works well with a GIN index:

CREATE INDEX IF NOT EXISTS 
    storage_value_ginidx
ON
    storage
USING GIN(value);

Now if you’re certain that you’ll have specific states to the value for the field you want to query on within the storage object you could target your index to be more specific with a BTREE index and an index expression (NOTE: These only work in Postgres although support for them will come soon with cockroachdb).

CREATE INDEX IF NOT EXISTS
    storage_value_state_idx
ON
    storage ((value->>'state'))
WHERE
    (value->>'state') = ANY(ARRAY['Joinable', 'Complete']);

With a BTREE index you should use a different operator to access the data to take advantage of it.

SELECT
    collection, key, user_id, value
FROM
    storage
WHERE
    value->>'state' = 'Complete'

When you’ve created a custom index which will best fit the data you can verify what the performance profile will look like using an EXPLAIN statement. You want to look to avoid full table scans which are the worst performance you can achieve with any database.

EXPLAIN (VERBOSE) SELECT collection, key, user_id, value FROM storage WHERE value->>'state' = 'Complete' LIMIT 100;

                                              QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
 Limit  (cost=0.12..8.14 rows=1 width=596)
   Output: collection, key, user_id, value
   ->  Index Scan using storage_value_state_idx on public.storage  (cost=0.12..8.14 rows=1 width=596)
         Output: collection, key, user_id, value
         Index Cond: ((storage.value ->> 'state'::text) = 'Complete'::text)
(5 rows)

You can see that the query analyzer would use an “Index Scan” which is what you’d want rather than a sequence scan.

You should also remember to make sure that your queries use LIMIT to set an upper bound on the number of rows returned. If you return large ranges of rows in your results it will become a source of performance problems as the datasets grow.

In your later query you’re looking at more than one field.

SELECT
    collection, key, user_id, value
FROM
    storage
WHERE
    value @> '{"state":"Joinable"}'
    AND value->>'minPlayers' >= $1
    AND value->>'maxPlayers' <= $2

In this query because it includes ranges of results a composite index would not be optimal so instead you’d likely need to combine a few individual indexes. The specifics of how best to structure your indexes is one of the reasons we’ve not yet come up a generalized way to solve this in our storage engine API.

Finally for where to set these indexes you’ll want to apply them slightly differently based on whether your server-side logic is in Lua or Go.

In Lua use run_once.

local nk = require('nakama')
nk.run_once(function(context)
    local params = {}
    nk.sql_exec([[
CREATE INDEX IF NOT EXISTS ... etc
]], params)
end)

In Go you’d use the standard SQL driver in the InitModule function.

func InitModule(ctx context.Context, logger runtime.Logger, db *sql.DB, nk runtime.NakamaModule, initializer runtime.Initializer) error {
    exec, err := db.ExecContext(ctx, `
CREATE INDEX IF NOT EXISTS ... etc
`)
    if err != nil {
        return err
    }
    logger.Debug("index created: %v", exec.RowsAffected())
    return nil
}

Hope this helps.

Topic		Replies	Views
How to create index about json object? Runtime Framework storage	27	618	December 15, 2023
How to Index Storage JSON for Custom SQL Query? Runtime Framework golang , storage	1	435	May 3, 2023
Collections and Database Implementation in conjunction with SQL Tables Local Setup	12	2324	March 28, 2020
How to make a custom SQL query on a collection in storage? Runtime Framework golang , storage	1	456	May 4, 2023
Creating a Database Table or Storage Object with a list of Words with Definitions; to be accessed by all users Game Design getting-started , server-framework , storage	3	653	December 12, 2021

Query by JSON data & Indexing

Related topics