Lua VM Modules

Hello everyone,
Recently as we are nearing some bigger testing and things, we have noticed one issue or well huge issue i would say how lua runtime works, so wanted to confirm here.

So we tried to stress test game which has machines:
2 vcpu, 4gb ram , nakama is only running on it, no prometheus or cockroachDB.

On that instance i was able only to create 50 matches ,and than memory spiked a hell a lot oO
Removing all modules from it, and having just stock match handler with state data as same as how it would be in normal match, i was able to create about 3500 matches and then memory got to 4gb.

Now the only difference is modules and importing. So my question is:
Since each match is an instance of Lua VM. does that mean if i have for example something like this:

  • Match 1
    • Improts network or requires network module which is on path (‘Network.handler’)
  • Match 2
    • Imports network or requires network module which is on path (‘Network.handler’)

Out expectations is that upon first importing which ever match was created first, second one uses cached one one and doesn’t import its, Thus memory is only paid once for first loading.
Can this be verified?

Or each import in match handler, is being “global” to that match handler Lua VM and not as server wide?

Thanks

An Updated:

  • No matter if modules are loaded or already require like for instance in main.lua

  • They will be per match again fully loaded.

  • Is there a way to mark a specific modules to act like system libs? and not ot be instantiated/required again per match, bug globally shared.

  • I am talking about modules for instance let say network.

  • Network module within it self has a code to send packet using dispatcher, and to log it for later storing, but code was organised in that way so anything that deals with packets it is handled there.

  • Maybe to introduced shared libs or something like that?

Hello @Eatos, unfortunately this isn’t possible - the Lua VMs are sandboxed and don’t support multi-threading.

For HTTP requests a pool is used to amortize the cost of spinning up new VMs to improve performance and cap memory utilization - but for matches each match needs it’s own VM with its internal state.

We’ll see if upgrading gopher-lua (the Lua interpreter that Nakama uses) to latest introduces some improvements in the next release.

For this matter:

  • Can you extend docs, maybe ilistrating this scenario a bit different, so devs know they will pay the modules/data structure price per match.
  • Is there a way you could also extend docs that maybe lists best case/scenaiors for using Lua as runtime and what to avoid or it is too broad?

@HeroicNathan some things we should consider improving in the runtimes docs.

@sesposito Will make a ticket for this.

While we are on this topic, is it possible to somehow extend or give ability how nakama counts local variables for registry overflow?
As for instance if i do with normal way counting using debug.getlocal function in a script i get 28 local variables. pure native lua 5.1 interpreter, but in nakama from time to time, it happens registry overflow on same place.

It’s difficult to pinpoint the issue without more info, perhaps if you could reproduce the bug it would help - you need to be mindful that because of the VM pooling, different VMs may observe different local variables (if you don’t have read_only_globals set to true) and mutate the state of the VM during execution (not recommended), if you believe there’s an issue with how Lua is behaving it could be an issue with gopher-lua itself.

Heya @Eatos, here’s the link to our Lua Runtime docs, just in case you haven’t found them: Lua Runtime - Heroic Labs Documentation.

Let me know if you have any feedback about them, I’m planning on adding a note about VM memory usage.

Yes, this could maybe be demonstrated with a picture in doc. For instance when we talk about VM pooling, is every runtime VM pooled or not?
With this i mean i assumed each VM in match is per match, so it strictly uses that vm.
Once that match is created, one VM is allocated for it, and used across lifetime of match. So tehnically if i have 1 match running let say with 4 players, it should theoritcally see same local variable number count like in stock lua?
read_only_globals is set to true so that is not it.

@Eatos yes you’re right, for Matches there’s no pooling, each match has its own dedicated VM, it’s for HTTP requests that there’s a pool of VMs, so what you say about local variables within the same match holds true.

One more question:

  • Nakam counts local variables per match loop invocation and not per function correct?

Whit this i mean that given that game code starting point is most certaintly be match loop function, the local variables from there, up to what ever gets executed will be counted in correct?

Because if that is the case, then sometimes hitting the limit of 512, does make sense to me, especially if there is on the way code that is being recuservily executed.

I believe your statement should hold true, unless there’s a bug in gopher-lua.

If you’re hitting the maximum call-stack or registry size, you can have a look at:

and try to tune the values according to your needs.

Have a look at this: GitHub - yuin/gopher-lua: GopherLua: VM and compiler for Lua in Go
Nakama sets MinimizeStackMemory to false. This means that gopher-lua won’t release memory it is not using so that allocations are more efficient.

If you still have issues, I’d suggest you fork Nakama and set the above option to true in all places where the Lua VMs are initialized and see if it improves memory usage (this may have a non-negligible performance cost) in your Lua code. If so, we may consider accepting a PR that exposes this configuration in the Nakama config.

I have figured out where is the issue or well main actor of it. For my testing i lowered lua_registry_size to 256 so i can lower hit it and not play a game of catch. So apparently during the gameplay the last function that we had that occasually broadcasts a state packet, this function did some of packing, and a lot of repacking of variables. And this caused down the line registry overflow to happen, because of the whole chain:

  • match loop → processing packet opcode → handling packet opcode → validating state

It would be nice somehow to expose as read only how many currently locals are loaded and what is the registry size, it would help quite a bit :slight_smile:

@sesposito I have found the case 100% reproduction on nakama 3.17 will see on 3.24
But basically i have this table, that gets encoded with msgpack and it fails with table.concat
This table has about 115 elements and if fails when trying to merge it on registry count 256.
It shouldn’t yet it always.

Here it is:

local test_table = {“\ufffd”,“\ufffd\u0000B”,“\ufffd)\ufffd(\u001c”,“\u0001”,“\u0000”,“\u0000”,“\ufffd\u0019\ufffd(\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd9\ufffd(\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd\u0005\ufffd(\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd)\ufffd((”,“\u0001”,“\u0000”,“\u0000”,“\ufffd\u0019\ufffd(\u0018”,“\u0001”,“\u0000”,“\u0001”,“\ufffd\u0008 “,”\ufffd9\ufffd(\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd\u0005\ufffd(\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd)\ufffd",”,“\u0001”,“\u0000”,“\u0001”,“\ufffd\u0008 “,”\ufffd\u0019\ufffd"\u0018”,“\u0001”,“\u0000”,“\u0000”,“\ufffd9\ufffd"\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd\u0005\ufffd"\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd)\ufffd*"”,“\u0001”,“\u0000”,“\u0000”,“\ufffd\u0019\ufffd*\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd9\ufffd*\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd\u0005\ufffd*\u0000”,“\u0000”,“\u0000”,“\u0000”,“\ufffd”,“\ufffd\u0000\u0013”,“\u0000”,“\u0000”,“\u0002”,“\ufffd\u0000\n\ufffd\u0000”,“\ufffd\u0000\u0005\u0008\u0000”,“\u0001”,“\u0001”,“\u0000”,“\u0003”,“\ufffd\u0000\u0005\u0008@”,“\u0002”,“\u0002”,“\u0000”,“\u0000”,“\u0002”,“\u0003”,“\u0000”,“\u0000”,“\u0002”,“\ufffd\u0000\u0010”,“\u0000”,“h”,“̀”,“\u0006”,“\u0001”,“h”,“̀”,“\u0006”,“\u0002”,“h”,“̀”,“\u0007”,“\u0003”,“h”,“̀”,“\u0008”,“\ufffd”,“\u0000”,“\u0001”,“\u0003”,“\u0000”,“\u0001”,“\ufffd\u0001,”,“\u0001”,“\u0000”}
local result = table.concat(test_table)

I am failing to understand why it crashes or throws registry overflow here, when size is 115 and max size is 256.

@Eatos I’m not sure why that would be the case, I’d put together a small gopher-lua test outside of Nakama to reproduce and try to understand what’s happening with the registry, otherwise I think it’s best if you ask in the gopher-lua community directly.