Preserve match states after server restart?

In match_terminate in the docs it seems all current matches are forcefully terminated when the server shuts down.

Is it possible to have list of matches preserved with their states?

I don’t want to save the state to player storage since it contains asymmetric data that they need to guess that I do not want them to access (bluffs they need to guess, etc). I’m also not sure it would be a good idea to save the whole match state to storage during match_loop since that’s a lot of unnecessary writes.

Does nakama have any internal mechanism to do this automatically?

When the server shuts down there is no choice but to stop running matches. The shutdown grace period gives you an opportunity to persist exactly as you want.

After receiving the termination signal you should notify players the match is stopping, save the state to a storage object, and stop the match. Use a system-owner storage object with read 0, write 0 permissions so users can’t read it. There’s no need to do this in the match loop, you can do it in the terminate hook itself.

Then when the match needs to resume load the storage object in match init and restore the in-memory match state from there.

1 Like

Thank you but how long should the shutdown grace period be to in order to save all the matches?

This depends on the matches you have running, as well as your infrastructure that the server runs on. This is why we’ve left this value configurable :slight_smile:

2 Likes

I tried to implement it and I have some kind of success.
The restore part is a little unclear.

  1. So I have clients that were notified about termination and disconnected from terminating server and I have a new instance of the server and state of match stored in DB with key=old_match_id.
  2. If there’s a one single instance of a server then clients have to wait until it restarts. I made them wait graceSeconds and then check /healthcheck endpoint if it returns ok status.
  3. After that clients can authenticate.
  4. One of the clients (chosen on termination by server) have to make RPC that create new match from saved state. When new match created server sends notification to all other clients from old match that match has been restored with new_match_id (inside matchInit handler).
  5. After that clients can join to new restored match.

I don’t think that it’s reliable enough. Is there a better solution?
For exmple, server could restart faster than graceSeconds but how clients would know about that.
For another example, client chosen to initiate new match creation could just leave.

For exmple, server could restart faster than graceSeconds but how clients would know about that.

If you’ve setup your deployment correctly, server will not restart faster than graceSeconds. Make sure you are using bash and exec to run your deployment to ensure SIGABORT and SIGKILL are passed through from the kernel to Nakama to be captured.

For another example, client chosen to initiate new match creation could just leave.

Yes - you’ll need to make this a server-based operation. You’ll need to have a roster of all available Nakama instances with their URL (effectively service-discovery) and call an RPC in another server to create the match there, and the migrate the clients to connect to that instance to continue gameplay.

For what’s it worth, our Heroic Cloud solution has already solved a lot of these problems you are having to reimplement.

1 Like