We rely on a runtime.Initializer.RegisterAfterAuthenticateCustom() callback to handle setting up new users after they have authenticated and the api.Session.Created flag is set. This sets up various properties that the new user should get.
While we were developing and frequently restarting the server we noticed that it is possible for a client to authenticate exactly when the server has started but before loading our plugin. This means that our custom callback is not called and the user is created with no required properties, which is not good. This is not a huge problem, but it happened when just testing with a few developers, it will probably happen every single time when the game is live and millions of players try to authenticate.
Is there any way to delay or block all client authentications until the server has fully started up? Like a flag in the config file or something? Our plugin would then do something to allow authentications when it has initialized. Could not find anything in the docs, but it’s a somewhat tricky topic to search for.
The server shouldn’t start to process requests if one of it’s dependencies is not ready and that includes loading the plugins and preparing the runtime. Only after those objects have been instantiated, is that the goroutines to process requests are launched. The scenario you are describe seems quite complex, if you don’t mind could you provide the following information if possible:
The code snippets and a description your development environment, including the version of Nakama and the client used.
Have you confirmed the functions have been loaded properly into Nakama? Nakama logs a message (info level) per function loaded into the Runtime.
Have you confirmed that there is no other Nakama server running that may be processing requests?
What is the restart procedure like?
Do you have a minimal version of the code I could try to run and verify the problem?
If the server does not process requests before our Go plugin has loaded then that should pretty much rule out this scenario. We’re not certain that this is what happened, but a user that was not initialised was created while we were having issues with some other code in the server and had to restart it a few times.
Our initializeUser sets up new users with default properties, wallet contents etc and saves to disk. We run this development server in a Linux virtual machine on Linode and start it via systemd. So restarting is done with systemctl restart. The version of Nakama was 3.17.1 when this happened. There is no other Nakama server running on this system.
All functions seemed to load properly and the user handling is in a file that was not touched recently. If it failed to set up the callback the server would panic.
Given your service configuration in systemd, would it be possible for the following to happen?
Nakama is operating normally
A developer triggers a deployment by running systemctl restart
A SIGTERM is sent by systemd to Nakama which starts the graceful shutdown procedure (if it’s configured to do so).
Due to the other code you mentioned, the graceful shutdown process hangs.
Since the service isn’t terminating within some specified time in systemd, systemd times out and sends a SIGKILL to Nakama
Nakama is forcefully interrupted, thus any on-going request handling is interrupted including the AuthenticationCustom which executed all the code up to the AfterAuthenticationCustom hook but never finished the later.
Nakama process is forcefully terminated and systemd launches a new Nakama process
I’m not familiarized with the internals of systemd but I think this is a possibility.