Group-based Nakama load balancer

Hi, I want to congrats you on the amazing framework. I love every part of it!

We are in the process of deploying our authoritative game. I’ve read almost all the posts related to scaling and the benchmark. I also read this helpful post (Why CockroachDB not PostgreSQL - #4 by novabyte). We would like to have a minimal scalable high-availability solution at the start. Right now, the enterprise solution is out of our reach financially. However, we will switch once the game gains attractions.

Our current setup will be, 1x load balancer, 2x Nakama servers, and 2x CockroachDB servers. As I understand, Nakama’s real-time features are all server-dependent. fortunately, all our real-time features are within the group. As long as the users are connected to the same server for that specific group, they should be able to experience the full features of the game. We don’t care about the geolocation of our users. As long as, players from the same group can interact together.

So we would like a load balancer that keeps track of the currently active groups and routes users accordingly. My questions are,

  • Is this a viable and practical solution? if so, do you have some technical recommendations on how to achieve that on the client, server, and load balancer?
  • Is the auth session also stored in memory inside the server? If so, is there a way to share it within the Nakama cluster?

Hi @Ben welcome :wave:

Thanks for the kind words about the game tech. It’s great to hear positive feedback from game teams about the success they have with the server.

2x CockroachDB servers

Usually if you can you should run 3 instances of CockroachDB rather than two instances because you cannot achieve consensus in the event of downtime with just 2 instances. It might be worth a look at the CockroachDB Cloud to ease the burden of the infrastructure if you don’t want to use Heroic Cloud to manage it all right now.

  • Is this a viable and practical solution? if so, do you have some technical recommendations on how to achieve that on the client, server, and load balancer?

We’ve never used this approach before with a game team on a project. With the Nakama Enterprise version available on the Heroic Cloud we just cluster the system and have the load balancer round-robin requests to any instance to maintain the socket connection.

In theory you could use sticky sessions to ensure that all interaction that a user has via the client SDK is routed by the load balancer to the same Nakama server. I must admit though that I’ve never tried it. There could be other complications it introduces with the expectations within the server design for how users interact with the API.

  • Is the auth session also stored in memory inside the server? If so, is there a way to share it within the Nakama cluster?

To minimize the infrastructure needed we don’t require the use of a separate system that acts as a session cache. This is all handled by Nakama server and scaled with the cluster system that I’ve described in the other post you linked.

I do wonder though. If you’re going to run the load balancer, egress traffic, multiple CRDB instances, and multiple Nakama instances. Is it really going to be less costly when you include the total cost of ownership of the maintenance work than just to run on the Heroic Cloud at this point? :thinking:

Thank you @novabyte for your prompt response

I do wonder though. If you’re going to run the load balancer, egress traffic, multiple CRDB instances, and multiple Nakama instances. Is it really going to be less costly when you include the total cost of ownership of the maintenance work than just to run on the Heroic Cloud at this point? :thinking:

This scalable solution is when we operate at maximum capacity. However, We will scale down to 1x load balancer, 1x Nakama server, and 1x CockroachDB on low traffic. I agree that a managed cloud is easy to manage. However, unless we reach a certain daily active users volume, we can’t afford the minimum Heroic cloud option that includes high availabilities.

Usually if you can you should run 3 instances of CockroachDB rather than two instances because you cannot achieve consensus in the event of downtime with just 2 instances. It might be worth a look at the CockroachDB Cloud to ease the burden of the infrastructure if you don’t want to use Heroic Cloud to manage it all right now.

I will check CockroachDB Cloud. Right now, high availability is not that important all day.

With the Nakama Enterprise version available on the Heroic Cloud we just cluster the system and have the load balancer round-robin requests to any instance to maintain the socket connection.

In the Enterprise version, this should work since the session is distributed to all the cluster nodes. I believe this feature is not available to the open-source version and we have to find a way to route people who play together to the same server.

I do wonder though. If you’re going to run the load balancer, egress traffic, multiple CRDB instances, and multiple Nakama instances. Is it really going to be less costly when you include the total cost of ownership of the maintenance work than just to run on the Heroic Cloud at this point?

As I mentioned, this setup is when we expect large traffic. Most of the time we are going to scale down when the traffic is low. The current Heroic Cloud options don’t fit our phase requirements.

@Ben Understood. Not sure there’s more I can add to this discussion. The approach I suggested above should work theoretically for your use case. :+1: