Memory leak/profiling

oshribin · April 9, 2020, 2:00pm

I am encountering an issue I believe related to a memory leak in the server, but I couldn’t use pprof.
I am currently running the server on ECS (I am waiting for the new version of the managed cloud platform). and every few hours (6-12) the process reaches 100% memory utilization and failed to serve any new connection with a storage read error. I have checked the console to make sure that all the sessions and presences and matches are closed when completed and it seems ok, the only thing that seems to increase through the lifetime of the process is the amount of goroutine (I do not use any custom goroutines in my plugin). Because it’s very difficult to investigate this without pprof, I wonder maybe you familiar already with such behavior I attached the screenshot of the memory utilization for the last few days and the metrics from the console at the point when the server stops serving.

Thanks,
Oshri.

zyro · April 9, 2020, 2:45pm

There are other resources that might increase but are not tracked by the status view. Your “storage read error” might be an indication that database resources are not correctly cleaned up for example. Do you have custom database queries that use the sql.DB handle in the Go runtime? If so are you closing the result sets correctly?

Also look for other resources in your code that might be increasing or not cleaned up. Typical culprits here are maps/slices of values that just grow, retaining references you may not need anymore and so on.

oshribin · April 10, 2020, 12:32pm

Thanks @zyro, So I have scanned the code for resources not correctly cleaned up. I have custom queries and also transactions but everything seems to handled correctly. The problem is that the same behavior was reproduced on a server that does not serve any connection. (one I have deployed for a test).

novabyte · April 16, 2020, 1:58pm

I spoke with @oshribin outside the forums about this issue and it appears to be with Elastic Container Service at AWS but its not clear what causes it. This is not an issue with Nakama and he will investigate further if needed.

Topic		Replies	Views
Using pprof to profile Local Setup	1	943	February 26, 2020
What is the best way to trace a memory leak in my custom rpcs in the typescript nakama runtime? Runtime Framework server-framework , typescript	2	155	June 13, 2024
Error reading message from client / websocket: close 1006 (abnormal closure): unexpected EOF Local Setup	8	8170	October 27, 2019
Scheduled rpc call Local Setup	3	1172	September 23, 2019
The server encountered a temporary error and could not complete your request. (502) Local Setup sdk-unity	5	553	December 22, 2020

Memory leak/profiling

Related topics