Exception handling of server errors

Server exception are not bubbled up to the calling context. Am I doing something wrong and/or what is the proper way to handle them?

I created an rpc in Typescript that uses thrown error as shown in the documentation.

throw new Error("No such code as '" + requestedCode + "'");

In Unity the rpc is called via the following code

            try
            {
                var payload = new Dictionary<string, string> { { "code", code } };
                var response = await Client.RpcAsync(Session.Result, "code_redeem", payload.ToJson());
                Debug.Log("Code redeemed sucessfully" + response);
            }
            catch( ApiResponseException ex )
            {
                Debug.LogFormat("Error: {0}", ex.Message);
            }
            catch( Exception ex )
            {
                Debug.LogError("What else could have happened? " + ex.Message);
            }

When the server throws an error, the Unity console shows

Received: status=InternalServerError, contents='{"code":13,"error":{"StackTrace":"Error: No such code as 'asdfd'\n\tat rpcRedeemCode (.:36:15(76))\n","Type":"exception"},"message":"Error: No such code as 'asdfd' at rpcRedeemCode (.:36:15(76))"}'

But the calling code catch blocks are never called. I think the SDK is eating the exception. It sounds pretty unusual to me, as the code needs to respond if the server is failing, no matter what reason.

So I suppose my questions are

  • How to properly receive and handle server errors on the client?
  • What is the recommended way to return errors from the server during rpc calls in Typescript?

Versions: Nakama 3.5, Mac Docker, Unity 2019.4.30f1
Languages: TS

1 Like

This does sound a bit unusual. What version of the unity SDK are you using? Can you update to the latest and retry plz?

Sorry, I missed that piece of information. I’m on 3.2.0, installed two days ago.

If it helps, for each call to the RpcAsync method, the console displays 2 sets of send/receive messages.

Send: method='POST', uri='http://127.0.0.1:7350/v2/rpc/code_redeem', body='System.Byte[]'
Received: status=InternalServerError, contents='{"code":13,"error":{"StackTrace":"Error: No such code as 'asdas'\n\tat rpcRedeemCode (.:36:15(76))\n","Type":"exception"},"message":"Error: No such code as 'asdas' at rpcRedeemCode (.:36:15(76))"}'
Send: method='POST', uri='http://127.0.0.1:7350/v2/rpc/code_redeem', body='System.Byte[]'
Received: status=InternalServerError, contents='{"code":13,"error":{"StackTrace":"Error: No such code as 'asdas'\n\tat rpcRedeemCode (.:36:15(76))\n","Type":"exception"},"message":"Error: No such code as 'asdas' at rpcRedeemCode (.:36:15(76))"}'

I have also tried using the ContinueWith syntax instead of await but I’m getting the exact same behaviour

                Client.RpcAsync(Session.Result, "code_redeem", payload.ToJson()).ContinueWith(t =>
                {
                    if( t.IsCompleted )
                    {
                        if( t.IsFaulted )
                            Debug.LogError(t.Exception);
                        else
                            Debug.Log("Code redeemed successfully");
                    }
                });

So, I have created a new empty 2019.4.30f1 project, imported Nakama 3.2.0 via the unity package file and added a very minimal NakamaManager that tries to do a rpcCall and I’m getting the same result (exception not bubbling up my code)

I’ve tried different target platforms (iOS and PC), different .net levels (2.0 and 4.x) and different unity scripting backend (mono and il2cpp) but none of that changed the behavior.

@adbourdages It’s possible this is a bug in the Unity client with socket RPCs. The error should be raised to the Task which was the callback for the socket response. Can you open an issue on the .NET client issue tracker?

After digging a lot more into this I think* I have the cause of this issue figured out. There are two main facets to it. One is the RetryInvoker timer logic and the second is the overloading of the meaning of ApiResponseException when it comes to RpcCalls.

First, why doesn’t the following code complete correctly if the server side rpc throws an exception?

try
            {
                var payload = new Dictionary<string, string> { { "code", code } };
                var response = await Client.RpcAsync(Session.Result, "code_redeem", payload.ToJson());
                Debug.Log("Code redeemed sucessfully" + response);
            }
            catch( ApiResponseException ex )
            {
                Debug.LogFormat("Error: {0}", ex.Message);
            }
            catch( Exception ex )
            {
                Debug.LogError("What else could have happened? " + ex.Message);
            }

It stems from the RetryInvoker logic. On an exception, there is a check made to see if it should be considered a “transient” exception. (From the source file comments: “For example, timeouts can be transient in cases where the server is experiencing temporarily high load.”).

ApiResponseExceptions with a status code of 500 or more are considered transient. Using throw new Error(""); in typescript does indeed result in that kind of exception. Here is the http response: status=500, {"code":13,"error":{"StackTrace":"Error: No such code as 'h'\n\tat rpcRedeemCode (.:38:15(76))\n","Type":"exception"},"message":"Error: No such code as 'h' at rpcRedeemCode (.:38:15(76))"}

So, if the exception is transient, the SDK will re-attempt the server call (that applies not only to rpc but all server calls). Before re-attempting, the client will wait some time. How long is based on a RetryConfiguration object. The default one is configured with 500ms and a max retry attempt of 4.

    public RetryConfiguration GlobalRetryConfiguration { get; set; } = new RetryConfiguration(500, 4, (RetryListener) null, new Jitter(RetryJitter.FullJitter));

The exact time to wait between each attempt is further based on the previous number of failed attempts

    private Retry CreateNewRetry(RetryHistory history)
    {
      int int32 = Convert.ToInt32(Math.Pow((double) history.Configuration.BaseDelay, (double) (history.Retries.Count + 1)));
      int jitterBackoff = history.Configuration.Jitter((IList<Retry>) history.Retries, int32, new Random(this.JitterSeed));
      return new Retry(int32, jitterBackoff);
    }

The following math is a bit suspicious Math.Pow((double) history.Configuration.BaseDelay, (double) (history.Retries.Count + 1)). With a base delay of 500ms, on the first retry, you get 500ms. Good. On the second retry you get 500^2. That’s starting to be a rather large number. On the 3rd attempt we are up to ~34h…

Therefore, the original call to RpcAsync is simply waiting for hours before actually completing, that’s what is going on. It is easy enough to work around this issue by providing a RetryConfiguration object with a maxRetry of 1.

I’ll be posting this in original issue in the .Net client but I think they’ve already figured out the problem as there was a commit 5 days ago that changes the logic of the delay calculation. However, this update has not propagated to the Unity SDK DLL yet.


Which begs the question: what is the proper way to return errors from rpc calls? The documentation samples I’ve seen so far are using (for typescript) throw new Error("something"); First, this results in the unfortunate side effect of the client automatically retrying the rpc because the exception is deemed “transient”. Secondly, it is hard to extract the actual error from the resulting exception stack (it is a sub-exception to the main TaskCanceledException)

Also, the errors a rpc call can returns do not all fall under the umbrella of a failed api call. There could simply be a logical failure: code doesn’t exist, entry has already been removed, parameters are outside the desired range. Is there a way to distinguish those from a failed database connection or a divide-by-zero kind of exception (on the client side). Could we throw new LogicalError("something) that wouldn’t get wrapped in an ApiResponseException?

Reading the new commit on the .Net client from 5 days ago, I see that it will be possible to override the check on whether an exception is “transient”. That would solve part of the problem. Are there plans to update the Unity DLL?

I’ve just encountered the same problem; when my GoLang runtime code calls runtime.NewError("Some error message", 13), the Unity Client fails to catch the ApiResponseException and there’s a noticeable delay of a second or two before Unity throws the exception to the console. Error code 13 is apparently “INTERNAL” and just by accident, I tried a different error code 10 (ABORTED) and found that the Unity Client does catche the ApiResponseException. I went through all 17 error codes that have been listed in the Nakama docs and these are my findings:

const (
	OK                  = 0  // doesn't catch ApiResponseException
	CANCELLED           = 1  // okay
	UNKNOWN             = 2  // doesn't catch ApiResponseException
	INVALID_ARGUMENT    = 3  // okay
	DEADLINE_EXCEEDED   = 4  // doesn't catch ApiResponseException
	NOT_FOUND           = 5  // okay
	ALREADY_EXISTS      = 6  // okay
	PERMISSION_DENIED   = 7  // okay
	RESOURCE_EXHAUSTED  = 8  // okay
	FAILED_PRECONDITION = 9  // okay
	ABORTED             = 10 // okay
	OUT_OF_RANGE        = 11 // okay
	UNIMPLEMENTED       = 12 // doesn't catch ApiResponseException
	INTERNAL            = 13 // doesn't catch ApiResponseException
	UNAVAILABLE         = 14 // doesn't catch ApiResponseException
	DATA_LOSS           = 15 // doesn't catch ApiResponseException
	UNAUTHENTICATED     = 16 // okay
)

Does anyone know why certain error codes throw the exception immediately and others don’t ? Also, is the server retrying something when runtime.NewError returns a code such as 13, hence the delay before it times out and throws the exception ?

Maybe take a look at the SendRequest method in UnityWebRequestAdapter, it is where the exceptions are created.

We ended up using a specific kind of error signature for logical script errors: throw new Error(JSON.stringify({ reason:"whatever", extra: "0" }));. In case of a server error, the client parses the returned value and when it contains both reason+extra we use a different error handling mechanism. So it becomes possible to make the difference between logic error (missing parameters, etc) versus a script error (divide by zero, crash, etc).