Jump to content
Sign in to follow this  
MattScott

Citadel lag issue fix

Recommended Posts

Hi all,

 

I want to take some time to walk through this, because we have been chasing it for a couple weeks trying to diagnose and fix the issue.

I also want to thank all the players who sent us latency reports and helped report the issue.


Based on initial reports, it seemed like this was an obvious issue with our provider.

So that's where we started, but after many days of extensive research, we couldn't find anything affecting our service.

 

It turns out the problem was in our orchestration layer that controls spinning up and spinning down districts on our many district servers.

This system is also responsible for detected hung/non-responsive districts by looking at ping times and CCU counts.

If the district has been sitting empty for a specific time period, then we throw it away and spin up a new one.

 

At some point in the last month, we had an issue where the orchestration layer was killing districts with players in them. This issue was difficult to diagnose, because it only happens on Citadel, so it's not a system wide issue. While we were investigating, my network team disabled the ability for orchestration to kill districts, so that we wouldn't accidently keep kicking players. Then we got caught up in Halloween and a number of other pressing tasks, and this issue was left on the back burner.

 

During the last two weeks, it appears that orchestration was still detecting non-responsive districts, attempting to kill them (which was disabled), and then spinning up a new district on the same server. Currently due to memory and CPU constraints, we can only run 3 districts per physical server. Over time, orchestration had fired up 7-8 districts, which caused massive amounts of processing lag and eventually started crashing districts. 

 

As of today, we believe we have figured out why orchestration was killing populated districts. It involved a small configuration error from when Nekrova physical servers were merged with Citadel servers (which also explains why only Citadel servers were affected).

 

We have started putting a fix in place, and the lag issue should get better.

 

Thanks,

Matt

  • Like 4
  • Thanks 8

Share this post


Link to post
Share on other sites

poor citadel players, without the lag there's no reason for them to stay away from playing apb anymore

 

my condolences 

  • Haha 1

Share this post


Link to post
Share on other sites

That actually sounds pretty good.

Now i can't wait to get my hands on 2.1

Edited by HighSociety
  • Like 1

Share this post


Link to post
Share on other sites
58 minutes ago, MattScott said:

-snip-

I really appreciate the transparency (and open-mindedness) with issues like this and in general, so tprEzav.png

back in the day we would've gotten a "have you run a tracert yet?" at best - or no response at all, with the issue left hanging there for month, even years.

  • Like 1

Share this post


Link to post
Share on other sites
Guest

I love the transparency Matt, you know that, thank you very much, i appreciate that.

Can't wait to see some progress behind the other updates and delayed things.

Share this post


Link to post
Share on other sites

awesome. can't wait to see how things improve, but honestly... rather impressed that things worked at all with that workload!

  • Like 1

Share this post


Link to post
Share on other sites
13 hours ago, Solamente said:

poor citadel players, without the lag there's no reason for them to stay away from playing apb anymore

 

my condolences 

poor na players, without people on citadel they'll have no one to play against

  • Dislike 1

Share this post


Link to post
Share on other sites

I still feel a very heavy delay on some of the district instances at the moment. Trying to look into /latencyreport data really doesn't tell me much sadly.

 

I hope the issue gets resolved quickly.

Share this post


Link to post
Share on other sites

Is this why you guys went quiet on the server merge? Because you were too busy fixing the Citadel issue (which is understandable). People are still waiting to know if will happen or not. 

Share this post


Link to post
Share on other sites

Hi all,

 

I'm checking with the network team to see if the fix was applied to all servers yet.


Thanks,
Matt

  • Like 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...