MattScott 15242 Posted November 16, 2019 Hi all, I want to take some time to walk through this, because we have been chasing it for a couple weeks trying to diagnose and fix the issue. I also want to thank all the players who sent us latency reports and helped report the issue. Based on initial reports, it seemed like this was an obvious issue with our provider. So that's where we started, but after many days of extensive research, we couldn't find anything affecting our service. It turns out the problem was in our orchestration layer that controls spinning up and spinning down districts on our many district servers. This system is also responsible for detected hung/non-responsive districts by looking at ping times and CCU counts. If the district has been sitting empty for a specific time period, then we throw it away and spin up a new one. At some point in the last month, we had an issue where the orchestration layer was killing districts with players in them. This issue was difficult to diagnose, because it only happens on Citadel, so it's not a system wide issue. While we were investigating, my network team disabled the ability for orchestration to kill districts, so that we wouldn't accidently keep kicking players. Then we got caught up in Halloween and a number of other pressing tasks, and this issue was left on the back burner. During the last two weeks, it appears that orchestration was still detecting non-responsive districts, attempting to kill them (which was disabled), and then spinning up a new district on the same server. Currently due to memory and CPU constraints, we can only run 3 districts per physical server. Over time, orchestration had fired up 7-8 districts, which caused massive amounts of processing lag and eventually started crashing districts. As of today, we believe we have figured out why orchestration was killing populated districts. It involved a small configuration error from when Nekrova physical servers were merged with Citadel servers (which also explains why only Citadel servers were affected). We have started putting a fix in place, and the lag issue should get better. Thanks, Matt 4 8 Share this post Link to post Share on other sites
vsb 6170 Posted November 16, 2019 poor citadel players, without the lag there's no reason for them to stay away from playing apb anymore my condolences Share this post Link to post Share on other sites
Acornie 490 Posted November 16, 2019 Well that's good news 1 Share this post Link to post Share on other sites
HighSociety 148 Posted November 16, 2019 (edited) That actually sounds pretty good. Now i can't wait to get my hands on 2.1 Edited November 16, 2019 by HighSociety 1 Share this post Link to post Share on other sites
Snubnose 639 Posted November 16, 2019 58 minutes ago, MattScott said: -snip- I really appreciate the transparency (and open-mindedness) with issues like this and in general, so back in the day we would've gotten a "have you run a tracert yet?" at best - or no response at all, with the issue left hanging there for month, even years. 1 Share this post Link to post Share on other sites
Guest Posted November 16, 2019 I love the transparency Matt, you know that, thank you very much, i appreciate that. Can't wait to see some progress behind the other updates and delayed things. Share this post Link to post Share on other sites
gordIsMyName 104 Posted November 16, 2019 awesome. can't wait to see how things improve, but honestly... rather impressed that things worked at all with that workload! 1 Share this post Link to post Share on other sites
ninjarrrr 248 Posted November 16, 2019 13 hours ago, Solamente said: poor citadel players, without the lag there's no reason for them to stay away from playing apb anymore my condolences poor na players, without people on citadel they'll have no one to play against Share this post Link to post Share on other sites
KyoukiDotExe 231 Posted November 16, 2019 I still feel a very heavy delay on some of the district instances at the moment. Trying to look into /latencyreport data really doesn't tell me much sadly. I hope the issue gets resolved quickly. Share this post Link to post Share on other sites
Nickolai 206 Posted November 16, 2019 Is this why you guys went quiet on the server merge? Because you were too busy fixing the Citadel issue (which is understandable). People are still waiting to know if will happen or not. Share this post Link to post Share on other sites
HighSociety 148 Posted November 17, 2019 So after i played for a while now i have to say not much of a change for me... still laggy. Share this post Link to post Share on other sites
MattScott 15242 Posted November 17, 2019 Hi all, I'm checking with the network team to see if the fix was applied to all servers yet. Thanks, Matt 1 Share this post Link to post Share on other sites