Jump to content

MattScott

CEO
  • Content Count

    1281
  • Joined

  • Last visited

Posts posted by MattScott


  1. Hi all,

     

    We are going to try getting our secondary environment up and running.

    Then we can test a new set of data. This wont be a super fast process, and many of my team of recuperating this weekend.

    Bear with us.

     

    My advice for current players would be to not go crazy. If we can confirm a solid way to restore the 2 weeks of missing progress we will.

    But that will mean losing anything since the servers came back online.


    Thanks,
    Matt

    • Like 3

  2. Hi all,

     

    I am looking at the data issue with my team, and there is no easy answer.

    Right now we have two choices.

     

    1) Leave it the way it is and work on fixing the accounts that lost paid items. We have records on the payment side, so it should be easy to restore those. The team is exhausted, but over time we can also try restoring some of the new data in a separate area which would allow us to verify some of the bigger lost progression items / in-game items in order to grant those back to players.

    or

    2) Take the servers down and try another round of restoring data from a different set of possibly newer backups. We already know one of the databases from this newer set of backups is significantly out of sync and older. So there is risk that we will introduce a whole bunch of problems with incompatible data.

    Personally, as painful as it is, I'm going to recommend that we plow forward and escalate the support tickets for the players who lost real money purchases.
    Apologies for the issues.

     

    Sorry,
    Matt

    • Like 1

  3. 30 minutes ago, zefcool said:

    hi, were there no plans of hardware upgrade that would speed up the boot and troubleshooting to begin with ?

    seems like move the relics first and upgrade them to 2019 standards for FE 2.0 project was a bad idea, relics died in the middle...

    maybe its a clue to upgrade the hardware now rather than later ?

    We can't upgrade the hardware. And the code for the servers runs on an OS that isn't available any more.

    For the move, we made images of everything. Backed up everything. And then moved the hardware 1 to 1, so as not to disrupt anything.

    Even with all of our precautions, we still have problems.

     

    My approach is to upgrade the code first and then we can update the hardware properly. That effort has been underway for quite a while now.

    • Like 2

  4. Hi all,

     

    I can’t know how frustrating it is for everyone right now. I wish I had more details to give out. Fallen Earth is a collection of many databases and many different servers that all have to mesh together properly for us to unlock the doors and let players in. There were problems in the move and in the restore process that we are still working out.

     

    To give you an idea of how old and large the game is, it takes over and hour to full reboot everything and come back online. That means we make some changes, but then wait significant amounts of time just to see the results. Then we make more changes and then wait some more. It’s been extremely frustrating.

     

    The team is going to continue working on the servers this weekend till they are online properly.

     

    Stay tuned.

     

    Thanks,

    Matt

    • Like 1
    • Thanks 7

  5. Hi all,

     

    We're in a weird position at this point.

    Technically, all the servers are working properly. All the databases are restored and up.

    However, the system overall is unbearably slow. It takes hours to boot and load all the data.

    Once it is up, the timeouts between servers make the system unstable and unplayable.

     

    We took everything down and restarted it all from scratch about an hour ago, and not everything is fully booted.

    I'll keep you posted.

     

    Thanks,
    Matt 

    • Thanks 2

  6. Hi all,

     

    One more brief update. We're waiting on a trouble ticket to get fulfilled with our new datacenter to fix a final hardware issue.

    Then we have two tasks that can be completed, and the servers can come back online.

     

    I can't predict how long the trouble ticket will take to get addressed.

    But I'll let everyone know once we're back online.


    Thanks,
    Matt

    • Like 3
    • Thanks 2

  7. 2 hours ago, arkup said:

    Hi Matt , don't know if you've changed server locations or something but i now get 480ms to jerico 

    up from around 220ms , which is totally unplayable , citadel hasn't changed though still getting 350ms to there 

    I recommend this site: http://www.azurespeed.com/

    You can see what your latency is to various parts of the country without going through our network.

     

    We did not move any of the district servers (Jericho, Citadel, or Nekrova). So your latency shouldn't be affected.

    However, if you find that the speed site reports no latency issues, then you can get on Jericho and run /latencytest and send it to me in a PM on the forums.

    • Thanks 1

  8. Hi all,

     

    With all the craziness this week and the server outages, the development team hasn't made enough progress on the changes to RIOT for a play test this week.

     

    We're going to aim for May 10th instead.

    I have updated the original post to reflect the new date:

    Thanks,
    Matt

    • Like 1
    • Thanks 2

  9. Hi everyone,

     

    All servers are back online across PC, XB1, PS4 across Jericho, Citadel, and Nekrova.

     

    I sincerely apologize to all the players for this outage.

     

    I have no interest in sugarcoating this or dodging blame. I own this failure. What appeared relatively straight forward, ended up being riddled with landmines. This network move went much too long, and our inability to properly communicate when the game would come back online was unacceptable.  Needless to say, I am unhappy with how this all went, and I am sorry for our performance.

     

    I am happy to do a future post to walk through why we had to do this and what happened that led to the various issues. But I will do my best to make sure this doesn’t happen ever again, and to make sure we properly compensate the players for the downtime.

     

    Sorry,

    Matt

    • Like 24
    • Thanks 16

  10. Hi all,


    Sorry for the lack of updates.

     

    The team is working as fast as they can. We had to send one engineer to bed after nearly 3 days of mostly being awake.

    At this point it's a waiting game. We feel we have the problem identified, which is a routing issue inside the new data center that affects all of the servers.

    We're currently going box to box to fix the issue, and then we'll spin up the game.

     

    Apologies,
    Matt

    • Thanks 2

  11. Hi all,


    Sorry for the lack of updates.

     

    The team is working as fast as they can. We had to send one engineer to bed after nearly 3 days of mostly being awake.

    At this point it's a waiting game. We feel we have the problem identified, which is a routing issue inside the new data center that affects all of the servers.

    We're currently going box to box to fix the issue, and then we'll spin up the game.

     

    Apologies,
    Matt

    • Like 2
    • Thanks 3

  12. Hi all,

     

    I am sorry for the unexpected outage. We moved some critical services from one physical location to another. 

     

    Everything appeared fine yesterday.

     

    However after disconnecting the old location and turning everything off, several services hadn’t been transitioned properly causing the new servers to stop functioning properly.

     

    Unfortunately several of the engineers had already been up for 24 hours and were asleep when we noticed the issue.

     

    Everyone is back up and looking at it now. I’ll have a better update soon.

     

    Apologies,

    Matt

    • Like 5
    • Thanks 3

  13. Hi all,

     

    We are not getting DDoSed, but we did move colocation facilities for part of the APB hosting. I checked our monitoring and I see some players on districts in every world.

     

    Can I get a little extra information to pass onto the devs? What world and platform are you on?

     

    We had some intermittent outages with the login server. Can you double check that it is still down?

     

    Thanks,

    Matt

    • Like 6

  14. Hi all,

     

    As you know, Fallen Earth is quite dated and it requires operating systems that don't exist any more.

     

    We did our best to backup everything properly and organize the smoothest transition possible, but the downtime was required for hand carrying servers between locations.

     

    The techs hit a number of unforeseen issues, which have caused the extended downtime. And unfortunately these kinds of problems tend to be difficult to estimate fix time. 

     

    I believe we are most of the way there now. The databases are back online. The servers are re-installed. We are going to bring everything up manually to get it back online, and then we’ll dial in the rest of the automation tomorrow.

     

    I’ll post an update here as soon as possible.

     

    Thanks,

    Matt

    • Like 3
×
×
  • Create New...