Jump to content

MattScott

CEO
  • Content Count

    1281
  • Joined

  • Last visited

Everything posted by MattScott

  1. Sure thing. I can see the confusion. It was both. The contract was up, and we had already put our notice in with the old hosting facility. However at the very end there was a delay with the new hardware arriving and getting setup on-time at the new facility. My preference would have been to get 2 more weeks at the old facility, but they denied our request to stay a bit longer.
  2. Hi all, Small update. I have updated the main post, but we hit a serious snag and are working around it now. We are going to have to unfortunately extend the maintenance window. Thanks, Matt
  3. Hi all, We spent some time yesterday looking at the character login issues and found that 6 tables across our various databases needed heavy re-indexing after they got restored during the move. The tables are all over 5GB in size with the largest being 550MM rows at 55GB. This largest table contains all character attributes for the more than 2.5MM characters made in Fallen Earth to date. The short term fix is to re-index and stabilize things. The long term fix will be to move old character data to an archive database where it can be restored on the fly if those older players choose to log back in. Since 1am we spent a couple hours backing everything up, and then about 8 hours re-indexing. We are down to the last large table. It appears that it will take about 5 hours more and then we need about another hour to turn on servers and run through a quick QA test. That puts our earliest ETA for letting players back in at ~6pm Pacific time. With a little luck we should have things running much better for the weekend. I'll keep monitoring this and updating the community. EDIT: We hit a serious snag and ran out of drive space on the volume while doing the repair. The team is working around it, but we are going to have to extend the maintenance. EDIT: 5/10 1am Pacific. Nearly done restoring and adding more space for the repair. No ETA yet. EDIT: 5/10 3:20am Pacific. The server dumped. We've decided to scrap it and replace it due to ongoing issues with it. The team is working on replacing it now. EDIT: 5/10 12:00pm Pacific. New hardware is online. Reinstalling everything and then we'll start the database restore / repair process. EDIT: 5/10 3:20pm Pacific. Current estimate is 9:30pm for the restore to finish. Then the repair will start. EDIT: 5/11 3:30am Pacific. Still waiting on various parts to finish. EDIT: 5/11 12pm Pacific. Finishing one last task and then we'll be bringing servers online for QA internal testing. To be clear, backups are intact. No rollback is expected. Thanks, Matt
  4. Out of curiosity, did you create your account in April?
  5. There is no intention to change the core mechanics to the game. Specifically we are not turning Fallen Earth into a first person game. Most of this effort is just to upgrade the tech so it is more supportable.
  6. Hmm. Did you lose all your achievements? According to my understanding, we only lost about 2 weeks of progression. So the large majority of your 8.5 year progress should be preserved.
  7. Hi all, This post went up today. It's been a while since I posted engine update progress for @FallenEarth. One big blocker has been figuring out the animation structure and porting them over to match the new characters and creatures. Looks like we have that solved now. #ThisOldGame Thanks, Matt
  8. Hi all, The team has organized how to make up for the lost progression, and we will be starting 2 weeks of Commander for all players on Friday 5/10. I have edited the original post to reflect this. Thanks, Matt
  9. Hi there, We recently had an issue with our primary database. The situation is fixed, and no data was lost. But we are still restoring the old Hoplon data that used by the migration system. You should be able to migrate your account by Friday. Thanks, Matt
  10. For all the PS4 players, we are looking at the login issue.
  11. Hi all, Looks like the PS4 services on our end are having trouble communicating with Sony’s end. We are investigating the issue. Thanks, Matt
  12. All servers are back online now. We're monitoring things to see I there are any straggler issues.
  13. Hi all, I'm not interested in sugarcoating or spinning this last week. I am sorry for all the downtime. Many players have asked what happened, so I'm going to do my best to summarize the events. On Monday the 29th, we took the servers down to move data centers and migrate to all new hardware for our backend systems. This move was necessary for a couple reasons, but mainly because a lot of the hardware was more than 5 years old, and we were already seeing failures and system performance problems. It was only a matter of time before something critical failed. The timing also happened to line up with the end of our last legacy (ludicrously overpriced) hosting contract, and we needed to make some network architecture changes to facilitate some of the features coming after the Engine Upgrade. I figured we could kill three birds with one stone. The core challenge was managing all the information needed to drive APB since all the hardware was new. That meant backing everything up and then hand carrying a series of large hard drives from one location to another. The hardware had been prepped and configured in advance, so while we knew this would be challenging, but we had a fairly detailed plan and felt we could manage any issues that popped up. Problem #1: Unfortunately, during the move we unearthed some buried issues that delayed our ability to bring servers back online for quite a while. The team did a solid job of working through those problems and even recompiling code in a couple places to remove landmines that we stumbled over. But once we got the servers back online, I felt we did a decent job. Problem #2: Shortly after we went live, the brand new RAID controller in our new primary database server quickly degraded in performance. To make the situation worse, we rushed to get servers back online and decided it would be okay to let players back in while the secondary database server finished syncing. The hardware failure hit so quickly that the secondary wasn't ready, so we couldn't failover. At this point we made an effort to keep the servers online through the weekend, and while our jury-rigged fix allowed some players to get on, it also lead to many other players being unable to login (Error 9). The team decided the quickest way to fix the issue would be to build an entirely new primary database server and then swapping everything over on Monday. We didn't want to risk moving damaged drives to the new servers, so we needed a complete backup to make sure we didn't lose anything. Problem #3: Once we shutoff all the servers and started the backup, we found that the faulty RAID controller could only copy files at the rate of 1GB per minute. After 18+ hours, we were finally able to complete the backup, and then finally get the new server finalized and back online. There are a lot of things that went wrong, but in the end, I should have planned better. With that much new hardware, we were bound to have an issue somewhere. To make it up to everyone, on Friday we will be turning on 2 weeks of free Premium for all players. For anyone who has existing Premium, this will add to it. I never want to have to make this kind of apology again. Little Orbit can and will do better in the future. EDIT: We just started awarding the Premium (4pm Friday Pacific time). There are a lot of players to gift, so we're doing this in waves. To start we will hit everyone who has logged in within the last 30 days sorted by most recent login first. Then we'll go back 30 days, etc. Sorry, Matt
  14. Hi everyone, Unfortunately, we are still working on backing things up properly. The file transfer from the bad server is very slow. We’ve done the math on the remaining files at their current rate and added repair time, and we feel the outage is going to run another 6 hours to roughly 6am Pacific time. We are working through it as fast as we can. Sorry, Matt
  15. Hi everyone, We're running a little long on the hardware maintenance. We're currently estimating another 6 hours with servers coming online around midnight Pacific time. We apologize for the delay, but we also feel it is critical to get this piece of equipment working properly for everyone. I'll post as soon as servers are back online. Sorry, Matt
  16. Hi all, I’m going to go ahead and close this thread. I appreciate the OP’s effort to raise awareness. We are working on the issue. Apologies. Thanks, Matt
  17. MattScott

    error code 9

    Hi there, Lixil has updated her post on this issue. We are hoping to fix the issue tomorrow (5/6). Thanks, Matt
  18. Hi all, Lixil has updated her original post. We are waiting on replacement hardware, but we should be able to do some maintenance tomorrow (5/6) to fix the issue. Thanks, Matt
  19. Hi all, Lixil has updated her original post with more details. Our brand new primary login server has some bad hardware. We have a replacement arriving soon, and we are hoping to schedule maintenance for tomorrow (5/6). Thanks, Matt
  20. Hi all, Lixil has updated her original post with more information. We have some faulty equipment that is hopefully ready to be fixed tomorrow (5/6). Sorry, Matt
  21. Hi all, We are working on the issue. Lixil has updated her original post with more information. We moved many servers last week. Unfortunately, one of them has some bad hardware and is malfunctioning. Logging in is hit or miss right now. We are hoping to fix the hardware tomorrow (5/6). Thanks, Matt
  22. Hi there, Lixil just updated her post in Social. Essentially, we have a bad piece of equipment in the new data center that is causing a problem. We have already secured a replacement part and we hope to schedule downtime tomorrow (5/6) to fix the issue. Sorry, Matt
  23. Hi Wastelanders, Servers are up. My full post about the incident is here: We will schedule Commander early this week so everyone gets notice and can enjoy it properly. Thanks, Matt
  24. Hi all, On Monday, April 29th, we scheduled downtime to physically move from one data center to a new one. This move was unavoidable based on a very expensive, legacy Reloaded contract that had ended. Even though we asked for an extension, we were forced to move out before May 1st 2019. As you know, Fallen Earth is an extremely old game. The servers run on an OS that is no longer available to download, it takes more than an hour to even reboot, and it requires many different databases that all have to be synced to operate correctly. We made backups and planned to move each system one by one into the new data center. However despite many precautions, some data was lost. The engineers have spent their waking hours attempting to find the right mix of files to get everything restored properly. We did eventually get the system back online, but it appears roughly 2 weeks of progress was lost. Having exhausted all other options, we are going to be putting the servers back online and moving forward. In the meantime, we'll be doing the following to help players recover: - We'll be giving out Commander to all players for 2 weeks to help them get caught back up - Anyone who lost purchases due to the rollback can open a trouble ticket at http://support.gamersfirst.com, and we'll escalate getting those taken care off as quickly as we can We're not going to start the Commander for another couple days, so that all the players can get up to speed on what has happened. I want everyone to be able to take full advantage of the boost over the coming weeks. Please know that the team worked very hard to get us to this point, and we are committed to getting the back end re-written so it can be properly supported in the future. EDIT: We will be turning on 2 weeks of Commander for the Fallen Earth players. EDIT: We waited an extra couple of days to try and make sure server performance was better. Effective 5/16, we have activated a 4 week Commander code as compensation to the players in the hopes of helping them catch up on lost progress. The code is: FallenNot4gotten Apologies, Matt
  25. Hi all, I think we've reached the limits of what we can do, and the down time has already been excessive. We did some tests and the newer data is too incomplete to work. With that in mind, the servers are going to be put back online shortly, and I'll make a public post about the missing data for the rest of the players. Moving forward: - We'll be giving out Commander to all players for 2 weeks to help them get caught back up - Anyone who lost purchases due to the rollback can open a trouble ticket at http://support.gamersfirst.com, and we'll escalate getting those taken care off as quickly as we can The team is committed to getting the back end re-written so it can be properly supported. Apologies, Matt
×
×
  • Create New...