After a tremendously long development period, Dyescape has finally been able to put out a first release; v0.1.0. Those who were active in the Discord or the Twitch livestream yesterday will have already seen that we ran into a few technical struggles. This thread serves as transparency towards the community to explain what happened, why it happened, how it could happen and what has been done to address it. On a positive note, we'll also list a few technical things that did go well.
Database cluster
In a project like this, being dynamic with content changes is crucial. Software and content are two completely separate complexities, and in order to ensure that it doesn't become a single, even larger complexity, we've made everything configurable. From items, to skills, creatures, quests, random encounters; everything is configurable. In order to facilitate for this, a JSON document based file storage system was set up. For the past years during development, this has been done through flatfile & SFTP to allow the content to quickly make changes, test these and commit them to version control. A few months ago, a MongoDB cluster was set up to make it production ready.
Having set up the database cluster, everything was operational. We could import our content, we could play the game, save characters, swap to a different server, and everything would be there. However, before launch, we still had a few important content & software changes to process. While doing so, something in the database got upset, causing cluster sharding initialisation to fail and the cluster to become unusable. This is the first major technical downside we ran into, which is what caused the initial release to be two hours late. Thankfully, the team quickly jumped to support and we managed to solve the issue.
GEO load balancing
GEO load balancing is a technical setup that automatically routes users to the nearest server. This is setup by our Anti-DDoS & Load Balancing provider. From what we could see, this load balancing worked for most users. North American users see on average roughly 40 to 60ms ping from earlier playtests, and west European users get a ping of around 15 to 25ms. This GEO load balancing is setup for two main reasons; to give users the best possible connection and reduce lag, and to limit cross-continent bandwidth usage.
However, due to a misconfiguration on the proxy service discovery side, players were often sent to a fallback server in a different region. They would connected to a North American proxy, but a European fallback server for instance. In some unlucky cases, some people didn't get the expected GEO load balancing on the proxy to begin with. This caused for example American users to connect to a European proxy, and then to an American fallback server; doubling the already bad latency as a result. However, this was only for a handfull of people and was likely caused by some fallback servers crashing (more on this later).
In order to fix a few latency issues some players were having, two new domains have been created. These domains are eu.dyescape.com and na.dyescape.com created. In order to keep things as stable as possible for the time being, the proxy & fallback setup has fully been split into two separate networks now, so there's no chance of ever being routed to a fallback server of the wrong region. The play.dyescape.com domain should still provide accurate GEO load balancing. If not, please contact the team.
Cross-continent bandwidth
We have explicitly set up the infrastructure to have considerably strong network bandwidth. At our hosting provider, a large private network consisting of 4 solid dedicated servers with 8 physical cores, 64GB memory each, and a 2gbit connection. Despite initial thought yesterday, we currently see no signs of bandwidth coming short. The timeout errors we were seeing yesterday was actually caused by an issue on the software. The resulting issues for players however made us think it was a network related issue.
Crashing fallback servers & timeouts
While the Dukes were playing, we were notified of numerous timeout & crashing issues. After some investigation, it seemed to be caused by a recent software change in our interactive chat plugin. A code issue caused an infinite loop, exhausting the CPU capacity and killing the instance. This issue was fixed at around 02:15 AM our time. Afterwards, the Dukes could continue playing.
Remaining issues & alpha queue
The question which we see posted in Discord extremely often since the release, is why the queue is not processing. There's a good reason for this; although we've fixed any infrastructure & fatal software related issue we see, there's still a few in-game issues causing quests to become stuck, blocking progression. These issues are scheduled for a v0.1.1 patch release, which is planned to go live in the upcoming days. The team is working round the clock to get this patch out.
Because these remaining issues prevent gameplay progression, we've decided not to progress the queue until the v0.1.1 goes live. We deemed the best decision would be to have only Dukes test the game. It's a small group of people that can help us efficiently identify critical issues.
Positive notes
Down to some positive comments, because despite the technical issues, there are also multiple compliments worth mentioning. I'll go over some of these below;
Gameplay feedback (from Dukes); while currently only Dukes are able to play, we've received very positive feedback from them. The game is smooth, skill usage is a bliss, content is interesting & understandable, and there's multiple qualify of life features that are very well appreciated. After having fixed the regional connections & freezes, ping seems to be good, and the average milliseconds per tick seems to be healthy. Once the in-game issues are fixed in the v0.1.1 launch, we can likely start processing the queue.
Conclusion & thank you
We want to close this off with a massive thank you. Albeit rough around the edge; Dyescape has launched. It has taken us well over 4 and a half years to get to this point. It has been an incredible ride to reach this state. We've had our ups, we've had our downs. We've seen team members come and go, we've seen content be revamped, software be overhauled and we've seen a community grow.
Despite the messy infrastructure launch, in all of our years of playing Minecraft, we have not seen a more considerate, friendly & heartwarming community. The support is incredible. We will continue to work hard getting the fixes out and to have everyone be able to join. Hope to see everyone on the server in v0.1.1!
Database cluster
In a project like this, being dynamic with content changes is crucial. Software and content are two completely separate complexities, and in order to ensure that it doesn't become a single, even larger complexity, we've made everything configurable. From items, to skills, creatures, quests, random encounters; everything is configurable. In order to facilitate for this, a JSON document based file storage system was set up. For the past years during development, this has been done through flatfile & SFTP to allow the content to quickly make changes, test these and commit them to version control. A few months ago, a MongoDB cluster was set up to make it production ready.
Having set up the database cluster, everything was operational. We could import our content, we could play the game, save characters, swap to a different server, and everything would be there. However, before launch, we still had a few important content & software changes to process. While doing so, something in the database got upset, causing cluster sharding initialisation to fail and the cluster to become unusable. This is the first major technical downside we ran into, which is what caused the initial release to be two hours late. Thankfully, the team quickly jumped to support and we managed to solve the issue.
GEO load balancing
GEO load balancing is a technical setup that automatically routes users to the nearest server. This is setup by our Anti-DDoS & Load Balancing provider. From what we could see, this load balancing worked for most users. North American users see on average roughly 40 to 60ms ping from earlier playtests, and west European users get a ping of around 15 to 25ms. This GEO load balancing is setup for two main reasons; to give users the best possible connection and reduce lag, and to limit cross-continent bandwidth usage.
However, due to a misconfiguration on the proxy service discovery side, players were often sent to a fallback server in a different region. They would connected to a North American proxy, but a European fallback server for instance. In some unlucky cases, some people didn't get the expected GEO load balancing on the proxy to begin with. This caused for example American users to connect to a European proxy, and then to an American fallback server; doubling the already bad latency as a result. However, this was only for a handfull of people and was likely caused by some fallback servers crashing (more on this later).
In order to fix a few latency issues some players were having, two new domains have been created. These domains are eu.dyescape.com and na.dyescape.com created. In order to keep things as stable as possible for the time being, the proxy & fallback setup has fully been split into two separate networks now, so there's no chance of ever being routed to a fallback server of the wrong region. The play.dyescape.com domain should still provide accurate GEO load balancing. If not, please contact the team.
Cross-continent bandwidth
We have explicitly set up the infrastructure to have considerably strong network bandwidth. At our hosting provider, a large private network consisting of 4 solid dedicated servers with 8 physical cores, 64GB memory each, and a 2gbit connection. Despite initial thought yesterday, we currently see no signs of bandwidth coming short. The timeout errors we were seeing yesterday was actually caused by an issue on the software. The resulting issues for players however made us think it was a network related issue.
Crashing fallback servers & timeouts
While the Dukes were playing, we were notified of numerous timeout & crashing issues. After some investigation, it seemed to be caused by a recent software change in our interactive chat plugin. A code issue caused an infinite loop, exhausting the CPU capacity and killing the instance. This issue was fixed at around 02:15 AM our time. Afterwards, the Dukes could continue playing.
Remaining issues & alpha queue
The question which we see posted in Discord extremely often since the release, is why the queue is not processing. There's a good reason for this; although we've fixed any infrastructure & fatal software related issue we see, there's still a few in-game issues causing quests to become stuck, blocking progression. These issues are scheduled for a v0.1.1 patch release, which is planned to go live in the upcoming days. The team is working round the clock to get this patch out.
Because these remaining issues prevent gameplay progression, we've decided not to progress the queue until the v0.1.1 goes live. We deemed the best decision would be to have only Dukes test the game. It's a small group of people that can help us efficiently identify critical issues.
Positive notes
Down to some positive comments, because despite the technical issues, there are also multiple compliments worth mentioning. I'll go over some of these below;
Gameplay feedback (from Dukes); while currently only Dukes are able to play, we've received very positive feedback from them. The game is smooth, skill usage is a bliss, content is interesting & understandable, and there's multiple qualify of life features that are very well appreciated. After having fixed the regional connections & freezes, ping seems to be good, and the average milliseconds per tick seems to be healthy. Once the in-game issues are fixed in the v0.1.1 launch, we can likely start processing the queue.
Conclusion & thank you
We want to close this off with a massive thank you. Albeit rough around the edge; Dyescape has launched. It has taken us well over 4 and a half years to get to this point. It has been an incredible ride to reach this state. We've had our ups, we've had our downs. We've seen team members come and go, we've seen content be revamped, software be overhauled and we've seen a community grow.
Despite the messy infrastructure launch, in all of our years of playing Minecraft, we have not seen a more considerate, friendly & heartwarming community. The support is incredible. We will continue to work hard getting the fixes out and to have everyone be able to join. Hope to see everyone on the server in v0.1.1!