Stake Wars 2 Launch Recap
After the success of our incentivized staking and operating initiative Stake Wars 1 a number of updates were made to staking and domains. For example, nominator limits were removed, storage fee funds were introduced and cross domain messaging (XDM) was implemented. Of course, testing these thoroughly is a dependency of being able to launch the mainnet. To that end, Stake Wars 2 was conceived and announced with a healthy 0.3% of total token supply up for grabs for those who help us look for any problems which we are not aware of.
The original Stake Wars was immensely successful with no big issues being reported and our wonderful community enjoying the journey we all went on together. This, of course, set an expectation that we write perfect code and have perfect testnet infrastructure 100% of the time. We launched Stake Wars 2 on a community call on July 10th 2024 to great excitement and many operators were setup over the next 24 hours. Shortly after launch we could see that there were a few problems developing. Of course, this is what testnets are for and why they are incentivized - so that we can work with our community to identify issues that only present themselves when the protocol is tested at scale and edge-cases, corner-cases and broken processes are discovered.
We have been providing regular updates in Discord but felt it’s time to provide a consolidated breakdown on what’s happened and where we’re going.
Slashing
At first the issues centered on the user interface, Astral, which experienced stability problems as well as data inaccuracies and many small UI issues. On the Friday/Saturday after launch, operators started reporting they had been slashed, which was confirmed by the Autonomys team. Slashing is the process of confiscating stake from operators who are not acting in good faith. The operators were running official binaries and following the documentation so what gives?!?
It was soon discovered that the root cause was an existing issue we were already aware of. The gloriously named “unbounded gap between head receipt and the best domain block” problem. In simple terms, this gap arises because execution receipts (ERs) are submitted with bundles, causing potential mismatches between the confirmed domain block and the head receipt. The fix was to introduce a mechanism to ensure ERs align with the latest domain blocks to prevent the gap discrepancies.
Obviously, when making changes to fraud proofs and the security of the network a sure-footed and methodical approach was undertaken and Ning, one of our experienced protocol engineers produced branches that were tested on our internal devnet where both the network upgrade path and the fixes were confirmed as working. Afterwards, the same process was run on the Gemini 3h testnet. The fix was to introduce a submit_receipt
to fill up the unbounded gap between HeadDomainNumber
and HeadReceiptNumber
. A secondary issue was found and addressed in this phase which did cause a further delay in the protocol being ready for a restart. To confirm, the build that is running on testnet right now no longer exhibits the original issue or the secondary one.
Astral Problems
So we had the situation where the protocol was causing us issues but this was compounded by a number of problems with Astral which were becoming apparent. As Astral is the official window into the staking process this caused a lot of consternation within the operator and nominator community. Note that while Polkadot.js and Subscan provide insights into the state of the network, using them to understand the state of domains can be awkward and unwieldy.
Here is a rundown on the issues identified:
Unstable under load
- Cause: Inefficient graphql query strategy, more calls being made than necessary
- Mitigation:
- Status: Mostly mitigated, can improve further via “live queries”
- Testing plan: Monitor for instability.
Staking data inaccurate
- Cause: Indexer not properly capturing data to be indexed
- Mitigation:
- Short-term fix: Use RPC data for “live” operator and nominator status
- Long-term fix: Use dedicated staking indexer
- Status: In progress
- Testing plan: Extensive internal testing by the team
Indexer out of sync
- Cause: Indexing slowed down due to increase in transactions being submitted for Stake Wars 2.
- Mitigation:
- Bandwidth for squid was increased and indexing resumed normally after about 12 hours syncing.
- This situation will improve with Autonomys Subsquids design improvements.
- Status: Mitigated (for now)
- Testing plan: Monitor
Various Astral UI issues
- Cause: Required more thorough testing to highlight issues
- Mitigation:
- GitHub issues have been created and are being worked through. These are mostly minor.
- Status: In progress
- Testing Plan: Extensive internal testing by the team
As you can see, there were a number of problems to deal with and the inability to see what was going on with the network in Astral made things very difficult for those trying to stake or unstake.
Progress On Fixing Astral
In the past few weeks there has been a veritable tsunami of pull requests into the Astral repository from Marc. As you can see above this has ranged from updating the foundational infrastructure with a new indexer topology built around specialized Subsquid instances rather than larger, monolithic ones to ensure that only the operations available to operators and nominators in a certain state are available to them. And everything in between!
We believe we are almost there with the frontend changes supported by the backend changes. As an added bonus there are many new features and data points available to operators and nominators. We hope that Marc will be able to give us a full rundown on all of this at some point soon - we think you’ll be impressed!
Requirements to Restart
Before we restart there are a small number of remaining issues we need to clear out and we would also like to have you, our community, do some “pressure-off” initial testing of Astral to confirm readiness.
The other items we have to resolve are:
- Investigate an issue when syncing an operator node from scratch. There is a suspicion there could be a regression in the XDM relayer causing a problem here.
- Reimburse slashed tokens.
- Note that any nominator who claimed
MinStake
from the faucet but did not register an operator will be docked 100TSSC from their reimbursement. - Abusers of the faucet will be excluded from Stake Wars 2 rewards. Up until we restart there is an amnesty on any ill-gotten or erroneously claimed tokens which can be returned to
st9ykCvB5M7AqUfq1nzALZsBRB7hVrQ3GZbAFPnBmYK3sXDC9
. - NOTE: We will still be running the
MinStake
claim initiative for those who want to participate but do not have enough TSSC farmed. This will be a manual process and we invite you to reach out to a member of the Ecosystem team in Discord or email jim@autonomys.xyz
- Note that any nominator who claimed
- As mentioned above, we need to release and test the huge changes to Astral that will bring stability, performance and workflow enhancements. All the things that tripped us up at the initial launch.
The mechanics of the restart will be the same as the original launch where a block will be marked live on a community call. We apologize for the hiccup and acknowledge that more internal testing may have helped but do want to reiterate that events like these are exactly what we run a testnet for. We hope that all our original participants and more will be standing at our sides when the time for the restart comes. Keep an eye on announcements for confirmation of when that will be. Keep the faith Stake Warriors!