- On August 14, Terence Tsao, a member of the Prysmatic Labs teams reported a bug which resulted in a synchronisation problem with the Cloudflare’s rough time clock.
- Beacon chain which supports the whole Ethereum 2.0 network started to explode making the other clients struggle to make it through the nodes.
- Ethereum 2.0 network started to explode making the other clients struggle to make it through the nodes.
Even though it is a short period since its launch on 4th August, Ethereum 2.0 Medalla testnet had a fantastic reign with some initial setbacks. However, the incident on Prysm Client has left the developers panicked and they quickly turned the tables which made 80% of the network to vanish.
The team announced through its Medium post that they are working on their clients in making them robust which likely enables Medalla to a workable state with some damage to everyone’s balance. They might think of restarting from scratch if the participation rate isn’t recovered to keep the network running.
Chaos Created by a Clock Bug
Among the five clients of Medalla testnet, Prysm has been the most widely used client. On August 14, Terence Tsao, a member of the Prysmatic Labs teams reported a bug which resulted in a synchronisation problem with the Cloudflare’s roughtime clock. Apparently, roughtime started to serve 4 hours ahead of the original. This eventually led all the clients of prysm to make blocks and proofs in a chain which didn’t exist yet.
In spite of missing blocks and proofs from the future, the remaining clients were able to keep building the original chain. Everything came to place once the clocks were readjusted and participation began to rise. The real chaos started a few hours after the original incident. As the clock got readjusted, all the future proofs Prysm clients started to become valid. As their slashing protection kicked in, the rejoined nodes started to vanish to reduce further complications in proofs.
With these things running, Beacon chain which supports the whole Ethereum 2.0 network started to explode making the other clients struggle to make it through the nodes. The memory and CPU requirements of navigating the issue were indeed overwhelming as spotted a Lighthouse client used 30GB while Teku had trouble making out with 12GB.
In the light of the mess created by the bug, members of Prismatic lab have responded efficiently which prevented the total collapse of the testnet. The team planned to slow up the network to speed up the things.
Blessing in Disguise
This incident has shed light on the things to be taken care of by the developers of testnet. Through their post they stated that everything will be a blessing in disguise. What’s the point in having a testnet when it doesn’t test anything and having a happy-flow is simply unrealistic.
This has become a good opportunity for them to pay some more attention to time synchronization and its importance in Ethereum 2.0. They noted that one must build a robust network with the diversity of clients and no major dominance of a single client. This will be effective when such situations arise and the number of validators dropping will be low which makes it easier for the network to recover.
Currently, they plan to work on the format where some stakers can switch their signing keys to hot-standby nodes of other clients which provide slashing protection. Once finished, it will be possible to switch the validator client themselves, not just keys.The developers strongly supported the idea where individuals get involved and run their own validators.
The network is now recovering slowly and clients like Prysm and Lighthouse became fairly efficient in being able to find the right chain head and continue building the beacon chain.