Re-org Limits and Checkpoints

Background

A re-org limit in this proposal means an agreed upon number of blocks after which nodes will not switch over to a new chain. For example, if the re-org limit is 10 blocks and a node reaches block 100, that node will never change blocks previous to sequence 90. Blocks <= 90 are ‘locked in’. A hash-based checkpoint in this proposal means a block height with a hardcoded block hash. For example the block hash at height 100 must be b42... Any other blocks at sequence 100 would be invalid.

Iron Fish currently has no concept of re-org limits or checkpoints but they have a number of advantages. Primarily they provide further guards against large 51% attacks. If a miner attempts a 51% attack that reorganizes to a chain further back than the re-org limit the 51% attack will fail.

Motivation

With the upcoming switch to FishHash, mitigating 51% re-org attacks is of particular priority because there will be a large decrease in network difficulty at the hardfork. The FishHash algorithm produces considerably less difficulty blocks than blake3 because its a much slower algorithm. Consider the following situation:

  1. FishHash activates at block 100
  2. 100 FishHash blocks are mined at much lower total difficulty than blake3 blocks
  3. The total difficulty of the longest chain is now 100b + 100f, b being the average blake3 difficulty and f being the average FishHash difficulty
  4. A bad actor re-mines a blake3 block 100 at a higher difficulty such that the difficulty different between block 100 and block 100’ is greater than 100f. This is possible because the FishHash average difficulty is much lower than the blake3 average difficulty
  5. The bad actor can initiate a large re-org with very little hashpower

Without a re-org limit this attack could be viable for a very long period and indeterminate of time (until the FishHash blocks difficulty exceeds what reasonably could be overcome with blake3 hash power).

With a re-org limit however this attack would only be viable up to sequence HARD_FORK_SEQUENCE + RE_ORG_LIMIT. Additionally at this point a hash-based checkpoint could be adopted for the first FishHash block to further solidify the canonical chain.

Potential Downsides

The downsides of a re-org limit are directly related to how large the limit is. If the re-org limit is say 10 blocks (~10 mins) then a network interruption of only 10 minutes could cause the network to split and be in an irrecoverable state. If the re-org limit is 1500 block (~1 day) its much less likely that a network disruption would last that long. The key is to add a sensible limit that balances a potential network split with a counteracting a potential 51% attack.

With the switch over to FishHash there is also some nuance around the block limit. With the difficulty adjustment it could take a while for the first few FishHash blocks to be mined. So 10 blocks could correspond to much more than 10 minutes (possibly hours). So choosing this re-org limit must take that into account as well.

Hard vs Soft Limit

The re-org limit could be coded into the consensus parameters of the network (hard limit), or it could be set as a config value for node runners to change if they wanted to (soft limit)

The benefit of a hard limit is that users will know that the network as a whole will not change at a certain block height. If a user has a transaction at that block height they can be very confident that it will never change. With a soft limit users do not know what other node operators have set their limits to. So there is no overall assurance at a network level.

The benefit of a soft limit is that it can be changed to either counteract a network split or allow specific users to change their tolerance to a re-org. Although again, setting a stricter soft limit does not give a network-level guarantee.

Proposal

We propose:

  • adding a hard re-org limit with a default value of 3000 blocks (~2 days).
  • add logic for both hard limit and soft limit hash-based checkpoints. Once users are reasonably sure of the first FishHash block they can manually add that block hash to the list of soft limit hash-based checkpoints stored in the node’s config. On the next release of IronFish, a hard limit with that block hash will also be added to the consensus parameters.

The hard re-org limit of 2 days is a very conservative number. It is meant as a fallback to finalize the hardfork if node runners don’t update their nodes with the hash-based checkpoints. Its also meant to limit the future impact of 51% attacks without having to periodically adopt more hash-based checkpoints. More aggressive finalization can happen manually with the soft hash-based checkpoint. This will allow FishHash block to quickly finalize with important node runners like mining pools, exchanges and block explorer nodes so that users can start sending transactions on the network again.

Had some discussions with @mat about this. Some of our takeaways

  • User configurable hash based checkpoints can be dangerous. Node runners could potentially mess up the state of their node by adding the wrong checkpoints by accident or even through some sort of malicious attack. Bitcoin only has consensus level hash-based checkpoints which seem much safer to implement
  • Adding a consensus hash-based checkpoint before the re-org limit is hit seems risky. If a new node version is released with a hash-based checkpoint that points to the canonical FishHash chain it will take a while for all nodes to upgrade. In that time if there is a re-org we could end up with some nodes being incompatible with the hash checkpoint and left in an unrecoverable state
  • It seems much safer just to allow the FishHash chain to solidify by nodes passing the HARD_FORK_SEQUENCE + RE_ORG_LIMIT block. Lets call this block the Fish Hash Finalization Block. The block after which a node can no longer re-org back to before the hardfork. Reaching the Fish Hash Finalization Block will happen at almost the exact same time for all nodes so there’s much less chance that a malicious re-org could affect some nodes and not others. Additionally as long as the majority of hash power passes the Fish Hash Finalization Block at the same time that chain will eventually become heavier than any malicious re-org attempts. So anyone node that gets left behind will eventually be brought back to the main chain once enough blocks are mined on top of it
  • Because we are relying on the re-org limit to solidify the hard fork, 3000 blocks seems a little too long. 1000 block (~16 hrs) still seems like a conservative enough number that will allow the network to quickly finalize the hardfork. With the change in difficulty, 1000 block might take more like ~3 days to finalize which still seems fine as long as users are not sending transactions during that time.

So newly proposed plan is

  • adding a hard re-org limit with a default value of 1000 blocks (~3 days after the hardfork and ~16hrs once chain difficulty has stabilized)
  • add logic for hard limit hash-based checkpoints. Once the Fish Hash Finalization Block is reached by most of the network release a new version of the node that adds a hash-based checkpoint for the first FishHash block after the hardfork. Users can upgrade on their own time.

Some interesting reading on Reorg Caps for Eth clients Reorg Caps and Confirmation Delays: A Primer on Finality Arbitration | by Ethereum Classic Labs | etc_core | Medium

After more discussion I think that a single hash-based checkpoint after the hardfork would be sufficient here. We realized that the potential 51% attack is much less threatening than previously thought. An attack could not replace blake3 hashpower 1 for 1 by re-mining old blocks. it would more be like 0.5% of blake3 hashpower since the attacker has to overcome the work of the previously mined blocks as well as adding their additional work. This makes it much less viable for someone to try and once the hash-based checkpoint is upgraded to by the majority, it is no longer viable.

The downsides of no rolling re-org limit are

  • nodes will have to upgrade versions once more after the hardfork
  • the network will not have re-org limit going forward (something that could be useful for bridges)

We can always add a re-org limit later on but since it’s not strictly necessary for FishHash, leaning towards not adding it in this hardfork.