Bitcoin's initial block download problem (IBD)

Chris Dev

//
July 14, 2019

Sometimes, when I write about any flaws in Bitcoin, I get negative feedback but I think it's important we do so. These criticisms are not "an attack on Bitcoin". It's just stating facts and outlining potential solutions. Part of the intention is to open a dialog and is part of any healthy community. This is part of my critique of the current Bitcoin Maximalism ethos. There is a culture where if you discuss any weaknesses in Bitcoin, you're considered an enemy. It's somewhat ironic since I'm sure many of the people pointing fingers have less of an interest in the success of Bitcoin than myself. Coming from a science and engineering background, I understand that this sort of healthy skepticism is the only way we can move forward.

One of the major problems in Bitcoin is that not very many people run a full node. In fact, when you talk to many Bitcoiners about running a full node they will think that it's something that you need a degree in computer science to even think about. Why is that the case? Isn't software something that can be perfected to a point where it's easy to run? A modern day operating system is VERY complex, much more complex than the software in a full node, but these same people run this incredibly complex piece of software without trouble. Is it just that we're still early and full node software is just not well designed? No, the reason most users don't run a full node is due to the problem called the initial block download problem. The problem is that in order to be able to validate even a single new Bitcoin transaction, you must download the entire existing Bitcoin blockchain. Downloading this 220 + GB of data and processing it takes a lot of time and on slower connections it can take several days.

Before we go on about ways to address this issue, you may be asking, why do I need to run a full node? I'll say something that some will disagree with, the answer is that on a day to day basis, you don't. Personally, I run a full node because I want to actually validate all my transactions, but I don't use a full node to store Bitcoin, I use it exclusively to validate my transactions, which are very rare for me because I am mostly using Bitcoin as a long term savings account and don't do day to day transactions. After I have validated my transaction, I turn my full node off, sometimes for years and only turn it back on when I do another transaction. Truth be told, it's probably ok if you were to rely on several independent block explorers (like Blockchair, Blockchain.info, etc) to validate your transactions, but what is VERY important in my opinion is that, when needed, users should be ABLE to run a full node.

In order to understand this, lets talk about a hypothetical cryptocurrency that had only one full node. If such a cryptocurrency existed, the only person that could validate transactions would be the single full node operator. That means, the single full node operator could censor any transactions they wish or that a government tells them to. If this were to happen with Bitcoin, it would lose its distinguishing characteristic - censorship resistance. In addition, the full node operator could create new rules like additional inflation, etc. At this point, we just have another paypal or venmo and it becomes not very interesting.

But what if there's a few nodes instead of one? If the blockchain is very difficult to store but there are few full node operators you run the risk of cartelization. What if those node operators collude? Essentially it boils down to a high risk of the consequences single node operator scenario taking place. The more spread out the nodes are in terms of geographical regions, operator ideology and political beliefs, even language, the better for things like censorship resistance.

We can even see this today with Ethereum which has a blockchain that has exceeded 1 TB in size. Even the Ethereum Foundation no longer runs a full node. For the average user, it's practically impossible to run a full node and getting worse all the time. The most important instance where users running full nodes became important was during the activation of Segwit. Segwit was supposed to be activated by what's known as miner signaling where miners signal that they are ready for a particular upgrade, but the miners, many of whom didn't like Segwit decided to use the signaling mechanism as a voting mechanism instead. If miners were the only ones operating full nodes, we'd likely not have Segwit today. Instead, many users, myself included, ran a modified full node which had a patch called BIP 148 on it. This patch forced the miner's hand and had they not activated Segwit by August 1, 2017 there potentially would have been a chain-split where some miners would refuse Segwit transactions and some accept them. This sort of activation is known as a UASF and it was one of the most important moment for Bitcoin since it proved that users of Bitcoin had the ultimate say and miners had to succumb to the wishes of their masters - the hodlers. This is when it became clear to all, the importance of running a full node, or at least being able to do so on demand. Now miners, and the companies that work with them, will tell you that what I'm saying is not the case and that they just decided to activate segwit on their own, but without a doubt, users drove the activation of Segwit.

So, as Bitcoin holders, it's in our interest to protect this capability should the need arise in the future. So, we need to prevent the size of the blockchain from growing out of control. This brings us to the initial block download problem. When you startup a full node, the first thing it does is download all the blocks in the history and validates them. This is part of the design of Bitcoin and without having literally every byte of data in the blockchain, you can't do anything useful. This includes EVERY single transaction ever made. Yes, even if Bitcoin were to run for 1000 years, users would be downloading that first transaction that Satoshi sent to Hal Finney in 2009 along with EVERY other transaction that has ever occurred for 1000 years. You are probably starting to see the problem. Over time, this data will add up to an enormous about of data.

So, how does Mimblewimble address this IBD problem? One of the main distinguishing characteristics of Mimblewimble is that spent transactions can be deleted. This means that as soon as Hal Finney spent the 10 Bitcoin that Satoshi sent him, the data representing the transaction where Satoshi sent to the coins to hal could be deleted. PERMANENTLY. That means, had Bitcoin been implemented initially as a Mimblewimble blockchain, a lot of the data that needs to be downloaded by full nodes would be gone. Since all of it depends on how often people spend their coins, it's hard to say exactly how much space would be saved, but one analysis showed that if Bitcoin were ran as a Mimblewimble chain from the beginning, the initial block download would only require around 70 GB as opposed to the current 200 + GB. So, as a rule of thumb we say that Mimblewimble would improve the IBD problem by approximately 200%. This is a huge improvement.

The initial block download problem is such a problem that it has led core developer Luke Dashjr to propose a hard fork to reduced block size from 1 MB to 0.3 MB. Obviously, such a change would have drastic consequences like vastly higher fees and many fewer Bitcoin transactions taking place, but that illustrates how big a problem some see it as. It's an interesting coincidence that this reduction is approximately the same reduction that would be achieved by implementing Mimblewimble in the base layer of Bitcoin while keeping the transaction throughput the same. As more people come to understand that Mimblewimble is actually a scaling improvement in addition to a privacy solution, I would hope this type of block reduction proposal would be revisited and something like Mimblewimble be used instead of reducing the block size. This allows us to get the best of both worlds.

As we always mention, all this is reliant on Mimblewimble being properly tested and developed to the point where it can be used. That's a large part of what we are doing with this airdrop. Help us by registering today.