To become mass-adopted blockchains have to be functional when many people use them. What are the different solutions to make blockchains usable for the masses?
How to scale blockchains is the philosopher’s stone of the industry. If someone knew how to scale a blockchain while keeping it decentralized and secure, he’d be a rich man. Combining all three is, of course, the infamous scalability trilemma that postulates how you cannot have security, decentralization, and scaling at the same time. Thus, it’s the key challenge all modern blockchains face. This lesson will cover:
- Why scalability is an objective in the first place
- What different types of scalability exist
- Different solutions to scaling blockchains
- The upsides and downsides of these solutions
Why scaling blockchains is the goal
The first blockchains were not designed for widespread adoption. If Satoshi Nakamoto had wanted that Bitcoin replaces VISA, he probably would have designed it in a way that supports more than seven transactions per second. The problem is that miners have an incentive to process transactions with a higher transaction fee first. Once they’re added to a block, the rest follows. But that means that if you want low transaction fees, you will have to wait because a Bitcoin block is only one megabyte big. Also, it takes ten minutes for Bitcoin miners to reach a consensus, which is an extra roadblock for scalability.
Ethereum faces a similar problem. Blocks are limited by their gas limit. The more people want to use Ethereum, the higher the gas price because there is a bigger pool of people willing to pay a higher fee to get their transaction processed. But the higher the gas price, the fewer transactions get included in a block. Essentially, Ethereum shoots itself in the foot by being popular.
Hence, scaling means processing more transactions per second without compromising on the other two crucial factors. Blockchains are designed in this particular way because putting transactions in blocks adds a layer of interaction, which, in theory, allows for better scalability. If you had a transaction chain, each transaction would only be able to interact with the one preceding and the one following it.
Think of it like this: if you have two freelancers interacting with each other, there is a limit to the impact they can have. But if you have two companies interacting with each other, the potential results are much more significant. This comparison is massively simplifying the underlying problem, but it works as a mental model.
The different forms of scalability
Let’s first look at what impacts scalability. The arrow at the end of each factor indicates the direction needed to improve scalability.
- Throughput: in plain English how many transactions per second can be processed. High scalability essentially equals high throughput. ↑
- Block size: as we saw, the bigger a block, the more transactions fit and the higher the throughput will be. ↑
- Latency: a fancy way of saying confirmation time. For instance, with Bitcoin, your transaction is usually processed after 10 minutes but confirmations take longer. ↓
- Block time: for Bitcoin, it takes ten minutes to mine a block by solving the hash function. But not all blockchains have the same approach as Bitcoin, so proof-of-stake blockchains can, and usually do, process blocks faster because there is no mining. ↓
With that in mind, you can divide all solutions into two basic categories.
Vertical scalability refers to adding more hardware or software capacities to speed up transactions. If you imagine the blockchain as a highway, this would enable the cars (applications) to run faster on the existing infrastructure. Scaling vertically would mean having more powerful nodes. In practice, this could mean faster internet or better database technology. Assume you have a blockchain with 1 GB blocks (good for scalability); you would need much longer to download those and more storage space as well. But if your internet speed is 100X faster than what is available now and it’s easy and cheap to store massive amounts of data, this wouldn’t constitute a problem. So, it’s safe to assume that blockchains will scale vertically automatically with technological progress.
Horizontal scalability refers to increasing transactional capacity by adding more nodes. Instead of making cars run faster, you just add more lanes so more cars can fit. This is the type of scaling solutions blockchains are currently trying to figure out.
Different solutions to scaling blockchains
So, if we need to add more lanes, how can we go about this?
There are two basic categories of solutions here. Layer 1 scaling and Layer 2 scaling. Layer 1 scaling would refer to scaling the existing blockchain itself. Ethereum or Bitcoin would add more lanes to their blockchain highway. Layer 2 solutions refer to adding more lanes by other blockchains that are connected to Ethereum or Bitcoin. Roughly speaking, another construction company would offer to build a connection and lanes adjacent to the existing highways, so cars can conveniently switch lanes. This de-clogs the existing highway and incentivizes cars to use the new lanes.
Both approaches have been tried and there are several proposed solutions with their own advantages and disadvantages. Let’s look at layer 1 scaling first.
Layer 1 scaling
Bigger block size
If Bitcoin had 1 GB blocks instead of 1 MB blocks, it would fit 1000 times more transactions. It’s pretty logical, but that wouldn’t mean it’s 1000 times faster because it would take longer to download these, and storing them would only work with specific infrastructure. This infrastructure isn’t available to everyone with today’s technology, so Bitcoin would be far less decentralized, and here, the scalability trilemma catches up with this solution. Smaller block size increases have been tried, although they did not go far enough to make a difference. For instance, Segwit, an example for this type of scaling solution for Bitcoin, essentially only resulted in hard forks like Bitcoin Cash, with a few minor improvements to the original Bitcoin blockchain.
Consensus protocol improvements
This is an elaborate way of proposing a consensus mechanism other than proof-of-work. Proof-of-work, in other words using a lot of computational power to mine coins, is highly secure and decentralized but has proven itself to be excessively hard to scale and has major energy consumption implications.
Another approach would be to use proof-of-stake. Instead of miners, each node could stake coins, in other words, deposit them like in a savings account, with the blockchain to secure it and validate transactions. A lottery decides who gets to validate, and the winner receives a share of the validated transaction’s fee. The more coins you have staked, the higher your chances to be chosen as a validator.
This approach is indeed faster, but only at the expense of being more centralized. EOS is a blockchain that can process a lot of transactions per second, but it has only 15-21 of the mentioned validators, which essentially secure the entire network. Other blockchains have more validators but don’t support certain functions like decentralized applications, as in the case of Stellar. Effectively, these blockchains can only be used for transactions.
While there is a general consensus that blockchains will need to use proof-of-stake if they want to scale, no one has figured out the exact way how to do it yet.
Sharding is a way of splitting up a database. Instead of having all data in one database, you split it up into smaller partitions to better organize the data and spread the workload.
This is the most promising layer 1 solution, and also the solution Ethereum has chosen, in combination with changing to proof-of-stake. Instead of just having a massive highway with no lanes, Ethereum essentially plans to expand its highway and meticulously divide it into lanes. That would regulate the flow of traffic and ideally increase its speed on the blockchain highway.
In practice, sharding means breaking up data into smaller data sets that can be simultaneously processed. Not all of the nodes would have all the information, which would reduce the load on individual nodes and improve throughput. Recalling the example of blockchains adding one layer of interaction, sharding would add another one to improve scalability even further.
Ethereum would have 64 shards, and the Ethereum blockchain now would be one of them, called the Beacon Chain. Nodes would make up these shards and validate the transactions they receive. The proposed solution is that the Ethereum Beacon Chain would just handle the receipts of the transactions that happen in other shards. It would pretty much work as a ledger of ledgers by keeping a tally of what is happening on other shards.
While this sounds promising, it comes with a lot of potential problems that still need solving. Firstly, the way Ethereum is designed now, more nodes mean more time needed to reach a consensus. When Ethereum changes in the proposed way, shards would need a way of knowing what is going on in other shards. In other words, since not all of the nodes in all the shards have all information, they need to communicate with each other. Otherwise, the consensus mechanism would break down. Too much communication, though, slows the system down again, which is why one proposed solution would be shards only seeing receipts of what happened in other shards. This would work a bit like asynchronous communication in the sense that shards see the end result but don’t need to keep up with the process.
Another potential pitfall is security. For instance, what happens if an attacker takes over one shard, essentially 1/64th of the entire Ethereum blockchain? And how many nodes would he need to take over in the shard to be successful? An attacker could try a Sybil Attack, which is overloading a network by creating a lot of fake identities and spamming the network.
Also, one needs to keep in mind that not all nodes have the same computing power. Some nodes are faster than others, which could slow the system down in this type of architecture.
In conclusion, sharding is a highly promising but untested scaling solution. It still needs to figure out:
- How to split the network into shards
- How to select the best consensus
- How to verify transactions between shards
- How many nodes one shard needs
Therefore, even just constructing the architecture and starting to use it is only the beginning of the scaling solution.
Layer 2 scaling
Nested blockchains and sidechains
Layer 2 scaling refers to adding another highway, constructed by an external company, next to the existing highway. By adding another blockchain on top of the existing blockchain, smaller transactions can be handled by the new blockchain and the results can be transmitted to the underlying settlement layer. These scaling solutions are being employed and developed at the moment.
Nested blockchains and sidechains are two very similar concepts. Both work pretty much like parent and child, hence why the terms child chain and parent chain are often used. The child chain does the work for both solutions, i.e., processes the transactions, and the parent chain has the results. For instance, if you wanted to trade Ether for altcoins and start trading these altcoins, it would cost you a fortune due to Ethereum’s high gas fees. But by transferring your Ether to a layer 2 solution and trading there, you can save yourself the gas fees. Once you want to withdraw, you would transfer your coins back to Ethereum.
The biggest difference is that nested blockchains work more closely with the parent chain and would rely for security on the parent chain. Sidechains are more independent from the parent chain and rely on their own security. The former would be more applicable for private blockchains, while the latter often results in semi-centralized solutions. Popular Ethereum scaling solutions like Polygon are sidechains of Ethereum.
A payment channel is a way of conducting transactions away from the blockchain, with only the end result being referred to it. For example, by opening a payment channel on Bitcoin, you could transfer money back and forth with your counterpart without Bitcoin’s transaction fees. Once you have concluded trading with your counterparty, you close the payment channel and relay the final tally to the Bitcoin blockchain. This solution was explored for Bitcoin since its limited to payments only. However, the main problem is that payment channels are liable to attacks. The Lightning Network for Bitcoin is an example that never got the desired traction because of recurring attacks on it.
Blockchains will scale one way or another, especially taking into account technological advances that are happening unrelated to blockchains. Whether there will be one solution to rule them all remains to be seen. Another unresolved question is interoperability between blockchains, in itself a way of scaling. If users could effortlessly switch between different blockchain highways, that would automatically result in much higher adoption by consumers and producers alike. As of today, though, there is no quick or easy fix for this.