Blockchain: What Is Sharding? [Expained]

If you’ve spent any time at all within the blockchain space, studying the industry, its promises, its challenges, you may have come across the term sharding. While far from a novel concept in database management, sharding is a partition technique currently being tested within the context of blockchain as a possible answer to some of blockchain’s biggest hurdles standing in the way between it and a future where much of our daily internet services rely on decentralized networks and benefit from their unique security promises.

Down below, we will explain exactly what sharding is, how the different types of sharding work, and how it may solve one of blockchain’s toughest riddles.

Related: What are NFT Stocks?

What is Sharding?

What is sharding blockchain? shard

To sum up the technique of sharding in a single sentence, think of it as dividing a blockchain into multiple sub-chains, each of which work independently and spread the workload of the network, improving throughput and reducing latency.

While the nitty-gritty is substantially more complicated — and interesting — than that, sharding really does boil down into partitioning the decentralized network’s nodes into independent clusters — shards — that can verify transactions and add to their own ledgers.

However, before we wade any further into why developers are exploring sharding as a means to improving blockchain performance, it’s important to first understand how blockchain works and the problems sharding might solve — and the problems it may pose.

Related: How to Make and Sell NFT Art

Blockchain’s Biggest Problems

What is sharding blockchain? relay

If you are reading about sharding, there’s a good chance you already have a decent grasp of how decentralized networks like blockchain work. But just in case you don’t, or don’t remember, here’s a quick refresher: blockchain itself is designed to serve as an immutable, public ledger that is viewable by everyone on the Network and which is, practically speaking, almost impossible for malicious actors to alter.

Consensus algorithms like Proof of Work and Proof of Stake rely on the participation of individual computers, or nodes, which contribute the computing power necessary to validate transactions and add them to the Blockchain in a series of data-blocks (hence the term, block-chain).

Cryptographic puzzles involving encryption elements like one-way hashing are used to ensure the veracity of a transaction before it is written in stone on the publicly viewable blockchain.

Every single node in the network maintains a full copy of this ledger. This makes it easy to spot a malicious actor’s attempt to forge transactions or alter the record — Think about it, if 99 out of 100 people have one historical record of events, it becomes pretty hard for someone to sell a fake record as the real Mccoy.

It is because the network is spread across so many independent nodes rather than stored on a singular server that it is known as a decentralized network. This decentralization is one of the primary tenets of cryptocurrency and blockchain in general – a trustless exchange environment that does not rely on the trustworthiness of a third-party to both execute transactions and handle data safely and ethically.

Many believe in the power of blockchain almost purely on a philosophical level, envisioning an Internet where individuals can make transactions across borders and write self-executing, immutable smart contracts that can safely oversee interactions without depending on another entity to store payment information and sensitive user data. But, for all the virtues and promises of decentralized networks, they are not without challenges yet unsolved.

Improving Scalability

What is sharding blockchain? growth

The core issue at the heart of blockchain right now is how to improve the scalability of decentralized networks in order to meet growing demand.

While mainstream financial technology organizations are, more and more, adopting blockchain-based technologies for their own operations, the end-user performance of even the most popular blockchains are a far cry from the capabilities of well-entrenched heavyweights like Visa.

Ethereum, for example, can only process 10 to 15 transactions per second, with individual transactions typically taking several minutes to complete — meanwhile the aging Visanet can handle somewhere around 1700 per second with most transactions completed in a matter of seconds.

The slow speed inherent to many decentralized networks stems from their very nature – rather than a high-powered, easily-upgradable centralized data center responsible for processing transactions as quickly as possible, each and every node connected to the network has to process and store the updated ledger.

As the distributed ledger grows in size, so do the local storage demands on each and every member node. This is why decentralized networks have yet to overtake the centralized paradigm of financial technology despite the immense security advantages they provide.

Preventing Centralization

What is sharding blockchain? centralization

The ever-increasing size of the distributed ledger creates a secondary problem for the blockchain as a whole: an increasing impediment to the addition of new, individual nodes and thus an increased risk of centralization. As the blockchain grows bigger, it becomes harder and more expensive for individual users to set up nodes capable of holding the entire transaction history of the network.

But, with the current state of block chain’s consensus algorithms, nodes have no choice; both Proof of Work and Proof of Stake involve an individual node contributing the computational power required to solve a cryptographic puzzle that confirms the validity of a transaction in order to add it to the blockchain while every other node on the network stores the entire ledger to be able to then check the veracity of their solution to the cryptographic puzzle, validating the authenticity of the record.

The demands that this increasingly large ledger places on individual nodes constitutes a barrier to entry for the network — leaving only larger, more financially flush entities well-positioned to enter the network. Having fewer, larger entities in control of a network is exactly the sort of centralization that the blockchain was designed to free users from and presents the exact security dilemmas that accompany leaving large amounts of data processing in the hands of a select few.

How Sharding Works

What is sharding blockchain? honeycomb

Now that you have an overview of the problems that scalability issues spawn for any decentralized network, we can take a look at how sharding works in theory and practice and the arguments for and against it. While sharding essentially boils down to horizontal database partitioning to spread workloads, the term, funnily enough, actually comes from the venerated MMO hall-of-famer, Ultima Online.

As the game grew in size, the developers looked for a lore-friendly way of partitioning the game into multiple independent servers (or worlds, as most MMOs would call them now) and settled on “shards” based on the concept that each server is a canonically a world encapsulated within a shard of a broken crystal. Pretty cool stuff, and an unexpected origin story for what is now a commonplace term in database management.

While similar, rather than shattering a singular crystal into multiple shards, in the context of blockchain Sharding would essentially be replacing a singular, large crystal with numerous smaller but whole crystals. Sort of. That analogy sort of holds until later on when we get into relay chains and specialized shards.

Think of it like running multiple independent blockchains simultaneously; the nodes within each smaller blockchain, or shard, need only store the ledger data for the rest of the nodes within its resid shard, rather than the entire network.

This way, instead of using the vast multitude of notes connected to, say, the Ethereum Network in its entirety, for one transaction at a time, it could be split into, say, ten subordinate shards and complete ten at a time — With the entire consensus algorithm being completed within each shard. This would essentially allow a blockchain to multitask and could theoretically result in a manifold increase in transaction speeds.

This would solve the local storage problem for individual nodes by no longer necessitating each individual member to keep a record of the entire network’s history on their machine. By using this barrier to entry, sharding could also help stave off the unwanted centralization that accompanies rising storage and equipment costs.

Sharding: Problems & Solutions

Down below we will examine exactly what makes sharding an attractive option to developers looking to tackle blockchain’s scalability issues and take a look at a few of the unique challenges the charting itself poses in terms of both security and feasibility.

Shard Vulnerability

What is sharding blockchain? hacks

While sharding is a theoretical answer to the problem of scalability and centralization, it does so with a significant trade-off insecurity. Blockchains like Bitcoin’s that rely on a Proof of Work consensus algorithm to maintain their ledger are vulnerable to a hypothetical cyber-attack called a 51% attack.

Because the Proof of Work Protocol rewards the miners who win the “race” to solve the cryptographic puzzle that verifies a transaction, those with more computer power have proportionately greater chances of being the one who verifies the transaction — more power equals more influence on the network.

A 51% attack becomes possible when any singular entity obtains more than 50% of the total computing power in a network (even 50.01% and lower would be sufficient as long as it is more than half), giving them the power to dictate every transaction in the network and prevent others from validating the authenticity of the blockchain.

While in control, malicious actors could double-spend coins and enrich themselves with complete control of the mining process. In practice, however, this is considered extremely unlikely simply due to how much power 51% of a major blockchain’s total computing power really is.

In the context of crypto-mining, Computing power is generally measured in hash rate per second. A standard PC is generally capable of anywhere between a few thousand hashes per second (KH/S), meaning it can generate a few thousand 64-digit hexadecimals per second.

The entire Bitcoin network, on the other hand, is currently measured at around 156 EH/s — meaning 156 quintillion hashes per second. High-end mining servers like the Bitmain S9 that go for thousands of dollars are capable of putting out a few trillion hashes per second — many, many orders of magnitude below the Bitcoin network’s 50% threshold.

However, because sharding divides a network into multiple independent nodes, the total power required to take over a singular node is divided accordingly. Let’s say Ethereum’s total computing power is 100, and the network is divided into 20 different, independently operating shards.

The transaction speed could be multiplied accordingly, but the total computing power of each shard is now 5. This means that to take over a singular shard, all that is needed is a computing power above 2.5. While the takeover of a single shard may not jeopardize the entire network, corruption consigns that one shard to permanent loss.

Even if it doesn’t destroy the entire network outright, it allows attackers to pose a risk of progressive dismantlement and also erodes trust in the security of the network — security being blockchain’s primary selling point right now.

The Beacon Chain – a Double-Edged Sword

What is sharding blockchain? relay

To combat this critical vulnerability, Blockchains like Ethereum are exploring how randomness can be weaponized as a shield against attackers. In the aforementioned example where, for an individual shard to be compromised, only 2.6% of a network’s total computing power is required.

However small this threshold may be, it depends on all of that computing power being assigned within a single shard. If a malicious node cannot select the shard in which you will serve as a validator, it becomes exponentially more difficult to compromise a shard.

In order to oversee the task of randomizing validator selection, A second blockchain is created that does not participate directly with the computation inside any particular shard.

Instead, its sole focus is performing the separate computational operations required for the upkeep of the entire network, generating random numbers for the selection process, recording shard states (snapshots of a shard’s ledger without a complete transactional history of each block), and providing other network-wide services. This central, overarching chain is known as the Beacon chain in Ethereum and Relay Chain in Polkadot.

However, as seems to be true with most solutions in blockchain, this answer is a double-edged sword. While, theoretically, sharding can completely address any and all scalability issues inherent to a non-sharded blockchain, its dependence on a separate beacon chain to oversee its functioning and help maintain its security poses its own constraint on scaling because the beacon chain is not sharded.

Because the beacon chain is responsible for a number of computational services required to oversee all of the shards, it too can bottleneck throughput as the number of shards outgrows the computational power provided by the network of nodes that contribute to the relay chain. So it’s a tradeoff developers are still working on solutions to.

Shard Interoperability

What is sharding blockchain?

Another major impediment to fully isolated shards is their ability to communicate with one another. Many proponents of sharding argue for a specialized-shard approach in which whole shards are dedicated to specific tasks, Rather than simply cutting up the blockchain into miniatures that handle the whole gamut of data processes the original, non-sharded chain handled.

This, however, requires shards to be able to talk to each other — something that the oft-cited theoretical model doesn’t explicitly describe. Validators need to be able to exchange accurate information without running into the same scalability issues they would if each validator had to authenticate all of the data on an external shard it needs to interact with.

This is a complex issue with only a few solutions — such as having all shards create prospective new blocks simultaneously or dividing the process into a sequential validation system.

At the end of the day, sharding is a technologically complex solution to blockchain’s biggest problems, but on far from crystallization.

What do you make of sharding? A viable answer to blockchain’s most stalwart gatekeeper to mainstream acceptance or far-flung fool’s gold best left on the wayside in pursuit of better solutions?

RELATED

Blockchain: What Is Sharding? [Expained]

What is Sharding?