Blockchain technology. You’ve heard about it. You probably know it’s used for cryptocurrencies. But few people actually understand how it works, and how to use it.
So, I figured I’d outline what Blockchain is, and how it compares to traditional, centralized databases.
Check out my follow-up article about how to set up a Blockchain database in your project. Hint: It’s as easy as setting up Postgres or MongoDB.
Like our content, what we do and how we do it?
The Big Picture
Blockchain is just another way of storing data. It is a distributed, decentralized, non-hierarchical database. Or, more specifically, it’s a bunch of database nodes constantly talking to each other, instantaneously syncing all the incoming data between themselves.
In short: Blockchain is a network of identical databases.
Photo via Coindesk
Keeping that in mind, let’s dive into the basics of how the technology actually works. Because, as Einstein once said, “You do not really understand something unless you can explain it to your grandmother.”
A hash is the unique output of a hash function which has been fed some data (a.k.a. the payload). Hashes cannot be reversed back into readable information. The hash function will always yield the same hash for the same data.
For example, the string “Blockchains are awesome,” using the SHA256 algorithm, will always yield the hash “F7BC6C6148 … A4CB553D27,” but the string “Blockchains are awesome!” (notice the added exclamation mark) will yield “15B4F08F90 … 9A5BA41103.” So, you can see the constant values.
Hashes are one of the underlying building parts of Blockchain technology. With that in mind, let's proceed to the next core part: transactions.
A traditional database consists of rows (records) of data. E.g.:
In the same sense, a Blockchain database consists of transactions. The main difference is that each tx in the Blockchain is cryptographically tied to the previous tx.
Image via Nakamotoinstitute.org
Each tx is signed by its publisher’s private key, along with the previous tx’s hash, thus creating the current tx. This means that each tx is part of the next tx’s hash payload. Combining this with our knowledge of hashes, we can now see how the transactions are connected. We now understand that the smallest change in any previous tx data will force a change in all consecutive transactions. That’s the basis of Blockchain’s immutability feature.
Now that we have a better understanding of this new technology, we come to the bigger building elements - blocks. A Blockchain is, well... a chain of blocks. Each block is linked to the previous one by, again, a hash value.
Image via Nakamotoinstitute.org
Each block in the chain consists of an ordered set of transactions. These transactions must not conflict with the transactions in the current block or the previous blocks. Just like a tx, a block moves rapidly through the network to the other nodes. Once the allowed memory size of the block is filled by transactions (or the blockchain might follow another rule), it is sealed. And once a tx appears in a block, it is ”confirmed,” which makes the other nodes reject any conflicting transactions.
The block is, as a hash, used in the consecutive block’s payload. This way, exactly as with the transactions, a chain of block hashes that are incorporated within each other is created. Hence the name “Blockchain.”
How does the network ensure that the nodes agree on the state of the chain? And who creates these blocks? This is a complex topic for another time. But the short answer is: It’s the job of the Consensus Algorithm.
Traditional DB vs. Blockchain
Performance is a feature in which Blockchain is lacking when compared to centralized databases. This isn’t because Blockchain technologies are slow and not optimized. It is simply a result of Blockchain’s nature. Like other databases, Blockchain follows more or less the same steps when processing a transaction, but Blockchain has three additional burdens:
- Signature verification - Every transaction going into the Blockchain has to be digitally signed using a public-private cryptographic scheme. This is needed in order to prove the transaction’s origin, due to the peer-to-peer nature of the Blockchain. Centralized databases do not have to prove the origin of each incoming transaction once a connection to the database has been established.
- Consensus mechanism - Because of Blockchain’s reliance on distribution, it takes effort to ensure that the nodes in the network can reach a consensus on a transaction’s validity. This might involve significant back-and-forth communication between the nodes.
- Redundancy - Centralized databases process a transaction once, while a Blockchain requires the redundancy of a transaction being processed individually by every single node in the network.
In terms of robustness, a Blockchain is extremely tolerant of faults. This is one of its default features. Each node in the network does the same job of processing transactions, which means that the failure of a single node is in no way detrimental to the overall database. Furthermore, the consensus algorithm in Blockchain technology ensures that failed nodes can catch up in the latest database state, once they are back up and running.
Sure, many techniques ensure robustness in traditional databases - like replication, master-slave, disaster recovery and others. Nevertheless, all these techniques require setup and configuration procedures, which are usually difficult to do correctly. And once they’ve been set up, these mechanisms require constant and close monitoring and maintenance.
Disintermediation stands for the characteristic of mitigating the intermediaries in the communication between entities (eg. removing third parties). Essentially, this characteristic enables direct sharing across certain trust rules, without requiring central administration. The Blockchain achieves that through the transactions’ own proof of validity and authorization - as opposed to having a centralized application logic that enforces these constraints, as is the case with centralized databases.
In summary: Having a Blockchain as your database gets you these sweet properties:
- Instant replication across the nodes
- No hierarchy between the nodes
- Synchronized written input between the nodes
- Multiple identical copies
That’s all for now. If you’d like to know how to easily set up a Blockchain in your backend project (e.g. with Node.js) - I cover that here.
Thanks for your time!
You might also like...
Inside STRV, Product
The Business Side of STRV: A Talk with Jeremy Stephan
A strong product or service deserves an equally strong advocate. That was STRV’s thinking as we searched for the person best fit to lead our business development activities. Meet Jeremy Stephan. It wasn’t just his experience that impressed....
Getting to Know Monorepo
As a concept, Monorepos have been around for more than a decade. Google, Facebook and Microsoft have been using this architecture for ages. But it’s only now, as better tooling hits the market, that startups and open source projects are jum...
Kotlin Coroutines, Threads, Concurrency and Parallelism 101
Kotlin coroutines have been stable since Kotlin 1.3. As a result, we can finally get rid of the experimental flag and start our exciting journey into the uncharted magical world of concurrency.But wait a second…we can’t just dive in unprepa...