Blockchain: Under the Hood
Blockchain is increasingly mentioned in various business circles, but people often lack a concrete understanding of what it is, particularly when it comes to its underlying technology.
This article is the first of two which will focus on blockchain from a technical perspective. In this piece, we explore what blockchain is for those who have heard of it, and would like to know how it actually works and why it’s important. Next, we will showcase how we applied the blockchain methodology in a proof of concept with a Global Financial Services organization.
What is Blockchain?
First born in 2008, blockchain’s claim to fame comes from its role as the underpinning technology for Bitcoin, but it has evolved to be used for much more than cryptocurrencies. While it is many things, it is a decentralized, independently verifiable, and immutable ledger at its core.
When transactions are recorded on the ledger, every member of the network is updated. The decentralized nature of this ledger ensures that these transactions and the information they carry are tamper-proof1. Thanks to blockchain’s use of merkle trees, it is very easy for the network to verify the validity of each transaction.
Git also uses merkle trees and this similarity is helpful, because blockchain can be seen as a peer-to-peer (P2P) hosted git repository. In order to modify this repository, users must have a copy of the whole repository and pull the latest commits2. When you download a blockchain client and run it, you have to checkout the entire blockchain history, much like checking out a git repo.
To run a node, you must download the entire blockchain and run the blockchain client software. Each commit in the repository can be thought of as a block in the blockchain as it modifies state and is immutable once committed.
The P2P nature of blockchain removes the need for a centralized repository so instead of requiring something like github, each node publishes its changes to existing nodes simultaneously. As in git, each block contains a current hash and a hash of the previous block; this plays a central role in the transaction verification process.
You can think of this verification as a git pre-commit hook. For instance, it will verify that the person sending currency is authorized to sign for that transaction, that they own the coins they are trying to send, and that they have sufficient funds to make the transaction. Blockchain is also built to prevent merge conflicts since it cannot resolve them.
A merge conflict can be thought of as any time the blockchain has conflicting versions of history. Imagine Robb only has five bitcoin (BTC), yet he sends five BTC to Arya and five BTC to Jon. Here is a double-spend dilemma, which is prevented by the concept of proof-of-work3, essentially a weighted lottery system that ensures transactions happen sequentially, so that the five BTC go to Arya and not Jon.
To win, one must be the first to solve a complex algorithm. Lottery winners are chosen by whoever is the first to solve a complex algorithm; nodes can increase their chances of winning by allocating more computing power to solve the algorithm.
This process is called mining. The more miners there are on a network, the less likely any one individual miner can monopolize the ability to choose transaction order. Once a miner has solved the algorithm, they receive a reward for their efforts: a freshly minted cryptocurrency. They then commit this newly mined block to the blockchain with all its transactions.
Whoever wins the lottery chooses the order of transactions. By default, transactions will execute in the order they were submitted, but without an arbiter, there is no guarantee the transactions will execute in the same order on everyone's instance because events are asynchronous. Therefore, someone has to decide what truth all nodes should follow.
As long as the amount of the reward exceeds the cost of maintaining the infrastructure for the winning node, the network itself is an economically viable utility service independently operated by those who transact on top of it.
Most blocks in the blockchain will contain transactions, hashes of the current and previous blocks, unstructured data, as well as signatures for the miners and transactions senders. These signatures are public keys and do not correlate to a person’s identity; therefore, all transactions on blockchain are pseudo-anonymous.4
Unlike Bitcoin, some blockchains have block sizes large enough to contain decentralized applications (dApps)5. These dApps use smart contracts which contain self-executing code and are stored on the blockchain. These smart contracts have many uses, ranging from crowd-funding an organization to renting your bike to strangers. All these dApps are reachable via their unique addresses, similar to a domain name one would visit on the internet.
Ethereum is the most notable blockchain implementation to support dApps, and it has gained a lot of traction for this capability. Much of the blockchain space can be broken down into Bitcoin and altcoins. Many of these altcoins are actually forks of Bitcoin source code or used Bitcoin to initiate their cryptocurrency. For example, Ethereum deployed its blockchain by conducting a pre-sale and amassed 25000 BTC.
Why Blockchain Matters
Now that you understand the basic tenets of blockchain, you’re probably wondering why you should even care. How is this actually useful?
If we think of blockchain as a distributed ledger, you can start to envision how it can be a distributed database or even a decentralized hosting provider. Everledger, a fraud detection system built on blockchain, helps cryptographically insure diamonds by storing unique images of each diamond on the blockchain.
If Brienne buys a diamond, insures it, and then loses it, the insurance company can take ownership of the diamond and hope that it is eventually recovered. If it is found and someone tries to resell it, the purchasing jeweler can easily verify ownership of the diamond, at which point the insurance company will repossess the diamond.
This whole model can exist in a non-blockchain world, but it is much easier to build and maintain it in blockchain. Trust is decentralized and information on the diamond is ubiquitously available yet immutable. No one will question the validity of the diamond’s image because its immutability is backed by the network.
Everledger is one application that stores digital representations of physical goods on blockchain, but as more goods, data, and content are stored on blockchain, it will shift the way we host and serve applications. Once data and applications are decentralized, it makes less sense to rely on hosting providers or data centers to serve our content.
Currently, most companies set up their own servers or use public/private clouds. While these solutions are often secure and resilient to outages, they still have concentrated points of failure.
Blockchain shifts trust from individual administrators or hosting providers and puts it in the hands of people running blockchain nodes. Since the data is immutable and pseudonymous, you can trust the data you receive without having to trust the people you receive it from. The integrity of the network is maintained by a large body of people all acting in their own self-interest. If a person’s best interest is to compromise the network, they would have to convince 51% of the network to collude, which is computationally extremely difficult and very unlikely in a sufficiently large network. In this world of decentralized data and applications, everyone is now a server.
Decentralized servers built on blockchain are not that far way, but they are not currently a feasible reality. One barrier is that it is not easy for someone to interact with a blockchain unless they are very technical. Even a simple game of Rock-Paper-Scissors on Ethereum requires some familiarity with terminal. The Ethereum Mist Browser; however, aims to change that by allowing people to create smart contracts, interact with contracts, and “surf” the blockchain, all via a browser.
A world with decentralized servers threatens the business models of existing companies. Many companies currently profit from owning the data of individuals; however, if everyone can see what people are sharing without intermediaries, how will these companies profit? In this futuristic world, many companies will have to transition from data monopolizers to value adders. While this state is admittedly years away, it will be very interesting to see what happens between now and then.
Many are excited about the potential for blockchain; however, in order for this future to materialize, we as technologists have some work to do. Blockchain is still foreign to many developers and that community will have to grow in order to build the tools and applications we are looking forward to seeing.
For blockchain to reach mass adoption, there are also technical challenges that need to be addressed, such as scalability and private key management. Members of the blockchain community are fervently addressing these problems, but they won’t be solved overnight.
Imagine a world that forgoes postgres in favor of blockchain to build a distributed database, where people can anonymously store and retrieve data that is perpetually transparent. That world is here today, but to fully realize it, we must pick up our keyboards and start building the apps and tools that will take blockchain to the next level.
[In this edition of the Thoughtworks Beacon Podcast, Software Architect Neal Ford, Prasanna Pendse, Tech Principal, and Jonny LeRoy, Head of Technology, North America, discuss blockchain, how it differs from Bitcoin and what the future holds for the technology.]
Footnotes
1A blockchain’s ability to resist tampering is in direct correlation with the number of nodes running. It can only be truly tamper-proof if it is very large much like the Bitcoin network.
2Light clients exist for many blockchains and they do not require users to run a blockchain node. These clients are useful; however, they do come with security risks and do not help decentralize the network.
3 Proof-of-work is not the only way to solve this problem. Proof of stake is a less resource-intensive way to solve this problem, but requires people to have already obtained a given currency which undermines the decentralized nature if the network is not large enough.
4 They are not fully anonymous because if you can connect transactions to an identity, then you can subsequently connect the public key of those transactions to that identity.
5 dApps are still very nascent and require some development infrastructure to ensure their robustness and reliability. The DAO exploit highlighted that need.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.