There’s more than one way to put code on a blockchain
In most discussions about blockchains, it doesn’t take long for the notion of “smart contracts” to come up. In the popular imagination, smart contracts automate the execution of interparty interactions, without requiring a trusted intermediary. By expressing legal relationships in code rather than words, they promise to enable transactions to take place directly and without error, whether deliberate or not.
From a technical viewpoint, a smart contract is something more specific: computer code that lives on a blockchain and defines the rules for that chain’s transactions. This description sounds simple enough, but behind it lies a great deal of variation in how these rules are expressed, executed and validated. When choosing a blockchain platform for a new application, the question “Does this platform support smart contracts?” isn’t the right one to ask. Instead, we need to be asking: “What type of smart contracts does this platform support?”
In this article, my goal is to examine some of the major differences between smart contract approaches and the trade-offs they represent. I’ll do this by looking at four popular enterprise blockchain platforms which support some form of customized on-chain code. First, IBM’s Hyperledger Fabric, which calls its contracts “chaincode”. Second, our MultiChain platform, which introduces smart filters in version 2.0. Third, Ethereum (and its permissioned Quorum and Burrow spin-offs), which popularized the “smart contract” name. And finally, R3 Corda, which references “contracts” in its transactions. Despite all of the different terminology, ultimately all of these refer to the same thing – application-specific code that defines the rules of a chain.
Before going any further, I should warn the reader that much of the following content is technical in nature, and assumes some familiarity with general programming and database concepts. For good or bad, this cannot be avoided – without getting into the details it’s impossible to make an informed decision about whether to use a blockchain for a particular project, and (if so) the right type of blockchain to use.
Let’s begin with some context. Imagine an application that is shared by multiple organizations, which is based on an underlying database. In a traditional centralized architecture, this database is hosted and administered by a single party which all of the participants trust, even if they do not trust each other. Transactions which modify the database are initiated only by applications on this central party’s systems, often in response to messages received from the participants. The database simply does what it’s told because the application is implicitly trusted to only send it transactions that make sense.
Blockchains provide an alternative way of managing a shared database, without a trusted intermediary. In a blockchain, each participant runs a “node” that holds a copy of the database and independently processes the transactions which modify it. Participants are identified using public keys or “addresses”, each of which has a corresponding private key known only to the identity owner. While transactions can be created by any node, they are “digitally signed” by their initiator’s private key in order to prove their origin.
Nodes connect to each other in a peer-to-peer fashion, rapidly propagating transactions and the “blocks” in which they are timestamped and confirmed across the network. The blockchain itself is literally a chain of these blocks, which forms an ordered log of every historical transaction. A “consensus algorithm” is used to ensure that all nodes reach agreement on the content of the blockchain, without requiring centralized control. (Note that some of this description does not apply to Corda, in which each node has only a partial copy of the database and there is no global blockchain. We’ll talk more about that later on.)
In principle, any shared database application can be architected by using a blockchain at its core. But doing so creates a number of technical challenges which do not exist in a centralized scenario:
Transaction rules. If any participant can directly change the database, how do we ensure that they follow the application’s rules? What stops one user from corrupting the database’s contents in a self-serving way?
Determinism. Once these rules are defined, they will be applied multiple times by multiple nodes when processing transactions for their own copy of the database. How do we ensure that every node obtains exactly the same result?
Conflict prevention. With no central coordination, how do we deal with two transactions that each follow the application’s rules, but nonetheless conflict with each other? Conflicts can stem from a deliberate attempt to game the system, or be the innocent result of bad luck and timing.
So where do smart contracts, smart filters and chaincode come in? Their core purpose is to work with a blockchain’s underlying infrastructure in order to solve these challenges. Smart contracts are the decentralized equivalent of application code – instead of running in one central place, they run on multiple nodes in the blockchain, creating or validating the transactions which modify that database’s contents.
Let’s begin with transaction rules, the first of these challenges, and see how they are expressed in Fabric, MultiChain, Ethereum and Corda respectively.
Transaction rules perform a specific function in blockchain-powered databases – restricting the transformations that can be performed on that database’s state. This is necessary because a blockchain’s transactions can be initiated by any of its participants, and these participants do not trust each other sufficiently to allow them to modify the database at will.
Let’s see two examples of why transaction rules are needed. First, imagine a blockchain designed to aggregate and timestamp PDF documents that are published by its participants. In this case, nobody should have the right to remove or change documents, since doing so would undermine the entire purpose of the system – document persistence. Second, consider a blockchain representing a shared financial ledger, which keeps track of the balances of its users. We cannot allow a participant to arbitrarily inflate their own balance, or take others’ money away.
Inputs and outputs
Our blockchain platforms rely on two broad approaches for expressing transaction rules. The first, which I call the “input–output model”, is used in MultiChain and Corda. Here, transactions explicitly list the database rows or “states” which they delete and create, forming a set of “inputs” and “outputs” respectively. Modifying a row is expressed as the equivalent operation of deleting that row and creating a new one in its place.
Since database rows are only deleted in inputs and only created in outputs, every input must “spend” a previous transaction’s output. The current state of the database is defined as the set of “unspent transaction outputs” or “UTXOs”, i.e. outputs from previous transactions which have not yet been used. Transactions may also contain additional information, called “metadata”, “commands” or “attachments”, which don’t become part of the database but help to define their meaning or purpose.
Given these three sets of inputs, outputs and metadata, the validity of a transaction in MultiChain or Corda is defined by some code which can perform arbitrary computations on those sets. This code can validate the transaction, or else return an error with a corresponding explanation. You can think of the input–output model as an automated “inspector” holding a checklist which ensures that transactions follow each and every rule. If the transaction fails any one of those checks, it will automatically be rejected by all of the nodes in the network.
It should be noted that, despite sharing the input–output model, MultiChain and Corda implement it very differently. In MultiChain, outputs can contain assets and/or data in JSON, text or binary format. The rules are defined in “transaction filters” or “stream filters”, which can be set to check all transactions, or only those involving particular assets or groupings of data. By contrast, a Corda output “state” is represented by an object in the Java or Kotlin programming language, with defined data fields. Corda’s rules are defined in “contracts” which are attached to specific states, and a state’s contract is only applied to transactions which contain that state in its inputs or outputs. This relates to Corda’s unusual visibility model, in which transactions can only be seen by their counterparties or those whose subsequent transactions they affect.
Contracts and messages
The second approach, which I call the “contract–message model”, is used in Hyperledger Fabric and Ethereum. Here, multiple “smart contracts” or “chaincodes” can be created on the blockchain, and each has its own database and associated code. A contract’s database can only be modified by its code, rather than directly by blockchain transactions. This design pattern is similar to the “encapsulation” of code and data in object-oriented programming.
With this model, a blockchain transaction begins as a message sent to a contract, with some optional parameters or data. The contract’s code is executed in reaction to the message and parameters, and is free to read and write its own database as part of that reaction. Contracts can also send messages to other contracts, but cannot access each other’s databases directly. In the language of relational databases, contracts act as enforced “stored procedures”, where all access to the database goes via some predefined code.
Both Fabric and Quorum, a variation on Ethereum, complicate this picture by allowing a network to define multiple “channels” or “private states”. The aim is to mitigate the problem of blockchain confidentiality by creating separate environments, each of which is only visible to a particular sub-group of participants. While this sounds promising in theory, in reality the contracts and data in each channel or private state are isolated from those in the others. As a result, in terms of smart contracts, these environments are equivalent to separate blockchains.
Let’s see how to implement the transaction rules for a single-asset financial ledger with these two models. Each row in our ledger’s database has two columns, containing the owner’s address and the quantity of the asset owned. In the input–output model, transactions must satisfy two conditions:
The total quantity of assets in a transaction’s outputs has to match the total in its inputs. This prevents users from creating or deleting money arbitrarily.
Every transaction has to be signed by the owner of each of its inputs. This stops users from spending each other’s money without permission.
Taken together, these two conditions are all that is needed to create a simple but viable financial system.
In the contract–message model, the asset’s contract supports a “send payment” message, which takes three parameters: the sender’s address, recipient’s address, and quantity to be sent. In response, the contract executes the following four steps:
Verify that the transaction was signed by the sender.
Check that the sender has sufficient funds.
Deduct the requested quantity from the sender’s row.
Add that quantity to the recipient’s row.
If either of the checks in the first two steps fails, the contract will abort and no payment will be made.
So both the input–output and contract–message models are effective ways to define transaction rules and keep a shared database safe. Indeed, on a theoretical level, each of these models can be used to simulate the other. In practice however, the most appropriate model will depend on the application being built. Does each transaction affect few or many pieces of information? Do we need to be able to guarantee transaction independence? Does each piece of data have a clear owner or is there some global state to be shared?
It is beyond our scope here to explore how the answers should influence a choice between these two models. But as a general guideline, when developing a new blockchain application, it’s worth trying to express its transaction rules in both forms, and seeing which fits more naturally. The difference will express itself in terms of: (a) ease of programming, (b) storage requirements and throughput, and (c) speed of conflict detection. We’ll talk more about this last issue later on.
When it comes to transaction rules, there is one way in which MultiChain specifically differs from Fabric, Ethereum and Corda. Unlike these other platforms, MultiChain has several built-in abstractions that provide some basic building blocks for blockchain-driven applications, without requiring developers to write their own code. These abstractions cover three areas that are commonly needed: (a) dynamic permissions, (b) transferrable assets, and (c) data storage.
For example, MultiChain manages permissions for connecting to the network, sending and receiving transactions, creating assets or streams, or controlling the permissions of other users. Multiple fungible assets can be issued, transferred, retired or exchanged safely and atomically. Any number of “streams” can be created on a chain, for publishing, indexing and retrieving on-chain or off-chain data in JSON, text or binary formats. All of the transaction rules for these abstractions are available out-of-the-box.
When developing an applications in MultiChain, it’s possible to ignore this built-in functionality, and express transaction rules using smart filters only. However, smart filters are designed to work together with its built-in abstractions, by enabling their default behavior to be restricted in customized ways. For example, the permission for certain activities might be controlled by specific administrators, rather than the default behavior where any administrator will do. The transfer of certain assets can be limited by time or require additional approval above a certain amount. The data in a particular stream can be validated to ensure that it consists only of JSON structures with required fields and values.
In all of these cases, smart filters create additional requirements for transactions to be validated, but do not remove the simple rules that are built in. This can help address one of the key challenges in blockchain applications: the fact that a bug in some on-chain code can lead to disastrous consequences. We’ve seen endless examples of this problem in the public Ethereum blockchain, most famously in the Demise of The DAO and the Parity multisignature bugs. Broader surveys have found a large number of common vulnerabilities in Ethereum smart contracts that enable attackers to steal or freeze other peoples’ funds.
Of course, MultiChain smart filters may contain bugs too, but their consequences are more limited in scope. For example, the built-in asset rules prevent one user from spending another’s money, or accidentally making their own money disappear, no matter what other logic a smart filter contains. If a bug is found in a smart filter, it can be deactivated and replaced with a corrected version, while the ledger’s basic integrity is protected. Philosophically, MultiChain is closer to traditional database architectures, where the database platform provides a number of built-in abstractions, such as columns, tables, indexes and constraints. More powerful features such as triggers and stored procedures can optionally be coded up by application developers, in cases where they are actually needed.
Permissions + assets + streams
Let’s move on to the next part of our showdown. No matter which approach we choose, the custom transaction rules of a blockchain application are expressed as computer code written by application developers. And unlike centralized applications, this code is going to be executed more than one time and in more than one place for each transaction. This is because multiple blockchain nodes belonging to different participants have to each verify and/or execute that transaction for themselves.
This repeated and redundant code execution introduces a new requirement that is rarely found in centralized applications: determinism. In the context of computation, determinism means that a piece of code will always give the same answer for the same parameters, no matter where and when it is run. This is absolutely crucial for code that interacts with a blockchain because, without determinism, the consensus between the nodes on that chain can catastrophically break down.
Let’s see how this looks in practice, first in the input–output model. If two nodes have a different opinion about whether a transaction is valid, then one will accept a block containing that transaction and the other will not. Since every block explicitly links back to a previous block, this will create a permanent “fork” in the network, with one or more nodes not accepting the majority opinion about the entire blockchain’s contents from that point on. The nodes in the minority will be cut off from the database’s evolving state, and will no longer be able to effectively use the application.
Now let’s see what happens if consensus breaks down in the contract–message model. If two nodes have a different opinion about how a contract should respond to a particular message, this can lead to a difference in their databases’ contents. This in turn can affect the contract’s response to future messages, including messages it sends to other contracts. The end result is an increasing divergence between different nodes’ view of the database’s state. (The “state root” field in Ethereum blocks ensures that any difference in contracts’ responses leads immediately to a fully catastrophic blockchain fork, rather than risking staying hidden for a period of time.)
Sources of non-determinism
So non-determinism in blockchain code is clearly a problem. But if the basic building blocks of computation, such as arithmetic, are deterministic, what do we have to worry about? Well, it turns out, quite a few things:
Most obviously, random number generators, since by definition these are designed to produce a different result every time.
Checking the current time, since nodes won’t be processing transactions at exactly the same time, and in any event their clocks may be out of sync. (It’s still possible to implement time-dependent rules by making reference to timestamps within the blockchain itself.)
Querying external resources such as the Internet, disk files, or other programs running on a computer. These resources cannot be guaranteed to always give the same response, and may become unavailable.
Running multiple pieces of code in parallel “threads”, since this leads to a “race condition” where the order in which these processes finish cannot be predicted.
Performing any floating point calculations which can give even minutely different answers on different computer processor architectures.
Our four blockchain platforms employ several different approaches to avoiding these pitfalls.
Determinism by endorsement
When it comes to determinism, Hyperledger Fabric adopts a completely different approach. In Fabric, when a “client” node wants to send a message to some chaincode, it first sends that message to some “endorser” nodes. Each of these nodes executes the chaincode independently, forming an opinion of the message’s effect on that chaincode’s database. These opinions are sent back to the client together with a digital signature which constitutes a formal “endorsement”. If the client receives enough endorsements of the intended outcome, it creates a transaction containing those endorsements, and broadcasts it for inclusion in the chain.
In order to guarantee determinism, each piece of chaincode has an “endorsement policy”..
By now it’s clear that many blockchain use cases have nothing to do with financial transactions. Instead, the chain’s purpose is to enable the decentralized aggregation, ordering, timestamping and archiving of any type of information, including structured data, correspondence or documentation. The blockchain’s core value is enabling its participants to provably and permanently agree on exactly what data was entered, when and by whom, without relying on a trusted intermediary. For example, SAP’s recently launched blockchain platform, which supports MultiChain and Hyperledger Fabric, targets a broad range of supply chain and other non-financial applications.
The simplest way to use a blockchain for recording data is to embed each piece of data directly inside a transaction. Every blockchain transaction is digitally signed by one or more parties, replicated to every node, ordered and timestamped by the chain’s consensus algorithm, and stored permanently in a tamper-proof way. Any data within the transaction will therefore be stored identically but independently by every node, along with a proof of who wrote it and when. The chain’s users are able to retrieve this information at any future time.
For example, MultiChain 1.0 allowed one or more named “streams” to be created on a blockchain and then used for storing and retrieving raw data. Each stream has its own set of write permissions, and each node can freely choose which streams to subscribe to. If a node is subscribed to a stream, it indexes that stream’s content in real-time, allowing items to be retrieved quickly based on their ordering, timestamp, block number or publisher address, as well as via a “key” (or label) by which items can be tagged. MultiChain 2.0 (since alpha 1) extended streams to support Unicode text or JSON data, as well as multiple keys per item and multiple items per transaction. It also added summarization functions such as “JSON merge” which combine items with the same key or publisher in a useful way.
Confidentiality and scalability
While storing data directly on a blockchain works well, it suffers from two key shortcomings – confidentiality and scalability. To begin with confidentiality, the content of every stream item is visible to every node on the chain, and this is not necessarily a desirable outcome. In many cases a piece of data should only be visible to a certain subset of nodes, even if other nodes are needed to help with its ordering, timestamping and notarization.
Confidentiality is a relatively easy problem to solve, by encrypting information before it is embedded in a transaction. The decryption key for each piece of data is only shared with those participants who are meant to see it. Key delivery can be performed on-chain using asymmetric cryptography (as described here) or via some off-chain mechanism, as is preferred. Any node lacking the key to decrypt an item will see nothing more than binary gibberish.
Scalability, on the other hand, is a more significant challenge. Let’s say that any decent blockchain platform should support a network throughput of 500 transactions per second. If the purpose of the chain is information storage, then the size of each transaction will depend primarily on how much data it contains. Each transaction will also need (at least) 100 bytes of overhead to store the sender’s address, digital signature and a few other bits and pieces.
If we take an easy case, where each item is a small JSON structure of 100 bytes, the overall data throughput would be 100 kilobytes per second, calculated from 500 × (100+100). This translates to under 1 megabit/second of bandwidth, which is comfortably within the capacity of any modern Internet connection. Data would accumulate at a rate of around 3 terabytes per year, which is no small amount. But with 12 terabyte hard drives now widely available, and RAID controllers which combine multiple physical drives into a single logical one, we could easily store 10-20 years of data on every node without too much hassle or expense.
However, things look very different if we’re storing larger pieces of information, such as scanned documentation. A reasonable quality JPEG scan of an A4 sheet of paper might be 500 kilobytes in size. Multiply this by 500 transactions per second, and we’re looking at a throughput of 250 megabytes per second. This translates to 2 gigabits/second of bandwidth, which is faster than most local networks, let alone connections to the Internet. At Amazon Web Services’ cheapest published price of $0.05 per gigabyte, it means an annual bandwidth bill of $400,000 per node. And where will each node store the 8000 terabytes of new data generated annually?
It’s clear that, for blockchain applications storing many large pieces of data, straightforward on-chain storage is not a practical choice. To add insult to injury, if data is encrypted to solve the problem of confidentiality, nodes are being asked to store a huge amount of information that they cannot even read. This is not an attractive proposition for the network’s participants.
The hashing solution
So how do we solve the problem of data scalability? How can we take advantage of the blockchain’s decentralized notarization of data, without replicating that data to every node on the chain?
The answer is with a clever piece of technology called a “hash”. A hash is a long number (think 256 bits, or around 80 decimal digits) which uniquely identifies a piece of data. The hash is calculated from the data using a one-way function which has an important cryptographic property: Given any piece of data, it is easy and fast to calculate its hash. But given a particular hash, it is computationally infeasible to find a piece of data that would generate that hash. And when we say “computationally infeasible”, we mean more calculations than there are atoms in the known universe.
Hashes play a crucial role in all blockchains, by uniquely identifying transactions and blocks. They also underlie the computational challenge in proof-of-work systems like bitcoin. Many different hash functions have been developed, with gobbledygook names like BLAKE2, MD5 and RIPEMD160. But in order for any hash function to be trusted, it must endure extensive academic review and testing. These tests come in the form of attempted attacks, such as “preimage” (finding an input with the given hash), “second preimage” (finding a second input with the same hash as the given input) and “collision” (finding any two different inputs with the same hash). Surviving this gauntlet is far from easy, with a long and tragic history of broken hash functions proving the famous maxim: “Don’t roll your own crypto.”
To go back to our original problem, we can solve data scalability in blockchains by embedding the hashes of large pieces of data within transactions, instead of the data itself. Each hash acts as a “commitment” to its input data, with the data itself being stored outside of the blockchain or “off-chain”. For example, using the popular SHA256 hash function, a 500 kilobyte JPEG image can be represented by a 32-byte number, a reduction of over 15,000×. Even at a rate of 500 images per second, this puts us comfortably back in the territory of feasible bandwidth and storage requirements, in terms of the data stored on the chain itself.
Of course, any blockchain participant that needs an off-chain image cannot reproduce it from its hash. But if the image can be retrieved in some other way, then the on-chain hash serves to confirm who created it and when. Just like regular on-chain data, the hash is embedded inside a digitally signed transaction, which was included in the chain by consensus. If an image file falls out of the sky, and the hash for that image matches a hash in the blockchain, then the origin and timestamp of that image is confirmed. So the blockchain is providing exactly the same value in terms of notarization as if the image was embedded in the chain directly.
A question of delivery
So far, so good. By embedding hashes in a blockchain instead of the original data, we have an easy solution to the problem of scalability. Nonetheless, one crucial question remains:
How do we deliver the original off-chain content to those nodes which need it, if not through the chain itself?
This question has several possible answers, and we know of MultiChain users applying them all. One basic approach is to set up a centralized repository at some trusted party, where all off-chain data is uploaded then subsequently retrieved. This system could naturally use “content addressing”, meaning that the hash of each piece of data serves directly as its identifier for retrieval. However, while this setup might work for a proof-of-concept, it doesn’t make sense for production, because the whole point of a blockchain is to remove trusted intermediaries. Even if on-chain hashes prevent the intermediary from falsifying data, it could still delete data or fail to deliver it to some participants, due to a technical failure or the actions of a rogue employee.
A more promising possibility is point-to-point communication, in which the node that requires some off-chain data requests it directly from the node that published it. This avoids relying on a trusted intermediary, but suffers from three alternative shortcomings:
It requires a map of blockchain addresses to IP addresses, to enable the consumer of some data to communicate directly with its publisher. Blockchains can generally avoid this type of static network configuration, which can be a problem in terms of failover and privacy.
If the original publisher node has left the network, or is temporarily out of service, then the data cannot be retrieved by anyone else.
If a large number of nodes are interested in some data, then the publisher will be overwhelmed by requests. This can create severe network congestion, slow the publisher’s system down, and lead to long delays for those trying to retrieve that data.
In order to avoid these problems, we’d ideally use some kind of decentralized delivery mechanism. Nodes should be able to retrieve the data they need without relying on any individual system – be it a centralized repository or the data’s original publisher. If multiple parties have a piece of data, they should share the burden of delivering it to anyone else who wants it. Nobody needs to trust an individual data source, because on-chain hashes can prove that data hasn’t been tampered with. If a malicious node delivers me the wrong data for a hash, I can simply discard that data and try asking someone else.
For those who have experience with peer-to-peer file sharing protocols such as Napster, Gnutella or BitTorrent, this will all sound very familiar. Indeed, many of the basic principles are the same, but there are two key differences. First, assuming we’re using our blockchain in an enterprise context, the system runs within a closed group of participants, rather than the Internet as a whole. Second, the blockchain adds a decentralized ordering, timestamping and notarization backbone, enabling all users to maintain a provably consistent and tamper-resistant view of exactly what happened, when and by whom.
How might a blockchain application developer achieve this decentralized delivery of off-chain content? One common choice is to take an existing peer-to-peer file sharing platform, such as the amusingly-named InterPlanetary File System (IPFS), and use it together with the blockchain. Each participant runs both a blockchain node and an IPFS node, with some middleware coordinating between the two. When publishing off-chain data, this middleware stores the original data in IPFS, then creates a blockchain transaction containing that data’s hash. To retrieve some off-chain data, the middleware extracts the hash from the blockchain, then uses this hash to fetch the content from IPFS. The local IPFS node automatically verifies the retrieved content against the hash to ensure it hasn’t been changed.
While this solution is possible, it’s all rather clumsy and inconvenient. First, every participant has to install, maintain and update three separate pieces of software (blockchain node, IPFS node and middleware), each of which stores its data in a separate place. Second, there will be two separate peer-to-peer networks, each with its own configuration, network ports, identity system and permissioning (although it should be noted that IPFS doesn’t yet support closed networks). Finally, tightly coupling IPFS and the blockchain together would make the middleware increasingly complex. For example, if we want the off-chain data referenced by some blockchain transactions to be instantly retrieved (with automatic retries), the middleware would need to be constantly up and running, maintaining its own complex state. Wouldn’t it be nice if the blockchain node did all of this for us?
Off-chain data in MultiChain 2.0
Today we’re delighted to release the third preview version (alpha 3) of MultiChain 2.0, with a fully integrated and seamless solution for off-chain data. Every piece of information published to a stream can be on-chain or off-chain as desired, and MultiChain takes care of everything else.
No really, we mean everything. As a developer building on MultiChain, you won’t have to worry about hashes, local storage, content discovery, decentralized delivery or data verification. Here’s what happens behind the scenes:
The publishing MultiChain node writes the new data in its local storage, slicing large items into chunks for easy digestion and delivery.
The transaction for publishing off-chain stream items is automatically built, containing the chunk hash(es) and size(s) in bytes.
This transaction is signed and broadcast to the network, propagating between nodes and entering the blockchain in the usual way.
When a node subscribed to a stream sees a reference to some off-chain data, it adds the chunk hashes for that data to its retrieval queue. (When subscribing to an old stream, a node also queues any previously published off-chain items for retrieval.)
As a background process, if there are chunks in a node’s retrieval queue, queries are sent out to the network to locate those chunks, as identified by their hashes.
These chunk queries are propagated to other nodes in the network in a peer-to-peer fashion (limited to two hops for now – see technical details below).
Any node which has the data for a chunk can respond, and this response is relayed to the subscriber back along the same path as the query.
If no node answers the chunk query, the chunk is returned back to the queue for later retrying.
Otherwise, the subscriber chooses the most promising source for a chunk (based on hops and response time), and sends it a request for that chunk’s data, again along the same peer-to-peer path as the previous response.
The source node delivers the data requested, using the same path again.
The subscriber verifies the data’s size and hash against the original request.
If everything checks out, the subscriber writes the data to its local storage, making it immediately available for retrieval via the stream APIs.
If the requested content did not arrive, or didn’t match the desired hash or size, the chunk is returned back to the queue for future retrieval from a different source.
Most importantly, all of this happens extremely quickly. In networks with low latency, small pieces of off-chain data will arrive at subscribers within a split second of the transaction that references them. And for high load applications, our testing shows that MultiChain 2.0 alpha 3 can sustain a rate of over 1000 off-chain items or 25 MB of off-chain data retrieved per second, on a mid-range server (Core i7) with a decent Internet connection. Everything works fine with off-chain items up to 1 GB in size, far beyond the 64 MB limit for on-chain data. Of course, we hope to improve these numbers further as we spend time optimizing MultiChain 2.0 during its beta phase.
When using off-chain rather than on-chain data in streams, MultiChain application developers have to do exactly two things:
When publishing data, pass an “offchain” flag to the appropriate APIs.
When using the stream querying APIs, consider the possibility that some off-chain data might not yet be available, as reported by the “available” flag. While this situation will be rare under normal circumstances, it’s important for application developers to handle it appropriately.
Of course, to prevent every node from retrieving every off-chain item, items should be grouped together into streams in an appropriate way, with each node subscribing to those streams of interest.
On-chain and off-chain items can be used within the same stream, and the various stream querying and summarization functions relate to both types of data identically. This allows publishers to make the appropriate choice for every item in a stream, without affecting the rest of an application. For example, a stream of JSON items about people’s activities might use off-chain data for personally identifying information, and on-chain data for the rest. Subscribers can use MultiChain’s JSON merging to combine both types of information into a single JSON for reading.
If you want to give off-chain stream items a try, just follow MultiChain’s regular Getting Started tutorial, and be sure not to skip section 5.
So what’s next?
With seamless support for off-chain data, MultiChain 2.0 will offer a big step forwards for blockchain applications focused on large scale data timestamping and notarization. In the longer term, we’re already thinking about a ton of possible future enhancements to this feature for the Community and/or Enterprise editions of MultiChain:
Implementing stream read permissions using a combination of off-chain items, salted hashes, signed chunk queries and encrypted delivery.
Allowing off-chain data to be explicitly “forgotten”, both voluntarily by individual nodes, or by all nodes in response to an on-chain message.
Selective stream subscriptions, in which nodes only retrieve the data for off-chain items with particular publishers or keys.
Using merkle trees to enable a single on-chain hash to represent an unlimited number of off-chain items, giving another huge jump in terms of scalability.
Pluggable storage engines, allowing off-chain data to be kept in databases or external file systems rather than local disk.
Nodes learning over time where each type of off-chain data is usually available in a network, and focusing their chunk queries appropriately.
We’d love to hear your feedback on the list above as well as off-chain items in general. With MultiChain 2.0 still officially in alpha, there’s plenty of time to enhance this feature before its final release.
In the meantime, we’ve already started work on “Smart Filters”, the last major feature planned for MultiChain 2.0 Community. A Smart Filter is a piece of code embedded in the blockchain which implements custom rules for validating data or transactions. Smart Filters have some similarities with “smart contracts”, and can do many of the same things, but have key differences in terms of safety and performance. We look forward to telling you more in due course.
While off-chain stream items in MultiChain 2.0 are simple to use, they contain many design decisions and additional features that may be of interest. The list below will mainly be relevant for developers building blockchain applications, and can be skipped by less technical types:
Per-stream policies. When a MultiChain stream is created, it can optionally be restricted to allow only on-chain or off-chain data. There are several possible reasons for doing this, rather than allowing each publisher to decide for themselves. For example, on-chain items offer an ironclad availability guarantee, whereas old off-chain items may become irretrievable if their publisher and other subscribers drop off the network. On the flip side, on-chain items cannot be “forgotten” without modifying the blockchain, while off-chain items are more flexible. This can be important in terms of data privacy rules, such as Europe’s new GDPR regulations.
On-chain metadata. For off-chain items, the on-chain transaction still contains the item’s publisher(s), key(s), format (JSON, text or binary) and total size. All this takes up very little space, and helps application developers determine whether the unavailability of an off-chain item is of concern for a particular stream query.
Two-hop limit. When relaying chunk queries across the peer-to-peer network, there is a trade-off between reachability and performance. While it would be nice for every query to be propagated along every single path, this can clog the network with unnecessary “chatter”. So for now chunk queries are limited to two hops, meaning that a node can retrieve off-chain data from any peer of its peers. In the smaller networks of under 1000 nodes that tend to characterize enterprise blockchains, we believe this will work just fine, but it’s easy for us to adjust this constraint (or offer it as a parameter) if we turn out to be wrong.
Local storage. Each MultiChain node stores off-chain data within the “chunks” directory of its regular blockchain directory, using an efficient binary format and LevelDB index. A separate subdirectory is used for the items in each of the subscribed streams, as well as those published by the node itself. Within each of these subdirectories, duplicate chunks (with the same hash) are only stored once. When a node unsubscribes from a stream, it can choose whether or not to purge the off-chain data retrieved for that stream.
Binary cache. When publishing large pieces of binary data, whether on-chain or off-chain, it may not be practical for application developers to send that data to MultiChain’s API in a single JSON-RPC request. So MultiChain 2.0 implements a binary cache, which enables large pieces of data to be built up over multiple API calls, and then published in a brief final step. Each item in the binary cache is stored as a simple file in the “cache” subdirectory of the blockchain directory, allowing gigabytes of data to also be pushed directly via the file system.
Monitoring APIs. MultiChain 2.0 alpha 3 adds two new APIs for monitoring the asynchronous retrieval of off-chain data. The first API describes the current state of the queue, showing how many chunks (and how much data) are waiting or being queried or retrieved. The second API provides aggregate statistics for all chunk queries and requests sent since the node started up, including counts of different types of failure.
As time goes on, the blockchain world has been separating into two distinct parts. On one hand, public blockchains with their associated cryptocurrencies have enjoyed a remarkable recent comeback, minting many a multi-millionaire. On the other hand, use of permissioned or enterprise blockchains has been growing quietly but steadily, seeing their first live deployments across multiple industries during 2017.
One interesting question to consider is the appropriate level of similarity between these two types of chain. Both implement a shared database using peer-to-peer networking, public–private key cryptography, transaction rules and consensus mechanisms that can survive malicious actors. That’s a great deal of common ground. Nonetheless, public and private blockchains have different requirements in terms of confidentiality, scalability and governance. Perhaps these differences point to the need for radically divergent designs.
The Corda platform, developed by the R3 banking consortium, adopts a clear stance on this question. While some aspects were inspired by public blockchains, Corda was designed from scratch based on the needs of R3’s members. Indeed, although R3 still uses the word “blockchain” extensively to help market their product, Corda has no chain of blocks at all. More than any other “distributed ledger” platform I’m aware of, Corda departs radically from the architecture of conventional blockchains.
My goal in this piece is to explain these differences and discuss their implications, for good and bad. Actually, good and bad is the wrong way to put it, because the more interesting question is “Good and bad for what?” This article is far from short. But by the end of it, I hope that readers will gain some understanding of the differences in Corda and their consequent trade-offs. Corda is important because its design decisions bring many of the dilemmas of enterprise blockchains into sharp relief.
One last thing before we dive in. As the CEO of the company behind MultiChain, a popular enterprise blockchain platform, why am I writing in such depth about a supposedly competing product? The standard reason would be to argue for MultiChain’s superiority, but that’s not my motivation here. In fact, I do not see Corda and MultiChain as competitors, because they are fundamentally different in terms of design, architecture and audience. Corda and MultiChain compete in the same way as cruise liners and jet skis – while both transport people by sea, there are almost no real-world situations in which both could be used.
On a more personal note, I’ve learned a great deal from Corda’s technical leadership over the past few years, whether through meetings, correspondence or their public writings, much of which occurred before they joined R3. Some of my interest in Corda stems from the respect I have for this team, and for this reason alone, Corda is worth studying for anyone seeking an understanding of the distributed ledger field.
In order to understand Corda, it’s helpful to start with conventional blockchains. The purpose of a blockchain is to enable a database or ledger to be directly and safely shared by non-trusting parties. This contrasts with centralized databases, which are stored and controlled by a single organization. A blockchain has multiple “nodes”, each of which stores a copy of the database and can belong to a different organization. Nodes connect to each other in a dense peer-to-peer fashion, using a “gossip protocol” in which each node is constantly telling its peers everything it learns. As a result, any node can rapidly broadcast a message to the entire network via many alternative paths.
A database, whether centralized or blockchain-powered, begins in an empty state, and is updated via “transactions”. A transaction is defined as a set of database changes which are “atomic”, meaning that they succeed or fail as a whole. Imagine a database representing a financial ledger, with one row per account. A transaction in which Alice pays $10 to Bob has three steps: (1) verify that Alice’s account contains at least $10, (2) subtract $10 from Alice’s account, and (3) add $10 to Bob’s account. As a basic requirement, any database platform must ensure that no transaction interferes with another. This “isolation” is achieved by locking the rows for both Alice and Bob while the payment is under way. Any other transaction involving these rows must wait until this one is finished.
In a blockchain, every node independently processes every transaction on its own copy of the database. Transactions are created anywhere on the network and automatically propagated to all other nodes. Since the organizations running nodes may have different (or even conflicting) interests, they cannot trust each other to transact fairly. Blockchains therefore need rules which define whether or not a particular transaction is valid. In a shared financial ledger, these rules prevent users from spending each other’s money, or conjuring funds from thin air.
Along with the rules that determine transaction validity, blockchains must also define how transactions will be ordered, since in many cases this ordering is critical. If Alice has $15 and tries to send $10 to both Bob and Charlie in two separate transactions, only one of these payments can succeed. While we might like to say that the first transaction takes precedence, a peer-to-peer network has no objective definition of “first”, since messages can arrive at different nodes in different orders.
In a general sense, the information in any database is separated into records or “rows”, and a transaction can do three different things: delete rows, create rows, and/or modify rows. These can be reduced further to two, since modifying a row is equivalent to deleting that row and creating a new one in its place. To go back to Alice’s payment to Bob, her row containing $15 is deleted, and two new rows are created – one containing $10 for Bob and the other with $5 in “change” for Alice.
Following bitcoin’s and Corda’s terminology, we denote the rows deleted by a transaction as its “inputs”, and those created as its “outputs”. Any row deleted by a transaction must have been created by a previous transaction. Therefore each transaction input consumes (or “spends”) a previous transaction’s output. The up-to-date content of the database is defined by the set of “unspent transaction outputs” or “UTXOs”.
In a blockchain, a transaction is valid if it fulfills the following three conditions:
Correctness. The transaction must represent a legitimate transformation from inputs to outputs. For example, in a financial ledger, the total quantity of funds in the inputs must match the total in the outputs, to prevent money from magically appearing or disappearing. The only exceptions are special “issuance” or “retirement” transactions, in which funds are explicitly added or removed.
Authorization. The transaction must be authorized by the owner of every output consumed by its inputs. In a financial ledger, this prevents participants from spending each other’s money without permission. Transaction authorization is managed using asymmetric (or public–private key) cryptography. Every row has an owner, identified by a public key, whose corresponding private key is kept secret. In order to be authorized, a transaction must be digitally signed by the owner of each of its inputs. (Note that rows can also have more complex “multisignature” owners, for example where any two out of three parties can authorize their use.)
Uniqueness. If a transaction consumes a particular output, then no other transaction can consume that output again. This is how we prevent Alice from making conflicting payments to both Bob and Charlie. While the transactions for both of these payments could be correct and authorized, the uniqueness rule ensures that only one will be processed by the database.
In a conventional blockchain, every node checks every transaction in terms of these three rules. Later on, we’ll see how Corda divides up this responsibility differently.
A blockchain is literally a chain of blocks, in which every block links to the previous one via a “hash” that uniquely identifies its contents. Each block contains an ordered set of transactions which must not conflict with each other or with those in previous blocks, as well as a timestamp and some other information. Just like transactions, blocks propagate rapidly across the network and are independently verified by every node. Once a transaction appears in a block, it is “confirmed”, leading nodes to reject any conflicting transaction.
Who is responsible for creating these blocks, and how can we be sure that all nodes will agree on the authoritative chain? This question of “consensus algorithms” is a huge subject in itself, filled with wondrous acronyms such as PoW (Proof of Work), PBFT (Practical Byzantine Fault Tolerance) and DPoS (Delegated Proof of Stake). We won’t be getting into all that here. Suffice to say that permissioned blockchains for enterprises use some kind of voting scheme, where votes are granted to “validator nodes” who are collectively responsible. The scheme ensures that, so long as a good majority of validator nodes are functioning correctly and honestly, transactions will enter the chain in a (close to) fair order, timestamps will be (approximately) correct, and confirmed transactions cannot be subsequently reversed.
Before discussing some of the challenges of blockchains, I’d like to clarify three additional points. First, while I am using a financial ledger by example throughout this piece, the input–output model of transactions supports a much broader variety of use cases. Each row can contain a rich data object (think JSON) containing many different types of information – indeed, Corda uses the word “state” rather than “row” for this reason. Richer states change nothing fundamental about transaction rules: correctness is still defined in terms of inputs and outputs, authorization is still required for every input, and uniqueness ensures that each output can only be spent once.
Second, there are many blockchain use cases in which rows are only created in the database, and never deleted. These applications relate to general data storage, timestamping and notarization, rather than maintaining some kind of ledger which is in flux. In these data-only applications, transactions add data in their outputs but consume none in their inputs, allowing the rules for correctness, authorization and uniqueness to be simplified. Although data-only use cases are an increasing focus of our own development at MultiChain, I only mention them in passing here, since Corda was clearly not designed with them in mind.
Finally, it’s worth noting that some blockchain platforms do not use an input–output model. Ethereum presents an alternative paradigm, in which the chain controls a virtual computer with a global state that is managed by “contracts”, and transactions do not connect to each other explicitly. A discussion of Ethereum’s model in permissioned blockchains is beyond our scope here, but see this article for a detailed explanation and critique. One key advantage of the input–output paradigm is that most transactions can be processed in parallel and independently of each other. This property is crucial for Corda, as we’ll see later on.
Let’s imagine that the world’s banks created a shared ledger to represent the ownership, transfer and exchange of a variety of financial assets. In theory, this could be implemented on a regular blockchain, as described above. Each row would contain three columns – an asset identifier such as GOOG or USD, the quantity owned, and the owner’s public key. Each transaction would transfer one or more assets from its inputs to its outputs, with special cases for issuance and retirement.
Every bank in the network would run one or more nodes which connect to the others, propagating and verifying transactions. Senior members would act as validators, with the collective responsibility of confirming, ordering and timestamping transactions. Any validator’s misbehavior would be visible to all the nodes in the network, leading to censure, banishment and/or legal proceedings. With all this in place, any financial asset could be moved across the world in seconds, with the rules of correctness, authorization and uniqueness guaranteeing the ledger’s integrity.
What’s wrong with this picture? Actually, there are three problems: scalability, confidentiality and interoperability. The issue of scalability is simple enough. Our proposed interbank blockchain would require every member to verify, process and store every transaction performed by every bank in the world. Even if this would be technically feasible for the largest financial institutions, the cost of computation and storage would create a significant barrier for many. Surely we’d prefer a system in which participants only see those transactions in which they are immediately involved.
But let’s put scalability aside, since it can ultimately be solved using expensive computers and clever engineering. A more fundamental issue is confidentiality. While it might sound utopian for every transaction to be visible everywhere, in the real world such radical transparency is a non-starter in terms of competition and regulation. If J.P. Morgan and HSBC exchange a pair of assets, they’re unlikely to want Citi and the Bank of China to see what they did. If the transaction was conducted on behalf of these banks’ customers, it could be illegal for them to expose it in this way.
One proposed solution to the problem of confidentiality is “channels”, as implemented in Hyperledger Fabric. Each channel has certain members, who are a subset of the nodes in the network as a whole. A channel’s transactions are visible only to its members, so that each channel effectively acts as a separate blockchain. While this does help with confidentiality, it also undermines the entire point of the exercise. Assets cannot be moved from one channel to another without the help of a trusted intermediary which is active on both. The difficulty of this approach was recently highlighted by SWIFT’s reconciliation proof-of-concept, which estimated that over 100,000 channels would be needed in production. That’s 100,000 islands between which assets cannot be directly moved.
In data-only use cases, where transactions do not consume data in inputs, the confidentiality problem can be sidestepped by encrypting or hashing the data in outputs, and delivering the decryption key or unhashed data outside of the chain. But for a transaction whose inputs consume other transactions’ outputs, every node has to see those inputs and outputs in order to validate the transaction. While advanced cryptographic techniques such as confidential assets and zero knowledge proofs have been developed to partially or completely solve this problem for financial ledgers, these impose a significant performance burden and/or cannot be generalized to any correctness rule.
Finally, let’s talk about interoperability. In an ideal world, every bank would immediately join our global blockchain on the day it was launched. In reality however, multiple blockchains would be adopted by different groups of banks, based on geography or pre-existing relationships. Over time, a member of one group might wish to start transacting with a member of another, by transferring an asset between chains. Just as with channels, this can only be achieved with the help of a trusted intermediary, defeating the blockchain’s purpose.
Corda aims to solve these interrelated problems of scalability, confidentiality and interoperability via a radical rethink of how distributed ledgers work.
Corda’s partial view
The fundamental difference in Corda is easy to explain: Each node only sees some, rather than all, of the transactions processed on the network. While a single logical and conceptual ledger is defined by all these transactions, no individual node sees that ledger in its entirety. To draw a comparison, at any point in time, every dollar bill in the world is in a particular place, but nobody knows where they all are.
So which transactions does a Corda node see? First of all, those in which it is directly involved, because it owns one of that transaction’s inputs or outputs. In a financial ledger, this includes every transaction in which a node is sending or receiving funds. Let’s say Alice creates a transaction which consumes her $15 in an input and has two outputs – one with $10 for me, and the other with $5 in “change” for her. After Alice sends me this transaction, I can check it for correctness and authorization, verifying that the inputs and outputs balance and that Alice has signed.
However, this transaction on its own is not enough. I also need to verify that Alice’s $15 input state really exists, and she didn’t just make it up. That means I need to see the transaction which created this state, and check it for correctness and authorization as well. If this previous transaction, which sent Alice $15, has a $10 input belonging to Denzel and another $5 input from Eric, then I must also verify the transactions which created those. And so on it goes, all the way back to the original “issuance” transaction in which the asset was created. The number of transactions I need to verify will depend on how many times the assets have changed hands and the extent of backwards branching.
Since Corda nodes don’t automatically see every transaction, how do they obtain the ones they need? The answer is from the sender of each new transaction. Before Alice creates a transaction consuming her $15, she must already have verified the transaction in which she received it. And since Alice must have applied the recursive technique above, she will have a copy of every transaction needed for this verification. Bob simply requests these transactions from Alice as part of their interaction. If Alice doesn’t respond appropriately, Bob concludes that Alice is trying to trick him, and rejects the incoming payment. In the case where Bob is sent a new transaction whose inputs have multiple owners, he can obtain the necessary proofs from each.
So far we’ve explained how Bob can verify the correctness and authorization of an incoming transaction, including recursively retracing its inputs’ origins. But there is one more rule we need to think about: uniqueness. Let’s say Alice is malicious. She can generate one transaction in which she pays $10 to Bob, and another in which she pays the same $10 to Charlie. She can send these transactions to Bob and Charlie respectively, along with a full proof of correctness and authorization of each. While both transactions conflict with each other by consuming the same state, there is no way for Bob and Charlie to know this.
Conventional blockchains solve this problem by every node seeing every transaction, making conflicts easy to detect and reject. So how does Corda, with its partial transaction visibility, address the same problem? The answer is with the help of a “notary”. A notary is a trusted party (or parties working together) which guarantees that a particular state is only consumed once. Each state has a specific notary, which must sign any transaction in which that state is consumed. Once a notary has done this, it must not sign another transaction for the same state. Notaries are the network’s guardians of transaction uniqueness.
While every state can have a different notary, all of the states consumed by a particular transaction must be assigned to the same one. This avoids issues relating to deadlocks and synchronization, which should be familiar for those with distributed database experience. Let’s say Alice and Bob agree to exchange Alice’s $10 for Bob’s £7. The transaction for this exchange must be signed by the notaries of both states, but which one goes first? If Alice’s notary signs but Bob’s fails for some reason, then Alice will be left with an incomplete transaction and can never use her $10 again. If Bob’s signs first then he is similarly exposed. While we might like notaries to simply work together, in practice this requires mutual trust and the use of a consensus protocol, complications which Corda’s designers chose to avoid.
If states with different notaries are required as inputs to a single transaction, their owners first execute special “notary change” transactions, which move a state from one notary to another, changing nothing else. So when parties are building a transaction with multiple inputs, they must first agree on the notary to be used, and then perform the notary changes necessary. While the developer in me felt a small twinge of pain when reading about this workaround, there’s no reason why it won’t work so long as notaries play along.
It should also be clarified that, while each notary is a single logical actor in terms of signing transactions, it need not be under the control of a single party. A group of organizations could run a notary collectively, using an appropriate consensus protocol in which a majority of the participants are needed to generate a valid signature. This would prevent any single malicious party from undermining uniqueness by signing transactions that conflict. In theory, we could even allow every node in the network to participant in this kind of shared notarization, although in that case we’d be more-or-less back to a conventional blockchain.
Let’s recap the key differences between Corda and conventional blockchains. In Corda, there is no unified blockchain which contains all of the transactions confirmed. Nodes only see those transactions in which they are directly involved, or upon which they depend historically. Nodes are responsible for checking transaction correctness and authorization but rely on trusted notaries to verify uniqueness.
Of course, there is a lot more to Corda than this: the use of digital certificates to authenticate identity, “network maps” to help nodes find and trust each other, per-state “contracts” which define correctness from each state’s perspective, a deterministic version of the Java Virtual Machine which executes these contracts, “flows” which automate transaction negotiations, “time windows” which restrict transactions by time, “oracles” that attest to external facts and “CorDapps” which bundle many things together for easy distribution. While each of these features is interesting, equivalents for all can be found in other blockchain platforms. My goal in this article is to focus on that which makes Corda unique.
So does Corda live up to its promise? Does it solve the scalability, confidentiality and interoperability problems of blockchains? And in making its particular choices, how much of a price does Corda pay?
More scalable, sometimes
Let’s start with scalability. Here, Corda’s advantage appears clear, since nodes only see some of the transactions in a network. In a regular blockchain, the maximum throughput is constrained by the speed of the slowest node in processing transactions. By contrast, a Corda network could process a million transactions per second, while each node sees just a tiny fraction of that. Scalability extends to notaries as well, since the task of signing transactions for uniqueness can be spread between many different notaries, each..
Per-asset permissions, capacity upgrading and inline metadata
Today we’re pleased to unveil the second preview release of MultiChain 2.0. This makes substantial progress on the MultiChain 2.0 roadmap, and includes an important extra feature relating to asset permissions.
Let’s start with the surprise. This release adds the ability to separately control the send and receive permissions for each asset issued on the blockchain. This control is important in environments where each asset has different characteristics in terms of regulation, user identification requirements and so on.
At the time a new asset is issued, it can optionally be specified as receive- and/or send-restricted. Receive-restricted assets can only appear in transaction outputs whose address has receive permissions for that asset. Similarly, send-restricted assets can only be spent in transaction inputs by addresses which have per-asset send permissions. (Note that in all cases, addresses need global send and receive permissions to appear in inputs and outputs respectively.)
The send and receive permissions for an asset can be granted or revoked by any address which has admin or activate permissions for that asset. By default, these permissions are only assigned to the asset issuer, but the issuer (or any subsequently added asset administrator) can extend them to other addresses as well.
Blockchain parameter upgrades
One of the major features in development for MultiChain 2.0 is blockchain upgrading, to allow many of a chain’s parameters to be changed over time. This is vital because blockchains are designed to run for the long term, and it’s hard to predict how computer systems will be used many years after their creation.
MultiChain 1.0.x already provides a facility for upgrading a single parameter – the chain’s protocol version. This release of MultiChain 2.0 takes a significant step forwards, allowing changes to seven additional parameters related to blockchain performance and scaling. These include the target block time, maximum block size, maximum transaction size and maximum size of metadata.
As with other crucial operations relating to governance, upgrading a chain’s parameters can only be performed by the chain’s administrator(s), subject to a customizable level of consensus. We’re continuing to work on this feature, so look out for more upgradable parameters in future releases of MultiChain 2.0.
MultiChain 1.0.x already supports unformatted (binary) transaction metadata, which can be embedded raw or wrapped in a stream item. The first preview release of MultiChain 2.0 extended this to allow metadata to be optionally represented in text or JSON format. In all of these cases the metadata appears in a separate transaction output containing an OP_RETURN, which makes the output unspendable by subsequent transactions.
This release of MultiChain 2.0 introduces a new type of metadata which we call “inline”. Inline metadata is stored within a regular spendable transaction output, and so is associated directly with that output’s address and/or assets. As with other forms of metadata, inline metadata can be in binary, text or JSON formats, and is easily writable and readable via a number of different APIs.
The road ahead
With this second preview/alpha release, we’ve completed about half of work scheduled for the open source Community edition of MultiChain 2.0. You can download and try out alpha 2 by visiting the MultiChain 2.0 preview releases page. On this page you’ll also find documentation for the new and enhanced APIs.
We’ve already started working on the next major feature for MultiChain 2.0, which we’re calling off-chain stream items. In an off-chain item, only a hash of the item’s payload is embedded inside the chain, alongside the item’s keys and some other metadata. The payload itself is stored locally by the publisher and propagated to the stream’s subscribers using peer-to-peer file sharing techniques, with the on-chain hash providing verification. The result is a huge improvement in the scalability and performance of blockchains used to record large amounts of information, where some of this information is only of interest to certain participants. While not originally planned for MultiChain 2.0, this feature rose up our list of priorities in response to user demand.
As always, we welcome your feedback on the progress of MultiChain 2.0, and look forward to delivering the next preview release in due course.
Today we’re delighted to share the first preview release of MultiChain 2.0, which implements one major part of the MultiChain 2.0 roadmap published earlier this year – a richer data model for streams.
Streams have proven to be a popular feature in MultiChain, providing a natural abstraction for general purpose data storage and retrieval on a blockchain. A MultiChain chain can contain any number of named streams, each of which can have individual write permissions or be open for writing by all. In MultiChain 1.0, each stream item has one or more publishers (who sign it), an optional key for efficient retrieval, a binary data payload up to 64 MB in size, and a timestamp derived from the block in which it’s embedded.
This preview release of MultiChain 2.0, numbered alpha 1, takes streams functionality to a whole new level:
JSON items. As an optional alternative to raw binary data, stream items can now contain any JSON structure, which is stored on the blockchain in the efficient UBJSON serialization format. Since the MultiChain API already uses JSON throughout, these JSON structures can be read and written in a natural and obvious way.
Text items. Stream items may also contain Unicode text, stored efficiently on the blockchain in UTF-8 encoding. Text items can also be read and written directly via the MultiChain API.
Multiple keys. Each stream item can now have multiple keys instead of only one. This enables much more flexible schemes for tagging, indexing and retrieval.
Multiple items per transaction. Multiple items can now be written to the same stream in a single atomic transaction. This allows multiple stream items to: (a) be naturally grouped together under a single transaction ID, (b) take up less space on the blockchain and (c) require fewer signature verifications.
JSON merging. There are new APIs to summarize the items in a stream with a particular key or publisher. The first type of summary offered is a merge of all of the JSON objects in those items. The outcome of the merge is a new object containing all the JSON keys from the individual objects, where the value corresponding to each JSON key is taken from the last item in which that key appears. The merge can be customized in various ways, e.g. to control whether sub-objects are merged recursively and if null values should be included.
The purpose of JSON merging is to enable a stream to serve as a flexible database for applications built on MultiChain, with the stream key or publisher (as appropriate) acting as a “primary key” for each database entry. The advantage over a regular database is that the stream contains a fully signed and timestamped history of how each entry was changed over time, with the blockchain securing this history immutably through multiparty consensus.
As in previous versions, each node can freely decide which streams to subscribe to, or can subscribe to all streams automatically. If a node is subscribed to a stream, it indexes that stream’s content in real time, allowing efficient retrieval by publisher, key, block, timestamp or position – and now summarization by key or publisher.
Aside from stream items, MultiChain 2.0 alpha 1 also supports JSON and text in raw transaction metadata, as alternatives to the raw binary data supported in MultiChain 1.0.
Finally, this release allows the custom fields of issued assets and created streams to contain any JSON object, instead of the text-only key/value pairs offered in MultiChain 1.0. For forwards compatibility, MultiChain 1.0.2 includes the ability to read (but not write) these richer asset and stream custom fields.
To try out these new features, visit the MultiChain 2.0 preview releases page and download alpha 1. The page also provides detailed documentation on the new APIs and parameters available.
We’d love to hear your feedback on this new functionality. And of course we’re already hard at work on the next major set of enhancements for MultiChain 2.0, scheduled for release early next year.
Here at Coin Sciences, we’re best known for MultiChain, a popular platform for creating and deploying permissioned blockchains. But we began life in March 2014 in the cryptocurrency space, with the goal of developing a “bitcoin 2.0″ protocol called CoinSpark. CoinSpark leverages transaction metadata to add external assets (now called tokens) and notarized messaging to bitcoin. Our underlying thinking was this: If a blockchain is a secure decentralized record, surely that record has applications beyond managing its native cryptocurrency.
After less than a year, we stopped developing CoinSpark, due to both a push and a pull. The push was the lack of demand for the protocol – conventional companies were (understandably) reluctant to entrust their core processes to a public blockchain. But there was also a pull, in terms of the developing interest we saw in closed or permissioned distributed ledgers. These can be defined as databases which are safely and directly shared by multiple known but non-trusting parties, and which no single party controls. So in December 2014 we started developing MultiChain to address this interest – a change in direction that Silicon Valley would call a “pivot”.
Two years since its first release, MultiChain has proven an unqualified success, and will remain our focus for the foreseeable future. But we still take an active interest in the cryptocurrency space and its rapid pace of development. We’ve studied Ethereum’s gas-limited virtual machine, confidential CryptoNote-based systems like Monero, Zcash with its (relatively) efficient zero knowledge proofs, and new entrants such as Tezos and Eos. We’ve also closely observed the crypto world’s endless dramas, such as bitcoin’s block size war of attrition, the failures of numerous exchanges, Ethereum’s DAO disaster and Tether’s temporary untethering. Crypto news is the gift that keeps on giving.
Crypto and the enterprise
Aside from sheer curiosity, there’s a good reason for us to watch so closely. We fully expect that many of the technologies developed for cryptocurrencies will eventually find their way into permissioned blockchains. And I should stress here the word eventually, because the crypto community has (to put it a mildly) a far higher risk appetite than enterprises exploring new techniques for integration.
It’s important to be clear about the similarities and differences between cryptocurrencies and enterprise blockchains, because so much anguish is caused by the use of the word “blockchain” to describe both. Despite the noisy objections of some, I believe this usage is reasonable, because both types of chain share the goal of achieving decentralized consensus between non-trusting entities over a record of events. As a result, they share many technical characteristics, such as digitally signed transactions, peer-to-peer networking, transaction constraints and a highly robust consensus algorithm that requires a chain of blocks.
Despite these similarities, the applications of open cryptocurrency blockchains and their permissioned enterprise counterparts appear to be utterly distinct. If you find this surprising or implausible, consider the following parallels: The TCP/IP networking protocol is used to connect my computer to my printer, but also powers the entire Internet. Graphics cards make 3D video games more realistic, but can also simulate neural networks for “deep learning”. Compression based on repeating sequences makes web sites faster, but also helps scientists store genetic data efficiently. In computing, multi-purpose technologies are the norm.
So here at Coin Sciences, we believe that blockchains will be used for both cryptocurrencies and enterprise integration over the long term. We don’t fall on either side of the traditional (almost tribal) divide between advocates of public and private chains. Perhaps this reflects an element of wishful thinking, because a thriving cryptocurrency ecosystem will develop more technologies (under liberal open source licenses) that we can use in MultiChain. But I don’t think that’s the only reason. I believe there is a compelling argument in favor of cryptocurrencies, which can stand on its own.
In favor of crypto
What is the point of cryptocurrencies like bitcoin? What do they bring to the world? I believe the answer is the same now as in 2008, when Satoshi Nakamoto published her famous white paper. They enable direct transfers of economic value over the Internet, without a trusted intermediary, and this is an incredibly valuable thing. But unlike Satoshi’s original vision, I do not see this as a better way to buy coffee in person or kettles online. Rather, cryptocurrencies are a new class of asset for people looking to diversify their financial holdings in terms of risk and control.
Let me explain. In general people can own two types of asset – physical and financial. For most of us physical assets are solid and practical items, like land, houses, cars, furniture, food and clothing, while a lucky few might own a boat or some art. By contrast, financial assets consist of a claim on the physical assets or government-issued money held by others. Unlike physical assets, financial assets are useless on their own, but can easily be exchanged for useful things. This liquidity and exchangeability makes them attractive despite their abstract form.
Depending on who you ask, the total value of the world’s financial assets is between $250 and $300 trillion, or an average of $35-40k per person alive. The majority of this sum is tied up in bonds – that is, money lent to individuals, companies and governments. Most of the rest consists of shares in public companies, spread across the stock exchanges of the world. Investors have plenty of choice.
Nonetheless, all financial assets have something in common – their value depends on the good behavior of specific third parties. Furthermore, with the exception of a few lingering bearer assets, they cannot be transferred or exchanged without a trusted intermediary. These characteristics create considerable unease for these assets’ owners, and that feeling gains credence during periods of financial instability. If a primary purpose of wealth is to make people feel safe in the face of political or personal storms, and the wealth itself is at risk from such a storm, then it’s failing to do its job.
So it’s natural for people to seek money-like assets which don’t depend on the good behavior of any specific third party. This drive underlies the amusingly-named phenomenon of gold bugs – people who hold a considerable portion of their assets in physical gold. Gold has been perceived as valuable by humans for thousands of years, so it’s reasonable to assume this will continue. The value of gold cannot be undermined by governments, who often succumb to the temptation to print too much of their own currency. And just as in medieval times, gold can be immediately used for payment without a third party’s assistance or approval.
Despite these qualities, gold is far from ideal. It’s expensive to store, heavy to transport, and can only be handed over through an in-person interaction. In the information age, surely we’d prefer an asset which is decentralized like gold but is stored digitally rather than physically, and can be sent across the world in seconds. This, in short, is the value proposition of cryptocurrencies – teleportable gold.
On intrinsic value
The most immediate and obvious objection to this thesis is that, well, it’s clearly ridiculous. You can’t just invent a new type of money, represented in bits and bytes, and call it Gold 2.0. Gold is a real thing – look it’s shiny! – and it has “intrinsic value” which is independent of its market price. Gold is a corrosion-resistant conductor of electricity and can be used for dental fillings. Unlike bitcoin, if nobody else in the world wanted my gold, I could still do something with it.
There’s some merit to this argument, but it’s weaker than it initially sounds. Yes, gold has some intrinsic value, but its market price is not derived from that value. In July 2001 an ounce of gold cost $275, ten years later it cost $1840, and today it’s back around the $1200 mark. Did the practical value of dental fillings and electrical wiring rise sevenfold in ten years and then plummet in the subsequent six?
Clearly not. The intrinsic value argument is about something more subtle – it places a lower bound on gold’s market price. If gold ever became cheaper than its functional substitutes, such as copper wiring or dental amalgam, electricians and dentists would snap it up. So if you buy some gold today, you can be confident that it will always be worth something, even if it’s (drastically) less than the price you paid.
Cryptocurrencies lack the same type of lower bound, derived from their practical utility (we’ll discuss a different form of price support later on). If everyone in the world lost interest in bitcoin, or it was permanently shut down by governments, or the bitcoin blockchain ceased to function, then any bitcoins you hold would indeed be worthless. These are certainly risks to be aware of, but their nature also points to the source of a cryptocurrency’s value – the network of people who have an interest in holding and transacting in it. For bitcoin and others, that network is large and continuing to grow.
Indeed, if we look around, we can find many types of asset which are highly valued but have negligible practical use. Examples include jewelry, old paintings, special car license plates, celebrity autographs, rare stamps and branded handbags. We might even say that, in terms of suitability for purpose, property in city centers is drastically overpriced compared to the suburbs. In these cases and more, it’s hard to truly justify why people find something valuable – the reason is buried deep in our individual and collective psyches. The only thing these assets have in common is their relative scarcity.
So I wouldn’t claim that bitcoin’s success was a necessary or predictable consequence of its invention, however brilliant that may have been. What happened was a complete surprise to most people, myself included, like the rise of texting, social media, sudoku and fidget spinners. There’s only one reason to believe that people will find cryptocurrencies valuable, and that is the fact that they appear to be doing so, in greater and greater numbers. Bitcoin and its cousins have struck a psychoeconomic nerve. People like the idea of owning digital money which is under their ultimate control.
Against crypto maximalism
At this point I should clarify that I am not a “cryptocurrency maximalist”. I do not believe that this new form of money will take over the world, replacing the existing financial landscape that we depend on. The reason for my skepticism is simple: Cryptocurrencies are a poor solution for the majority of financial transactions.
I’m not just talking about their sky-high fees and poor scalability, which can be technically resolved with time. The real problem with bitcoin is its core raison d’être – the removal of financial intermediaries. In reality, intermediaries play a crucial role in making our financial activity secure. Do consumers want online payments to be irreversible, if a merchant has ripped them off? Do companies want a data loss or breach to cause immediate bankruptcy? One of my favorite Twitter memes is this from Dave Birch (although note that bitcoin is not truly anonymous or untraceable):
While it’s wonderful to send value directly across the Internet, the price of this wizardry is a lack of recourse when something goes wrong. For the average Joe buying a book or a house, this trade-off is simply a bad deal. And the endless news stories about stolen cryptocurrency and hacked bitcoin exchanges aren’t going to change his mind. As a result, I believe cryptocurrencies will always be a niche asset, and nothing more. They will find their place inside or outside of the existing financial order, alongside small cap stocks and high yield bonds. Not enough people are thinking about the implications of this boring and intermediate outcome, which to me seems most likely of all.
A pointed historical analogy can be drawn with the rise of e-commerce. In the heady days of the dot com boom, pundits were predicting that online stores would supersede their physical predecessors. Others said that nobody would want to buy unseen goods from web-based upstarts. Twenty years later, Amazon, Ebay and Alibaba have indeed built their empires, but physical stores are still with us and attractive to buy. In practice, most of us purchase some things online, and other things offline, depending on the item in question. There are trade-offs between these two forms of commerce, just as there are between cryptocurrencies and other asset classes. He who diversifies wins.
Now about that price
If cryptocurrencies will be around in the long term, but won’t destroy the existing financial order, then the really interesting question is this: Exactly how big are they going to get? Fifty years from now, what will be the total market capitalization of all the cryptocurrency in the world?
In my view, the only honest answer can be: I’ve no idea. I can make a strong case for a long-term (inflation-adjusted) market cap of $15 billion, since that’s exactly where crypto was before this year’s (now deflating) explosion. And I can make an equally strong case for $15 trillion, since the total value of the world’s gold is currently $7 trillion, and cryptocurrencies are better in so many ways. I’d be surprised if the final answer went outside of this range, but a prediction this wide is as good as no prediction at all.
Most financial assets have some kind of metric which acts to anchor their price. Even in turbulent markets, they don’t stray more than 2-3x in either direction before rational investors bring them back into line. For example, the exchange rates between currencies gravitate towards purchasing power parity, defined as the rate at which a basket of common goods costs the same in every country. Bonds gravitate towards their redemption price, adjusted for interest, inflation and risk, which depends on the issuing party. Stocks gravitate towards a price/earnings ratio of 10 to 25, because of the alternatives available to income-seeking investors. (One exception appears to be high-growth technology stocks, but even these eventually come back down to earth. Yes, Amazon, your day will come.)
When it comes to the world of crypto, there is no such grounding. Cryptocurrencies aren’t used for pricing common goods, and they don’t pay dividends or have a deadline for redemption. They also lack the pedigree of gold or artwork, whose price has been discovered over hundreds of years. As a result, crypto prices are entirely at the mercy of Keynesian animal spirits, namely the irrational, impulsive and herd-like decisions that people make in the face of uncertainty. To paraphrase Benjamin Graham, who wrote the book on stock market investing, Mr Crypto Market is madder than a madman. The geeks among us might call it chaos theory in action, with thousands of speculators feeding off each other in an informational vacuum.
Of course, some patterns can be discerned in the noise. I don’t want to write (or be accused of writing) a guide to cryptocurrency investing, so I’ll mention them only in brief: reactions to political uncertainty and blockchain glitches, periods of media-driven speculation, profit-taking by crypto whales, 2 to 4 year cycles, deliberate pump-and-dump schemes, and the relentless downward pressure caused by proof-of-work mining. But if I could give one piece of advice, it would be this: Buy or sell to ensure you’ll be equally happy (and unhappy) whether crypto prices double or halve in the next week. Because either can happen, and you have no way of knowing which.
If the price of a cryptocurrency isn’t tied to anything and moves unpredictably, could it go down to zero? Barring a blockchain’s catastrophic technical failure, I think the answer is no. Consider those speculators who bought bitcoin in 2015 and sold out during the recent peak, making a 10x return. If the price of bitcoin goes back to its 2015 level, it would be a no-brainer for them to buy back in again. In the worst case, they’ll lose a small part of their overall gains. But if history repeats itself, they can double those gains. And maybe next time round, the price will go even higher.
This rational behavior of previous investors translates into a cryptocurrency’s price support, at between 10% and 25% (my estimate) of its historical peak. That’s exactly what happened during 2015 (see chart below) when bitcoin’s price stabilized in the $200-$250 range after dropping dramatically from over $1000 a year earlier. At the time there was no good reason to believe that it would ever rise again, but the cost of taking a punt became too low to resist.
So I believe that cryptocurrencies will be with us for the long term. As long as bitcoin is worth some non-trivial amount, it can be used as a means of directly sending money online. And as long as it serves this purpose, it will be an attractive alternative investment for people seeking to diversify. The same goes for other cryptocurrencies that have reached a sufficient level of interest and support, such as Ethereum and Litecoin. In Ethereum’s case, this logic applies whether or not smart contracts ever find serious applications.
On that subject, I should probably (and reluctantly) mention the recent wave of token Initial Coin Offerings (ICOs) on Ethereum. For the most part, I don’t see these as attractive investments, because their offer price may well be a high point to which they never return. And the sums involved are often ridiculous – if $18 million was enough to fund the initial development of Ethereum, I don’t see why much simpler projects are raising ten times that amount. My best guess is that many ICO investors are looking for something to do with their newly-found Ether riches, which they prefer not to sell to drive down the price. Ironically, after being collected by these ICOs, much is being sold anyway.
Back to reality
There’s a certain symmetry between people’s reactions to cryptocurrencies and enterprise blockchains. In both cases, some shamelessly drive the hype, claiming that bitcoin will destroy the financial system, or that enterprise chains will replace relational databases. Others are utterly dismissive, seeing cryptocurrencies as elaborate Ponzi schemes and permissioned blockchains as a technological farce.
In my view, these extreme positions are all ignoring a simple truth – that there are trade-offs between different ways of doing things, and in the case of both cryptocurrencies and enterprise blockchains, these trade-offs are clear to see. A technology doesn’t need to be good for everything in order to succeed – it just needs to be good for some things. The people who are doing those things have a tendency of finding it.
So when it comes to both public and private blockchains, it’s time to stop thinking in binary terms. Each type of chain will find its place in the world, and provide value when used appropriately. In the case of cryptocurrencies, as an intermediary-free method for digital value transfer and an alternative asset class. And in the case of enterprise blockchains, as a new approach to database sharing without a trusted intermediary.
That, at least, is the bet that we’re making here.
Disclosure: The author has a financial interest in various cryptocurrencies. Coin Sciences Ltd does not.
Where flexible thinking is preferable to dogmatism
“The highest good, than which there is no higher, is the blockchain, and consequently it is immutably good, hence truly eternal and truly immortal.” — Saint Augustine, De natura boni, i, 405 C.E. (with minor edits)
If you ask someone well-informed about the characteristics of blockchains, the word “immutable” will invariably appear in the response. In plain English, this word is used to denote something which can never be modified or changed. In a blockchain, it refers to the global log of transactions, which is created by consensus between the chain’s participants. The basic notion is this: once a blockchain transaction has received a sufficient level of validation, some cryptography ensures that it can never be replaced or reversed. This marks blockchains as different from regular files or databases, in which information can be edited and deleted at will. Or so the theory goes.
In the raucous arena of blockchain debate, immutability has become a quasi-religious doctrine – a core belief that must not be shaken or questioned. And just like the doctrines in mainstream religions, members of opposing camps use immutability as a weapon of derision and ridicule. The past year has witnessed two prominent examples:
Cryptocurrency advocates claiming that immutability can only be achieved through decentralized economic mechanisms such as proof-of-work. From this perspective, private blockchains are laughable because they depend on the collective good behavior of a known group of validators, who clearly cannot be trusted.
Scorn poured on the idea of an editable (or mutable) blockchain, in which retroactive modifications can be made to the transaction history under certain conditions. Mockers posed the question: What could possibly be the point of a blockchain if its contents can easily be changed?
For those of us on the sidelines, it’s fun to watch the mudslinging. Not least because both of these criticisms are plain wrong, and stem from a fundamental misunderstanding of the nature of immutability in blockchains (and indeed any computer system). For those short on time, here’s the bottom line:
In blockchains, there is no such thing as perfect immutability. The real question is: What are the conditions under which a particular blockchain can and cannot be changed? And do those conditions match the problem we’re trying to solve?
To put it another way, a blockchain’s transactions are not written into the mind of God (with apologies to Augustine above). Instead, the chain’s behavior depends on a network of corporeal computer systems, which will always be vulnerable to destruction or corruption. But before we get into the details of how, let’s proceed by recapping some basics of blockchains themselves.
Blockchains in brief
A blockchain runs on a set of nodes, each of which may be under the control of a separate company or organization. These nodes connect to each other in a dense peer-to-peer network, so that no individual node acts as a central point of control or failure. Each node can generate and digitally sign transactions which represent operations in some kind of ledger or database, and these transactions rapidly propagate to other nodes across the network in a gossip-like way.
Each node independently verifies every new incoming transaction for validity, in terms of: (a) its compliance with the blockchain’s rules, (b) its digital signature and (c) any conflicts with previously seen transactions. If a transaction passes these tests, it enters that node’s local list of provisional unconfirmed transactions (the “memory pool”), and will be forwarded on to its peers. Transactions which fail are rejected outright, while others whose evaluation depends on unseen transactions are placed in a temporary holding area (the “orphan pool”).
At periodic intervals, a new block is generated by one of the “validator” nodes on the network, containing a set of as-yet unconfirmed transactions. Every block has a unique 32-byte identifier called a “hash”, which is determined entirely by the block’s contents. Each block also includes a timestamp and a link to a previous block via its hash, creating a literal “block chain” going back to the very beginning.
Just like transactions, blocks propagate across the network in a peer-to-peer fashion and are independently verified by each node. To be accepted by a node, a block must contain a set of valid transactions which do not conflict with each other or with those in the previous blocks linked. If a block passes this and other tests, it is added to that node’s local copy of the blockchain, and the transactions within are “confirmed”. Any transactions in the node’s memory pool or orphan pool which conflict with those in the new block are immediately discarded.
Every chain employs some sort of strategy to ensure that blocks are generated by a plurality of its participants. This ensures that no individual or small group of nodes can seize control of the blockchain’s contents. Most public blockchains like bitcoin use “proof-of-work” which allows blocks to be created by anyone on the Internet who can solve a pointless and fiendishly difficult mathematical puzzle. By contrast, in private blockchains, blocks tend to be signed by one or more permitted validators, using an appropriate scheme to prevent minority control. Our product MultiChain uses a technique called “mining diversity” which requires a minimum proportion of the permitted validators to participate in order to create a valid chain.
Depending on the consensus mechanism used, two different validator nodes might simultaneously generate conflicting blocks, both of which point to the same previous one. When such a “fork” happens, different nodes in the network will see different blocks first, leading them to have different opinions about the chain’s recent history. These forks are automatically resolved by the blockchain software, with consensus regained once a new block arrives on one of the branches. Nodes that were on the shorter branch automatically rewind their last block and replay the two blocks on the longer one. If we’re really unlucky and both branches are extended simultaneously, the conflict will be resolved after the third block on one branch, or the one after that, and so on. In practice, the probability of a fork persisting drops exponentially as its length increases. In private chains with a limited set of validators, the likelihood can be reduced to zero after a small number of blocks.
Nonetheless, it’s important to remember that each node is running on a computer system owned and controlled by a particular person or organization, so the blockchain cannot force it to do anything. The purpose of the chain is to help honest nodes to stay in sync, but if enough of its participants choose to change the rules, no earthly power can stop them. That’s why we need to stop asking whether a particular blockchain is truly and absolutely immutable, because the answer will always be no. Instead, we should consider the conditions under which a particular blockchain can be modified, and then check if we’re comfortable with those conditions for the use case we have in mind.
Mutability in public chains
Let’s return to the two examples cited in the introduction, in which the doctrine of immutability has been used as a basis for ridicule. We’ll begin with the claim that the consensual validation procedures used in permissioned blockchains cannot bring about the “true immutability” promised by public chains.
This criticism is most easily addressed by pointing to the vulnerability of public blockchains themselves. Take, for example, the Ethereum blockchain, which suffered a devastating exploit in June 2016. Someone found a coding loophole in a smart contract called “The DAO”, in which almost $250 million had been invested, and began draining its funds at speed. While this clearly violated the intentions of the contract’s creators and investors, its terms and conditions relied on the mantra that “code is law”. Law or not, less than a month later, the Ethereum software was updated to prevent the hacker from withdrawing the cryptocurrency “earned”.
Of course, this update could not be enforced, since every Ethereum user controls their own computer. Nonetheless, it was publicly supported by Vitalik Buterin, Ethereum’s founder, as well as many other community leaders. As a result, most users complied, and the blockchain with the new rules kept the name “Ethereum”. A minority disagreed with the change and continued the blockchain according to its original rules, earning the title “Ethereum Classic”. A more accurate choice of names might be “Ethereum compromised” and “Ethereum the pure”. Either way, democracy is democracy, and (the pragmatic and popular) “Ethereum” is now worth over ten times (the idealistic but sidelined) “Ethereum Classic”.
Now let’s consider a less benevolent way in which public blockchain immutability can be undermined. Recall that block creation or “mining” in bitcoin and Ethereum uses a proof-of-work scheme, in which a mathematical problem must be solved in order to generate a block and claim its reward. The value of this reward inevitably turns mining into an arms race, with miners competing to solve the problems faster. To compensate, the network periodically adjusts the difficulty to maintain a constant rate of block creation, once every 10 minutes in bitcoin or 15 seconds in Ethereum.
In the last 5 years, bitcoin’s difficulty has increased by a factor of 350,000×. Today, the vast majority of bitcoin mining takes place on expensive specialized hardware, in locations where the weather is cold and electricity is cheap. For example, $1,089 will buy you an Antminer S9, which mines blocks 10,000 times faster than any desktop computer and burns 10 times more electricity. This is all a long way from the democratic ideals with which bitcoin was created, even if it does make the blockchain extremely secure.
Well, kind of secure. If someone wanted to undermine the immutability of the bitcoin blockchain, here’s how they would do it. First, they would install more mining capacity than the rest of the network put together, creating a so-called “51% attack”. Second, instead of openly participating in the mining process, they would mine their own “secret branch”, containing whichever transactions they approve and censoring the rest. Finally, when the desired amount of time had passed, they would anonymously broadcast their secret branch to the network. Since the attacker has more mining power than the rest of the network, their branch will contain more proof-of-work than the public one. Every bitcoin node will therefore switch over, since the rules of bitcoin state that the more difficult branch wins. Any previously confirmed transactions not in the secret branch will be reversed, and the bitcoin they spent could be sent elsewhere.
By now, most bitcoin believers will be laughing, because I wrote “install more mining capacity than the rest of the network put together” as if this is trivial to achieve. And they have a point, because of course it’s not easy, otherwise lots of people would already have done it. You need a lot of mining equipment, and a lot of electricity to power it, both of which cost a ton of money. But here’s the inconvenient fact that most bitcoiners brush over: For the government of any mid-size country, the money required is still small change.
Let’s estimate the cost of a 51% attack which reverses a year of bitcoin transactions. At the current bitcoin price of $1500 and reward of 15 bitcoins (including transaction fees) per 10-minute block, miners earn around $1.2 billion per year ($1500 × 15 × 6 × 24 × 365). Assuming (reasonably) that they are not losing money overall, or at least not losing much, this means that total miner expenses must also be in the same range. (I’m simplifying here by amortizing the one-time cost of purchasing mining equipment, but $400 million will buy you enough Antminer 9s to match the current bitcoin network’s mining capacity, so we’re in the right ball park.)
Now think about the reports that bitcoin is being used by Chinese citizens to circumvent their country’s capital controls. And consider further that the Chinese government’s tax revenues are approximately $3 trillion per year. Would a non-democratic country’s government spend 0.04% of its budget to shut down a popular method for illegally taking money out of that country? I wouldn’t claim that the answer is necessarily yes. But if you think the answer is definitely no, you’re being more than a little naive. Especially considering that China reportedly employs 2 million people to police Internet content, which totals $10 billion/year if we assume a low wage of $5,000. That puts the $1.2 billion cost of reversing a year of bitcoin transactions in perspective.
Even this analysis understates the problem, because the Chinese government could undermine the bitcoin network much more easily and cheaply. It appears that the majority of bitcoin mining takes place in China, due to low-cost hydroelectric power and other factors. Given a few tanks and platoons, China’s army could physically seize these bitcoin mining operations, and repurpose them to censor or reverse transactions. While the wider bitcoin world would undoubtedly notice, there’s nothing it could do without fundamentally altering the governance structure (and therefore nature) of bitcoin itself. What was that about censorship free money?
None of this should be construed as a criticism of bitcoin’s design, or a prediction that a network catastrophe will actually happen. The bitcoin blockchain is a remarkable piece of engineering, perhaps even perfect for the purpose its creator(s) had in mind. And if I had to put money on it, I would bet that China and other governments probably won’t attack bitcoin in this way, because it’s not in their ultimate interest to do so. More likely, they’ll focus their wrath on its more untraceable cousins like Dash, Zcash and Monero.
Nonetheless, the mere possibility of this form of interference puts the cryptocurrency immutability doctrine in its place. The bitcoin blockchain and its ilk are not immutable in any perfect or absolute sense. Rather, they are immutable so long as nobody big enough and rich enough decides to destroy them. Still, by relying on the economic cost of subverting the network, cryptocurrency immutability satisfies the specific needs of people who don’t want to trust governments, companies and banks. It may not be perfect, but it’s the best they can do.
Rewriteable private chains
Now let’s move on to private blockchains, designed for the needs of governments and large companies. We can begin by noting that, from the perspective of these organizations, immutability based on proof-of-work is a commercial, legal and regulatory non-starter, because it allows any (sufficiently rich) actor to anonymously attack the network. For institutions, immutability can only be grounded in the good behavior of other similar institutions, with whom they can sign a contract and sue if need be. As a bonus, private blockchains are far less costly to run, since blocks only need a simple digital signature from the nodes that approve them. So long as a majority of validator nodes are following the rules, the end result is stronger and cheaper immutability than any public cryptocurrency can offer.
Of course, immutability is still easy to undermine if all the participants in a chain decide to do so together. Let’s imagine a private blockchain used by six hospitals to aggregate data on infections. A program in one hospital writes a large and erroneous data set to the chain, which is a source of inconvenience for the other participants. A few phone calls later, the IT departments of all the hospitals agree to “rewind” their nodes back one hour, delete the problematic data, and then allow the chain to continue as if nothing happened. If all the hospitals agree to do this, who’s going to stop them? Indeed, apart from the staff involved, who will even know that it happened? (It should be noted that some consensus algorithms like PBFT don’t provide an official mechanism for rollbacks, but this doesn’t help with governance since nodes are still free to bypass the rules.)
Now consider a case where most of a private blockchain’s participants agree to rewind and remove some transaction, but a few withhold their consent. Since every organization’s node is under its ultimate control, nobody can force the minority to join the consensus. However, by sticking to their principles, these users will find themselves on a fork being ignored by everyone else. Like the virtuous proponents of Ethereum Classic, their place in heaven may well be assured. But back here on earth, they will be excluded from the consensus process for which the chain was deployed, and might as well give up completely. The only practical application of transactions outside the consensus is to serve as evidence in a court of law.
With this in mind, let’s talk about the second case in which the doctrine of blockchain immutability has been used to ridicule ideas. Here, we’re referring to Accenture’s idea of using a chameleon hash to enable a block buried deep in a chain to be easily replaced. The primary motivation, as described by David Treat, is to allow an old problematic transaction to be quickly and efficiently removed. Under the scheme, if a block substitution does occur, a “scar” is left behind which all participants can see. (It should be noted that any later transactions that depend on the deleted one would need to be removed as well.)
It’s hard to overstate how many people poured scorn on this idea when it was announced. Twitter and LinkedIn were aghast and aflutter. And I’m not just talking about the crypto crowd, which takes sporting pleasure in mocking anything related to enterprise blockchains. The idea was broadly slammed by private blockchain advocates as well.
And yet, under the right conditions, the idea of allowing blockchains to be modified retroactively via chameleon hashes can make perfect sense. To understand why, we begin with a simple question: in this type of blockchain, who would actually have the power to replace old blocks? Clearly, it can’t be any unidentified network participant, because that would render the chain ungovernable.
The answer is that a chameleon hash can only be used by those who hold its secret key. The key is required to enable a new version of a block, with different transactions, to be given the same chameleon hash as before. Of course, we probably don’t want centralized control in a blockchain, so we can make the scheme stronger by having multiple chameleon hashes per block, each of whose key is held by a different party. Or we might use secret sharing techniques to divide a single chameleon hash key between multiple parties. Either way, the chain can be configured so that a retroactive block substitution can only occur if a majority of key holders approve it. Is this starting to sound familiar?
Allow me to render the parallel more explicit. Let’s say that we share control over chameleon hashes between those same validating nodes which are responsible for block creation. This means that an old block can only be replaced if a majority of validating nodes agree to do so. And yet, as we discussed earlier, any blockchain can already be retroactively modified by a majority of validating nodes, via the rewind and replay mechanism. So in terms of governance, chameleon hashes subject to a validator majority make no difference at all.
If so, why bother with them? The answer is: performance optimization, because chameleon hashes allow old blocks to be substituted in a chain far more efficiently than before. Imagine that we need to remove a transaction from the start of a blockchain that has been running for 5 years. Perhaps this is due to the European Union’s right to be forgotten legislation, which allows individuals to have their personal data removed from companies’ records. Nodes can’t just wipe the offending transaction from their disks, because that would change the corresponding block’s hash and break a link in the chain. The next time the blockchain was scanned or shared, everything would fall apart.
To solve this problem without chameleon hashes, nodes would have to rewrite the early block without the problematic transaction, calculate the block’s new hash, then change the hash embedded in the next block to match. But this would also affect the next block’s own hash, which must be recalculated and updated in the subsequent block, and so on all the way along the chain. While this mechanism is possible in principle, it could take hours or days to complete in a blockchain with millions of blocks and transactions. Even worse, while engaged in this process, a node may be incapable of processing new incoming network activity. So chameleon hashes provide a far more computationally efficient way to achieve the same goal. If you imagine a bad transaction as a rock buried many miles underground, chameleon hashes can teleport the rock to the surface, instead of making us dig all the way down, retrieve the rock, and fill in the hole.
Immutability is nuanced
By reviewing the risks of proof-of-work blockchains and the technical value of chameleon hashes, I hope to have convinced you that blockchain immutability is far more nuanced than a “yes or no” question. To quote Simon Taylor quoting Ian Grigg, the question must always be “who are you and what do you want to achieve?”
For cryptocurrency believers who want to avoid government-issued money and the traditional banking system, it makes perfect sense to believe in a public proof-of-work blockchain, whose immutability rests on economics rather than trusted parties. Even if they must live with the possibility of a large government (or other wealthy actor) bringing down the network, they can take solace in the fact that this would be a painful and expensive operation. And no doubt they hope that cryptocurrencies will only get more secure, as their value and mining capacity continues to grow.
On the other hand, for enterprises and other institutions that want to safely share a database across organizational boundaries, proof-of-work immutability makes no sense at all. Not only is it astoundingly expensive, but it allows any sufficiently motivated participant to anonymously seize control of the chain and censor or reverse transactions. What these users need is immutability grounded in the good behavior of a majority of identified validator nodes, backed by contracts and..