Mastering Corda (2024)

Blockchain is evolving rapidly and is poised to dramatically change how we operate as a civilization and how societies transact, do business, and exchange value. Although the current hype may be dizzying, a no-nonsense, quiet storm is occurring within enterprises as they evaluate, test, and build solutions on blockchain, and we’ll see the fruits of this labor over the next few years. The blockchain cat is out of the bag, and there is no looking back.

Enterprises, whether private corporations, nonprofits, or governments, that have their ears tuned to this shift will potentially reap the benefits of new revenue streams and sustainable competitive advantage through significant cost reduction. Enterprises taking a wait-and-see approach may realize gains later but with higher risks or costs and reduced upside potential, or they’ll just be disrupted into oblivion. This technology is called blockchain, enterprise blockchain, or distributed ledger technology (DLT).¹

One of the leading platforms in this emerging market is Corda, which takes the best of public blockchains like Bitcoin and Ethereum, where anyone can transact, and retrofits them for enterprise requirements, where privacy and identity are critical.

Corda is a platform built on blockchain concepts—an interpretation of blockchain and how it can be applied to enterprise and business use cases, much like how MySQL and MongoDB are interpretations of database concepts and tenets. Corda is a platform for doing traditional business in new ways and new business in ways previously thought to be impossible.

In geek speak, Corda is a distributed, decentralized, permissioned, open source smart contract platform that does not have a native cryptocurrency, mining, or chaining of blocks, or any need for them. Transactions on the open source Corda platform are private instead of public, known only to transacting parties, and free of per-transaction fees. Corda mitigates double spend through a decentralizable service, known as a notary, that tracks whether a digital asset like a token or debt obligation has already been spent.

Corda borrows heavily from public blockchain technologies and movements like Bitcoin and Ethereum and reorients them for use in business-to-business applications with specific and nonnegotiable requirements, such as scalability and privacy. The purpose of this book is to provide you with the business and technical depth and breadth needed to stand up enterprise-grade use cases. You’ll learn how to build applications and systems that leverage distributed informational models and transaction consensus. You’ll learn not only about blockchain fundamentals, Corda internals, and how to build enterprise-grade applications on top of Corda, but also some of the best practices and patterns I’ve uncovered through engagement with countless enterprise executives and clients, architects, developers, and business users. You’ll learn about their use cases and, more importantly, the challenges of and solutions to working with those use cases.

If you’re a decision maker, either on the business or technology side, then the first few chapters should provide you deep insight into the Corda value proposition and an understanding of the Corda architecture and how it compares to other blockchains. If you’re a technologist, developer, or blockchain enthusiast, then a large portion of the book will help you understand the powerful features Corda offers you and how to code and build distributed applications, and you’ll become intimately familiar with the suite of tools and services Corda offers.

In this chapter, we’ll explore the business case for Corda, why you might want to consider it for your business, and which potentially new business models or revenue streams might be available to you. We’ll save as much technical jargon as possible for later chapters; the early ones build a strong foundation of blockchain concepts and Corda, and the later chapters progressively get more technical. The goal of this chapter is to understand why Corda is something you should think about and what it is or can do for you, and then finally how it can do those things (because if you’re not sold on the why, you probably don’t care about the how).

When thinking of blockchain, imagine a notebook that tracks and stores economic activity between individuals in tamper-proof form—a ledger that’s publicly accessible and owned by no one person exclusively. Anyone can read and write to this ledger, but a moat around it, enforced by the mathematics of cryptography, provides economic incentives and disincentives to writing in it. Because anyone is allowed to have a copy of the ledger and continuously synchronize updates and changes made to it with other copyholders, there’s no central authority. To write on a public blockchain ledger, you need to spend some amount of money through the expenditure of energy consumed by computing power, and anyone else holding a copy has the right to vote and agree that what you wrote is valid. Therefore, you’ll think twice about what and how much you’ll want to write. A blockchain is effectively an information storage system with economic moats around it that is open and accessible to the public.

You can think of a distributed ledger as a type, subset, or cousin of what is generally termed as blockchain. Distributed ledgers leverage many of the concepts of public blockchains like Bitcoin and Ethereum but make trade-offs by giving up certain features, such as pure decentralization and unfettered public access, to gain other features enterprises require, like data privacy, legal recourse, transaction performance, and transaction rate scalability. Distributed ledgers allow multiple parties to have a consistent view of their transactions with one another on a need-to-know basis.

Blockchain solves the decades-old digital problem of double spend. Double spend is the notion that a digital anything, like a PDF, can be reused, copied, and pasted infinitely. As a result, as more copies become available or are perceived to be available, the value of anything digital plummets to zero. The value of a digital asset that cannot be protected from double spend will drop precipitously, especially if it’s actively shared. In Figure1-1, the supply/demand curve for a PDF file slopes downward sharply. The more times a PDF is shared, the more rapidly its initial value of $10 drops to near zero. In some cases, a PDF can have a negative value, which can happen if you have more than one copy of the PDF on your hard drive. Any additional copy is taking up hard drive resources and thus has an economic cost or a negative value to you.

Blockchains have a specific mix of properties beyond being just a database of transactions. These properties can (but don’t always) include decentralization, double-spend protection, and programmability. (Chapter4 does a deep dive into these properties.)

Bitcoin just happens to use a blockchain to store Bitcoin transaction data. If I send you Bitcoins, I have an address stored on the blockchain, and you have an address stored on the blockchain. The act of sending you Bitcoins records that transaction on the blockchain so that everyone can see it and know that the coins I’ve just sent you are in your possession. Many computers, called nodes, keep copies of the fact that we’ve transacted so that if any one computer goes offline, the record of our transaction will still exist. This results in decentralization, the idea that no single authority can control the books.

Ever since Ethereum ushered in the ability to deploy rich smart contracts, business logic, and code that resides and executes on a blockchain, new blockchains making all sorts of claims seem to be cropping up every month. As it stands today, the leaders in the enterprise blockchain space are starting to emerge, one of which is Corda, alongside Bitcoin, Ethereum, Quorum, and Hyperledger.

Corda is a platform on which two or more cooperating enterprises or domains (companies, departments, teams, etc.) can define and execute a consistent, agreed-upon set of semantics and business processes that can reside and run on top of a shared ledger system. This allows for seamless collaboration and more efficient consensus of business deals and transactions. Corda takes the best of the business process–management world and mixes the innovations of blockchain in with it without compromising security and privacy. A Corda network is the amalgamation of a group of cooperating organizational units, called participants or parties, their respective shared and private business models and logic, the Corda platform, and the services the platform provides.

Corda is middleware technology—like messaging, an application server, an enterprise service bus, or an object broker—but with advancements from Bitcoin, one of the first few contributions that gained public and open source community adoption that is now being retrofitted for the enterprise.

Although Bitcoin and Ethereum blockchains seek to disintermediate by removing central authorities, like a central bank, Corda’s core disruption is not that it encourages or promises disintermediation via decentralization , but also that it encourages business partners to mediate among themselves and think about consensus at the business level.

This is not to imply that business partners are currently in open conflict with one another–they wouldn’t be partners if they were, although they may be competitors. But there are inherent, deep differences in how two cooperating partners do the very same business, say a mortgage bank and a title insurance company, that create unnecessary costs. If the business partners can sit down and find economic incentives by discussing how to commonalize shared business data models and processes that they are interdependent on, then this could result in a significant increase in trust between the partners, reduction in costs, and potentially a radical change in how business is done.

While Corda is a shared platform, data or infrastructure is not necessarily shared. Organizations can expect complete data and transaction privacy, full ownership and control of the data, and all the protections available behind a firewall. Any participating organization can choose to walk away with their data at any time, maintaining full possession of it at all times. Corda is a framework that is entirely happy to live inside of corporate firewalls and mind its own business and communicate with other business partners through those firewalls.

Corda can transform how your business works, especially in the B2B space, and create value via new revenue streams and significant and material cost reductions. Any problem that can be solved by automating multilateral consensus or agreement can benefit from Corda—and this is a large number of business problems. We can categorize the business cases broadly into digital assets, reconciliation, and traceability.

Decentralized Finance and Digital Assets

For the purposes of this book, a digital asset is defined as any digital representation that benefits from double-spend mitigation. Bitcoin and Ether are effectively just tradeable digital assets that just so happen to be perceived as currencies. Corda creates opportunities for new types of rich and complex digital assets to be designed from scratch, and the platform provides the pen and paper to programmatically draw up and create new digital assets and then sell or trade them.

Designing and engineering assets that have specific financial or economic behavior can now be accomplished entirely digitally. But we can already trade electronically, can we not? Online stock trading is nearly ubiquitous. However, unlike the electronic trade of an equity, bond, or credit default swap, when trading a digital asset on a blockchain, the problem of double spend is taken care of, and a central exchange is not required.

What this means is that we move from a world where we can electronically record the trade of a credit default swap to a world where the credit default swap itself is transferred electronically. The ledger then acts as a settlement and clearing system where the trade of natively digital assets can move between balance sheets instantaneously. You can own digital assets just like you can own digital art or other collectibles. Mathew McDermont, global head of digital assets at Goldman Sachs, says it best: “In the next five to 10 years, you could see a financial system where all assets and liabilities are native to a blockchain, with all transactions natively happening on chain.” This new form of finance, where trading of financial instruments occurs peer to peer with few to no intermediaries, is referred to as DeFi or decentralized finance.

Tokenization

Tokens (covered in depth in Chapter9) can be used as a conduit to trade and fractionalize on blockchain assets that already exist in the real world—like a house, car, credit default swap, or equity—and gain the benefits of mitigating double spend.The asset itself is not traded on the blockchain because its full digitized representation is not possible because of limitations of the industry or physicality, and so a proxy digital representation is traded instead. For example, because they’re physical assets, a shipping container or house cannot literally be brought into the digital world and so are instead represented by tokens. An equity stock is a set of rights and is more easily represented as a token. A stock certificate is a paper token of those rights.

At the most basic level, a token is a transferable digital pointer to an asset, whether tangible or intangible. In most use cases today, tokens are used to represent existing real-world assets, and trading those tokens is equivalent to trading the real-world asset, as shown in Figure1-2. Of course, in such scenarios, the right legal framework is required to assure the buyer or seller of a token that the asset or title to it is in fact legally transferred and that the transfer is enforceable by law if need be.

Tokens represent several business opportunities, including the ability to raise capital (discussed in the next section), increase liquidity, represent sovereign currency (i.e. central bank digital currency or CBDCs), and tap potential new revenue sources from transaction fees, custodian services, and exchanges.

Tokens are a stepping-stone to a broader emerging opportunity of converting traditional assets like equities into native digital assets, as described in “Decentralized Finance and Digital Assets”. A native digital asset, depicted in Figure1-3, is itself the asset, whereas a token represents an asset. A physical paper stock certificate could be redesigned to be a digital asset with all the properties of the stock certificate, like its serial number or par value, and the paper would no longer be required. A property title, which represents ownership in real property, could also be redesigned to be entirely digital. Trading of the digital deed to the title would represent ownership of the real estate the title represents.

Capital Raising

A common use of tokens has been to raise funds through crowdfunding campaigns via token issuances; a simple conceptual framework of this is shown in Figure1-4. These were very popular from 2016 to 2018, with many of the campaigns skirting regulatory requirements set by regulators like the Securities and Exchange Commission. Tokens with no underlying control and redemption schemes, no clear legal recourse, and not necessarily representing any underlying asset were sold to raise capital for technology and blockchain projects. As a result, many fundraising campaigns became defunct, and many investors lost money.

As regulation entered into the token issuance markets in 2019, token issuances subsided momentarily only to begin to pick up again slightly in early 2020. The Jumpstart Our Business Startups (JOBS) Act allows capital raising through the public up to $1,070,000 for projects that meet specific requirements. This amount is expected to go up to $5 million in the coming years, creating new venues for entrepreneurs to raise capital through token issuances.

Traceability and Provenance

Many use cases, such as supply chains, art auctions, pharmaceuticals, waste management, real estate, and collectibles, require or benefit from a clear and accurate ownership history or asset chain of custody.For example, when buying a house, the history of title holders needs to be known so that no future claim against the purchased house may come as a surprise, or worse, result in forfeiture of ownership. In the case of supply chains, tracking a fruit or vegetable’s origin back to the specific farm that grew it could enable containment in the event of a public health risk. Corda is designed for these use cases, allowing for digital stamping at ports and transparency around an asset’s custody history.

Reconciliation Cost Reduction

Most organizations have IT systems that represent business information in formats that are specific to how that organization operates and that are usually a product of the organization’s industry, history (often going back decades), technology choices, and appetite for innovation. IT systems all leverage some form of datastores, and we can conceptually refer to the collection of datastores as a business’s information model. As businesses exchange information with other businesses, the need to standardize how information is passed is critical. This has led to all kinds of standard formats, like Financial Exchange Protocol (FIX) for trading, Mortgage Industry Standards Maintenance Organization (MISMO) for mortgage and broader initiatives set by groups like Oasis and W3C, and formats like Electronic Data Interchange (EDI) and Electronic Business XML (ebXML) (if you go back decades).

Although standardization has been useful in lowering costs and allowing enterprises to integrate faster, it has only meaningfully occurred between organizations’ boundaries. For example, as shown in Figure1-5, two mortgage companies can exchange information using standard MISMO formats, but how data is represented internally within those organizations is very different. This creates operational inefficiencies because information has to be reinterpreted, processed, translated, etc., leaving enormous room for error and resulting in duplicated manual efforts, reconciliation efforts, and crosstalk between organizations. We can refer to this as the reconciliation problem, and it costs industry billions of dollars annually.

What Corda provides is a means for businesses to not only standardize their communication, but also to have a consistent, identical information model³ without having to give up control of the data, as shown in Figure1-6.

Before Corda, this was typically the job of a SaaS vendor that provided software and employed a consistent information model across all its clients. The problem with the SaaS model was that data was always in the hands of the SaaS vendor. This loss of control does not occur with Corda. Corda adds another layer of standardization (as shown in Table1-1) and uses cryptography to enforce those standards.

Table 1-1. Each additional layer adds consistency at higher levels of semantics
Layer	Reconciliation of	Solution	How
1	Communication packets	TCP/IP	Standardize how raw data packets can be transmitted, plus their sizes, sequence semantics, retries, timeout protocols, and management
2	Payload/message transmission	HTTP, AMQP	Standardize how messages can be carried and delivered between two or more endpoints
3	Payload/message description	XML, JSON, SQL DDL	Standardize how payload data can be described to enable interchange between parties
4	Domain message information models	FIX, XBRL, MISMO	Standardize domain-level information exchange
5	Organization-specific information models	Shared data models	Standardize shared data and business processes with cryptographic assurances without loss of data privacy and control

Reconciliation is fundamentally an information model synchronization problem that’s solvable through some consensus mechanism. To automate reconciliation and drive its cost down, there needs to be consensus around the semantics and taxonomies used and around transactions conducted using those semantics. Organizations conducting transactions or making arrangements with one another typically store and represent information related to those transactions and arrangements in different ways. Information models are siloed and not shared because internal processes modify those models, and external input adds little value. Even if a group agrees to a common information model, the next question will be: who will hold the data?

Organizations communicate and interoperate with other organizations to arrive at agreements and obligations and conduct transactions to fulfill those agreements and obligations. Mechanically, this can occur over the phone or electronically. These exchanges can be internal to an organization, like between teams, or between competing enterprises. As information flows between these organizations, each respective organization stores their understanding of the status of their relationship with another organization. The information is stored using a data model and set of semantics and taxonomy that is typically unique to an organization. This has a “siloing” effect where a relatively small group of people understand the unique business model, and other groups, although they may be in the same business, might not readily understand the internal lingo, formats, data structures, or semantics used to represent the same industry information.

Example: The housing market

The US housing market is a massive labyrinth made up of a large number of companies of varying sizes, from large warehouse lenders like JPMorgan and government-sponsored entities (GSEs) like Freddie Mac and Fannie Mae to mom-and-pop title insurance providers and mortgage servicers. This includes residential and commercial real estate, from single-family homes and condos to multifamily buildings and high-end developments.

Within the space, any given party, like a lender, servicer, or GSE, interacts with several partners providing liquidity, recapitalization, loans, brokerage, and insurance products and services to a whole host of buyers, investors, and owners. This creates an environment where large amounts of information need to be transmitted between parties to cover a real estate closing or loan commitment, and more often than not, these parties maintain their own siloed and proprietary sets of information, all representing their understanding of any given business transaction or deal.

Changes to the data maintained need to be circulated to the appropriate parties, which creates enormous opportunities for misinformation and errors to creep in. For example, for any given loan transaction, the date, the amount of the loan, closing and custody information, and the date of a derivative like a mortgage-backed security (MBS) and its closing are emailed around with screenshots and manually keyed in in other places. Lenders and custodians can change mid-stream, resulting in many parties becoming out of sync, often unaware of critical changes, which delays closing of the deal. With warehouse lenders, different parties in the loan process generate bailee letters that contain identical information separately, and the effort to maintain the same information represents a high cost in terms of human resources, risks to a deal’s closing, and the time it takes for a deal to complete from its inception.

Example: Trade breaks

I can recall sitting in a small room known as the “break room” embedded inside a larger office known as the “break room” of an equities market-making and trading firm at 120 Broadway in New York City in the mid-90s—my first job out of college. (I stayed there for two years, and a few weeks after I left, it was raided by the SEC; the co-heads market makers were a husband-and-wife team who then turned on each other!) The room was full of operations personnel picking up their phones, dialing numbers, talking to some counterpart in another break room in some other brokerage or trading firm, and then slamming the phone back down, only to pick it up and dial again while scribbling on some stack of papers. There were easily over a dozen of them singularly dedicated to resolving breaks, a situation where one party in the very same equity trade had a different understanding of the trade than the trade’s counterparty. In other words, a trade of 1,000 IBM shares occurred at $98.75 according to the buyer, but if you asked the seller, it occurred at $98.65 or was “DK” (i.e., they didn’t know the trade even occurred). In a trade mismatch or “break,” depicted in Figure1-7, downstream systems are out of sync with one another across multiple parties. Surprisingly, this is relatively a common occurrence.

The break room’s responsibility is to figure out where the trade discrepancy arises from and why and how the parties couldn’t reconcile their trades. Breaks occur not because the traders are trying to eke out extra profits but because back-office operations, communications, and integrations with other parties (like clearing firms) are done sporadically and in bulk or by hand or email and cause delayed updates across the entire trading ecosystem.

This is one of the main reasons trades took three days (or T+3) to settle and confirm from the date of the trade. This T+3 settlement window allowed the break rooms, clearing houses, and brokerages to get their reconciliation done. The SEC originally established settlement time frames at T+5. Today, at T+2, surprisingly not much has changed and breaks still occur, but clearly there’s room for improvement in a world where equities trading is done almost entirely electronically.

Corporate politics, fiefdoms, and incumbents try to maintain their hold on the processing of a trade and are incentivized to keep settlement time frames the way they are, and conversations about how to reduce the time frames have been difficult. The only options a few years ago were to have a shared database, which no one is or was willing to do, or to build on top of open exchange standards like the FIX protocol. The impact and results from the latter were limited—just because two parties could speak the same language didn’t mean they understood things in the same way.

Reconciliation Revenue Streams

Reconciliation is not always just about cost reduction and operational efficiencies. New ways of reconciling business processes can produce new business models and revenue streams. An example of such an approach involves real-time consensus reconciliation of pricing, which provides a valuable service to market participants. Let’s see how.

Example: Blind pricing

Often when an exotic financial instrument needs to be priced, the holder of the instrument is reluctant to publish a price lest they undersell themselves. Instead, the holder may be willing to submit a confidential, or blind, indication to a market center that can then run it through a model and provide pricing transparency to all holders of the same instrument, resulting in a market price.

This is an example of parties who are looking for consensus but are not willing to disclose information to help arrive at that consensus unless they have assurances that information is not disclosed to any other party. A blind pricing system would be an ideal use case for a peer-to-peer architecture like the one presented in Figure1-8, where no one central party controls the pricing information, but a price emerges between indicating parties and is processed through agreed-upon valuation models.

Better AI

Artificial intelligence and machine learning (ML) algorithms are heavily, if not entirely, dependent on data. The prediction accuracy of a model is tied directly to the quality of the training data used to produce models. One of the hardest challenges⁴ of building accurate ML models is obtaining usable, clean, and complete data. Data is often muddied with errors that are byproducts of manual errors or redundant processes, or it’s missing patches of information that need to be interpolated. Even if data is clean and accurate, because the data is only a partial view of a single business instead of an entire consortium of businesses, models can be overfitted and produce predictions that are highly accurate only within the scope of the training data, resulting in poor predictive quality when the model is applied to a broader scope.

Corda can contribute to significant improvement in the quality of data that organizations store and use. It requires cryptographically-enforced data models between participants of a Corda network. This enforcement of data structure rigidity leads to a higher quality of data, and better data leads directly to better AI.

This is by no means a guarantee that data quality will always be better—blockchains are not immune to GIGO (garbage in garbage out). However, cryptographically-enforced data models can significantly limit how much stray and bad data enters the blockchain, creating opportunities for higher-quality data and higher returns on an AI investment and ultimately a sustainable competitive advantage.

Enterprises looking to capture the value proposition of blockchain may have some concerns about how public versions of the blockchain currently⁵ operate and how B2B transactions would occur. Many businesses would instead prefer to trade away some of the benefits of public blockchain in order to meet specific requirements, resulting in some differences, as shown in Table1-2. Some of these requirements include privacy and data privacy; security; knowledge and awareness of counterparties in a transaction, or KYC (know your customer); transaction finality (covered in Chapter4), performance; security; availability; and scalability. In addition, enterprises often require support from vendors that are legal entities with service-level agreements in place and potential recourse for breaches.

Table 1-2. Enterprise blockchain trade-offs
Feature	Public blockchain	Corda
Information model definition	Open source community	Organization collaborators
Data visibility	Public	Need-to-know
Data privacy	Unencrypted	Unencrypted
Identity privacy	Pseudonymous	Known, confidential
Participation	Permissionless	Permissioned
Consensus	Software and algorithms	Counterparty agreement and notary

Privacy

Most businesses are reluctant to broadcast all of their business dealings on a public ledger and want, for a whole host of reasons, to keep knowledge of transactions on a need-to-know basis—typically only to the organizations involved in the transaction or to regulators. On Bitcoin and Ethereum, transactions occur in the open, and although there is some degree of pseudonymity, it is insufficient for any business to rely on as a means of complete privacy. Transactions on Corda are private by design, revealed only to the participants of a transaction by mutual agreement.

Know Your Counterparty

Blockchains like Bitcoin and Ethereum are pseudonymous systems, where a participant is effectively identified by a random number. This allows for a participant to conceal their identity, although there are means available to law enforcement to circumvent some pseudonymity. In addition, anyone can gather and analyze metadata about a pseudonymous participant. For businesses that want to conduct transactions, knowing who they’re transacting with is a business and, in the case of banks and financial institutions, regulatory requirement.

Permissioning

Because knowledge of a counterparty in a transaction is necessary, the network model Corda operates is called permissioned. This means the operator of the network can allow or disallow participation in a network, resulting in a private network. The public blockchain universe implies a permissionless environment, and because blockchains suited for enterprises run contrary to that assumption, they have come to be known as permissioned blockchains. The default in the enterprise is permissioned, and the default in the public blockchain world is permissionless.

Scalability and Performance

Transactions on Bitcoin and Ethereum are relatively slow, especially compared to payment rails like Visa,⁶ which have the capacity to process tens of thousands of transactions per second. Bitcoin peaks out at 12–15 transactions per second and Ethereum at about 15–20 transactions per second. Although transaction rates are improving (for example, via the Lightning Network for Bitcoin), Corda can process several thousands of transactions per second and is constantly improving this rate.

Integration and Developer Adoption

Integration is the backbone of enterprise technology.Any new piece of technology that is brought into an enterprise is evaluated for the cost of integrating it into the enterprise’s existing infrastructure. Integration into identity systems, database farms, firewalls, and messaging layers all need to be understood up front. For software that uses obscure technologies or technologies with low adoption rates, integration can be expensive and integration consulting fees can be costly.

The Corda framework is based on the Java Virtual Machine (JVM), battle-hardened technology that has been in the enterprise for more than 20 years. Languages like Java, Kotlin, and Scala can be used to build applications on Corda. As of this writing, JVM languages, especially Java, have a very active and large developer and experience base within enterprises. Although the Corda framework is written in Kotlin, the framework and any application written in Java is bytecode compatible and will run as if everything were written in one language. Compared to other platforms like Bitcoin, Ethereum, and Hyperledger Fabric,⁷ Corda does not require a serious retooling of skill sets. This is important because it means organizations can begin building applications using existing talent more quickly and focus on retooling skill sets in terms of understanding the DLT platform and not learning new and potentially complex languages.

If you have a distributed computing background and have worked with technologies like J2EE, COM, DCOM, CORBA, DCE-RPC, or messaging abstractions like pub/sub and queues, then much of how the underlying Corda framework operates should be at least somewhat familiar. What DLTs add on top of these common enterprise components is more use of cryptography to provide certain guarantees related to transactions, signing, and immutability. The engineering team behind Corda at R3 is cut not only from the cloth of blockchain but also from enterprise systems where integration is important, and thus they used a technology stack most familiar to enterprise developers and managers.

Blockchain technology is an unintended consequence of the Bitcoin digital currency invented in 2008 by an anonymous individual or entity known only as Satoshi Nakomoto. Very little is known about Satoshi, but they provided two key groundbreaking contributions.

The first is the Bitcoin whitepaper that describes both a public, decentralized, tamper-proof digital currency, known as Bitcoin or BTC, and a ledger system to openly record exchanges of the digital currency that can operate entirely without a central clearing authority, like a bank or PayPal. Born out of the 2008 financial crisis, while I was working as a vice president at Lehman Brothers in the mortgage origination and securitization area, Bitcoin represented the worldview where monetary policy would be placed into the hands of the public and not a central authority.

Satoshi’s second contribution is the implementation of the concepts described in the paper as working and usable open source software. The currency and the enabling software (or associated software) are both referred to as Bitcoin. The software materialized Bitcoins and made them digitally real, enabling Bitcoins to be minted and traded while also solving the key problem of double spend that prevented prior attempts at creating digital currencies from gaining traction.

The ledger system described by Satoshi is what we call blockchain, although the term does not appear in Satoshi’s paper. Since the paper’s publication, Bitcoin’s blockchain concepts have been extracted, isolated, and repurposed by other projects, engineers, entrepreneurs, and visionaries.

Today, blockchain technology has by and large divorced itself from its Bitcoin origins and runs a life of its own, spawning new variations and flavors of itself as more and more innovative people and resources are applied to advance it further, taking it from a single-purpose currency system into broader use cases.

Blockchain essentially brings together for the first time in history a set of properties that were available independently of one another in other software, such as public accessibility (like the web), decentralization (like torrents or the Gnutella music sharing service), immutability or tamper-proof data storage (like how hash checksums are used in file downloads), and consensus mechanisms. We’ll cover these properties in greater detail in the coming chapters.

Until Bitcoin came along, these properties were never aggregated into a single, cohesive, working piece of software machinery, and Satoshi’s genius and key innovation was to seamlessly orchestrate all of them to deliver powerful new value propositions that the world had not seen before. As will ideally become apparent throughout this and the next three chapters, the result of Satoshi’s innovation is new ways of conducting transactions (business or otherwise) that are so disruptive that it has extended far beyond the original intent of Bitcoin to replace fiat currency and central monetary authorities to become an idea disruptive on its own merits.

Vitalik Buterin and Gavin Wood released the Ethereum blockchain in 2015. It not only introduced another digital currency called Ether, but it also extended blockchain technology to have bespoke, customizable behavior and data structures, addressing the need for applicability to broader use cases and programmability beyond just transactional exchange of a cryptocurrency. This programmability, known as smart contracts,⁸ created broader awareness of the potential for blockchain.

Almost a year after the launch of Ethereum, a company called R3 released Corda, which, much like Ethereum, was designed for broader use cases. Corda was a member of a new breed of blockchains known as enterprise blockchains—blockchains that met the requirements of businesses. Because Corda and other enterprise blockchains were pared down and moved away from some of the properties espoused in the Bitcoin blockchain, blockchain purists deemed Corda not truly a blockchain.⁹

Corda initially focused on the specific needs of bank and financial use cases and creating a frictionless environment and lower impedance and overhead in transactions (refer back to “Reconciliation Cost Reduction”). However, in recent years, Corda has broadened its applicability and is able to support a diverse set of use cases, such as supply chains, insurance, capital markets, and real estate. Corda did away with some of the properties traditionally associated with blockchain—for example, public transactions—in order to make the trade-offs required by its target audience (enterprises that value sharing and harmony; see Figure1-9) but also respect transaction privacy and scalability, neither of which Bitcoin or Ethereum currently offers.

R3 was founded in 2014 by the namesake trio of David Rutter, Todd McDonald, and Jesse Edwards (see Figure1-10). R3’s eventual intent was to build a blockchain platform that solved real business problems and to identify problems first instead of creating solutions in search of problems. R3 initially focused on the banking and financial world, but today Corda is involved in many use cases, including real estate, aeronautics, supply chains, and more. The “R” in R3 comes from David’s last name, and 3 signifies the three partners that David brought together. Todd and David were active in the capital markets space: David ran an FX exchange, and Todd was an FX trading client of David’s. Early notions of what would eventually become Corda began to emerge in 2013 when Todd began exploring Bitcoin and trading Bitcoin struck by its chart patterns.

In 2016, Richard Brown, the CTO of R3, and Mike Hearn, developer of the Java implementation of Bitcoin, met at an event. Mike’s journey with Bitcoin started in 2009 when he had conversations directly with Satoshi and they traded Bitcoins with each other over the Bitcoin network, which was hardly used by anyone else at the time. From 2011 through 2015, Mike worked on Bitcoin but was eventually discouraged¹⁰ by the internal discord among Bitcoin developers and looked for a different project to engage in. Mike reached out to Richard in 2015, and they began brainstorming the idea of a permissioned system in late 2015, around the time R3 raised $120 million from a consortium of banks and financial institutions like Barclays, Bank of America, Goldman Sachs, JPMorgan, UBS, BNY Mellon, TD, HSBC, Morgan Stanley, Societe Generale, and almost 100 others. Mike joined R3 full time in 2016, releasing open source Corda that year and the generally available Corda version 1.0 in 2017. Mike also helped turn R3 into a software firm, and the rest is history.

For R3, the early years were heavy on research by the leadership team. R3’s full legal name, R3 CEV, stood for Consulting, Exchanges, and Ventures, and the founders did just that by meeting with banks, investors, and family offices (see Figure1-11) to identify what the market needed.

This was a stark contrast to the Bitcoin approach, which established the idea of a new world order at the outset. The pragmatic, grounded approach garnered interest from banks, and the leaders of R3 transformed it into one of the leading blockchain companies in the world.

Today, we can see Corda in many use cases and industries, including insurance, central banking, payments, equities and fixed income, mortgage, real estate, trade finance, supply chains, foreign exchange, aerospace, and healthcare.

The immutable nature of blockchain and the relatively lower transaction throughput, compared to traditional databases at least, requires us to consider what types of transactions should or should not go on ledger. For example, in the case of a purchase order (PO), is storing just the order’s header information sufficient, or should all of the items in the PO be included on the ledger? Is it possible to store the items off the ledger, or off-ledger, and a hash of the items on the ledger instead? These are design considerations that we must work through. Because transactions require consensus between multiple parties and updates to multiple databases simultaneously, transaction throughput is slower than compared to a raw database.

The richer a data structure is on the ledger, the higher the probability that it will be subject to transaction updates. Updates to multiple items on a PO may require multiple row updates in a ledger where those updates would be subject to the lower blockchain transaction throughputs. Alternatively, if items are stored off-ledger, in another high-performance SQL database, and a large number of items are updated, then all that needs to be updated on the ledger may be just a hash identifying the PO and its items.

The richer a data structure is on the ledger, the higher the probability that it will be subject to a schema change. Schema changes are a fact of life, and an enterprise can go through a long process to modify the data type of a column in a production table; the process is tedious depending on the number of applications that depend on that column. Schema changes on the ledger can be an order magnitude larger in the orchestration because immutable ledger schemas cannot be modified retroactively.

Corda is not meant to be a full replacement of a traditional database—it’s not a dumping ground for anything and everything an enterprise may choose to store. You have to carefully consider which data should be on-ledger, as opposed to off-ledger (in a different database, but linked).

With a DLT like Corda, you’ll want to store only relevant business essentials on-ledger—that is, what’s really required to have business parties arrive at consensus and the relevant data items that specifically matter to that consensus.

In fact, any organization or enterprise that’s looking to adopt a blockchain, regardless of what platform it uses, needs to develop a heuristic or set of best practices that will help guide what should or should not be stored on the ledger. It’s critical to make sure there’s as little on the DLT as possible; reducing what’s stored on the DLT to the absolute essentials will stave off creating a load on transaction throughput and reduce maintenance costs if schemas on the ledger change. It might make sense to establish a minimal set of shared ledger data fields and then only add to it if there’s clear justification for those fields. Any data you want to track in a DLT should pass the following tests:

The data should have meaningful business impact and relevance across multiple partners.
The data should be directly relevant to and contain the necessary knowledge parties need in order to arrive at consensus.
The data should be the most succinct method to track whatever it’s trying to represent.

Enterprises I’ve come in contact with that have pressed forward in their adoptions of enterprise blockchains have typically had a mix of failures and successes.¹¹ In most cases, these enterprises are running a number of proofs-of-concept, exploring a few blockchain options, and then deciding which technologies fit which use cases best. The proof-of-concept often helped internal teams separate the hype from the reality and increase competency. In cases where it was clear the concept had traction, enterprises continued building and have taken Corda applications into production. Banks have taken up Corda; for example, HSBC has put more than $10 billion of private placement deals on Corda, and the NASDAQ has adopted Corda for digital assets. There’s no doubt real adoption is occurring, and the startup scene, a sample of which is shown in Table1-3, is robust.

Table 1-3. The Corda startup landscape
Name	URL	Use case/industry	Stage
Reilize	reilize.com	Real estate wholesaling contract trading	MVP
Disburso	disburso.com	Large payment disbursem*nt and management	MVP
CordaIQ	cordaiq.com	Corda node management	MVP
Aerotrax technologies	aerotrax.org	Tracing historical of aircraft parts and components	MVP
AFOX	afoxexchange.com	Media Investment	MVP
Agora	agoradcm.com	Smart contracts to create smartbonds	MVP
Amedici Ltd	amedici.io	Capital Markets Engagement	POC
Archax	archax.com	Digital Securities Exchange	N/A
Banyan Infrastructure	Banyaninfrastructure.com	Loan operation for infrastructure projects.	Commercial Pilot
Birth Venue	birthvenue.in	Blockchain Development	POC
BlockSpaces	blockspaces.io	Development company, platform as a service	Live
BlueBox music	bluebox.info/	Empowering creators to monetize,copyright & license	MVP
Bond180	bond180.com	Data-driven matching service	Raised pre-seed
Cephas research	cephasreseach.com	Enterprise blockchain solutions	MVP
Chainstack	chainstack.com	Managed blockchain service	Live
Coadjute	coadjute.com/	Making home buying simpler, faster & cheaper	MVP
Cognitive view	cognitiveview.com	Analyzes customer communication to identify	Commercial Pilot
Consenso Labs	consensolabs.com	Blockchain research labs	MVP
Copyright Delta	copyrightdelta.com	Digital rights management	POC
Corditize	corditize.com	Architecture & Digital asset	POC
CoreChain	corechainb2b.com	Payments and financing	MVP
Custom Blockchain Solution	customblock.com	Industry agnostic blockchain	N/A
DASL	lab577.io/dasl/	Digital Assets Network	Live
Deriveum	deriveum.com	Restore trust in CDS risk sharing	MVP
Digipharm Ltd	digipharm.io/	Digital transformation to outcome-healthcare	Commercial Pilot
Dilichain	Dilichain.com	Empowering financial institutions	MVP
Elandbridge limited	elandbridge.com	Apply technology to create seamless borders	POC
Emali.IO Limited	emali.io	Health & Insurance	MVP
Fardoe software Ltd	fardoesoftw.com	Providing an intelligent ecosystem product	N/A
Fiducia	fiducia.eco	Digital advertising/marketing	N/A
Finteum	finteum.com	Global market for bank treasures	MVP
Flexvpc	flexvpc.com	Software blockchain lab	Live
Fragmos chain	fragmos.com	Blockchain development	MVP
Fund AdminChain	fundadminchain.com	Fund Operations	N/A
Gavea	gavea.com/e	Digital blockchain-Commodities exchange	Commercial Pilot
IDWorks	idworks.io	Digital identity	MVP
Instate	instate.io	Innovative financing network platform	MVP
Internet think tank, Inc	inttk.com	Develops network technology	MVP
Ivno Limited	ivno.io	Digital Assets	Commercial Pilot
Ledgertech Ag	legertech.com	Insurance	MVP
Loanxchain	loanxchain.com	Institutional investors	Commercial Pilot
Loppex	loppex.com	Illiquid Assets	Raising seed capital
Neb Tech OU	Envoychain.io	Trade Finance	MVP
OCYAN	ocyan-sa.com	Expat Services	Production
Procredex	procredex.com	Healthcare organizations	MVP
qiibee	qibee.com	Loyalty on the blockchain	Ideation/design
reThought insurance corporation	rethoughins.com	Protects valuable assets and reinsurers	Live
Schrocken Inc	schrocken.com	Digital intelligent supply chain for drug & device	Commercial Pilot
Things Protocol	thingsprotocol.com	B2B Payments	POC
Trade cloud commodities	tradecloud.sg/	Commodities	MVP
Trames Pte Ltd	Thames.sg	New age of collaboration across supply chains	Live
Trusterras Inc	trusterras.com	Provenance	POC
Trustlayer, Inc	trustlayer.io/	Risk management solution	POC
Umazi	umazi.io	Tokenizing corporate credentials	POC
Valk	valktech.io	Effortless management, investment and trading of unlisted assets	Live
Vyoma software,Inc	vyomasoft.com	Develops Healthcare Software technology	POC
YDK Technologies Ldt	wwwyourdatakey.com	Digital identity	MVP
Zeeve deeptech pvt	zeeve.io	Deployments, analytics and monitoring blockchain	MVP

The Critical Mass Challenge

Corda can only work in an ecosystem where two or more participants of a business transaction are using it. This creates an adoption challenge. Usually one party, say a mortgage company, can decide which platform it wants to use to conduct transactions, potentially settling on a Java stack with a MySQL database. Any counterparty it conducts business with is free to use any other platform as long as the communication protocol and semantics between the two parties are agreed upon. A mortgage company can send an invoice to another organization that receives and processes it using a different set of technologies. Both organizations can independently select and evolve their respective platforms, which liberates both sides from any interdependencies.

In the case of Corda, true value can only be realized if multiple parties, via either a mutually established consortium or some collaboration scheme, have adopted it, because a Corda system will only communicate with another Corda system in the context of a business transaction. This means any undertaking to adopt Corda requires multilateral coordination of one or more business partners. The up-front work of organizing and getting multiple business partners, who are often also competitors, to standardize on a platform can be a significant uphill battle.

Alternatives to Corda

Corda is not the only entry into the enterprise blockchain. A number of startups and incumbents have started to build blockchain platforms that target enterprise use cases. The largest competitor is a suite of blockchain framework tools under the Hyperledger community, which is hosted by the Linux Foundation. It has the backing of IBM and is entirely open source. Fabric and Sawtooth are Hyperledger’s blockchain frameworks. JPMorgan launched its own enterprise blockchain called Quorum, which was recently purchased by blockchain company ConsenSys. Quorum is a derivative of the Ethereum blockchain with modifications that enable private transactions and zero transaction fees. Not to be outdone, the Ethereum Alliance released an Enterprise Ethereum fork, making the same trade-offs Corda makes natively. In addition, public blockchains like Algorand (based in Boston) offer public and private versions of their blockchains.

Portability

Applications built on any blockchain platform are not portable to other blockchains. A decentralized application built for Ethereum will not run natively on Corda, and vice versa. Similarly, chaincode built on HyperLedger’s Fabric, even if written in Java, will not port unaided to Corda. Although Ethereum is entirely open source, transactions have a cost associated with them (known as gas). Corda’s Enterprise edition, the version typically used in production environments, has a license fee. Is it possible to build a portable, decentralized application that reduces vendor lock-in? This is a concern for executives, but this is by no means a new problem. The database world is an excellent example. Functions or stored procedures written for, say, the Oracle database, will not run on any other database system, creating significant switching costs. Despite standardized technologies like SQL, vendors will continue to add specialized features that are difficult to abstract. In the blockchain space, one company, Digital Asset, is attempting to allow us to build portable decentralized apps with a technology called Digital Asset Markup Language (DAML). A brief discussion of DAML can be found in AppendixD.

R3 offers an enterprise edition¹² of Corda: a closed source fork of Corda with additional enterprise—grade features, including significantly increased performance. Much of the performance can be attributed to changes to Corda’s core—it’s like retrofitting the open source Corda’s single-threaded engine with a multithreaded one. The enterprise edition also offers more tools and capabilities related to availability and managing connections through firewalls, hardware security modules (HSMs), network management tools, such as Corda Enterprise Network Manager (CENM), and high-availability deployment options, such as a notary database clustering through Percona XtraDB Cluster software. Corda Enterprise is not free and requires a license from R3, and, as such, it is not covered in this book. For developers, this should not be a concern, as any application built on the open source edition of Corda is completely portable to the enterprise edition.

R3 provides a set of Corda applications that solve some common business problems known as the Corda Business Networks Toolkit. This includes a Business Networks Membership Service (slated to be included in Corda by late 2020) used to manage one or more Corda networks, ledger sync, CorDapp distribution and billing and metering services. Details can be found on the Corda Solutions GitHub page.

The business cases for Corda are compelling and paint a picture of a new digital future, much like the internet did in the late 1990s. Digital assets allow us to create new tradeable, valuable, and fractionalizable financial or collectible products. Businesses can reduce friction and reconciliation costs they face in trying to translate the same business terms to one another, and all types of industries can and are finding ways to use Corda to open new business opportunities.

¹ Although there are differences between blockchain, enterprise blockchain, and DLTs, we will regard the three as generally synonymous throughout this book.

² See an equivalent claim by an HSBC paper entitled, “The 10x Potential of Tokenisation” at MasteringCorda.com.

³ As Austin Moothart of R3 puts it, “What you see is what I see.”

⁴ In speaking to hundreds of students that take my machine learning course, the most common pain point has been the inordinate amount of time spent cleaning and organizing data.

⁵ I say “currently” because the two worlds of public and private blockchains are converging.

⁶ “This year’s stress test showed that VisaNet could process more than 65,000 transaction messages per second.” —Visa’s 2018 Annual Report

⁷ Hyperledger Fabric now supports Java.

⁸ The term smart contract has been around for decades.

⁹ See Andreas Antonopolous’s five pillars of blockchain.

¹⁰ For those interested in the background story, as per my conversation with Mike, Mike states: “2015: The so-called block size ‘debate’ turns into more of a civil war. It becomes clear that a small number of devs have gone rogue and started blocking, loudly shouting down, and obfuscating over a scheduled increase to Bitcoin’s capacity. This takes more and more time away from the upgrades I intended to work on. It turns into a giant civil war, which eventually the small blockers win using totalitarian tactics; e.g., anyone who disagrees with them is DoS’d off the internet and/or banned from all community forums.”

¹¹ A large number of blockchain projects (decentralized apps, or “Dapps”) have been failures. Much of this can be attributed to a rush to rebuild and reimagine a new world too quickly. However, the rate of successful projects is increasing.

¹² To obtain the link to download the evaluation version of the enterprise edition, go to http://masteringcorda.com.

Mastering Corda (2024)

Figure 1-1. The value of any arbitrary PDF file containing desired information and its value as the quantity available increases.

Decentralized Finance and Digital Assets

Tokenization

Figure 1-2. A token is a digital representation of a real-world asset.

Figure 1-3. A digital asset or native digital token is the asset itself.

Capital Raising

Figure 1-4. Token issuance and investment via a blockchain.

Traceability and Provenance

Reconciliation Cost Reduction

Figure 1-5. Standardization at the edges and inconsistent data models internally.

Figure 1-6. Consistent informational model across businesses and their internal systems.