Generating OpenAPI or Swagger From Code is an Anti-Pattern, and Here’s Why

(This article was originally posted on Medium.)

I’ve been using Swagger/OpenAPI for a few years now, and RAML before that. I’m a big fan of these “API documentation” tools because they provide a number of additional benefits beyond simply being able to generate nice-looking documentation for customers and keep client-side and server-side development teams on the same page. However, many projects fail to fully realize the potential of OpenAPI because they approach it the way they approach Javadoc or JSDoc: they add it to their code, instead of using it as an API design tool.

Here are five reasons why generating OpenAPI specifications from code is a bad idea.

You wind up with a poorer API design when you fail to design your API.

You do actually design your API, right? It seems pretty obvious, but in order to produce a high-quality API, you need to put in some up-front design work before you start writing code. If you don’t know what data objects your application will need or how you do and don’t want to allow API consumers to manipulate those objects, you can’t produce a quality API design.

OpenAPI gives you a lightweight, easy to understand way to describe what those objects are at a high level and what the relationships are between those objects without getting bogged down in the details of how they’ll be represented in a database. Separating your API object definitions from the back-end code that implements them also helps you break another anti-pattern: deriving your API object model from your database object model. Similarly, it helps you to “think in REST” by separating the semantics of invoking the API from the operations themselves. For example, a user (noun) can’t log in (verb), because the “log in” verb doesn’t exist in REST — you’d create (POST) a session resource instead. In this case, limiting the vocabulary you have to work with results in a better design.

It takes longer to get development teams moving when you start with code.

It’s simply quicker to rough out an API by writing OpenAPI YAML than it is to start creating and annotating Java classes or writing and decorating Express stubs. All it takes to generate basic sample data out of an OpenAPI-generated API is to fill out the example property for each field. Code generators are available for just about every mainstream client and server-side development platform you can think of, and you can easily integrate those generators into your build workflow or CI pipeline. You can have skeleton codebases for both your client and server-side plus sample data with little more than a properly configured CI pipeline and a YAML file.

You’ll wind up reworking the API more often when you start with code.

This is really a side-effect of #1, above. If your API grows organically from your implementation, you’re going to eventually hit a point where you want to reorganize things to make the API easier to use. Is it possible to have enough discipline to avoid this pitfall? Maybe, but I haven’t seen it in the wild.

It’s harder to rework your API design when you find a problem with it.

If you want to move things around in a code-first API, you have to go into your code, find all of the affected paths or objects, and rework them individually. Then test. If you’re good, lucky, or your API is small enough, maybe that’s not a huge amount of work or risk. If you’re at this point at all, though, it’s likely that you’ve got some spaghetti on your hands that you need to straighten out. If you started with OpenAPI, you simply update your paths and objects in the YAML file and re-generate the API. As long as your tags and operation Ids have remained consistent, and you’ve used some mechanism to separate hand-written code from generated code, all you’re left to change is business logic and the mapping of the API’s object model to its representation in the database.

The bigger your team, the more single-threaded your API development workflow becomes.

In larger teams building in mixed development environments, it’s likely you have people who specialize in client-side versus server-side development. So, what happens when you need to add to or change your API? Well, typically your server-side developer makes the changes to the API before handing it off to the client-side developer to build against. Or, you exchange a few emails, each developer goes off to do his own thing, and you hope that when everyone’s done that the client implementation matches up with the server implementation. In a setting where the team reviews the proposed changes to the API before moving forward with implementation, you’re in a situation where code you write might be thrown away if the team decides to go in a different direction than the developer proposed.

It’s easy to avoid this if you start with the OpenAPI definition. It’s faster to sketch out the changes and easier for the rest of the team to review. They can read the YAML, or they can read HTML-formatted documentation generated from the YAML. If changes need to be made, they can be made quickly without throwing away any code. Finally, any developer can make changes to the design. You don’t have to know the server-side implementation language to contribute to the API. Once approved, your CI pipeline or build process will generate stubs and mock data so that everyone can get started on their piece of the implementation right away.

The quality of your generated documentation is worse.

Developers are lazy documenters. We just are. If it doesn’t make the code run, we don’t want to do it. That leads us to omit or skimp on documentation, skip the example values, and generally speaking weasel out of work that seems unimportant, but really isn’t. Writing OpenAPI YAML is just less work than decorating code with annotations that don’t contribute to its function.

Introducing Aviator DLT by TXMQ’s DTG

In 2017, TxMQ’s Disruptive Technologies Group created Exo – an open-source framework for developing applications using the Swirlds Hashgraph consensus SDK. Our intention for Exo was to make it easier for Java developers to build Hashgraph and other DLT applications. It provided an application architecture for organizing business logic, and an integration layer over REST or Java sockets. Version 2 of the framework introduced a pipeline model for processing transactions and the ability to monitor transactions throughout their life cycle over web sockets.

TxMQ used Exo as the basis of work we’ve delivered for our customers, as well as the foundation for a number of proofs-of-concept. As the framework has continued to mature, we began to realize its potential as the backbone for a private distributed ledger platform.

By keeping the programming and integration model consistent, we are able to offer a configurable platform that is friendlier to enterprise developers who don’t come from a distributed ledger background. We wanted developers to be able to maximize the investment they’ve made in the skills they already have, instead of having to tackle the considerable learning curve associated with new languages and environments.

Enterprises, like developers, also require approachability – though from a different perspective. Enterprise IT is an ecosystem in which any number of applications, databases, APIs, and clients are interconnected. For enterprises, distributed ledger is another tool that needs to live in the ecosystem. In order for DLT to succeed in an enterprise setting, it needs to be integrated into the existing ecosystem. It needs to be manageable in a way that fits with how enterprise IT manages infrastructure. From the developer writing the first line of code for their new DApp all the way down to the team that manages the deployment and maintenance of that DApp, everyone needs tooling to help them come to grips with this new thing called DLT. And so the idea for Aviator was born!

Aviator is an application platform and toolset for developing DLT applications. We like to say that it is enterprise-focused yet startup-ready. We want to enable the development of private ledger applications that sit comfortably in enterprise IT environments, flattening the learning curve for everyone involved.

There are three components of Aviator: The core platform, developer tools, and management tools.

Think of the core platform as an application server for enterprise DApps. It hosts your APIs, runs your business logic, handles security, and holds your application data. Each of those components is meant to be configurable so Aviator can work with the infrastructure and skill-sets you already have. We’ll be able to integrate with any Hibernate-supported relational database, plus NoSQL datastores like MongoDB or CouchDB. We’ll be delivering smart contract engines for languages commonly used in enterprise development, like Javascript, Java, and C#. Don’t worry if you’re a Solidity or Python developer, we have you on our radar too. The core platform will provide a security mechanism based on a public key infrastructure, which can be integrated into your organization’s directory-based security scheme or PKI if one is already in place. We can even tailor the consensus mechanism to the needs of an application or enterprise.

Developing and testing DApps can be complicated, especially when those applications are integrated into larger architectures. You’re likely designing and developing client code, an API layer, business logic, and persistence. You’re also likely writing a lot of boilerplate code. Debugging an application in a complicated environment can also be very challenging. Aviator developer tools help to address these challenges. Aviator can generate a lot of your code from Open API (Swagger) documents in a way that’s designed to work seamlessly with the platform. This frees developers to concentrate on the important parts and cuts down on the number of bugs introduced through hand-coding. We’ve got tools to help you deploy and test smart contracts and more tools to help you look at the data and make sure everything is doing what is supposed to do. Finally, we’re working on ways to use those tools the way developers will want to use them, whether that’s through integrations with existing IDEs like Visual Studio Code or Eclipse, or in an Aviator-focused IDE.

The work doesn’t end when the developers have delivered. Deploying and managing development, QA, and production DLT networks is seriously challenging. DLT architectures include a number of components, deployed across a number of physical or virtual machines, scaled across a number of identical nodes. Aviator aims to have IT systems administrators and managers covered there as well. We’re working on a toolset for visually designing your DLT network infrastructure, and a way to automatically deploy that design to your physical or virtual hardware. We’ll be delivering tooling to monitor and manage those assets through our own management tooling, or by integrating into the network management tooling your enterprise may already have. This is an area where even the most mature DLT platforms struggle, and there are exciting opportunities to lower frictions when managing DLT networks through better management capabilities.

So what does this all mean for Exo, the framework that started our remarkable journey? For starters, it’s getting a new name and a new GitHub. Exo has become the Aviator Core Framework, and can now be found on TxMQ’s official GitHub at https://github.com/txmq. TxMQ is committed to maintaining the core framework as a free, open source development framework that anyone can use to develop applications based on Swirlds Hashgraph. The framework is a critical component of the Aviator Platform, and TxMQ will continue to develop and maintain it. There will be a path for applications developed on the framework to be deployed on the Aviator Platform should you decide to take advantage of the platform’s additional capabilities.

For more information on Aviator, please visit our website at http://aviatordlt.com and sign up for updates.

 

 

 

 

What Digital Cats Taught Us About Blockchain

Given the number of cat pictures that the internet serves up every day, perhaps we shouldn’t be surprised that blockchain’s latest pressure-test involves digital cats. CryptoKitties is a Pokémon-style collecting and trading game built on Ethereum where players buy, sell, and breed digital cats. In a matter of a week, the game has gone from a relatively obscure but novel decentralized application (DAPP) to the largest DAPP currently running on Ethereum. Depending on when you sampled throughput, CryptoKitties accounted for somewhere in the neighborhood of 14% of Ethereum’s overall transaction volume. At the time I wrote this, players had spent over $5.2 million in Ether buying digital cats. The other day, a single kitty was sold for over $117,000.

Wednesday morning I attended a local blockchain meet-up, and the topic was CryptoKitties.

Congestion on the Ethereum node that the player was connected to was so bad, gas fees for buying a kitty could be as high as $100. The node was so busy that game performance was significantly degraded to the point where the game became unusable. Prior to the game’s launch, pending transaction volume on Ethereum was under 2,000 transactions. Now it’s in the range of 10,000-12,000 transactions. To summarize: A game where people pay (lots of) real money to trade digital cats is degrading the performance of the world’s most viable general-purpose public blockchain.

If you’re someone who has been evaluating the potential of blockchain for enterprise use, that sounds pretty scary. However, most of what has been illustrated by the CryptoKitties phenomenon isn’t news. We already knew scalability was a challenge for blockchain. There are a proliferation of off-chain and side-chain protocols emerging to mitigate these challenges, as well as projects like IOTA and Swirlds which aim to provide better throughput and scalability by changing how the network communicates and reaches consensus. Work is ongoing to advance the state of the art, but we’re not there yet and nobody has a crystal ball.

So, what are the key takeaways from the CryptoKitties phenomenon?

Economics Aren’t Enough to Manage the Network

Put simplistically, as the cost of trading digital cats rises, the amount of digital cat trading should go down (in an idealized, rational market economy that is). Yet both the cost of the kitties themselves – currently anywhere from $20 to over $100,000 – and the gas cost required to buy, sell, and breed kitties has gone up to absurd levels. The developers of the game have also increased fees in a bid to slow down trading. Up to now, nothing has worked.

In many ways, it’s an interesting illustration of cryptocurrency in general: cats have value because people believe they do, and the value of a cat is simply determined by how much people are willing to pay for it. In addition, this is clearly not an optimized, nor ideal, nor rational market economy.

The knock-on effects for the network as a whole aren’t clear either. Basic economics would dictate that as a resource becomes more scarce, those who control that resource will charge more for it. On Ethereum, that could come in the form of gas limit increases by miners which will put upward pressure on the cost of running transactions on the Ethereum network in general.

For businesses looking to leverage public blockchains, the implication is that the risk of transacting business on public blockchains increases. The idea that a CryptoKitties can come along and impact the costs of doing business adds another wrinkle to the economics of transacting on the blockchain. Instability in the markets for cryptocurrency already make it difficult to predict the costs of operation for distributed applications. Competition between consumers for limited processing power will only serve to increase risk and likely the cost of running on public blockchains.

Simplify, and Add Lightness

Interestingly, the open and decentralized nature of blockchains seems to be working against a solution to the problem of network monopolization. Aside from economic disincentives, there isn’t a method for ensuring that the network isn’t overwhelmed by a single application or set of applications. There isn’t much incentive for applications to be good citizens when the costs can be passed on to end-users who are willing to absorb those costs.

If you’re an enterprise looking to transact on a public chain, your mitigation strategy is both obvious and counter-intuitive: Use the blockchain as little as possible. Structure your smart contracts to be as simple as they can be, and handle as much as you can either in application logic or off-chain. Building applications that are designed to be inexpensive to run will only pay off in a possible future where the cost of transacting increases. Use the right tools for the job, do what you can off-chain, and settle to the chain when necessary.

Private Blockchains for Enterprise Applications

The easiest way to assert control over your DAPPs are to deploy them to a network you control. In the enterprise, the trustless, censorship-free aspects of the public blockchain are much less relevant. Deploying to private blockchains like Hyperledger or Quorum (a permissioned variant of Ethereum), gives organizations a measure of control over the network and its participants. Your platform then exists to support your application, and your application can be structured to manage the performance issues associated with blockchain platforms.

Even when the infrastructure is under the direct control of the enterprise, it’s still important to follow the architectural best practices for DAPP development. Use the blockchain only when necessary, keep your smart contracts as simple as possible, and handle as much as you can off-chain. In contrast to traditional distributed computing environments, scaling a distributed ledger platform by adding nodes increases fault tolerance and data security but not performance. Structuring your smart contracts to be as efficient as possible will ensure that you make best use of transaction bandwidth as usage of an application scales.

Emerging Solutions

Solving for scalability is an area of active development. I’ve already touched on solutions which move processing off-chain. Development on the existing platforms is also continuing, with a focus on the mechanism used to achieve consensus. Ethereum’s Casper network proposes to change the consensus mechanism to a proof-of-stake system, where miners put up an amount of cryptocurrency as proof that they aren’t acting maliciously. While proof-of-stake has the potential to increase throughput, it hasn’t yet been proven to be.

Platforms built on alternatives to mining are also emerging.

IOTA has been gaining traction as an Internet of Things scale solution for peer-to-peer transacting. It has the backing of a number of large enterprises including Microsoft, is open-source, and freely available. IOTA uses a directed acyclic graph as its core data structure, which differs from a blockchain and allows the network to reach consensus much more quickly. Swirlds is coming to market with a solution based on the Hashgraph data structure. Similar to IOTA, this structure allows for much faster time to consensus and higher transaction throughput. In contrast to IOTA, Swirlds is leaderless and Byzantine fault tolerant.

As with any emerging technology, disruption within the space can happen at a fast pace. Over the next 18 months, I expect blockchain and distributed ledger technology to continue to mature. There will be winners and losers along the way, and it’s entirely possible that new platforms will supplant existing leaders in the space.

Walk Before You Run

Distributed ledger technology is an immature space. There are undeniable opportunities for early adopters, but there are also pitfalls – both technological and organizational. For organizations evaluating distributed ledger, it is important to start small, iterate often, and fail fast. Your application roadmap needs to incorporate these tenets if it is to be successful. Utilize proofs of concept to validate assumptions and drive out the technological and organizational roadblocks that need to be addressed for a successful production application. Iterate as the technology matures. Undertake pilot programs to test production readiness, and carefully plan application roll out to manage go-live and production scale.

If your organization hasn’t fully embraced agile methods for application development, now is the time to make the leap. The waterfall model of rigorous requirements, volumes of documentation, and strictly defined timelines simply won’t be flexible enough to successfully deliver products on an emerging technology. If your IT department hasn’t begun to embrace a DevOps-centric approach, then deploying DAPPs is likely to meet internal resistance – especially on a public chain. In larger enterprises, governance policies may need to be reviewed and updated for applications based on distributed ledger.

The Future Is Still Bright

Despite the stresses placed on the Ethereum network by an explosion of digital cats, the future continues to look bright for distributed ledger and blockchain. Flaws in blockchain technology have been exposed somewhat glaringly, but for the most part these flaws were known before the CryptoKitties phenomenon. Solutions to these issues were under development before digital cats. The price of Ether hasn’t crashed, and the platform is demonstrating some degree of resilience under pressure.

We continue to see incredible potential in the space for organizations of all sizes. New business models will continue to be enabled by distributed ledger and tokenization. The future is still bright – and filled with cats!