On COBOL

Helping government through this crisis. 
We’re offering free access to O’Reilly online learning to any individual who works for a US government agency to get the learning you need as your agency responds to unprecedented demand.

We’ve all seen that the world (well, governments, specifically state governments, to say nothing of the banks) is screaming for COBOL programmers—a cry that goes up roughly every five years. We somehow muddle through the crisis at hand, then people forget that it was ever a problem. It’s time we asked what the crisis really is, and why it keeps returning.

Learn faster. Dig deeper. See farther.

COBOL is one of the earliest programming languages; it was invented in 1960 and rose to prominence fairly quickly as a language that required minimal programming skills. (Real programmers wrote FORTRAN.) That’s not how COBOL’s inventors put it, but that is, to some extent, what they meant: a language that was supposed to be easy for programmers to learn, and that could also be understood by business people. Just look at what COBOL stands for: “Common Business-Oriented Language.” A programming language for business.

COBOL’s influence faded in the 1980s, and now, there are billions of lines of code in governments, banks, enterprises, and elsewhere performing essential business functions with nobody to maintain them. COBOL programmers have grown old and retired, and nobody came along afterward.

What’s the language like? I’ve had occasion to look at COBOL code, and my reaction hasn’t been what I expected. It doesn’t look like any “modern” language. But it’s not a strange antique from the days before people knew how to design decent languages. COBOL is a well-thought-out domain-specific language. It’s a business language that uses the language of businesspeople. Remember when Rubyists were proud that they could write statements that looked like idiomatic English? And that they could use metaprogramming to create domain-specific languages that used the vocabularies and concepts of different application domains? That was no small achievement. And COBOL did it 40-odd years earlier.

Like other useful languages, COBOL never disappeared; but it has had surprisingly little influence on the development of computer languages, and that makes it look like it has died. In 10 Most(ly) Dead Influential Programming Languages, Hillel Wayne argues that COBOL had little influence on the development of programming languages because it came from the business community, and academics weren’t interested in it—for academics, it “wasn’t worth paying attention to.” Who wants to write code that’s readable by bankers and business people anyway? The allure of speaking a secret language that nobody else understood was always attractive to programmers.

COBOL nevertheless made a number of important innovations. It had a concept of records (like rows in a database), which was related to a concept of hierarchical structures, looking forward to C structs and perhaps even objects. And it has a report generator—if that doesn’t sound interesting, remember that one of the initial applications for Perl was report generation. And that another nearly forgotten early language, RPG, was invented purely to generate reports. Reports aren’t glamorous, but they’re important.

Syntactically, COBOL asked a really good question: Why do we need to use the bastardized language of mathematics to move money around, by saying something like “total = total + deposit”? Wouldn’t it be more natural to MOVE amounts from one account to another? Don’t get too excited. MOVE sounds like a proto-transaction, but it isn’t; it’s just an assignment. However, if you’re thinking about MOVE-ing money rather than assignment to a variable, those thoughts will lead you to atomic transactional operations sooner rather than later.

Of course, there’s a lot that COBOL doesn’t offer. While COBOL has been updated with most of the features you’d expect in a modern language (since 2002, it’s even object-oriented), COBOL tends to lead to very awkward spaghetti code and monoliths. That’s 1960s programming for you. GOTO was an essential part of every programming language (even C has a goto statement). Modularization wasn’t well-understood, if it was understood at all. Libraries? The earliest versions of COBOL didn’t have a standard library, let alone user-defined libraries. Web frameworks? You don’t want to know. Microservices? Forget it.

So, where are we now, with our billions of lines of COBOL running the world’s governments, and finances? I doubt there are many 1960s mainframes left, but there are plenty of emulations of 1960s mainframes running COBOL in the cloud much faster than the hardware it ran on initially. And that’s one strategy I’ve seen for maintaining COBOL: leave it as it is, run it on an emulator, wrap it up in a microservice written in some “modern” language, and hope you never have to touch it. That buys time, but while “hope” may solve the immediate problem, it’s a poor long-term strategy.

The real problem isn’t just the lack of programmers fluent in a language that is no longer popular. There are also cultural problems that need to be addressed—and that have solutions that go beyond “train up a new batch of COBOL programmers.” First, one casualty of the “language wars” of the 90s and 00s is that we have an increasing number of programmers who identify with one language: they’re JavaScript programmers, or Java programmers, or Python programmers, or Rubyists. Dave Thomas’ and Andy Hunt’s advice to learn a new programming language every year is just as valid as it was when they first wrote The Pragmatic Programmer; but it goes sadly unheeded. To be a good programmer, you need to expose yourself to new ideas, new ways of thinking about problems—and, in the case of COBOL, old ideas. Programmers who can’t be coaxed out of their comfort zone aren’t going to learn COBOL; but in the long run, they’ll prove to be less valuable, regardless of what modern language they know.

Second, COBOL programming requires an understanding of business programming. Regardless of the language, that’s an increasingly rare specialty. How do you handle financial quantities, like dollars and cents? If you say “floating point,” go to the back of the class. Roundoff errors will kill you. If you say “use integers, and divide by 100,” that’s not much better. The fundamental problem is that binary numbers are not good at representing decimal fractional values. But that’s lore that most current programmers have never learned.  (And we haven’t even started thinking about currency conversions.)

Third, engineering decisions made in the 1960s, 1970s, and even 1980s aren’t the decisions we’d make today. The engineering was certainly valid for its time, but modern engineers frequently don’t understand why. I’ve heard many contemporary programmers talk about the Y2K problem (representing years in the 1900s with two digits) as “technical debt.” That represents a misunderstanding of the issues the original programmers faced. In an environment when data was entered on 80-column punched cards, saving 2 characters was a Big Deal. In an environment where the largest computers had memories measured in Kilobytes (and a small number of K at that), saving 2 characters was a Big Deal. This isn’t engineering that has to be replicated, but it does need to be understood. 

Fourth, old business software was monolithic—and monolithic in a very deep sense. It tended to model forms that humans would fill out, and that couldn’t be submitted until the form was complete. There’s often no way to save your work, because—why would you need to? You went to the unemployment office in person; you leave when you hand the application to the person behind the desk. An incomplete form goes into the wastebasket; why waste valuable storage on it? Putting a web interface in front of those monoliths leads to a predictable result: long, complex forms that can take hours to fill out, and that are close to unusable on the modern Web. In creating GetCalFresh, a streamlined application for food assistance in California, Code for America found that the old form took an hour to fill out—but applicants often relied on public computers in libraries that didn’t allow sessions longer than a half-hour. Since incomplete forms couldn’t be saved, it was impossible for applicants to finish applying. Moving a COBOL application to an emulator, running it in the cloud, and hacking together a Web frontend isn’t going to solve problems like this. The good news is that this is an opportunity to re-think your service and make it more effective. The bad news is that it’s not a quick fix.

So, what’s needed? Yes, we need more people who know and understand COBOL programs. There’s a lot of old code that needs to be maintained, pure and simple. But it goes deeper. COBOL is just another programming language; if we’re going to maintain (or replace) that software, we need programmers who understand the engineering decisions that made the software what it is. We also need engineers and managers who are willing to look at our current situation—for example, the huge surge in unemployment applications—and think beyond the short-term solution. What does it take to re-invent current systems, rather than just replace them? How can they become more human-centric? Can they be redesigned to match the way we live and work now? Putting a web front-end on a monolithic business process from the 1950s is the road to failure.

That’s the new generation of COBOL programmers that we need: people to do the tedious, unglamorous work of re-inventing, re-engineering, and automating government applications, business applications, and much more. Reimagining these processes is creative work, but it requires a different kind of creativity from implementing a new website. I previously wrote about the distinction between blue- and white-collar programming. COBOL is very, very blue-collar. And very, very important. Every time the cry for COBOL programmers has gone up, we’ve muddled through; this time, we should do something better.

The future of programming is re-understanding the past, and re-inventing it to meet our current challenges.