How A Simple 500-Word Memo Changed the Way We Talk About the Internet
The unlikely trio MUST, SHOULD, and MAY in internet standardsby Joe Honton
In March of 1997, Scott Bradner penned a short memo titled "Key Words For Use in RFCs to Indicate Requirement Levels". It is more commonly referred to as simply RFC 2119. The entire document fits comfortably on two pages of paper.
That memo would go on to become the single most referenced document of the Internet Engineering Task Force (IETF).
The IETF Standards Process
Every day software engineers who build and maintain the computers that comprise the internet, reference technical standards to troubleshoot problems that arise, and to design innovative software to meet new challenges. Many of those standards are published by the IETF, in documents referred to as "Request For Comments", or in geek jargon an RFC.
These documents take shape when a computer scientist or engineer encounters a problem that needs to be addressed, and proposes a solution, writing it up with all of the flourish that makes for good bedtime reading. The draft manuscript is reviewed by an editor, who helps the author to frame the problem within a suitable historical context. Then the manuscript is crowd-sourced, and anyone can suggest corrections, or nit-pick the finer details, or argue for an entirely different solution.
All of this happens with complete transparency. Everything is voluntary; there are no financial incentives involved. It's just good people, standing on the shoulders of giants, paying it forward. In this way, the culture of the IETF and the RFC process infuses the internet with collaborative goodness. It's one of the great success stories of open governance. No one is in charge, anyone can participate, everyone benefits.
It's important to note that the IETF does things in a unique way when it comes to revising its standards to meet new challenges. Instead of modifying the original standard, by adding and removing sentences in situ, RFCs are updated by writing new RFCs, that reference the original, describing how new problems have arisen that weren't previously considered. The new RFC then proposes new solutions to those problems, while keeping the legacy solution intact.
Every RFC that's ever passed muster can readily be found online, and mined for historical insight. Some are simply stamped "obsolete" and archived, while others are "updated" and form a daisy-chain thread of context and specification.
With this approach, the old and the new can operate side-by-side. There's never a cataclysmic "Stop the internet, we're upgrading to a new version" crisis.
Here's an example of the IETF process that we can all relate to. It's the standard that describes how email gets delivered from sender to recipient.
A long time ago, long before the World Wide Web — and before the network of networks that we now refer to as the Internet — there was an experimental communications system, set up by the U.S. Advanced Research Projects Agency, called ARPANET. This was a loose collection of about 50 computers, mostly at universities and research facilities, that were networked together using the first version of the transmission control protocol that now underpins the internet.
Each ARPANET-connected computer was shared by all of the professors and researchers at each facility. At that time, the idea of a "personal computer" was still ridiculously far-fetched, so sending a file from one place to another was accompanied by a challenge: how to send a letter to a recipient on another computer to inform them about the contents of the data that was being sent or received. The solution was straightforward: simply preface the body of the letter with header lines beginning with the words: "Date", "From", "Subject" and "At".
This convention was good enough to get things where they needed to be, but loose enough to cause four researchers from MIT, BBN, and SRI-ARC — Abhay Bhushan, Ken Pogran, Ray Tomlinson, and Jim White — to decide that it needed formalizing. They spelled it out in excruciating detail in September of 1973 in the aptly named "Standardizing Network Mail Headers" (RFC 561).
It soon became apparent that improvements could be made and by November of 1977, the original four mail headers were replaced with something that all of us now (five decades later) comfortably recognize: "From", "Sender", "Date", "To", "cc", "bcc", "Subject", "Comments", "Message-ID", "In-Reply-To", "References" and "Keywords". This was codified in "Standard for the Format of ARPA Network Text Messages" (RFC 733).
But by August of 1982, ARPA was no longer just a network of shared computers. It had become a network of networks, and the mail message protocol now needed to handle the problem of inter-network retransmission.
David Crocker, who had helped to co-author RFC 733, took up the challenge, writing "Standard for the Format of ARPA Internet Text Messages" (RFC 822). The subtle change in the new document's title, from network to internet, may be one of the first occurrences of the word that eventually has become part of our everyday vernacular. And going one step further, Crocker also makes history by tentatively dropping the phrase "text message" and introducing the phrase "electronic mail".
Email became the #1 "killer app", so wildly popular that everyone had to have it. Over the next four decades, that success kept building upon itself as new challenges arose and new solutions were implemented. The daisy-chain continued to grow:
- RFC 2822 "Internet Message Format" (April 2001) reframes the context of an email to be viewed as having an envelope and contents, rather than its former context of being viewed as a letter with headers and a body.
- RFC 4021 "Registration of Mail and MIME Header Fields" (March 2005) adds headers to describe the path of a message as it moves from host to host. It also defines headers that describe the contents of the email: its language, compression, encoding, disposition, and type (e.g. text, image, document).
- RFC 5322 "Internet Message Format" (October 2008) revises its predecessors using simpler notation and easier to understand examples.
- RFC 6532 "Internationalized Email Headers" (February 2012) allows use of Unicode in headers.
- RFC 6854 "Update to Internet Message Format to Allow Group Syntax in the 'From:' and 'Sender:' Header Fields" (March 2013).
Accommodating the old and the new
As the lineage of email-related RFCs grew, there was a continual need to allow old servers to function properly without getting tripped up by new extensions to the email specification.
Some things may be allowed, some things may not. Some things should be accommodated, other things should not. Some things are required and simply must be present in a certain form, other things must never be present.
So now we can reintroduce Scott Bradner and his 500-word memo (RFC 2119). Bradner gave us a way to succinctly specify all of the above, without ambiguity, by defining these five imperatives (quoted here, just as he wrote them in March of 1997):
- MUST ⠀ This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification.
- MUST NOT ⠀ This phrase, or the phrase "SHALL NOT", mean that the definition is an absolute prohibition of the specification.
- SHOULD ⠀ This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
- SHOULD NOT ⠀ This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.
- MAY ⠀ This word, or the adjective "OPTIONAL", mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.)
Nearly every RFC written in the past 25 years has referenced RFC 2119 and used these imperatives to be explicit about what is mandatory, what is advisable, and what is allowed but not required.
Interestingly, RFC 2119 itself has only ever been updated once, to clarify that when these imperatives are not capitalized, they have their normal English meanings.
World Standards Day
October 14th is World Standards Day. It's a day when we can pause and give thanks for all the efforts of the many volunteers who contribute to the efficient running of things.
Regarding the internet, there are many organizations that help to establish the standards that contribute to its success. Their acronyms are familiar to anyone working in software development:
- IEEE - Institute of Electrical and Electronics Engineers Standards Association
- ISO/IEC - International Organization for Standardization and the International Electrotechnical Commission
- ANSI - American National Standards Institute
- W3C - World Wide Web Consortium
- IETF - Internet Engineering Task Force
And there are others, each filling a technical niche: computer languages, data storage, compression, caching, and many more.
Every domain has their own collection of standards, and their own reasons for establishing them. For those of us who write software, standards make it possible for you and me and someone we've never met, to create state-of-the-art software, that works in harmony with legacy software written years ago, and with novel software that may be written years from now.
Scott Bradner went on to become one of the stalwarts of the IETF, participating at every level of the process. We owe a lot to him. One way to pay him back is to volunteer in the standards making process.
Let us be thankful to all the scientists, inventors, engineers, and writers who volunteer their time and expertise towards the establishment of standards.