Saturday, May 12, 2007


Protocols and APIs

In some previous posts here and here, I mentioned protocols and programming interfaces, in reference to Internet standards. It's time to talk about the differences between them, and when one would want to use each.

I'm particularly talking, here, about situations in which one computer program is using a service provided by another program. Programs that do this can have client/server relationships, where one of the programs is clearly the server and the other, the client, is using the service. Or they can have peer-to-peer relationships, where each program has its turn at providing and using services.

The two programs can also run on the same computer, or on different computers on a network (such as the Internet). Since these comments are rooted in comments about Internet standards, I'm going to assume, as I talk about this, that the programs are on different computers on the Internet.

Whatever the service relationship, the two programs must agree on how to communicate. One way for them to do that is to use the same application programming interface (API). Another way is to use the same protocol.


When we define a protocol, we specify what information one program sends to the other program. We also specify the sequence of commands, information, and acknowledgements that the two programs send back and forth. Think of a protocol that we might use on the phone in formal settings:

caller: [enters callee's phone number]
callee: Hello, Acme Advertising, this is Jim Anderson.
caller: Hello, this is Ward Cleaver calling for Bentley Gregg.
callee: Mr Gregg is in a meeting. May I help you?
caller: This is important; I need to speak with Mr Gregg directly, about the Ricardo account.
callee: One moment, and I'll see if I can get him.
caller: Thank you. [waits...]
callee: Hello, this is Bentley Gregg.
One can imagine that Mr Gregg might not have come to the phone if Mr Cleaver had not used this protocol, if he'd said something like, “Don't give me this ‘meeting’ story, just put Mr Gregg on now!”

We specify Internet protocols more precisely, of course, but it's the same idea. The protocol used to send email around the internet is Simple Mail Transfer Protocol (SMTP). The specification defines a number of commands and responses, and tells us how those have to be used in order for one mail relay to transfer mail to another. The result looks something like this:

relay 1: [makes network connection to relay 2]
relay 2: 220 ESMTP Sendmail 8.12
relay 1: HELO
relay 2: 250
relay 1: MAIL FROM:
relay 2: 250 2.1.0 sender OK
relay 1: RCPT TO:
relay 2: 250 2.1.5 recipient OK
relay 1: RCPT TO:
relay 2: 250 2.1.5 recipient OK
relay 1: DATA
relay 2: 354 send message
relay 1: [sends the email message]
relay 1: .
relay 2: 250 2.0.0 message accepted
relay 1: QUIT
relay 2: 221 2.0.0 closing connection
Note how the protocol tells exactly what to send in order to identify a recipient for the message (“RCPT TO:”). It also specifies the significant part of the response, and allows a human-readable bit at the end (“recipient OK”), which the program is told to ignore and which makes it easier for debugging.

But nothing here tells the programmer how to write the program, nor gives any tools to help. Any program that sends the right commands in the right order (HELO, MAIL FROM, and so on) and handles the responses will be accepted as an SMTP client, for example. And that's why we standardize these protocols — it's what allows anyone to create an email program, and makes sure that we don't have to buy all of our software from the same company. It's just up to the programmer to put all that together.

Putting it together can be involved, especially for more complex protocols. To help with that we often define APIs, which provide an easy way to program the conversation between the client and the server, something that makes more sense to a programmer than having to get every bit of the protocol right. Here's java-like pseudo-code for one sort of API that could be used for the SMTP conversation above:

try {
  SMTP.connect("", "");
  List to = new List();
  SMTP.fromto("", to);
} catch (Exception x) {
  [...handle errors...]
This hypothetical API has made it easier because the programmer doesn't have to worry about the exact data that goes “on the wire”, and doesn't have to check the responses explicitly (errors are sent down to the “catch” section automatically). The API has also collapsed some of the protocol elements into a single line in the program (making the connection and sending the HELO, for instance).

But someone still has to do the protocol stuff, ultimately. There's still a program that turns the API into on-the-wire protocol. That program is the implementation of the API. The nitty-gritty work is still done, just not by the person writing the example above.

Now, APIs can be used, as here, to make it easier to implement standard protocols. But they can also be used to allow the implementation of non-standard protocols. There are lots of things in Windows, for example, that involve non-standard, unpublished protocols — protocols that might change from one release of Windows to another. But there are published, supported APIs that are guaranteed to continue working, because the implementation of the API keeps up with the changes.

What all this mostly means is this:

When we define a standard protocol, it allows you to buy a server program from one company, me to buy a client program from another company, and us to have those programs talk to each other without anyone doing anything special, or having to know whom we bought our software from... as long as both programs use the standard.

When we define a standard API, your server does not have to use any standards at all, and can use a proprietary protocol of its own. But if I have a client from another company that uses that API, I have to install your server's implementation of the API on my computer in order for the two to communicate — that's the bridge between the API and your server's protocol.

In the IETF, we concentrate on standard protocols, because that's what makes things the most interoperable. We can still layer published or standardized APIs on top of the standard protocols, but they are now just a programming convenience, not required to make anything work at all.

No comments: