Yesterday on twitter I made a comment criticizing the practice of putting an API key in a query string parameter. I was surprised by the amount of attention it got and there were a number of responses questioning the significance of my objection. Rather than try and reply in 140 character chunks, I decided a blog post was in order.
Most of the comments were security related,
It is true that whether the API key is put in the URL or in the Authorization header. They are both going to be sent over the wire in clear text. If security is critical then HTTPS is going to be necessary and both approaches would be equivalent… over the wire.
The security problem is not really when the message is going over the wire, it is what happens to it on the client and server. We developers like writing things out to log files and URLs are full of useful information for debugging.
My friend Pedro pointed me to an article that demonstrates how API keys in URLs can become a major problem.
I’m not suggesting that by putting the API key into an authorization header, that all the problems go away. It is just a matter of reducing the chances of sensitive information being stored in unsecured placed and then being misused.
It reminds me of the choice we make to lock our car doors. Anyone who has locked their keys in the car knows how easy it is for someone with the right tools to get into a locked car. However, locking your doors does significantly reduce the chance of theft.
Andrew makes the suggestion to use the username:password convention was introduced in RFC1738 back in 1994.
When RFC 1738 was revised in 1998 and became RFC 2396 the following text was added:
Some URL schemes use the format “user:password” in the userinfo field. This practice is NOT RECOMMENDED
In the latest revision of the URI specification, RFC3986, they went further,
Use of the format “user:password” in the userinfo field is deprecated.
The reasons for deprecating this feature are very much applicable to the use of API keys in the query string. It is unfortunate that we aren’t quicker at learning from the mistakes of those came before us.
Forget the security issue
When I wrote the tweet, I really wasn’t complaining about having an API key in the URL for security reasons. For me, there are a number of benefits of using the authorization header.
One of constraints of REST is called the “Uniform Interface”. A benefit of this constraint is that when you start working with a new API, there should consistency with the way it works. This helps to reduce the learning curve and it makes it easier to build re-usable code that depends on this consistency.
Many HTTP client libraries have the ability to set default headers that will automatically be sent with every request. It’s one line of code and you get to forget about API keys and focus on actually using the API.
When assigning an API key in a URL, you first need to know if the parameter is key, apikey, api-key or api_key. Then you need to modify the URL that you want to call to add the API key. Futzing around with strings to add a query parameter to an existing URL is full of annoying little gotchas. It is not so hard to do on a case by case basic, but trying to write generic code that will work for any URI is just painful.
I’m quite sure that string manipulation of URLs is one of the primary reasons API providers create API specific client libraries to insulate client developers from these irritants.
Another REST constraint is the hypermedia constraint. I realize that hypermedia usage in the API world is still very exceptional, but it’s popularity is growing. Having to define a URI template for every embedded link, just to add an API key would be really annoying.
Believe it or not, Caching is another REST constraint . HTTP caches use the URL as part of the primary cache key, even the query string parameters. If you add a API key into the URL you make it difficult to take advantage of HTTP caches for resources that are common to all users. A cache would end up keeping a duplicate copy of the resource representation for every user. Not only is this a waste of cache space, but it also reduces the cache hit ratio massively.
It is interesting to note that if you use an Authorization header, HTTP caches have special logic that will prevent caching by public caches unless you specifically allow it using a cache-control directive. When the API key is buried in a query string parameter, intermediaries have no idea that the representation has come from a protected resource and therefore don’t realize that caching it might be a bad idea. Using standard HTTP features, the way they were defined, allows intermediary components to perform useful functions because they can have a limited understanding of the message.
A Thousand paper cuts
I believe that the usability of an API is hugely impacted by many small factors that in isolation, seem fairly inconsequential. It is the combined effect of these small issues that is significant. There is also the impact of change. What doesn’t matter today, might be very significant sometime in the future.
The HTTP specifications define a set of guidelines for building distributed applications that have been proven to work, in real running applications. Disregarding the advice they contain is throwing money down the drain.
A final comment that I would like to address came from Bret,
My original objection was that using the Authorization was not an option in the API I was trying to use. I understand why some users prefer to use query string parameters. Providing an easy path to get users working with your product is critical and if providing them with a query string parameter to send the auth key helps that process then do it. However, I also believe part of the role of API provider is to help educate API consumers on the best way to work with an API to get the best results over the long run. Give them an easy way, and when they are ready, educate them on the better way.
Hopefully, this blog post has provided some concrete reasons as to why using an Authentication header is a better solution.