This is completely from my perspective needless to say. I really wish Mark Delany in particular would write something similar as it's the other half of the equation and his perspective would be really enlightening. DKIM is a remarkable piece of convergent evolution.
IIM
Tasman Drive |
In 2004 Cisco just like everybody else was being inundated by spam. With my personal mail server, Spamassassin couldn't keep up with the permutations. Cisco had no visibility or expertise with email but we were heavy users of email so we had an outsiders view that the situation was really bad and didn't seem like it would get better any time soon. So Dave Rossetti assembled myself, Fred Baker, Eliot Lear, Jim Fenton and maybe one other that I'm forgetting to talk about what Cisco could do about the spam problem. The main thing going on at the time was Bayesian filtering, but that was being defeated by image spam. After one of these meetings, I came up with an idea that if the mail servers did nothing more than apply an unanchored digital signature to the mail but with a consistent key, that maybe the Bayesian filters could latch onto that as a signal for spam or ham. I remember talking to Eliot after a meeting telling him my idea, and he was interested as I recall, but dubious that a free floating key would work. Some time after I told Jim too, but he had a better idea: why not anchor the key to a domain? And thus the genesis of of Identified Internet Mail, IIM. I'm fairly certain Jim came up with IIM because if left to me I would have probably tried to make some cutesy tortured acronym ala KINK.
Since we now had a trust anchor (ie, the sending domain) it became obvious that we could possibly also publish a record which said whether the sending domain signed all of their mail or not. If the receiving domain received unverified mail and the sending domain says it signs everything, it would be a forgery in the eyes of the sending domain. Thus the concept of Sender Signing Policy (SSP) was born.
So off we went. Jim was still part of his group, and I was still working for Dave Oran at the time, so we were more or less doing this free-form and under the radar. Jim wrote most of the IIM draft, and I wrote the actual IIM code, telling Jim what the syntax of the header was from my running code, and how I implemented the SSP code. IIM had a concept of a key registration server (KRS) that ran on top of HTTP. For discovery, we used a well-known top level SRV record to find the KRS. We were a little nervous about the overhead from HTTP for fetching the key, but we had a means to allow it to be cached, so we figured it was probably acceptable. We were also really nervous about the overhead of the RSA signing operation. But when I wrote the code using a sendmail milter I quickly found out that the signing overhead was drowned out by the overall processing of the message so it wasn't a problem.
While this was going on we had heard of some exec at another company falling for a spear phishing attack purportedly from another employee. We didn't think our execs were any brighter and security savvy -- and frankly, none of the engineers either since it isn't easy to figure out even if you're looking for it. So with Dave Rossetti we decided that spear phishing was a scary problem for Cisco and decided to create a research group within Cisco which was charged with dealing with this employee-employee spear phishing attack where I was employee #1 (Jim stayed in his group throughout this). We got some coworkers that we had worked with before including one -- Bailey Szeto -- who had close ties to Cisco IT. The object was to create an IIM signing/verifying MTA and insert it into the mail pipeline to sign and verify signatures.
While this was going on, we were starting to reach out and socialize the ideas externally. Our co-worker Dan Wing was good friends with Jon Callas then at PGP Corp so we had him over to talk it over to make certain we weren't crazy. I'm not sure if Jon was impressed or not, but he didn't find anything substantially wrong as I recall, so we weren't going to badly embarrass ourselves going to IETF at least. We were making fast progress on actually implementing IIM internally as well while this was happening, and getting buy in from the IT folks to insert my IIM code into the email pipeline. Finally holding our breath we went live with IIM in the mail pipeline. A little at first then a little more until we were signing and verifying signatures for an entire Fortune 100 company. A company that lives and dies by email, I'll add.
Domain Keys
Tasman Adjacent |
We kept our feelers outside of Cisco and eventually found out that right down the street a mile or two away at Yahoo! Mark Delany was working on something called Domain Keys (DK) and had actually deployed it into their mail pipeline. What was remarkable about DK is how similar it was to IIM. He too was working on an internet draft documenting DK. DK also had a signing policy mechanism as I recall, but it was more tentative and maybe aspirational as I recall Mark saying which makes sense from the perspective of a email provider. When we finally became aware of each other we started meeting in larger groups of interested people informally called the Email Signing Technical Group of maybe about a dozen to try to figure out what to do, both with the two I-D's and generally how to standardize something. Barry Leiba was part of that early group who along with Stephen Farrell would go on to be the DKIM working group chairs. Nothing is simple with the IETF world, and it takes time to agree on the color of the sky even on good days, so it's usually the best plan to have quite a bit of buy in and a coherent front for the inevitable push back and vested interests. Mark's DK worked. Ours worked. It was deployed just like ours was. They were both fundamentally doing the same thing.
The IIM draft was first published on June 3, 2004 and DK was published on June 24th, 2004. As I recall we both had live implementations running when we published our drafts. I don't know when Yahoo! started signing it outgoing mail, but I have always assumed it was before us, but who knows (and if you do, let me know and I'll update it).
The Fusion of DKIM
Mark being at Yahoo! was very service provider oriented. Our situation at Cisco being from an enterprise standpoint was more complex where the IIM draft laid out a bunch of the use cases that needed to be supported. It wasn't entirely clear whether they could be supported by DK or not. As I recall, we met with Mark at Cisco to see if we could hammer out a combined spec instead of the usual routine at IETF of having two competing drafts and the pissing matches that ensued. The pissing match was already happening in SPF-land with SenderId. There was a real engineering trade off between using DNS and using
HTTPS. Security was easy for HTTPS, much more of stretch for DNS. But
DNS lookups are cheap vs. HTTPS and we kept going around on that though neither of us was dogmatic. I liked DK's header syntax better as mine was a little overwrought. The big deal though was whether DK could do the enterprise-y things that we wanted.
After the meeting I thought about it for several days reading the DK draft and comparing it to IIM and its use cases until I convinced myself that it was a product of convergent evolution; DK could just be extended for our needs. I bit the bullet and told our group we should just adopt the DK mechanism and add the things we needed. The lingering concern about HTTP performance was greater than the security concerns of DNS. The irony these days is that DNS over HTTP (DoH) is now a thing so we're back to where we started with IIM: we could have used HTTP from a performance standpoint after all. The other part of basing it off of DK was tactical: Yahoo! was a big fish in the email world where Cisco was a barely hatched fry. That said, I think IIM had it right in the long run. DKIM gets knocked all of the time about DNS and the lack of deployment of DNSSec. While I think that is overblown, you can't argue that setting up TLS on a HTTP server was a well known skill even in those days.
At that time we already had IIM deployed throughout Cisco and were starting to gather some stats for our stated goal of dealing with spear phishing. Part of the problem was identifying the sources of email in the company that were not routed through the Cisco mail pipeline and that was daunting and proved something of an Achilles heel, though not entirely. DMARC's reporting facility would have been very helpful, but of course that requires wide deployment from other domains, and we didn't even have a merged protocol yet. Our main problem was with external mailing lists of which we were painfully aware because that's how IETF did its business. I wrote a bunch of heuristics to recover signatures that went through mailing lists to see if it could be validated. I got tantalizingly close with about 90% recovery, but we had a lot of unsigned email from other sources so we couldn't take action.
Where Eric Allman, Mark Delany, Jim Fenton, Jon Callas, Miles Libbey, and I hammered out DKIM at my place in San Francisco |
The combined spec was coming together. Eric Allman was given the editor's pen for the combined spec that was hammered out in my dining room in San Francisco with all of the named authors in attendence. When enough of DKIM was cobbled together I got to work converting IIM into DKIM with my implementation. I had found out that Murray Kucherawy at Sendmail had a DK implementation written as a milter as well (it was never clear to me if that's what Yahoo! was using. Edit: Mark says it was Murray's milter). So the race was on. I got done enough that I sent Murray email (signed!) telling him I was ready to interop. Murray was right behind me and the next day we started to debug our implementations. Murray was at a big advantage because the protocol looked on the outside a lot like DK. Our main interop issue was me getting the NFWS body canonicalization correct as I recall. Beyond that I think we had interop possibly that day, but certainly within a few days.
As it turns out, that was a theme with lots of implementations to follow, and most importantly lots of interest across the industry. The next step was to take the combined DKIM draft to IETF. As I mentioned IETF is a painful process, and getting a working group spun up is always extremely difficult because everybody and their brother gets their $.02 worth in. DKIM had the advantage that it was a fully formed spec with a lot of vetting at that point from a lot of eyeballs as well as implementations. If I recall correctly it was at the Paris IETF in 2005 where we had our debutante's ball. There was a lot of sound and fury from the usual attack poodles. Jim got saddled with writing a threats informational RFC, much of it written sitting on the floor in the halls of the Paris IETF venue as I recall. The one thing I do recall out of all of the sound and fury was that Harald Alvestrand (then IETF chair) stood up saying this entire process was ridiculous and should just proceed. Thanks Harald!
I don't recall whether we actually were chartered in Paris, but do remember filling up a friend's tiny restaurant with an assortment of IETF folks with the wonderful food coming from her postage stamp kitchen including a chocolate mousse with cayenne. Everybody loved it. So anyway the working group was chartered, the threats draft was published and work began on what was already a pretty mature draft with a growing number of interoperable implementations. Probably the single biggest change to the original draft was the message body canonicalization. NFWS turned into "relaxed" for reasons I don't really recall. Relaxed seemed better, but not that much better and required us to re-interop. Oh well, something had to change. We did eventually have an in-person interop with probably 20 different implementations hosted by the affable Arvel Hathcock at Altn (now MDaemon) in Dallas. We were treated to a Brazilian restaurant where prodigious amounts of meat was consumed.
So at this point DKIM was pretty well set and would go on to become a proposed standard RFC 4871 in mid 2007. Believe it or not, that was a good speed for IETF process, but we did have the advantage of an interoperable spec without any competing specs, or in IETF parlance the rough consensus and running code were there before the working group was formed. On the home front I continued to do experiments as we tried with increasing frustration to find all our sources of email.
SSP/ADSP
Early on I believe after the working group formed, it was decided to split DKIM and SSP apart. That's a fine decision in retrospect -- they are two different on the wire protocols. But SSP elicited shall we say fervor from people who disliked it. It still seems to elicit similar fervor in its DMARC instantiation which makes me wonder why the people who dislike it participate at all. But there was a lot of resistance to SSP suffice it to say. It was at some point renamed ADSP for reasons lost to me, but for all of the bickering it remained pretty much the same SSP with some tag wordsmithing I assume so as to justify the name change. One of the authors was even in the resistance crowd which again makes you wonder why you'd work on something you don't support. To this day, DMARC which is yet another bite at the apple is fundamentally the same (modulo the reports) as ADSP. It also added support for SPF to be used as a policy check too along with DKIM. As for DMARC, I really don't know why they went off to reinvent ADSP instead of just extending it, but it's possible that the shall we say the fervent poisoned the well too much. One of them even wrote an article for a tech rag against its existence after participating -- mainly delaying -- in its production. Finally RFC 5617 was made a proposed standard in mid 2009.
That's All Folks
At Cisco we had deployed DKIM into the mail pipeline, but we were also working on a more ambitious project that could take multiple protocols and apply security to the various streams instead of just email. I was most intrigued with SIP because SIP has a lot of the same issues that email does, the inter-domain problem being the biggest. Since I had previously been working on VoIP stuff before DKIM, I still kept tabs on what was going on with SIP. SIP was at that time creating what was called the P-Asserted-Identity header which supposedly told you what the caller-id was being asserted. I was a regular Casandra shouting at the top of my lungs that their assertion that voice will be an old-boys-network just like the old days was wrong and this was going to backfire on them since there was no authentication mechanism. I even hacked up a SIP stack and started DKIM-signing SIP INVITES to prove it could be done with probably little or maybe no changes with DKIM. More later.
Cisco had decided a ways back that it was in fact interested in getting into the email security business. I did due-diligence for a number of companies including Ironport which was eventually chosen and integrated making our prototyping work redundant (they even participated at the Altn interop too). Both Jim and I had figured that we'd just move over to Ironport. Apparently we were too "Old Cisco" and both of us were rejected with myself at least labeled completely unqualified to write code or something like that. We just won you the fucking startup lottery, assholes. Thanks a fucking lot you ingrates. Have I mentioned how much I dislike puffed up egos?
Epilog
Off to Ski
My group (one that I was responsible for forming) had decided to go off on some wacky scheme with Skype which I had absolutely no input on and absolutely no interest. As I was looking around for something new to do at Cisco, I was also fascinated by having taken my Garmin GPS to Kirkwood skiing and dumping all of my points into Google Maps. I was completely fascinated by this with all of the possibilities of finding your friends on the mountain, seeing how fast you were going ("It can tell me how fast I'm going?!") and gaming with your friends. Since I didn't find anything interesting at Cisco and was bored, I left to go ski for 5 years in August 2008. Two months later Android came out and my adventure into phone apps began.
DKIM to STD
While I was off skiing the DKIM working group kept bumping along. While most RFC's stay at the proposed standard level, there is a complicated process to move it from proposed to draft standard and then to a full internet standard. By the time I decided to ski for a living, I had had it with the petty politics and stopped paying attention to the working group altogether and unsubscribed from the mailing list. I have no idea what happened in the intervening 3 years but in 2011 DKIM became STD 76. 76 makes it clear that there are not many protocols that make it to full standard. By the time I left, DKIM was already widely deployed at the major email providers with billions of pieces of email signed and verified every day.
One of the interesting things that came out of DKIM is that it implements a public key infrastructure (PKI) and is probably the second largest PKI next only to HTTPS/TLS. What I particularly like is that it shows that it is not inevitable that a PKI needs to use certificates. In fact DKIM shows that X.509 is particularly dated and unnecessarily complex with its CA's, ASN.1, blah blah blah. TLS is water under the bridge at this point, of course, but there seems to be some magical thinking that if you use asymmetric keys that certificates are required. DKIM proves that is emphatically wrong and that it can be just as simple as publishing the key/name in DNS or pointing to a web server to fetch it.
Ah SIP, My Old Friend
As a sad case in point I submit to you STIR/SHAKEN (RFC 8226). While my main beef with STIR is that it solves the wrong problem -- they are trying to determine whether somebody is allowed to assert a given telephone number rather than just hold the senders accountable as DKIM did. They also clung to the X.509 world made which made it much less comprehensible in the process. On top of that there are many classes of deployments that STIR can't address at all. The RFC was published in 2018, 10 years after I had shown that they could just reuse DKIM. STIR is so rife with errors and under-specification that I had to stop writing a blog post about it. If it flops -- and there is a good chance it may -- there is always the DKIM route, which also has the benefit that it also solves for the non-bellheaded use cases in the From address.
ARC, WTF?
I had vaguely heard that a set of people created a standard which was a successor to ADSP which at some point was brought to IETF as an information RFC. I looked it over more carefully and it seems to unify policy with SPF which is fine -- we didn't care about SPF at the time because they had their own policy mechanism so why pick needless fights? It also has a reporting mechanism for when signatures fail, etc which is in reality a completely different protocol than ADSP and has no advantage of being tied to the ADSP policy mechanism.
That said, I happened to see looking at the headers of a message a weird DKIM-like signature called ARC-Signature along with ARC-Seal and ARC-Authentication-Results. I joined the DMARC working group trying figure out what this was about. There were a lot of fresh new to me faces on the working group, but also a lot of people who should have known better that ARC brings nothing new to the table than plain old DKIM. The main premise is that ARC is supposed to solve the mailing list traversal problem, or more generally intermediates who invalidate the originating DKIM signature. There is definitely a lot of magical thinking because when pressed on the issue when asked how ARC will do what DKIM supposedly can't is that it depends on the receiving domain to trust the ARC-Signature's domain. Doh. Uh, folks that resolves to a previously unsolved problem because intermediaries DKIM sign all of the time these days and there is absolutely nothing stopping a receiving domain from trusting that domain for the past dozen years. I really can't understand how the IESG let this happen because it is really ill conceived, though it is just an (failed, imo) experimental RFC at least. Through the process, however, I have come to the conclusion we should just ignore the mailing list traversal problem and set p=reject and let the chips fall where they may. For the vast majority of domains it is unlikely to ever be a problem. I wrote a post here about why.
Conclusion
DKIM is definitely one of the biggest achievements of my life and I'm very proud of it. Starting from a kooky idea about feeding Bayesian filters, working up a fully fleshed out implementation and internet draft, finding convergent evolution just down the street and marrying them off instead of a protracted pissing match to a full internet standard 76. What a trip!
I recently came across a really interesting study about DKIM, SPF and DMARC showing what the effects they have had: TL;DR not a silver bullet -- nothing is with spam -- but it's having a noticeable effect on the problem. It's an interesting if long read but worthwhile if you're into email security.
No comments:
Post a Comment