Showing posts with label DKIM. Show all posts
Showing posts with label DKIM. Show all posts

Wednesday, December 30, 2020

Are Mailing Lists Toast?

Definitely Toast
 

From the very beginning when IIM (Cisco's email authentication draft) was merged with DK (Domain Keys from Yahoo!) to became DKIM, we both envisioned a sender signing policy module which allowed a domain so say "we sign all of our mail, so if you have unsigned mail or the signature is broken that's purportedly from us, that's not cool". Since we were all experienced with internet standards it was plain as day that there was a serious deployment problem since mailing lists mangle messages and thus broke signatures. Mailing lists would thus throw a wrench into the policy gears. This was 16 years ago.

Our effort at Cisco was driven primarily by phishing, and spear phishing in particular. We had heard tell of some exec at another company falling for a spear phishing attack, and we didn't figure our execs were any more clueful so that was pretty frightening. Since Cisco had exactly no presence in email at all, it also gave some plausible deniability as to what we were up to. We weren't looking to get into the email biz, but we weren't not looking either. 

We formed a group which included Jim Fenton and me, and created a project to sign all of the mail at Cisco with a goal of being able to annotate suspicious email purporting to be coming from Cisco employees. This required finding and signing all legitimate Cisco email. So off we went trying to find all of the sources of unsigned email in the company so that we could route it through the DKIM signing infrastructure. We didn't have the nifty reporting feature of DMARC so it wasn't the easiest thing to figure out. It was made much worse because Cisco had tons of acquisitions so there was a lot of legacy infrastructure left over from them, and who knows whether they were still using their mail servers or not. This was very slow going to say the least.

Most of the DKIM working group was pretty cavalier and dismissive about the mailing list problem. A DKIM signature from the mailing list would somehow solve the problem. We just needed to somehow trust that mailing list. Coming back after a dozen years and a lot of skiing under my belt, it seems that the previously unsolved problem remains unsolved: nobody knows how to "trust" a mailing list.

When I was still at Cisco I used a bunch of heuristics trying answer the question of whether the originating domain signature could actually be reconstructed and verified. It was met with a lot of derision and hysteria from the usual attack poodles, but I didn't care and kept trying to improve my recovery rate ultimately achieving northward of 90% recovery. It was interesting and tantalizing that the false positive rate was close enough to be worth considering marking suspicious mail up with warnings. The next step would have been to differentiate what that broken signature traffic was, where it was coming from, what it was doing that was not reversible and ultimately whether we cared about that case enough. There was no silver bullet that we could find, and we definitely didn't know how to  "trust" mailing lists.

So as we slogged on with hunting down our infrastructure and me still hacking on the heuristics, Cisco decided that it was interested in the email security angle that we had pioneered. We did diligence on a bunch of companies and settled on Ironport just down the peninsula from me. Cisco bought them and my group it was with decided -- without my input -- that they were switching to some wacky telephony thing that I had no interest in. When Ironport wouldn't transfer me, I wandered around a while and then decided it was time to quit and ski. The one thing I regret is that we didn't get a chance to finish off the research side of the problem, especially with mail lists to separate out the enormity of the remaining problem which seemingly nobody knows to this day.

Fast forward 12 years and I found this curious creature called and ARC-Signature and ARC-Seal in my email headers. The signature looked pretty identical to a DKIM signature which I found very odd, and it also had an ARC version of the Authentication-Results header. So what's going on? I found that ARC is an experimental RFC which was seemingly trying to solve the mailing list problem. Again. But using even more machinery in the form of the ARC-Seal. What is the Seal's purpose? I determined it is so that it can bind the ARC auth-res to the ARC signature. Why are they doing that? Because it's supposed to be of some value to the receiving domain to know what the original assessment of the originating DKIM signature was. But DKIM can already do that if a mailing list resigns the email. It was all a mystery.

So wondering whether there was some secret sauce that I had completely overlooked all those years ago, I posted on the DMARC working group list for the first time asking what that secret sauce might be. "It requires the receiver to trust the mailing list" I was told. I said that you can do that right now if the mailing list resigns the mail with DKIM, which I assume most do at this point (and screw them if they don't). Why does this require a different signature than a DKIM signature? They wanted to bind the ARC auth-res to the signature I was told. Why can't you just add a new tag to DKIM assuming that is a problem at all, which I am not very convinced it is? Never got a good answer for that. And most importantly why does the receiver care about the original auth-res in the first place? Never got a good answer for that either.

So this all boils down to trusting a mailing list at its heart. It's not clear to me whether some receivers are using the list's DKIM signature to bind it to some whitelist or reputation service. Somebody as big as Google could easily roll their own and just not tell anybody about it since it's completely proprietary just like the rest of the spam filtering. So on the outside we really don't know whether mailing list reputation is a Thing or not, but the assumption that the working group seems to be operating on is that it is not a Thing and remains a previously unsolved problem. I am willing to believe that assumption since it seemed like a hard problem all those years ago too. That and Google itself is participating with ARC, so that suggests they aren't any better off than anybody else. But who knows? That's part of the problem of being opaque is that nobody on the outside can scrutinize whether there is some magic thinking going, or whether there actually is some there three.

So here we are over a decade and half after DKIM's inception right back where we started. As far as I can tell, ARC doesn't bring much of anything new to the table and the original auth-res doesn't address the fundamental problem of trusting the mailing list or not. Whether the original signature verified on the list or not seems completely beside the point: if I trust the mailing list, why do I even care whether it verified or not? If they are forwarding messages without valid originating signatures, that is a very good reason to not trust them for starters. Any reputation system needs to take into account a lot of factors, and requiring signature verification at the mailing seems like table stakes.

Mailing lists. Again. We are completely wrapped around the axle of being able to provide pretty good forgery protection from mail from malicious domains, but domains can't seem to pull the trigger of asking receiving domains to toss mail without valid signatures because it would cause mailing list traffic to be discarded as well. There have been other drafts beyond my experiment on signature recovery that have not been well received, and have languished as probably they should. The worst part of this problem is that there is no way to determine what success looks like. How many false positives are acceptable? How can we assess quality and quantity? Cisco, for example, would grind to a halt if it's upper level engineers working on IETF standards ceased to work because they implemented a p=reject policy (I just checked, it's p=quarantine 0% percent which is basically p=none with a little bit of attitude). 

Been toast a little too long

So pretty much we're in the same trenches as we were from the beginning with no forward progress, and no forward progress on the horizon. It then caused me to question the unthinkable: are mailing lists actually worth saving? They are ancient technology that never had security constrains from the beginning and they are operating in a world where security is a Must requirement. Likewise, it's not like there aren't alternatives to mailing lists. Web based forums have been around for decades, so it doesn't even require new infrastructure. Lots of things that once were a Thing are now gone. Take Usenet for example. Usenet was revolutionary in many ways and provided the first thing we would recognize as social media on the nascent internet. But it's been sidelined by the new social media companies, but one of the biggest problems is that it couldn't adapt to spam for whatever reason. Probably not technical and more likely neglect, but it is basically dead now. The world moved on, and now it's just a relic of the past. It's not even that the new tech has to be particularly good: Reddit is the most similar to Usenet of the social media platforms and it is a terrible and buggy reimplementation of Usenet, yet it is popular and Usenet is done.

So are mailing lists a relic of the past too? Going by the total volume of email, it sounds like they're down in the noise from what I've heard -- like 1% -- but it would be good if some of the big providers stepped up with some concrete numbers. For many companies, perhaps most, losing access to external mailing lists wouldn't even be missed at all. Us in the internet community are profoundly attached to mailing lists because that's how business is done. But let's be serious here: we are complete outliers. Changing over to some off the shelf forum software while not trivial, is certainly well within the capability of internet community if it needed to. The same for other lists. It quite possible that a hybrid system could even be done in transition where mail is gateway'd to the forums or some such.

In conclusion the bad security properties that mailing lists and the like are causing better security to not be deployed. So yes we should just ignore mailing lists and let the people who run them them adapt however they feel fit. After 15+ years since the advent of DKIM-like technologies and the ability to determine who is sending mail and discover what that domain desires on receipt of mail that doesn't verify, we need to just move on and accept that the desire for verifiable mail from domains is more important than a legacy technology with ample alternatives. 

This is not to say that mailing lists need to be burned to the ground or anything drastic. I've noticed that some lists have taken to re-writing From headers seemingly from domains with a DMARC policy other than "none". This is pretty awful from a security standpoint because it further trains people to rely on the pretty name rather than the email address, but to be fair that is just building on an existing problem and MUA's are the real culprit of a lot of this since they do little to help people know who they are actually talking to. That said, none of this would be necessary if we used a more modern technology.

Sorry mailing lists. But it's come to this.








Saturday, February 15, 2020

SIP: what about the From: header? No love?

I posted a while ago with questions about SIP's STIR/SHAKEN stuff (RFC 7340 has a very good problem statement worth reading) that I became aware of. Well, it turns out that it's really trying to shore up the miserable P-Asserted-Identity mess. I actually kind of like saying I told you so, so if that makes me a horrible person I'll own it. For SIP, P-Asserted-Identity is really nothing more than passing on the PSTN identities (caller-id, e.164 addresses). Which begs the question of whether you can trust their contents. The answer now is the same as the answer 15 years ago, and that answer is... no. The only real surprise is how long it took for $EVIL to figure this out.

For many, many reasons trying to give some guarantees about whether somebody is allowed to assert a given e.164 address is a very hard problem. The new standards have had to deal with this and it's not pretty. Not a knock on the work, it's just that the problem is really, really awful and hard: the PSTN never, ever envisioned the sort of trust model that has become common, or the financial incentives to not care about the problem. This is going to take a great amount of effort to roll out and that's just the beginning. $EVIL is not a static thing, and if I understand correctly there are some pretty significant holes that can't really be plugged.

Which got me to thinking. Why in the hell do I even care about e.164 addresses in this day and age? They are, actually, quite a nuisance. I can barely remember my own phone number, let alone anybody else's. SIP from the very beginning didn't really envision co-existing with the PSTN. It was a new way to use internet mechanisms instead of the inadequate PSTN standards. SIP was just like email, in that it had headers, one of which is a From: header that is identical to email addresses. The idea is that if you wanted to email me, you'd use mike@mtcc.com. If you wanted to call me, you'd use mike@mtcc.com. Simple. It was't until telcos started getting interested in SIP that PSTN integration started rear its ugly head. And hence the sorry situation we're in today.

So let's go retro for a moment. Maybe the original idea of using From: addresses wasn't so bad an idea, and is certainly widely in use today. A lot has changed since the telcos have butted into the VoIP world. For one thing, it's practically extinct. If it weren't for cell phones and the last mile it probably would be extinct. From what I can tell, it's IP the second it hits telco equipment. My little provider here in the Sierra has a gadget that terminates POTS and sends it out as SIP and RTP,  has a DSLAM and backhauls IP over fiber and is battery backed up from CO. Pretty nifty that stuff I've worked on is a block or two away. I think it's pretty much the same for the cellular RAN networks. Since POTS is pretty much dead that just leaves cellular. And if you believe the hype about 5G it will be pretty redundant since it supposedly deals jitter, latency and other things that make VoIP a little dodgy on 4G. I'm not sure of the exact details, but I'll take them at their word that VoIP will be pretty acceptable on 5G. Update: found out that VoLTE is a Thing. So PSTN stuff is now almost completely redundant.

So it's a pretty SIP-y world, and it's about to get a lot more. If I have a SIP UA on my phone, I can completely decouple who provides the bits from who provides the rendezvous services. And I can guarantee you that the telcos are not going to be my first choice. So I may well get my wish that the From: address becomes what people expect on an incoming call, not PSTN anachronisms. So all is good, right? Well, no. Not quite. We still have the problem of spoofed addresses, but now it's put on the From: header instead of the P-Asserted-Identity header.

As far as I can tell (and i could be wrong because there's a mountain of SIP RFC's), there's really not a viable end to end or end to middle or middle to middle kind of way of asserting identity. Yes, I know there is an RFC for S/MIME, but client certs have never seen any wide adoption, and probably never will. And S/MIME is really about end to end crypto which while useful, is not exactly problem that SIP's version of the "caller id" problem is trying to solve.

What we learned with email back in the DKIM days is that end-to-end authentication is a hopeless task. Domain based aggregation, on the other hand, seemed quite tractable. That is a domain can claim responsibility for a particular message (email for DKIM) as having come or passed through its infrastructure. The way we characterized is that DKIM is a "blame me" mechanism if something malicious happened with one of its users. The tradeoff that DKIM  made, however, is that you really don't know if the user part of the email address is who they say they are. But for the purposes of reporting abuse that's not necessary: it's really the sending provider's problem to figure that out. As it turned out, a lot of providers and probably all of the major providers nowadays require SMTP auth. I'm not sure if there was any cause and effect from DKIM to adoption of SMTP auth, but it was certainly in the air at the time.

Now back to SIP. Given the spam we're seeing it sure would be nice to have a "blame me" mechanism to see who injected a particular piece of voice spam into the SIP legs of the INVITE. Reputations can be aggregated at a domain level, and signing policies can be advertised for evaluation of the message. While I might not trust my provider on every front, our interests are alined  when dealing with spam and misuse. Even if I can't verify the incoming INVITE directly (say, you're on a 4G phone), I do trust that my provider can verify it on my behalf and they could stuff the verified message's From: into the caller-id, or somesuch. With VoLTE, they're using SIP so you wouldn't even need to do anything heroic: just show the From: address.

A nice property of this is that the unpluggable holes with e.164 address security aren't a problem in a world that is becoming more and more native SIP. We should be looking forward to that future in addition to any backward looking legacy problems. DKIM has been amazingly successful and extremely widely deployed with tremendous volumes. And since email message structure is the template from many protocols including SIP, it should be pretty easily transferable. In fact, in the day I actually wrote a SIP DKIM signer and verifier just for fun, so mechanically there's not any problems.

There are definitely questions to be answered though: Should verbs other than INVITE be signed? Should the replies? I'm not sure of what benefit there would be to signing REGISTER, for example, but it may be just as well to sign everything regardless of whether it's useful. And then there is the every present  problem of B2BUA's (back to back UA's). Honestly, these aren't entirely different than the Mailing List Problem with DKIM. The answer there is that the entity in the middle that breaks the signature should resign it. And it's probably not as bad a problem as with mailing lists because if I understand correctly B2BUA's are mostly being used as session border controllers which are typically in the same domain as the sender which is typically not the case with mailing lists.

In conclusion while it might be worthwhile to solve the E.164 problem, we definitely need to look to a future where it eventually shrivels up and dies. The future is being able to verify the sending domain of SIP messages, and especially knowing whether the From: address checks out which should be the case in a large percentage of signaling traffic. That would greatly help the voice spam problem since we would be able to reliably blame the sending domain.