I posted a while ago with questions about SIP's STIR/SHAKEN stuff (RFC 7340 has a very good problem statement worth reading) that I became aware of. Well, it turns out that it's really trying to shore up the miserable P-Asserted-Identity mess. I actually kind of like saying I told you so, so if that makes me a horrible person I'll own it. For SIP, P-Asserted-Identity is really nothing more than passing on the PSTN identities (caller-id, e.164 addresses). Which begs the question of whether you can trust their contents. The answer now is the same as the answer 15 years ago, and that answer is... no. The only real surprise is how long it took for $EVIL to figure this out.
For many, many reasons trying to give some guarantees about whether somebody is allowed to assert a given e.164 address is a very hard problem. The new standards have had to deal with this and it's not pretty. Not a knock on the work, it's just that the problem is really, really awful and hard: the PSTN never, ever envisioned the sort of trust model that has become common, or the financial incentives to not care about the problem. This is going to take a great amount of effort to roll out and that's just the beginning. $EVIL is not a static thing, and if I understand correctly there are some pretty significant holes that can't really be plugged.
Which got me to thinking. Why in the hell do I even care about e.164 addresses in this day and age? They are, actually, quite a nuisance. I can barely remember my own phone number, let alone anybody else's. SIP from the very beginning didn't really envision co-existing with the PSTN. It was a new way to use internet mechanisms instead of the inadequate PSTN standards. SIP was just like email, in that it had headers, one of which is a From: header that is identical to email addresses. The idea is that if you wanted to email me, you'd use mike@mtcc.com. If you wanted to call me, you'd use mike@mtcc.com. Simple. It was't until telcos started getting interested in SIP that PSTN integration started rear its ugly head. And hence the sorry situation we're in today.
So let's go retro for a moment. Maybe the original idea of using From: addresses wasn't so bad an idea, and is certainly widely in use today. A lot has changed since the telcos have butted into the VoIP world. For one thing, it's practically extinct. If it weren't for cell phones and the last mile it probably would be extinct. From what I can tell, it's IP the second it hits telco equipment. My little provider here in the Sierra has a gadget that terminates POTS and sends it out as SIP and RTP, has a DSLAM and backhauls IP over fiber and is battery backed up from CO. Pretty nifty that stuff I've worked on is a block or two away. I think it's pretty much the same for the cellular RAN networks. Since POTS is pretty much dead that just leaves cellular. And if you believe the hype about 5G it will be pretty redundant since it supposedly deals jitter, latency and other things that make VoIP a little dodgy on 4G. I'm not sure of the exact details, but I'll take them at their word that VoIP will be pretty acceptable on 5G. Update: found out that VoLTE is a Thing. So PSTN stuff is now almost completely redundant.
So it's a pretty SIP-y world, and it's about to get a lot more. If I have a SIP UA on my phone, I can completely decouple who provides the bits from who provides the rendezvous services. And I can guarantee you that the telcos are not going to be my first choice. So I may well get my wish that the From: address becomes what people expect on an incoming call, not PSTN anachronisms. So all is good, right? Well, no. Not quite. We still have the problem of spoofed addresses, but now it's put on the From: header instead of the P-Asserted-Identity header.
As far as I can tell (and i could be wrong because there's a mountain of SIP RFC's), there's really not a viable end to end or end to middle or middle to middle kind of way of asserting identity. Yes, I know there is an RFC for S/MIME, but client certs have never seen any wide adoption, and probably never will. And S/MIME is really about end to end crypto which while useful, is not exactly problem that SIP's version of the "caller id" problem is trying to solve.
What we learned with email back in the DKIM days is that end-to-end authentication is a hopeless task. Domain based aggregation, on the other hand, seemed quite tractable. That is a domain can claim responsibility for a particular message (email for DKIM) as having come or passed through its infrastructure. The way we characterized is that DKIM is a "blame me" mechanism if something malicious happened with one of its users. The tradeoff that DKIM made, however, is that you really don't know if the user part of the email address is who they say they are. But for the purposes of reporting abuse that's not necessary: it's really the sending provider's problem to figure that out. As it turned out, a lot of providers and probably all of the major providers nowadays require SMTP auth. I'm not sure if there was any cause and effect from DKIM to adoption of SMTP auth, but it was certainly in the air at the time.
Now back to SIP. Given the spam we're seeing it sure would be nice to have a "blame me" mechanism to see who injected a particular piece of voice spam into the SIP legs of the INVITE. Reputations can be aggregated at a domain level, and signing policies can be advertised for evaluation of the message. While I might not trust my provider on every front, our interests are alined when dealing with spam and misuse. Even if I can't verify the incoming INVITE directly (say, you're on a 4G phone), I do trust that my provider can verify it on my behalf and they could stuff the verified message's From: into the caller-id, or somesuch. With VoLTE, they're using SIP so you wouldn't even need to do anything heroic: just show the From: address.
A nice property of this is that the unpluggable holes with e.164 address security aren't a problem in a world that is becoming more and more native SIP. We should be looking forward to that future in addition to any backward looking legacy problems. DKIM has been amazingly successful and extremely widely deployed with tremendous volumes. And since email message structure is the template from many protocols including SIP, it should be pretty easily transferable. In fact, in the day I actually wrote a SIP DKIM signer and verifier just for fun, so mechanically there's not any problems.
There are definitely questions to be answered though: Should verbs other than INVITE be signed? Should the replies? I'm not sure of what benefit there would be to signing REGISTER, for example, but it may be just as well to sign everything regardless of whether it's useful. And then there is the every present problem of B2BUA's (back to back UA's). Honestly, these aren't entirely different than the Mailing List Problem with DKIM. The answer there is that the entity in the middle that breaks the signature should resign it. And it's probably not as bad a problem as with mailing lists because if I understand correctly B2BUA's are mostly being used as session border controllers which are typically in the same domain as the sender which is typically not the case with mailing lists.
In conclusion while it might be worthwhile to solve the E.164 problem, we definitely need to look to a future where it eventually shrivels up and dies. The future is being able to verify the sending domain of SIP messages, and especially knowing whether the From: address checks out which should be the case in a large percentage of signaling traffic. That would greatly help the voice spam problem since we would be able to reliably blame the sending domain.