Note: This is part three of a four-part series where security expert Jon Callas breaks down the fatal flaws of a recent proposal to add a secret user — the government — to our encrypted conversations. Part two can be found here.
The latest intelligence community proposal for circumventing encryption suggests a scheme that would enable surveillance on otherwise securely encrypted communications by secretly adding an extra user — the government — to a conversation.
The GCHQ authors insist that the proposal does not “break encryption.” They’re wrong. As multiple computer scientists and security experts—including Seth Schoen, Susan Landau, Matt Green, and Bruce Schneier — have explained, the “ghost user” encryption back door won’t work as promised. It is hard enough to build secure tools; it is impossible to build technology that is impenetrable for everyone except the “right people.”
In this third post of a four-part series on the GCHQ “Ghost User” proposal, I address a technological assertion that the authors make in their essay: the idea that adding a secret listener to a conversation is just like attaching “crocodile clips” to a phone wire. This analogy is a rhetorical device designed to normalize the idea of a secret listener-in. It is wildly inaccurate from a technological point of view, and it serves to obscure major flaws with the GCHQ authors’ idea: The “ghost user” proposal won’t keep conversations private from unauthorized attackers, and criminals and terrorists will readily be able to defeat the tool.
Crocodile Clips Are Nearly Extinct
The GCHQ authors wax nostalgic, insisting that all they’re asking for is for the world “to go back a few decades” and enable investigators to apply “virtual crocodile clips” to internet communication platforms. After all, long ago, police could sneak over to the phone wires running outside of your home and convert a two-way call into a three-way call (with the government as the third party) using crocodile clips. Crocodile clips, called alligator clips here in the United States, are small metal clips that one can attach temporarily to an electrical connection.
The metaphor may be simple and easy to understand, but it doesn’t work here. True, the early telephone system could be wiretapped with crocodile clips and a tape recorder. But this method quickly proved ineffective — to say the least. Since the early 1990s, when the first digital phone switches were deployed, wiretaps have been conducted centrally via special interception technology that phone companies are obligated to use under federal law. As charming as the crocodile clip metaphor is, for almost three decades, the reality of wiretapping is that it happens through — yes — a mandated backdoor in the telephone architecture, one which itself creates security vulnerabilities.
More importantly, the Internet is radically different in design from the phone network. The telephone system is centralized — an expensive, top-down network connects to cheap, dumb devices in our homes and offices. In contrast, the Internet is decentralized — instead of a centralized network connecting dumb end-user devices, it’s a dumb network connecting smart devices. Today’s smartphones are pocket computers that replace our music players and televisions — and, by the way, also make phone calls.
That difference in design means that the centralized wiretapping that works on the old telephone system will not work for the Internet. Communications on the Internet can transmit at any time, take different paths, change their paths while in motion, and send information out of sequence.
This brings me to the most important point: On the decentralized Internet, there is no place to put virtual crocodile clips except for on our personal smartphones, computers, and other end-point devices. When Alice and Bob talk to each other, you can either listen on Alice’s device or Bob’s. There is no one path the communication follows, no one network router or switch that carries all of Alice’s and Bob’s conversation.
Crocodile in the Machine
Now you might argue that modern Internet communications are far more centralized than they used to be. This “client-server” architecture is a lot more like the phone network than the decentralized Internet used to be. Communications travel, not from Alice to Bob, but from Alice to Facebook, then to Bob. WhatsApp — the GCHQ proposal’s ripest target for a “ghost user” implementation with its 1.5 billion users securely communicating via end-to-end encryption — operates over Facebook servers. So, if you’re GCHQ, Facebook seems to be a good place to add a secret listener.
That assumption is not accurate, though, because that’s not the way today’s cryptography works. Surveillance of otherwise encrypted conversations cannot take place purely at the centralized office (Facebook) because all the work, from encryption to emojis is done on the users’ devices, not Facebook’s network. Eavesdroppers have to put the “ghost user” code on cell phones and computers, too. This provides an easy way to evade interception: look for and delete that kind of code on their devices.
This is a bit complicated; bear with me while I explain why this is true.
Two’s company, three’s a conference
Today’s cryptography works point-to-point, between two parties, even when there are more than two people in the conversation. Without belaboring the math, here’s a common scenario. When you and I want to talk to each other securely, the communications software on our smartphones each selects a random number that will be our keys. The number my phone chooses is what I want you to use as an encryption key when you talk to me, and the number your phone chooses is the key I will use to talk to you. The largest technical problem of encryption is how we tell each other our respective keys, which cryptographers call “key exchange.”
There are a number of ways to do key exchange, and they all follow a similar path. The most common one we use today (called Diffie-Hellman) works like this: I do some math on my key. I send you the result of that math. You take the thing I sent you along with your key, and do some more math. You then send that result back to me. At the end of this dance, I know your key and you know mine. Importantly, we can do this dance completely in public and still, no one else knows our keys.
This key exchange is a miracle of the modern age and it is what makes private Internet communications possible. The encrypted communications that we all use most, TLS, is implicitly two-party. The cryptographic models that describe TLS’s security guarantees depend explicitly on the fact that there are only two parties involved. Authentication interfaces and user-visible indicators also make the same assumption.
If that’s true, you might ask, how do group chats on our texting apps work securely? Well, many of the present apps (like WhatsApp, iMessage, Signal, and others) emulate a multi-party conversation by encrypting each message for each device of each participant. While the exact details vary from one app to the next, the basic principle is that I am in a two-party conversation with each of the people in the chat at the same time, and they are also in a two-party conversation with each other. Alice is talking to Bob, Alice is talking to Carol, Bob is talking to Carol — all at the same time. So when Alice sends a message, she’s sending it to both Bob’s and Carol’s devices, even though to all three, it appears they’re all in the same “room.”
This fact leads to a raw truth: in order to have end-to-end encryption with multiple ends, each and every end has to know about all the other ends. When a new person joins the conversation, every device of every person in the conversation must establish a new, encrypted two-party connection with the new person. Let’s say GCHQ joins the conversation between Alice, Bob, and Carol. Alice’s, Bob’s, and Carol’s cell phones each must now make an encrypted connection to GCHQ. Even under developing message encryption standards, every participant’s device requires a complete and accurate roster of the group and their keys.
Ghostly Footsteps
The GCHQ authors know this. But a government’s exceptional access endpoint must be invisible to everyone in the conversation lest the participants stop speaking freely. What good is a wiretap that everyone knows about? So the proposal includes something else: the app has to lie to the user.
Currently, secure apps alert participants when someone joins or leaves a call, and they all make different choices about what and how much to tell their users. Signal’s app notifies everyone about every update to the participant list. Apple’s iMessage system tells the account owner about a change to their own devices; if I get a new phone, for example, my laptop and my tablet all display that there is a new device in my account, and thus my conversations. In the past, WhatsApp has not told people about device changes, but this is changing as part of their security improvements. These kinds of notifications are integral to maintaining a secure communications network in the modern age, and they are widely recognized as such. Without coming out and saying it, the GCHQ essay proposes that this security best-practice would have to stop.
Even if that happened, participants will still be able to find out when they’ve made new connections to the spooks. An app might suppress user notification of a “ghost user” joining a conversation, but the device still has all the information it needs for a technologically savvy person to find out whether an interloper is there. After all, in order to transmit the messages of interest, the smartphone must be connected in a two-party conversation with that ghost and have the ghost’s key stored in its memory. Like the Wizard of Oz, a government agent may be behind the curtain, but they must be in the same room as Dorothy and her friends.
In sum, both the network architecture of smart devices on a decentralized network and the mathematics of encryption force the ghost to be on the participants’ devices. The crocodile metaphor describes a situation where the eavesdropper is not present, yet listening in. In reality, that situation breaks down completely, leaving nothing but a nostalgic, rhetorical spin.
In my next essay, I tie all of these threads together and show how a “ghost user” will inevitably be exposed, rendering the proposal worthless.
Further Reading
As you have noticed, this issue, that of the differences in the fundamental way communications were done before the computer revolution and afterward along with before the Internet and after, is incredibly complex and hard to summarize. Here are some additional resources:
Steven M. Bellovin, Matt Blaze, Susan Landau, Stephanie K. Pell, “It’s Too Complicated: How the Internet Upends Katz, Smith, and Electronic Surveillance Law“
Vassilis Prevelakis and Diomidis Spinellis, “The Athens Affair: How some extremely smart hackers pulled off the most audacious cell-network break-in ever“
Whitfield Diffie and Susan Landau, “Privacy on the Line: The Politics of Wiretapping and Encryption“
Per quotes from the article, title should read “crocodile (aka’, alligator”) clip”… (yes Alan, I know… something about my ‘gramer’, and my assumed political affiliation)… crocodile chips are something I’d not want to step into… particularly if they were ‘fresh’, meaning the croc was close… like cow chips or buffalo chips, wouldn’t want to step into…
As a kid, used alligator clips a lot (as narrative uses it) for electrical connections, but not to hold “roaches”… a different use, entirely…
And Alan, my gramer passed 26 years ago…