So it appears our government wants to resurrect plans for the Interception Modernisation Program, which was dropped by our previous government owing to massive cost and huge controversy. If indeed they do want to return to the same idea of recording every contact between individuals on the Internet (which was the main jist of the original proposal) then the next couple of years could get pretty interesting for the Internet in the UK.
One of the biggest issues with such a plan is the scope, or rather the boundaries of that scope. The plan was always only to track the who, when and where of Internet communications – not the what. This is similar to how phone records are currently stored, the intelligence services can pull up records on who contacted who, when and (to a limited extent) where. So I know that Jane contacted Jill on December 21st at 5pm, and the communication happened from a landline phone (the address of which is known) to a mobile (which can be located using base-station triangulation).
This in itself is a massive amount of information which, properly analysed, can tell you a lot about a person’s movements over time. This information is obviously very valuable to the intelligence services as it allows them to determine what relationships groups of people have to one another by who they contact.
This works for phones because they engender one medium, voice calls (text messaging works similarly, since it uses the same endpoints). When Jane calls Jill it is very easy to record the time, endpoints, duration and (if needed) location of the endpoints. The content of the call isn’t intercepted (unless the intelligence services have a wiretapping order) and cannot be analysed “after the fact”.
For communications on the Internet a similar form of recording for communications does not work. This is because the content of a message (the what) and the details of who, when and where are too closely linked together. This is due to the nature of TCP/IP and “flows” of data on the Internet over higher level protocols.
For example, say Jane is communicating with Jill using an instant messaging application. Jane is on her home broadband connection, Jill is at work behind her corporate network. We intercept the communication at some midpoint (a so-called “black box” installed in the infrastructure of Jane’s ISP). What can we see? At the IP level we see a series of packets moving backwards and forwards between two endpoints, however, the endpoints are not easily identified as Jane and Jill.
At Jane’s end we’d see the public IP address of Jane’s broadband connection. This could potentially be the same for all members of Jane’s household, so intercepting only the source and destination IP addresses would lead to communications coming from Jane being confused with communications coming from her son, John. Similarly on Jill’s end it’s likely that the entire corporate network will be behind a NAT and firewall, and subsequently a single IP address. This could potentially be thousands of employees.
Almost all IP endpoints in the UK are behind some kind of NAT, very few home broadband connections lack this and it makes simple IP-based tracking impossible. Similar issues are present for mobile data networks and ISP-level NAT, but these can be worked around by installing a “black box” in the network to track which subscriber maps to which IP address at any time. This technique could not be used for every home network (at least not without massive government interference, which I would hope people might rebel against!)
The other problem comes from the lack of any real meaning engendered by an IP packet. All you can tell by looking at it is that it’s from somewhere (which could be “fake” info, e.g. a NAT address) and it’s going to somewhere (with the same issue). No other information is readily available without looking inside the packet. (You could probably tell what protocol it is employing from the source and destination ports, but that isn’t particularly useful either).
Of course modern networking equipment can track “flows” of information, e.g. TCP sessions, and give more information about them. The IM conversation may take place over one or more TCP sessions between ports on the endpoints, and this session can be tracked. Without looking inside the content of the message though it is impossible to reliably say that it was Jane talking to Jill.
For the same kind of tracking plan to work for all forms of Internet communication as it does for phones the tracking equipment will need to understand the higher level protocols being used by each messaging system. They’ll need to understand SMTP, to tell from who and to who email is being sent. They’ll need to understand proprietary protocols (e.g. Skype). They’ll need to understand things like Facebook too.
The other thing they’ll have to understand is the links between these services and the people using them. They’ll need to correlate Jane’s Facebook, MSN, Skype and Twitter accounts with her phone, postal address and so-forth. Quickly the amount of information and complexity of obtaining that information grows enormously compared to simply logging phone calls.
One other issue is tracking non-immediate forms of communication. For example postings on message boards, or social networking services. Here a person can make a post which is viewed by thousands of other people. If these connections are not tracked then these kind of systems would be easy ways to exchange information. The message “My cat Jess just had kittens” may seem innocuous enough but it could easily be code directing a terrorist cell to launch an attack. These kind of techniques have been used since before the cold war and they are made all the more effective due to the sheer size of the Internet. The larger the haystack the harder it is to find the needle.
All this isn’t even taking into account encryption of services which will present a real impediment to tracking communications. IPSec can encrypt the contents of IP packets meaning that none of this information will even be available. Many protocols have crypto built into them at higher levels as well (the most commonly used for most people being SSL/TLS used during HTTPS transactions). It’s true that many people don’t know anything about encryption, but it is very easy to find out about it and to start to use systems which not only conceal what you are saying but who you are talking to as well (for example, ToR).
It’s also true that the people most likely to use such technologies are likely to be the ones who have something to hide. The phrase “if you have nothing to hide you have nothing to fear” works both ways – if you have something to fear you damn well better hide it. Criminals and terrorists know how to use cryptographic software and steganography techniques to hide their communications – the only reason we catch these people at the moment is that they are careless or incompetent (and we have to be thankful that they are!)
It’s my opinion that this plan is doomed to failure one way or another, and we can only hope that it fails before much money is spent on it. My main objection to it isn’t on cost grounds, or the civil liberties issues it raises (though both of these things I object to strongly…) it’s that it is just such an impossibly stupid proposal.
It won’t work and cannot work. I only hope that others realise this soon.