In Dutch: Volg de actuele ontwikkelingen rond de Wet op de inlichtingen- en veiligheidsdiensten via het Dossier herziening Wiv 2017

February 26, 2016

A look at the latest French laws on intelligence collection


For the second time we have an article written in cooperation with the French weblog about intelligence and defence Zone d'Intérêt:


Introduction

Over the last year, The French parliament passed new laws granting additional powers to intelligence services regarding interception of communications and data requests. This is part of a broader reform aimed at creating a legal framework for intelligence practices which were not formally authorized by law before 2015. In the press, it was said that these laws allowed sweeping new surveillance powers, legalizing highly intrusive methods without guarantees for individual freedom and privacy.

This article will focus on the provisions related to communications intelligence (COMINT), including targeted telephone tapping (lawful interception or LI), metadata collection and data requests to internet service providers (ISPs). Targeted interception of the content of internet communications is not regulated by these new laws, but only by older decrees which are still a bit unclear. The new laws are only about collection the metadata of internet communications.

In France, communications interception is authorized under two distinct frameworks:
- Judicial interceptions ordered by a judge of inquiry (juge d'instruction) during a criminal investigation. These interceptions can be done by the police, the gendarmerie (a military force charged with police duties) and by the security service DGSI.

- Administrative interceptions, also known as security interceptions, which are requested by both the domestic security and the foreign intelligence services.

Administrative interceptions are approved by the Prime Minister for various motives, such as defending and supporting major national interests including national defense, foreign policy interests, economical and industrial interests, or preventing terrorism and organized crime. Whereas the Unites States strongly denies conducting commercial espionage in the sense of stealing trade secrets for the benefit of individual companies, France is known for being less strict on this.



Diagram of the various interception capabilities of French intelligence
(Diagram: ZonedInteret.net - Click to enlarge)


The main French security and intelligence services are:
Direction Générale de la Sécurité Intérieure (DGSI), which reports to the Interior Ministry and is responsible for domestic security. It has some 3500 employees and an annual budget of 300 million euros. DGSI was formed in 2008 through the merger of the Direction Centrale des Renseignements Généraux (RG) and the Direction de la Surveillance du Territoire (DST) of the French National Police.

Direction Générale de la Sécurité Extérieure (DGSE), which reports to the Minister of Defence and is responsible for collecting foreign intelligence on civilian issues and also performs paramilitary and counterintelligence operations abroad. DGSE is responsible for both HUMINT and SIGINT.

Direction du Renseignement Militaire (DRM), which reports directly to the Chief of Staff and to the President of France as supreme commander of the French military. DRM is responsible for collecting military intelligence in support of the French armed forces.

Direction de la Protection et de la Sécurité de la Défense (DPSD), which is also part of the Ministry of Defence. DPSD is responsible for the security of information, personnel, material and facilities of the armed forces as well as the defence industry.



Headquarters of the French foreign intelligence agency DGSE in Paris
(Click to enlarge)



A special advisory commission on intelligence activities

The French laws, such as Loi n° 2015-912 and Loi n° 2015-1556, from July and November 2015, grant the Prime Minister full authority to order and approve intelligence activities both domestic and foreign. Each collection request is sent by the intelligence service director to its parent ministry and to the Prime Minister, who gives final approval. An advisory commission known as the CNCTR (Commission Nationale de Contrôle des Techniques de Renseignement, or National Commission for the Control of Intelligence Techniques) is kept informed of all requests for oversight purposes.

In most cases, before the Prime Minister can approve a request, this control commission must receive information related to its approval, including the request justification, the identity and location of the targeted individual, or any other identifying information (occupation, username, etc.) when his identity is unknown.

The CNCTR consists of nine members: four from the Parliament, two from the Council of State, two from the Court of Cassation, and one appointed telecommunications expert. This commission is considered an "Independent administrative authority": it is neither part of the Parliament even though members of Parliament are among its members, nor part of the judicial branch, even though some its members are magistrates.

The CNCTR only holds advisory power as it can not stop any decision from the Prime Minister regarding data requests or intelligence collection. The commission can express disapproval of a collection request, but the Prime Minister can overrule this advice and still authorize intelligence collection.

The CNCTR can access all transcripts and logs from intelligence collected under the Prime Minister's authority, but it can not compel any intelligence service for documents or information, and it can not investigate any irregularity on its own. However, it can express recommendations regarding intelligence procedures and bring any irregularity to the Council of State. All debates inside the commission, as well as all its communications with the Prime Minister and intelligence services are classified.

A special status has been granted to journalists, lawyers and members of parliament, as when intelligence requests apply to them, the CNCTR must be informed just before collection starts so it can state whether the collection is necessary and proportionate. The CNCTR must also receive transcripts of the intercepted communications afterwards. The difference with regard to eavesdropping operations against regular citizens is that for them, CNCTR can access the transcripts if it asks for them, while for the privileged professions, CNCTR must receive and review them.

In theory, any individual living in France or abroad can ask the CNCTR to check if he has been placed under surveillance following proper procedure. The control commission must check for any irregularity, but can neither confirm nor deny to the individual that he has been placed under such surveillance. The commission only states that proper verification has been made, and if any irregularity is detected it can report it to the Council of State.



Headquarters of the French domestic security service DGSI in Paris
(Photo: Bertrand Guay/AFP - Click to enlarge)


New provisions for domestic intelligence collection

This section applies to all main intelligence services such as DGSI, DGSE and DRM. DGSE is a foreign intelligence service, which is not supposed to operate on French territory, but it is authorized to request data and intercept domestic communications. DGSE holds most technical capabilities for decryption and high-end communications collection and provides other agencies, such as DGSI or DRM, with technical means and expertise in this regard.

A recent decree provided authority to more than twenty police and gendarmerie services, some of which are not officially intelligence services, to intercept communications and request data, mostly for counterterrorism purposes. Allowing police services to collect communication intelligence is a shift from older French habits, which the French government justified by the ongoing terrorist threat.

As in most countries, French law provides higher privacy protection to its own citizens and to people communicating from France than to people communicating from abroad, who receive little legal protection against intelligence collection. Intelligence collection under the Prime Minister approval may apply to all electronic means of communication traced to a targeted individual, from mobile phones to landlines, to all metadata from his internet service provider, and even metadata from online services.

In France, telephone companies, ISPs and online services providers can be compelled to provide a wide range of metadata regarding a targeted user, including: technical data related to the identification of connection or subscription numbers (phone numbers, IP adresses, etc.), a list of all connection or subscription numbers linked to a targeted individual, location data of all devices traced to a targeted individual, and call detail records (CDR).

Under the Prime Minister’s authority, telephone companies can be compelled to cooperate with intelligence services conducting targeted phone calls interceptions. French intelligence services are not supposed to proceed to interceptions on their own, but have to go through a dedicated government technical agency called GIC (Groupement Interministériel de Contrôle or Interministerial Control Group).

The GIC operates under the Prime Minister direct authority, receiving approved requests and ordering telephone companies and ISPs to provide information or access to their networks for interception. Providers compelled to cooperate are forbidden to reveal any information related to interceptions or data requests, or to inform their users they have been targeted. Providers personnel refusing to cooperate could be sentenced to a 150,000 € fine and up to two years of imprisonment.

The parliament recently authorized intelligence services to use devices such as IMSI-catchers to identify and locate mobile phones or computers linked to targeted individuals. Intelligence services can only use IMSI-catchers to collect metadata, and all collected data unrelated to specified targets must be destroyed.

Regarding domestic communications, voice communication recordings must be destroyed 30 days after collection, but transcripts can be kept "as long as necessary" by intelligence services. Metadata requested from ISPs and Telcos can be stored up to 4 years. Intercepted communications that are encrypted can be stored up to 6 years.



The French satellite intercept station at the Tontouta naval air base
near Noumea on the main island of New Caledonia
(Photo: Google Earth - Click to enlarge)


A loose framework for the surveillance of foreign communications

Fewer restrictions apply to the surveillance of foreign communications, whether collected by the domestic security service DGSI, the foreign intelligence service DGSE or one of the military agencies.

The Prime Minister issues broad authorizations to intelligence services to monitor and collect communications, either for whole geographical regions, countries, organizations or individuals. The Prime Minister specifies which types of communication networks can be targeted for collection. These authorizations last for 4 months, but they can be renewed without restriction.

Foreign intercepted communications can be kept for 1 year after processing, or up to 4 years after collection. Collected metadata can be stored for 6 years. Encrypted data can be stored for up to 6 years after decryption, or up to 8 years after it has been collected. With these retention periods, the French law is more strict than for example American law, which allows NSA to store encrypted data for an unlimited period of time.


From French territory

The law on surveillance of foreign communications only applies to communications between users who are outside of France, but which are collected from French territory. Here it should be noted that many former French colonies spread around the globe are also considered part of French territory, and French law applies there, especially as this is stated in the latest intelligence laws.

This means that these laws not only apply to data collected from major fiber-optic cables and satellite intercept stations inside France, but also to those from the overseas satellite stations like those in French Guyana, on the island of New Caledonia in the South Pacific and on Mayotte in the Indian Ocean - providing French intelligence with a global SATCOM coverage probably second only to that of the Five Eyes partnership. After ECHELON, this French network was dubbed FRENCHELON.

If data is collected under the foreign communications status, but is then traced back to domestic communications (call number or subscription located in France), it can be processed only if approved under the domestic communications framework, or it must be destroyed under 6 months.



The DGSE satellite intercept station near Kourou in French Guyana,
which was built in cooperation with German BND
(Image: Google Maps)


Outside French territory

Intelligence collection conducted by French intelligence services outside of France is not restricted by law. Because the overseas satellite stations are considered to be on French territory, this situation only applies to for example covert eavesdropping operations in foreign countries, as well as to tactical SIGINT collected through land, sea and airborne platforms during military operations abroad. French armed force are based in countries such as Mali, Gabon, Djibouti and UAE. This will mainly result in communications for military purposes.

While this kind of collection is not regulated by law, it will be limited by the available resources and the specific goals set by the government in the annual PNOR (Plan National d’Orientation du Renseignement or National intelligence orientation plan), a classified document sent to the chiefs of intelligence services and to the parliamentary delegation for intelligence (DPR - Délégation Parlementaire au Renseignement), which only receives a redacted version of this document.



A French army vehicle for collecting tactical SIGINT and ELINT in Afghanistan
(Photo: ageat.asso.fr - Click to enlarge)


Automated bulk metadata collection

In July 2015, a law introduced a new automated bulk metadata collection system against terrorism. The Prime Minister can order French internet service providers to add specified metadata collection and filtering systems to their networks. He can issue such orders for 2 months, and they can be renewed without restriction. Data collected on ISPs networks can be stored up to 60 days, and would be filtered and processed by government issued algorithms to detect terrorism related threats. If such a threat is detected, the Prime Minister can compel ISPs to identify related users.

The development of threat-detection algorithms, and their so-called "black boxes", should be done under supervision from the CNCTR. However, providing oversight at the hardware and software level could be very tricky and difficult, especially as algorithms would be updated and modified very regularly and it would also require specialized knowledge of such internet filter systems.

The scope and purpose of this metadata provision is largely a mystery. At first sight it may look similar to what NSA did by collecting domestic telephone records in order to find unknown terrorist associates by contact chaining. But if that was the purpose of this French law too, then it would have been much easier to order the ISPs to hand over their metadata in bulk, just like it happened in the US.

Actually, French telecommunications and internet service providers already have to store their customer's metadata for at least one year under the EU data retention directive. Moreover, a French legal decree even requires web hosting companies, like Facebook, Google and Amazon, to store their user data for at least one year and provide it to government authorities at their request. However, these metadata may only be used for targeted investigations, as intelligence services must provide specific requests to ISPs & web hosting companies with either the full name of a target, its user name, IP address or other identifying information.

It seems that installing "black boxes" at ISP networks serves the bulk collection of smaller sets of data: they filter traffic using specific threat-detection algorithms, so they will likely only pull in those metadata that match certain communication patterns and routines, based on digital forensics from counterterrorism investigations. The metadata would then be used to identify the users showing such patterns.

Given the very high data rates of traffic passing internet service providers, such filter systems are very expensive and ISP generally don’t like external systems to be plugged into their networks. That makes it surprising that the orders for installing them are valid for just 2 months, and although they are renewable without any limitations, it’s not clear whether these "black boxes" would be removed from ISPs networks at the end of each order, or if they would only be turned off until further notice.



Cyber defense

Interestingly, filtering internet traffic using threat-detection algorithms sounds very much like detecting and preventing malware and cyber attacks. But maybe except for a case when a terrorists group would conduct cyber attacks, the law precisely states that this "black box" metadata filtering and collection system can only be used to detect terrorist threats. It can not be used for any other purpose, including cybersecurity, counterintelligence or criminal investigations.

Nonetheless, the cyber domain did receive special attention from French lawmakers in the latest regulations on intelligence. All collected intelligence which is related to cyber attacks can be stored indefinitely for technical analysis. In addition, all penalties for computer hacking and cyber-related crimes have been doubled as part of the new Law on Intelligence passed in July 2015. This fits a general shift of intelligence agencies towards "cyber", as for example in the US, cyber threats replaced terrorism as top priority for the intelligence community since 2013.



Links and Sources
- New York Times: French Inquiry Urges Changes to Intelligence Services in Light of Failures
- The Guardian: France passes new surveillance law in wake of Charlie Hebdo attack
- Matthew Aid: French SIGINT: Part II
- Overview of French intercept sites: Comment on peut, en trois clics, découvrir la carte des stations d'écoute des espions de la DGSE

February 13, 2016

How NSA contact chaining combines domestic and foreign phone records

(Updated: September 18, 2017)

In the previous posting we saw that the domestic telephone records, which NSA collected under authority of Section 215 of the USA PATRIOT Act (internally referred to as BR-FISA), were stored in the centralized contact chaining system MAINWAY, which also contains all kinds of metadata collected overseas.

Here we will take a step-by-step look at what NSA analysts do with these data in order to find yet unknown conspirators of foreign terrorist organisations.

It becomes clear that the initial contact chaining is followed by various analysis methods, and that the domestic metadata are largely integrated with the foreign ones, something NSA never talked about and which only very few observers noticed.

What is described here is the situation until the end of 2015. The current practice under the USA FREEDOM Act differs in various ways. The information in this article is almost completely derived from documents declassified by the US government, but these have various parts redacted.


 

RAS-approval

As a seed for starting a contact chain, NSA analysts can take a telephone identifier like a phone number (also called a selector), based upon:
- their own ongoing analysis on an existing target set;
- a Request for Information (RFI) from another government agency;
- a notification of a match between a known counterterrorism-related selector and an identifier among newly ingested phone metadata.

Access to the domestic phone records was granted to about 125 intelligence analysts from the Homeland Security Analysis Center (HSAC, or S2I4) of the NSA's Signals Intelligence Directorate. There were also up to 22 specially trained officials called Homeland Mission Coordinators or HMCs (initially shift coordinators).

As required by the FISA Court orders, only these HMCs, the chief and the deputy chief of the HSAC are allowed to determine that there is a Reasonable, Articulable Suspicion (RAS) that a certain selector is associated with a designated foreign terrorism group and/or Iran. Such a RAS-approval is only needed for the domestic phone records, not the ones collected overseas.

NSA has a special RAS Identifier Management System to streamline the adjudication of the requests for RAS approval and the documentation thereof. The codename of this system is IRONMAN, as we learn from this document from a declassified 2011 training presentation (.pdf) in which this codeword wasn't redacted twice:



A RAS-approval is effective for one year, meaning that during the next year, repeated queries using the approved seed selector can be made. If the selector is reasonably believed to be used by a US person, the approval period is 6 months.

The number of RAS-approved identifiers varied substantially over the years, but in 2012, there were fewer than 300. According to the annual Transparancy Report from the Director of National Intelligence (DNI), there were 423 such selectors in 2013, but just 161 in 2014. It's not known how many of these belonged to Americans.
 


Different kinds of queries

From various declassified documents analysed in an article on the weblog EmptyWheel, it becomes clear that there are three different kinds of queries that NSA analysts conducted on the domestic phone records database:
1. Queries for data integrity purposes
2. Queries for "Ident lookups"
3. Queries for contact chaining

In the EmptyWheel article it's assumed that besides these queries, NSA also conducted some kind of pattern analysis: in many declassified documents a redaction appears right after the term "contact chaining", which according to EmptyWheel could hide something like "pattern analysis".

Given that in these documents the targets are also redacted, there's also the possibility that the redaction hides a description of the target, like "contact chaining al-Qaida affiliates".

At least one NSA memorandum from 2009 indeed speaks about "chaining and analysis", but there can be two kinds of analysis: one conducted on the bulk of raw metadata records, and another one on selected results of contact chaining.

NSA always denied that it conducts pattern analysis on the bulk metadata themselves, stating that every search begins with a specific telephone number or other specific selection term. So far, there are no indications of the contrary, so the analysis apparently refers to the results of contact chaining queries, which is confirmed by the 2014 report (.pdf) about the Section 215 program by the Privacy and Civil Liberties Oversight Board (PCLOB).

As we will see later on, this second type of analysis is indispensable for making the contact chaining queries useful for foreign intelligence purposes.




(1) Data integrity queries

The first way the domestic phone records were queried was for data integrity purposes. This was done by some 25 specialized Data Integrity Analysts (DIAs). They didn't conduct target analysis, but helped intelligence analysts with questions on a target. For those cases, a DIA could use a standard login (with appropriate controls) to query the phone records for foreign intelligence purposes.

However, when they queried for data integrity purposes, DIAs used a special login that bypassed the normal controls (like EAR) and also the auditing. This because for this task, they were allowed to use identifiers that were not RAS-approved (not allowed though were selectors that had expired because they were not revalidated).

One goal of these data integrity queries was to discover selectors that, for reasons that were redacted in the review report, should not become part of analysis, both for BR FISA and other purposes. These selectors could then be added to a defeat list of identifiers that were deemed to be of little analytic value, and/or to a database holding those that should not be tasked onto the collection system.

There was of course a risk of mixing up these tasks, and after an expired identifier had been queried in March 2010, the NSA Inspector General recommended that the duties of DIAs and foreign intelligence analysts should be clearly separated.


(2) Ident lookup queries

A second kind of query was for so-called "ident lookup". According to an NSA Inspector General test report (.pdf) from April 2010, this refers to:
"querying a selector using [tool name redacted] to determine the approval status of a selector. In such cases, the Emphatic Access Restriction controls will prevent chaining of a selector that is not marked as approved for querying, and return an error message to the analyst. Because the selector was not actually chained, there is no violation of the Order"

Emphatic Access Restriction (EAR, pronounced as "ear") is a tool that was installed at the MAINWAY database in February 2009. It automatically prevents using a selector that is not RAS-approved. It seems therefore that when an analyst started a query and the seed selector appeared to be not approved, that query was called an "ident lookup" (although EmptyWheel has a different interpretation).

This could be the way it worked before the IRONMAN system was established, as in a training module from 2011, it is said that by then, analysts just had to "use [tool name redacted] to determine the identifier’s approval status".
 


(3) Contact chaining queries

The most important queries on the domestic phone records were of course those conducted by intelligence analysts in order to "identify unknown terrorist operatives through their contacts with known suspects, discover links between known suspects, and monitor the pattern of communications among suspects".

For this, an analyst took a RAS-approved selector (often a telephone number) and entered it into a specialized metadata tool, which searched the telephone metadata in the MAINWAY contact chaining system. To limit the number of results, the analyst could set a certain timeframe for the query.

The metadata tool then returns "a .cml file, usually referred to as a chain, which is made up of the individual first hop contacts of the seed". Usually, the analyst will also be interested in the second-hop contacts, and then the tool will retrieve the batches of one-hop chains for the identifiers that had been in direct contact with those from the first hop series.



Number of hops

Based upon the FISA Court orders, NSA analysts were also allowed to retrieve the numbers in contact with all the numbers from the second hop, which would make a third hop. The software tools are said to prevent looking beyond the third hop, or performing a query of a selection term that has not been RAS-approved.

The initial authorizations under the President's Surveillance Program (PSP) did not prohibit chaining more than two degrees of separation from the target, but "NSA analysts determined that it was not analytically useful to do so".* When this collection was brought under supervision of the FISA Court, it limited contact chaining to 3 hops.

But despite that authorization, the policy of NSA's Counter Terrorism branch restricted chaining to 2 hops, as can be seen in an NSA training presentation (.pdf) from 2007:


A 2011 training module says that chaining to a third hop is possible, but only after prior approval by the analyst's division management (for example when a contact that comes up with the first hop appears to be an already known suspect).

Strangely enough, both a government white paper and the PCLOB-report don't mention this policy restriction and in the latter it's even assumed that chaining 3 hops was regular practice:
"If a seed number has seventy-five direct contacts, for instance, and each of these first-hop contact has seventy-five new contacts of its own, then each query would provide the government with the complete calling records of 5,625 telephone numbers. And if each of those second-hop numbers has seventy-five new contacts of its own, a single query would result in a batch of calling records involving over 420,000 telephone numbers"

As of 2012, the FISA Court also allowed an automated chaining process in which "the NSA's database periodically performs queries on all RAS-approved seed terms, up to three hops away from the approved seeds. The database places the results of these queries together in a repository called the "corporate store" - the NSA was never able to get that working though (although the PCLOB report, again, describes it as if it was actually implemented).


Visualization

The results from a contact chaining query can be visualized by a contact graph. An example was published by the German magazine Der Spiegel, showing a slide from an NSA presentation with a 2-hop contact graph for the e-mail addresses of the CEO and the chairwoman of the Chinese telecommunications company Huawei:




Domestic and foreign results

Generally, it is said that analysts query the "Section 215 calling records", the "BR metadata" or something similar. This sounds like they only access the domestic telephone records and that therefore the resulting contact chains would fully consist of American phone numbers.

The initial seed number however will often be a foreign number, as the whole purpose of the Section 215 program is to discover connections between foreign terrorists and potential conspirators inside the US. Analysts will therefore choose a seed for which they expect a good chance it has a domestic nexus, which probably explains the low numbers of RAS-approved identifiers.

But as we have seen in the previous article, NSA stored the domestic phone records in MAINWAY, which also contains the foreign telephone and internet metadata collected overseas. That means that a contact chaining query will not only return identifiers from the domestic, but also from the NSA's worldwide metadata collection.


Federated queries

Such results from multiple sources are called federated queries. According to a 2011 training module, BR FISA queries initially only resulted in these federated queries, but in later versions of the query tool, the analyst could also check boxes to conduct an "unfederated" query and choose individual collection sources.

These options can be seen in the following screenshot from the user interface (the codename of which is redacted) used to conduct the contact chaining:


Selecting the "FISABR Mode" makes that an additional checkbox for the EO12333 source appears. An NSA memorandum explains that when this BR FISA option is chosen, the analyst will not only be provided with the domestic telephone metadata, but also with those from the SIGINT realm (which is collection overseas under EO 12333 authority), dating back to late 1998.

When the analyst used a RAS-approved selector, he could also check the box for PENREGISTRY, or PR/TT, which refers to the domestic internet metadata, but the collection thereof was ended by the end of 2011. Normal mode is for all other metadata collected abroad.
Analysts can determine the collection sources of each result by examining the Producer Designator Digraph (PDDG) and/or SIGINT Activity Designator (SIGAD) from each line of the contact chain file. BR FISA metadata can be identified by specific SIGADs.

SPCMA

There's also a fourth box for SPCMA mode, which stands for the "Special Procedures governing Communications Metadata Analysis" from January 2011. These allow contact chaining and other types of analysis on metadata that have already been collected under EO 12333, regardless of nationality and location (because metadata aren't constitutionally protected).

This means that US person identifiers that were in contact with valid foreign intelligence targets may be used for searching these foreign metadata too.

NSA isn't allowed to collect US data overseas, but these do come in "incidentally" when for example foreigners communicate with Americans - precisely the kind of communications that could reveal conspirators inside the US. Many international phone calls from or to the US, will likely be intercepted by NSA collection facilities abroad too.


In other words:
- By default, any contact chaining query will use the foreign metadata collected overseas. For these, any useful selector may be used as a seed, and, under SPCMA, even one that belongs to an American.

- If the seed selector is RAS-approved, then the domestic phone records will be used too, which could lead to the discovery of additional contacts within in the US.

The fact that most contact chains will consist of both foreign and domestic identifiers means that they contain much less American numbers then in calculations like the one from PCLOB, which give the impression that queries resulted in up to 3 hops of domestic numbers.


 


Analysing the contact chains

It should be noted that the phone numbers (or other selectors) which are returned after an initial contact chaining query are anonymous and therefore meaningless. They're just numbers which could belong to anyone: from a pizza delivery to a dangerous conspirator.

So, in order to identify which numbers are of interest for finding unknown suspects, additional analysis is needed - a comprehensive GCHQ book (.pdf) disclosed last week calls contact chaining the start of a "painstaking process of assembling information about a terrorist cell or network".


Analytic tools

In the early years of the President's Surveillance Program (PSP), only the SIGINT Navigator (SIGNAV) tool was available to view the output of the MAINWAY contact chaining system. Later, new tools were created to improve efficiency and to obtain the most complete results, they were designed to use phone records collected both domestically and overseas.

According to the 2009 BR FISA review, there were 19 different analytic tools used for analysing both the raw metadata as well as the results of contact chaining. The glossary of the review lists following tools, unfortunately with their codenames redacted:


S................?
"This tool is used by HMCs to conduct contact chaining against BR FISA metadata and provide the results to the [...]team. HMCs only used RAS-approced selectors when using this tool. The [...] team ultimately provided the results to NSA's [....]"

S.........?
"The primary desktop graphical user interface (GUI) for access to [....] data and services"

S....?
"An analytic query tool used to seek out additional information on telephony selectors from [MAINWAY?] and other knowledge bases and reporting repositories"

[SYNAPSE Workbench?]
"A next generation metadata analysis graphical user interface (GUI) which is the replacement for [......]"

W......?
"The query tool, which indicates whether a telephony selector is present in NSA data repositories, the total number of unique contacts, total number of calls, and "first heard" and "last heard" information for the selector"


The 2009 PR/TT review also mentions the following tool, which could have been redacted in the BR FISA review:

M.....?
"A database analytic system and user interface tool for integrated analysis of multiple types of metadata, facilitating more comprehensive target activity tracking"


Update:
According to the internal NSA newsletter SIDToday from March 4, 2005, which was published by The Intercept in September 2017, MAINWAY's Sigint Navigator (SigNav) version 4.0 became the vehicle for the new single sign-on tool GLOBALVISION, which gave analysts access to 11 databases.


Combining multiple contact chains

In 2006, a "high-level Bush Administration intelligence official" told Seymour Hersh that analysts could for example look whether any number that is two or three hops away from the seed number is also in direct contact with that original suspect number. That sounds smart, but in that case, that number which is two or three hops away is simply a first-hop contact.

Finding suspects just by looking at connections between anonymous numbers could work however when several contact chains (from related suspect seed numbers for example) are combined: then a number that appears to be in contact with seed #1 and also with seed #2, would be suspicious, as it apparently belongs to someone known by both initial suspects.

This approach was seen in the CBS television program 60 Minutes from December 15, 2013, in which an NSA employee gave a demonstration of how metadata contact chaining works. He used a tool for foreign collection under EO 12333, resulting in some contact chains of almost fully masked phone numbers from Somalia. Clearly visible are numbers that different targets had in common:



Detailed call record analysis

Besides analysing the breadth of the contact chains, each contact between two phone numbers can also be analysed in depth. For this, the analytic software provides analysts access to the complete calling records associated with all the phone calls from a contact chain.

Such a record, as provided by the telecoms, includes the calling and the called number, a calling-card number, the IMEI number of a mobile handset and the IMSI number of a SIM card, as well as the date and time of the call, its duration and technical information about how the call was routed through the telephone networks.

This provides analysts with information like which number initiated the call, the day and time the call was made, and how long it lasted. And although the domestic phone records may not contain cell phone location data, the area code and prefix of a landline telephone number, as well as the trunk identifier for mobile networks, still indicate the area where a particular phone was located.

As described in the previous article, these data weren't derived from the MAINWAY system, but from a second database which holds "individual BR FISA metadata call records for access by authorized Homeland Security Analysis Center (HSAC) and data integrity analysts to view detailed information about specific telephony calling events".


Searching the second database

This database of calling records also enables analysts to subject these records "to other analytic methods or techniques besides querying", like for example searching them "using numbers, words, or symbols that uniquely identify a particular caller or device", or using "selection terms that are not uniquely associated with any particular caller or device" - according to the PCLOB report.

So, when analysing one or more contact chains resulted in finding several suspicious phone numbers, analysts can then use those numbers for querying the second database in order to see whether these numbers also appear in phone records that were not included in their initial contact chains.

And it also seems possible to query for example a trunk identifier to discover other phones from the same region. These kind of searches can therefore provide potential connections that could not have been found by conducting a direct contact chaining query.

Update:
An NSA slide that was already published in December 2013, shows that MAINWAY can indeed be used for queries with cell tower identifiers, in order to find selectors in certain geographical areas:



Some numbers

In a Department of Justice report (.pdf) from 2006 it's said that NSA "estimated that only a tiny fraction (0,000025% or one in four million) of the call-detail records [...] were expected to be analyzed". This would mean that of the 1,8 billion domestic phone records provided daily by AT&T, just 450 would be used for analysis.

So in a year, the records (not the content) of roughly 230.000 individual calls from the domestic metadata collection could have been used for analysis in addition to contact chaining.



Foreign call records

As we have seen, a contact chaining query on Section 215 telephone metadata will generally result in both foreign and domestic numbers. Analysts will therefore not only like to analyze the associated call records from the domestic collection, but also those from foreign collection conducted abroad.

These foreign phone records could be retrieved from the known metadata repositories like ASSOCIATION (for mobile calls) and BANYAN (for landline calls), or from a single foreign "SIGINT" database, as is suggested by an NSA memorandum from 2009.


Enrichment

Analyzing the detailed call records will still not provide names or other information that allows the identification of the people to which the numbers from a contact chain belong. For that, the phone numbers have to be correlated ("enriched") with other kinds of information.

The easiest way is probably to combine them with target watch lists to see if the contact chains contain phone numbers that belong to already known targets. This is demonstrated in the following video, which shows contact chain analysis using Sentinel Visualizer, which is a commercially available program for this purpose:





Telephone identifiers found through contact chaining and subsequent analysis can of course also be correlated with internet metadata. NSA does not collect domestic internet metadata anymore, but its collection abroad results in over 10 billion internet metadata a day being stored in the MARINA database.

The metadata from contact chains can also be enriched with data from for example GPS and TomTom, billing records and bank transactions, passenger manifests, voter registration rolls, property records and unspecified tax data - for both Americans and foreigners, according to a New York Times report, but in which NSA denies using this for the domestic metadata collected under Section 215.


SYNAPSE Data Model

With all this, analysts can build extensive social network graphs (or "community of interest" profiles) using 164 different relationship types like "travelsWith, hasFather, sentForumMessage, employs". It seems that this refers to the SYNAPSE Data Model, for which internal NSA relationships are shown in the following diagram that was published by The New York Times too:



Apparently also based upon this data model is SYNAPSE Workbench, which seems to be the "next generation metadata analysis graphical user interface (GUI)" described in the 2009 BR FISA review. SYNAPSE Workbench is apparently capable of fusing metadata from multiple sources and is also enabled for SPCMA searches.


Further action

When all this makes an analyst to believe that a certain telephone identifier belongs to someone who is of interest but wasn't yet known or identified, the following actions can be taken:
Is the identifier American and of counterterrorism value, then it can be passed on to the FBI for further intelligence or criminal investigation. From 2006-2009, NSA provided the FBI (and other intelligence agencies) a total of 277 reports containing 2883 telephone identifiers.
Is the identifier foreign, then NSA can use it as a selector to retrieve the content of associated communications that might be already in its databases. It can also be entered into the NSA collection system in order to pull in the content of any future communications of the target systematically.

In case the identifier of the yet unknown suspect is foreign, the analyst might have found out a name through the various enrichment correlations, but if not, this can also be achieved by listening into the content of associated phone calls or additional Human Intelligence (HUMINT) methods.


 

Conclusion

As we have seen, the domestic phone records collected by NSA under Section 215 are used for contact chaining that combines both domestic and foreign identifiers. NSA never explicitly explained this, probably because they didn't want to draw attention to their foreign metadata collection and analysis efforts. But it did became clear from the many documents about the Section 215 program that were declassified by the US government.

These documents made clear that NSA rarely went to 3 hops of contact chaining, which is contrary to what most people, including the Privacy and Civil Liberties Oversight Board (PCLOB) assumed. Because of the federated queries, the resulting contact chains were made up of both domestic and foreign identifiers, which means contact chaining under the Section 215 program involved far less American phone numbers than often presumed.

The documents also show that contact chaining for finding yet unknown conspirators isn't as easy as it may appear. It's not that one enters a phone numbers and the software provides a list of suspects. Data retrieved through the contact chains have to be analysed and correlated with other data sets in order to find out which numbers could matter. It still depends on experience, analysis and eventually even guessing which data and which numbers might be worth a closer investigation.

How successful this contact chaining and subsequent analysis is, is difficult to say. The PCLOB report judged that there was "no instance in which the [Section 215] program directly contributed to the discovery of a previously unknown terrorist plot or the disruption of a terrorist attack" - but it's also possible that there were just no such conspirators.

The PCLOB report noticed that analysing the domestic telephone metadata did provide some value "by offering additional leads regarding the contacts of terrorism suspects already known to investigators, and by demonstrating that foreign terrorist plots do not have a U.S. nexus" - although useful, this seems a rather meager result of what for sure required lots of work.


> Next: Collection of domestic phone records under the USA FREEDOM Act



Links and Sources
- Lawfare Blog: Understanding Footnote 14: NSA Lawyering, Oversight, and Compliance (2016)
- EmptyWheel.net: Federated Queries and EO 12333 FISC Workaround (2013) - What We Know about the Section 215 Phone Dragnet and Location Data (2016)
- PCLOB: Report on the Telephone Records Program Conducted under Section 215 of the USA PATRIOT Act (pdf) (2014)
- Cryptome.org: NSA FISA Business Records Offer a Lot to Learn (2013)
- Huffingtonpost.com: The NSA's Telephone Meta-data Program: Part I (2013)
- US Administration White Paper: Bulk Collection of Telephony Metadata under Section 215 of the USA PATRIOT Act (pdf) (2013)
- The New Yorker: What the N.S.A. Wants to Know About Your Phone Calls (2013)
- NSA: Business Records FISA NSA Review (.pdf) (2009)