Over a year ago, an anonymous source contacted the Süddeutsche Zeitung
(SZ) and submitted encrypted internal documents from Mossack Fonseca, a
Panamanian law firm that sells anonymous offshore companies around the
world. These shell firms enable their owners to cover up their business
dealings, no matter how shady.
In the months that
followed, the number of documents continued to grow far beyond the
original leak. Ultimately, SZ acquired about 2.6 terabytes of data,
making the leak the biggest that journalists had ever worked with. The
source wanted neither financial compensation nor anything else in
return, apart from a few security measures.
The data provides rare
insights into a world that can only exist in the shadows. It proves how
a global industry led by major banks, legal firms, and asset management
companies secretly manages the estates of the world’s rich and famous:
from politicians, Fifa officials, fraudsters and drug smugglers, to
celebrities and professional athletes.
A group effort
The Süddeutsche Zeitung
decided to analyze the data in cooperation with the International
Consortium of Investigative Journalists (ICIJ). ICIJ had already
coordinated the research for past projects that SZ was also involved in,
among them Offshore Leaks, Lux Leaks, and Swiss Leaks. Panama Papers is
the biggest-ever international cooperation of its kind. In the past 12
months, around 400 journalists from more than 100 media organizations in
over 80 countries have taken part in researching the documents. These
have included teams from the Guardian and the BBC in England, Le Monde in France, and La Nación
in Argentina. In Germany, SZ journalists have cooperated with their
colleagues from two public broadcasters, NDR and WDR. Journalists from
the Swiss Sonntagszeitung and the Austrian weekly Falter
have also worked on the project, as have their colleagues at ORF,
Austria’s national public broadcaster. The international team initially
met in Washington, Munich, Lillehammer and London to map out the
research approach.
The data
The Panama Papers include approximately 11.5 million documents – more
than the combined total of the Wikileaks Cablegate, Offshore Leaks, Lux
Leaks, and Swiss Leaks. The data primarily comprises e-mails, pdf
files, photo files, and excerpts of an internal Mossack Fonseca
database. It covers a period spanning from the 1970s to the spring of
2016.
Moreover,
the journalists crosschecked a large number of documents, including
passport copies. About two years ago, a whistleblower had already sold
internal Mossack Fonseca data to the German authorities, but the dataset
was much older and smaller in scope: while it addressed a few hundred
offshore companies, the Panama Papers provide data on some 214,000
companies. In the wake of the data purchase, last year investigators
searched the homes and offices of about 100 people. The Commerzbank was
also raided. As a consequence of their business dealings with Mossack
Fonseca, Commerzbank, HSH Nordbank, and Hypovereinsbank agreed to pay
fines of around 20 million euros, respectively. Since then, other
countries have also acquired data from the initial smaller leak, among
them the United States, the UK, and Iceland.
The system
The
leaked data is structured as follows: Mossack Fonseca created a folder
for each shell firm. Each folder contains e-mails, contracts,
transcripts, and scanned documents. In some instances, there are several
thousand pages of documentation. First, the data had to be
systematically indexed to make searching through this sea of information
possible. To this end, the Süddeutsche Zeitung used Nuix, the same program that international investigators work with. Süddeutsche Zeitung and ICIJuploaded
millions of documents onto high-performance computers. They applied
optical character recognition (OCR) to transform data into
machine-readable and easy to search files. The process turned images –
such as scanned IDs and signed contracts – into searchable text. This
was an important step: it enabled journalists to comb through as large a
portion of the leak as possible using a simple search mask similar to
Google....MORE