Panama Papers
About the Panama Papers Over a year ago, an anonymous source contacted the Süddeutsche Zeitung (SZ) and submitted encrypted internal documents from Mossack Fonseca, a Panamanian law firm that sells anonymous offshore companies around the world. These shell firms enable their owners to cover up their business dealings, no matter how shady.
In the months that followed, the number of documents continued to grow far beyond the original leak. Ultimately, SZ acquired about 2.6 terabytes of data, making the leak the biggest that journalists had ever worked with. The source wanted neither financial compensation nor anything else in return, apart from a few security measures.
The data provides rare insights into a world that can only exist in the shadows. It proves how a global industry led by major banks, legal firms, and asset management companies secretly manages the estates of the world’s rich and famous: from politicians, Fifa officials, fraudsters and drug smugglers, to celebrities and professional athletes.
A group effort
The Süddeutsche Zeitung decided to analyze the data in cooperation with the International Consortium of Investigative Journalists (ICIJ). ICIJ had already coordinated the research for past projects that SZ was also involved in, among them Offshore Leaks, Lux Leaks, and Swiss Leaks. Panama Papers is the biggest-ever international cooperation of its kind. In the past 12 months, around 400 journalists from more than 100 media organizations in over 80 countries have taken part in researching the documents. These have included teams from the Guardian and the BBC in England, Le Monde in France, and La Nación in Argentina. In Germany, SZ journalists have cooperated with their colleagues from two public broadcasters, NDR and WDR. Journalists from the Swiss Sonntagszeitung and the Austrian weekly Falter have also worked on the project, as have their colleagues at ORF, Austria’s national public broadcaster. The international team initially met in Washington, Munich, Lillehammer and London to map out the research approach.
The data
The Panama Papers include approximately 11.5 million documents – more than the combined total of the Wikileaks Cablegate, Offshore Leaks, Lux Leaks, and Swiss Leaks. The data primarily comprises e-mails, pdf files, photo files, and excerpts of an internal Mossack Fonseca database. It covers a period spanning from the 1970s to the spring of 2016.
Moreover, the journalists crosschecked a large number of documents, including passport copies. About two years ago, a whistleblower had already sold internal Mossack Fonseca data to the German authorities, but the dataset was much older and smaller in scope: while it addressed a few hundred offshore companies, the Panama Papers provide data on some 214,000 companies. In the wake of the data purchase, last year investigators searched the homes and offices of about 100 people. The Commerzbank was also raided. As a consequence of their business dealings with Mossack Fonseca, Commerzbank, HSH Nordbank, and Hypovereinsbank agreed to pay fines of around 20 million euros, respectively. Since then, other countries have also acquired data from the initial smaller leak, among them the United States, the UK, and Iceland.
The system
The leaked data is structured as follows: Mossack Fonseca created a folder for each shell firm. Each folder contains e-mails, contracts, transcripts, and scanned documents. In some instances, there are several thousand pages of documentation. First, the data had to be systematically indexed to make searching through this sea of information possible. To this end, the Süddeutsche Zeitung used Nuix, the same program that international investigators work with. Süddeutsche Zeitung and ICIJ uploaded millions of documents onto high-performance computers. They applied optical character recognition (OCR) to transform data into machine-readable and easy to search files. The process turned images – such as scanned IDs and signed contracts – into searchable text. This was an important step: it enabled journalists to comb through as large a portion of the leak as possible using a simple search mask similar to Google....MORE