What Do Big Data Paris and the Panama Papers Have In Common?

Airbus Presentation at Big Data Paris
Airbus Presentation at Big Data Paris

We’re back from an exciting week at Big Data Paris 2017 where we were featured by our customer Airbus in their presentation on enterprise big data integration and graph visualizations.  We were delighted to meet up with our Neo Technology France and UK partners and had a great turnout for our workshop on Airbus and Twitter big data. 

At our booth, in addition to the very cool Airbus GAIA demonstration on a two meter-wide multi-touch screen, fraud detection, criminal investigations, and networking
were the hot topics. Our crime network and Panama Papers demos were very popular with attendees looking for big data and graph visualization technology.

Since these topics were so popular at the Paris event, we’re bringing back the two-part series we did with Neo Technology on using Neo4j and our graph visualizations to uncover the secrets behind the Panama Papers offshore leaks database.

Read on to learn how we used Tom Sawyer Perspectives to identify hidden and surprising connections between sports celebrities and corporate executives, uncover fraud, and make sense of big data from the ICIJ database, and much more.

The Offshore Leaks Database Challenge

By Caroline Scharf and Uli Foessmeier, Tom Sawyer Software for the Neo4j Blog

The Panama Papers investigation and resulting Offshore Leaks database present an interesting challenge for investigators.

If you’re not familiar with this investigation, it was led by the ICIJ – The International Consortium of Investigative Journalists – to expose the people behind companies and trusts incorporated in tax havens. While some offshore entities and trusts are legitimate, their anonymous nature more easily facilitates money laundering, tax evasion, fraud, and other crimes. For more information about the Offshore Leaks database, visit offshoreleaks.icij.org.

The Offshore Leaks database contains more than 320,000 entities and oftentimes duplicate entries. Navigating the massive amount of information, visualizing it in a format that can be digested and understood, and knowing what clues to look for are all unique challenges for anyone using this database.

Tom Sawyer Software specializes in helping businesses rapidly build sophisticated enterprise graph and data visualization applications to help make sense of, and analyze their Big Data, such as the volume of information in the Offshore Leaks database.

In this first of two articles, we walk you through our Panama Papers example application, built with our flagship product Tom Sawyer Perspectives. We discuss two scenarios that can help you make sense of the Offshore Leaks data, so you can focus your investigation on suspicious people and companies, spot areas of potential fraud and make connections.

Using Tom Sawyer Perspectives to Focus Your Investigation

When you begin an investigation, you may know the person or network of people you want to investigate, such as a well-known political figure or celebrity, or you may know several individuals who you suspect are connected, or the name or address of a company.

Using the 2015 FIFA corruption scandal as a backdrop, we want to find out if there are any connections between those charged in that case, and any other prominent FIFA-connected individuals.

In searching the database for all the individuals indicted in the case, three names show results: Eugenio Figueredo, Hugo Jinkis, and Mariano Jinkis. In the case of Figueredo, a number of results come back with the name Figueredo.

Data integrity problems are common in this database, so we included a feature in our example application to automatically merge nodes with identical names, and the ability to manually merge nodes. We merge the two identical nodes and delete the nodes with different first names that we know are not relevant, but we’re not sure which of the remaining three “Eugenio Figueredo”s are valid, so we decide not to merge them until we are more certain.

Panama Papers Jinkis Figueredo
Initial Search Results Prove Inconclusive and Require More Research

Our Panama Papers application shows a cardinality for each of the nodes, which indicates the number of connections between each person and other entries in the database. We start to load these individuals’ connections to build out the network.

We decide not to load connections of intermediaries for now, because they typically have many connections and can clutter our diagram. It also seems doubtful that intermediaries and their connections would lead to any factual connections between two companies simply because both were created by the same intermediary. So we continue focusing on connections between people, companies and addresses.

After expanding the network several times, we see that two of the Figueredo results are indeed connected, and one is not. So we remove the unconnected group from our drawing and merge the other two. However, after expanding the connections as far as we can, we still do not see any connection between Figueredo and either of the Jinkis’.

Panama Papers Graph and Data Visualization
Expanded Networks Reveal No Connection, Yet

Undeterred, we decide to expand the intermediary PGA Consultores that is in the Jinkis network. We expand the 34 entities and there is still no connection to Figueredo, but we decide to expand its entities one by one.

Bingo! The entity LEONIDAS PROPERTIES S.A. shows a clear connection between the Figueredo and Jinkis networks. Knowing we are on the right track now, and given that there are only 33 more entities, each with only a few connections, we continue to expand and grow the network.

As we do so, our powerful graph layout automatically lays out the connections in a readable format, and we easily spot data integrity issues along the way. We see the name of the person connecting the two networks and El Portador is misspelled a number of times. We merge the nodes as we find these duplicates, and continue expanding. Each time, we look at the names and entities of the connections that are revealed.

Tom Sawyer Perspectives Finds Data Integrity Issues in Panama Papers
Powerful Graph Layout Ensures Your Diagram is Readable as the Network Grows

After a few minutes, a name catches our eye: Damiani. We know that Juan Pedro Damiani is a member of FIFA’s Independent Ethics Committee. There is a J. P. Damiani and Associates intermediary and a Juan Pedro Damiani Sobrero, both from Uruguay where the committee member lives. Now we are really onto something!

Using our people network and running a social network analysis, we see the person El Portador is central in this network between Figueredo, the Jinkis’ and Damiani. This seems like a likely place to continue our investigation and dive a little deeper to understand the connection.

Read more about the alleged connection between Damiani and those individuals indicted in the FIFA scandal in this ICIJ article.

Panama Papers El Portador Connection to Figueredo, the Jinki' and Damiani
El Portador Identified as the Connection Between Figueredo, the Jinkis’ and Damiani

The Power of Tom Sawyer Perspectives

As we’ve illustrated in this article, the power of Tom Sawyer Perspectives lies in revealing the hidden connections in a visually understandable way, whether it’s among members of an organization, elements in a network, systems in an aircraft or automobile, or vendors in a supply chain.

Tom Sawyer Software specializes in helping clients with needs in link analysis; network topology; architectures and models; schematics and maps; and dependencies, flows, and processes. We help them federate and integrate their data from multiple sources, and build the graph and data visualization applications that are critical to analyzing and gaining insight into their data.

Visit www.tomsawyer.com to access our Panama Papers and other demonstrations built using Tom Sawyer Perspectives, and to learn how we can help solve your visualization and analysis challenges.

One Reply to “What Do Big Data Paris and the Panama Papers Have In Common?”

Leave a Reply

Your email address will not be published. Required fields are marked *