Graphing Congressional Lobbying

exploring the relationships formed by legal bribery

Image for post
Image for post
Carlos Yudia / Shutterstock

‘Money in politics’ gets thrown around a lot. Depending on who you ask, it’s either standard, run-of-the-mill corruption or the biggest threat to democracy in the Information Age.

Lobbying, of course, has decent legal reasoning behind it. It can be defined as “ a practice performed by either individuals or organizations whereby public campaigns (which are legally registered with the government) are undertaken to pressure governments into specific public policy actions”. This is all fine, first-amendment stuff; band together with your fellows to petition the government for a redress of grievances. Sounds great on paper.

In reality, you vote for Senators and Representatives that you believe will best champion your interests, defend your ideals and support what you care dearly about. Then they receive generous campaign ‘donations’ from various lobbies, and suddenly they sneak some nasty clause into a 1,018-page (no, really) bill that reroutes your tax dollars to the offshore accounts of some nameless Davos men. Your labor has helped to widen the wealth gap. Justice cries out and is promptly buried in bureaucracy.

But some people do their best to cut through that bureaucracy. They collect, organize and publicize whatever details aren’t hidden in little black books and [REDACTED] by our trustworthy government agencies. Today, we’ll be using data from the Center for Responsive Politics to visualize the flow of money. We’ll be able to answer some neat questions like “which of our legislators Bends the Knee to the oil companies killing the planet?” and “whose new house is funded by the opioid-pushing pharmaceutical firms?”. It doesn’t take much — a few python tools, a graph visualizer and a relational database are all we need to shine some much-needed light on the money.

Graph Networks (or network graphs)

Image for post
Image for post
Game of Thrones characters linked by shared scenes. Node and edge size is weighted by frequency.

Network graphs are a simple and effective way to represent relationships between data points and uncover ‘structure’ hidden in numbers. They’re composed of nodes (little circles) linked by edges (lines), both of which can carry properties/attributes (any information you’d like). Facebook employs nodes of users linked by friendships to detect communities/cliques and determine shared interest. Google’s PageRank algorithm makes a graph of webpage nodes, forming edges when pages link to one another. There’s plenty of interesting things to do with a well-furnished graph, like pathfinding, search optimization and betweenness calculations. We’re going to create nodes for congressmembers and the various industries that line their pockets, and link edges by donation accordingly.

Gathering Data

Image for post
Image for post

I’ll gloss over this part; it’s not terribly interesting. I used Selenium to scrape each congressmember’s unique ID from the 115th Congress page, then signed up for an API key, and used the IDs to request each member’s lobbying records organized by industry. The data is quite pleasant to work with; I parsed it from XML to JSON, and then prepared to convert it to a graph.


NetworkX makes it quite easy to graph complex relationships. I processed each member, creating new member_nodes and industry_nodes as needed, and keeping track of each node’s total money given or received (allowing us to display size according to $ weight). Represented in Gephi, it’s messy but somewhat sensible:

Image for post
Image for post

A short disclaimer: There’s only ~200 Representatives in that node. The API has a 200 daily limit for the Industry method, so I’ve been designing some workarounds.

What’s great about Gephi is its selection of Layout Algorithms: you can quickly run any of several sorting algos to ‘form’ your graph properly. In this case, edges ‘pull’ nodes closer together with more or less strength depending on the money in that edge, so we can create a 2d space where members and industries cluster around their benefactors/puppets.

We can see some structure: Republicans and Democrats have clear sides of the circle, and some are more bipartisan than others. Some industries are more partisan (Human Rights, Mining), and some are more central. Two thoughts immediately come to my mind: 1) this is an absolute mess, and 2) does the Retired lobby secretly control all of America?

I’ve been pushing the narrative that banker oligarchs are pulling the legal strings of our nation, but some of these central industries aren’t quite so nefarious. I’m more worried about the Sacklers influencing politics than the AARP.

Further Organization

Gephi is a great tool to get an overview of your graph, and there’s fancier things you can do with it as well (community detection, among others). However, I want to be able to query this data to get descriptive statistics for any member or industry, so the next step is uploading the (completed) graph to Neo4J, a SQL-ish relational graph database with tons of features.

Tune in next week to uncover more information about our beloved legislators!

Written by

data scientist, machine learning engineer. passionate about ecology, biotech and AI.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store