So often when someone starts a Twitter message with the label “Must read” I get defensive. You’re not my teacher. I’m a grown up. I get to decide what I’m going to read, thank you very much. But I’m really tempted to start this post with “Must read” because Cathy O’Neil’s book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy is important and covers issues everyone should care about. Bonus points: it’s accessible, compelling, and – something I wasn’t expecting – really fun to read.
O’Neil is a data scientist who taught at Barnard before being seduced by the excitement of applying mathematics to finance, working for a Wall Street hedge fund before the crash of 2008. One of the things she quickly learned was different from academic mathematics was that employees were treated like members of an Al Qaida cell: the amount of information they could share was strictly limited so that if anyone was captured by a competing firm, they couldn’t reveal too much. Also, the scale of their collective if obscure work was ginormous. Subprime mortgages were a three trillion dollar market, but the markets created around them through credit default swaps, synthetic CDOs, and other weird financial inventions based on math and baloney was twenty times that size. As it all began to collapse, the damage cascaded, and people, lots of people, got hurt.
These risky financial instruments, like many other proprietary big data projects – what O’Neil calls “weapons of math destruction” - have features in common. They are opaque (few people could understand them even if they weren’t trade secrets that cannot be examined by those who are subject to the decisions they make); they work at large scale, and because they are sealed systems, they can’t learn from their mistakes. They can do a lot a damage and are bizarrely unaccountable for it, often claiming greater objectivity than the fallible humans who encode them. Her experience in high finance is a cautionary tale because the features that crashed the world economy are present in big data systems that affect our lives in myriad ways, from education to jobs to the criminal justice system to how we are persuaded to vote.
Two of her early chapters deal with the effect of these big data systems on higher education. One the one hand, they feed the hysterical and costly race among elite schools to game U.S. News’s ranking system, making colleges “manage their student populations almost like an investment portfolio” and enhance their status, often at the expense of actual students. She also examines how for-profit schools manipulated personal data to target the vulnerable.
The marketing of these universities is a far cry from the early promise of the Internet as a great equalizing and democratizing force. They find inequality and feast on it. If it was true during the early dot-com days that “nobody knows you’re a dog,” it’s the opposite today. We are ranked, categorized, and scored in hundreds of models, on the basis of our revealed preferences and patterns. This establishes a powerful basis for legitimate ad campaigns, but it also fuels their predatory cousins: ads that pinpoint people in great need and sell them false or overpriced promises. The result is that they perpetuate our existing social stratification, with all of its injustices.
That injustice is scaled up in programs like predictive policing, which magnifies human prejudices, and algorithm-based sentencing: “we criminalize poverty, believing all the while that our tools are not only scientific but fair.” Companies that have turned to Big Data to manage the tiresome business of hiring low-wage workers use personality tests to “exclude as many people as possible as cheaply as possible” – and they don’t correct for their mistakes because they never check to see if that excluded employee actually turns out to be very good at something. Of course, these tests are used more on the poor than on the wealthy, who would never put up with it. Big data also is used to schedule workers to maximize efficiency, giving them a few days’ notice, making it impossible to schedule daycare or take courses and get ahead in life. Systems that rate teachers are based on nonsense, and secret and flawed formulas are used to determine whether we are credit-worthy.
Her penultimate chapter looks at ways these WMDs undermine democracy, and it's particularly timely. By analyzing personal data and targeting us with highly personalized political messages, we never know what promises a candidate is making to our neighbors or see the same news they read that persuades them of something that seems to us nonsensical. We are increasingly living in different worlds, sculpted to fit different world views, micro-targeted to the point "e pluribus unum" no longer pertains. Rather, divide, divide, divide and conquer.
O’Neil doesn’t hate data. She loves mathematics, but hates the ways its being deployed on us. In her conclusion she suggests the ways we could use data to make lives better, if we didn't use money as a proxy for goodness. With transparency, open audits, and a willingness to take up the problems these systems expose, we could do great things for the public good.
Dare I say it? I will. For anyone who cares about information systems and how society works these days, it's a must read.