• Just Visiting

    A blog by John Warner, author of the story collection Tough Day for the Army, and a novel, The Funny Man, on teaching, writing and never knowing when you're going to be asked to leave.

Title

Probability Isn't Destiny

Data should inform us, not rule us.

November 7, 2016
 

 

 

Nate Silver’s FiveThirtyEight has gotten some things wrong this past year.

The Cleveland Cavaliers had less than a 10% chance of winning the NBA title when then went down 2-0 to the Golden State Warriors.

The chances of Donald Trump winning the Republican nomination were pegged as low as 2% at the time of the announcement of his candidacy, and were still under 15% as the first caucuses and primaries loomed on the horizon.

And who can forget that the Chicago Cubs were given less than a 15% chance of winning the World Series once they fell behind 3-1 to the Cleveland Indians?[1]

Of course, Nate Silver will tell us, correctly, that he didn’t get anything “wrong.” He delivers probabilities, not predictions, so even if something has a 90% likelihood of happening according to his calculations, and the unlikely thing happens, it’s just one of those times when the 1 in 10 chance cashes in.

Silver judged his errors in identifying the probability of a Trump nomination as being rooted in acting, “too much like a pundit” in which he said that the site’s early estimates weren’t based on the statistical models that they’ve become known for.

The reasons for that, according to Silver, were because the nominating process lacked sufficient data in order to produce a model (The nominating process we have today only stretches back to 1972), and the contest was too high in “structural complexity” (17 candidates).

As I write (the Sunday before the election), Silver’s outfit is the most bullish of the data journalists when it comes to Trump’s chances, giving him a 35% probability of victory. In contrast, the New York Times data outfit, The Upshot, gives Trump a 16% probability.

Anxious supporters of Hillary Clinton have imbued Silver with a kind of mystical power, their moods fluctuating with Trump’s probability of winning, even though Nate Silver has exactly nothing to do with the outcome of the election. A writer for the Huffington Post accused FiveThirtyEight of "just guessing," which is not true.

But it's also sort of true in that Silver's model relies on assumptions about what the data means that are, at their heart, guesses, informed guesses, but guesses nonetheless.

These marvelous tools we have to collect and aggregate data have given us another route to informing our opinions, but we would never let FiveThirtyEight tell us who the next president is.

If Trump wins on Tuesday it will be in defiance of all known probability. Those rumored secret Trump voters will actually be real, I suppose.

The recriminations over the bad modeling will commence – something similar would happen if Hillary Clinton wins by an unexpectedly large margin – but presumably, the models will be improved with new information, more data.

Unexpected things happen, as our recent sporting championships indicate. Fortunately, models don’t determine the World Series Champion or the President of the United States.

The competitors actually get to play the game, no matter how daunting the odds, and sometimes, surprises happen

--

I like data. I cite data often in this space, data like average student debt, or the student loads for contingent writing faculty.

I am not a fan of “data driven” decision making, however, because we know that data is not infallible, and when we put data inside of models, models created by humans with all of our biases and flaws, we can do real harm.

But even when the model is valid, we should be cautious. Imagine a model that says 85% of students who share a certain set of traits or previous experiences are likely to do poorly in a particular program or major, and based on this data, the student is guided away  from that path.

We’re just playing the odds, which is smart, but when we’re talking about individual lives, is it right? Is it just?

The 7th game of the World Series was an object lesson in improbabilities. The Cubs leadoff hitter, Dexter Fowler, was the first player ever to lead off the 7th game of a World Series with a home run. Aroldis Chapman, the Cubs closer, hadn’t given up a home run since he’d become a Cub, but he surrendered a two run lead on just such an event. Miguel Montero, who delivered the final Cubs RBI, was hitting less than .100 in the playoffs.

The pitcher who recorded the final out, Mike Montgomery, had never earned a save in his entire professional career.

And maybe you’d heard, but the Cubs hadn’t won the World Series in better than a century.

When we are told that it is necessary to “do more with less,” in education, the less almost invariably means less human contact. K-12 education has seen some rush eagerly toward the era of the “robo-tutor,” as software that promises “personalized learning” is trumpeted as the latest savior.

I’m sure these are useful tools, but we can’t lose sight of the fact that they’re tools meant to be used according to human discretion.

Decision making should be data-informed, but never data-driven, and when we’re talking about students and individual choice, definitely never data-dictated.

Probabilities are about playing the odds, and we should let them inform our practices, and we should share this information as far and wide as possible, but we have to also remember that when it comes to “structural complexity,” it’s hard to think of anything more complex than teaching and learning.

The only person who should get to make a bet on what’s going to happen needs to be the students themselves, and everyone deserves a chance to play the game of their choosing.

 

 

 

[1] Not me. I cried for the second time in my life over the Chicago Cubs, this time with happiness. the first time being 1984 when they lost a 2-0 series lead and 3-0 lead in the decisive Game 5 against the San Diego Padres.

 

Read more by

Back to Top