Big Data’s Usability Problem


Sen. Lindsay Graham just told Fox News that the reason the FBI never realized that Boston Marathon bombing suspect Tamerlan Tsarnaev went to Russia in 2011 is that “when he got on the Aeroflot plane, they misspelled his name, so it never went into the system that he actually went to Russia.” Meanwhile, the Reinhart-Rogoff paper that has been a catalyst for government austerity policies worldwide since 2010 has, in fact, accidentally left out several countries’ worth of critical data in Excel.

As one blogger sums up scathingly: “One of the core empirical points providing the intellectual foundation for the global move to austerity in the early 2010s was based on someone accidentally not updating a row formula in Excel.”

Taken together, these factors offer a critical lesson here about the power and limits of Big Data today. In both scenarios, data management tools (i.e., the FBI’s systems and Excel) were undone by fairly simple errors: In one situation, a misspelling; in another, a failure to code a spreadsheet properly. And in both scenarios, the results were dire — an awful tragedy, and a potentially misdirected government economic policy in the midst of a recession.

As someone who spends day and night thinking through data management and workflow, these two stories lead me to three observations:

  • As a society, we’re hugely reliant on data management platforms for our most critical information.
  • Our core data platforms often aren’t set up to handle human error, from basic coding flaws to spelling mistakes.
  • The wealth of data in our data tools can mask that human error. Consider: The Reinhart-Rogoff study examined “new data on forty-four countries spanning about two hundred years” with “over 3,700 annual observations covering a wide range of political systems, institutions, exchange rate arrangements, and historic circumstances.”

In such a wide sea of data, a few lines of code can be very easy to overlook, even if they have strong ramifications for analysis.

There are lots of things to take away from these three points, but I’ll just focus on one: The promise of Big Data is that it can make everyday processes — from critical analyses to mundane tasks — work smarter through data intelligence. Ultimately, all that data management translates into an economy and society that lets machines handle the minutiae as humans think through the larger picture.

To a large extent, that vision is already here. But at the same time, more human/data interaction means a lot more room for error (and inefficiency) around increasingly critical data sets — which, as we’ve seen, can have very serious results. Which means that, if we want to make the reality of Big Data match the dream, we need to spend serious time around providing usability that guides human users in the best way to engage with the data, and automation that takes human interaction (and human error) out of the picture for a lot of the basic calculations and tasks — and for some of the complicated ones, too.

If Big Data can’t fit hand-in-glove with usability and workflow, a lot of the promise of big data will be empty data crunching. That’s not just a problem for getting where we want to be in the evolution of computing. It’s a situation that can lead to bad data management — which translates into bad economics and, sometimes, far worse.

Bill Wise is CEO of Mediaocean. You can follow him on twitter at @billwise.

Must-Reads from other Websites

Panos Mourdoukoutas

Why Apple Should Buy China’s Xiaomi

Paul Graham

What I Didn’t Say

Benjamin Bratton

We Need to Talk About TED

Mat Honan

I, Glasshole: My Year With Google Glass

Chris Ware

All Together Now

Corey S. Powell and Laurie Gwen Shapiro

The Sculpture on the Moon

About Voices

Along with original content and posts from across the Dow Jones network, this section of AllThingsD includes Must-Reads From Other Websites — pieces we’ve read, discussions we’ve followed, stuff we like. Six posts from external sites are included here each weekday, but we only run the headlines. We link to the original sites for the rest. These posts are explicitly labeled, so it’s clear that the content comes from other websites, and for clarity’s sake, all outside posts run against a pink background.

We also solicit original full-length posts and accept some unsolicited submissions.

Read more »