
Big Data has exhaust problem
Hey, what should we do with our data? Perhaps you've heard, or asked, that question before. It's a common query these days, a byproduct of the growing interest -- some may say obsession -- with big data and data science. Unfortunately, it's not the right question to ask, says Steve Weber, a professor
at the University of California School of Information's data science program.
"A better question is: 'What do my customers really want and need and desire?' " Weber tells InformationWeek. "And then: 'What kind of data would I need to collect, and what would I need to do with it to help them?' "
Sounds like obvious stuff, Weber admits, but it's a pragmatic approach that big-data-obsessed organizations often overlook.
"When you start with the data, it's like putting the cart before the horse," he says. "It's an obsession with the tools, an obsession with the data exhaust. You're searching around in the haystack for the needle."
It's far more efficient, he asserts, to start with a core business question. Example: What value-added service or product do I want to provide to my customers, but can't today? And then the follow-up: What data would allow me to design it?
"The technology comes on, and suddenly everyone feels like they've got to get it in place before they really know what they're going to do with it. You can do that, but I'm not sure it's the most efficient way to go."
And Hadoop fans, take note: It's wise not to become too enamored of a particular big-data platform or tool.
"Hadoop is great piece of software, or a great platform, but it's not the only one -- it's an early one. Lots of people are starting to build tools to democratize the ability to work with [big data]."
The web analogy is applicable here. "In the early days of the web, writing HTML was really complicated. Now, basically, you don't really need to know any HTML to make a web page."
And the move to democratize complex technology may be happening much faster in the big data space. Says Weber: "Hadoop's good, but if you bet on it for the long run, you're likely to be surprised."
One big data development that Weber finds "unbelievably exciting" is the Internet of Things, or as he calls it, "really cheap sensors world."
"I'm wearing sensors, and everything I interact with is instrumented in some fashion. It starts to become mind-boggling, both in terms of what we can know, and the almost unlimited number of things we can do with that data."
That depends, of course, on whether the information is collected and analyzed, ideally by well intentioned parties.
"People sometimes use the term 'data exhaust' [to describe] all the data that their interaction with the world is throwing off, and how little of it gets collected," says Weber.
Source: http://www.informationweek.com