Trifacta Aims to Make Big Data Useful, Lands $4.3 Million From Accel Partners
We hear a lot about big data these days. It’s the IT industry catch phrase of the moment, and describes the notion of capturing and analyzing data from many quarters in order to find patterns and other information that is useful in making business decisions.
One part of that equation is easy to solve: Moore’s law and the persistent march of storage technology have made the number-crunching and storage of all the data relatively easy and inexpensive.
The other part is hard: Gleaning actual information by interpreting, recognizing and then deciding on a course of action from the data. In the end, that requires a person. And people are a little complicated.
A study by McKinsey reckons that the demand for people with deep analytical skills will exceed the supply by about 140,000 people to 190,000 people by 2018. So, while computing engines will become incrementally more powerful and cheaper, the people available to understand all that data being gathered will be going up.
That fact was an epiphany to Joe Hellerstein, a computer science professor at the University of California at Berkeley, and to Jeffrey Heer, a professor in the Human-Computer Interaction research group at Stanford University. Why not make that analysis — the bits that only a human can do — easier and more accessible?
The result of their research is Trifacta. The start-up came out of stealth mode today and announced a $4.3 million Series A investment from Accel Partners led by Ping Li, head of the firm’s Big Data Fund. The company also has investments from X/Seed Capital, Data Collective and angel investors Dave Goldberg, Venky Harinarayan and Anand Rajaraman.
Trifacta’s aim is to help close that analysis gap by making data science more efficient, by making the data itself more easy to manipulate and rearrange. A lot of data analysis is simply asking what would happen if one condition or another were different, or if one or two key assumptions change. How might sales of a widget suffer if the gross domestic product of a certain country drops next year by a few basis points? How might the yield of a certain crop be better or worse if the mean temperature during the growing season is one or two degrees hotter? Those are super-simple examples that I’m making up off the top of my head, but you get the idea.
Tackling that problem requires expertise in three areas: Database systems, data visualization and machine learning. As it happens, Trifacta has assembled a team of some of the best experts in all three.
Hellerstein is Trifacta’s chief executive officer and a professor of computer science at Berkeley. He’s also a leading authority on data-centric systems. Heer, the chief experience officer, is known for his work at Stanford on open-source data visualization libraries such as Protovis and D3.js. Sean Kandel, Trifacta’s CTO, is a former financial analyst who did his dissertation work at Stanford studying analyst behavior and designing tools to improve productivity.
They’ve assembled a team of advisers that include Michael Bostock, a data visualizer with the New York Times; and Jeff Hammerbacher, founder and chief scientist at Cloudera, the start-up focusing on the open-source big-data engine Hadoop. Other advisers include Sam Madden, a computer science professor at MIT; Tim O’Reilly, CEO of O’Reilly Media; and DJ Patil, the data scientist in residence at Greylock Ventures.
Trifacta’s technology takes data found in small files or huge petabyte-sized data troves in Hadoop, and makes it possible to manipulate it through a guided, iterative process.
Accel’s Li, who is taking a board seat at Trifacta, summarized his interest like this: “The world doesn’t need another Hadoop or SQL company. The biggest problem with big data is around the ability to get information out of it. That gap is huge, and it’s not going to be solved anytime soon. This is really the soft underbelly of big data right now.”