Arik Hesseldahl

Recent Posts by Arik Hesseldahl

Exclusive: Hadoop Companies Multiply as MapR Lands $20M in Funding

Hadoop, it seems, is everywhere these days. If you have a big data job to do, Hadoop is more than likely capable of helping you get it done.

Hadoop is an open source technology known for its cute cartoon elephant mascot (hence the photo). It has its roots at Google and was inspired by MapReduce — one of the fundamental technologies that makes the Google search experience what it is — and was created at Yahoo, which donated it to the open source community by way of the Apache Software Foundation. That means it’s free.

It’s used by companies as varied as Facebook, Groupon and AOL to turn workloads involving huge sets of data into manageable tasks. It’s so popular, in fact, that several companies have sprung up hoping to turn a profit by helping other companies run Hadoop, in much the same way that Red Hat makes money by helping companies run Linux.

I’ve written here in the past about Cloudera, and Yahoo’s Hadoop team recently spun out as Hortonworks (again with the elephant references).

Now there’s another Hadoop company on the scene — MapR — and it has just secured a $20 million round of venture capital funding led by Redpoint Ventures, with Lightspeed Venture Partners and New Enterprise Associates also participating. This comes on top of a strategic relationship with storage giant EMC, in which the hardware maker is offering MapR’s Hadoop distribution with some of its systems.

So what does MapR aim to do? Create an industrial-strength version of Hadoop that’s ready for the enterprise. I talked with CEO John Schroeder. “We created a reliable and dependable platform that’s built for high availability so clusters don’t fail. And we also added data protection, so you can back up your data and recover to a point in time that works in large clusters,” he said.

MapR also tuned its version of Hadoop for speed. It’s not uncommon, he said, for MapR to run two to five times faster than other distributions on standard benchmark tests. As you might expect, faster is better. You can arrive at your analytical answers sooner, or run more workloads on larger data sets, or you can run the same ones on cheaper hardware. So Schroeder is only half kidding when he says it’s “cheaper than free.”

I talked with Satish Dharmaraj, a general partner at Redpoint, and asked him what he sees in MapR. The market for “big data,” he says, is real. “It’s pretty clear to us that the MapReduce method of crunching big sets of data is the easiest and most cost-efficient way of doing things, and it’s disrupting the analytics and software industry in how they process big sets of data.”

Dharmaraj also likes the team. Schroeder was previously CEO of Calista Technologies, which he sold to Microsoft, and before that, CEO of Rainfinity, now part of EMC. His co-founder and CTO is M.C. Srivas, who ran one of Google’s search infrastructure teams, and so has an intimate familiarity with the original MapReduce to which Hadoop is so closely related. Srivas was also chief architect at Spinnaker Networks, now part of NetApp; before that, he ran the engineering team at Transarc, now part of IBM.

Finally, Dharmaraj likes MapR’s approach. “Hadoop is great, but it’s an open source project, so there’s nobody really building all the things around it that an enterprise would need, like disaster recovery. It’s also really fast. Jobs that take 30 hours on other versions are taking five hours,” he said. “That, to us, makes this the first version of Hadoop for the enterprise.”


Latest Video

View all videos »

Search »

Another gadget you don’t really need. Will not work once you get it home. New model out in 4 weeks. Battery life is too short to be of any use.

— From the fact sheet for a fake product entitled Useless Plasticbox 1.2 (an actual empty plastic box) placed in L.A.-area Best Buy stores by an artist called Plastic Jesus