Everyone Loves Hadoop, So Cloudera Makes It Easier to Manage
Name a Web company of any size doing anything reasonably complex, and chances are it’s using Hadoop. The open source software makes working with big data sets practical in distributed computing environments. Hadoop is based on concepts developed at Google, and since it’s open source, it’s technically free for anyone who wants to download and install it on their own systems. Plus it has a cheerful yellow elephant for a mascot that looks like something out of an old Warner Bros. cartoon. What’s not to like?
One company that’s been making a lot of noise around Hadoop is the start-up Cloudera. Backed by $36 million in venture capital investments from Accel Partners, Greylock Partners, Meritech Capital Partners and In-Q-Tel (that’s the CIA’s venture capital arm), Cloudera aims to create a booming business around Hadoop. Companies using Cloudera’s version of Hadoop include eBay, Groupon and AOL.
How to build a business around something that’s free? By making what Cloudera says is the best Hadoop distribution around and helping companies run it by offering services and support. I liken it a bit to what Red Hat does with Linux.
Today Cloudera announced another step in that process: Cloudera Enterprise 3.5, available today, is a big update that includes some new automated configuration and monitoring tools and what it is describing as “one click” security for clusters of machines running Hadoop. It’s a subscription service and includes a management suite and production support. Charles Zedlewski, Cloudera’s VP of products, told me that included with the tools is a real-time monitoring dashboard. “We saw there was no performance monitoring in Hadoop like there is in the database world, and we thought that was a shame. So we built this, so now companies can have a real-time view of the performance of their workloads in Hadoop,” he said.
The company is launching Cloudera SCM Express. SCM stands for Service and Configuration Manager, and its point is to make it easy for anyone to install and configure a complete Hadoop stack. “It’s basically a way of automating all the changes you might need to make to the Hadoop stack,” Zedlewski said. “It handles everything — configuration changes, restarts, adding new servers, all the normal operational stuff.” He said that up to now the way to do all those things was to open connect to the machines remotely using SSH and copy files using an arcane command-line interface. “We didn’t think that was any way to live, so we built this.” SCM Express is available for a free download and can be used to operate a Hadoop cluster of up to 50 nodes.
If you’re still having a hard time getting your head around Hadoop, Cloudera has a two-minute video with Doug Cutting, who is basically the founder of the Hadoop project. Before joining Cloudera, Cutting led the Hadoop team at its old home, Yahoo.
Incidentally, Yahoo announced this week that it is spinning off its Hadoop team as a new start-up, with the help of Benchmark Capital. It will be called Hortonworks, named after the elephant in the Dr. Seuss books, naturally.