John Paczkowski

Recent Posts by John Paczkowski

Google and the Evolution of Search II: Cheating the System

googlegjpg

This is the second of three interviews with members of the Google (GOOG) team responsible for overseeing search algorithms at the company. The introduction and Part I, an interview with Scott Huffman, appeared yesterday. In today’s installment Google software engineer Matt Cutts talks about search quality and spam. In Part III tomorrow, Google Fellow Amit Singhal will wrap up the series.

Google and the Evolution of Search

  1. Human Evaluators — Google Engineering director Scott Huffman
  2. Cheating the System — Google software engineer Matt Cutts
  3. What’s Next in Search? Much, Much Better Search — Google Fellow Amit Singhal

Part II: Matt Cutts

John Paczkowski: How do you maintain quality in search?

Matt Cutts: Well, broadly, we improve our algorithms and hopefully, every so often, develop some punctuated equilibrium where we create totally new ways to improve our relevance. My contribution… is ensuring that people who try to cheat the system don’t show up higher than they deserve to in our results. We want sites ranking high based on merit, not based on shortcuts.

JP: OK, so how do you do that?

MC: Essentially we look at a wide variety of input. We look at user complaints, for example. We also have a variety of internal metrics we use to track current trends. They help show us what people are using to spam right now. What’s getting past our defenses. And when we detect those things, we write some new algorithms or develop some tool that helps us detect and, hopefully, counteract them. So a large part of what we do is simply spotting trends in spam.

JP: Is there a human evaluation element here as well?

MC: Each team is responsible for general search-quality evaluations, but it’s not like they’re changing rankings or anything like that. That said, there are some policy violations that are pretty egregious. So, for example, if you type in your name and instead of getting All Things Digital, you got a porn site, you would get pretty angry about that. And you might complain to Google. And it would be frustrating if our reply was, “Yeah, well, we think we might have an algorithm that might fix that problem in five or six months, so we’re just going to leave that porn site as the top result for All Things D until we get an algorithm up to help you out.” Obviously, that’s a deeply dissatisfying answer.

So in spam, we are sometimes willing to take manual action on those sorts of policy violations. But Google’s philosophy is that wherever you can use machines and algorithms, it is much better, more robust, more scalable. And so, to the extent that we can, we always want to rely on the computers as our first line of defense.

JP: But you’re willing to remove spam manually until you can find an algorithm to counteract it. Do you think that will always be the case? Will we some day reach a point where human intervention of the sort you just described won’t be necessary or are we headed toward increasing human intervention?

MC: That’s a really fascinating question, but I don’t know the answer. What’s interesting to think about is that page rank, the raw page rank algorithm, actually improves as it ranks more pages. So the more pages you add to it, the easier it is to determine how reputable a particular page is without human intervention.

But as the Web grows in size we also encounter new and different policy violations–hidden text, cloaking. Those are the sorts of things that humans are very good at spotting. You can certainly identify some of them with a computer algorithm, but not all. And so our intent is always to try to make sure that we handle things efficiently with machines and algorithms. But I don’t know that we will ever get there completely.

Google and the Evolution of Search

  1. Human Evaluators — Google Engineering director Scott Huffman
  2. Cheating the System — Google software engineer Matt Cutts
  3. What’s Next in Search? Much, Much Better Search — Google Fellow Amit Singhal

Twitter’s Tanking

December 30, 2013 at 6:49 am PT

2013 Was a Good Year for Chromebooks

December 29, 2013 at 2:12 pm PT

BlackBerry Pulls Latest Twitter for BB10 Update

December 29, 2013 at 5:58 am PT

Apple CEO Tim Cook Made $4.25 Million This Year

December 28, 2013 at 12:05 pm PT

Latest Video

View all videos »

Search »

Just as the atom bomb was the weapon that was supposed to render war obsolete, the Internet seems like capitalism’s ultimate feat of self-destructive genius, an economic doomsday device rendering it impossible for anyone to ever make a profit off anything again. It’s especially hopeless for those whose work is easily digitized and accessed free of charge.

— Author Tim Kreider on not getting paid for one’s work