Google's Latest Search Changes Could Be Very Bad News For Retailers

A Brooklyn retailer was arrested Monday (Dec. 6) and federally charged with fraud and harassment. But the most heinous offense of eyewear-hawker Vitaly Borker was his criminally cynical manipulation of retail rankings within Google.

Borker figured out that any kind of comments from customers—including really negative ones—would send his pageviews from Google soaring. Note: This didn't help him if a customer typed in his retail brand (Decormyeyes), but few prospects had a reason to do that. They'd be much more likely to type in major optical brands such as Ciba Visions, which Borker resold. Because of the Borker case, Google has changed its search mechanism. But that might be bad news for many legitimate retailers.

Google is being cagey about the changes it made, but Google Fellow Amit Singhal did post a few comments on the Google blog: "In the last few days, we developed an algorithmic solution which detects [Borker] along with hundreds of other merchants that, in our opinion, provide an extremely poor user experience. The algorithm we incorporated into our search rankings represents an initial solution to this issue, and Google users are now getting a better experience as a result. We can't say for sure that no one will ever find a loophole in our ranking algorithms in the future. We know that people will keep trying: Attempts to game Google's ranking go on 24 hours a day, every single day. That's why we cannot reveal the details of our solution—the underlying signals, data sources and how we combined them to improve our rankings—beyond what we've already said."

Before we get back to Google, we want to say that what Borker is accused of—and he seemed to concede many of the accusations in an interview with The New York Times—is serious and quite criminal. He literally threatened one customer who wanted a refund with rape (anal rape, to be specific) and backed up his threats by sending her a picture of the front of her house. But the changes Google made will likely go well beyond punishing merchants who behave poorly and even criminally.

The theoretical goal of every search engine is to figure out what searchers really want and to deliver that information to them. The assumption is that someone searching for "Panasonic HDTV and Samsung HDTV" is quite likely thinking of buying an HDTV. The system then suggests some major retailers that are known to sell such products.

So far, that purchase assumption will probably work. Sure, some searchers may want market stats on HDTV activities, HDTV buying guides or maybe technical discussions on how HDTVs work. But when two competing brands are mentioned in the search field, the buying intent guess seems legitimate.

The problem is the next step: Which retailers should be displayed and in which order? The old way was, more or less, a clean numeric calculation. How many links to each retailer's site exist and how many are coming from popular or well-regarded external sites? And—at issue here—how many people are referencing these sites in comments, which we will informally refer to as the "famous" factor.If a consumer is searching for a movie actress, the results that will please most people will go to the actresses who are referenced most frequently and most recently. A famous actress of the 1950s might have 50 million references to her in the archives, but few are particularly recent. Why doesn't Google—or any other search engine—rank actresses by how well they can act or singers by how well they can sing? Should politicians be ranked by intelligence, honesty or the number of votes received?

These are judgments that software doesn't do very well. That's partially because humans don't do them very well, either, because they are so entrenched in emotion and personal preference. (For the record, the only personal preferences that should matter are mine, but my wife disagrees. Go figure.)

When you do a search for a word on Google, Bing, Yahoo or any other major engine, how often does it correctly figure out what you really want to know? If it guesses correctly 70 percent of the time, that is amazingly impressive. But it also means it's wrong almost one out of three times.

How much of an impact on your search engine referral business would it have if Google concluded that those very happy customers you had—the ones who used a lot of exclamation points for your superb customer service—were actually yelling in an angry manner? Will the underlying software understand sarcasm or humor?

Let's assume a customer wrote: "Wal-Mart had these laptops on a clearance sale. They were fully loaded, with an MSRP of $2,000, and I picked up three for $250 each. I love a great ripoff!" Would the software recognize that Wal-Mart was being complimented on having great prices? Or would it zero in on "ripoff" with an exclamation point and conclude that the customer was angry?

I have tremendous respect and admiration for some of the extremely talented folk at Google. Heck, the number of truly great minds on Google's payroll (and, for what it's worth, Amazon's) is staggeringly high. That's how the company gets all the way up to 70 percent and maybe even 75 percent in successful search results. But the more Google, or any other search engine, leaves the numeric realm and gets deeper into the subjective realm, the more its accuracy is going to have to drop.

Also, will Google's software factor in the realities of consumer comments? For instance, consumers expect a perfect experience, in the sense of "deliver to me what you promised, at the price you promised and in the timeframe you promised." If you do that perfectly, the consumer will be happy, but not necessarily overjoyed. Competent professional performance will not prompt many consumers to bother posting a comment. But screw up and consumers are highly motivated to post.

Presumably, Google's software—which, if nothing else, can count well—would spot this trend. But how would it deal with it? The larger the chain, the more negative comments. Will the software determine a percentage for each retailer, as in "of all the comments you received, 17 percent were negative"? Or will it compare retailers with each other? Will the software consider differences such as a company-owned chain versus a franchised one, where owner groups could perform quite differently?

I'm all in favor of punishing retailers that abuse their customers. It's just the idea of software making that determination that makes me nervous. Right, Hal?