Foursquare Crashes Risk Shaking Retailer Loyalty

Check-in services like Foursquare are supposed to help keep customers engaged with retailers. But 3-million-user Foursquare was offline for 11 hours last Monday (Oct. 4) due to a database problem—and the same problem took it down for another six hours the next day. And while those outages were happening, it wasn't just Foursquare that felt the pain; that also came down on the retailers whose customers use the service.

That's a key problem with tying your loyalty or rewards programs to an outside company, especially a quickly rising startup with inevitable growing pains. From the point of view of your customers, when they check in with Foursquare at your store or restaurant, they're checking in with you. And if Foursquare is unresponsive, you're the one who will get blamed.

The technical problem that took down Foursquare is one that, in theory, shouldn't have happened. Foursquare uses a document-oriented database called MongoDB that's designed to be highly scalable in ways that relational databases can't be. Foursquare counted on the database's ability to balance the load in storing user check-ins.

But according to the company, at 11 a.m. Monday (New York time), one section of the database—a "shard," in MongoDB jargon—got overloaded. For 90 minutes, Foursquare techs tried to rebalance the load. Nothing worked. At 12:30 p.m., they tried to manually create a new shard to split the data. That caused the entire site to go down.

After another six hours of futile efforts to fix the problem, followed by five more hours of reindexing and testing, the database was working again, and Foursquare was finally back up and accepting check-ins.

The next day it happened again—the same problem, but this time the service was only down for six hours, from 6:30 p.m. to 12:30 a.m.

That was Foursquare's problem. Your problem as a retailer, as you head into the holidays, is that your check-in-oriented customers aren't likely to see a real line dividing Foursquare from you. To them, when Foursquare is down, you're down.

Maybe that sounds painfully familiar. It should. Only a few years ago, you were struggling to keep your Web site working during the holiday season. And just as you worked all the kinks out of your internal operations, you began to run up against the problems of your partners.

You had outsourced your site's shopping cart, and that collapsed under too heavy a load. You depended on UPS and FedEx to handle shipping information on your site, and their response time slowed to a crawl when things got too busy. Every time you handed off a function to partners, you faced the possibility that they wouldn't come through—and no matter how crisp your own operations were, they could make you look bad. It was up to you to find ways of making sure they didn't do that.

That's pretty much the state of check-in services today, too. Can you pressure Foursquare and other check-in vendors to improve their performance and reliability? Perhaps not. But you can insist that they participate in the same dry-runs that you insist of your own team. After all, for all practical purposes, they now are.