Macy's Debit Fiasco <i>Not</i> A POS Issue. The Switch Is To Blame

Editor's Note: After our coverage last week of the Macy's internal probe over its December debit card over-charge fiasco, On-Line Strategies' Andy Orrock and his colleague, Dave Bergert, wrote a blog post about their thoughts. StorefrontBacktalk was impressed by their analysis and asked Orrock to write a modified version as a GuestView column for us. He generously agreed.
The incident of double- and triple-billing of debit cards at Macy's in December sounds more and more like a failure at the payment switch level, not in the point of sale register system.

But before we delve into the significance of where the failure happened, it's important to acknowledge the payment system's different perspective on debit versus credit and how a debit transaction can so much more easily go awry. With offline debit and credit, the payment switch can perform reasonably well and avoid making unfortunate headlines because the subsequent extract file will typically be right.

Settlement of credit transactions is based on a "clearing file," whereas settlement of online debit it is at the transaction level. There is no clearing file for settlement for online debit transactions. So, with online debit, the switch has to perform consistently and exceptionally well. One spate of bad processing can make news like the Macy's glitch happen.

Some things to consider:

  • Scope
    "The glitch was limited to PIN debit transactions, ignoring signature debit transactions."
  • Resolution
    "Within a half-hour of noticing the first incident, Macy's began shifting stores to an alternate gateway that was operating properly, and the shift of all stores was completed in less than two hours, said Macy's officials."
  • Other concern #1
    "An even more vexing failure occurred in an application that was supposed to keep an eye out for identical sales tickets that are processed multiple times."
  • Other concern #2
    The President of Macy's credit and consumer services division said his "immediate concern" was why the automated reversal system didn't work, as designed, to prevent double and several triple charges. The system was supposed to automatically send a credit to the bank for the second debit.

    Let's break this down. First, it's unlikely this was a problem with the "point-of-sale register system." That phrase implies an in-store system. An institution the size of Macy's will have a payment switch at their operations center through which they'll run all transactions. The point-of-sale register system communicates with the centralized host that, in turn, communicates with payment gateway providers like FDR and Fifth Third and direct authorization links to AMEX and Discover.

    The problem was probably on this host application of theirs. We can say this because the problem got resolved "shifting stores to an alternate gateway that was operating properly." In this instance, gateway refers to Macy's host. Nothing at the store system level got changed to get things back on track.

    Regarding the scope, I guarantee you that credit and offline debit transactions got caught up in this crap, too. (See my credit versus debit post for background.) The difference is that credit and offline debit have a wide tolerance for processing error. You can screw up the online interaction (multiple times, in fact), but most times you'll end with one good interaction — all the customer intended — and that's the one that ends up in the nightly extract/settlement file, which is the financial letter of record. By contrast, PIN-ed Debit/EBT has very little tolerance for error. If you screw up the online interaction, you are well and truly screwed. Take it from somebody who has the scars on his back to prove it.

    Regarding the concern about the "vexing failure" on the component that was supposed to check for duplicates, I suspect that the "Dup Check" (as I like to call it) wouldn't have worked here. There's a strong chance that the Macy's payment switch had these original transaction attempts flagged as failures. It's going to look for an approved transaction that looks like the one coming through. But there was no approval on file, only a failure.

    So what most likely actually happened? My best guesses:

  • Transaction responses from the authorizer were received late, reversals were placed into a Store and Forward queue, but the queue was corrupted.
  • A garbage or malformed record got at the head of the SAF queue and no one could figure out how to remove it. They then tried to move to the other server node with a clean SAF queue. When the corrupted SAF Queue was removed from the picture so were legitimate debit reversal transactions that should have been sent to reverse incomplete or failed debit sale transactions.
  • Macy's had a slowdown internally on its server and didn't have a threshold mechanism in place to prevent sending out already-aged requests for remote authorization. The threshold setting implements a source-based timeout mechanism before committing to send a transaction out for external authorization. The total time elapsed on the transaction so far is compared to the threshold. If elapsed time exceeds the threshold value (indicative of internal problems slowdowns), then the transaction will be internally rejected prior to committing to external routing.
  • The request/response match-up mechanism in host system's multiplexer quit working above a certain volume level due to a programmatic flaw. I've seen implementations where this gets so twisted up that killing the process is the only solution.

    Note that in any one of those scenarios, the reversal processing would have failed. In scenarios 1 and 2, the reversals are generated, but the SAF is a jail. In scenario 3, no reversal is generated because things "worked" from the host application's perspective even though the store system has long-since closed out the transaction. In scenario 4, all hell has broken loose. If there are not debit reversals to counter incomplete or failed debit sales, customers will get charged multiple times.