Issue affecting the Gateway API on the Braintree platform

140 points by siddharthgoel88 1 year ago | 65 comments
  • mauvehaus 1 year ago
    Meanwhile, anyone in Boston who only read the headline is completely unsurprised to hear that the red line isn't running and is wondering how this merits making the front page of HN.
  • sschueller 1 year ago
    I received an email for every paypal authorized connection I had indicating that it has been terminated. Related?

       This email confirms that you have canceled your payment agreement with ###### No further payments will be made from your PayPal account to this merchant. If you have any further questions about the agreement, or wish to reinstate it, please contact ###### directly.
    • smeej 1 year ago
      Not sure, but anecdotally I haven't received any such notices even though I do have active connections.
    • preinheimer 1 year ago
      Hugops to the team working on it.

      The biggest incidents have the best post mortems.

      • shaftoe 1 year ago
        "Some merchants may be seeing a higher-than-usual decline due to Gateway rejections Fraud."

        Is this really "down"?

        • Aurornis 1 year ago
          The trend is is to downplay the issue in status messages to obscure the real problem. That message could mean anything from an extra 1% of rejections to 99% of transactions are failing.

          The vagueness is the point, because they want to avoid admitting serious problems.

          We had this problem with some devops hires who came from a big company. They’d delay updating the status page as long as possible, then update with the weakest language that was technically correct. “Some customers might experience degraded performance” was their go-to message for nearly complete outages. They’d argue that it was technically correct because some requests were getting through in some logs somewhere.

          It was a side effect of working in an environment where their bonuses depended on downtime and the severity of outages. The game was to admit as little as possible to keep those bonus numbers high. We didn’t calculate bonuses that way but they had ingrained the behavior from years of BigCo performance reviews.

          • s_dev 1 year ago
            >We had this problem with some devops hires who came from a big company.

            Amazon.

            All you have to do is look at their status page of green lights when us-east goes down completely to lose complete faith in their status page reflecting anything but wishful thinking.

            • wging 1 year ago
              Seems unlikely, bonuses are not an Amazon thing, and iiuc status pages aren’t a decision such people would be making anyway. A dedicated “devops” person at Amazon (to the extent that’s even a thing, mostly engineering teams own their own ops) would be unlikely to benefit from minimizing issues. The status page issue you’re discussing is real but I don’t think it’s the fault of lower level engineers.
            • sontek 1 year ago
              Updating the status page in the middle of the incident is always an art. Sometimes you can truly define impact and update the status page without weak language but other times you can't.

              You still want to notify customers they may be seeing issues even if you aren't confident on the percentage of impacted customers yet.

            • siddharthgoel88 1 year ago
              For us the rejection rate is 90+% which is equivalent to down for me.
              • piva00 1 year ago
                For merchants it is, I worked on a marketplace before and having checkout flows with higher than usual declines will eat on your sales. People don't tolerate it so well and will either drop the purchase completely if it's a "want" and not a "need", or will go to the competition to finish the purchase.
                • bastawhiz 1 year ago
                  If a site is being DDoSed and only 10% of legitimate traffic is going through, is it "up"? I think you'd be hard pressed to call that "not down". So if the proximate cause is fraud instead of network requests, how is it any different?
                  • digitalsin 1 year ago
                    I absolutely despise this kind of language that is becoming so commonplace now and is obvious BS. I wish I could pay my bills to these same companies using language like this. "It's not that I didn't pay my bill, it's just that some dollars may experience longer-than-usual time to get to your bank."
                    • ricardobayes 1 year ago
                      If you're trying to order lunch or pay for your medicine online, then yeah I'd say it's pretty much down.
                      • dna_polymerase 1 year ago
                        Unless it's hosted on Solana...

                        or

                        on the Ethereum blockchain, where, yes, the service is not technically down but unavailable to anyone who isn't paying a hundred bucks for a simple transaction.

                        • ranting-moth 1 year ago
                          It's technically not a lie.

                          It's a half-truth which is technically the same or worse than a lie.

                        • xyst 1 year ago
                          Extended downtime like this usually means prod database deleted. Sucks to be that SRE team. Can’t wait for the Kevin Fang re-enactment!
                          • consoomer 1 year ago
                            Or that one microservice holding up the other 10,000 microservices keeled over :)
                            • ranting-moth 1 year ago
                              But microservices are standalone services.

                              Anyone who correctly implements ms wouldn't make it depend on other ms or get it shipped to prod anyway.

                              Edit: /s

                              • btilly 1 year ago
                                I'm trying to figure out the odds that you intended this as sarcasm.

                                In the real world, it doesn't work like that.

                                • resonious 1 year ago
                                  Anyone who correctly implements a payment service wouldn't make it go down for 7 hours.
                                  • consoomer 1 year ago
                                    Right... I'm sure that's what happens.
                                    • sp332 1 year ago
                                      I would actually love to know if you have done this in prod and what it looked like.
                                      • kikimora 1 year ago
                                        This is brilliant :)
                                        • tklinglol 1 year ago
                                          [dead]
                                      • sp332 1 year ago
                                        Last time Stripe had extended downtime, it was DNS. I think that's more likely than a missing database.
                                      • jxf 1 year ago
                                        I don't think that's the most common cause. Deleting the production database is obviously disastrous but there are many other reasons for extended downtime, and they're usually about the distributed-ness of the underlying architecture.
                                      • xbenjii 1 year ago
                                        We had to disable the fraud protection on our account to be able to accept transactions temporarily, luckily we have alternative fraud protection in play too.
                                        • sparrish 1 year ago
                                          Our account didn't see any interruption. Subscriptions processed fine and new subscriptions were processed as well.
                                          • m00dy 1 year ago
                                            Defi has never been down.
                                            • tmpX7dMeXU 1 year ago
                                              I could name plenty of other systems nobody wants and speak to their uptime, too? My Plex server probably has better uptime than GitHub lately but I’m not gonna start pushing my code there. It does its job serving the barbie movie, just like defi tells me who to unfollow.
                                              • ta1243 1 year ago
                                                The whole point of git was its decentralisation.

                                                Then people just moved to a client/server model hanging off github.

                                                But you're right, local servers are more reliable than big cloud ones.

                                                • hdctambien 1 year ago
                                                  I think you may have missed a bit of their point... a plex server is a video hosting server.. like a personal Netflix. They wouldn't push their code to their Plex server because it's not a server that accepts code pushes.
                                                  • LadyCailin 1 year ago
                                                    Not on average. On average, with proper failover and redundancy, you can get five nines, which is 5 minutes of downtime per year. I have that much every time I reboot my machine.
                                                • lifefeed 1 year ago
                                                  "My toilet pipes are broken and I have to get a plumber."

                                                  "Shitting in the woods is never broken."

                                                • vermilingua 1 year ago
                                                  It’s also never been up
                                                  • oooyay 1 year ago
                                                    Well yeah, in DeFi you also wouldn't have an anti-fraud gateway because then there'd be no transactions.
                                                    • LightRailTycoon 1 year ago
                                                      I've developed a DeFi anti-fraud gateway, first version runs on stdin:

                                                        yes '"Fraudulent Transaction Detected"'
                                                      
                                                      I also have a json api:

                                                        while true; do echo 'HTTP/1.1 200 OK
                                                        Content-Type: application/json; charset=UTF-8
                                                        Server: anti-fraud gateway
                                                        
                                                        "Fraudulent Transaction Detected"'|nc -l 8123;done
                                                    • q87b 1 year ago
                                                      • piva00 1 year ago
                                                        Defi doesn't have disputes nor chargebacks either, complete no-go for any mainstream checkout flow.
                                                        • arrowsmith 1 year ago
                                                          And yet no-one apart from a tiny handful of techbros uses it.

                                                          Almost as if uptime isn't the only thing that matters.

                                                          • anaganisk 1 year ago
                                                            But it is illegal depending on the country, Or a scam depending on the receiver, Or centralized if you use a hosted wallet, DeFIs merits on paper sound life changing, but reality is far from it. It's an unorganized, volatile mess.
                                                            • danheskett 1 year ago
                                                              Not a great message when billionaire tech-heads are getting scammed for seven-figures and are entirely defenseless. If Mark Cuban can't get the vendor/tech mix right, what chances do an average Joe have?
                                                              • fullspectrumdev 1 year ago
                                                                Assuming because someone is successful and competent in one area means they must be competent in other, related areas has a habit of going poorly, and happens a lot in tech.
                                                                • smeej 1 year ago
                                                                  The average Joe has a whole different set of incentives than Mark Cuban.

                                                                  It's much easier to sort out the smaller set of incentives the average Joe has and align them.

                                                              • jklinger410 1 year ago
                                                                The value is down on 99% of the coins, though
                                                                • flangola7 1 year ago
                                                                  Defi has no fraud protection
                                                                  • xyst 1 year ago
                                                                    people in the comments are getting nasty. Sure the ecosystem is plagued with scammers, but financial system in general is filled with scammers. We just have insurance products or the faith of the US government to bail you out lol

                                                                    I still believe in digital currency as the future.

                                                                    • inductive_magic 1 year ago
                                                                      The -highly entertaining- comments are rightfully bashing the de facto state of defi in the context of GPs comment.

                                                                      Remind me when a digital currency actually exists, because I haven't seen one yet. The problem isn't just that the ecosystem is filled with scammers, its that the tech doesn't solve any existing problem but creates new ones instead. Web3 is a joke.