Issue affecting the Gateway API on the Braintree platform
140 points by siddharthgoel88 1 year ago | 65 comments- mauvehaus 1 year agoMeanwhile, anyone in Boston who only read the headline is completely unsurprised to hear that the red line isn't running and is wondering how this merits making the front page of HN.
- fasteo 1 year agoSome context[1] for the non-Bostonians
So it seems this line is pretty unreliable.Massachusetts Bay Transportation Authority RED LINE Ashmont/Braintree 4 current alerts
- dublinben 1 year agoThat's quite an understatement. It's currently taking 68 minutes longer than it shouuld to get from Braintree to the other end.[0]
https://dashboard.transitmatters.org/system/slowzones/?chart...
- Tempest1981 1 year ago15 miles south of Boston: https://en.wikipedia.org/wiki/Braintree,_Massachusetts
- dublinben 1 year ago
- nerdjon 1 year agoI haven't had enough coffee yet, I saw the headline and couldn't figure out why something didn't make sense reading it and I live in Boston.
Glad to see this comment to explain it.
- NetOpWibby 1 year agoHaven't lived in Boston since early 2018 and I've only heard worse and worse things. Good grief.
- daveguy 1 year agoFor context, the original HN title was:
BrainTree has been down for more than 7 hours now
- rd 1 year agoHahaha this is what my brain went to immediately as well
- fasteo 1 year ago
- sschueller 1 year agoI received an email for every paypal authorized connection I had indicating that it has been terminated. Related?
This email confirms that you have canceled your payment agreement with ###### No further payments will be made from your PayPal account to this merchant. If you have any further questions about the agreement, or wish to reinstate it, please contact ###### directly.
- smeej 1 year agoNot sure, but anecdotally I haven't received any such notices even though I do have active connections.
- smeej 1 year ago
- preinheimer 1 year agoHugops to the team working on it.
The biggest incidents have the best post mortems.
- shaftoe 1 year ago"Some merchants may be seeing a higher-than-usual decline due to Gateway rejections Fraud."
Is this really "down"?
- Aurornis 1 year agoThe trend is is to downplay the issue in status messages to obscure the real problem. That message could mean anything from an extra 1% of rejections to 99% of transactions are failing.
The vagueness is the point, because they want to avoid admitting serious problems.
We had this problem with some devops hires who came from a big company. They’d delay updating the status page as long as possible, then update with the weakest language that was technically correct. “Some customers might experience degraded performance” was their go-to message for nearly complete outages. They’d argue that it was technically correct because some requests were getting through in some logs somewhere.
It was a side effect of working in an environment where their bonuses depended on downtime and the severity of outages. The game was to admit as little as possible to keep those bonus numbers high. We didn’t calculate bonuses that way but they had ingrained the behavior from years of BigCo performance reviews.
- s_dev 1 year ago>We had this problem with some devops hires who came from a big company.
Amazon.
All you have to do is look at their status page of green lights when us-east goes down completely to lose complete faith in their status page reflecting anything but wishful thinking.
- wging 1 year agoSeems unlikely, bonuses are not an Amazon thing, and iiuc status pages aren’t a decision such people would be making anyway. A dedicated “devops” person at Amazon (to the extent that’s even a thing, mostly engineering teams own their own ops) would be unlikely to benefit from minimizing issues. The status page issue you’re discussing is real but I don’t think it’s the fault of lower level engineers.
- wging 1 year ago
- sontek 1 year agoUpdating the status page in the middle of the incident is always an art. Sometimes you can truly define impact and update the status page without weak language but other times you can't.
You still want to notify customers they may be seeing issues even if you aren't confident on the percentage of impacted customers yet.
- s_dev 1 year ago
- siddharthgoel88 1 year agoFor us the rejection rate is 90+% which is equivalent to down for me.
- piva00 1 year agoFor merchants it is, I worked on a marketplace before and having checkout flows with higher than usual declines will eat on your sales. People don't tolerate it so well and will either drop the purchase completely if it's a "want" and not a "need", or will go to the competition to finish the purchase.
- bastawhiz 1 year agoIf a site is being DDoSed and only 10% of legitimate traffic is going through, is it "up"? I think you'd be hard pressed to call that "not down". So if the proximate cause is fraud instead of network requests, how is it any different?
- digitalsin 1 year agoI absolutely despise this kind of language that is becoming so commonplace now and is obvious BS. I wish I could pay my bills to these same companies using language like this. "It's not that I didn't pay my bill, it's just that some dollars may experience longer-than-usual time to get to your bank."
- ricardobayes 1 year agoIf you're trying to order lunch or pay for your medicine online, then yeah I'd say it's pretty much down.
- dna_polymerase 1 year agoUnless it's hosted on Solana...
or
on the Ethereum blockchain, where, yes, the service is not technically down but unavailable to anyone who isn't paying a hundred bucks for a simple transaction.
- ranting-moth 1 year agoIt's technically not a lie.
It's a half-truth which is technically the same or worse than a lie.
- Aurornis 1 year ago
- xyst 1 year agoExtended downtime like this usually means prod database deleted. Sucks to be that SRE team. Can’t wait for the Kevin Fang re-enactment!
- consoomer 1 year agoOr that one microservice holding up the other 10,000 microservices keeled over :)
- ranting-moth 1 year agoBut microservices are standalone services.
Anyone who correctly implements ms wouldn't make it depend on other ms or get it shipped to prod anyway.
Edit: /s
- btilly 1 year agoI'm trying to figure out the odds that you intended this as sarcasm.
In the real world, it doesn't work like that.
- resonious 1 year agoAnyone who correctly implements a payment service wouldn't make it go down for 7 hours.
- consoomer 1 year agoRight... I'm sure that's what happens.
- sp332 1 year agoI would actually love to know if you have done this in prod and what it looked like.
- kikimora 1 year agoThis is brilliant :)
- tklinglol 1 year ago[dead]
- btilly 1 year ago
- ranting-moth 1 year ago
- sp332 1 year agoLast time Stripe had extended downtime, it was DNS. I think that's more likely than a missing database.
- granzymes 1 year agoYou mean Square? They had a big DNS outage two weeks ago.
https://techcrunch.com/2023/09/11/square-daylong-outage-dns-...
- sp332 1 year agoYup, that was the one.
- sp332 1 year ago
- granzymes 1 year ago
- jxf 1 year agoI don't think that's the most common cause. Deleting the production database is obviously disastrous but there are many other reasons for extended downtime, and they're usually about the distributed-ness of the underlying architecture.
- consoomer 1 year ago
- xbenjii 1 year agoWe had to disable the fraud protection on our account to be able to accept transactions temporarily, luckily we have alternative fraud protection in play too.
- sparrish 1 year agoOur account didn't see any interruption. Subscriptions processed fine and new subscriptions were processed as well.
- m00dy 1 year agoDefi has never been down.
- tmpX7dMeXU 1 year agoI could name plenty of other systems nobody wants and speak to their uptime, too? My Plex server probably has better uptime than GitHub lately but I’m not gonna start pushing my code there. It does its job serving the barbie movie, just like defi tells me who to unfollow.
- ta1243 1 year agoThe whole point of git was its decentralisation.
Then people just moved to a client/server model hanging off github.
But you're right, local servers are more reliable than big cloud ones.
- hdctambien 1 year agoI think you may have missed a bit of their point... a plex server is a video hosting server.. like a personal Netflix. They wouldn't push their code to their Plex server because it's not a server that accepts code pushes.
- LadyCailin 1 year agoNot on average. On average, with proper failover and redundancy, you can get five nines, which is 5 minutes of downtime per year. I have that much every time I reboot my machine.
- hdctambien 1 year ago
- ta1243 1 year ago
- lifefeed 1 year ago"My toilet pipes are broken and I have to get a plumber."
"Shitting in the woods is never broken."
- 11235813213455 1 year agountil you step on it
- 11235813213455 1 year ago
- vermilingua 1 year agoIt’s also never been up
- oooyay 1 year agoWell yeah, in DeFi you also wouldn't have an anti-fraud gateway because then there'd be no transactions.
- LightRailTycoon 1 year agoI've developed a DeFi anti-fraud gateway, first version runs on stdin:
I also have a json api:yes '"Fraudulent Transaction Detected"'
while true; do echo 'HTTP/1.1 200 OK Content-Type: application/json; charset=UTF-8 Server: anti-fraud gateway "Fraudulent Transaction Detected"'|nc -l 8123;done
- LightRailTycoon 1 year ago
- q87b 1 year ago
- piva00 1 year agoDefi doesn't have disputes nor chargebacks either, complete no-go for any mainstream checkout flow.
- arrowsmith 1 year agoAnd yet no-one apart from a tiny handful of techbros uses it.
Almost as if uptime isn't the only thing that matters.
- anaganisk 1 year agoBut it is illegal depending on the country, Or a scam depending on the receiver, Or centralized if you use a hosted wallet, DeFIs merits on paper sound life changing, but reality is far from it. It's an unorganized, volatile mess.
- danheskett 1 year agoNot a great message when billionaire tech-heads are getting scammed for seven-figures and are entirely defenseless. If Mark Cuban can't get the vendor/tech mix right, what chances do an average Joe have?
- fullspectrumdev 1 year agoAssuming because someone is successful and competent in one area means they must be competent in other, related areas has a habit of going poorly, and happens a lot in tech.
- smeej 1 year agoThe average Joe has a whole different set of incentives than Mark Cuban.
It's much easier to sort out the smaller set of incentives the average Joe has and align them.
- fullspectrumdev 1 year ago
- danheskett 1 year ago
- jklinger410 1 year agoThe value is down on 99% of the coins, though
- flangola7 1 year agoDefi has no fraud protection
- xyst 1 year agopeople in the comments are getting nasty. Sure the ecosystem is plagued with scammers, but financial system in general is filled with scammers. We just have insurance products or the faith of the US government to bail you out lol
I still believe in digital currency as the future.
- inductive_magic 1 year agoThe -highly entertaining- comments are rightfully bashing the de facto state of defi in the context of GPs comment.
Remind me when a digital currency actually exists, because I haven't seen one yet. The problem isn't just that the ecosystem is filled with scammers, its that the tech doesn't solve any existing problem but creates new ones instead. Web3 is a joke.
- inductive_magic 1 year ago
- tmpX7dMeXU 1 year ago