Track HN: Survival Rate of Show HN Stories
230 points by namiwang 2 years ago | 73 comments- gkoberger 2 years agoI'm happy to say that the reports of my death here are greatly exaggerated :)
I'm the owner of both #4 and #140 on the Top-scoring Show HN Stories that Didn’t Survive... but both are very much alive!
#4 StackSort was a Github.com page, but on 2021 they made it so only Github.io wroks. If dang sees this, I'd really appreciate if you could change the URL for https://news.ycombinator.com/item?id=5395463 to use github.io!
#140 ReadMe has the same io/com issue, in the opposite direction! we redirect readme.io to readme.com now, which seems to be why it's flagged.
- 12907835202 2 years agoHow on earth did you get readme.com?
I'm assuming someone else owned it, whenever I see that and all the "make an offer" links I move on and ignore it. Was the process easy?
- gkoberger 2 years agoIt wasn't called ReadMe originally... I happened to be browsing HN, and came across a post with someone offering readme.io for free, and I was like "oh that's a great name!" (I ultimately paid $3k as a thank you)
https://news.ycombinator.com/item?id=6397526
For the first few years, we used readme.io as our domain. When we did our Series A, I finally bought the .com for $170k. By that point I knew we were successful, and I figured the longer I waited the more it'd cost.
- 12907835202 2 years agoWow I can't fathom the .com being worth that much. Did you manage to do any sort of math on whether the .com has helped bring in $170k of sales? Or how many years it would take to break even.
- 12907835202 2 years ago
- Lerc 2 years agoSomeone has to be the first one.
I once registered web.site. I had it for a week or two before I tried pointing it at a server at which time someone noticed and took it off me.
- nl 2 years ago> at which time someone noticed and took it off me.
This... isn't how domain registration works.
- bottled_poe 2 years ago“Someone”…?
- nl 2 years ago
- gkoberger 2 years ago
- codetrotter 2 years agoThe main page https://gkoberger.github.io/ that the https://gkoberger.github.com/ link suggests going to gives a 404 as well. Could be a good idea to add a main page for https://gkoberger.github.io/ that links the StackSort page and anything else
- gkoberger 2 years agoGood call! I'll fix that up today.
- gkoberger 2 years ago
- echelon 2 years agoIf we're going by vote count, my "show HN" should be #30, but it's not on the list at all.
- mkl 2 years agoIf you mean https://news.ycombinator.com/item?id=23965787, that still survives, right? Which is why it's not in the list of "Top-scoring Show HN Stories that Didn’t Survive" (where it would be #31).
- mkl 2 years ago
- 12907835202 2 years ago
- reaperman 2 years ago> Extra: ChatGPT Gave a Wrong RegexPermalink I consulted ChatGPT for a regex to extract domains from urls, and it gave a flawed one:
^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n?]+).
It even gave reasonable detailed explanations which convinced me. Later tests revealed that this regex doesn’t work for url with @ in path, such as https://foo.com/@./bar. The correct one should be
^(?:https?:\/\/)?(?:[^@\/\n]+@)?(?:www\.)?([^:\/?\n]+).
---------------------
The trick is to ask ChatGPT what the right tool for the job is in your language of choice. For python, ChatGPT will happily give you:
-------------from urllib.parse import urlparse extract_domain = lambda url: urlparse(url).netloc.replace('www.', '', 1) # Example usage url = 'https://foo.com/@./bar' domain = extract_domain(url) print(domain) # Output: foo.com
I don't think RegEx is typically the "most" correct tool for the job for things which likely have built-in parser libraries (XML, HTML, URLs, JSON, etc)
- folli 2 years agoNice work!
I'd actually be interested in factors that make a Show HN a success vs failure.
Objectively, there's an obvious one your dataset: time of submission. Tuesday afternoon (which timezone? I assume US west coast?) seems to be key. No way this correlates with the quality of submissions.
Subjectively: it seems to become much harder recently. I managed once a couple of years ago for a short time to reach the front page with an Android app, now I'm barely able to get above 20 points, even though the product is (again, subjectively) cooler and has a possibly wider audience (https://news.ycombinator.com/item?id=35671245).
Not complaining, but perhaps nowadays Show HN is not an easy way anymore to "get the word out" and get some early user feedback for and from indie hackers? Any other sites that might be of interest?
- OJFord 2 years agoIts badge on a product's home page is to me a negative signal, but partly since it does still happen (quite a lot) - people do seem to use ProductHunt.
(I suppose I'd use it - and pretty much anything - but just not put 'omg #1' badge on my site, if I had something to launch myself.)
Completely tangential now, but I think its problem is right in the title - who is hunting a product? It's a complete echo chamber, surely nobody who doesn't have something to launch is actively using it - 'it's Wednesday so I need a new Gmail-integrating Jira spline reticulator'.
- noncoml 2 years ago> which timezone?
I’m wondering the same. Earlier in the article he mentions UTC.
So it’s either afternoon or early in the morning Pacific time.
- OJFord 2 years ago
- jumploops 2 years agoNo affiliation, but the second to top deceased site is still alive and kicking [0]
Spot checking the top results might give a better estimate for how many are actually alive vs. just using bot protection.
- bagels 2 years agoIt just errors out right now. How can we differentiate: always errors out vs dead?
- gkoberger 2 years agoVercel (and AWS) are down right now, hence the error.
- gkoberger 2 years ago
- sentrysapper 2 years agohttps://harvestsignal.com/ is also still alive, but the site certificate expired.
- bagels 2 years ago
- david_shaw 2 years agoThanks for making and sharing this - although I'm surprised it's not a "Show HN" itself!
I was curious about the top post that didn't survive - an HTML5 game called "airma.sh" - and I wanted to check it out. I think I found a working mirror: https://www.crazygames.com/game/airmash
It's possible that this is a different game, but it seems to fit the description.
Interestingly, the person who submitted that post stopped being active on HN after that discussion.
- flyinglizard 2 years agoAirmash lives very well on this community hosted site: https://airmash.online/
The original author was never to be heard from again.
- flyinglizard 2 years ago
- karaterobot 2 years agoI know you mention there are lots of reasons for false positives and negatives, but does your methodology account for length of time at all? Meaning, if a project was posted to HN in 2009, it could have been successful for 14 years and then closed down, or just changed URLs somewhere along the way, and in that case it would be counted as a failure even though it wasn't. Likewise, if it was posted in May, 2023 and is still around, that doesn't mean much because it's still flying the Grand Opening banner, practically.
- h0l0cube 2 years agoExactly. Some of these graphs are really flawed. Like the heatmap for the top 1% which pretty much mirrors the submission heatmap. I want to see what portion of submissions for that time slot reached 1%, not of all submissions. There could be time slots that perform exceedingly well outside of popular times.
- h0l0cube 2 years ago
- Semaphor 2 years agoThe top 250 has 8 dead projects from 2023. Of those 8, 5 are not dead at all, 1 is alive but has an expired certificate and only 2 (the lowest ranked) are dead. This does not seem like useful data.
- actuallyalys 2 years agoThat's definitely a red flag, although I'd expect the 2023 data to have a disproportionate number of false negatives relative to true negatives (since the vast majority of 2023 projects are still alive).
- actuallyalys 2 years ago
- gadgetoid 2 years agoAirmash still lives at https://airmash.online/ and there’s also a space mod - Starmash - at https://airmash.cc/
I apologise in advance for the hours you’ll lose to these (again?)
- zX41ZdbW 2 years ago> Looking for a Sponsor to Host the Database PubliclyPermalink > In the meantime, it’d be great if anyone can query the database. I tried to host a public database and real-time query interface online, but couldn’t afford the bill for a smooth Postgres instance to hold around 20G (40M rows plus indices) data. While a $20 instance could suffice, it’s pretty slow from usable, comparing to the local one on my M2 MacBook Air.
Here is the database with publicly available SQL endpoint: https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...
- SushiHippie 2 years agoNice, but seems to be last updated 2022-12-12 and funnily the IDs that don't exist have a time of 1970-01-01 00:00:00
- SushiHippie 2 years ago
- oliverobscure 2 years agoGreat visualisation. I was quite surprised that the submission dates and times appeared unimodal around an American morning peak.
- hawski 2 years agoRegarding database hosting, if you would consider giving the data away, I would suggest converting it to an SQLite database and sharing it over Torrent.
- rahimnathwani 2 years agoI'm guessing OP wants to share a database that's always up to date.
A torrent containing a single sqlite file would be good for a snapshot in time, but each update would require a new torrent, even if it only contains the updates since the base or last release.
IIRC IPFS can be used to distribute files that change over time, with only the changes being transferred, although of course there would need to be a place where OP publishes the hash of the most recent file.
In either case, someone would need to seed the file to guarantee it's always available.
- xnx 2 years agoI second this. You've done a great service to collect this data. I'm guessing the file must be much smaller than 20GB when compressed.
- zX41ZdbW 2 years agoI've also did an experiment by generating and searching embeddings for all the comments on HN. Here is the walkthrough: https://www.youtube.com/watch?v=hGRNcftpqAk
- zX41ZdbW 2 years agoIt is only around 5 GB in ClickHouse. Details: https://github.com/ClickHouse/ClickHouse/issues/29693
- zX41ZdbW 2 years ago
- rahimnathwani 2 years ago
- nvy 2 years agoNeat idea, thanks for sharing.
Curious choice to highlight Show HNs that didn't survive, but not the ones that did.
Is there a reason for this?
- malfist 2 years agoSame, I read the article twice in case I missed it, but no, nothing about the ones that did survive, even on the "more data" section.
- malfist 2 years ago
- gnicholas 2 years ago> Send me your interesting queries
I'd be interested to see what the top Show HN posts were, after adjusting for the growing size of the HN community. That is, posts from 10 years ago would not have garnered as many upvotes simply because the community was smaller, and presumably posts were upvoted less back then, in general.
I don't know the best way to measure this; it could be normed based on the median number of upvotes for the top story each week, bucketed by month. Probably someone has a better idea for this.
- AndrewKemendo 2 years agoI am also, along with gkoberger happy to say that we didn't die after our Show HN (Show HN: A Covid-19 testing location site that a group of us are building)
https://news.ycombinator.com/item?id=22650725
In fact we were so successful that we were able to shut it down less than a year after we started (It's on the list as a very reasonable Type II error ;))
Thanks to the HN community for helping us get an amazing Temporary product out and shut down successfully
- elaus 2 years agoRecently I was browsing through old threads where users showed off their personal websites and blogs. I wanted to find some inspiration for my own website.
What I found instead were about 3/4 dead links – even though the threads were all from the last 4-5 years. I found that quite sad, because people often talked with great passion about their websites and they sounded really cool. Also i LOVE those small, personal islands in the big, commercialized and in many ways centralized web.
- manuelmoreale 2 years agoSadly that is nothing new. I used to run a website gallery and link rotting is incredibly high.
Same is true for another couple of projects I’m running now. I’m collecting personal websites and quirky small web experiments and the same is happening there.
Somewhat related is the phenomenon of dead blogs. Plenty of those with a couple of interesting posts and then abandoned.
- manuelmoreale 2 years ago
- qwytw 2 years ago> So I’m looking for a sponsor to host the database publicly. I need one mediocre VM for a Rails stack app and a semi-powerful hosted Postgres instance. Contact me if you’re interested
The Oracle Cloud Free tier is a great deal. They give you 4 Ampere A1 Cores + 24 GB RAM + 200GB storage for free. More than enough for a 20G (40M rows plus indices) Posgres instance.
- gsatic 2 years agoIs there a way to see how long a link stays on the hn front page on average, and if that average is rising or falling over time? I read that avg time spent by a twitter hashtag on the twitter trending page has been falling year over year. Indicating people's are paying less attention to any one thing.
- billllll 2 years agoI'd love to get some correlation with rank, or even filtering of lower scoring posts.
From what I know, HN posts are often used as a signal for viability of a project. In that case, you can't make a conclusion on the effectiveness of Show HN posts, because some of them will die off by design.
- TomNomNom 2 years agoJust a silly aside with regards to the regex to extract domains from URLs, my little tool called unfurl [0] exists to solve that exact sort of problem :)
- opello 2 years agobagder (of curl) also made trurl to address URL manipulation:
- opello 2 years ago
- smallerfish 2 years agoPhind (#2 on your list) is still up and running also (https://www.phind.com/search?q=false%20negative&source=searc...).
- CryptoBanker 2 years agoHow do you have 40mm rows of data on Show HN for only ~126,000 stories?
- SushiHippie 2 years agoComments and the stories that are not "SHOW HN".
From TFA:
> For this analyze, I considered submissions made before May 31, 2023, 23:59 UTC. The dataset consists of 4,714,023 stories and 30,363,533 comments from 867,097 users.
- SushiHippie 2 years ago
- welder 2 years agoMy Show HN from 2013 is still alive but it's listed as dead (#590). Probably because the link from the post uses https but my 301 redirect only works using http.
- 2 years ago
- littlestymaar 2 years agoOh, Airmash is dead. I remember seeing it on HN then spending half of my workday this day playing it.
- gadgetoid 2 years agoThe community revived it to https://airmash.online/ pretty sharpish, does this count as dead?
- gadgetoid 2 years ago
- ryry 2 years agoThis is neat! One of my sites is on this list - I'm gonna have to put up a 418 on it as well.
- coding123 2 years agoIs this why HN was so slow yesterday?
- firecall 2 years agoVery cool!
Did you have any conclusions?
I had a look at the page, couldnt see anything you'd written up :-)
- jedberg 2 years agoWhat is the timezone for the heat maps? I assume UTC but wanted to check.
- ravenstine 2 years agoYou're telling me substack.com doesn't even make the top 100?
- gnicholas 2 years agoIf you're referring to the domains, it's by submission count. Presumably only one Show HN was linked to substack.com.
- ravenstine 2 years agoIt's just strange to me because medium.com comes up as #4, but in recent years Substack links get posted very often.
- gnicholas 2 years agoPeople post Show HNs and link to substack.com? I guess I don't understand why medium.com would show up either, but I can't recall seeing a substack link for a Show HN.
- gnicholas 2 years ago
- ravenstine 2 years ago
- gnicholas 2 years ago
- fergbrain 2 years agoWhat about low ranking Show HN that did survive?
- tagawa 2 years agoWhat timezone is used for the submission heatmap?
- trewqasdf 2 years agoThe pandemic really got the activity going during 2020 (first bar chart), but maybe not so surprising with everyone pivoting to remote work. And obviously all discssusions about vaccines and how different government were handling things.
- _andrei_ 2 years agoPhind, the 2nd entry, is live and well.