Track HN: Survival Rate of Show HN Stories

230 points by namiwang 2 years ago | 73 comments
  • gkoberger 2 years ago
    I'm happy to say that the reports of my death here are greatly exaggerated :)

    I'm the owner of both #4 and #140 on the Top-scoring Show HN Stories that Didn’t Survive... but both are very much alive!

    #4 StackSort was a Github.com page, but on 2021 they made it so only Github.io wroks. If dang sees this, I'd really appreciate if you could change the URL for https://news.ycombinator.com/item?id=5395463 to use github.io!

    #140 ReadMe has the same io/com issue, in the opposite direction! we redirect readme.io to readme.com now, which seems to be why it's flagged.

    • 12907835202 2 years ago
      How on earth did you get readme.com?

      I'm assuming someone else owned it, whenever I see that and all the "make an offer" links I move on and ignore it. Was the process easy?

      • gkoberger 2 years ago
        It wasn't called ReadMe originally... I happened to be browsing HN, and came across a post with someone offering readme.io for free, and I was like "oh that's a great name!" (I ultimately paid $3k as a thank you)

        https://news.ycombinator.com/item?id=6397526

        For the first few years, we used readme.io as our domain. When we did our Series A, I finally bought the .com for $170k. By that point I knew we were successful, and I figured the longer I waited the more it'd cost.

        • 12907835202 2 years ago
          Wow I can't fathom the .com being worth that much. Did you manage to do any sort of math on whether the .com has helped bring in $170k of sales? Or how many years it would take to break even.
        • Lerc 2 years ago
          Someone has to be the first one.

          I once registered web.site. I had it for a week or two before I tried pointing it at a server at which time someone noticed and took it off me.

          • nl 2 years ago
            > at which time someone noticed and took it off me.

            This... isn't how domain registration works.

            • bottled_poe 2 years ago
              “Someone”…?
          • codetrotter 2 years ago
            The main page https://gkoberger.github.io/ that the https://gkoberger.github.com/ link suggests going to gives a 404 as well. Could be a good idea to add a main page for https://gkoberger.github.io/ that links the StackSort page and anything else
            • gkoberger 2 years ago
              Good call! I'll fix that up today.
            • echelon 2 years ago
              If we're going by vote count, my "show HN" should be #30, but it's not on the list at all.
          • reaperman 2 years ago
            > Extra: ChatGPT Gave a Wrong RegexPermalink I consulted ChatGPT for a regex to extract domains from urls, and it gave a flawed one:

            ^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n?]+).

            It even gave reasonable detailed explanations which convinced me. Later tests revealed that this regex doesn’t work for url with @ in path, such as https://foo.com/@./bar. The correct one should be

            ^(?:https?:\/\/)?(?:[^@\/\n]+@)?(?:www\.)?([^:\/?\n]+).

            ---------------------

            The trick is to ask ChatGPT what the right tool for the job is in your language of choice. For python, ChatGPT will happily give you:

              from urllib.parse import urlparse
              extract_domain = lambda url: urlparse(url).netloc.replace('www.', '', 1)
              # Example usage
              url = 'https://foo.com/@./bar'
              domain = extract_domain(url)
              print(domain)  # Output: foo.com
            
            -------------

            I don't think RegEx is typically the "most" correct tool for the job for things which likely have built-in parser libraries (XML, HTML, URLs, JSON, etc)

            • folli 2 years ago
              Nice work!

              I'd actually be interested in factors that make a Show HN a success vs failure.

              Objectively, there's an obvious one your dataset: time of submission. Tuesday afternoon (which timezone? I assume US west coast?) seems to be key. No way this correlates with the quality of submissions.

              Subjectively: it seems to become much harder recently. I managed once a couple of years ago for a short time to reach the front page with an Android app, now I'm barely able to get above 20 points, even though the product is (again, subjectively) cooler and has a possibly wider audience (https://news.ycombinator.com/item?id=35671245).

              Not complaining, but perhaps nowadays Show HN is not an easy way anymore to "get the word out" and get some early user feedback for and from indie hackers? Any other sites that might be of interest?

              • OJFord 2 years ago
                Its badge on a product's home page is to me a negative signal, but partly since it does still happen (quite a lot) - people do seem to use ProductHunt.

                (I suppose I'd use it - and pretty much anything - but just not put 'omg #1' badge on my site, if I had something to launch myself.)

                Completely tangential now, but I think its problem is right in the title - who is hunting a product? It's a complete echo chamber, surely nobody who doesn't have something to launch is actively using it - 'it's Wednesday so I need a new Gmail-integrating Jira spline reticulator'.

                • noncoml 2 years ago
                  > which timezone?

                  I’m wondering the same. Earlier in the article he mentions UTC.

                  So it’s either afternoon or early in the morning Pacific time.

                • jumploops 2 years ago
                  No affiliation, but the second to top deceased site is still alive and kicking [0]

                  Spot checking the top results might give a better estimate for how many are actually alive vs. just using bot protection.

                  [0]https://news.ycombinator.com/item?id=35543668

                  • bagels 2 years ago
                    It just errors out right now. How can we differentiate: always errors out vs dead?
                    • gkoberger 2 years ago
                      Vercel (and AWS) are down right now, hence the error.
                    • sentrysapper 2 years ago
                      https://harvestsignal.com/ is also still alive, but the site certificate expired.
                    • david_shaw 2 years ago
                      Thanks for making and sharing this - although I'm surprised it's not a "Show HN" itself!

                      I was curious about the top post that didn't survive - an HTML5 game called "airma.sh" - and I wanted to check it out. I think I found a working mirror: https://www.crazygames.com/game/airmash

                      It's possible that this is a different game, but it seems to fit the description.

                      Interestingly, the person who submitted that post stopped being active on HN after that discussion.

                    • karaterobot 2 years ago
                      I know you mention there are lots of reasons for false positives and negatives, but does your methodology account for length of time at all? Meaning, if a project was posted to HN in 2009, it could have been successful for 14 years and then closed down, or just changed URLs somewhere along the way, and in that case it would be counted as a failure even though it wasn't. Likewise, if it was posted in May, 2023 and is still around, that doesn't mean much because it's still flying the Grand Opening banner, practically.
                      • h0l0cube 2 years ago
                        Exactly. Some of these graphs are really flawed. Like the heatmap for the top 1% which pretty much mirrors the submission heatmap. I want to see what portion of submissions for that time slot reached 1%, not of all submissions. There could be time slots that perform exceedingly well outside of popular times.
                      • Semaphor 2 years ago
                        The top 250 has 8 dead projects from 2023. Of those 8, 5 are not dead at all, 1 is alive but has an expired certificate and only 2 (the lowest ranked) are dead. This does not seem like useful data.
                        • actuallyalys 2 years ago
                          That's definitely a red flag, although I'd expect the 2023 data to have a disproportionate number of false negatives relative to true negatives (since the vast majority of 2023 projects are still alive).
                        • gadgetoid 2 years ago
                          Airmash still lives at https://airmash.online/ and there’s also a space mod - Starmash - at https://airmash.cc/

                          I apologise in advance for the hours you’ll lose to these (again?)

                          • zX41ZdbW 2 years ago
                            > Looking for a Sponsor to Host the Database PubliclyPermalink > In the meantime, it’d be great if anyone can query the database. I tried to host a public database and real-time query interface online, but couldn’t afford the bill for a smooth Postgres instance to hold around 20G (40M rows plus indices) data. While a $20 instance could suffice, it’s pretty slow from usable, comparing to the local one on my M2 MacBook Air.

                            Here is the database with publicly available SQL endpoint: https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...

                            • SushiHippie 2 years ago
                              Nice, but seems to be last updated 2022-12-12 and funnily the IDs that don't exist have a time of 1970-01-01 00:00:00
                            • oliverobscure 2 years ago
                              Great visualisation. I was quite surprised that the submission dates and times appeared unimodal around an American morning peak.
                              • oezi 2 years ago
                                Using a stacked barchart for dead vs alive isn't a great choice in my mind. Normalize to 100% please.
                                • _dain_ 2 years ago
                                  n=1 but I know at least one non-american who has stayed up late so that the submission coincides with this peak time
                                • hawski 2 years ago
                                  Regarding database hosting, if you would consider giving the data away, I would suggest converting it to an SQLite database and sharing it over Torrent.
                                  • rahimnathwani 2 years ago
                                    I'm guessing OP wants to share a database that's always up to date.

                                    A torrent containing a single sqlite file would be good for a snapshot in time, but each update would require a new torrent, even if it only contains the updates since the base or last release.

                                    IIRC IPFS can be used to distribute files that change over time, with only the changes being transferred, although of course there would need to be a place where OP publishes the hash of the most recent file.

                                    In either case, someone would need to seed the file to guarantee it's always available.

                                    • xnx 2 years ago
                                      I second this. You've done a great service to collect this data. I'm guessing the file must be much smaller than 20GB when compressed.
                                  • nvy 2 years ago
                                    Neat idea, thanks for sharing.

                                    Curious choice to highlight Show HNs that didn't survive, but not the ones that did.

                                    Is there a reason for this?

                                    • malfist 2 years ago
                                      Same, I read the article twice in case I missed it, but no, nothing about the ones that did survive, even on the "more data" section.
                                    • gnicholas 2 years ago
                                      > Send me your interesting queries

                                      I'd be interested to see what the top Show HN posts were, after adjusting for the growing size of the HN community. That is, posts from 10 years ago would not have garnered as many upvotes simply because the community was smaller, and presumably posts were upvoted less back then, in general.

                                      I don't know the best way to measure this; it could be normed based on the median number of upvotes for the top story each week, bucketed by month. Probably someone has a better idea for this.

                                      • AndrewKemendo 2 years ago
                                        I am also, along with gkoberger happy to say that we didn't die after our Show HN (Show HN: A Covid-19 testing location site that a group of us are building)

                                        https://news.ycombinator.com/item?id=22650725

                                        In fact we were so successful that we were able to shut it down less than a year after we started (It's on the list as a very reasonable Type II error ;))

                                        Thanks to the HN community for helping us get an amazing Temporary product out and shut down successfully

                                        • elaus 2 years ago
                                          Recently I was browsing through old threads where users showed off their personal websites and blogs. I wanted to find some inspiration for my own website.

                                          What I found instead were about 3/4 dead links – even though the threads were all from the last 4-5 years. I found that quite sad, because people often talked with great passion about their websites and they sounded really cool. Also i LOVE those small, personal islands in the big, commercialized and in many ways centralized web.

                                          • manuelmoreale 2 years ago
                                            Sadly that is nothing new. I used to run a website gallery and link rotting is incredibly high.

                                            Same is true for another couple of projects I’m running now. I’m collecting personal websites and quirky small web experiments and the same is happening there.

                                            Somewhat related is the phenomenon of dead blogs. Plenty of those with a couple of interesting posts and then abandoned.

                                          • qwytw 2 years ago
                                            > So I’m looking for a sponsor to host the database publicly. I need one mediocre VM for a Rails stack app and a semi-powerful hosted Postgres instance. Contact me if you’re interested

                                            The Oracle Cloud Free tier is a great deal. They give you 4 Ampere A1 Cores + 24 GB RAM + 200GB storage for free. More than enough for a 20G (40M rows plus indices) Posgres instance.

                                            • gsatic 2 years ago
                                              Is there a way to see how long a link stays on the hn front page on average, and if that average is rising or falling over time? I read that avg time spent by a twitter hashtag on the twitter trending page has been falling year over year. Indicating people's are paying less attention to any one thing.
                                              • billllll 2 years ago
                                                I'd love to get some correlation with rank, or even filtering of lower scoring posts.

                                                From what I know, HN posts are often used as a signal for viability of a project. In that case, you can't make a conclusion on the effectiveness of Show HN posts, because some of them will die off by design.

                                                • TomNomNom 2 years ago
                                                  Just a silly aside with regards to the regex to extract domains from URLs, my little tool called unfurl [0] exists to solve that exact sort of problem :)

                                                  [0]https://github.com/tomnomnom/unfurl

                                                • smallerfish 2 years ago
                                                  Phind (#2 on your list) is still up and running also (https://www.phind.com/search?q=false%20negative&source=searc...).
                                                  • CryptoBanker 2 years ago
                                                    How do you have 40mm rows of data on Show HN for only ~126,000 stories?
                                                    • SushiHippie 2 years ago
                                                      Comments and the stories that are not "SHOW HN".

                                                      From TFA:

                                                      > For this analyze, I considered submissions made before May 31, 2023, 23:59 UTC. The dataset consists of 4,714,023 stories and 30,363,533 comments from 867,097 users.

                                                    • welder 2 years ago
                                                      My Show HN from 2013 is still alive but it's listed as dead (#590). Probably because the link from the post uses https but my 301 redirect only works using http.
                                                      • 2 years ago
                                                        • littlestymaar 2 years ago
                                                          Oh, Airmash is dead. I remember seeing it on HN then spending half of my workday this day playing it.
                                                        • ryry 2 years ago
                                                          This is neat! One of my sites is on this list - I'm gonna have to put up a 418 on it as well.
                                                          • coding123 2 years ago
                                                            Is this why HN was so slow yesterday?
                                                            • firecall 2 years ago
                                                              Very cool!

                                                              Did you have any conclusions?

                                                              I had a look at the page, couldnt see anything you'd written up :-)

                                                              • jedberg 2 years ago
                                                                What is the timezone for the heat maps? I assume UTC but wanted to check.
                                                                • ravenstine 2 years ago
                                                                  You're telling me substack.com doesn't even make the top 100?
                                                                  • gnicholas 2 years ago
                                                                    If you're referring to the domains, it's by submission count. Presumably only one Show HN was linked to substack.com.
                                                                    • ravenstine 2 years ago
                                                                      It's just strange to me because medium.com comes up as #4, but in recent years Substack links get posted very often.
                                                                      • gnicholas 2 years ago
                                                                        People post Show HNs and link to substack.com? I guess I don't understand why medium.com would show up either, but I can't recall seeing a substack link for a Show HN.
                                                                  • fergbrain 2 years ago
                                                                    What about low ranking Show HN that did survive?
                                                                    • tagawa 2 years ago
                                                                      What timezone is used for the submission heatmap?
                                                                      • trewqasdf 2 years ago
                                                                        The pandemic really got the activity going during 2020 (first bar chart), but maybe not so surprising with everyone pivoting to remote work. And obviously all discssusions about vaccines and how different government were handling things.
                                                                        • _andrei_ 2 years ago
                                                                          Phind, the 2nd entry, is live and well.