Subdomain.center – discover all subdomains for a domain
385 points by adam_gyroscope 1 year ago | 124 comments- gnyman 1 year agoYou cannot hide anything on the internet anymore, the full IPv4 range is scanned regularly by multiple entities. If you open a port on a public IP it will get found.
If it's a obscure non-standard port it might take longer, but if it's on any of the standard ports it will get probed very quickly and included tools like shodan.io
The reason why I'm repeating this, is that not everyone knows this. People still (albeit less) put up elastic and mongodb instances with no authentication on public IP's.
The second thing which isn't well known is the Certificate Transparency logs. This is the reason why you can't (without a wildcard cert) hide any HTTPS service. When you ask Let's Encrypt (or any CA actually) to generate veryobscure.domain.tld they will send that to the Certificate Transparency logs. You can find every certificate which was minted for a domain on a tool like https://crt.sh
There are many tools like subdomain.center, https://hackertarget.com/find-dns-host-records/ comes to mind. The most impressive one I've seen, which found more much more than expected, is Detectify (which is a paid service, no affiliation), they seem to combine the passive data collection (like subdomain.center) with active brute to find even more subdomains.
But you can probably get 95% there by using CT and a brute-force tool like https://github.com/aboul3la/Sublist3r
- fuzzy2 1 year agoThe Certificate Transparency Log is very important. I recently spun up a service with HTTPS certs by Let's Encrypt. By coincidence I was watching the logs. Within just 80 seconds of the certificate being issued I could see the first automated "attacks".
If you get a certificate, be ready for the consequences.
- mixdup 1 year agoWere these automated "attacks" hitting you by hostname or IP? Because there's a chance you would've been getting them regardless just from people scanning the entire IPv4 space
- fuzzy2 1 year agoThey would not have been reverse proxied to the Docker container without the hostname.
- fuzzy2 1 year ago
- tikkabhuna 1 year agoThis is really interesting. For my Homelab I've been playing around with using Lets Encrypt rather than spinning up my own CA. "What's the worst that could happen?"
Guess I'll be looking to spin up my own CA now!
- dspillett 1 year agoGetting a wildcard certificate from LE might be a better option, depending on how easy the extra bit of if plumbing is with your lab setup.
You need to use DNS based domain identification, and once you have a cert distribute it to all your services. The former can be automated using various common tools (look at https://github.com/joohoi/acme-dns, self-hosted unless you are only securing toys you don't really care about, if you self host DNS or your registrar doesn't have useful API access) or you can leave that as an every ~ten weeks manual job, the latter involves scripts to update you various services when a new certificate is available (either pushing from where you receive the certificate or picking up from elsewhere). I have a little VM that holds the couple of wildcard certificates (renewing them via DNS01 and acmedns on a separate machine so this one is impossible to see from the outside world), it pushes the new key and certificate out to other hosts (simple SSH to copy over then restart nginx/Apache/other).
Of course you may decide that the shin if your own CA is easier than setting all this up, as you can sign long lived certificates for yourself. I prefer this because I don't need to switch to something else if I decide to give friends/others access to something.
Your top level (sub)domain for the wildcard is still in the transparency logs of course, but nothing under it is.
- rbut 1 year agoIf you're homelab'ing then you should be using private IPs to host your services anyway. Don't put them on a public IP unless you absolutely have to (eg port 25 for mail).
Use your internal DNS server (eg your routers) for DNS entries for each service. Or if you wish you can put them in public DNS also. Eg. gitlab.myhome.com A 192.168.33.11
You can then access your services over an always-on VPN like wireguard when you're away from home.
Then it doesn't matter if anyone knows what subdomains you have, they can't access them anyway.
- KronisLV 1 year ago> Guess I'll be looking to spin up my own CA now!
I was looking for a lazy/easy way to do this manually and settled on KeyStore Explorer, which is a GUI tool that lets you work with various keystores and do everything from making your own CA, to signing and exporting certificates in various formats: https://github.com/kaikramer/keystore-explorer (to me it feels easier than working with OpenSSL directly, provided I trust the tool)
In addition, I also setup mTLS or even basicauth at the web server (reverse proxy) level for some of my sites, which seems to help that little bit more, given that some automated attacks might choose to ignore TLS errors, but won't be able to provide my client certs or the username/password. In addition, I also run fail2ban and mod_security, though that's more opinionated.
- oooyay 1 year agoI use a wildcard certificate for my home infrastructure. For all the talk of hiding, though, it's wise not to count on hiding behind a wild card. Properly configure your firewalls and network policy. For the services you do have exposed, implement rate limiting and privileged access. I stuck most of my LE services behind Tailscale, so they get their certificates but aren't routable outside my Tailscale network.
- ahoka 1 year agoDidn’t you read the original comment? It’s just a matter of time until someone starts to poke your IPs. Your own CA will be harder to get right.
- teekert 1 year agoCan Tailscale magic DNS + tunnel obscure things? Or only when you keep a service within the tailnet? (Still a + for selfhosters)
- dspillett 1 year ago
- mixdup 1 year ago
- implements 1 year agoRecently, I opened 80 and 443 so I could use LetsEncrypt’s acme-client to get a certificate (and then test it). Tightening up security a bit, I configured an http relay to filter people accessing 80 by ip address rather than domain name - some scanners are still trying domain and sub-domain names I was using weeks ago - which goes to show how organised hackers are about attacking targets.
- BoberMod 1 year agoYou can use DNS-01 challenge [1] to get certificate. You just need to add temporary TXT record to your DNS. It also supports wildcart certificates.
Most popular DNS providers (like Cloudflare) has API, so it can be easily automated.
I'm using it in my local network: I have publicly available domain for it (intranet.domain.com) and I don't wont to expose my local services to the world to issue certificate trusted by root CA on all my devices. So, this method allows me to issue valid Let's encrypt wildcard cert (*.intranet.domain.com) for all my internal services without opening any ports to the world.
[1]: https://letsencrypt.org/docs/challenge-types/#dns-01-challen...
- intothemild 1 year agoOnce you expose something long enough to get scanned. It's going to continue to get scanned pretty much forever.
I self host a couple web services, but none are open, you need strong authentication to get in.
It's not ideal, ideally I'd close the https web traffic and use some form of VPN to get in. But sadly that's just not feasible in my use case. So strong auth it is.
- BoberMod 1 year ago
- fragmede 1 year agonot to underestimate the power of shodan, and oh god don't spin up a default mongo with no auth, but port knocking would seen to counteract this to enough of a degree, not to mention having a service only accessible via Tor.
https://wiki.archlinux.org/title/Port_knocking#:~:text=Port%....
- gnyman 1 year agoYes, you can hide with a little bit of effort. Port knocking or Tor will stop almost any thing (but don't rely on it as the sole protection, just as another layer).
I like to prefix anything "I don't want scraped" with a random prefix, like domain.com/kwo4sx_grafana/ and nobody will find it (as long as you don't link to it anywhere). But I still have auth enabled, but at least I don't have to worry about any automated attacks exploiting it before I have time to patch.
Something as simple as moving SSH on a non standard port reduces the amount of noise from most automated scanners 99% (made up number, but a lot).
- Vorh 1 year agoHave you had any problems with browsers leaking the prefixed sites, as seen here?
- Vorh 1 year ago
- gnyman 1 year ago
- sgjohnson 1 year agoYou don't even need "multiple entities". Absolutely anyone can do that. Scanning a single port on the entire IPv4 internet takes about 40 minutes.
- codethief 1 year ago> You cannot hide anything on the internet anymore, the full IPv4 range is scanned regularly by multiple entities. If you open a port on a public IP it will get found.
Sure but you might still host multiple virtual hosts (e.g. subdomains) on the same web server. Unless an attacker knows their exact hostnames, they won't be able to access them.
- gustavus 1 year agoThere are several easy ways to the skirt that.
First you can simply try bruteforcing subdomains, secondly if you are using https you can simply pull the cert and look at the aliases listed there. 2 ways off the top of my head.
- codethief 1 year agoOf course, but my point was that none of them involve IP scanning.
- codethief 1 year ago
- gustavus 1 year ago
- tamimio 1 year ago> This is the reason why you can't (without a wildcard cert)
Guess being security conscious pays off, as testing those on some domains I have, they only managed to show what I want to show, since wildcard will just mask them.
That being said, I don’t think anyone should consider a subdomain as a hidden thing, it’s an address after all and should not be considered hidden, assume it’s accessible or put it behind a FW or VPN and have a proper authentication, security by obscurity never works.
- matheusmoreira 1 year ago> the full IPv4 range is scanned regularly by multiple entities
Single packet authorization. Server just drops any and all packets unless you send a cryptographically signed packet to it first. To all these observers, it's like the server is not even there.
- danielvaughn 1 year agoAt my company we got bit by this several months ago. Luckily the database was either empty or only had testing data, but like you said the port was exposed and someone found it.
- 1 year ago
- pid-1 1 year ago> full IPv4 range is scanned regularly by multiple entities.
Yet another good reason to use IPv6
- gnyman 1 year agoIPv6 won't get found by brute-force but there are a few projects which tries to gather IPv6 addresses using various means and scans them as they are found.
Shodan did (maybe still does) provided NTP servers to some ntp-pools and scanned anyone who sent incoming requests.
https://arstechnica.com/information-technology/2016/02/using...
So as with everything, layer the defences, don't rely on you IPv6 address being secret as the only defence.
- gnyman 1 year ago
- fuzzy2 1 year ago
- banana_giraffe 1 year agoCute, it managed to find 121486 subdomains for amazonaws.com [1], and somehow I suspect that's a tiny fraction of what's in use.
https://gist.githubusercontent.com/Q726kbXuN/bf8a9a22b81fe65...
- Brananarchy 1 year agoAs others have said, certificate transparency seems to be doing some heavy lifting here. It reports subdomains for me that have never had a public CNAME or A record, but have had let's encrypt certs issued for internal use.
It's also missing some that have not had certs issued, but that are in public DNS
- TekMol 1 year agoThat's why HTTPS is still a pain in the butt. 30 years after it was invented.
I don't want internally used subdomains to be public. Because of certificate transparency, the only way to achieve that is via wildcard certs.
Let's encrypt only supports cumbersome validation methods for those. Like changing DNS records every time you need to renew the cert.
Pretty annoying.
- proto_lambda 1 year agoIf the subdomains aren't supposed to be public, the public also doesn't need to trust the TLS certs. Sign them with your own CA and trust it on the devices that should be able to access the domains.
- paranoidrobot 1 year agoAdding CAs to trust stores on devices and in apps is a major pain.
If you have unmanaged devices this becomes even more painful.
"Oh, hi, welcome to the company, please install this Root CA onto your machine to access <internal service>"
Because you can't scope CAs to specific domains, this causes everyone with any idea about security to start being concerned.
- Figs 1 year agoNon-public usage doesn't necessarily mean that only devices under your direct control need access. Slack needs access to some of my organization's systems, for example, to support the way we collaborate on our projects -- but the general public doesn't and would likely just be confused if they stumbled into one of our infrastructure subdomains instead of visiting our public website.
- withinboredom 1 year agoYeah. In that case, it's just easier to get a really cheap wild-card cert signed by a low-cost reseller for <50 bucks. They only reason to care about big-name certs is compatibility with all the devices out there, but if you don't need compatibility, then get the cheapest thing you can.
- justsomehnguy 1 year ago> and trust it on the devices that should be able to access the domains
Sometimes it's not an option. I spent too many hours trying to figure out why some Android apps didn't want to talk with a service I self-hosted. They just ignored my Root CA cert installed on the phone.
- paranoidrobot 1 year ago
- speedgoose 1 year agoI think you are supposed to automate the renewal with the DNS record method.
- tjoff 1 year agoWhich most registrars don't support.
- tjoff 1 year ago
- proto_lambda 1 year ago
- Hamuko 1 year agoI have a single wildcard certificate for my internal domain name and ~10 CNAMEs for various service subdomains in the network (plex.server.com, grafana.server.com, etc). This tool found zero subdomains for my internal domain.
- rft 1 year agoI have a similar setup (*.home.domain.com DNS auth with LE -> service1.home.domain.com etc.) for my personal, but externally reachable domain, and I get the same results. I went the wildcard route just due to a bit of paranoia, nice to see that it actually worked out in this case.
As this (I expect) heavily uses cert transparency in the background, I want to point out another use case for that service. You can search the CT logs with wildcards to find your domain "neighbors" on other TLDs: https://crt.sh/?Identity=google.%25&match=ILIKE This usually gives you somewhat more active websites compared to just checking whether you can register the domain and somewhat weeds out squatted domains. I found that for our company one TLD contained a NSFW games store that way.
- SushiHippie 1 year agoYou should check again after some time, the first time I looked up my domain there were no results, few minutes later it found some of my subdomains.
- Hamuko 1 year agoStill nothing.
- Hamuko 1 year ago
- rft 1 year ago
- Symbiote 1 year agoAt work we have a wildcard certificate for most services we host on our own infrastructure. Most public websites have been detected, and some internal ones which have probably been referenced in public GitHub issues and so on.
They've done simple reverse DNS lookups on our public IP range and indexed all those hostnames.
Certificate transparency logs have found names used for externally hosted websites.
There are some pretty old hostnames which haven't been used for 5 years or more, and were probably found with reverse DNS at the time.
- TekMol 1 year ago
- TheHappyOddish 1 year agoHardly "all subdomains". Unless it's doing an AXFR of my zone file (unlikely), this isn't possible.
It's a scraper/guesser, using cert transparency, common names, etc. Cute toy, but false claims.
- panki27 1 year agoYou are correct, I've tested it with my own domains. It does not know the ones running with a wildcard certificate for example.
- wlonkly 1 year agoIt knows many of the wildcard-served customer subdomains of one of my former employers. (They're probably just scraped from search or something, but a wildcard is not sufficient to prevent discovery.)
- wlonkly 1 year ago
- panki27 1 year ago
- hankchinaski 1 year agoI would be keen to know what techniques are used. Usually subdomain discovery is done with dns axfr transfer request which leaks the entire dns zone (but this only works on ancient and unpatched nameservers) or with dictionary attacks. There are some other techniques you can check if you look at the source code of amass (open source Golang reconnaissance/security tool), or CT logs. Dns dumpster is one of the tools I used alongside pentest tools (commercial) and amass (oss)
- cobertos 1 year agoI mean, doesn't it say right on the front page?
* Apache Nutch - So they're crawling either some part of the root itself or some other websites to find subdomains. Honestly might help to query CommonCrawl too.
* Calidog's Certstream - As you said, you can look at the CT logs
* OpenAI Embeddings - So I guess it also uses LLM to try to generate ones to test too.
* Proprietary Tools - your guess is as good as mine
Probably a common list of subdomains to test against too.
Seems like multiple techniques to try to squeeze out as much info as possible.
- Zuiii 1 year agoI'd also add insecure DNSSEC implementations that allow you to "walk" the entire record chain for the domain.
- kevincox 1 year agoCalling this "insecure" is a bit harsh. This is required for offline signing which provides better security but worse privacy.
- kevincox 1 year ago
- piffey 1 year agoProprietary tools means passive DNS.
- smarx007 1 year agoHow can one avoid their browsing ending up in the passive DNS logs? For example, is using 1.1.1.1, 8.8.8.8, or 9.9.9.9 (CF, Google, and Quad9, respectively) good or bad in this regard?
For example, where does Spamhaus get their passive DNS data? They write [1] that it comes from "trusted third parties, including hosting companies, enterprises, and ISPs." But that's rather vague. Are CF, Google, and Quad9 some of those "hosting companies, enterprises, and ISPs"?
[1]: https://www.spamhaus.com/resource-center/what-is-passive-dns...
- hoppla 1 year ago
- smarx007 1 year ago
- Zuiii 1 year ago
- cobertos 1 year ago
- derefr 1 year agoInteresting. Our domain has some subdomains with a numeric suffix; and the API response here has entries in that pattern for not only the particular subdomains that exist or ever existed, but also for subdomains of the same pattern that go beyond any suffix number we've ever actually used.
You'd think they'd at least be filtering their response by checking which subdomains actually have an A/AAAA/CNAME record on them...
- blueflow 1 year agoI entered my own domains and i got so many garbage entries. It feels like an AI reading letsencrypt logs and then adding made up shit to it.
- internet2000 1 year agoFor my personal domain: it got the ones I have on the SSL cert alternative subject names, made up three, returned one I deleted more than a year ago, and didn't find two. Very curious.
- DaiPlusPlus 1 year agoThose SAN and CN names will appear in publicly visible certificate transparency lists ( https://en.wikipedia.org/wiki/Certificate_Transparency ): so if you ever get a TLS certificate for a super-seeekret internal sub-sub-sub-domain-name from a major CA then it won't be secret for long. The only way to keep a publicly-resolvable DNS subdomain confidential is to either get a wildcard cert for the parent domain or find a dodgy (yet somehow widely-trusted) CA that doesn't particiate in CT - or use a self-signed cert.
This subdomain.center database returned one of my "private" sub-sub-domains (which just points to my NAS) for which I did get a cert from LetsEncrypt, but it doesn't have any of my other sub-sub domains listed (despite resolving to the same A IPv4 address as the listed subdomain) because those subdomains have only ever been secured by a wildcard cert.
- DaiPlusPlus 1 year ago
- donatj 1 year agoInteresting. It only found less that a quarter of the subdomains of the site I work on, and everything it did find is public facing. I wonder if that’s maybe something to do with how we set up certificates for public vs internal subdomains? It even missed “staging.” which should be nearly identical in configuration to www
- SushiHippie 1 year agoNote, if you looked up a domain and it had no results, you should check back again after some minutes. I looked my domain up and had zero results, which was weird as it should at least find some in the ct logs, but a few minutes later it showed some subdomains.
- LinuxBender 1 year agoIt took about 5 minutes for me. It found my apex domain and a sub-domain that must have belonged to the previous renter of my domain name. [1] So I was curious and it turns out the previous renters pages were in Wayback. [2] That page renders as mostly little boxes for me. Funny, I had never bothered to check that. I should check if any of my other domains have snapshots from before I rented them.
[1] - https://api.subdomain.center/?domain=ohblog.net
[2] - https://web.archive.org/web/20090302094112/https://ohblog.ne...
- SushiHippie 1 year agoWeb archive can also somewhat act as a subdomain finder (not really in this case, only the www subdomain, but still interesting): https://web.archive.org/web/*/ohblog.net*
- SushiHippie 1 year ago
- LinuxBender 1 year ago
- RockRobotRock 1 year agoThis is certificate transparency doing most of the work, right?
- zootboy 1 year agoI would assume so. I tested on one of my private domains that generally isn't linked to anywhere, and it just returned the few domains that I generate Let's Encrypt certs for, plus my nameservers.
Interestingly, I did not receive any DNS queries on my authoritative nameservers during the query, so they don't seem to be doing any active DNS probes.
- out-of-ideas 1 year agoit may utilize a few techniques as there are subdomains I am aware of that've never been published other than in the zone config on my registrar that are returned from api query
- pbhjpbhj 1 year agoI use Siteground and it has a staging server that AFAIK hasn't been used for at least 6 years ...
Nothing at the host has any details of that, archive.org doesn't have it in their site URLs, it's not in DNS records, not in .well-known, it was a transient test years ago ... really curious, must be historic data from somewhere?
- RockRobotRock 1 year agoI use Cloudflare for DNS and the only ones it found had LE certs. It's not doing a simple brute-force on common names, I don't think. Otherwise it probably would have found a lot more. Curious about how it works.
- pbhjpbhj 1 year ago
- zootboy 1 year ago
- Arubis 1 year agoIf this were able to determine which wildcard subdomains were active for a given domain, you could use it to figure out a lot of B2B companies’ client/customer list.
- Xorakios 1 year agoJust for giggles, does anyone else remember when "subdomains" were called "machine names" because physical devices were limited to one service?
www. ftp. mail.
... weren't theoretical or merely mnemonic.
Felt like an old coot when using "machine name" to a 40 year old IT professional and she was perplexed!
- semi 1 year agoI'm a somewhat old coot and do remember those days, but I think the term still makes sense but only in a lan environment.
machines still have hostnames, and home routers will often trust your dhcp clients machine name.
So I can still look up steamdeck.lan and find the IP of my steam deck and in that context calling it a machine name is perfectly apt and I think would still be well understood.
- semi 1 year ago
- p4bl0 1 year agoIt gave me empty results for some of my domains that have multiple subdomains that have TLS certificate associated with them so that must appear in the certificate transparency log.
I guess it should be "discover some subdomains for some domains".
- Semaphor 1 year agoEmpty for all my and my work’s domains. Then I tested random .com domains and got results. Seems pretty useless.
- Semaphor 1 year ago
- pabs3 1 year agoMore options here: https://wiki.archiveteam.org/index.php/Finding_subdomains
- sea-gold 1 year agoThanks. This is a really helpful list which includes many of the sites/tools listed here.
- sea-gold 1 year ago
- ohuf 1 year agoThe subdomain explorer may be fun, but their Exploit Observer is really useful: https://www.exploit.observer/
- g147 1 year agothanks!
- g147 1 year ago
- keepamovin 1 year agoThis is fantastic!!!
What kind of security considerations are there to having multi-tenant user applications on subdomains and then having them exposed like this?
I'm building a SaaS right now, and I guess one thing is that a given username can then be discovered as a valid login for the system...but obviously that's only part of the login credential.
Maintaining a list of mappings to opaque subdomains seems to reduce targeting, and conceal login partial credentials, but doesn't seem to offer much besides.
Analysis?
- thorum 1 year agoIt doesn’t seem to detect subdomains set up with Kubernetes ingresses, based on results for one of my domains, so that might be a place to start research.
- davidkuennen 1 year agoIt also doesn't find any subdomains for my domain.
In my case I use Google Cloud DNS. Maybe they have some sort of protection in place (I wouldn't be surprised).
- davidkuennen 1 year ago
- thorum 1 year ago
- cm2187 1 year agoOne thing I noticed looking at my logs is that there is almost no unsolicited traffic (i.e. failed authentication attempts, exploits of various worldpress bugs, etc) through ipv6. I think it's a function of 1) those coming from networks (compromised home devices, etc) that don't support v6, 2) the v6 address space being too large to scan (the size of an encryption key), so good security by obscurity. This would nullify 2).
- weird-eye-issue 1 year agoI got back an empty list for my domain on Cloudflare with several subdomains (non wildcard)
edit: I retried on my computer (was on my phone earlier) and now it returns all of our subdomains, even picking up our test R2 bucket. In guessing I was rate limited because I accidentally loaded the example file a few times
- hbcondo714 1 year agoSeems similar yet still useful to Wolfram Alpha; just enter a domain and click on the "Subdomains" button:
- franky47 1 year agoSublist3r [1] does a similar job, as long as you have the authorisation to use it on a particular domain, as it uses more aggressive discovery techniques.
- asmor 1 year agohttps://github.com/projectdiscovery/subfinder does this, but it explains all the methods and lets you choose to only do a passive scan.
- johntiger1 1 year agoTook a while, but was impressed it detected all of ours: https://api.subdomain.center/?domain=radiantai.health
- DistractionRect 1 year agoCertificate transparency does a lot of the heavy lifting:
- Semaphor 1 year agoOnly that actually works. I get hundreds of entries for my domain there, including entries before Lets Encrypt was a thing, while the subdomain checker returns an empty array.
- Semaphor 1 year ago
- DistractionRect 1 year ago
- 1 year ago
- perryizgr8 1 year agoIt detects only some of mine. To be precise, it does not detect subdomains being served by a service behind a CloudFlare tunnel.
- xg15 1 year agoI think as soon as cert transparency was introduced, it was pretty clear we would eventually get something like this.
- judge2020 1 year ago
- TechBro8615 1 year agoI get a rate limit error when I click the text input (I'm on a VPN).
- 867-5309 1 year agouse an obscure country like North Macedonia
- 867-5309 1 year ago
- mmarquezs 1 year agoNice, last time I used Wolframalpha for this.
- webprofusion 1 year agoThis is a CT log search right?
- zX41ZdbW 1 year agoHow can I download the entire dataset from this service?
- maul666 1 year agodpd.co.uk
- Ocha 1 year agoMissed some for me
- Ocha 1 year agoMaybe because I use wild card certs with let’s encrypt
- ThePowerOfFuet 1 year agoInstead of replying to yourself, try editing your first comment!
- ThePowerOfFuet 1 year ago
- Ocha 1 year ago
- yadnst 1 year ago[dead]
- Andrew018 1 year ago[dead]
- chillbill 1 year ago[dead]
- tobinfekkes 1 year agoThis is crazy, I was just looking for this exact thing a couple days ago. Thank you for sharing. Brilliant work.