Part 5 – Certificate Transparency, CT Logs and Cert-Based Discovery

Introduction

Every time a company issues an SSL certificate, it leaves a public trace.
That trace lives in Certificate Transparency logs.

Most teams forget this.
Attackers and good researchers do not.

In this Part, we will learn how to use certificate data to discover subdomains, staging apps, forgotten hosts, and third-party services.
Quietly. Reliably. With high confidence.

If passive OSINT was listening at the door, CT logs are reading the guest list.


Why certificate transparency matters

  • SSL certificates are public by design.
  • Companies often issue certs for internal, staging, or test domains.
  • Even deleted infrastructure leaves historical cert records.
  • Cert data usually has higher signal than brute force wordlists.

If a hostname ever needed HTTPS, there is a good chance it exists in CT logs.


What you can find using CT logs

  • Subdomains never linked on the main site
  • Staging and dev environments
  • Old infrastructure still reachable
  • Third-party services tied to the company
  • Wildcard cert usage patterns
  • Candidates for subdomain takeover checks

This is one of the cleanest ways to grow your asset list.


How certificate transparency works (simple explanation)

Whenever an SSL certificate is issued,
the Certificate Authority logs it publicly.

These logs include:

  • Domain names in the certificate
  • Issue date and expiry
  • Issuing CA

Anyone can query these logs.
That is what tools like crt.sh and certstream do.


Tools you will use

  • crt.sh – manual and automated CT queries
  • certstream – live certificate monitoring
  • amass – cert enrichment and correlation
  • jq, sed, sort – parsing and cleanup
  • curl – automated queries
  • dnsx / dig – validation and enrichment

No heavy tooling needed here. Signal over noise.


Step-by-step: CT log discovery workflow

1. Manual crt.sh search (fast sanity check)

Open in browser:

https://crt.sh/?q=%25.example.com

Look for:

  • Unexpected subdomains
  • Environment names like dev, stage, internal
  • Long or oddly named hosts

This gives you intuition before automation.


2. Automated crt.sh extraction (copy-paste)

curl -s "https://crt.sh/?q=%25.example.com&output=json" \
| jq -r '.[].name_value' \
| sed 's/\*\.//g' \
| sort -u > crt_raw.txt

Explanation in simple words:

  • Fetch all cert records for the domain
  • Extract domain names
  • Remove wildcard prefixes
  • Deduplicate the list

This file is your CT asset base.


3. Cleaning noisy entries

Certificates sometimes include:

  • Email addresses
  • Invalid formatting
  • Duplicate hostnames

Clean them:

cat crt_raw.txt | grep -E "^[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" | sort -u > crt_clean.txt

Now you have only valid hostnames.


Wildcard certificates and how to handle them

Wildcard certs look like this:

*.example.com

Important points:

  • A wildcard cert does NOT mean all subdomains exist
  • It only means the cert allows them
  • You must still resolve hosts to confirm they are real

Always treat wildcard entries as candidates, not assets.


Detecting wildcard certificate usage

Check how many wildcard entries exist:

curl -s "https://crt.sh/?q=%25.example.com&output=json" | jq '.[].name_value' | grep '\*'

If wildcards dominate:

  • Expect noise
  • Focus on unique non-wildcard names first

Resolution and validation (critical step)

Never trust CT data blindly.

Resolve discovered hostnames:

cat crt_clean.txt | dnsx -a -cname -resp -silent -o crt_resolved.txt

What this gives you:

  • IP addresses
  • CNAME chains
  • Confirmation of existence

Only resolved hosts move forward.


Enriching CT data

For each resolved host, collect:

  • IP address
  • CNAME target
  • Cloud provider hint
  • CDN usage

Example quick enrichment:

cat crt_resolved.txt | while read h; do
  echo "---- $h ----"
  dig +short $h
  dig +short CNAME $h
done

This helps identify:

  • Cloud services
  • Third-party hosting
  • Takeover opportunities later

Live monitoring with certstream (advanced but powerful)

Certificate transparency is not only historical.
You can watch new certs in real time.

Certstream use-case

  • Catch new staging domains immediately
  • Monitor large programs continuously
  • Discover assets before others

Basic certstream setup (Python)

pip install certstream

Simple listener example:

certstream --domain example.com

Whenever a new cert is issued for the domain, you see it instantly.

This is gold for active bug bounty programs.


How to reduce false positives

CT logs may include:

  • Typo domains
  • Unrelated subdomains
  • Shared certs

Filtering tips:

  • Resolve everything
  • Compare IP ranges with known assets
  • Check HTTP response and headers
  • Correlate with repo leaks and OSINT

Only keep what connects logically to the target.


How CT data feeds other recon steps

CT output goes directly into:

  • Subdomain enumeration refinement
  • Subdomain takeover checks
  • URL collection
  • JS harvesting
  • Cloud footprint mapping

CT logs act as a high-confidence seed list.


Practical real-world use-cases

  • Finding staging.example.com used only for QA
  • Discovering forgotten admin-old.example.com
  • Catching new infrastructure added during a release
  • Identifying SaaS services connected via CNAME
  • Spotting takeover-prone dangling records

These are real bugs found repeatedly in programs.


Mini lab exercise (25–30 minutes)

  1. Pick a domain you own or a lab domain.
  2. Extract CT data:
curl -s "https://crt.sh/?q=%25.yourdomain.com&output=json" \
| jq -r '.[].name_value' \
| sed 's/\*\.//g' \
| sort -u > crt_raw.txt
  1. Clean and resolve:
grep -E "^[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" crt_raw.txt | sort -u > crt_clean.txt
dnsx -l crt_clean.txt -a -cname -resp -silent -o crt_resolved.txt
  1. Open top three resolved hosts in your browser.
  2. Add notes to your tracker:
  • Host
  • Status
  • Why it looks interesting

This trains you to treat cert data seriously.


Common mistakes and how to avoid them

Mistake: Treating wildcard certs as real hosts
Fix: Always resolve and validate

Mistake: Ignoring historical certs
Fix: Old infrastructure is often still reachable

Mistake: Not correlating CT with other data
Fix: Combine with repo leaks and OSINT

Mistake: Forgetting live monitoring
Fix: Use certstream for active programs


Quick command summary

CT extraction:

curl -s "https://crt.sh/?q=%25.example.com&output=json" | jq -r '.[].name_value' | sed 's/\*\.//g' | sort -u > crt_raw.txt

Cleaning:

grep -E "^[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" crt_raw.txt | sort -u > crt_clean.txt

Resolution:

dnsx -l crt_clean.txt -a -cname -resp -silent -o crt_resolved.txt

Wildcard check:

curl -s "https://crt.sh/?q=%25.example.com&output=json" | jq -r '.[].name_value' | grep '\*'

What to do after this Part

  • Merge CT hosts into your master subdomain list
  • Run takeover checks on CNAME-based hosts
  • Feed live hosts into URL and JS collection
  • Set up certstream monitoring for active targets

CT data becomes a long-term asset, not a one-time check.


Next post preview

Part 6 – DNS Records, Zone Discovery and Wildcard Handling

We will cover:

  • DNS record types that matter for web apps
  • Zone transfer checks
  • Misconfigured DNS setups
  • Wildcard DNS detection and filtering
  • How DNS mistakes lead to real bugs

This builds directly on CT and subdomain work.


Closing thought

Certificates were meant to improve security.
They also improved visibility.

Use that visibility calmly and ethically.
CT logs reward patience and pattern recognition.


Disclaimer

This content is for educational purposes only. Use it ethically and only against targets you own or have explicit permission to test. Do not use any techniques described here in ways that break laws, platform rules, or third-party rights. If in doubt, stop and get permission.

Share the Post:

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

×