Icon

November 11, 2025

Robots.txt

Auteur:

What is an robots.txt?

The robots.txt is a small but incredibly important text file that is in the main folder of your website.
It tells search engines which pages or sections they should or should not crawl.

Think of it like a digital doorbell.
Before a search engine enters your site, it first looks at this file to see where it's welcome and where it's not.
As an SEO specialist, I use robots.txt to control the crawling behavior of bots and prevent unimportant or sensitive content from being indexed.

Where can you find the robots.txt?

You can always view it by following your domain name /robots.txt to put.
For example:

https://www.rankrocket.nl/robots.txt

The file is publicly visible, anyone can view it, but only crawlers actively use it.

What does a robots.txt look like?

A basic example:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://www.rankrocket.nl/sitemap.xml

Explanation:

  • User agent: * means that the rules apply to all crawlers.
  • Disallow blocks certain folders or pages.
  • Allow gives exceptions within a blocked folder.
  • Sitemap refers to the XML sitemap so that Google knows where the important URLs are.

In my own projects, I always add this last point — it helps Google find the right pages right away.

Why is robots.txt important for SEO?

The robots.txt has a direct influence on how Google crawls your site.
A properly configured file prevents crawlers from wasting time on irrelevant or duplicate pages, such as:

  • /wp-admin/ or /cgi-bin/
  • internal search results
  • filter pages or URLs with tracking parameters
  • testing or staging environments

By excluding those parts, Google focuses its crawl budget on the pages that really matter.
That means faster indexation of important content and less noise in your search results.

Robots.txt in practice

In my work at Rank Rocket, I often come across it: a wrong line in the robots.txt that unintentionally blocks an entire website.
To one customer, someone had accidentally added this:

Disallow: /

As a result, Google couldn't crawl anything at all.
The site disappeared from search results within a few days.

Since then, every SEO audit has been the first to check the robots.txt by default.
One wrong slash can literally cost thousands of dollars in organic traffic.

What's better not to do

  • Use robots.txt not to hide pages that have already been indexed. For that, use noindex tags or remove the URL.
  • Don't block any CSS or JS files that Google needs to understand the layout.
  • Never put confidential information in a public path — robots.txt only stops bots, not humans.

Robots.txt and sitemaps

I always add a line to the XML sitemap, like:

Sitemap: https://www.rankrocket.nl/sitemap.xml

This isn't mandatory, but it makes it easier for search engines to find all important URLs right away.
It's a small effort that often has a noticeable effect on crawling efficiency.

Tools to test your robots.txt

I use three methods myself:

  1. Google Search Console: under Settings > Crawling you can check for blockages.
  2. Robots.txt Tester: part of the old Google Webmaster Tools, still usable via a direct link.
  3. Screaming Frog: this simulates Googlebot's behavior and shows exactly which URLs are excluded.

This is how I quickly discover whether a blockade is technically correct, or whether something goes wrong in interpreting the rules.

Op zoek naar hulp voor je SEO?

Neem gratis contact op en laten we samen kijken naar je website!

🚀 Gratis SEO scan

Krijg direct inzicht in de SEO kansen voor jou website.

Bedankt!
Er is iets mis gegaan.

Daan Coenen

Ik ben Daan Coenen, SEO-specialist en oprichter van Rank Rocket. Al meer dan zes jaar help ik bedrijven in Nederland en daarbuiten om duurzaam beter vindbaar te worden in Google, met strategie, techniek en content die écht werkt.