November 11, 2025

Meta robots tag

Auteur:

Daan Coenen

What is the meta robots tag

The meta robots tag is an HTML instruction that tells search engines how to treat your page.
With this tag, I decide whether or not a page can be indexed, whether links should be followed, and whether certain fragments can appear in the search results.

In short: with the meta robots tag, you give Google direction about indexing, crawling and viewing.

<meta name="robots" content="index, follow">

This example tells Google: “you can include this page in the index and follow the links”.
By using the tag smartly, I control the way search engines read your website, which is essential for long-term SEO performance.

Why the meta robots tag is important

Without clear meta robot instructions, Google assumes the standard: indices, follow.
This means that everything that is found is also included in the index.

This is fine with small websites, but for larger projects, such as a webshop with thousands of product variants or filter pages, this leads to the indexation of irrelevant or duplicating URLs.
The result: Google wastes crawl budget on pages you'd rather not see in search results.

By using the meta robots tag strategically, I ensure that:

Only valuable content is indexed
Internal link value is distributed in a targeted manner
Crawl budget is used efficiently
Search engines don't get confused by duplicates

How to use the meta robots tag

In my SEO processes, I use the meta robots tag at three levels:

1. Indexation Management

For each page type, I decide whether or not it can be included in the index.
For example:

Blog articles and product pages → index, follow
Filter or sort pages → noindex, follow
Thank you and login pages → noindex, nofollow

2. Snippet control

I use additional attributes to control what Google can show in search results.

<meta name="robots" content="max-snippet:160, max-image-preview:large, max-video-preview:0">

This way, you can maintain control over how your snippet is displayed without blocking anything.

3. Temporary or seasonal pages

For temporary campaigns, I use:

<meta name="googlebot" content="unavailable_after: 31 Dec 2025 23:59:00 CET">

After this date, the page will automatically disappear from the index.

The different values of the meta robots tag

| Waarde | Functie | Gebruikssituatie | | ----------------- | ------------------------------------------------------- | ------------------------------------------------ | | index | Staat indexatie toe | Standaardinstelling | | noindex | Sluit de pagina uit van indexatie | Test-, filter- of dankpagina’s | | follow | Laat Google links volgen | Standaard bij interne navigatie | | nofollow | Laat Google links negeren | Alleen bij vertrouwelijke of tijdelijke pagina’s | | nosnippet | Verbergt beschrijving in zoekresultaten | Voor private of betaalde content | | noarchive | Verhindert caching door Google | Voor pagina’s met tijdgevoelige info | | notranslate | Verhindert automatische vertalingen | Voor meertalige merken | | noimageindex | Sluit afbeeldingen van indexatie uit | Voor exclusieve visuele content | | max-snippet | Limiteert de snippetlengte | Voor controle op SERP-weergave | | unavailable_after | Verwijdert de pagina na specifieke datum | Voor campagnes of tijdelijke acties | | indexifembedded | Laat content indexeren wanneer deze in een iframe staat | Voor embedded video’s of tools |

Meta robots vs X-Robots Tag

I use the meta robots tag for HTML pages.
For files such as PDFs or images, there's the X-Robots Tag, which you add in the HTTP header.

<FilesMatch "\.(pdf|doc|jpg)$">
Header set X-Robots-Tag "noindex, noarchive"
</FilesMatch>

This is useful for files that cannot be accessed via HTML, but that you still want to keep out of the index.

Meta robots vs robots.txt

The meta robots tag and robots.txt look alike, but have a completely different function:

| Element | Wat het doet | Niveau | Leest Google de pagina nog? | | ----------- | ---------------------- | ------------ | ----------------------------------- | | robots.txt | Blokkeert crawlverkeer | Padniveau | Nee, Google ziet de inhoud niet | | meta robots | Stuurt indexatie | Paginaniveau | Ja, Google leest en volgt de pagina |

A mistake I often see: people set up a page noindex, but also block it in robots.txt.
Then Google will never be able to read the noindex, and the page will still remain in the index.

My rule:

“Do you want Google to forget something? Let him read it first.”

Meta robots and canonical

The canonical tag and meta robots work together, but not always well at the same time.
I use canonical when I want Google to consider multiple pages as one (e.g. product color variants).
I only use the noindex if the page really shouldn't be visible.

Important: Do not use them contradictory.
A page with noindex and a canonical in itself is confusing, Google doesn't know which instruction should win.

Meta robots and crawl budget

It also plays a role with large websites. crawl budget a role.
Every Google crawl takes time and resources.
Through irrelevant pages on noindex, follow to set, the internal link structure remains intact but the index is cleaned up.

For a customer with 40,000 filter pages, we applied this approach.
Result:

60% fewer redundant URLs in the index
New products were indexed 3 times faster
25% increase in crawl efficiency (according to server logs)

Audit and Validation

After implementation, I always check that Google interprets the instructions correctly:

URL inspection (Google Search Console)
- See if the page shows “Allowed for indexing.”
Screaming Frog (SEO Spider)
- Use custom extraction to retrieve meta robots.
site:Google search
- Make sure that the page is still visible in the search results.
Server logs
- See if Googlebot continues to crawl the page after the change.

Common mistakes

One noindex post on staging but forget to delete when going live.
noindex combine with disallow in robots.txt.
nofollow use on internal links, thereby breaking your own link structure.
Pages with noindex still include in the XML sitemap.
Templates that accidentally include a sitewide noindex state.

The future of meta robots

Although AI and machine learning are getting smarter, the meta robot tag remains essential.
Even Google's generative search models (such as SGE) use these signals to determine what content should or should not be displayed in contextual responses.

A well-designed meta robots strategy not only helps with indexation management, but also with brand control in AI results.

My advice

The meta robots tag is a powerful tool, but only if you use it strategically.
I always use three steps:

Create a decision tree per page type: what is allowed in the index, what is not?
Verify that robots.txt, canonical, and sitemap work together logically.
After going live, test with Search Console and a technical crawler.

At Rank Rocket, this comes as standard with every technical SEO audit.
The right meta robots implementation saves frustration, indexing problems and hours of manual work.

Op zoek naar hulp voor je SEO?

Neem gratis contact op en laten we samen kijken naar je website!

Neem contact op

🚀 Gratis SEO scan

Krijg direct inzicht in de SEO kansen voor jou website.

Bedankt!

Er is iets mis gegaan.

Daan Coenen

Ik ben Daan Coenen, SEO-specialist en oprichter van Rank Rocket. Al meer dan zes jaar help ik bedrijven in Nederland en daarbuiten om duurzaam beter vindbaar te worden in Google, met strategie, techniek en content die écht werkt.

More customers via your website?

Request a free intake and see how you can generate leads and sales for free.

Free introduction

Your website at #1 like a rocket

Company

Verder lezen

About us

Stay up to date