How to allow bots index all pages in robots.txt?

Unanswered

Netherland Dwarf posted this in #help-forum

Netherland DwarfOP

2024-06-11T17:39:15.014Z

Hello, i have the following code:

import { MetadataRoute } from 'next';
import { BASE_URL } from '@/constants';
const robots = (): MetadataRoute.Robots => {
  return {
    rules: {
      userAgent: '*',
    },
    sitemap: `${BASE_URL}/sitemap.xml`,
  };
};
export default robots;

I checked next.js doc and it shows an "allow" prop but if i remove it the allow prop as shown in the video, will it by default allow all pages to be detected by bots?

9 Replies

Netherland DwarfOP

2024-06-11T21:36:34.464Z

Update: i looked at the original docs related to meta data according to wiki linked in next.js doc

And it mentions adding disallow: as empty and allow:’/‘ says for all pages

If anyone can confirm this that would be great

Bengal

2024-06-11T22:53:17.978Z

I've honestly never used allow:'/', but I do use Disallow: as empty.
Typically in my case I only want google to index me so my robots.txt always contains

User-agent: Googlebot
Disallow:
User-agent: Googlebot-image
Disallow:

You might not need the 2 disallows
With robots.txt, you can always check most websites and see how they handle it, /robots.txt is always public

Netherland DwarfOP

2024-06-11T23:26:12.492Z

@Bengal thanks and i cant because im using typescript

It will throw an error if i leave it empty

So i gave it a empty array

Bengal

2024-06-11T23:29:08.750Z

Check what's generated. You can always go to localhost:300/robots.txt, or check the generated robots.txt, it should be somewhere in your project files

Netherland DwarfOP

2024-06-11T23:34:03.504Z

Yeah it shows the same that i typed