Next.js Discord

Discord Forum

lighthouse crawler file malformed

Answered
Arinji posted this in #help-forum
Open in Discord
Avatar
ArinjiOP
return {
title: ${seoJson.data.name} (@${params.biolink}) \u00b7 Feds.lol,
description: Stay connected with @${params.biolink} and discover their online world in one convenient location.,
openGraph: {
title: ${seoJson.data.name} (@${params.biolink}) \u00b7 Feds.lol,
description: Stay connected with @${params.biolink} and discover their online world in one convenient location.,
images: [
{
url: seoJson.data.profile_picture,
alt: ${seoJson.data.name} Profile Picture,
},
],
},
robots: {
index: true,
follow: true,
nocache: true,
googleBot: {
index: true,
follow: true,
noimageindex: false,
"max-video-preview": -1,
"max-image-preview": "large",
"max-snippet": -1,
},
},
};
}

When I do a lighthouse report my seo score drops because "crawler file is malformed"

Any idea why this happens
Answered by not-milo.tsx
You need an Allow entry like this:

User-Agent: *
Allow: /
Disallow: /private/

Sitemap: https://acme.com/sitemap.xml
View full answer

27 Replies

Avatar
ArinjiOP
The docs shows the same code so I'm really confused now lol
Avatar
ArinjiOP
Bump
Avatar
not-milo.tsx
I don't think it's related to the code you posted. That generates some meta tags for web crawlers.

If you want to have a properly formed robots.txt file either include one at the root of your app directory or generate one dynamically with a [robots.ts](https://nextjs.org/docs/app/api-reference/file-conventions/metadata/robots) file
Avatar
ArinjiOP
oh that could be it actually, lemme try :D
Avatar
ArinjiOP
ok so i just got this new lighthouse seo issue
Image
my robots.txt file
User-agent: *
Disallow: /dash

The page im trying to index is https://website-git-frontend-v2-fedslol.vercel.app/techy
Avatar
not-milo.tsx
You need an Allow entry like this:

User-Agent: *
Allow: /
Disallow: /private/

Sitemap: https://acme.com/sitemap.xml
Answer
Avatar
ArinjiOP
oh ok, lemme try :D
also for the sitemap, i have a lot of users with their own pages, like /techy
so do i need to like dynamically generate all of them?
if i want them to be indexed as well
Avatar
not-milo.tsx
Yup, you have to and you can do that dynamically too as explained here: https://nextjs.org/docs/app/api-reference/file-conventions/metadata/sitemap#generate-a-sitemap
Avatar
ArinjiOP
Alright, thankyou, lemme check if the allow entry works
Also sorry about the cross posting thing, i thought it was like one question per thread
Avatar
not-milo.tsx
Right now it's only limited to the url and lastModified fields. So if you need something more sophisticated you'll have to generate it in a different way. (If you need it I can show you)
Avatar
ArinjiOP
Oh no need, the lastModified thing is optional right?
Avatar
not-milo.tsx
Oh no, not necessarily. If the questions are related you can post them in the same thread. The only thing the server doesn't allow is posting the same question in multiple places.
Avatar
ArinjiOP
yea, got it now. Thanks :D
Avatar
not-milo.tsx
Yup, the sitemap return tipe is this one:

type Sitemap = Array<{
  url: string
  lastModified?: string | Date
}>
Avatar
ArinjiOP
alright, actually i was thinking, we might want to add some more details like the name and the description of the user. So could you show me how i would have to set that up?
Avatar
not-milo.tsx
The trick is to use middleware and return a custom xml response when the crawler requests /sitemap.xml.

Se it in action here: https://github.com/milovangudelj/milovangudelj.com/blob/master/middleware.ts#L31-L39

And here: https://github.com/milovangudelj/milovangudelj.com/blob/master/lib/sitemap.ts
Avatar
ArinjiOP
oh, cool lemme check it out
oh thats very neat, thankyou so much :D
Avatar
not-milo.tsx
No problem ✌🏻
If that's all you can mark the question as resolved
Avatar
ArinjiOP
Yup, its deploying the site, once i can verify the allow thing has worked imma resolve it :D
yay 100 Seo, thankyou so much man, you are awesome.
Avatar
not-milo.tsx
Great success 💪🏻