The Complete Guide to XML Sitemaps (And Why Yours Is Probably Wrong)
What an XML Sitemap Actually Does
An XML sitemap is a file that lists the URLs you want search engines to discover and index. Think of it as a roadmap for Google's crawler.
Important clarification: submitting a URL in your sitemap does not guarantee indexation. Google still decides whether a page is worth indexing. But a sitemap helps Google find pages faster, especially new pages and pages buried deep in your site structure.
Common Sitemap Mistakes
Including Non-Indexable Pages
Your sitemap should only contain pages that return a 200 status code and are not blocked by robots.txt, noindex, or canonical tags pointing elsewhere. Including redirects, 404s, or noindexed pages sends mixed signals and wastes your crawl budget.
One Giant Sitemap
If you have more than 50,000 URLs or your sitemap exceeds 50MB, split it into multiple sitemaps with a sitemap index file. Even for smaller sites, organizing sitemaps by section (blog, products, categories) makes monitoring easier.
Stale lastmod Dates
The lastmod tag should reflect when the page content actually changed, not when the sitemap was regenerated. If all your lastmod dates are the same timestamp, Google will eventually ignore them entirely.
Missing from Robots.txt
Your robots.txt file should reference your sitemap location. Add this line:
Sitemap: https://yoursite.com/sitemap.xml
Sitemap Best Practices
Only Include Canonical URLs
If page A has a canonical tag pointing to page B, only include page B in your sitemap.
Update Dynamically
When you publish, update, or delete a page, your sitemap should update automatically. Most CMS platforms handle this, but verify it works correctly.
Use Sitemap Index for Large Sites
Structure your sitemaps logically:
- sitemap-posts.xml
- sitemap-pages.xml
- sitemap-products.xml
- sitemap-categories.xml
Each referenced in a sitemap-index.xml file.
Monitor in Search Console
After submitting your sitemap, monitor the "Submitted" vs "Indexed" numbers. A large gap between submitted and indexed pages indicates quality or technical issues that need investigation.
Priority and Changefreq
These tags are largely ignored by Google in 2025. Do not spend time optimizing them. Focus on accurate lastmod dates and clean URL inclusion instead.
When Sitemaps Matter Most
Sitemaps are most critical for:
- New sites without many backlinks
- Large sites with thousands of pages
- Sites with pages not reachable through internal links
- Sites that publish frequently and need new content discovered quickly
Small, well-linked sites benefit less from sitemaps, but there is no downside to having one.