sico – A Sitemap comparison tool

A Sitemap Comparison that helps you to not fuck up your website migration.


Website migration

Imagine you want to migrate a website. Lets say, your personal website (incl. a blog) from Hugo to Astro. You have a few blog posts that rank pretty well in Google – You want to keep all previous URLs.

This tool checks if all URLs from the old website’s sitemap are present in the new one. It is not doing a 1:1 check! The new site can contain more links in the sitemap.


Usage of ./sico:
  -exclude value
        Regex to match against URLs in {source} sitemap that don't need to be in {new} sitemap. It can be defined multiple times.
  -new string
        New Sitemap URL - Sitemap entries you want to check for presence (default "")
  -newBaseURL new
        Base URL that will be used if new contains a SitemapIndex to replace the SitemapIndex entries
  -source string
        Source Sitemap URL - Sitemap you want to check against (default "")

An example

The call …

./sico -source "" \
       -new "" \
       -newBaseURL "" \
       -exclude "andygrunwald\\.com/tags/"

… means:

  1. We read the -source sitemap from and collect the URLs:

    <urlset xmlns="" xmlns:xhtml="">
  2. We read the -new sitemap from and collect the URLs:

    <sitemapindex xmlns="">
    1. If this URL is a SitemapIndex ([one sitemap split into multiple sitemaps](
    2. AND `-newBaseURL` is set, replace the Base URL (Scheme + Host) of the Sub-Sitemap with `-newBaseURL`
    3. Means `` will be changed to ``
  3. Loop through all URLs from -source (, check if the URL matches a defined -exclude and needs to be skipped; if not check if it is part of the -new sitemap. If yes, all good; if not, raise this as output (see below)
  4. Result:

    Source Sitemap:
    URLs checked (from source sitemap): 60
    New Sitemap:
    Excludes configured: 1
    URLs skipped because they matched an exclude: 24
    URLs missing from source sitemap in new sitemap: 14
    Missing URLs in the new sitemap:

Production ready?

No, not really. But it does the job.

This tool was created “on the get-go”. It has no focus on reliability, proper error handling, or things that fit into the production ready category. However, this is (partially) not needed.

This tool only reads data from the web and compares it. No write functionality or anything else. Hence, no damage.

Means: You can see it as a (kind of) production ready.


View Github