It is common to have several URLs leading to the same page or to similar ones – for instance, to desktop and mobile versions. For us, it’s the same page with the same content, but search engines see a bunch of different URLs that redirect to different “unique” pages.
This is why it is important to specify which page is considered a canonical one. This way, you are sure that users will see the right URL in search results, and Googlebot won’t get lost among duplicate pages. Otherwise, Google will make this choice for you, and the result can cause some unwanted problems with crawling and indexation.
If you want to learn more about factors that search engines are guided by while picking a canonical URL, you may watch this video by John Mueller.
What Is a Non-Canonical URL in a Sitemap?
Non-canonical URLs in the sitemap mislead search engines. It happens when a page’s URL doesn’t match a canonical URL. As a result, the robots are indexing pages that have different addresses from their canonical versions. To avoid these problems, one should always put a tag on a preferred URL. For instance, if the same page can be accessed with and without WWW, you should tag the one you want search engines to index.
What Triggers This Issue?
Here are the most common causes of canonical issues:
- There are several URLs leading to pages with identical or similar content. At the same time, there is no indication of a URL’s default version. It leads to duplicates of the same page: with and without WWW, etc.
- If the website can be accessed with HTTP and HTTPS, there is a duplicate of each page.
- Your website has mobile and desktop versions.
- You haven’t properly set up redirects. As a result, you’re sending mixed signals to crawlers.
- It is common for eCommerce websites to have different URLs for the same page depending on the filters applied.
Analyze not only Non-canonical page in sitemap but the entire site!
Make a full audit to find out and fix your technical SEO in order to improve your SERP results.
Why Is This Important?
If search engines get misleading signals from your XML sitemap, there is a risk that they will ignore your website in the future. Moreover, duplicate content issues may cause SEO problems. For correct indexation, your sitemap should include only canonical URLs. This is how you inform crawlers about the most important pages on the website. Without fixing the canonical issues, you won’t be able to properly monitor your website’s traffic.
How To Check It
To understand if you have any issues with canonicalization, you may type different versions of your site name, meaning the ones that start with HTTP or HTTPS and WWW or non-WWW. If any of these variations fail to lead to your preferred URL, you’re facing problems with non-canonical pages.
Another option might be to crawl all the pages of your website using tools like Screaming Frog, SiteAnalyzer, and others. Some programs can find duplicate pages and canonical problems. It will spare you from going through each URL manually.
How To Fix the Issue
Your goal is to remove all non-canonical URLs from your XML sitemap. After choosing and submitting canonical ones, you should also resubmit your sitemap in Google Search Console.
These guidelines will tell you more about the dos and don’ts of the canonicalization process.
What else you can do:
- Create 301 redirects for duplicate URLs. This way, Google will understand which page is the preferred one. You can implement redirects on the webserver (Apache or Nginx) or contact your host service’s support.
- Add rel=”canonical” tag to each page (also known as a canonical link). Usually, CMS has its own way or a plugin to simplify the process without having to manually tag pages one by one. That is how you indicate the preferred page among its duplicates. It is a delicate version of the 301 redirects.
- Use rel=”canonical” HTTP header if you have access to the server. It is useful when you need to fix a link to a PDF file.