What Happens If There are Non-ASCII Characters in a URL?

What Happens If There are Non-ASCII Characters in a URL?

Free Complete Site Audit

Access a full website audit with over 300 technical insights.

Something went wrong. Please, try again later.
Trusted by
Sitechecker trusted company

Free Website SEO Checker & Audit Tool

  • Scan the site for 300+ technical issues
  • Monitor your site health 24/7
  • Track website rankings in any geo

URL encoding is an integral part of web development. It’s important that you properly make your URL since it will represent your web page address. To learn more about how to encode and decode URLs, you can watch the following video by Dcode.

Essentially, you have to understand what to include and omit when developing a URL. It’s best, of course, that you use the standard coding characters, specifically ASCII characters, to avoid transmission issues.

What Does “Non-ASCII Characters in a URL” Mean?

There’s only a limited set of characters you can use within a URL. It’s called the American Standard Code for Information Interchange (ASCII), and it includes digits (0-9), letters (A-Z, a-z), and a few special symbols (-, ., _, ~). Technically, non-ASCII characters are symbols not included in this set. Non-ASCII characters within a URL will make it not transmissible over the internet. This can result in your site resulting in an error page or not loading.

What Triggers This Issue?

When generating a URL, only ASCII symbols are allowed to be used. An example of a non-ASCII character is the Ñ. The URL can’t contain any non-ASCII character or even a space. This issue commonly arises from developers misusing symbols or making coding mistakes — it could arise from a lack of knowledge or even negligence.

How To Check the Issue

There are third-party tools you can use to automatically check for non-ASCII characters within a URL. For example, Sitechecker – conduct an audit and find all URLs with non-ASCII characters.

URLs with Non ASCII Characters Issue

In the result, get a list of all affected URLs, so you can fix them.

URLs with Non ASCII Characters Pages

If you’re the developer and you’re working on a code that results in a non-ASCII error warning, there’s a simple procedure you can follow.

To look for non-ASCII characters, try following these steps in any source code editor:

  1. Open the code editor.
  2. Press Ctrl + F to run a Find or Search command.
  3. Enter [^\x00-\x7F]+ in the search box.
  4. Choose “Regular expression” as the search mode. Click Next.
  5. Wait for the results.

Keep your URLs clean and accessible!

Unsure if there are Non-ASCII characters in your link? Check it right now with Sitechecker!

Something went wrong. Please, try again later.

Why Is This Important?

It’s important to detect and replace non-ASCII to avoid any issues with your URL and site. Non-ASCII characters can be dangerous since web browsers and search engines can interpret these characters in many ways. Removing non-ASCII characters makes URLs more readable and transmissible over the internet.

How To Fix the Issue

Aside from checking and manually replacing the non-ASCII symbols, there’s a specific method you can use to fix the issue: Percent encoding. It’s also referred to as URL encoding. This method converts the characters to new and universally accepted formats.

In order to convert it to a valid ASCII format, through URL encoding, the symbols are replaced by “%” followed by two hexadecimal digits. For example, Ñ will become %D1 or %C3%91 in Windows-1252 and UTF-8 formats, respectively.

Fast Links

You may also like

View More Posts
How to fix URLs with nofollow in HTML and HTTP header
Site Audit Issues
How to fix URLs with nofollow in HTML and HTTP header
Ivan Palii
May 4, 2023
How To Fix Content-Type HTML
Site Audit Issues
How To Fix Content-Type HTML
Ivan Palii
Sep 13, 2023
Canonical For External URL Issue
Site Audit Issues
Canonical For External URL Issue
Iryna Krutko
Oct 28, 2022
close