URL encoding is an integral part of web development. It’s important that you properly make your URL since it will represent your web page address. To learn more about how to encode and decode URLs, you can watch the following video by Dcode.
Essentially, you have to understand what to include and omit when developing a URL. It’s best, of course, that you use the standard coding characters, specifically ASCII characters, to avoid transmission issues.
What Does “Non-ASCII Characters in a URL” Mean?
There’s only a limited set of characters you can use within a URL. It’s called the American Standard Code for Information Interchange (ASCII), and it includes digits (0-9), letters (A-Z, a-z), and a few special symbols (-, ., _, ~). Technically, non-ASCII characters are symbols not included in this set. Non-ASCII characters within a URL will make it not transmissible over the internet. This can result in your site resulting in an error page or not loading.
What Triggers This Issue?
When generating a URL, only ASCII symbols are allowed to be used. An example of a non-ASCII character is the Ñ. The URL can’t contain any non-ASCII character or even a space. This issue commonly arises from developers misusing symbols or making coding mistakes — it could arise from a lack of knowledge or even negligence.
How To Check the Issue
There are third-party tools you can use to automatically check for non-ASCII characters within a URL. For example, Sitechecker – conduct an audit and find all URLs with non-ASCII characters.
In the result, get a list of all affected URLs, so you can fix them.
If you’re the developer and you’re working on a code that results in a non-ASCII error warning, there’s a simple procedure you can follow.
To look for non-ASCII characters, try following these steps in any source code editor:
- Open the code editor.
- Press Ctrl + F to run a Find or Search command.
- Enter [^\x00-\x7F]+ in the search box.
- Choose “Regular expression” as the search mode. Click Next.
- Wait for the results.
Keep your URLs clean and accessible!
Unsure if there are Non-ASCII characters in your link? Check it right now with Sitechecker!
Why Is This Important?
It’s important to detect and replace non-ASCII to avoid any issues with your URL and site. Non-ASCII characters can be dangerous since web browsers and search engines can interpret these characters in many ways. Removing non-ASCII characters makes URLs more readable and transmissible over the internet.
How To Fix the Issue
Aside from checking and manually replacing the non-ASCII symbols, there’s a specific method you can use to fix the issue: Percent encoding. It’s also referred to as URL encoding. This method converts the characters to new and universally accepted formats.
In order to convert it to a valid ASCII format, through URL encoding, the symbols are replaced by “%” followed by two hexadecimal digits. For example, Ñ will become %D1 or %C3%91 in Windows-1252 and UTF-8 formats, respectively.