sitechecker.pro logo mobile

How to Fix Invalid URL Characters

How to Fix Invalid URL Characters

Free Complete Site Audit

Access a full website audit with over 300 technical insights.

Something went wrong. Please, try again later.
Trusted by
Sitechecker trusted company

Free Website SEO Checker & Audit Tool

  • Scan the site for 300+ technical issues
  • Monitor your site health 24/7
  • Track website rankings in any geo

What Does Invalid URL Mean?

A URL or Uniform Resource Locator is a web address of any resource on the internet. It’s made of a limited set of special characters, specifically from the US-ASCII character set. A valid web address means that the link works and directs a user to the webpage when accessed.

Now, its opposite, an invalid one, can mean various things, including:

  • The finder isn’t allowed to access the specified web page.
  • There’s a mistake when typing the web address or within the web address itself.
  • The site or page you’re trying to access has been redirected to another page or site, but the developers didn’t do it correctly.

What Triggers This Issue?

Invalid web addresses happen because of the following factors or instances:

  • The legal page or site doesn’t exist anymore.
  • There were mistakes in the link when it was entered. Examples include inputting spaces.
  • Invalid characters that are not allowed or accepted in the list of chars were used.
  • It is incomplete and missing some characters.

How to Check the Issue

There are various tools and techniques to check validity. Developers, programmers, and people with enough coding knowledge can use Java or Regex to check.

To manually check if a web address is valid or not, you can try observing and checking the link itself. There should be no invalid characters, and reserved characters must not be used excessively. There should also be no spaces in the web address. Additionally, the link should use https:// or http://, followed by the domain name.

In this section of our Site Audit tool, you’re looking at a crucial aspect of your website’s health: the correctness of your URLs. We’ve highlighted any web addresses that contain non-standard characters or have uppercase characters which might not conform to best practices.

Invalid URL Issues

By clicking on the issue labeled “URL contains non-ASCII characters,” the tool reveals a list of specific pages that have this problem. Each listed URL will show you key details such as the page’s weight, status code, and the date the issue was detected.

URL Contains non ASCII Characters Pages

Optimize Your URLs for Better Search Rankings!

Ensure your URLs are up to SEO standards with our Sitechecker Audit tool.

Something went wrong. Please, try again later.

How to Fix the Issue

Fixing invalid web address characters involves replacing or encoding characters that are not allowed in URLs. Here’s a step-by-step guide on how to handle this issue:

1. Identify Invalid Characters

URLs can only contain certain characters, including letters, numbers, and a few special characters (-, _, ., ~). Other characters need to be encoded.

2. URL Encoding

Web address encoding converts characters into a format that can be transmitted over the internet. For example, spaces are encoded as %20.

3. Using Programming Languages to Encode web address

Different programming languages offer built-in functions to handle link encoding. Below are examples in some common languages:

Python

Use the urllib.parse module.


import urllib.parse

def fix_invalid_url(url):
return urllib.parse.quote(url, safe='/:?=&')

# Example usage
url = "http://example.com/foo bar/baz"
fixed_url = fix_invalid_url(url)
print(fixed_url) # Output: http://example.com/foo%20bar/baz

JavaScript

Use the encodeURIComponent function.


function fixInvalidUrl(url) {
return encodeURIComponent(url);
}

// Example usage
let url = "http://example.com/foo bar/baz";
let fixedUrl = fixInvalidUrl(url);
console.log(fixedUrl); // Output: http%3A%2F%2Fexample.com%2Ffoo%20bar%2Fbaz

PHP

Use the urlencode function.


function fixInvalidUrl($url) {
return urlencode($url);
}

// Example usage
$url = "http://example.com/foo bar/baz";
$fixedUrl = fixInvalidUrl($url);
echo $fixedUrl; // Output: http%3A%2F%2Fexample.com%2Ffoo+bar%2Fbaz

4. Handling Specific Characters

Sometimes, you may want to preserve certain characters like : or / while encoding the rest. Adjust the encoding functions accordingly.

Python (keeping /: characters unencoded)


import urllib.parse

def fix_invalid_url(url):
return urllib.parse.quote(url, safe='/:?=&')

# Example usage
url = "http://example.com/foo bar/baz"
fixed_url = fix_invalid_url(url)
print(fixed_url) # Output: http://example.com/foo%20bar/baz

5. Regular Expressions

You can also use regular expressions to identify and replace invalid characters if needed.

Python


import re

def fix_invalid_url(url):
return re.sub(r'[^a-zA-Z0-9\-_.~:/?=&]', lambda x: '%{:02X}'.format(ord(x.group())), url)
# Example usage
url = "http://example.com/foo bar/baz"
fixed_url = fix_invalid_url(url)
print(fixed_url) # Output: http://example.com/foo%20bar/baz

Conclusion

To fix invalid web address characters, it is crucial to use link encoding functions provided by your programming language. These functions ensure that all characters are properly encoded and that your web addresses are valid for use on the web.

FAQ
Quote encodes special characters into percent-encoded format but leaves spaces as %20, while quote_plus replaces spaces with + and encodes other special characters.
Yes, link encoding can be reversed using link decoding functions provided by programming languages, such as urllib.parse.unquote in Python or decodeURIComponent in JavaScript.
Common issues include broken links, incorrect routing, security vulnerabilities, and data corruption during transmission.
Fast Links

You may also like

View More Posts
How to fix URLs where h1 tag and title tag are the same
Site Audit Issues
How to fix URLs where h1 tag and title tag are the same
Ivan Palii
Oct 31, 2022
Why using a canonical URL to point to a canonicalized page occurs and how to fix this indexing bug?
Site Audit Issues
Why using a canonical URL to point to a canonicalized page occurs and how to fix this indexing bug?
Ivan Palii
Oct 28, 2022
Not Found (4XX) URL in XML Sitemaps
Site Audit Issues
Not Found (4XX) URL in XML Sitemaps
Ivan Palii
Oct 9, 2024

So, soon? Well, before you go…

Get instant on-page SEO analysis of your home page

  • Detect broken links
  • Detect issues with content optimization
  • Check PageSpeed for mobile and desktop
Something went wrong. Please, try again later.
You’ll get the report in 2 seconds without required signup
exit-popup-image
close