How to Create a Unicode-Aware Slugify Function in PHP
Introduction
Creating clean, SEO-friendly URLs often involves transforming a title or string into a "slug" , a lowercase string with words separated by dashes and no special characters. PHP developers typically write a slugify()
function for this, but many forget to account for non-ASCII characters such as à, ä, ă, ö, ß, and others.
The Problem with Basic Slugify Functions
Here’s a common slugify function found in many PHP projects:
function slugify($title) {
$slug = strtolower($title);
$slug = str_replace(' ', '-', $slug);
$slug = preg_replace('/[^a-z0-9-]/', '', $slug);
$slug = preg_replace('/-+/', '-', $slug);
$slug = trim($slug, '-');
return $slug;
}
While this works for basic English titles, it fails to convert special characters like à
or ß
into their English equivalents. Instead, it removes them entirely, leading to slugs like cafe
becoming caf
instead of the correct cafe
.
Improved Unicode-Aware Slugify Function
To handle special characters properly, you can use PHP’s iconv()
function. This function converts UTF-8 characters into ASCII equivalents where possible.
Here’s the improved slugify function:
function slugify($title) {
// Convert to UTF-8 and transliterate characters to ASCII
$slug = iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $title);
// Convert to lowercase
$slug = strtolower($slug);
// Replace spaces with dashes
$slug = str_replace(' ', '-', $slug);
// Remove all characters that are not alphanumeric or dashes
$slug = preg_replace('/[^a-z0-9-]/', '', $slug);
// Replace multiple dashes with a single dash
$slug = preg_replace('/-+/', '-', $slug);
// Trim dashes from beginning and end
$slug = trim($slug, '-');
return $slug;
}
Examples
à la carte
→a-la-carte
Crème brûlée
→creme-brulee
straße
→strasse
Înălțime
→inaltime
Server Requirements
Make sure the iconv
extension is enabled in your PHP environment. It’s usually enabled by default. You can check with:
php -m | grep iconv
By using iconv()
in your slugify function, you ensure that your slugs work correctly across a wide range of languages and character sets. This helps improve SEO, readability, and the overall reliability of your URLs.
Feel free to use and share this function in your PHP projects and leave a comment down below for more tutorials like this one.