php 生成 SEO 友好的 URL(slugs)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5305879/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 20:56:29  来源:igfitidea点击:

Generate SEO friendly URLs (slugs)

phpstringseofriendly-urlslug

提问by GG.

Definition

定义

From Wikipedia:

来自维基百科

A slugis the part of a URL which identifies a page using human-readable keywords.

To make the URL easier for users to type, special characters are often removed or replaced as well. For instance, accented characters are usually replaced by letters from the English alphabet; punctuation marks are generally removed; and spaces (which have to be encoded as %20 or +) are replaced by dashes (-) or underscores (_), which are more aesthetically pleasing.

蛞蝓是识别使用人类可读关键字的页面的URL的一部分。

为了使用户更容易输入 URL,通常还会删除或替换特殊字符。例如,重音字符通常被英文字母表中的字母代替;标点符号通常被删除;和空格(必须编码为 %20 或 +)被替换为破折号 (-) 或下划线 (_),它们更美观。

Context

语境

I developed a photo-sharing website on which users can upload, share and view photos.

我开发了一个照片分享网站,用户可以在上面上传、分享和查看照片。

All pages are generated automatically without my grip on the title. Because the title of a photo or the name of a user may contain accented characters or spaces, I needed a function to automatically create slugs and keep readable URLs.

所有页面都是自动生成的,无需我控制标题。因为照片的标题或用户的名字可能包含重音字符或空格,所以我需要一个函数来自动创建 slug 并保持可读的 URL。

I created the following function which replaces accented characters (aèê???), removes punctuation and bad characters (#@&~^!) and transforms spaces in dashes.

我创建了以下函数来替换重音字符 (aèê???)、删除标点符号和坏字符 (#@&~^!) 并转换破折号中的空格。

Questions

问题

  • What do you think about this function?
  • Do you know any other functions to create slugs?
  • 你觉得这个功能怎么样?
  • 你知道任何其他函数来创建 slug 吗?

Code

代码

php:

php:

function sluggable($str) {

????$before = array(
        'àáa???òó????èéê?e?ìí??ùú?ü???',
        '/[^a-z0-9\s]/',
????????array('/\s/', '/--+/', '/---+/')
????);
?
????$after = array(
        'aaaaaaooooooeeeeeciiiiuuuunsz',
        '',
        '-'
    );

????$str = strtolower($str);
????$str = strtr($str, $before[0], $after[0]);
????$str = preg_replace($before[1], $after[1], $str);
????$str = trim($str);
????$str = preg_replace($before[2], $after[2], $str);
?
????return $str;
}

采纳答案by AlfaTeK

It seems ok, maybe it's incomplete. Check http://code.google.com/p/php-slugs/for a code example.

看起来还可以,也许是不完整的。检查http://code.google.com/p/php-slugs/以获取代码示例。

回答by Natxet

I like the php-slugs code at google code solution. But if you want a simpler one that works with UTF-8:

我喜欢 google 代码解决方案中的 php-slugs 代码。但是,如果您想要一个更简单的适用于 UTF-8 的文件:

function format_uri( $string, $separator = '-' )
{
    $accents_regex = '~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i';
    $special_cases = array( '&' => 'and', "'" => '');
    $string = mb_strtolower( trim( $string ), 'UTF-8' );
    $string = str_replace( array_keys($special_cases), array_values( $special_cases), $string );
    $string = preg_replace( $accents_regex, '', htmlentities( $string, ENT_QUOTES, 'UTF-8' ) );
    $string = preg_replace("/[^a-z0-9]/u", "$separator", $string);
    $string = preg_replace("/[$separator]+/u", "$separator", $string);
    return $string;
}

So

所以

echo format_uri("#@&~^!aèê???");

outputs

产出

-and-aeeeci

Please, comment if you find some errors

发现错误请评论

回答by rybo111

A few people have linked to "php-slugs" on google.com, but it looks like their page is a little screwy now, so here it is if anyone needs it:

一些人在 google.com 上链接到了“php-slugs”,但现在他们的页面看起来有点乱,所以如果有人需要的话,这里是:

// source: https://code.google.com/archive/p/php-slugs/

function my_str_split($string)
{
    $slen=strlen($string);
    for($i=0; $i<$slen; $i++)
    {
        $sArray[$i]=$string{$i};
    }
    return $sArray;
}

function noDiacritics($string)
{
    //cyrylic transcription
    $cyrylicFrom = array('А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я', 'а', 'б', 'в', 'г', 'д', 'е', 'ё', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я');
    $cyrylicTo   = array('A', 'B', 'W', 'G', 'D', 'Ie', 'Io', 'Z', 'Z', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'R', 'S', 'T', 'U', 'F', 'Ch', 'C', 'Tch', 'Sh', 'Shtch', '', 'Y', '', 'E', 'Iu', 'Ia', 'a', 'b', 'w', 'g', 'd', 'ie', 'io', 'z', 'z', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'f', 'ch', 'c', 'tch', 'sh', 'shtch', '', 'y', '', 'e', 'iu', 'ia'); 


    $from = array("á", "à", "?", "?", "?", "ā", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "D", "é", "è", "?", "ê", "?", "ě", "ē", "?", "?", "?", "?", "?", "?", "á", "à", "a", "?", "?", "ā", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "e", "é", "è", "?", "ê", "?", "ě", "ē", "?", "?", "?", "?", "?", "?", "?", "?", "I", "í", "ì", "?", "?", "?", "ī", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "ó", "ò", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "í", "ì", "i", "?", "?", "ī", "?", "?", "?", "?", "?", "?", "ń", "ň", "?", "?", "ó", "ò", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "T", "ú", "ù", "?", "ü", "?", "ū", "?", "?", "?", "?", "?", "Y", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "t", "ú", "ù", "?", "ü", "?", "ū", "?", "?", "?", "?", "?", "y", "?", "?", "?", "?", "?");
    $to   = array("A", "A", "A", "AE", "A", "A", "A", "A", "A", "AE", "C", "C", "C", "C", "C", "D", "D", "D", "E", "E", "E", "E", "E", "E", "E", "E", "G", "G", "G", "G", "G", "a", "a", "a", "ae", "ae", "a", "a", "a", "a", "ae", "c", "c", "c", "c", "c", "d", "d", "d", "e", "e", "e", "e", "e", "e", "e", "e", "g", "g", "g", "g", "g", "H", "H", "I", "I", "I", "I", "I", "I", "I", "I", "IJ", "J", "K", "L", "L", "N", "N", "N", "N", "O", "O", "O", "OE", "O", "O", "O", "O", "CE", "h", "h", "i", "i", "i", "i", "i", "i", "i", "i", "ij", "j", "k", "l", "l", "n", "n", "n", "n", "o", "o", "o", "oe", "o", "o", "o", "o", "o", "R", "R", "S", "S", "S", "S", "T", "T", "T", "U", "U", "U", "UE", "U", "U", "U", "U", "U", "U", "W", "Y", "Y", "Y", "Z", "Z", "Z", "r", "r", "s", "s", "s", "s", "ss", "t", "t", "b", "u", "u", "u", "ue", "u", "u", "u", "u", "u", "u", "w", "y", "y", "y", "z", "z", "z");


    $from = array_merge($from, $cyrylicFrom);
    $to   = array_merge($to, $cyrylicTo);

    $newstring=str_replace($from, $to, $string);
    return $newstring;
}

function makeSlugs($string, $maxlen=0)
{
    $newStringTab=array();
    $string=strtolower(noDiacritics($string));
    if(function_exists('str_split'))
    {
        $stringTab=str_split($string);
    }
    else
    {
        $stringTab=my_str_split($string);
    }

    $numbers=array("0","1","2","3","4","5","6","7","8","9","-");
    //$numbers=array("0","1","2","3","4","5","6","7","8","9");

    foreach($stringTab as $letter)
    {
        if(in_array($letter, range("a", "z")) || in_array($letter, $numbers))
        {
            $newStringTab[]=$letter;
        }
        elseif($letter==" ")
        {
            $newStringTab[]="-";
        }
    }

    if(count($newStringTab))
    {
        $newString=implode($newStringTab);
        if($maxlen>0)
        {
            $newString=substr($newString, 0, $maxlen);
        }

        $newString = removeDuplicates('--', '-', $newString);
    }
    else
    {
        $newString='';
    }

    return $newString;
}


function checkSlug($sSlug)
{
    if(preg_match("/^[a-zA-Z0-9]+[a-zA-Z0-9\-]*$/", $sSlug) == 1)
    {
        return true;
    }

    return false;
}

function removeDuplicates($sSearch, $sReplace, $sSubject)
{
    $i=0;
    do{

        $sSubject=str_replace($sSearch, $sReplace, $sSubject);
        $pos=strpos($sSubject, $sSearch);

        $i++;
        if($i>100)
        {
            die('removeDuplicates() loop error');
        }

    }while($pos!==false);

    return $sSubject;
}

回答by Juscelino Iene

    setlocale(LC_ALL, 'en_US.UTF8');

        function slugify($text)
        {
          // replace non letter or digits by -
          $text = preg_replace('~[^\pL\d]+~u', '-', $text);

          // trim
          $text = trim($text, '-');

          // transliterate
          $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);

          // lowercase
          $text = strtolower($text);

          // remove unwanted characters
          $text = preg_replace('~[^-\w]+~', '', $text);

          if (empty($text))
          {
            return 'n-a';
          }

          return $text;
        }


$slug = slugify($var);

回答by Amit Rajput

This really works fine. Returns correct clean url slug.

这真的很好用。返回正确的干净 url slug。

$string = '(1234) S*m@#ith S)&+*t `E}{xam)ple?>land   - - 1!_2)#3)(*4""5';

// remove all non alphanumeric characters except spaces
$clean =  preg_replace('/[^a-zA-Z0-9\s]/', '', strtolower($string)); 

// replace one or multiple spaces into single dash (-)
$clean =  preg_replace('!\s+!', '-', $clean); 

echo $clean; // 1234-smith-st-exampleland-12345

回答by rsplak

I found this on the net, does exactly as you want, but keeps the case.

我在网上找到了这个,完全符合你的要求,但保留了这个案例。

function sluggable($p) {
    $ts = array("/[à-?]/","/?/","/?/","/[è-?]/","/[ì-?]/","/D/","/?/","/[ò-??]/","/×/","/[ù-ü]/","/[Y-?]/","/[à-?]/","/?/","/?/","/[è-?]/","/[ì-?]/","/e/","/?/","/[ò-??]/","/÷/","/[ù-ü]/","/[y-?]/");
    $tn = array("A","AE","C","E","I","D","N","O","X","U","Y","a","ae","c","e","i","d","n","o","x","u","y");
    return preg_replace($ts,$tn, $p);
}

source

来源

回答by 0b10011

This is the class we use and while it can perform individual operations, it also has the ability to turn strings (or paths) into slug versions (only a-z, 0-9, and -are in the final output). It also does a couple extra things such as convert ampersands (&) to the word and.

这是我们使用的类,虽然它可以进行单独操作,也有把字符串(或路径)到蛞蝓版本的功能(只a-z0-9以及-在最终输出)。它还执行一些额外的操作,例如将 & 符号 ( &) 转换为单词and

Usage:

用法:

echo (new Str('My Cover Letter & Résumé'))->slugify()->__toString();

my-cover-letter-and-resume

我的求职信和简历

Strclass:

Str班级:

<?php

use RuntimeException;
use Transliterator;

class Str
{
    /**
     * Will hold an instance of Transliterator
     * for removing accents from characters.
     * Same instance for all instances of this class is fine.
     */
    private static $accent_transliterator;
    private $string;

    public function __construct(string $string)
    {
        $this->string = $string;
    }

    public function __toString()
    {
        return $this->string;
    }

    public function cleanForUrlPath(): self
    {
        $path = '';

        // Loop through path sections (separated by `/`)
        // and slugify each section.
        foreach (explode('/', $this->string) as $section) {
            $section = (new static($section))->slugify()->__toString();
            if ($section !== '') {
                $path .= "/$section";
            }
        }

        // Save the cleaned path
        $this->string = "$path/";

        return $this;
    }

    public function cleanUpSlugDashes(): self
    {
        // Remove extra dashes
        $this->string = preg_replace('/--+/', '-', $this->string);

        // Remove leading and trailing dashes
        $this->string = trim($this->string, '-');

        return $this;
    }

    /**
     * Replace symbols with word replacements.
     * Eg, `&` becomes ` and `.
     */
    public function convertSymbolsToWords(): self
    {
        $this->string = strtr($this->string, [
            '@' => ' at ',
            '%' => ' percent ',
            '&' => ' and ',
        ]);

        return $this;
    }

    public static function getSpacerCharacters(
        array $with = [],
        array $without = []
    ): array {
        return array_unique(array_diff(array_merge([
            ' ', // space
            '…', // ellipsis
            '–', // en dash
            '—', // em dash
            '/', // slash
            '\', // backslash
            ':', // colon
            ';', // semi-colon
            '.', // period
            '+', // plus sign
            '#', // pound sign
            '~', // tilde
            '_', // underscore
            '|', // pipe
        ], array_values($with)), array_values($without)));
    }

    public function lower(): self
    {
        $this->string = strtolower($this->string);

        return $this;
    }

    /**
     * Replaces all accented characters
     * with similar ASCII characters.
     */
    public function removeAccents(): self
    {
        // If no accented characters are found,
        // return the given string as-is.
        if (!preg_match('/[\x80-\xff]/', $this->string)) {
            return $this;
        }

        // Instantiate Transliterator if we haven't already
        if (!isset(self::$accent_transliterator)) {
            self::$accent_transliterator = Transliterator::create(
                'Any-Latin; Latin-ASCII;'
            );

            if (self::$accent_transliterator === null) {
                // @codeCoverageIgnoreStart
                throw new RuntimeException(
                    'Could not create a transliterator'
                );
                // @codeCoverageIgnoreEnd
            }
        }

        // Save transliterated string
        $this->string = (self::$accent_transliterator)->transliterate(
            $this->string
        );

        return $this;
    }

    public function replace($search, $replace)
    {
        $this->string = str_replace($search, $replace, $this->string);

        return $this;
    }

    public function replaceRegex($pattern, $replacement): self
    {
        $this->string = preg_replace($pattern, $replacement, $this->string);

        return $this;
    }

    /**
     * @param int $length number of bytes to shorten the string to
     */
    public function shorten(int $length): self
    {
        // If the string is already `$length` or shorter,
        // return it as-is.
        if (strlen($this->string) <= $length) {
            return $this;
        }

        // Shorten by 2 additional characters
        // to account for the three periods that are appended.
        // Only need to shorten by 2
        // as there's always at least one character (space) removed
        // when the last word is popped off of the array.
        $length -= 2;

        // Shorten the string to `$length` and split into words
        $words = explode(' ', substr($this->string, 0, $length));

        // Discard the last word as it's a partial word,
        // or empty if the last character happened to be a space.
        // If there's only one word,
        // then it was longer than `$length`
        // and the truncated version should be returned.
        if (count($words) > 1) {
            array_pop($words);
        }

        // Save the shortened string with "..." appended
        $this->string = rtrim(implode(' ', $words), ':').'...';

        return $this;
    }

    public function slugify(): self
    {
        // If the string is already a slug
        if (preg_match('/^[a-z0-9\-]+$/', $this->string)) {
            return $this;
        }

        // - Normalize accents
        // - Normalize symbols
        // - Lowercase
        // - Replace space characters with dashes
        // - Remove non-slug characters
        // - Clean up leading, trailing, and consecutive dashes
        return $this
            ->removeAccents()
            ->convertSymbolsToWords()
            ->lower()
            ->spacersToDashes()
            ->replaceRegex('/([^a-z0-9\-]+)/', '')
            ->cleanUpSlugDashes();
    }

    public function spacersToDashes(): self
    {
        return $this->replace(static::getSpacerCharacters(), '-');
    }
}

回答by Yan Bourgeois

function remove_accents($string)
{
    $a = 'àá??????èéê?ìí??D?òó????ùú?üYT?àáa?????èéê?ìí??e?òó????ùú?yyt???';
    $b = 'aaaaaaaceeeeiiiidnoooooouuuuybsaaaaaaaceeeeiiiidnoooooouuuyybyRr';
    $string = strtr(utf8_decode($string), utf8_decode($a), $b);
    return utf8_encode($string);
}

function format_slug($title)
{
    $title = remove_accents($title);
    $title = trim(strtolower($title));
    $title = preg_replace('#[^a-z0-9\-/]#i', '_', $title);
    return trim(preg_replace('/-+/', '-', $title), '-/');
}

use : echo format_slug($var);

使用:echo format_slug($var);

回答by Mahesh Dobhal

it means be SEO friendly permalink. Keep your URL small and SEO friendly using keywords. You can ignore prepositions in post permalinks. Don't repeat any keyword here

这意味着是 SEO 友好的永久链接。使用关键字保持您的 URL 小且 SEO 友好。您可以忽略帖子永久链接中的介词。请勿在此处重复任何关键字

回答by XpertSpot

function seourl($phrase, $maxLength = 100000000000000) {
        $result = strtolower($phrase);

        $result = preg_replace("~[^A-Za-z0-9-\s]~", "", $result);
        $result = trim(preg_replace("~[\s-]+~", " ", $result));
        $result = trim(substr($result, 0, $maxLength));
        $result = preg_replace("~\s~", "-", $result);

        return $result;
    }