PHP 音译
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1284535/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PHP Transliteration
提问by troelskn
Are there any solutions that will convert all foreign characters to A-z equivalents? I have searched extensively on Google and could not find a solution or even a list of characters and equivalents. The reason is I want to display A-z only URLs, plus plenty of other trip ups when dealing with these characters.
是否有任何解决方案可以将所有外来字符转换为 Az 等效字符?我在 Google 上进行了广泛搜索,但找不到解决方案,甚至找不到字符和等价物的列表。原因是我只想显示 Az 的 URL,以及在处理这些字符时出现的大量其他问题。
回答by troelskn
You can use iconv, which has a special transliteration encoding.
您可以使用iconv,它具有特殊的音译编码。
When the string "//TRANSLIT" is appended to tocode, transliteration is activated. This means that when a character cannot be represented in the target character set, it can be approximated through one or several characters that look similar to the original character.
当字符串“//TRANSLIT”被附加到tocode时,音译被激活。这意味着当目标字符集中无法表示某个字符时,可以通过一个或多个看起来与原始字符相似的字符来近似。
-- http://www.gnu.org/software/libiconv/documentation/libiconv/iconv_open.3.html
-- http://www.gnu.org/software/libiconv/documentation/libiconv/iconv_open.3.html
See herefor a complete example that matches your use case.
回答by Shane O'Grady
If you are using iconv then make sure your locale is set correctly before you try the transliteration, otherwise some characters will not be correctly transliterated
如果您使用 iconv,请确保在尝试音译之前正确设置了区域设置,否则某些字符将无法正确音译
setlocale(LC_CTYPE, 'en_US.UTF8');
回答by Ilyich
This will convert as much as possible foreign characters (including Cyrillic, Chinese, Arabic etc.) to their A-z equivalents:
这将尽可能多地将外来字符(包括西里尔文、中文、阿拉伯文等)转换为它们的 Az 等价物:
$AzString = transliterator_transliterate('Any-Latin;Latin-ASCII;', $foreignString);
You might want installPHP Intl extension first.
您可能需要先安装PHP Intl 扩展。
回答by Kemal Da?
If you are stuck with an development&release environment that doesn't support PHP 5.4 or newer, you should either use iconvor a custom Transliteration library.
如果您坚持使用不支持 PHP 5.4 或更高版本的开发和发布环境,您应该使用iconv或自定义音译库。
In case of iconv, I find it extremely unhelpful especially using it on Arabic or Cyrillic alphabets. I would go for a PHP 5.4 built-in Transliteration class or a custom Transliteration class.
在 iconv 的情况下,我发现它非常无用,尤其是在阿拉伯或西里尔字母上使用它。我会选择 PHP 5.4 内置 Transliteration 类或自定义 Transliteration 类。
One of the solutions posted mentioned a custom librarywhich I did not test.
发布的其中一个解决方案提到了一个我没有测试的自定义库。
When I was using Drupal, I loved their transliteration module, that I've recently ported it to make it usable without Drupal.
当我使用 Drupal 时,我喜欢他们的音译模块,我最近移植了它以使其在没有 Drupal 的情况下也可以使用。
You can download it hereand use as follows:
您可以在这里下载并使用如下:
<?php
include "JTransliteration.php";
$mombojombotext = "誓曰:『時日害喪?予及女偕亡。』民欲與之偕亡,雖有";
$nonmombojombotex = JTransliteration::transliterate($mombojombotext);
echo $nonmombojombotex;
?>
回答by Johnny Broadway
Note: I'm reposting this from another similar question in the hope that it's helpful to others.
注意:我是从另一个类似的问题中重新发布的,希望对其他人有所帮助。
I ended up writing a PHP library based on URLify.js from the Django project, since I found iconv() to be too incomplete. You can find it here:
我最终基于 Django 项目中的 URLify.js 编写了一个 PHP 库,因为我发现 iconv() 太不完整了。你可以在这里找到它:
https://github.com/jbroadway/urlify
https://github.com/jbroadway/urlify
Handles Latin characters as well as Greek, Turkish, Russian, Ukrainian, Czech, Polish, and Latvian.
处理拉丁字符以及希腊语、土耳其语、俄语、乌克兰语、捷克语、波兰语和拉脱维亚语。
回答by Johnny Broadway
for composer adepts there is slugify
对于作曲家专家有 slugify
https://github.com/cocur/slugify
https://github.com/cocur/slugify
use Cocur\Slugify\Slugify;
$slugify = new Slugify();
echo $slugify->slugify('Hello World!'); // hello-world
//You can also change the separator used by Slugify:
echo $slugify->slugify('Hello World!', '_'); // hello_world
//The library also contains Cocur\Slugify\SlugifyInterface. Use this interface whenever you need to type hint an instance of Slugify.
//To add additional transliteration rules you can use the addRule() method.
$slugify->addRule('i', 'ey');
echo $slugify->slugify('Hi'); // hey
回答by bulforce
<?php
/**
* @author bulforce[]gmail.com # 2011
* Simple class to attempt transliteration of bulgarian lating text into bulgarian cyrilic text
*/
// Usage:
// $text = "yagoda i yundola";
// $tl = new Transliterate();
// echo $tl->lat_to_cyr($text); //ягода и юндола
class Transliterate {
private $cyr_identical = array("а", "б", "в", "в", "г", "д", "е", "ж", "з", "и", "к", "л", "м", "н", "о", "п", "р", "с", "т", "у", "ф", "х", "ц", "ъ", "я");
private $lat_identical = array("a", "b", "v", "w", "g", "d", "e", "j", "z", "i", "k", "l", "m", "n", "o", "p", "r", "s", "t", "u", "f", "h", "c", "y", "q");
private $cyr_fricative = array("ж", "ч", "ш", "щ", "ц", "я", "ю", "я", "ю");
private $lat_fricative = array("zh", "ch", "sh", "sht", "ts", "ia", "iu", "ya", "yu");
public function __construct() {
$this->identical_to_upper();
$this->fricative_to_variants();
}
public function lat_to_cyr($str) {
for ($i = 0; $i < count($this->cyr_fricative); $i++) {
$c_cyr = $this->cyr_fricative[$i];
$c_lat = $this->lat_fricative[$i];
$str = str_replace($c_lat, $c_cyr, $str);
}
for ($i = 0; $i < count($this->cyr_identical); $i++) {
$c_cyr = $this->cyr_identical[$i];
$c_lat = $this->lat_identical[$i];
$str = str_replace($c_lat, $c_cyr, $str);
}
return $str;
}
private function identical_to_upper() {
foreach ($this->cyr_identical as $k => $v) {
$this->cyr_identical[] = mb_strtoupper($v, 'UTF-8');
}
foreach ($this->lat_identical as $k => $v) {
$this->lat_identical[] = mb_strtoupper($v, 'UTF-8');
}
}
private function fricative_to_variants() {
foreach ($this->lat_fricative as $k => $v) {
// This handles all chars to Upper
$this->lat_fricative[] = mb_strtoupper($v, 'UTF-8');
$this->cyr_fricative[] = mb_strtoupper($this->cyr_fricative[$k], 'UTF-8');
// This handles variants
// TODO: fix the 3 leter sounds
for ($i = 0; $i <= count($v); $i++) {
$v[$i] = mb_strtoupper($v[$i], 'UTF-8');
$this->lat_fricative[] = $v;
if ($i == 0) {
$this->cyr_fricative[] = mb_strtoupper($this->cyr_fricative[$k], 'UTF-8');
} else {
$this->cyr_fricative[] = $this->cyr_fricative[$k];
}
$v[$i] = mb_strtolower($v[$i], 'UTF-8');
}
}
}
}
回答by T.Todua
Nice library found at:
不错的图书馆在以下位置找到:
1) https://github.com/ashtokalo/php-translit(many languages, however, lacks of some languages)
1)https://github.com/ashtokalo/php-translit(很多语言,但是缺少一些语言)
2) https://github.com/fre5h/transliteration(only for Russian and Ukrainian)
2) https://github.com/fre5h/transliteration(仅限俄语和乌克兰语)
回答by Alin Razvan
Try this one
试试这个
function Unaccent( $string ) {
$transliterator = Transliterator::createFromRules(':: NFD; :: [:Nonspacing Mark:] Remove; :: NFC;', Transliterator::FORWARD);
$normalized = $transliterator->transliterate($string);
return $normalized;
}
回答by AAA
The problem with your query is that it is a very hard thing to do. Not all glyphs in most languages have a-z equivalents, all glyphs have phonetic equivalents (but these are words not letters), if you are just dealing with Latin based languages then things are a little easier but you still have issues with things like I-mutation.
您的查询的问题在于这是一件非常困难的事情。并非大多数语言中的所有字形都有 az 等价物,所有字形都有语音等价物(但这些是单词而不是字母),如果您只是处理基于拉丁语的语言,那么事情会更容易一些,但您仍然会遇到诸如 I-mutation 之类的问题.
Your best solution word be to come up with a crude list of phonetic sounds -> a-z equivalents, it won't be perfect but without any more information on you exact requirements it is hard to develop a solution.
您最好的解决方案是提出一个粗略的语音列表 -> az 等效项,它不会是完美的,但如果没有关于您的确切要求的更多信息,则很难开发解决方案。

