php 外来字符和 LDAP。LDAP 期望什么编码/字符集?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11035359/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 23:37:56  来源:igfitidea点击:

Foreign characters and LDAP. What encoding/charset does LDAP expect?

phpencodingactive-directoryldap

提问by OmidTahouri

I am parsing XML, with simplexml_load_string(), and using the data within it to update Active Directory (AD) objects, via LDAP.

我正在解析 XML,simplexml_load_string()并使用其中的数据通过 LDAP 更新 Active Directory (AD) 对象。

Example XML (simplified):

示例 XML(简化):

<?xml version="1.0" encoding="UTF-8"?>
<users>
    <user>Bìlb? Bágg?n?</user>
    <user>G?ńd??f Thê Gr?at</user>
    <user>?ām Wī??</user>
</users>

I firstly run an ldap_search()to find a single user and then proceed to change their attributes. Pumping the above values straight into AD, using LDAP, will result in some pretty mangled characters showing up.

我首先运行 anldap_search()来查找单个用户,然后继续更改他们的属性。使用 LDAP 将上述值直接注入 AD,将导致显示一些非常混乱的字符。

For example: B??lb?? B??gg?ˉn??

例如: B??lb?? B??gg?ˉn??

I've tried the following functions, to no avail:

我尝试了以下功能,但无济于事:

utf8_encode($str);
utf8_decode($str);
iconv("UTF-8", "ISO-8859-1//TRANSLIT", $str);
iconv("UTF-8", "ASCII//TRANSLIT", $str);
iconv("UTF-8", "T.61", $str);

Ideally, I don't want to do any of these string conversions. UTF-8 shouldbe fine, right?!

理想情况下,我不想进行任何这些字符串转换。UTF-8应该没问题吧?!

I've also noticed the following:I have printed out the values to see how they come out. curl-ing the script in CLI will show the correct characters, but web browsers show the same as AD.

我还注意到以下几点:我已经打印出这些值,看看它们是如何产生的。在 CLI 中 curl-ing 脚本将显示正确的字符,但 Web 浏览器显示与 AD 相同的字符。

What's going on? Should I be looking at something else, eg. URL encoding?I'm hoping this is down to a simple mistake on my end.

这是怎么回事?我应该看看别的东西,例如。网址编码?我希望这归结为我的一个简单错误。

EDIT:I entered in these characters using AD admin GUI to see how they would come out. I can read them via LDAP fine. Correct characters are displayed when in a browser. curl-ing via CLI will show question marks instead of foreign characters. Passing one of these returned values into mb_detect_encoding()will return UTF-8.

编辑:我使用 AD 管理 GUI 输入这些字符,看看它们会如何出现。我可以通过 LDAP 很好地读取它们。在浏览器中显示正确的字符。通过 CLI 进行 curl-ing 将显示问号而不是外来字符。将这些返回值之一传入mb_detect_encoding()将返回 UTF-8。

I decided to immediately modify the same object by not writing in a new string, but just reversing the existing value and saving the object. This works fine - I see the correct value (reversed) in AD.

我决定通过不写入新字符串来立即修改同一个对象,而只是反转现有值并保存对象。这很好用 - 我在 AD 中看到了正确的值(反转)。

  • Developing on Mac OS X 10.7 Lion - PHP 5.4.3
  • Running production on: Red Hat 6 - PHP 5.4.3
  • AD server: Windows 2003
  • 在 Mac OS X 10.7 Lion 上开发 - PHP 5.4.3
  • 运行生产环境:Red Hat 6 - PHP 5.4.3
  • AD服务器:Windows 2003

UPDATE:After a few months, I was unable to find the answer/solution to this problem. In the end, I went with replacing characters to their non-accented equivalent (NOT ideal, I know).

更新:几个月后,我无法找到这个问题的答案/解决方案。最后,我将字符替换为不带重音的等效字符(这并不理想,我知道)。

回答by Mike Mackintosh

Are you using LDAP v3?

您使用的是 LDAP v3 吗?

ldap_set_option($ldap, LDAP_OPT_PROTOCOL_VERSION, 3);

LDAPv3 supports UTF-8 by default, which it expects requests and responses to be in by default. See here: http://technet.microsoft.com/en-us/library/cc961766.aspx

LDAPv3 默认支持 UTF-8,它希望请求和响应在默认情况下。请参阅此处:http: //technet.microsoft.com/en-us/library/cc961766.aspx

回答by MrD

Here is solution that worked for me. Do following things:

这是对我有用的解决方案。做以下几件事:

1.) First make sure you are using LDAP protocol version 3 which is using "UTF-8" by default:

1.) 首先确保您使用的是默认使用“UTF-8”的 LDAP 协议版本 3:

ldap_set_option($ldap, LDAP_OPT_PROTOCOL_VERSION, 3);

2.) If you want to change user's password, than make sure that "use TLS" option is set to trueand use SSL to false.

2.) 如果要更改用户的密码,请确保将“使用 TLS”选项设置为true并将 SSL设置为false

ldap_start_tls($ldapConnection);

3.) I used port number 389.

3.) 我使用了端口号389

4.) Use PHP function ldap_mod_replaceto replace user's password.

4.) 使用PHP 函数ldap_mod_replace替换用户密码。

5.) Use the following function to encode your $password:

5.) 使用以下函数对您的$password:

public function encodePassword($password)
{
    $password="\"".$password."\"";
    $encoded="";
    for ($i=0; $i <strlen($password); $i++){ 
        $encoded.="{$password{$i}}
$password="test";
if(mb_detect_encoding($password) == 'UTF-8')
{
    $password = utf8_decode($password);
}

$add=array();
$add["unicodePwd"][0] = encodePassword($password);

$result = @ldap_mod_replace($ldapConnection, $userDn, $add);
if ($result === false){
    //your action
}
else{
    //Your action
}
0"; } return $encoded; }

6.) Use the following logic to change user's password:

6.) 使用以下逻辑来更改用户的密码:

if(mb_detect_encoding($password) == 'UTF-8')
{
    $password = utf8_decode($password);
}

7.) Please note that function encodePasswordwill encode your $passwordto UTF-8 encoding. If your password is UTF-8 encoded, then your have to decode it before sending it to the encodePasswordfunction. That is why I wrote the line:

7.) 请注意,该函数encodePassword会将您编码 $password为 UTF-8 编码。如果您的密码是 UTF-8 编码的,那么您必须在将其发送到encodePassword函数之前对其进行解码 。这就是为什么我写了这行:

if (!preg_match('//u', $value)) {
    // do your encoding process...
}

This code worked for me when I provide german Umlauts in password: ?ü???üetc...

当我在密码中提供德语变音时,此代码对我有用:?ü???ü等...

回答by mteodor

I've managed to add foreign characters in LDAP with two steps:

我设法通过两个步骤在 LDAP 中添加外来字符:

  • add the user only with ASCII characters (iconv "ASCII//TRANSLIT")

  • use ldapmodifyto update the field(s) with UTF-8 characters

  • 仅使用 ASCII 字符添加用户 (iconv "ASCII//TRANSLIT")

  • 用于使用ldapmodifyUTF-8 字符更新字段

LDAPv3 is UTF-8, but the tool I used (from smbldap-tools) was not dealing with it properly.

LDAPv3 是 UTF-8,但我使用的工具(来自smbldap-tools)没有正确处理它。

回答by ChadSikorra

Another thing to mention for those stumbling across this:

对于那些遇到这个问题的人,还有一件事要提:

If your text is already in UTF-8, then do NOT attempt to re-encode it. Note the following remarks on the doc page for utf8_encode. Re-encoding an already encoded string will result in garbled text. Additionally, the function only allows for one specific encoding to another.

如果您的文本已经是 UTF-8,请不要尝试重新编码。请注意utf8_encode文档页面上的以下注释。重新编码已编码的字符串将导致文本乱码。此外,该函数只允许一种特定的编码到另一种。

You could easily test if you need to UTF-8 encode the string by doing something like:

您可以通过执行以下操作轻松测试是否需要对字符串进行 UTF-8 编码:

##代码##

Regarding the characters not showing correctly on a web page either, but they are on the CLI, make sure you are setting the correct charset in your headers:

关于在网页上也没有正确显示但它们在 CLI 上的字符,请确保您在标题中设置了正确的字符集:

header('Content-type: text/html; charset=utf-8');

header('Content-type: text/html; charset=utf-8');