PHP MD5 多维数组的最佳方式?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2254220/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 05:44:25  来源:igfitidea点击:

PHP best way to MD5 multi-dimensional array?

phparraysmultidimensional-arrayhashmd5

提问by Peter John

What is the best way to generate an MD5 (or any other hash) of a multi-dimensional array?

生成多维数组的 MD5(或任何其他哈希)的最佳方法是什么?

I could easily write a loop which would traverse through each level of the array, concatenating each value into a string, and simply performing the MD5 on the string.

我可以轻松编写一个循环,遍历数组的每一层,将每个值连接成一个字符串,然后简单地对字符串执行 MD5。

However, this seems cumbersome at best and I wondered if there was a funky function which would take a multi-dimensional array, and hash it.

但是,这充其量看起来很麻烦,我想知道是否有一个时髦的函数可以接受多维数组并对其进行散列。

回答by Nathan J.B.

(Copy-n-paste-able function at the bottom)

(底部可复制粘贴功能)

As mentioned prior, the following will work.

如前所述,以下将起作用。

md5(serialize($array));

However, it's worth noting that (ironically) json_encode performs noticeablyfaster:

但是,值得注意的是(具有讽刺意味的是) json_encode 的执行速度明显更快:

md5(json_encode($array));

In fact, the speed increase is two-fold here as (1) json_encode alone performs faster than serialize, and (2) json_encode produces a smaller string and therefore less for md5 to handle.

事实上,这里的速度提高了两倍,因为 (1) json_encode 本身比序列化执行得更快,并且 (2) json_encode 产生一个更小的字符串,因此 md5 处理的字符串更少。

Edit:Here is evidence to support this claim:

编辑:以下是支持这一主张的证据:

<?php //this is the array I'm using -- it's multidimensional.
$array = unserialize('a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:4:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}i:3;a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}');

//The serialize test
$b4_s = microtime(1);
for ($i=0;$i<10000;$i++) {
    $serial = md5(serialize($array));
}
echo 'serialize() w/ md5() took: '.($sTime = microtime(1)-$b4_s).' sec<br/>';

//The json test
$b4_j = microtime(1);
for ($i=0;$i<10000;$i++) {
    $serial = md5(json_encode($array));
}
echo 'json_encode() w/ md5() took: '.($jTime = microtime(1)-$b4_j).' sec<br/><br/>';
echo 'json_encode is <strong>'.( round(($sTime/$jTime)*100,1) ).'%</strong> faster with a difference of <strong>'.($sTime-$jTime).' seconds</strong>';

JSON_ENCODE is consistently over 250% (2.5x) faster (often over 300%) -- this is not a trivial difference. You may see the results of the test with this live script here:

JSON_ENCODE 始终快 250% (2.5x) 以上(通常超过 300%)——这不是一个微不足道的区别。您可以在此处查看使用此实时脚本的测试结果:

Now, one thing to note is array(1,2,3) will produce a different MD5 as array(3,2,1). Ifthis is NOT what you want. Try the following code:

现在,需要注意的一件事是 array(1,2,3) 将产生与 array(3,2,1) 不同的 MD5。 如果这不是您想要的。试试下面的代码:

//Optionally make a copy of the array (if you want to preserve the original order)
$original = $array;

array_multisort($array);
$hash = md5(json_encode($array));

Edit:There's been some question as to whether reversing the order would produce the same results. So, I've done that (correctly) here:

编辑:关于颠倒顺序是否会产生相同的结果存在一些问题。所以,我在这里做了(正确):

As you can see, the results are exactly the same. Here's the (corrected) test originally created by someone related to Drupal:

如您所见,结果完全相同。这是最初由与 Drupal 相关的人创建的(更正的)测试:

And for good measure, here's a function/method you can copy and paste (tested in 5.3.3-1ubuntu9.5):

为了更好地衡量,这里有一个您可以复制和粘贴的函数/方法(在 5.3.3-1ubuntu9.5 中测试):

function array_md5(Array $array) {
    //since we're inside a function (which uses a copied array, not 
    //a referenced array), you shouldn't need to copy the array
    array_multisort($array);
    return md5(json_encode($array));
}

回答by Brock Batsell

md5(serialize($array));

回答by dotancohen

I'm joining a very crowded party by answering, but there is an important consideration that none of the extant answers address. The value of json_encode()and serialize()both depend upon the order of elements in the array!

我通过回答加入了一个非常拥挤的派对,但有一个重要的考虑因素,现有的答案都没有解决。的价值json_encode()serialize()既依赖于数组中元素的顺序!

Here are the results of not sorting and sorting the arrays, on two arrays with identical values but added in a different order(code at bottom of post):

以下是对具有相同值但以不同顺序添加的两个数组不进行排序和排序的结果(帖子底部的代码)

    serialize()
1c4f1064ab79e4722f41ab5a8141b210
1ad0f2c7e690c8e3cd5c34f7c9b8573a

    json_encode()
db7178ba34f9271bfca3a05c5dddf502
c9661c0852c2bd0e26ef7951b4ca9e6f

    Sorted serialize()
1c4f1064ab79e4722f41ab5a8141b210
1c4f1064ab79e4722f41ab5a8141b210

    Sorted json_encode()
db7178ba34f9271bfca3a05c5dddf502
db7178ba34f9271bfca3a05c5dddf502

Therefore, the two methods that I would recommend to hash an arraywould be:

因此,我建议散列数组的两种方法是:

// You will need to write your own deep_ksort(), or see
// my example below

md5(   serialize(deep_ksort($array)) );

md5( json_encode(deep_ksort($array)) );

The choice of json_encode()or serialize()should be determined by testing on the type of data that youare using. By my own testing on purely textual and numerical data, if the code is not running a tight loop thousands of times then the difference is not even worth benchmarking. I personally use json_encode()for that type of data.

选择json_encode()serialize()应该通过测试使用的数据类型确定。通过我自己对纯文本和数字数据的测试,如果代码没有运行数千次紧密循环,那么差异甚至不值得进行基准测试。我个人json_encode()用于这种类型的数据。

Here is the code used to generate the sorting test above:

这是用于生成上述排序测试的代码:

$a = array();
$a['aa'] = array( 'aaa'=>'AAA', 'bbb'=>'ooo', 'qqq'=>'fff',);
$a['bb'] = array( 'aaa'=>'BBBB', 'iii'=>'dd',);

$b = array();
$b['aa'] = array( 'aaa'=>'AAA', 'qqq'=>'fff', 'bbb'=>'ooo',);
$b['bb'] = array( 'iii'=>'dd', 'aaa'=>'BBBB',);

echo "    serialize()\n";
echo md5(serialize($a))."\n";
echo md5(serialize($b))."\n";

echo "\n    json_encode()\n";
echo md5(json_encode($a))."\n";
echo md5(json_encode($b))."\n";



$a = deep_ksort($a);
$b = deep_ksort($b);

echo "\n    Sorted serialize()\n";
echo md5(serialize($a))."\n";
echo md5(serialize($b))."\n";

echo "\n    Sorted json_encode()\n";
echo md5(json_encode($a))."\n";
echo md5(json_encode($b))."\n";

My quick deep_ksort() implementation, fits this case but check it before using on your own projects:

我的快速 deep_ksort() 实现适合这种情况,但在用于您自己的项目之前检查它:

/*
* Sort an array by keys, and additionall sort its array values by keys
*
* Does not try to sort an object, but does iterate its properties to
* sort arrays in properties
*/
function deep_ksort($input)
{
    if ( !is_object($input) && !is_array($input) ) {
        return $input;
    }

    foreach ( $input as $k=>$v ) {
        if ( is_object($v) || is_array($v) ) {
            $input[$k] = deep_ksort($v);
        }
    }

    if ( is_array($input) ) {
        ksort($input);
    }

    // Do not sort objects

    return $input;
}

回答by Alexander Yancharuk

Answer is highly depends on data types of array values. For big strings use:

答案高度依赖于数组值的数据类型。对于大字符串使用:

md5(serialize($array));

For short strings and integers use:

对于短字符串和整数,请使用:

md5(json_encode($array));

4 built-in PHP functions can transform array to string: serialize(), json_encode(), var_export(), print_r().

4 个内置的 PHP 函数可以将数组转换为字符串: serialize()json_encode()var_export()print_r()

Notice:json_encode()function slows down while processing associative arrays with strings as values. In this case consider to use serialize()function.

注意:json_encode()函数在处理以字符串作为值的关联数组时会变慢。在这种情况下,请考虑使用serialize()函数。

Test results for multi-dimensional array with md5-hashes (32 char) in keys and values:

键和值中带有 md5 哈希(32 个字符)的多维数组的测试结果:

Test name       Repeats         Result          Performance     
serialize       10000           0.761195 sec    +0.00%
print_r         10000           1.669689 sec    -119.35%
json_encode     10000           1.712214 sec    -124.94%
var_export      10000           1.735023 sec    -127.93%

Test result for numeric multi-dimensional array:

数值多维数组的测试结果:

Test name       Repeats         Result          Performance     
json_encode     10000           1.040612 sec    +0.00%
var_export      10000           1.753170 sec    -68.47%
serialize       10000           1.947791 sec    -87.18%
print_r         10000           9.084989 sec    -773.04%

Associative array test source. Numeric array test source.

关联数组测试源。数字数组测试源

回答by Chris Jester-Young

Aside from Brock's excellent answer (+1), any decent hashing library allows you to update the hash in increments, so you should be able to update with each string sequentially, instead having to build up one giant string.

除了 Brock 的出色答案 (+1) 之外,任何不错的散列库都允许您以增量方式更新散列,因此您应该能够按顺序更新每个字符串,而不必构建一个巨大的字符串。

See: hash_update

看: hash_update

回答by Max Wheeler

md5(serialize($array));

Will work, but the hash will change depending on the order of the array (that might not matter though).

会起作用,但散列会根据数组的顺序而改变(尽管这可能无关紧要)。

回答by Willem-Jan

Note that serializeand json_encodeact differently when it comes to numeric arrays where the keys don't start at 0, or associative arrays. json_encodewill store such arrays as an Object, so json_decodereturns an Object, where unserializewill return an array with exact the same keys.

请注意,serializejson_encode采取不同的,当涉及到数字数组,其中键不从0开始,或关联数组。 json_encode将这样的数组存储为 an Object,因此json_decode返回 an Object,其中unserialize将返回一个具有完全相同键的数组。

回答by Andrej Pandovich

I think that this could be a good tip:

我认为这可能是一个很好的提示:

Class hasharray {

    public function array_flat($in,$keys=array(),$out=array()){
        foreach($in as $k => $v){
            $keys[] = $k; 
            if(is_array($v)){
                $out = $this->array_flat($v,$keys,$out);
            }else{
                $out[implode("/",$keys)] = $v;
            }
            array_pop($keys);
        }
        return $out;  
    }

    public function array_hash($in){
        $a = $this->array_flat($in);
        ksort($a);
        return md5(json_encode($a));
    }

}

$h = new hasharray;
echo $h->array_hash($multi_dimensional_array);

回答by TermiT

Important note about serialize()

关于的重要说明 serialize()

I don't recommend to use it as part of hashing function because it can return different result for the following examples. Check the example below:

我不建议将它用作散列函数的一部分,因为它可以为以下示例返回不同的结果。检查下面的例子:

Simple example:

简单的例子:

$a = new \stdClass;
$a->test = 'sample';

$b = new \stdClass;
$b->one = $a;
$b->two = clone $a;

Produces

生产

"O:8:"stdClass":2:{s:3:"one";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}s:3:"two";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}}"

But the following code:

但是下面的代码:

<?php

$a = new \stdClass;
$a->test = 'sample';

$b = new \stdClass;
$b->one = $a;
$b->two = $a;

Output:

输出:

"O:8:"stdClass":2:{s:3:"one";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}s:3:"two";r:2;}"

So instead of second object php just create link "r:2;" to the first instance. It's definitely good and correct way to serialize data, but it can lead to the issues with your hashing function.

因此,而不是第二个对象 php 只需创建链接“r:2;” 到一审。这绝对是序列化数据的好方法,但它可能会导致散列函数出现问题。

回答by ymakux

// Convert nested arrays to a simple array
$array = array();
array_walk_recursive($input, function ($a) use (&$array) {
    $array[] = $a;
});

sort($array);

$hash = md5(json_encode($array));

----

These arrays have the same hash:
$arr1 = array(0 => array(1, 2, 3), 1, 2);
$arr2 = array(0 => array(1, 3, 2), 1, 2);