在 PHP 中检测 base64 编码?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2556345/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 06:55:38  来源:igfitidea点击:

Detect base64 encoding in PHP?

phpbase64encode

提问by Ian McIntyre Silber

Is there some way to detect if a string has been base64_encoded() in PHP?

有什么方法可以检测字符串是否在 PHP 中是 base64_encoded() 吗?

We're converting some storage from plain text to base64 and part of it lives in a cookie that needs to be updated. I'd like to reset their cookie if the text has not yet been encoded, otherwise leave it alone.

我们正在将一些存储从纯文本转换为 base64,其中一部分位于需要更新的 cookie 中。如果文本尚未编码,我想重置他们的 cookie,否则不要管它。

回答by chrishiestand

Apologies for a late response to an already-answered question, but I don't think base64_decode($x,true) is a good enough solution for this problem. In fact, there may not be a very good solution that works against any given input. For example, I can put lots of bad values into $x and not get a false return value.

对一个已经回答的问题的迟到回复表示歉意,但我认为 base64_decode($x,true) 不是这个问题的足够好的解决方案。事实上,可能没有一个很好的解决方案适用于任何给定的输入。例如,我可以将很多错误的值放入 $x 中,而不会得到错误的返回值。

var_dump(base64_decode('wtf mate',true));
string(5) "???j?"

var_dump(base64_decode('This is definitely not base64 encoded',true));
string(24) "N???^~)??r??[j?????"

I think that in addition to the strict return value check, you'd also need to do post-decode validation. The most reliable way is if you could decode and then check against a known set of possible values.

我认为除了严格的返回值检查之外,您还需要进行解码后验证。最可靠的方法是您是否可以解码并检查一组已知的可能值。

A more general solution with less than 100% accuracy (closer with longer strings, inaccurate for short strings) is if you check your output to see if many are outside of a normal range of utf-8 (or whatever encoding you use) characters.

一个小于 100% 准确度的更通用的解决方案(更长的字符串更接近,短字符串不准确)是如果您检查您的输出以查看是否有许多超出 utf-8(或您使用的任何编码)字符的正常范围。

See this example:

看这个例子:

<?php
$english = array();
foreach (str_split('az019AZ~~~!@#$%^*()_+|}?><": I?t?rnati?nàliz?ti?n') as $char) {
  echo ord($char) . "\n";
  $english[] = ord($char);
}
  echo "Max value english = " . max($english) . "\n";

$nonsense = array();
echo "\n\nbase64:\n";
foreach (str_split(base64_decode('Not base64 encoded',true)) as $char) {
  echo ord($char) . "\n";
  $nonsense[] = ord($char);
}

  echo "Max nonsense = " . max($nonsense) . "\n";

?>

Results:

结果:

Max value english = 195
Max nonsense = 233

So you may do something like this:

所以你可以做这样的事情:

if ( $maxDecodedValue > 200 ) {} //decoded string is Garbage - original string not base64 encoded

else {} //decoded string is useful - it was base64 encoded

You should probably use the mean() of the decoded values instead of the max(), I just used max() in this example because there is sadly no built-in mean() in PHP. What measure you use (mean,max, etc) against what threshold (eg 200) depends on your estimated usage profile.

您可能应该使用解码值的 mean() 而不是 max(),我在这个例子中只使用了 max(),因为遗憾的是 PHP 中没有内置的 mean()。您针对什么阈值(例如 200)使用什么度量(平均值、最大值等)取决于您估计的使用情况。

In conclusion, the only winning move is not to play. I'd try to avoid having to discern base64 in the first place.

总而言之,唯一获胜的举动是不玩。我会尽量避免首先识别base64。

回答by alex

function is_base64_encoded($data)
{
    if (preg_match('%^[a-zA-Z0-9/+]*={0,2}$%', $data)) {
       return TRUE;
    } else {
       return FALSE;
    }
};

is_base64_encoded("iash21iawhdj98UH3"); // true
is_base64_encoded("#iu3498r"); // false
is_base64_encoded("asiudfh9w=8uihf"); // false
is_base64_encoded("a398UIhnj43f/1!+sadfh3w84hduihhjw=="); // false

http://php.net/manual/en/function.base64-decode.php#81425

http://php.net/manual/en/function.base64-decode.php#81425

回答by Amir

I had the same problem, I ended up with this solution:

我遇到了同样的问题,我最终得到了这个解决方案:

if ( base64_encode(base64_decode($data)) === $data){
    echo '$data is valid';
} else {
    echo '$data is NOT valid';
}

回答by Abhinav bhardwaj

We can combine three things into one function to check if given string is a valid base 64 encoded or not.

我们可以将三件事组合成一个函数来检查给定的字符串是否是有效的 base 64 编码。

function validBase64($string)
{
    $decoded = base64_decode($string, true);

    // Check if there is no invalid character in string
    if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $string)) return false;

    // Decode the string in strict mode and send the response
    if (!$decoded) return false;

    // Encode and compare it to original one
    if (base64_encode($decoded) != $string) return false;

    return true;
}

回答by Marki

Better late than never: You could maybe use mb_detect_encoding()to find out whether the encoded string appears to have been some kind of text:

迟到总比不到好:您可以mb_detect_encoding()用来确定编码的字符串是否似乎是某种文本:

function is_base64_string($s) {
  // first check if we're dealing with an actual valid base64 encoded string
  if (($b = base64_decode($s, TRUE)) === FALSE) {
    return FALSE;
  }

  // now check whether the decoded data could be actual text
  $e = mb_detect_encoding($b);
  if (in_array($e, array('UTF-8', 'ASCII'))) { // YMMV
    return TRUE;
  } else {
    return FALSE;
  }
}

回答by Albert

I was about to build a base64 toggle in php, this is what I did:

我正准备在 php 中构建一个 base64 切换,这就是我所做的:

function base64Toggle($str) {
    if (!preg_match('~[^0-9a-zA-Z+/=]~', $str)) {
        $check = str_split(base64_decode($str));
        $x = 0;
        foreach ($check as $char) if (ord($char) > 126) $x++;
        if ($x/count($check)*100 < 30) return base64_decode($str);
    }
    return base64_encode($str);
}

It works perfectly for me. Here are my complete thoughts on it: http://www.albertmartin.de/blog/code.php/19/base64-detection

它非常适合我。这是我对它的完整想法:http: //www.albertmartin.de/blog/code.php/19/base64-detection

And here you can try it: http://www.albertmartin.de/tools

在这里你可以试试:http: //www.albertmartin.de/tools

回答by Special K.

Here's my solution:

这是我的解决方案:

if(empty(htmlspecialchars(base64_decode($string, true)))) { return false; }

if(empty(htmlspecialchars(base64_decode($string, true)))) { return false; }

It will return false if the decoded $stringis invalid, for example: "node", "123", " ", etc.

如果解码$string无效,则返回false ,例如:“node”、“123”、“”等。

回答by Sivaguru

base64_decode() will not return FALSE if the input is not valid base64 encoded data. Use imap_base64()instead, it returns FALSE if $text contains characters outside the Base64 alphabet imap_base64() Reference

如果输入不是有效的 base64 编码数据,则 base64_decode() 将不会返回 FALSE。使用imap_base64()相反,它如果$文本中包含的Base64字母以外的字符返回FALSE imap_base64()参考

回答by Francisco Luz

$is_base64 = function(string $string) : bool {
    $zero_one = ['MA==', 'MQ=='];
    if (in_array($string, $zero_one)) return TRUE;

    if (empty(htmlspecialchars(base64_decode($string, TRUE))))
        return FALSE;

    return TRUE;
};

var_dump('*** These yell false ***');
var_dump($is_base64(''));
var_dump($is_base64('This is definitely not base64 encoded'));
var_dump($is_base64('node'));
var_dump($is_base64('node '));
var_dump($is_base64('123'));
var_dump($is_base64(0));
var_dump($is_base64(1));
var_dump($is_base64(123));
var_dump($is_base64(1.23));

var_dump('*** These yell true ***');
var_dump($is_base64(base64_encode('This is definitely base64 encoded')));
var_dump($is_base64(base64_encode('node')));
var_dump($is_base64(base64_encode('123')));
var_dump($is_base64(base64_encode(0)));
var_dump($is_base64(base64_encode(1)));
var_dump($is_base64(base64_encode(123)));
var_dump($is_base64(base64_encode(1.23)));
var_dump($is_base64(base64_encode(TRUE)));

var_dump('*** Should these yell true? Might be edge cases ***');
var_dump($is_base64(base64_encode('')));
var_dump($is_base64(base64_encode(FALSE)));
var_dump($is_base64(base64_encode(NULL)));

回答by Digital Human

Your best option is:

您最好的选择是:

$base64_test = mb_substr(trim($some_base64_data), 0, 76);
return (base64_decode($base64_test, true) === FALSE ? FALSE : TRUE);