php 在php中以字节为单位测量字符串大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7568949/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 02:57:51  来源:igfitidea点击:

Measure string size in Bytes in php

phpstringstring-length

提问by Liam Bailey

I am doing a real estate feed for a portal and it is telling me the max length of a string should be 20,000 bytes (20kb), but I have never run across this before.

我正在为一个门户网站做一个房地产提要,它告诉我一个字符串的最大长度应该是 20,000 字节 (20kb),但我以前从未遇到过这个问题。

How can I measure bytesize of a varchar string. So I can then do a while loop to trim it down.

我如何测量byte的大小varchar string。所以我可以做一个while循环来修剪它。

采纳答案by Foo Bah

You have to figure out if the string is ascii encoded or encoded with a multi-byte format.

您必须弄清楚字符串是 ascii 编码的还是使用多字节格式编码的。

In the former case, you can just use strlen.

在前一种情况下,您可以只使用strlen.

In the latter case you need to find the number of bytes per character.

在后一种情况下,您需要找到每个字符的字节数。

the strlen documentation gives an example of how to do it : http://www.php.net/manual/en/function.strlen.php#72274

strlen 文档给出了如何执行此操作的示例:http: //www.php.net/manual/en/function.strlen.php#72274

回答by PhoneixS

You can use mb_strlen() to get the byte lengthusing a encoding that only have byte-characters, without worring about multibyte or singlebyte strings. For example, as drake127 saids in a comment of mb_strlen, you can use '8bit' encoding:

您可以使用 mb_strlen()使用只有字节字符的编码来获取字节长度,而不必担心多字节或单字节字符串。例如,正如 drake127 在 mb_strlen 的评论中所说,您可以使用 '8bit' 编码:

<?php
    $string = 'Cién ca?ones por banda';
    echo mb_strlen($string, '8bit');
?>

You can have problems using strlen function since php have an option to overload strlen to actually call mb_strlen. See more info about it in http://php.net/manual/en/mbstring.overload.php

您可能会在使用 strlen 函数时遇到问题,因为 php 可以选择重载 strlen 以实际调用 mb_strlen。在http://php.net/manual/en/mbstring.overload.php 中查看更多信息

For trim the string by byte length without split in middle of a multibyte character you can use:

要按字节长度修剪字符串而不在多字节字符中间拆分,您可以使用:

mb_strcut(string $str, int $start [, int $length [, string $encoding ]] )

回答by soulmerge

Do you mean byte size or string length?

你的意思是字节大小还是字符串长度?

Byte size is measured with strlen(), whereas string length is queried using mb_strlen(). You can use substr()to trim a string to X bytes(note that this will break the string if it has a multi-byte encoding - as pointed out by Darhazer in the comments) and mb_substr()to trim it to X characters in the encoding of the string.

字节大小用 测量strlen(),而字符串长度用 查询mb_strlen()。您可以使用substr()将字符串修剪为 X个字节(请注意,如果它具有多字节编码,这将破坏字符串 - 正如 Darhazer 在评论中指出的那样)并mb_substr()在字符串的编码中将其修剪为 X 个字符。

回答by mIFO

PHP's strlen()function returns the number of ASCII characters.

PHP 的strlen()函数返回 ASCII 字符的数量。

strlen('borsc')-> 5 (bytes)

strlen('borsc')-> 5(字节)

strlen('bor??')-> 7 (bytes)

strlen('bor??')-> 7(字节)

$limit_in_kBytes = 20000;

$pointer = 0;
while(strlen($your_string) > (($pointer + 1) * $limit_in_kBytes)){
    $str_to_handle = substr($your_string, ($pointer * $limit_in_kBytes ), $limit_in_kBytes);
    // here you can handle (0 - n) parts of string
    $pointer++;
}

$str_to_handle = substr($your_string, ($pointer * $limit_in_kBytes), $limit_in_kBytes);
// here you can handle last part of string

.. or you can use a function like this:

.. 或者你可以使用这样的函数:

function parseStrToArr($string, $limit_in_kBytes){
    $ret = array();

    $pointer = 0;
    while(strlen($string) > (($pointer + 1) * $limit_in_kBytes)){
        $ret[] = substr($string, ($pointer * $limit_in_kBytes ), $limit_in_kBytes);
        $pointer++;
    }

    $ret[] = substr($string, ($pointer * $limit_in_kBytes), $limit_in_kBytes);

    return $ret;
}

$arr = parseStrToArr($your_string, $limit_in_kBytes = 20000);

回答by Ulver

Further to PhoneixS answer to get the correct length of string in bytes - Since mb_strlen()is slower than strlen(), for the best performance one can check "mbstring.func_overload" ini setting so that mb_strlen()is used only when it is really required:

进一步PhoneixS答案以获得正确的字符串长度(以字节为单位) - 由于mb_strlen()比 慢strlen(),为了获得最佳性能,可以检查“mbstring.func_overload”ini设置,以便mb_strlen()仅在真正需要时使用:

$content_length = ini_get('mbstring.func_overload') ? mb_strlen($content , '8bit') : strlen($content);