C++ 如何检查字符串是否包含空格/制表符/新行(任何空白)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11424549/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 15:08:40  来源:igfitidea点击:

How to check if a string contains spaces/tabs/new lines (anything that's blank)?

c++

提问by unwise guy

I know there's an "isspace" function that checks for spaces, but that would require me to iterate through every character in the string, which can be bad on performance since this would be called a lot. Is there a fast way to check if a std::string contains only spaces?

我知道有一个“isspace”函数可以检查空格,但这需要我遍历字符串中的每个字符,这可能会影响性能,因为这会被调用很多。有没有一种快速的方法来检查 std::string 是否只包含空格?

ex:

前任:

function("       ") // returns true
function("    4  ") // returns false

One solution I've thought of is to use regex, then i'll know that it only contains whitespace if it's false... but i'm not sure if this would be more efficient than the isspace function.

我想到的一个解决方案是使用正则表达式,然后我会知道它只包含空白,如果它是假的......但我不确定这是否比 isspace 函数更有效。

regex: [\w\W] //checks for any word character(a,b,c..) and non-word character([,],..)

thanks in advance!

提前致谢!

采纳答案by Ernest Friedman-Hill

Anymethod would, of necessity, need to look at each character of the string. A loop that calls isspace()on each character is pretty efficient. If isspace()is inlined by the compiler, then this would be darn near optimal.

任何方法都必然需要查看字符串的每个字符。调用isspace()每个字符的循环非常有效。如果isspace()由编译器内联,那么这将接近最优。

The loop should, of course, abort as soon as a non-space character is seen.

当然,一旦看到非空格字符,循环就应该中止。

回答by Yuushi

With a regular string, the best you can do will be of the form:

使用常规字符串,您可以做的最好的事情是以下形式:

return string::find_first_not_of("\t\n ") == string::npos;

This will be O(n) in the worst case, but without knowing else about the string, this will be the best you can do.

在最坏的情况下,这将是 O(n),但在不知道字符串的其他情况下,这将是您能做的最好的事情。

回答by Preet Kukreti

You are making the assumption regex doesnt iterate over the string. Regex is probably much heavier than a linear search since it might build a FSM and traverse based on that.

您正在假设正则表达式不会遍历字符串。正则表达式可能比线性搜索重得多,因为它可能会构建一个 FSM 并在此基础上进行遍历。

The only way you could speed it up further and make it a near-constant time operation is to amortize the cost by iterating on every update to the string and caching a bool/bit that tracks if there is a space-like character, returning that value if no changes have been made since, and updating that bit whenever you do a write operation to that string. However, this sacrifices/slows that speed of modifying operations in order to increase the speed of your custom has_space().

您可以进一步加快速度并使其成为近乎恒定时间的操作的唯一方法是通过对字符串的每次更新进行迭代并缓存跟踪是否存在类似空格的字符的 bool/bit 来分摊成本,返回如果此后未进行任何更改,则值,并在对该字符串执行写入操作时更新该位。但是,这会牺牲/减慢修改操作的速度以提高自定义has_space().

回答by Jerry Coffin

For what it's worth, a locale has a function (scan_is) to do things like this:

对于它的价值,语言环境有一个函数 (scan_is) 来做这样的事情:

#include <locale>
#include <iostream>
#include <iomanip>

int main() {

    std::string inputs[] = { 
        "all lower",
        "including a space"
    };

    std::locale loc(std::locale::classic());

    std::ctype_base::mask m = std::ctype_base::space;

    for (int i=0; i<2; i++) {
        char const *pos;
        char const *b = &*inputs[i].begin();
        char const *e = &*inputs[i].end();

        std::cout << "Input: " << std::setw(20) << inputs[i] << ":\t";
        if ((pos=std::use_facet<std::ctype<char> >(loc).scan_is(m, b, e)) == e)
            std::cout << "No space character\n";
        else
            std::cout << "First space character at position " << pos - b << "\n";
    }
    return 0;
}

It's probably open to (a lot of) question whether this gives much (if any) real advantage over using isspacein a loop (or using std::find_if).

isspace在循环中使用(或使用std::find_if)相比,这是否提供了很多(如果有的话)真正的优势,这可能会引起(很多)问题。

回答by Gabriel

You can also use find_first_not_of if you all the characters to be in a given list. Then you can avoid loops.

如果所有字符都在给定列表中,您也可以使用 find_first_not_of 。然后你可以避免循环。

Here is an example

这是一个例子

#include <string>
#include <algorithm>
using namespace std;
int main()
{
    string str1="      ";
    string str2="      u    ";
    bool ContainsNotBlank1=(str1.find_first_not_of("\t\n ")==string::npos);
    bool ContainsNotBlank2=(str2.find_first_not_of("\t\n ")==string::npos);
    bool ContainsNotBlank3=(str2.find_first_not_of("\t\n u")==string::npos);
    cout << ContainsNotBlank1 <<endl;
    cout << ContainsNotBlank2 <<endl;
    cout << ContainsNotBlank3 <<endl;
    return 0;
}

Output: 1: because only blanks and a tab 0: because u is not into the list "\t\n " 1: because str2 contains blanks, tabs and a u.

输出: 1:因为只有空格和制表符 0:因为 u 不在列表中 "\t\n " 1:因为 str2 包含空格、制表符和一个 u。

Hope it helps Tell me if you have any questions

希望对你有帮助 有什么问题请告诉我