C++ 如何仅测试字符串中的字母

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7616867/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 17:13:27  来源:igfitidea点击:

how to test a string for letters only

c++

提问by miatech

how could I test a string against only valid characters like letters a-z?...

如何仅针对字母 az 等有效字符测试字符串?...

string name;

cout << "Enter your name"
cin >> name;

string letters = "qwertyuiopasdfghjklzxcvbnm";

string::iterator it;

for(it = name.begin(); it = name.end(); it++)
{
  size_t found = letters.find(it);
}

采纳答案by GreenScape

STL way:

STL方式:

struct TestFunctor
{
  bool stringIsCorrect;
  TestFunctor()
  :stringIsCorrect(true)
  {}

  void operator() (char ch)
  {
    if(stringIsCorrect && !((ch <= 'z' && ch >= 'a') || (ch <= 'Z' && ch >= 'A')))
      stringIsCorrect = false;
  }
}

TestFunctor functor;

for_each(name.begin(), name.end(), functor);

if(functor.stringIsCorrect)
  cout << "Yay";

回答by Jon Purdy

First, using std::cin >> namewill fail if the user enters John Smithbecause >>splits input on whitespace characters. You should use std::getline()to get the name:

首先,std::cin >> name如果用户输入, using将失败,John Smith因为>>将输入拆分为空白字符。您应该使用std::getline()来获取名称:

std::getline(std::cin, name);

Here we go…

开始了…

There are a number of ways to check that a string contains only alphabetic characters. The simplest is probably s.find_first_not_of(t), which returns the index of the first character in sthat is not in t:

有多种方法可以检查字符串是否仅包含字母字符。最简单的可能是s.find_first_not_of(t),它返回第一个字符的索引s不在 中t

bool contains_non_alpha
    = name.find_first_not_of("abcdefghijklmnopqrstuvwxyz") != std::string::npos;

That rapidly becomes cumbersome, however. To also match uppercase alphabetic characters, you'd have to add 26 more characters to that string! Instead, you may want to use a combination of find_iffrom the <algorithm>header and std::isalphafrom <cctype>:

然而,这很快变得很麻烦。要同时匹配大写字母字符,您必须在该字符串中再添加 26 个字符!相反,您可能希望使用find_iffrom <algorithm>header 和std::isalphafrom的组合<cctype>

#include <algorithm>
#include <cctype>

struct non_alpha {
    bool operator()(char c) {
        return !std::isalpha(c);
    }
};

bool contains_non_alpha
    = std::find_if(name.begin(), name.end(), non_alpha()) != name.end();

find_ifsearches a range for a value that matches a predicate, in this case a functor non_alphathat returns whether its argument is a non-alphabetic character. If find_if(name.begin(), name.end(), ...)returns name.end(), then no match was found.

find_if在范围内搜索与谓词匹配的值,在本例中是non_alpha返回其参数是否为非字母字符的函子。如果find_if(name.begin(), name.end(), ...)返回name.end(),则未找到匹配项。

But there's more!

但还有更多!

To do this as a one-liner, you can use the adaptors from the <functional>header:

要以单行方式执行此操作,您可以使用<functional>标题中的适配器:

#include <algorithm>
#include <cctype>
#include <functional>

bool contains_non_alpha
    = std::find_if(name.begin(), name.end(),
                   std::not1(std::ptr_fun((int(*)(int))std::isalpha))) != name.end();

The std::not1produces a function object that returns the logical inverse of its input; by supplying a pointer to a function with std::ptr_fun(...), we can tell std::not1to produce the logical inverse of std::isalpha. The cast (int(*)(int))is there to select the overload of std::isalphawhich takes an int(treated as a character) and returns an int(treated as a Boolean).

std::not1产生一个函数对象返回其输入的逻辑逆; 通过提供一个指向函数的指针std::ptr_fun(...),我们可以告诉std::not1生成 的逻辑逆std::isalpha。强制转换(int(*)(int))用于选择其重载,std::isalpha其采用 an int(视为字符)并返回 an int(视为布尔值)。

Or, if you can use a C++11 compiler, using a lambda cleans this up a lot:

或者,如果您可以使用 C++11 编译器,则使用 lambda 会清理很多:

#include <cctype>

bool contains_non_alpha
    = std::find_if(name.begin(), name.end(),
                   [](char c) { return !std::isalpha(c); }) != name.end();

[](char c) -> bool { ... }denotes a function that accepts a character and returns a bool. In our case we can omit the -> boolreturn type because the function body consists of only a returnstatement. This works just the same as the previous examples, except that the function object can be specified much more succinctly.

[](char c) -> bool { ... }表示接受一个字符并返回一个的函数bool。在我们的例子中,我们可以省略-> bool返回类型,因为函数体只包含一个return语句。除了可以更简洁地指定函数对象之外,这与前面的示例的工作原理相同。

And (almost) finally…

而且(几乎)终于……

In C++11 you can also use a regular expression to perform the match:

在 C++11 中,您还可以使用正则表达式来执行匹配:

#include <regex>

bool contains_non_alpha
    = !std::regex_match(name, std::regex("^[A-Za-z]+$"));

But of course…

但是当然…

None of these solutions addresses the issue of locale or character encoding! For a locale-independent version of isalpha(), you'd need to use the C++ header <locale>:

这些解决方案都没有解决语言环境或字符编码的问题!对于与语言环境无关的 版本isalpha(),您需要使用 C++ 标头<locale>

#include <locale>

bool isalpha(char c) {
    std::locale locale; // Default locale.
    return std::use_facet<std::ctype<char> >(locale).is(std::ctype<char>::alpha, c);
}

Ideally we would use char32_t, but ctypedoesn't seem to be able to classify it, so we're stuck with char. Lucky for us we can dance around the issue of locale entirely, because you're probably only interested in English letters. There's a handy header-only library called UTF8-CPPthat will let us do what we need to do in a more encoding-safe way. First we define our version of isalpha()that uses UTF-32 code points:

理想情况下,我们会使用char32_t,但ctype似乎无法对其进行分类,因此我们坚持使用char。幸运的是,我们可以完全围绕语言环境的问题跳舞,因为您可能只对英文字母感兴趣。有一个名为UTF8-CPP的方便的仅头文件库,它可以让我们以更安全的编码方式做我们需要做的事情。首先,我们定义isalpha()使用 UTF-32 代码点的版本:

bool isalpha(uint32_t c) {
    return (c >= 0x0041 && c <= 0x005A)
        || (c >= 0x0061 && c <= 0x007A);
}

Then we can use the utf8::iteratoradaptor to adapt the basic_string::iteratorfrom octets into UTF-32 code points:

然后我们可以使用utf8::iterator适配器将basic_string::iterator八位字节转换为 UTF-32 代码点:

#include <utf8.h>

bool contains_non_alpha
    = std::find_if(utf8::iterator(name.begin(), name.begin(), name.end()),
                   utf8::iterator(name.end(), name.begin(), name.end()),
                   [](uint32_t c) { return !isalpha(c); }) != name.end();

For slightly better performance at the cost of safety, you can use utf8::unchecked::iterator:

为了以安全为代价稍微提高性能,您可以使用utf8::unchecked::iterator

#include <utf8.h>

bool contains_non_alpha
    = std::find_if(utf8::unchecked::iterator(name.begin()),
                   utf8::unchecked::iterator(name.end()),
                   [](uint32_t c) { return !isalpha(c); }) != name.end();

This will fail on some invalid input.

这将在某些无效输入上失败。

Using UTF8-CPP in this way assumes that the host encoding is UTF-8, or a compatible encoding such as ASCII. In theory this is still an imperfect solution, but in practice it will work on the vast majority of platforms.

以这种方式使用 UTF8-CPP 假定主机编码是 UTF-8,或兼容的编码,例如 ASCII。从理论上讲,这仍然是一个不完美的解决方案,但实际上它可以在绝大多数平台上运行。

I hope this answer is finally complete!

我希望这个答案终于完成了!

回答by Lev

If you use Boost, you can use boost::algorithm::is_alphapredicate to perform this check. Here is how to use it:

如果使用 Boost,则可以使用boost::algorithm::is_alpha谓词来执行此检查。以下是如何使用它:

const char* text = "hello world";
bool isAlpha = all( text1, is_alpha() );

Update: As the documentation states, "all() checks all elements of a container to satisfy a condition specified by a predicate". The call to all() is needed here, since is_alpha() actually operates on characters.

更新:正如文档所述,“all() 检查容器的所有元素以满足谓词指定的条件”。这里需要调用 all(),因为 is_alpha() 实际上是对字符进行操作的。

Hope, I helped.

希望,我有所帮助。

回答by Alex Reece

I would suggest investigating the ctype library: http://www.cplusplus.com/reference/std/locale/ctype/

我建议调查 ctype 库:http: //www.cplusplus.com/reference/std/locale/ctype/

For example, the function is(see ctype.is) is a way to check properties on letters in locale sensitive manner:

例如,该函数is(参见ctype.is)是一种以区域设置敏感的方式检查字母属性的方法:

#include <locale>
using namespace std;
bool is_alpha(char c) {
    locale loc;
    bool upper = use_facet< ctype<char> >(loc).is( ctype<char>::alpha, quote[0]);
    return upper;
}

回答by Pubby

  for (string::iterator it=name.begin(); it!=name.end(); ++it)
  {
    if ((*it) < 0x61 || (*it) > 0x71) 
      // string contains characters other than a-z
  }

回答by Galik

C++11approach using std::all_of:

C++11使用std::all_of 的方法

std::all_of(std::begin(name), std::end(name),
    [](char c){ return std::isalpha(c); });

std::all_ofwill only return true if all ofthe elements are true according to the supplied predicatefunction.

如果根据提供的谓词函数所有元素都为真,则std::all_of将仅返回真。