将 C++ 中的字符串转换为大写

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/735204/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 17:01:13  来源:igfitidea点击:

Convert a String In C++ To Upper Case

c++string

提问by OrangeAlmondSoap

How could one convert a string to upper case. The examples I have found from googling only have to deal with chars.

如何将字符串转换为大写。我从谷歌搜索中找到的例子只需要处理字符。

采纳答案by Tony Edgecombe

Boost string algorithms:

Boost字符串算法:

#include <boost/algorithm/string.hpp>
#include <string>

std::string str = "Hello World";

boost::to_upper(str);

std::string newstr = boost::to_upper_copy<std::string>("Hello World");

回答by Pierre

#include <algorithm>
#include <string>

std::string str = "Hello World";
std::transform(str.begin(), str.end(),str.begin(), ::toupper);

回答by Thanasis Papoutsidakis

Short solution using C++11 and toupper().

使用 C++11 和 toupper() 的简短解决方案。

for (auto & c: str) c = toupper(c);

回答by dirkgently

struct convert {
   void operator()(char& c) { c = toupper((unsigned char)c); }
};

// ... 
string uc_str;
for_each(uc_str.begin(), uc_str.end(), convert());

Note:A couple of problems with the top solution:

注意:顶级解决方案有几个问题:

21.5 Null-terminated sequence utilities

The contents of these headers shall be the same as the Standard C Library headers <ctype.h>, <wctype.h>, <string.h>, <wchar.h>, and <stdlib.h> [...]

21.5 空终止序列实用程序

这些头文件的内容应与标准 C 库头文件 <ctype.h>、<wctype.h>、<string.h>、<wchar.h> 和 <stdlib.h> [...]

  • Which means that the cctypemembers may well be macros not suitable for direct consumption in standard algorithms.

  • Another problem with the same example is that it does not cast the argument or verify that this is non-negative; this is especially dangerous for systems where plain charis signed. (The reason being: if this is implemented as a macro it will probably use a lookup table and your argument indexes into that table. A negative index will give you UB.)

  • 这意味着cctype成员很可能是不适合在标准算法中直接使用的宏。

  • 同一个例子的另一个问题是它没有转换参数或验证这是非负的;这对于普通char签名的系统尤其危险。(原因是:如果这是作为宏实现的,它可能会使用查找表和您的参数索引到该表中。负索引会给您 UB。)

回答by Peter Cordes

This problem is vectorizable with SIMDfor the ASCII character set.

对于 ASCII 字符集,这个问题可以用 SIMD 向量化



Speedup comparisons:

加速比较:

Preliminary testing with x86-64 gcc 5.2 -O3 -march=nativeon a Core2Duo (Merom). The same string of 120 characters (mixed lowercase and non-lowercase ASCII), converted in a loop 40M times (with no cross-file inlining, so the compiler can't optimize away or hoist any of it out of the loop). Same source and dest buffers, so no malloc overhead or memory/cache effects: data is hot in L1 cache the whole time, and we're purely CPU-bound.

-O3 -march=native在 Core2Duo (Merom) 上使用 x86-64 gcc 5.2进行初步测试。相同的 120 个字符字符串(混合小写和非小写 ASCII),在循环中转换了 40M 次(没有跨文件内联,因此编译器无法优化掉或将其中任何一个提升到循环之外)。相同的源和目标缓冲区,因此没有 malloc 开销或内存/缓存效应:数据在 L1 缓存中一直很热,我们纯粹受 CPU 限制。

  • boost::to_upper_copy<char*, std::string>(): 198.0s. Yes, Boost 1.58 on Ubuntu 15.10 is really this slow. I profiled and single-stepped the asm in a debugger, and it's really, reallybad: there's a dynamic_cast of a locale variable happening per character!!! (dynamic_cast takes multiple calls to strcmp). This happens with LANG=Cand with LANG=en_CA.UTF-8.

    I didn't test using a RangeT other than std::string. Maybe the other form of to_upper_copyoptimizes better, but I think it will always new/mallocspace for the copy, so it's harder to test. Maybe something I did differs from a normal use-case, and maybe normally stopped g++ can hoist the locale setup stuff out of the per-character loop. My loop reading from a std::stringand writing to a char dstbuf[4096]makes sense for testing.

  • loop calling glibc toupper: 6.67s(not checking the intresult for potential multi-byte UTF-8, though. This matters for Turkish.)

  • ASCII-only loop: 8.79s(my baseline version for the results below.) Apparently a table-lookup is faster than a cmov, with the table hot in L1 anyway.
  • ASCII-only auto-vectorized: 2.51s. (120 chars is half way between worst case and best case, see below)
  • ASCII-only manually vectorized: 1.35s
  • boost::to_upper_copy<char*, std::string>()198.0 秒。是的,Ubuntu 15.10 上的 Boost 1.58 真的很慢。我在调试器中对 asm 进行了分析和单步执行,这真的非常糟糕:每个字符都发生了一个 locale 变量的 dynamic_cast !!!(dynamic_cast 需要多次调用 strcmp)。这发生在LANG=CLANG=en_CA.UTF-8

    我没有使用 std::string 以外的 RangeT 进行测试。 也许另一种形式的to_upper_copy优化更好,但我认为它总是new/malloc空间用于副本,因此更难测试。也许我所做的某些事情与正常用例不同,也许通常停止的 g++ 可以将语言环境设置内容从每个字符的循环中提升出来。我的循环从 a 读取std::string并写入 achar dstbuf[4096]对测试有意义。

  • 循环调用 glibc toupper6.67s(不过,不检查int潜在多字节 UTF-8的结果。这对土耳其语很重要。)

  • 仅 ASCII 循环:8.79s(我用于下面结果的基线版本。)显然,查找表比 a 快cmov,无论如何表在 L1 中都很热。
  • 仅 ASCII 自动矢量化:2.51s。(120 个字符介于最坏情况和最好情况之间,见下文)
  • 仅 ASCII 手动矢量化:1.35s

See also this question about toupper()being slow on Windows when a locale is set.

另请参阅有关toupper()设置区域设置时在 Windows 上运行缓慢的问题



I was shocked that Boost is an order of magnitude slower than the other options. I double-checked that I had -O3enabled, and even single-stepped the asm to see what it was doing. It's almost exactly the same speed with clang++ 3.8. It has huge overhead inside the per-character loop. The perf record/ reportresult (for the cyclesperf event) is:

我很震惊 Boost 比其他选项慢一个数量级。我仔细检查了我是否已-O3启用,甚至单步执行 asm 以查看它在做什么。它与 clang++ 3.8 的速度几乎完全相同。它在每个字符循环内有巨大的开销。的perf record/report结果(对于cyclesPERF事件)为:

  32.87%  flipcase-clang-  libstdc++.so.6.0.21   [.] _ZNK10__cxxabiv121__vmi_class_type_info12__do_dyncastElNS_17__class_type_info10__sub_kindEPKS1_PKvS4_S6_RNS1_16
  21.90%  flipcase-clang-  libstdc++.so.6.0.21   [.] __dynamic_cast                                                                                                 
  16.06%  flipcase-clang-  libc-2.21.so          [.] __GI___strcmp_ssse3                                                                                            
   8.16%  flipcase-clang-  libstdc++.so.6.0.21   [.] _ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale                                                                     
   7.84%  flipcase-clang-  flipcase-clang-boost  [.] _Z16strtoupper_boostPcRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                   
   2.20%  flipcase-clang-  libstdc++.so.6.0.21   [.] strcmp@plt                                                                                                     
   2.15%  flipcase-clang-  libstdc++.so.6.0.21   [.] __dynamic_cast@plt                                                                                             
   2.14%  flipcase-clang-  libstdc++.so.6.0.21   [.] _ZNKSt6locale2id5_M_idEv                                                                                       
   2.11%  flipcase-clang-  libstdc++.so.6.0.21   [.] _ZNKSt6locale2id5_M_idEv@plt                                                                                   
   2.08%  flipcase-clang-  libstdc++.so.6.0.21   [.] _ZNKSt5ctypeIcE10do_toupperEc                                                                                  
   2.03%  flipcase-clang-  flipcase-clang-boost  [.] _ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale@plt                                                                 
   0.08% ...


Autovectorization

自动矢量化

Gcc and clang will only auto-vectorize loops when the iteration count is known ahead of the loop. (i.e. search loops like plain-C implementation of strlenwon't autovectorize.)

Gcc 和 clang 只会在循环之前知道迭代计数时自动矢量化循环。(即搜索循环如纯 C 实现strlen不会自动向量化。)

Thus, for strings small enough to fit in cache, we get a significant speedup for strings ~128 chars long from doing strlenfirst. This won't be necessary for explicit-length strings (like C++ std::string).

因此,对于小到足以放入缓存的字符串,我们从strlen第一次开始时获得了约 128 个字符长的字符串的显着加速。对于显式长度的字符串(如 C++ std::string),这不是必需的。

// char, not int, is essential: otherwise gcc unpacks to vectors of int!  Huge slowdown.
char ascii_toupper_char(char c) {
    return ('a' <= c && c <= 'z') ? c^0x20 : c;    // ^ autovectorizes to PXOR: runs on more ports than paddb
}

// gcc can only auto-vectorize loops when the number of iterations is known before the first iteration.  strlen gives us that
size_t strtoupper_autovec(char *dst, const char *src) {
    size_t len = strlen(src);
    for (size_t i=0 ; i<len ; ++i) {
        dst[i] = ascii_toupper_char(src[i]);  // gcc does the vector range check with psubusb / pcmpeqb instead of pcmpgtb
    }
    return len;
}

Any decent libc will have an efficient strlenthat's much faster than looping a byte at a time, so separate vectorized strlen and toupper loops are faster.

任何体面的 libc 都具有strlen比一次循环一个字节快得多的效率,因此单独的矢量化 strlen 和 toupper 循环更快。

Baseline: a loop that checks for a terminating 0 on the fly.

基线:一个循环检查是否有终止的 0。

Times for 40M iterations, on a Core2 (Merom) 2.4GHz. gcc 5.2 -O3 -march=native. (Ubuntu 15.10). dst != src(so we make a copy), but they don't overlap (and aren't nearby). Both are aligned.

在 Core2 (Merom) 2.4GHz 上进行 40M 次迭代的次数。海湾合作委员会 5.2 -O3 -march=native。(Ubuntu 15.10)。 dst != src(所以我们制作了一个副本),但它们不重叠(也不在附近)。两者对齐。

  • 15 char string: baseline: 1.08s. autovec: 1.34s
  • 16 char string: baseline: 1.16s. autovec: 1.52s
  • 127 char string: baseline: 8.91s. autovec: 2.98s // non-vector cleanup has 15 chars to process
  • 128 char string: baseline: 9.00s. autovec: 2.06s
  • 129 char string: baseline: 9.04s. autovec: 2.07s // non-vector cleanup has 1 char to process
  • 15 个字符的字符串:基线:1.08 秒。autovec:1.34s
  • 16 个字符的字符串:基线:1.16s。autovec:1.52s
  • 127 个字符的字符串:基线:8.91 秒。autovec: 2.98s // 非向量清理需要处理 15 个字符
  • 128 个字符的字符串:基线:9.00 秒。autovec:2.06s
  • 129 字符字符串:基线:9.04s。autovec: 2.07s // 非向量清理需要处理 1 个字符

Some results are a bit different with clang.

某些结果与 clang 有点不同。

The microbenchmark loop that calls the function is in a separate file. Otherwise it inlines and strlen()gets hoisted out of the loop, and it runs dramatically faster, esp. for 16 char strings (0.187s).

调用该函数的微基准循环位于单独的文件中。否则它会内联strlen()并被提升出循环,并且它运行得更快,尤其是。用于 16 个字符字符串 (0.187s)。

This has the major advantage that gcc can auto-vectorize it for any architecture, but the major disadvantage that it's slower for the usually-common case of small strings.

这具有 gcc 可以为任何架构自动矢量化它的主要优点,但主要缺点是对于通常常见的小字符串情况来说速度较慢。



So there are big speedups, but compiler auto-vectorization doesn't make great code, esp. for cleanup of the last up-to-15 characters.

所以有很大的加速,但是编译器自动矢量化并不能产生很好的代码,尤其是。用于清理最后最多 15 个字符。

Manual vectorization with SSE intrinsics:

使用 SSE 内在函数手动矢量化:

Based on my case-flip functionthat inverts the case of every alphabetic character. It takes advantage of the "unsigned compare trick", where you can do low < a && a <= highwith a single unsigned comparison by range shifting, so that any value less than lowwraps to a value that's greater than high. (This works if lowand higharen't too far apart.)

基于我的case-flip 函数,它反转每个字母字符的大小写。它利用了“无符号比较技巧”,您可以在low < a && a <= high其中通过范围移位进行单个无符号比较,以便任何小于low的值都转换为大于 的值high。(这工作,如果lowhigh不太远。)

SSE only has a signed compare-greater, but we can still use the "unsigned compare" trick by range-shifting to the bottom of the signed range: Subtract 'a'+128, so the alphabetic characters range from -128 to -128+25 (-128+'z'-'a')

SSE只有一个有符号比较大,但我们仍然可以通过范围移动到有符号范围的底部来使用“无符号比较”技巧:减去'a'+128,所以字母字符范围从-128到-128 +25 (-128+'z'-'a')

Note that adding 128 and subtracting 128 are the same thing for 8bit integers. There's nowhere for the carry to go, so it's just xor (carryless add), flipping the high bit.

请注意,对于 8 位整数,加 128 和减 128 是一回事。进位无处可去,所以它只是异或(无进位加法),翻转高位。

#include <immintrin.h>

__m128i upcase_si128(__m128i src) {
    // The above 2 paragraphs were comments here
    __m128i rangeshift = _mm_sub_epi8(src, _mm_set1_epi8('a'+128));
    __m128i nomodify   = _mm_cmpgt_epi8(rangeshift, _mm_set1_epi8(-128 + 25));  // 0:lower case   -1:anything else (upper case or non-alphabetic).  25 = 'z' - 'a'

    __m128i flip  = _mm_andnot_si128(nomodify, _mm_set1_epi8(0x20));            // 0x20:lcase    0:non-lcase

    // just mask the XOR-mask so elements are XORed with 0 instead of 0x20
    return          _mm_xor_si128(src, flip);
    // it's easier to xor with 0x20 or 0 than to AND with ~0x20 or 0xFF
}

Given this function that works for one vector, we can call it in a loop to process a whole string. Since we're already targeting SSE2, we can do a vectorized end-of-string check at the same time.

鉴于此函数适用于一个向量,我们可以在循环中调用它来处理整个字符串。由于我们已经针对 SSE2,我们可以同时进行矢量化的字符串结尾检查。

We can also do much better for the "cleanup" of the last up-to-15 bytes left over after doing vectors of 16B: upper-casing is idempotent, so re-processing some input bytes is fine. We do an unaligned load of the last 16B of the source, and store it into the dest buffer overlapping the last 16B store from the loop.

我们还可以在执行 16B 向量后对最后最多 15 个字节的“清理”做得更好:大写是幂等的,因此重新处理一些输入字节是可以的。我们对源的最后 16B 进行未对齐加载,并将其存储到与循环中最后 16B 存储重叠的 dest 缓冲区中。

The only time this doesn't work is when the whole string is under 16B: Even when dst=src, non-atomic read-modify-write is notthe same thing as not touching some bytes at all, and can break multithreaded code.

唯一不起作用的情况是当整个字符串低于 16B 时:即使dst=src,非原子读-修改-写与根本不接触某些字节也不是一回事,并且可以破坏多线程代码。

We have a scalar loop for that, and also to get srcaligned. Since we don't know where the terminating 0 will be, an unaligned load from srcmight cross into the next page and segfault. If we need any bytes in an aligned 16B chunk, it's always safe to load the whole aligned 16B chunk.

我们有一个标量循环,也可以src对齐。由于我们不知道终止的 0 将在哪里,因此未对齐的加载src可能会进入下一页并出现段错误。如果我们需要对齐的 16B 块中的任何字节,加载整个对齐的 16B 块总是安全的。

Full source: in a github gist.

完整来源:在 github gist 中

// FIXME: doesn't always copy the terminating 0.
// microbenchmarks are for this version of the code (with _mm_store in the loop, instead of storeu, for Merom).
size_t strtoupper_sse2(char *dst, const char *src_begin) {
    const char *src = src_begin;
    // scalar until the src pointer is aligned
    while ( (0xf & (uintptr_t)src) && *src ) {
        *(dst++) = ascii_toupper(*(src++));
    }

    if (!*src)
        return src - src_begin;

    // current position (p) is now 16B-aligned, and we're not at the end
    int zero_positions;
    do {
        __m128i sv = _mm_load_si128( (const __m128i*)src );
        // TODO: SSE4.2 PCMPISTRI or PCMPISTRM version to combine the lower-case and '
string StringToUpper(string strToConvert)
{
   for (std::string::iterator p = strToConvert.begin(); strToConvert.end() != p; ++p)
       *p = toupper(*p);

   return p;
}
' detection? __m128i nullcheck = _mm_cmpeq_epi8(_mm_setzero_si128(), sv); zero_positions = _mm_movemask_epi8(nullcheck); // TODO: unroll so the null-byte check takes less overhead if (zero_positions) break; __m128i upcased = upcase_si128(sv); // doing this before the loop break lets gcc realize that the constants are still in registers for the unaligned cleanup version. But it leads to more wasted insns in the early-out case _mm_storeu_si128((__m128i*)dst, upcased); //_mm_store_si128((__m128i*)dst, upcased); // for testing on CPUs where storeu is slow src += 16; dst += 16; } while(1); // handle the last few bytes. Options: scalar loop, masked store, or unaligned 16B. // rewriting some bytes beyond the end of the string would be easy, // but doing a non-atomic read-modify-write outside of the string is not safe. // Upcasing is idempotent, so unaligned potentially-overlapping is a good option. unsigned int cleanup_bytes = ffs(zero_positions) - 1; // excluding the trailing null const char* last_byte = src + cleanup_bytes; // points at the terminating '
string StringToUpper(string strToConvert)
{
    std::transform(strToConvert.begin(), strToConvert.end(), strToConvert.begin(), ::toupper);

    return strToConvert;
}
' // FIXME: copy the terminating 0 when we end at an aligned vector boundary // optionally special-case cleanup_bytes == 15: final aligned vector can be used. if (cleanup_bytes > 0) { if (last_byte - src_begin >= 16) { // if src==dest, this load overlaps with the last store: store-forwarding stall. Hopefully OOO execution hides it __m128i sv = _mm_loadu_si128( (const __m128i*)(last_byte-15) ); // includes the
#include <algorithm>
void  toUpperCase(std::string& str)
{
    std::transform(str.begin(), str.end(), str.begin(), ::toupper);
}

int main()
{
   std::string str = "hello";
   toUpperCase(&str);
}
_mm_storeu_si128((__m128i*)(dst + cleanup_bytes - 15), upcase_si128(sv)); } else { // whole string less than 16B // if this is common, try 64b or even 32b cleanup with movq / movd and upcase_si128 #if 1 for (unsigned int i = 0 ; i <= cleanup_bytes ; ++i) { dst[i] = ascii_toupper(src[i]); } #else // gcc stupidly auto-vectorizes this, resulting in huge code bloat, but no measurable slowdown because it never runs for (int i = cleanup_bytes - 1 ; i >= 0 ; --i) { dst[i] = ascii_toupper(src[i]); } #endif } } return last_byte - src_begin; }

Times for 40M iterations, on a Core2 (Merom) 2.4GHz. gcc 5.2 -O3 -march=native. (Ubuntu 15.10). dst != src(so we make a copy), but they don't overlap (and aren't nearby). Both are aligned.

在 Core2 (Merom) 2.4GHz 上进行 40M 次迭代的次数。海湾合作委员会 5.2 -O3 -march=native。(Ubuntu 15.10)。 dst != src(所以我们制作了一个副本),但它们不重叠(也不在附近)。两者对齐。

  • 15 char string: baseline: 1.08s. autovec: 1.34s. manual: 1.29s
  • 16 char string: baseline: 1.16s. autovec: 1.52s. manual: 0.335s
  • 31 char string: manual: 0.479s
  • 127 char string: baseline: 8.91s. autovec: 2.98s. manual: 0.925s
  • 128 char string: baseline: 9.00s. autovec: 2.06s. manual: 0.931s
  • 129 char string: baseline: 9.04s. autovec: 2.07s. manual: 1.02s
  • 15 个字符的字符串:基线:1.08 秒。autovec:1.34 秒。手动:1.29s
  • 16 个字符的字符串:基线:1.16s。autovec:1.52s。手动:0.335s
  • 31 字符字符串:手动:0.479s
  • 127 个字符的字符串:基线:8.91 秒。autovec:2.98s。手动:0.925s
  • 128 个字符的字符串:基线:9.00 秒。autovec:2.06 秒。手动:0.931s
  • 129 字符字符串:基线:9.04s。autovec:2.07 秒。手动:1.02s

(Actually timed with _mm_storein the loop, not _mm_storeu, because storeu is slower on Merom even when the address is aligned. It's fine on Nehalem and later. I've also left the code as-is for now, instead of fixing the failure to copy the terminating 0 in some cases, because I don't want to re-time everything.)

(实际上_mm_store是在循环中计时,而不是_mm_storeu,因为即使地址对齐时,存储在 Merom 上也较慢。在 Nehalem 和更高版本上很好。我现在也保留了代码,而不是修复复制失败的问题在某些情况下终止 0,因为我不想重新计时。)

So for short strings longer than 16B, this is dramatically faster than auto-vectorized. Lengths one-less-than-a-vector-width don't present a problem. They might be a problem when operating in-place, because of a store-forwarding stall. (But note that it's still fine to process our own output, rather than the original input, because toupper is idempotent).

因此,对于长度超过 16B 的短字符串,这比自动矢量化要快得多。长度小于向量宽度不存在问题。由于存储转发摊位,它们在就地操作时可能会成为问题。(但请注意,处理我们自己的输出而不是原始输入仍然可以,因为 toupper 是幂等的)。

There's a lot of scope for tuning this for different use-cases, depending on what the surrounding code wants, and the target microarchitecture. Getting the compiler to emit nice code for the cleanup portion is tricky. Using ffs(3)(which compiles to bsf or tzcnt on x86) seems to be good, but obviously that bit needs a re-think since I noticed a bug after writing up most of this answer (see the FIXME comments).

根据周围代码的需求和目标微体系结构,针对不同的用例有很多调整空间。让编译器为清理部分发出漂亮的代码是很棘手的。使用ffs(3)(在 x86 上编译为 bsf 或 tzcnt)似乎很好,但显然这一点需要重新思考,因为我在写完这个答案的大部分内容后发现了一个错误(请参阅 FIXME 评论)。

Vector speedups for even smaller strings can be obtained with movqor movdloads/stores. Customize as necessary for your use-case.

可以使用movqmovd加载/存储获得更小的字符串的向量加速。根据您的用例进行必要的定制。



UTF-8:

UTF-8:

We can detect when our vector has any bytes with the high bit set, and in that case fall back to a scalar utf-8-aware loop for that vector. The dstpoint can advance by a different amount than the srcpointer, but once we get back to an aligned srcpointer, we'll still just do unaligned vector stores to dst.

我们可以检测我们的向量何时有任何设置了高位的字节,在这种情况下,回退到该向量的标量 utf-8 感知循环。该dst点可以与src指针前进不同的量,但是一旦我们回到对齐的src指针,我们仍然只会将未对齐的向量存储到dst

For text that's UTF-8, but mostly consists of the ASCII subset of UTF-8, this can be good: high performance in the common case with correct behaviour in all cases. When there's a lot of non-ASCII, it will probably be worse than staying in the scalar UTF-8 aware loop all the time, though.

对于 UTF-8 的文本,但主要由 UTF-8 的 ASCII 子集组成,这可能很好:在所有情况下具有正确行为的常见情况下的高性能。但是,当有很多非 ASCII 字符时,这可能比一直停留在标量 UTF-8 感知循环中更糟糕。

Making English faster at the expense of other languages is not a future-proof decision if the downside is significant.

如果不利之处很大,以牺牲其他语言为代价来提高英语速度并不是一个面向未来的决定。



Locale-aware:

区域感知:

In the Turkish locale (tr_TR), the correct result from toupper('i')is '?'(U0130), not 'I'(plain ASCII). See Martin Bonner's commentson a question about tolower()being slow on Windows.

在土耳其语语言环境 ( tr_TR) 中,正确的结果toupper('i')'?'(U0130),而不是'I'(纯 ASCII)。请参阅Martin Bonner对有关tolower()在 Windows 上运行缓慢的问题的评论

We can also check for an exception-list and fallback to scalar there, like for multi-byte UTF8 input characters.

我们还可以检查异常列表并回退到那里的标量,例如多字节 UTF8 输入字符。

With this much complexity, SSE4.2 PCMPISTRMor something might be able to do a lot of our checks in one go.

由于如此复杂,SSE4.2 之类的PCMPISTRM东西可能可以一次性完成我们的很多检查。

回答by Milan Babu?kov

Do you have ASCII or International characters in strings?

字符串中有 ASCII 或国际字符吗?

If it's the latter case, "uppercasing" is not that simple, and it depends on the used alphabet. There are bicameral and unicameral alphabets. Only bicameral alphabets have different characters for upper and lower case. Also, there are composite characters, like Latin capital letter 'DZ' (\u01F1 'DZ') which use the so called title case. This means that only the first character (D) gets changed.

如果是后一种情况,“大写”就没有那么简单了,这取决于使用的字母表。有两院制和一院制字母表。只有两院制字母的大写和小写字符不同。此外,还有一些复合字符,例如拉丁大写字母 'DZ' (\u01F1 'DZ'),它们使用所谓的title case。这意味着只有第一个字符 (D) 被更改。

I suggest you look into ICU, and difference between Simple and Full Case Mappings. This might help:

我建议您查看ICU以及 Simple 和 Full Case Mappings 之间的区别。这可能有帮助:

http://userguide.icu-project.org/transforms/casemappings

http://userguide.icu-project.org/transforms/casemappings

回答by user648545

std::string s("change my case");

auto to_upper = [] (char_t ch) { return std::use_facet<std::ctype<char_t>>(std::locale()).toupper(ch); };

std::transform(s.begin(), s.end(), s.begin(), to_upper);

Or,

或者,

for(i=0;str[i]!=0;i++)
  if(str[i]<='z' && str[i]>='a')
    str[i]-=32;

回答by Pabitra Dash

The following works for me.

以下对我有用。

##代码##

回答by Byron

Use a lambda.

使用 lambda。

##代码##

回答by Luca C.

The faster one if you use only ASCII characters:

如果仅使用 ASCII 字符,则速度更快:

##代码##

Please note that this code run faster but only works on ASCIIand is not an "abstract" solution.

请注意,此代码运行速度更快,但仅适用于 ASCII,而不是“抽象”解决方案。

If you need UNICODE solutions or more conventional and abstract solutions, go for other answers and work with methods of C++ strings.

如果您需要 UNICODE 解决方案或更传统和抽象的解决方案,请寻找其他答案并使用 C++ 字符串方法。