如何在 C 或 C++ 中原地反转字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/198199/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 13:38:13  来源:igfitidea点击:

How do you reverse a string in place in C or C++?

c++cstringreverse

提问by Greg Rogers

How do you reverse a string in C or C++ without requiring a separate buffer to hold the reversed string?

如何在 C 或 C++ 中反转字符串而不需要单独的缓冲区来保存反转的字符串?

采纳答案by Anders Eurenius

The standard algorithm is to use pointers to the start / end, and walk them inward until they meet or cross in the middle. Swap as you go.

标准算法是使用指向开始/结束的指针,然后将它们向内移动,直到它们在中间相遇或交叉。随走随换。



Reverse ASCII string, i.e. a 0-terminated array where every character fits in 1 char. (Or other non-multibyte character sets).

反转 ASCII 字符串,即一个以 0 结尾的数组,其中每个字符都适合 1 char。(或其他非多字节字符集)。

void strrev(char *head)
{
  if (!head) return;
  char *tail = head;
  while(*tail) ++tail;    // find the 0 terminator, like head+strlen
  --tail;               // tail points to the last real char
                        // head still points to the first
  for( ; head < tail; ++head, --tail) {
      // walk pointers inwards until they meet or cross in the middle
      char h = *head, t = *tail;
      *head = t;           // swapping as we go
      *tail = h;
  }
}

// test program that reverses its args
#include <stdio.h>

int main(int argc, char **argv)
{
  do {
    printf("%s ",  argv[argc-1]);
    strrev(argv[argc-1]);
    printf("%s\n", argv[argc-1]);
  } while(--argc);

  return 0;
}

The same algorithm works for integer arrays with known length, just use tail = start + length - 1instead of the end-finding loop.

相同的算法适用于已知长度的整数数组,只是使用tail = start + length - 1而不是结束查找循环。

(Editor's note: this answer originally used XOR-swap for this simple version, too. Fixed for the benefit of future readers of this popular question. XOR-swap is highlynot recommended; hard to read and making your code compile less efficiently. You can see on the Godbolt compiler explorerhow much more complicated the asm loop body is when xor-swap is compiled for x86-64 with gcc -O3.)

(编者注:这个答案最初也对这个简单版本使用了 XOR-swap。为了这个流行问题的未来读者的利益而修复。 强烈不推荐XOR-swap;难以阅读并且使您的代码编译效率降低。你可以在 Godbolt 编译器资源管理器上看到当使用 gcc -O3 为 x86-64 编译 xor-swap 时,asm 循环体有多复杂。)



Ok, fine, let's fix the UTF-8 chars...

好的,好吧,让我们修复 UTF-8 字符...

(This is XOR-swap thing. Take care to note that you must avoidswapping with self, because if *pand *qare the same location you'll zero it with a^a==0. XOR-swap depends on having two distinct locations, using them each as temporary storage.)

(这是 XOR-swap 的事情。请注意,您必须避免与 self 交换,因为如果*p*q是相同的位置,您将使用 a^a==0 将其归零。XOR-swap 取决于有两个不同的位置,将它们分别用作临时存储。)

Editor's note: you can replace SWP with a safe inline function using a tmp variable.

编者注:您可以使用 tmp 变量将 SWP 替换为安全的内联函数。

#include <bits/types.h>
#include <stdio.h>

#define SWP(x,y) (x^=y, y^=x, x^=y)

void strrev(char *p)
{
  char *q = p;
  while(q && *q) ++q; /* find eos */
  for(--q; p < q; ++p, --q) SWP(*p, *q);
}

void strrev_utf8(char *p)
{
  char *q = p;
  strrev(p); /* call base case */

  /* Ok, now fix bass-ackwards UTF chars. */
  while(q && *q) ++q; /* find eos */
  while(p < --q)
    switch( (*q & 0xF0) >> 4 ) {
    case 0xF: /* U+010000-U+10FFFF: four bytes. */
      SWP(*(q-0), *(q-3));
      SWP(*(q-1), *(q-2));
      q -= 3;
      break;
    case 0xE: /* U+000800-U+00FFFF: three bytes. */
      SWP(*(q-0), *(q-2));
      q -= 2;
      break;
    case 0xC: /* fall-through */
    case 0xD: /* U+000080-U+0007FF: two bytes. */
      SWP(*(q-0), *(q-1));
      q--;
      break;
    }
}

int main(int argc, char **argv)
{
  do {
    printf("%s ",  argv[argc-1]);
    strrev_utf8(argv[argc-1]);
    printf("%s\n", argv[argc-1]);
  } while(--argc);

  return 0;
}
  • Why, yes, if the input is borked, this will cheerfully swap outside the place.
  • Useful link when vandalising in the UNICODE: http://www.macchiato.com/unicode/chart/
  • Also, UTF-8 over 0x10000 is untested (as I don't seem to have any font for it, nor the patience to use a hexeditor)
  • 为什么,是的,如果输入被中断,这将在外面愉快地交换。
  • 在 UNICODE 中破坏时的有用链接:http: //www.macchiato.com/unicode/chart/
  • 此外,超过 0x10000 的 UTF-8 未经测试(因为我似乎没有任何字体,也没有耐心使用十六进制编辑器)

Examples:

例子:

$ ./strrev R?ksm?rg?s ??▓○???●

??▓○???● ●???○▓??

R?ksm?rg?s s?gr?msk?R

./strrev verrts/.

回答by Greg Rogers

#include <algorithm>
std::reverse(str.begin(), str.end());

This is the simplest way in C++.

这是 C++ 中最简单的方法。

回答by Greg Rogers

Read Kernighan and Ritchie

读 Kernighan 和 Ritchie

#include <string.h>

void reverse(char s[])
{
    int length = strlen(s) ;
    int c, i, j;

    for (i = 0, j = length - 1; i < j; i++, j--)
    {
        c = s[i];
        s[i] = s[j];
        s[j] = c;
    }
}

回答by slashdottir

Reverse a string in place (visualization):

原地反转字符串(可视化):

Reverse a string in place

将字符串反转到位

回答by Chris Conway

Non-evil C, assuming the common case where the string is a null-terminated chararray:

非邪恶 C,假设字符串是空终止char数组的常见情况:

#include <stddef.h>
#include <string.h>

/* PRE: str must be either NULL or a pointer to a 
 * (possibly empty) null-terminated string. */
void strrev(char *str) {
  char temp, *end_ptr;

  /* If str is NULL or empty, do nothing */
  if( str == NULL || !(*str) )
    return;

  end_ptr = str + strlen(str) - 1;

  /* Swap the chars */
  while( end_ptr > str ) {
    temp = *str;
    *str = *end_ptr;
    *end_ptr = temp;
    str++;
    end_ptr--;
  }
}

回答by Nemanja Trifunovic

You use std::reversealgorithm from the C++ Standard Library.

您使用std::reverseC++ 标准库中的算法。

回答by karlphillip

It's been a while and I don't remember which book taught me this algorithm, but I thought it was quite ingenious and simple to understand:

已经有一段时间了,我不记得是哪本书教了我这个算法,但我认为它非常巧妙且易于理解:

char input[] = "moc.wolfrevokcats";

int length = strlen(input);
int last_pos = length-1;
for(int i = 0; i < length/2; i++)
{
    char tmp = input[i];
    input[i] = input[last_pos - i];
    input[last_pos - i] = tmp;
}

printf("%s\n", input);

回答by user2628229

Use the std::reversemethod from STL:

使用STL 中std::reverse方法:

std::reverse(str.begin(), str.end());

You will have to include the "algorithm" library, #include<algorithm>.

您必须包含“算法”库,#include<algorithm>.

回答by Eclipse

Note that the beauty of std::reverse is that it works with char *strings and std::wstrings just as well as std::strings

请注意, std::reverse 的美妙之处在于它适用于char *字符串和std::wstrings 就像std::strings

void strrev(char *str)
{
    if (str == NULL)
        return;
    std::reverse(str, str + strlen(str));
}

回答by Juan Pablo Califano

If you're looking for reversing NULL terminated buffers, most solutions posted here are OK. But, as Tim Farley already pointed out, these algorithms will work only if it's valid to assume that a string is semantically an array of bytes (i.e. single-byte strings), which is a wrong assumption, I think.

如果您正在寻找反转 NULL 终止的缓冲区,这里发布的大多数解决方案都可以。但是,正如 Tim Farley 已经指出的那样,这些算法只有在假设字符串在语义上是字节数组(即单字节字符串)是有效的情况下才有效,我认为这是错误的假设。

Take for example, the string "a?o" (year in Spanish).

以字符串“a?o”(西班牙语中的年份)为例。

The Unicode code points are 0x61, 0xf1, 0x6f.

Unicode 代码点是 0x61、0xf1、0x6f。

Consider some of the most used encodings:

考虑一些最常用的编码:

Latin1 / iso-8859-1(single byte encoding, 1 character is 1 byte and vice versa):

Latin1 / iso-8859-1(单字节编码,1个字符为1个字节,反之亦然):

Original:

0x61, 0xf1, 0x6f, 0x00

Reverse:

0x6f, 0xf1, 0x61, 0x00

The result is OK

原来的:

0x61、0xf1、0x6f、0x00

逆转:

0x6f、0xf1、0x61、0x00

结果正常

UTF-8:

UTF-8:

Original:

0x61, 0xc3, 0xb1, 0x6f, 0x00

Reverse:

0x6f, 0xb1, 0xc3, 0x61, 0x00

The result is gibberish and an illegal UTF-8 sequence

原来的:

0x61、0xc3、0xb1、0x6f、0x00

逆转:

0x6f、0xb1、0xc3、0x61、0x00

结果是胡言乱语和非法的 UTF-8 序列

UTF-16 Big Endian:

UTF-16 大端:

Original:

0x00, 0x61, 0x00, 0xf1, 0x00, 0x6f, 0x00, 0x00

The first byte will be treated as a NUL-terminator. No reversing will take place.

原来的:

0x00, 0x61, 0x00, 0xf1, 0x00, 0x6f, 0x00, 0x00

第一个字节将被视为 NUL 终止符。不会发生逆转。

UTF-16 Little Endian:

UTF-16 小端:

Original:

0x61, 0x00, 0xf1, 0x00, 0x6f, 0x00, 0x00, 0x00

The second byte will be treated as a NUL-terminator. The result will be 0x61, 0x00, a string containing the 'a' character.

原来的:

0x61、0x00、0xf1、0x00、0x6f、0x00、0x00、0x00

第二个字节将被视为 NUL 终止符。结果将是 0x61, 0x00,一个包含 'a' 字符的字符串。