C语言 如何从C中的字符串中提取子字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19555434/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 07:48:40  来源:igfitidea点击:

How to extract a substring from a string in C?

cstringc-strings

提问by ShadyBears

I tried using strncmp but it only works if I give it a specific number of bytes I want to extract.

我尝试使用 strncmp 但它只有在我给它指定要提取的特定字节数时才有效。

char line[256] = This "is" an example. //I want to extract "is"
char line[256] = This is "also" an example. // I want to extract "also"
char line[256] = This is the final "example".  // I want to extract "example"
char substring[256]

How would I extract all the elements in between the ""? and put it in the variable substring?

我将如何提取“”之间的所有元素?并将其放入变量子字符串中?

回答by Floris

Note:I edited this answer after I realized that as written the code would cause a problem as strtokdoesn't like to operate on const char*variables. This was more an artifact of how I wrote the example than a problem with the underlying principle - but apparently it deserved a double downvote. So I fixed it.

注意:在我意识到编写的代码会导致问题后,我编辑了这个答案,因为strtok它不喜欢对const char*变量进行操作。这更像是我如何编写示例的人工制品,而不是基本原则的问题——但显然它应该得到双重否决。所以我修好了。

The following works (tested on Mac OS 10.7 using gcc):

以下工作(在 Mac OS 10.7 上使用 gcc 测试):

#include <stdio.h>
#include <string.h>

int main(void) {
const char* lineConst = "This \"is\" an example"; // the "input string"
char line[256];  // where we will put a copy of the input
char *subString; // the "result"

strcpy(line, lineConst);

subString = strtok(line,"\""); // find the first double quote
subString=strtok(NULL,"\"");   // find the second double quote

printf("the thing in between quotes is '%s'\n", subString);
}

Here is how it works: strtoklooks for "delimiters" (second argument) - in this case, the first ". Internally, it knows "how far it got", and if you call it again with NULLas the first argument (instead of a char*), it will start again from there. Thus, on the second call it returns "exactly the string between the first and second double quote". Which is what you wanted.

这是它的工作原理:strtok查找“分隔符”(第二个参数)-在这种情况下,第一个". 在内部,它知道“它走了多远”,如果你用NULL第一个参数(而不是 a char*)再次调用它,它会从那里重新开始。因此,在第二次调用时,它返回“恰好是第一个和第二个双引号之间的字符串”。这就是你想要的。

Warning:strtoktypically replaces delimiters with '\0'as it "eats" the input. You must therefore count on your input string getting modified by this approach. If that is not acceptable you have to make a local copy first. In essence I do that in the above when I copy the string constant to a variable. It would be cleaner to do this with a call to line=malloc(strlen(lineConst)+1);and a free(line);afterwards - but if you intend to wrap this inside a function you have to consider that the return value has to remain valid after the function returns... Because strtokreturns a pointer to the right place inside the string, it doesn't make a copy of the token. Passing a pointer to the space where you want the result to end up, and creating that space inside the function (with the correct size), then copying the result into it, would be the right thing to do. All this is quite subtle. Let me know if this is not clear!

警告:strtok通常替换分隔符,'\0'因为它“吃掉”输入。因此,您必须依靠这种方法修改您的输入字符串。如果这是不可接受的,您必须先制作本地副本。本质上,当我将字符串常量复制到变量时,我在上面这样做了。通过调用line=malloc(strlen(lineConst)+1);free(line);之后执行此操作会更清晰- 但如果您打算将其包装在函数中,则必须考虑在函数返回后返回值必须保持有效......因为strtok返回指向字符串内正确位置的指针,它不会复制令牌。传递一个指向您希望结果结束的空间的指针,并在函数内部创建该空间(具有正确的大小),然后将结果复制到其中,将是正确的做法。这一切都非常微妙。如果这不清楚,请告诉我!

回答by Keith Nicholas

if you want to do it with no library support...

如果你想在没有库支持的情况下做到这一点......

void extract_between_quotes(char* s, char* dest)
{
   int in_quotes = 0;
   *dest = 0;
   while(*s != 0)
   {
      if(in_quotes)
      {
         if(*s == '"') return;
         dest[0]=*s;
         dest[1]=0;
         dest++;
      }
      else if(*s == '"') in_quotes=1;
      s++;
   }
}

then call it

然后调用它

extract_between_quotes(line, substring);

extract_between_quotes(line, substring);

回答by fizzer

#include <string.h>
...        
substring[0] = '
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){

    char input[100];
    char extract[100];
    int i=0,j=0,k=0,endFlag=0;

    printf("Input string: ");
    fgets(input,sizeof(input),stdin);
    input[strlen(input)-1] = '
$ ./test
Input string: extract "this" from this string
Extract = this
'; for(i=0;i<strlen(input);i++){ if(input[i] == '"'){ j =i+1; while(input[j]!='"'){ if(input[j] == '
$ ./test
Input string: Another example to extract "this gibberish" from this string
Extract = this gibberish
'){ endFlag++; break; } extract[k] = input[j]; k++; j++; } } } extract[k] = '
Input string: are you "happy now Kieth ?
1.Your code only had one quotation mark.
2.So the code extracted everything after that quotation mark
3.To make sure buffer overflow doesn't happen in this case:
4.Modify the extract buffer size to be the same as input buffer size

extracted string: happy now Kieth ?
'; if(endFlag==1){ printf("1.Your code only had one quotation mark.\n"); printf("2.So the code extracted everything after that quotation mark\n"); printf("3.To make sure buffer overflow doesn't happen in this case:\n"); printf("4.Modify the extract buffer size to be the same as input buffer size\n"); printf("\nextracted string: %s\n",extract); }else{ printf("Extract = %s\n",extract); } return 0; }
'; const char *start = strchr(line, '"') + 1; strncat(substring, start, strcspn(start, "\""));

Bounds and error checking omitted. Avoid strtokbecause it has side effects.

省略了边界和错误检查。避免,strtok因为它有副作用。

回答by sukhvir

Here is a long way to do this: Assuming string to be extracted will be in quotation marks (Fixed for error check suggested by kieth in comments below)

这是一个很长的方法:假设要提取的字符串将在引号中 (修复了 kieth 在下面的评论中建议的错误检查)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){

    char input[100];
    char extract[50];
    int i=0,j=0,k=0,endFlag=0;

    printf("Input string: ");
    fgets(input,sizeof(input),stdin);
    input[strlen(input)-1] = '
$ ./test
Input string: extract "multiple" words "from" this "string"
Extract = multiplefromstring
'; for(i=0;i<strlen(input);i++){ if(input[i] == '"'){ if(endFlag==0){ j =i+1; while(input[j]!='"'){ extract[k] = input[j]; k++; j++; } endFlag = 1; }else{ endFlag =0; } //break; } } extract[k] = '##代码##'; printf("Extract = %s\n",extract); return 0; }

Output(1):

输出(1):

##代码##

Output(2):

输出(2):

##代码##

Output(3):(Error check suggested by Kieth)

输出(3):(Kieth 建议的错误检查)

$ ./test

$ ./测试

##代码##

--------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------- -------------------------------------------------- -----------------------------

Although not asked for it -- The following code extracts multiple words from input string as long as they are in quotation marks:

虽然没有被要求 - 以下代码从输入字符串中提取多个单词,只要它们在引号中:

##代码##

Output:

输出:

##代码##

回答by godel9

Have you tried looking at the strchrfunction? You should be able to call that function twice to get pointers to the first and second instances of the "character and use a combination of memcpyand pointer arithmetic to get what you want.

你试过查看strchr函数吗?您应该能够调用该函数两次以获取指向该"字符的第一个和第二个实例的指针,并使用memcpy和 指针算术的组合来获得您想要的。