C语言 如何将十六进制字符串转换为无符号字符数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3221170/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 05:52:10  来源:igfitidea点击:

How to turn a hex string into an unsigned char array?

carrayshex

提问by Gbps

For example, I have a cstring "E8 48 D8 FF FF 8B 0D"(including spaces) which needs to be converted into the equivalent unsigned char array {0xE8,0x48,0xD8,0xFF,0xFF,0x8B,0x0D}. What's an efficient way to do this? Thanks!

例如,我有一个 cstring "E8 48 D8 FF FF 8B 0D"(包括空格)需要转换为等效的 unsigned char array {0xE8,0x48,0xD8,0xFF,0xFF,0x8B,0x0D}。这样做的有效方法是什么?谢谢!

EDIT: I can't use the std library... so consider this a C question. I'm sorry!

编辑:我不能使用 std 库......所以考虑这是一个 C 问题。抱歉!

采纳答案by Norman Ramsey

You'll never convince me that this operation is a performance bottleneck. The efficient way is to make good use of your time by using the standard C library:

你永远不会让我相信这个操作是一个性能瓶颈。有效的方法是通过使用标准 C 库充分利用您的时间:

static unsigned char gethex(const char *s, char **endptr) {
  assert(s);
  while (isspace(*s)) s++;
  assert(*s);
  return strtoul(s, endptr, 16);
}

unsigned char *convert(const char *s, int *length) {
  unsigned char *answer = malloc((strlen(s) + 1) / 3);
  unsigned char *p;
  for (p = answer; *s; p++)
    *p = gethex(s, (char **)&s);
  *length = p - answer;
  return answer;
}

Compiled and tested. Works on your example.

编译和测试。适用于您的示例。

回答by James McNellis

This answers the originalquestion, which asked for a C++ solution.

这回答了最初的问题,该问题要求 C++ 解决方案。

You can use an istringstreamwith the hexmanipulator:

您可以将 anistringstreamhex操纵器一起使用:

std::string hex_chars("E8 48 D8 FF FF 8B 0D");

std::istringstream hex_chars_stream(hex_chars);
std::vector<unsigned char> bytes;

unsigned int c;
while (hex_chars_stream >> std::hex >> c)
{
    bytes.push_back(c);
}

Note that cmust be an int(or long, or some other integer type), not a char; if it is a char(or unsigned char), the wrong >>overload will be called and individual characters will be extracted from the string, not hexadecimal integer strings.

请注意,c必须是int(或long,或其他一些整数类型),而不是char; 如果它是 a char(或unsigned char),>>将调用错误的重载并从字符串中提取单个字符,而不是十六进制整数字符串。

Additional error checking to ensure that the extracted value fits within a charwould be a good idea.

额外的错误检查以确保提取的值适合 achar将是一个好主意。

回答by Ben Voigt

  • Iterate through all the characters.
    • If you have a hex digit, the number is (ch >= 'A')? (ch - 'A' + 10): (ch - '0').
      • Left shift your accumulator by four bits and add (or OR) in the new digit.
    • If you have a space, and the previous character was not a space, then append your current accumulator value to the array and reset the accumulator back to zero.
  • 遍历所有字符。
    • 如果您有十六进制数字,则数字为(ch >= 'A')? (ch - 'A' + 10): (ch - '0')
      • 将累加器左移四位并在新数字中添加(或或)。
    • 如果您有空格,并且前一个字符不是空格,则将您当前的累加器值附加到数组并将累加器重置为零。

回答by Diego Medaglia

If you know the length of the string to be parsed beforehand (e.g. you are reading something from /proc) you can use sscanf with the 'hh' type modifier, which specifies that the next conversion is one of diouxX and the pointer to store it will be either signed char or unsigned char.

如果您事先知道要解析的字符串的长度(例如,您正在从 /proc 读取某些内容),您可以使用带有 'hh' 类型修饰符的 sscanf,它指定下一个转换是 diouxX 之一和存储它的指针将是有符号字符或无符号字符。

// example: ipv6 address as seen in /proc/net/if_inet6:
char myString[] = "fe80000000000000020c29fffe01bafb";
unsigned char addressBytes[16];
sscanf(myString, "%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx
%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx", &addressBytes[0],
&addressBytes[1], &addressBytes[2], &addressBytes[3], &addressBytes[4], 
&addressBytes[5], &addressBytes[6], &addressBytes[7], &addressBytes[8], 
&addressBytes[9], &addressBytes[10], addressBytes[11],&addressBytes[12],
&addressBytes[13], &addressBytes[14], &addressBytes[15]);

int i;
for (i = 0; i < 16; i++){
    printf("addressBytes[%d] = %02x\n", i, addressBytes[i]);
}

Output:

输出:

addressBytes[0] = fe
addressBytes[1] = 80
addressBytes[2] = 00
addressBytes[3] = 00
addressBytes[4] = 00
addressBytes[5] = 00
addressBytes[6] = 00
addressBytes[7] = 00
addressBytes[8] = 02
addressBytes[9] = 0c
addressBytes[10] = 29
addressBytes[11] = ff
addressBytes[12] = fe
addressBytes[13] = 01
addressBytes[14] = ba
addressBytes[15] = fb

回答by amigo

use the "old" sscanf() function:

使用“旧” sscanf() 函数:

string s_hex = "E8 48 D8 FF FF 8B 0D"; // source string
char *a_Char = new char( s_hex.length()/3 +1 ); // output char array

for( unsigned i = 0, uchr ; i < s_hex.length() ; i += 3 ) {
    sscanf( s_hex.c_str()+ i, "%2x", &uchr ); // conversion
    a_Char[i/3] = uchr; // save as char
  }
delete a_Char;

回答by bjg

For a pure C implementation I think you can persuade sscanf(3)to do what you what. I believe this should be portable (including the slightly dodgy type coercion to appease the compiler) so long as your input string is only ever going to contain two-character hex values.

对于纯 C 实现,我认为你可以说服sscanf(3)你做什么。我相信这应该是可移植的(包括稍微狡猾的类型强制以安抚编译器),只要您的输入字符串只包含两个字符的十六进制值。

#include <stdio.h>
#include <stdlib.h>


char hex[] = "E8 48 D8 FF FF 8B 0D";
char *p;
int cnt = (strlen(hex) + 1) / 3; // Whether or not there's a trailing space
unsigned char *result = (unsigned char *)malloc(cnt), *r;
unsigned char c;

for (p = hex, r = result; *p; p += 3) {
    if (sscanf(p, "%02X", (unsigned int *)&c) != 1) {
        break; // Didn't parse as expected
    }
    *r++ = c;
}

回答by kriss

The old C way, do it by hand ;-) (there is many shorter ways, but I'm not golfing, I'm going for run-time).

旧的 C 方式,手工完成 ;-)(有很多更短的方式,但我不是打高尔夫球,我要跑)。

enum { NBBYTES = 7 };
char res[NBBYTES+1];
const char * c = "E8 48 D8 FF FF 8B 0D";
const char * p = c;
int i = 0;

for (i = 0; i < NBBYTES; i++){
    switch (*p){
    case '0': case '1': case '2': case '3': case '4':
    case '5': case '6': case '7': case '8': case '9':
      res[i] = *p - '0';
    break;
    case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
      res[i] = *p - 'A' + 10;
    break;
   default:
     // parse error, throw exception
     ;
   }
   p++;
   switch (*p){
   case '0': case '1': case '2': case '3': case '4':
   case '5': case '6': case '7': case '8': case '9':
      res[i] = res[i]*16 + *p - '0';
   break;
   case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
      res[i] = res[i]*16 + *p - 'A' + 10;
   break;
   default:
      // parse error, throw exception
      ;
   }
   p++;
   if (*p == 0) { continue; }
   if (*p == ' ') { p++; continue; }
   // parse error, throw exception
}

// let's show the result, C style IO, just cout if you want C++
for (i = 0 ; i < 7; i++){
   printf("%2.2x ", 0xFF & res[i]);
}
printf("\n");

Now another one that allow for any number of digit between numbers, any number of spaces to separate them, including leading or trailing spaces (Ben's specs):

现在另一个允许数字之间有任意数量的数字,任意数量的空格来分隔它们,包括前导或尾随空格(Ben 的规范):

#include <stdio.h>
#include <stdlib.h>

int main(){
    enum { NBBYTES = 7 };
    char res[NBBYTES];
    const char * c = "E8 48 D8 FF FF 8B 0D";
    const char * p = c;
    int i = -1;

    res[i] = 0;
    char ch = ' ';
    while (ch && i < NBBYTES){
       switch (ch){
       case '0': case '1': case '2': case '3': case '4':
       case '5': case '6': case '7': case '8': case '9':
          ch -= '0' + 10 - 'A';
       case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
          ch -= 'A' - 10;
          res[i] = res[i]*16 + ch;
          break;
       case ' ':
         if (*p != ' ') {
             if (i == NBBYTES-1){
                 printf("parse error, throw exception\n");
                 exit(-1);
            }
            res[++i] = 0;
         }
         break;
       case 0:
         break;
       default:
         printf("parse error, throw exception\n");
         exit(-1);
       }
       ch = *(p++);
    }
    if (i != NBBYTES-1){
        printf("parse error, throw exception\n");
        exit(-1);
    }

   for (i = 0 ; i < 7; i++){
      printf("%2.2x ", 0xFF & res[i]);
   }
   printf("\n");
}

No, it's not really obfuscated... but well, it looks like it is.

不,它并没有真正混淆……但是,看起来确实如此。