C++ 将十六进制字符串转换为字节数组

Question

提问by oracal

What is the best way to convert a variable length hex string e.g. "01A1"to a byte array containing that data.

将可变长度的十六进制字符串（例如）转换为"01A1"包含该数据的字节数组的最佳方法是什么。

i.e converting this:

即转换这个：

std::string = "01A1";

into this

进入这个

char* hexArray;
int hexLength;

or this

或这个

std::vector<char> hexArray;

so that when I write this to a file and hexdump -Cit I get the binary data containing 01A1.

这样当我将其写入文件时，hexdump -C我会得到包含01A1.

Answer 1

采纳答案by Niels Keurentjes

This ought to work:

这应该有效：

int char2int(char input)
{
  if(input >= '0' && input <= '9')
    return input - '0';
  if(input >= 'A' && input <= 'F')
    return input - 'A' + 10;
  if(input >= 'a' && input <= 'f')
    return input - 'a' + 10;
  throw std::invalid_argument("Invalid input string");
}

// This function assumes src to be a zero terminated sanitized string with
// an even number of [0-9a-f] characters, and target to be sufficiently large
void hex2bin(const char* src, char* target)
{
  while(*src && src[1])
  {
    *(target++) = char2int(*src)*16 + char2int(src[1]);
    src += 2;
  }
}

Depending on your specific platform there's probably also a standard implementation though.

根据您的特定平台，可能还有一个标准实现。

Answer 2

回答by Chris Vasselli

This implementation uses the built-in strtolfunction to handle the actual conversion from text to bytes, but will work for any even-length hex string.

此实现使用内置strtol函数来处理从文本到字节的实际转换，但适用于任何偶数长度的十六进制字符串。

std::vector<char> HexToBytes(const std::string& hex) {
  std::vector<char> bytes;

  for (unsigned int i = 0; i < hex.length(); i += 2) {
    std::string byteString = hex.substr(i, 2);
    char byte = (char) strtol(byteString.c_str(), NULL, 16);
    bytes.push_back(byte);
  }

  return bytes;
}

Answer 3

回答by Rob Yull

So for fun, I was curious if I could do this kind of conversion at compile-time. It doesn't have a lot of error checking and was done in VS2015, which doesn't support C++14 constexpr functions yet (thus how HexCharToInt looks). It takes a c-string array, converts pairs of characters into a single byte and expands those bytes into a uniform initialization list used to initialize the T type provided as a template parameter. T could be replaced with something like std::array to automatically return an array.

所以为了好玩，我很好奇我是否可以在编译时进行这种转换。它没有很多错误检查，并且是在 VS2015 中完成的，它还不支持 C++14 constexpr 函数（因此 HexCharToInt 看起来如何）。它接受一个 c 字符串数组，将成对的字符转换为单个字节，并将这些字节扩展为一个统一的初始化列表，用于初始化作为模板参数提供的 T 类型。T 可以用 std::array 之类的东西替换以自动返回一个数组。

#include <cstdint>
#include <initializer_list>
#include <stdexcept>
#include <utility>

/* Quick and dirty conversion from a single character to its hex equivelent */
constexpr std::uint8_t HexCharToInt(char Input)
{
    return
    ((Input >= 'a') && (Input <= 'f'))
    ? (Input - 87)
    : ((Input >= 'A') && (Input <= 'F'))
    ? (Input - 55)
    : ((Input >= '0') && (Input <= '9'))
    ? (Input - 48)
    : throw std::exception{};
}

/* Position the characters into the appropriate nibble */
constexpr std::uint8_t HexChar(char High, char Low)
{
    return (HexCharToInt(High) << 4) | (HexCharToInt(Low));
}

/* Adapter that performs sets of 2 characters into a single byte and combine the results into a uniform initialization list used to initialize T */
template <typename T, std::size_t Length, std::size_t ... Index>
constexpr T HexString(const char (&Input)[Length], const std::index_sequence<Index...>&)
{
    return T{HexChar(Input[(Index * 2)], Input[((Index * 2) + 1)])...};
}

/* Entry function */
template <typename T, std::size_t Length>
constexpr T HexString(const char (&Input)[Length])
{
    return HexString<T>(Input, std::make_index_sequence<(Length / 2)>{});
}

constexpr auto Y = KS::Utility::HexString<std::array<std::uint8_t, 3>>("ABCDEF");

Answer 4

回答by Zan Lynx

You said "variable length." Just how variable do you mean?

你说的是“可变长度”。你的意思是多变？

For hex strings that fit into an unsigned long I have always liked the C function strtoul. To make it convert hex pass 16 as the radix value.

对于适合 unsigned long 的十六进制字符串，我一直喜欢 C 函数strtoul。为了使它转换十六进制传递 16 作为基数值。

Code might look like:

代码可能如下所示：

#include <cstdlib>
std::string str = "01a1";
unsigned long val = strtoul(str.c_str(), 0, 16);

Answer 5

回答by samoz

If you want to use OpenSSL to do it, there is a nifty trick I found:

如果你想使用 OpenSSL 来做到这一点，我发现了一个漂亮的技巧：

BIGNUM *input = BN_new();
int input_length = BN_hex2bn(&input, argv[2]);
input_length = (input_length + 1) / 2; // BN_hex2bn() returns number of hex digits
unsigned char *input_buffer = (unsigned char*)malloc(input_length);
retval = BN_bn2bin(input, input_buffer);

Just be sure to strip off any leading '0x' to the string.

只要确保去掉字符串的任何前导'0x'。

Answer 6

回答by TheoretiCAL

This can be done with a stringstream, you just need to store the value in an intermediate numeric type such as an int:

这可以通过 a 完成stringstream，您只需要将值存储在中间数字类型中，例如int：

  std::string test = "01A1"; // assuming this is an even length string
  char bytes[test.length()/2];
  stringstream converter;
  for(int i = 0; i < test.length(); i+=2)
  {
      converter << std::hex << test.substr(i,2);
      int byte;
      converter >> byte;
      bytes[i/2] = byte & 0xFF;
      converter.str(std::string());
      converter.clear();
  }

Answer 7

回答by Stamen Rakov

C++11 variant (with gcc 4.7 - little endian format):

C++11 变体（使用 gcc 4.7 - 小端格式）：

    #include <string>
    #include <vector>

    std::vector<uint8_t> decodeHex(const std::string & source)
    {
        if ( std::string::npos != source.find_first_not_of("0123456789ABCDEFabcdef") )
        {
            // you can throw exception here
            return {};
        }

        union
        {
            uint64_t binary;
            char byte[8];
        } value{};

        auto size = source.size(), offset = (size % 16);
        std::vector<uint8_t> binary{};
        binary.reserve((size + 1) / 2);

        if ( offset )
        {
            value.binary = std::stoull(source.substr(0, offset), nullptr, 16);

            for ( auto index = (offset + 1) / 2; index--; )
            {
                binary.emplace_back(value.byte[index]);
            }
        }

        for ( ; offset < size; offset += 16 )
        {
            value.binary = std::stoull(source.substr(offset, 16), nullptr, 16);
            for ( auto index = 8; index--; )
            {
                binary.emplace_back(value.byte[index]);
            }
        }

        return binary;
    }

Crypto++ variant (with gcc 4.7):

Crypto++ 变体（使用 gcc 4.7）：

#include <string>
#include <vector>

#include <crypto++/filters.h>
#include <crypto++/hex.h>

std::vector<unsigned char> decodeHex(const std::string & source)
{
    std::string hexCode;
    CryptoPP::StringSource(
              source, true,
              new CryptoPP::HexDecoder(new CryptoPP::StringSink(hexCode)));

    return std::vector<unsigned char>(hexCode.begin(), hexCode.end());
}

Note that the first variant is about two times faster than the second one and at the same time works with odd and even number of nibbles (the result of "a56ac" is {0x0a, 0x56, 0xac}). Crypto++ discards the last one if there are odd number of nibbels (the result of "a56ac" is {0xa5, 0x6a}) and silently skips invalid hex characters (the result of "a5sac" is {0xa5, 0xac}).

请注意，第一个变体比第二个变体快大约两倍，同时适用于奇数和偶数个半字节（“a56ac”的结果是 {0x0a, 0x56, 0xac}）。如果有奇数个 nibbels（“a56ac”的结果是 {0xa5, 0x6a}），Crypto++ 会丢弃最后一个，并默默地跳过无效的十六进制字符（“a5sac”的结果是 {0xa5, 0xac}）。

Answer 8

回答by metamystical

#include <iostream>
#include <sstream>
#include <vector>

int main() {
    std::string s("313233");
    char delim = ',';
    int len = s.size();
    for(int i = 2; i < len; i += 3, ++len) s.insert(i, 1, delim);
    std::istringstream is(s);
    std::ostringstream os;
    is >> std::hex;
    int n;
    while (is >> n) {
        char c = (char)n;
        os << std::string(&c, 1);
        if(is.peek() == delim) is.ignore();
    }

    // std::string form
    std::string byte_string = os.str();
    std::cout << byte_string << std::endl;
    printf("%s\n", byte_string.c_str());

    // std::vector form
    std::vector<char> byte_vector(byte_string.begin(), byte_string.end());
    byte_vector.push_back('123
123
123
'); // needed for a c-string
    printf("%s\n", byte_vector.data());
}

The output is

输出是

const char* src = "01A1";
char hexArray[256] = {0};
int hexLength = 0;

// read in the string
unsigned int hex = 0;
sscanf(src, "%x", &hex);

// write it out
for (unsigned int mask = 0xff000000, bitPos=24; mask; mask>>=8, bitPos-=8) {
    unsigned int currByte = hex & mask;
    if (currByte || hexLength) {
        hexArray[hexLength++] = currByte>>bitPos;
    }
}

'1' == 0x31, etc.

'1' == 0x31 等

Answer 9

回答by TooTone

I would use a standard function like sscanfto read the string into an unsigned integer, and then you already have the bytes you need in memory. If you were on a big endian machine you could just write out (memcpy) the memory of the integer from the first non-zero byte. However you can't safely assume this in general, so you can use some bit masking and shifting to get the bytes out.

我会使用一个标准函数，比如sscanf将字符串读入一个无符号整数，然后你就已经在内存中拥有了你需要的字节。如果您在大端机器上，您可以memcpy从第一个非零字节写出 ( ) 整数的内存。但是，您通常无法安全地假设这一点，因此您可以使用一些位掩码和移位来获取字节。

#include <boost/algorithm/hex.hpp>

char bytes[60] = {0}; 
std::string hash = boost::algorithm::unhex(std::string("313233343536373839")); 
std::copy(hash.begin(), hash.end(), bytes);

Answer 10

回答by Igor

You can use boost:

您可以使用提升：

##代码##

C++ 将十六进制字符串转换为字节数组

提问by oracal

采纳答案by Niels Keurentjes

回答by Chris Vasselli

回答by Rob Yull

回答by Zan Lynx

回答by samoz

回答by TheoretiCAL

回答by Stamen Rakov

回答by metamystical

回答by TooTone

回答by Igor

相关推荐

最近更新

标签

C++ 将十六进制字符串转换为字节数组

提问by oracal

采纳答案by Niels Keurentjes

回答by Chris Vasselli

回答by Rob Yull

回答by Zan Lynx

回答by samoz

回答by TheoretiCAL

回答by Stamen Rakov

回答by metamystical

回答by TooTone

回答by Igor

相关推荐

C++ 指针数组作为函数参数

C++ 是否可以确定指针是否指向有效对象？

C++ 捕获修饰键 Qt

C++ 使用 STL 容器进行中值计算时，正确的方法是什么？

相关推荐

最近更新

标签