字节数组的 Java 比较器（字典）

Question

提问by marcorossi

I have a hashmap with byte[] keys. I'd like to sort it through a TreeMap.

我有一个带有 byte[] 键的哈希图。我想通过 TreeMap 对其进行排序。

What is the most effective way to implement the comparator for lexicographic order?

实现字典顺序比较器的最有效方法是什么？

Answer 1

回答by ColinD

Using Guava, you can use either of:

使用Guava，您可以使用以下任一项：

The UnsignedBytescomparator appears to have an optimized form using Unsafethat it uses if it can. Comments in the code indicate that it may be at least twice as fast as a normal Java implementation.

该UnsignedBytes比较器似乎具有使用优化的形式Unsafe，它采用，如果它可以。代码中的注释表明它可能至少是普通 Java 实现的两倍。

Answer 2

回答by marcorossi

Found this nice piece of code in Apache Hbase:

在 Apache Hbase 中找到了这段不错的代码：

    public int compare(byte[] left, byte[] right) {
        for (int i = 0, j = 0; i < left.length && j < right.length; i++, j++) {
            int a = (left[i] & 0xff);
            int b = (right[j] & 0xff);
            if (a != b) {
                return a - b;
            }
        }
        return left.length - right.length;
    }

Answer 3

回答by Julius Musseau

I'm assuming the problem is just with the "byte vs. byte" comparison. Dealing with the arrays is straightforward, so I won't cover it. With respect to byte vs. byte, my first thought is to do this:

我假设问题仅在于“字节与字节”的比较。处理数组很简单，所以我不会介绍它。关于字节与字节，我的第一个想法是这样做：

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    return new Byte(b1).compareTo(b2);
  }
}

But that won't be lexicographic: 0xFF (the signed byte for -1) will be considered smaller than 0x00, when lexicographically it's bigger. I think this should do the trick:

但这不会是按字典顺序排列的：0xFF（-1 的有符号字节）将被认为小于 0x00，当按字典顺序排列时它更大。我认为这应该可以解决问题：

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    // convert to unsigned bytes (0 to 255) before comparing them.
    int i1 = b1 < 0 ? 256 + b1 : b1;
    int i2 = b2 < 0 ? 256 + b2 : b2;
    return i2 - i1;
  }
}

Probably there is something in Apache's commons-lang or commons-math libraries that does this, but I don't know it off hand.

Apache 的 commons-lang 或 commons-math 库中可能有一些东西可以做到这一点，但我不知道。

Answer 4

回答by Peter Lawrey

You can use a comparator which comares the Character.toLowerCase() of each of the bytes in the array (Assuming the byte[] is in ASCII) if not you will need to do the character decoding yourself or use new String(bytes, charSet).toLowerCase()but this is not likely to be efficient.

您可以使用比较器，它对数组中每个字节的 Character.toLowerCase() 进行比较（假设 byte[] 是 ASCII），如果不是，您将需要自己进行字符解码或使用，new String(bytes, charSet).toLowerCase()但这不太可能高效。

字节数组的 Java 比较器（字典）

提问by marcorossi

回答by ColinD

回答by marcorossi

回答by Julius Musseau

回答by Peter Lawrey

相关推荐

最近更新

标签

字节数组的 Java 比较器（字典）

提问by marcorossi

回答by ColinD

回答by marcorossi

回答by Julius Musseau

回答by Peter Lawrey

相关推荐

java 蚂蚁与哈德森

java 通过字符串获取实例化对象

java 如何更新 JSP 中现有的 cookie？

java 多个属性文件

相关推荐

最近更新

标签