Python 两个二进制字符串之间的汉明距离不起作用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31007054/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:19:32  来源:igfitidea点击:

Hamming distance between two binary strings not working

pythonbinarybithamming-distance

提问by Hyperion

I found an interesting algorithm to calculate hamming distance on thissite:

我在这个网站上找到了一个有趣的算法来计算汉明距离:

def hamming2(x,y):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x) == len(y)
    count,z = 0,x^y
    while z:
        count += 1
        z &= z-1 # magic!
    return count

The point is that this algorithm only works on bit strings and I'm trying to compare two strings that are binary but they are in string format, like

关键是这个算法只适用于位字符串,我试图比较两个二进制字符串,但它们是字符串格式,比如

'100010'
'101000'

How can I make them work with this algorithm?

我怎样才能让它们使用这个算法?

采纳答案by dlask

Implement it:

实施它:

def hamming2(s1, s2):
    """Calculate the Hamming distance between two bit strings"""
    assert len(s1) == len(s2)
    return sum(c1 != c2 for c1, c2 in zip(s1, s2))

And test it:

并测试它:

assert hamming2("1010", "1111") == 2
assert hamming2("1111", "0000") == 4
assert hamming2("1111", "1111") == 0

回答by Adam Hammes

If we are to stick with the original algorithm, we need to convert the strings to integers to be able to use the bitwise operators.

如果我们要坚持使用原始算法,我们需要将字符串转换为整数才能使用按位运算符。

def hamming2(x_str, y_str):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x_str) == len(y_str)
    x, y = int(x_str, 2), int(y_str, 2)  # '2' specifies we are reading a binary number
    count, z = 0, x ^ y
    while z:
        count += 1
        z &= z - 1  # magic!
    return count

Then we can call it as follows:

然后我们可以这样调用它:

print(hamming2('100010', '101000'))

While this algorithm is cool as a novelty, having to convert to a string likely negates any speed advantage it might have. The answer @dlask posted is much more succinct.

虽然这个算法很酷,但必须转换为字符串可能会抵消它可能具有的任何速度优势。@dlask 发布的答案要简洁得多。

回答by Mikheil Zhghenti

I think this explains well The Hamming distancebetween two strings

我认为这The Hamming distance在两个字符串之间很好地解释了

def hammingDist(s1, s2):
    bytesS1=bytes(s1, encoding="ascii")
    bytesS2=bytes(s2, encoding="ascii")
    diff=0
    for i in range(min(len(bytesS1),len(bytesS2))):
        if(bytesS1[i]^bytesS2[i]!=0):
            diff+=1
    return(diff)

回答by Panos Kal.

This is what I use to calculate the Hamming distance.
It counts the # of differences between equal length strings.

这就是我用来计算汉明距离的方法。
它计算等长字符串之间的差异数量。

def hamdist(str1, str2):
    diffs = 0
    for ch1, ch2 in zip(str1, str2):
        if ch1 != ch2:
            diffs += 1
    return diffs