java 使用for循环获取2个字符串之间的汉明距离

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16260752/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 22:26:28  来源:igfitidea点击:

Using for loop to get the Hamming distance between 2 strings

javastringfor-loopcompareequals

提问by Doh

In this task i need to get the Hamming distance (the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different - from Wikipedia) between the two strings sequence1 and sequence2.

在这个任务中,我需要得到两个字符串序列 1 和序列 2 之间的汉明距离(两个相等长度的字符串之间的汉明距离是对应符号不同的位置数 - 来自维基百科)。

First i made 2 new strings which is the 2 original strings but both with lowered case to make comparing easier. Then i resorted to using the for loop and if to compare the 2 strings. For any differences in characters in these 2 pair of string, the loop would add 1 to an int x = 0. The returns of the method will be the value of this x.

首先,我制作了 2 根新弦,这是 2 根原始弦,但都采用了小写字母,以便于比较。然后我求助于使用 for 循环和 if 来比较 2 个字符串。对于这 2 对字符串中字符的任何差异,循环会将 1 添加到 int x = 0。该方法的返回将是此 x 的值。

public static int getHammingDistance(String sequence1, String sequence2) {
    int a = 0;
    String sequenceX = sequence1.toLowerCase();
    String sequenceY = sequence2.toLowerCase();
    for (int x = 0; x < sequenceX.length(); x++) {
        for (int y = 0; y < sequenceY.length(); y++) {
            if (sequenceX.charAt(x) == sequenceY.charAt(y)) {
                a += 0;
            } else if (sequenceX.charAt(x) != sequenceY.charAt(y)) {
                a += 1;
            }
        }
    }
    return a;
}

So does the code looks good and functional enough? Anything i could to fix or to optimize the code? Thanks in advance. I'm a huge noob so pardon me if i asked anything silly

那么代码看起来不错且功能足够吗?我可以修复或优化代码吗?提前致谢。我是个大菜鸟,如果我问什么傻话,请原谅我

回答by Francisco Spaeth

From my point the following implementation would be ok:

从我的角度来看,以下实现是可以的:

public static int getHammingDistance(String sequence1, String sequence2) {
    char[] s1 = sequence1.toCharArray();
    char[] s2 = sequence2.toCharArray();

    int shorter = Math.min(s1.length, s2.length);
    int longest = Math.max(s1.length, s2.length);

    int result = 0;
    for (int i=0; i<shorter; i++) {
        if (s1[i] != s2[i]) result++;
    }

    result += longest - shorter;

    return result;
}
  1. uses array, what avoids the invocation of two method (charAt) for each single char that needs to be compared;
  2. avoid exception when one string is longer than the other.
  1. 使用数组,避免为每个需要比较的单个字符调用两个方法(charAt);
  2. 当一个字符串比另一个长时避免异常。

回答by radai

your code is completely off. as you said yourself, the distance is the number of places where the strings differ - so you should only have 1 loop, going over both strings at once. instead you have 2 nested loops that compare every index in string a to every index in string b.

您的代码完全关闭。正如您自己所说,距离是字符串不同的地方的数量 - 所以你应该只有 1 个循环,一次遍历两个字符串。相反,您有 2 个嵌套循环,将字符串 a 中的每个索引与字符串 b 中的每个索引进行比较。

also, writing an if condition that results in a+=0is a waste of time.

此外,编写导致结果的 if 条件a+=0是浪费时间。

try this instead:

试试这个:

for (int x = 0; x < sequenceX.length(); x++) { //both are of the same length
    if (sequenceX.charAt(x) != sequenceY.charAt(x)) {
        a += 1;
    }
}

also, this is still a naive approach which will probbaly not work with complex unicode characters (where 2 characters can be logically equal yet not have the same character code)

此外,这仍然是一种天真的方法,可能不适用于复杂的 unicode 字符(其中 2 个字符在逻辑上可以相等但不具有相同的字符代码)

回答by threadfin

public static int getHammingDistance(String sequenceX, String sequenceY) {
    int a = 0;
   // String sequenceX = sequence1.toLowerCase();
    //String sequenceY = sequence2.toLowerCase();
    if (sequenceX.length() != sequenceY.length()) {
        return -1; //input strings should be of equal length
    }

    for (int i = 0; i < sequenceX.length(); i++) {
        if (sequenceX.charAt(i) != sequenceY.charAt(i)) {
            a++;
        }
    }
    return a;
}

回答by AlexR

Your code is OK, however I'd suggest you the following improvements.

您的代码没问题,但是我建议您进行以下改进。

  1. do not use charAt()of string. Get char array from string using toCharArray()before loop and then work with this array. This is more readable and more effective.
  2. The structure

        if (sequenceX.charAt(x) == sequenceY.charAt(y)) {
            a += 0;
        } else if (sequenceX.charAt(x) != sequenceY.charAt(y)) {
            a += 1;
        }
    

    looks redundant. Fix it to: if (sequenceX.charAt(x) == sequenceY.charAt(y)) { a += 0; } else { a += 1; }

  1. 不要使用charAt()字符串。使用toCharArray()before 循环从字符串中获取字符数组,然后使用此数组。这更具可读性和更有效。
  2. 结构

        if (sequenceX.charAt(x) == sequenceY.charAt(y)) {
            a += 0;
        } else if (sequenceX.charAt(x) != sequenceY.charAt(y)) {
            a += 1;
        }
    

    看起来多余。将其修复为: if (sequenceX.charAt(x) == sequenceY.charAt(y)) { a += 0; } else { a += 1; }

Moreover taking into account that I recommended you to work with array change it to something like:

此外,考虑到我建议您使用数组将其更改为以下内容:

a += seqx[x] == seqY[x] ? 0 : 1

a += seqx[x] == seqY[x] ? 0 : 1

less code less bugs...

更少的代码更少的错误...

EDIT: as mentionded by @radai you do not need if/elsestructure at all: adding 0to ais redundant.

编辑:正如@radai 所提到的,您根本不需要if/else结构:添加0a是多余的。