如何在 Java 中执行字符串差异?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/132478/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to perform string Diffs in Java?
提问by Sergio del Amo
I need to perform Diffs between Java strings. I would like to be able to rebuild a string using the original string and diff versions. Has anyone done this in Java? What library do you use?
我需要在 Java 字符串之间执行差异。我希望能够使用原始字符串和差异版本重建字符串。有没有人用Java做过这个?你用什么库?
String a1; // This can be a long text
String a2; // ej. above text with spelling corrections
String a3; // ej. above text with spelling corrections and an additional sentence
Diff diff = new Diff();
String differences_a1_a2 = Diff.getDifferences(a,changed_a);
String differences_a2_a3 = Diff.getDifferences(a,changed_a);
String[] diffs = new String[]{a,differences_a1_a2,differences_a2_a3};
String new_a3 = Diff.build(diffs);
a3.equals(new_a3); // this is true
采纳答案by bernardn
This library seems to do the trick: google-diff-match-patch. It can create a patch string from differences and allow to reapply the patch.
这个库似乎可以解决问题:google-diff-match-patch。它可以根据差异创建补丁字符串并允许重新应用补丁。
edit: Another solution might be to https://code.google.com/p/java-diff-utils/
编辑:另一种解决方案可能是https://code.google.com/p/java-diff-utils/
回答by Paul Whelan
Apache Commons has String diff
Apache Commons 有字符串差异
org.apache.commons.lang.StringUtils
org.apache.commons.lang.StringUtils
StringUtils.difference("foobar", "foo");
回答by Torsten Marek
Use the Levenshtein distanceand extract the edit logs from the matrix the algorithm builds up. The Wikipedia article links to a couple of implementations, I'm sure there's a Java implementation among in.
使用Levenshtein 距离并从算法建立的矩阵中提取编辑日志。维基百科文章链接到几个实现,我确定其中有一个 Java 实现。
Levenshtein is a special case of the Longest Common Subsequencealgorithm, you might also want to have a look at that.
Levenshtein 是最长公共子序列算法的一个特例,您可能还想看看它。
回答by Paul Whelan
As Torsten Says you can use
正如 Torsten 所说,你可以使用
org.apache.commons.lang.StringUtils;
org.apache.commons.lang.StringUtils;
System.err.println(StringUtils.getLevenshteinDistance("foobar", "bar"));
回答by Alexander
If you need to deal with differences between big amounts of data and have the differences efficiently compressed, you could try a Java implementation of xdelta, which in turn implements RFC 3284 (VCDIFF) for binary diffs (should work with strings too).
如果您需要处理大量数据之间的差异并有效地压缩差异,您可以尝试 xdelta 的 Java 实现,它反过来为二进制差异实现 RFC 3284 (VCDIFF)(也应该使用字符串)。
回答by dnaumenko
The java diff utillslibrary might be useful.
在java的差异utills库可能是有用的。
回答by Sandeep Raj Urs
public class Stringdiff {
public static void main(String args[]){
System.out.println(strcheck("sum","sumsum"));
}
public static String strcheck(String str1,String str2){
if(Math.abs((str1.length()-str2.length()))==-1){
return "Invalid";
}
int num=diffcheck1(str1, str2);
if(num==-1){
return "Empty";
}
if(str1.length()>str2.length()){
return str1.substring(num);
}
else{
return str2.substring(num);
}
}
public static int diffcheck1(String str1,String str2)
{
int i;
String str;
String strn;
if(str1.length()>str2.length()){
str=str1;
strn=str2;
}
else{
str=str2;
strn=str1;
}
for(i=0;i<str.length() && i<strn.length();i++){
if(str1.charAt(i)!=str2.charAt(i)){
return i;
}
}
if(i<str1.length()||i<str2.length()){
return i;
}
return -1;
}
}
回答by Ahmed Ashour
Apache Commons Text now has StringsComparator:
Apache Commons Text 现在有StringsComparator:
StringsComparator c = new StringsComparator(s1, s2);
c.getScript().visit(new CommandVisitor<Character>() {
@Override
public void visitKeepCommand(Character object) {
System.out.println("k: " + object);
}
@Override
public void visitInsertCommand(Character object) {
System.out.println("i: " + object);
}
@Override
public void visitDeleteCommand(Character object) {
System.out.println("d: " + object);
}
});