在 Java 字符串中替换方法的更快替代方法?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1010928/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 22:15:25  来源:igfitidea点击:

Faster alternatives to replace method in a Java String?

javareplace

提问by ojblass

The fact that the replace method returns a string object rather than replacing the contents of a given string is a little obtuse (but understandable when you know that strings are immutable in Java). I am taking a major performance hit by using a deeply nested replace in some code. Is there something I can replace it with that would make it faster?

replace 方法返回一个字符串对象而不是替换给定字符串的内容这一事实有点晦涩(但是当您知道字符串在 Java 中是不可变的时可以理解)。通过在某些代码中使用深度嵌套的替换,我对性能造成了重大影响。有什么我可以用它代替它会使它更快吗?

采纳答案by paxdiablo

This is what StringBuilderis meant for. If you're going to be doing a lot of manipulation, do it on a StringBuilder, then turn that into a Stringwhenever you need to.

这就是StringBuilder的用途。如果您要进行大量操作,请在 a 上进行StringBuilder,然后String在需要时将其转换为 a 。

StringBuilderis described thus:

StringBuilder是这样描述的:

"A mutable sequence of characters. This class provides an API compatible with StringBuffer, but with no guarantee of synchronization".

“一个可变的字符序列。这个类提供了一个与 StringBuffer 兼容的 API,但不保证同步”。

It has replace(and append, insert, delete, et al) and you can use toStringto morph it into a real String.

它有replace(和append, insert, delete, et al),你可以用toString它把它变成一个真正的String.

回答by Artem Barger

All string manipulation in general are very slow. Consider to use StringBuffer, it's not exactly like the String class, but have a lot in common and it's mutable as well.

一般来说,所有字符串操作都很慢。考虑使用 StringBuffer,它与 String 类并不完全相同,但有很多共同点,而且它也是可变的。

回答by Jeremy

I agree with the above. Use StringBufferfor thread-safety and StringBuilderwhen working with single threads.

我同意以上所述。在使用单线程时,使用StringBuffer实现线程安全和StringBuilder

回答by David Nouls

The previous posts are right, StringBuilder/StringBuffer are a solution.

前面的帖子说的对,StringBuilder/StringBuffer 是一个解决方案。

But, you also have to question if it is a good idea to do the replace on big Strings in memory.

但是,您还必须质疑对内存中的大字符串进行替换是否是一个好主意。

I often have String manipulations that are implemented as a stream, so instead of replacing it in the string and then sending it to an OutputStream, I do the replace at the moment that I send the String to the outputstream. That works much faster than any replace.

我经常有作为流实现的字符串操作,所以我不是在字符串中替换它然后将它发送到 OutputStream,而是在我将 String 发送到输出流的那一刻进行替换。这比任何替换都快得多。

This works much faster if you want this replace to implement a template mechanism. Streaming is always faster since you consume less memory and if the clients is slow, you only need to generate at a slow pace - so it scales much better.

如果您希望此替换实现模板机制,这会更快。流式传输总是更快,因为您消耗的内存更少,如果客户端速度很慢,您只需要以较慢的速度生成 - 因此它可以更好地扩展。

回答by sb4

If you have a number of strings to replace (such as XML escape sequences), especially where the replacements are different length from the pattern, FSM lexer type algorithm seems like it might be most efficient, similar to the suggestion of processing in a stream fashion, where the output is incrementally built.

如果您有许多字符串要替换(例如 XML 转义序列),尤其是在替换与模式长度不同的情况下,FSM 词法分析器类型算法似乎可能是最有效的,类似于以流方式处理的建议,其中输出是增量构建的。

Perhaps a Matcher object could be used to do that efficiently.

也许可以使用 Matcher 对象有效地做到这一点。

回答by Tim

Just get the char[]of the Stringand iterate through it. Use a temporary StringBuilder.

刚刚拿到char[]String,通过它和迭代。使用临时StringBuilder.

Look for the pattern you want to replace while iterating if you don't find the pattern, write the stuff you scanned to the StringBuilder, else write the replacement text to the StringBuilder.

如果找不到模式,请在迭代时查找要替换的模式,将扫描的StringBuilder内容写入 ,否则将替换文本写入StringBuilder.

回答by Leo

Adding to the @paxdiablo answer, here's a sample implementation of a replaceAll using StringBuffers that is a ~3.7 times faster than String.replaceAll():

添加到@paxdiablo 答案中,这是一个使用 StringBuffers 的 replaceAll 示例实现,它比 String.replaceAll() 快约 3.7 倍:

Code:

代码:

public static String replaceAll(final String str, final String searchChars, String replaceChars)
{
  if ("".equals(str) || "".equals(searchChars) || searchChars.equals(replaceChars))
  {
    return str;
  }
  if (replaceChars == null)
  {
    replaceChars = "";
  }
  final int strLength = str.length();
  final int searchCharsLength = searchChars.length();
  StringBuilder buf = new StringBuilder(str);
  boolean modified = false;
  for (int i = 0; i < strLength; i++)
  {
    int start = buf.indexOf(searchChars, i);

    if (start == -1)
    {
      if (i == 0)
      {
        return str;
      }
      return buf.toString();
    }
    buf = buf.replace(start, start + searchCharsLength, replaceChars);
    modified = true;

  }
  if (!modified)
  {
    return str;
  }
  else
  {
    return buf.toString();
  }
}

Test Case -- the output is the following (Delta1 = 1917009502; Delta2 =7241000026):

测试用例——输出如下(Delta1 = 1917009502;Delta2 = 7241000026):

@Test
public void testReplaceAll() 
{
  String origStr = "1234567890-1234567890-";

  String replacement1 =  StringReplacer.replaceAll(origStr, "0", "a");
  String expectedRep1 = "123456789a-123456789a-";

  String replacement2 =  StringReplacer.replaceAll(origStr, "0", "ab");
  String expectedRep2 = "123456789ab-123456789ab-";

  String replacement3 =  StringReplacer.replaceAll(origStr, "0", "");
  String expectedRep3 = "123456789-123456789-";


  String replacement4 =  StringReplacer.replaceAll(origStr, "012", "a");
  String expectedRep4 = "1234567890-1234567890-";

  String replacement5 =  StringReplacer.replaceAll(origStr, "123", "ab");
  String expectedRep5 = "ab4567890-ab4567890-";

  String replacement6 =  StringReplacer.replaceAll(origStr, "123", "abc");
  String expectedRep6 = "abc4567890-abc4567890-";

  String replacement7 =  StringReplacer.replaceAll(origStr, "123", "abcdd");
  String expectedRep7 = "abcdd4567890-abcdd4567890-";

  String replacement8 =  StringReplacer.replaceAll(origStr, "123", "");
  String expectedRep8 = "4567890-4567890-";

  String replacement9 =  StringReplacer.replaceAll(origStr, "123", "");
  String expectedRep9 = "4567890-4567890-";

  assertEquals(replacement1, expectedRep1);
  assertEquals(replacement2, expectedRep2);
  assertEquals(replacement3, expectedRep3);
  assertEquals(replacement4, expectedRep4);
  assertEquals(replacement5, expectedRep5);
  assertEquals(replacement6, expectedRep6);
  assertEquals(replacement7, expectedRep7);
  assertEquals(replacement8, expectedRep8);
  assertEquals(replacement9, expectedRep9);

  long start1 = System.nanoTime();
  for (long i = 0; i < 10000000L; i++)
  {
    String rep =  StringReplacer.replaceAll(origStr, "123", "abcdd");
  }
  long delta1 = System.nanoTime() -start1;

  long start2= System.nanoTime();

  for (long i = 0; i < 10000000L; i++)
  {
    String rep =  origStr.replaceAll( "123", "abcdd");
  }

  long delta2 = System.nanoTime() -start1;

  assertTrue(delta1 < delta2);

  System.out.printf("Delta1 = %d; Delta2 =%d", delta1, delta2);


}

回答by Hummeling Engineering BV

When you're replacing single characters, consider iterating over your character array but replace characters by using a (pre-created) HashMap<Character, Character>().

替换单个字符时,请考虑迭代字符数组,但使用 (pre-created) 替换字符HashMap<Character, Character>()

I use this strategy to convert an integer exponent string by unicode superscript characters.

我使用此策略通过 unicode 上标字符转换整数指数字符串。

It's about twice as fast compared to String.replace(char, char). Note that the time associated to creating the hash map isn't included in this comparison.

String.replace(char, char). 请注意,与创建哈希映射相关的时间不包括在此比较中。

回答by Horcrux7

The follow code is approx. 30 times faster if there is no match and 5 times faster if there is a match.

以下代码约为。如果没有匹配则快 30 倍,如果有匹配则快 5 倍。

static String fastReplace( String str, String target, String replacement ) {
    int targetLength = target.length();
    if( targetLength == 0 ) {
        return str;
    }
    int idx2 = str.indexOf( target );
    if( idx2 < 0 ) {
        return str;
    }
    StringBuilder buffer = new StringBuilder( targetLength > replacement.length() ? str.length() : str.length() * 2 );
    int idx1 = 0;
    do {
        buffer.append( str, idx1, idx2 );
        buffer.append( replacement );
        idx1 = idx2 + targetLength;
        idx2 = str.indexOf( target, idx1 );
    } while( idx2 > 0 );
    buffer.append( str, idx1, str.length() );
    return buffer.toString();
}

回答by Mykhaylo Adamovych

Because String.replace(CharSequence target, CharSequence replacement)has Pattern.compile, matcher, replaceAllinside, one can slightly optimize it by for using precompiled target pattern constant, like this:

因为String.replace(CharSequence target, CharSequence replacement)has Pattern.compile, matcher, replaceAllinside, 可以通过使用预编译的目标模式常量对其进行稍微优化,如下所示:

private static final Pattern COMMA_REGEX = Pattern.compile(",");
...
COMMA_REGEX.matcher(value).replaceAll(replacement);