Java charAt() 还是子字符串？哪个更快？

Question

提问by estacado

I want to go through each character in a String and pass each character of the String as a String to another function.

我想遍历字符串中的每个字符并将字符串的每个字符作为字符串传递给另一个函数。

String s = "abcdefg";
for(int i = 0; i < s.length(); i++){
    newFunction(s.substring(i, i+1));}

or

或者

String s = "abcdefg";
for(int i = 0; i < s.length(); i++){
    newFunction(Character.toString(s.charAt(i)));}

The final result needs to be a String. So any idea which will be faster or more efficient?

最终结果需要是一个字符串。那么有什么想法会更快或更有效吗？

Answer 1

采纳答案by mhaller

As usual: it doesn't matter but if you insist on spending time on micro-optimization or if you really like to optimize for your very special use case, try this:

像往常一样：没关系，但如果您坚持在微优化上花费时间，或者您真的想针对您的非常特殊的用例进行优化，请尝试以下操作：

import org.junit.Assert;
import org.junit.Test;

public class StringCharTest {

    // Times:
    // 1. Initialization of "s" outside the loop
    // 2. Init of "s" inside the loop
    // 3. newFunction() actually checks the string length,
    // so the function will not be optimized away by the hotstop compiler

    @Test
    // Fastest: 237ms / 562ms / 2434ms
    public void testCacheStrings() throws Exception {
        // Cache all possible Char strings
        String[] char2string = new String[Character.MAX_VALUE];
        for (char i = Character.MIN_VALUE; i < Character.MAX_VALUE; i++) {
            char2string[i] = Character.toString(i);
        }

        for (int x = 0; x < 10000000; x++) {
            char[] s = "abcdefg".toCharArray();
            for (int i = 0; i < s.length; i++) {
                newFunction(char2string[s[i]]);
            }
        }
    }

    @Test
    // Fast: 1687ms / 1725ms / 3382ms
    public void testCharToString() throws Exception {
        for (int x = 0; x < 10000000; x++) {
            String s = "abcdefg";
            for (int i = 0; i < s.length(); i++) {
                // Fast: Creates new String objects, but does not copy an array
                newFunction(Character.toString(s.charAt(i)));
            }
        }
    }

    @Test
    // Very fast: 1331 ms/ 1414ms / 3190ms
    public void testSubstring() throws Exception {
        for (int x = 0; x < 10000000; x++) {
            String s = "abcdefg";
            for (int i = 0; i < s.length(); i++) {
                // The fastest! Reuses the internal char array
                newFunction(s.substring(i, i + 1));
            }
        }
    }

    @Test
    // Slowest: 2525ms / 2961ms / 4703ms
    public void testNewString() throws Exception {
        char[] value = new char[1];
        for (int x = 0; x < 10000000; x++) {
            char[] s = "abcdefg".toCharArray();
            for (int i = 0; i < s.length; i++) {
                value[0] = s[i];
                // Slow! Copies the array
                newFunction(new String(value));
            }
        }
    }

    private void newFunction(String string) {
        // Do something with the one-character string
        Assert.assertEquals(1, string.length());
    }

}

Answer 2

回答by Jesper

Does newFunctionreally need to take a String? It would be better if you could make newFunctiontake a charand call it like this:

是否newFunction真的需要采取String？如果你可以 make newFunctiontake achar并像这样调用它会更好：

newFunction(s.charAt(i));

That way, you avoid creating a temporary String object.

这样，您就可以避免创建临时 String 对象。

To answer your question: It's hard to say which one is more efficient. In both examples, a Stringobject has to be created which contains only one character. Which is more efficient depends on how exactly String.substring(...)and Character.toString(...)are implemented on your particular Java implementation. The only way to find it out is running your program through a profiler and seeing which version uses more CPU and/or more memory. Normally, you shouldn't worry about micro-optimizations like this - only spend time on this when you've discovered that this is the cause of a performance and/or memory problem.

回答您的问题：很难说哪个更有效。在这两个示例中，String必须创建一个仅包含一个字符的对象。这是更有效取决于究竟如何String.substring(...)与Character.toString(...)您的特定Java实现的贯彻落实。找出它的唯一方法是通过分析器运行您的程序并查看哪个版本使用更多 CPU 和/或更多内存。通常，您不应该担心像这样的微优化 - 只有当您发现这是性能和/或内存问题的原因时才花时间在这上面。

Answer 3

回答by Will

The answer is: it doesn't matter.

答案是：没关系。

Profile your code. Is this your bottleneck?

分析您的代码。这是你的瓶颈吗？

Answer 4

回答by Will

I would first obtain the underlying char[] from the source String using String.toCharArray() and then proceed to call newFunction.

我将首先使用 String.toCharArray() 从源字符串获取底层 char[]，然后继续调用 newFunction。

But I do agree with Jesper that it would be best if you could just deal with characters and avoid all the String functions...

但是我确实同意 Jesper 的观点，如果您可以只处理字符并避免使用所有 String 函数，那将是最好的……

Answer 5

回答by Andrzej Doyle

Of the two snippets you've posted, I wouldn't want to say. I'd agree with Will that it almost certainly is irrelevant in the overall performance of your code - and if it's not, you can just make the change and determine for yourself which is fastest for your data with your JVM on your hardware.

在你发布的两个片段中，我不想说。我同意 Will 的观点，它几乎肯定与您的代码的整体性能无关 - 如果不是，您可以进行更改并自行确定在您的硬件上使用 JVM 对您的数据而言哪个最快。

That said, it's likely that the second snippet would be better if you converted the String into a char array first, and then performed your iterations over the array. Doing it this way would perform the String overhead once only (converting to the array) instead of every call. Additionally, you could then pass the array directly to the String constructor with some indices, which is more efficient than taking a char outof an array to pass it individually (which then gets turned into a one character array):

也就是说，如果您先将 String 转换为 char 数组，然后在该数组上执行迭代，则第二个代码段可能会更好。这样做只会执行一次 String 开销（转换为数组），而不是每次调用。此外，您可以然后将数组直接传递给带有一些索引的 String 构造函数，这比从数组中取出一个字符以单独传递它（然后变成一个字符数组）更有效：

String s = "abcdefg";
char[] chars = s.toCharArray();
for(int i = 0; i < chars.length; i++) {
    newFunction(String.valueOf(chars, i, 1));
}

But to reinforce my first point, when you look at what you're actually avoiding on each call of String.charAt()- it's two bounds checks, a (lazy) boolean OR, and an addition. This is not going to make any noticeable difference. Neither is the difference in the String constructors.

但是为了强调我的第一点，当您查看每次调用时实际避免的内容时String.charAt()- 它是两个边界检查，一个（懒惰的）布尔 OR 和一个加法。这不会产生任何明显的差异。String 构造函数也没有区别。

Essentially, both idioms are fine in terms of performance (neither is immediately obviously inefficient) so you should not spend any more time working on them unless a profiler shows that this takes up a large amount of your application's runtime. And even then you could almost certainly get more performance gains by restructuring your supporting code in this area (e.g. have newFunctiontake the whole string itself); java.lang.String is pretty well optimised by this point.

本质上，这两种习语在性能方面都很好（两者都不是立即明显低效的），因此您不应再花时间处理它们，除非分析器表明这占用了大量应用程序的运行时间。即便如此，您几乎肯定可以通过在这方面重构您的支持代码来获得更多的性能提升（例如，newFunction获取整个字符串本身）；java.lang.String 在这一点上得到了很好的优化。

Answer 6

回答by DimitrisMel

Leetcode seems to prefer the substring option here.

Leetcode 似乎更喜欢这里的 substring 选项。

This is how I solved that problem:

我是这样解决这个问题的：

class Solution {
public int strStr(String haystack, String needle) {
    if(needle.length() == 0) {
        return 0;
    }

    if(haystack.length() == 0) {
        return -1;
    }

    for(int i=0; i<=haystack.length()-needle.length(); i++) {
        int count = 0;
        for(int j=0; j<needle.length(); j++) {
            if(haystack.charAt(i+j) == needle.charAt(j)) {
                count++;
            }
        }
        if(count == needle.length()) {
            return i;
        }
    }
    return -1;
}

}

And this is the optimal solution they give:

这是他们给出的最佳解决方案：

class Solution {
public int strStr(String haystack, String needle) {
    int length;
    int n=needle.length();
    int h=haystack.length();
    if(n==0)
        return 0;
    // if(n==h)
    //     length = h;
    // else
        length = h-n;
    if(h==n && haystack.charAt(0)!=needle.charAt(0))
            return -1;
    for(int i=0; i<=length; i++){
        if(haystack.substring(i, i+needle.length()).equals(needle))
            return i;
    }
    return -1;
}

}

Honestly, I can't figure out why it would matter.

老实说，我不明白为什么这很重要。

Java charAt() 还是子字符串？哪个更快？

提问by estacado

采纳答案by mhaller

回答by Jesper

回答by Will

回答by Will

回答by Andrzej Doyle

回答by DimitrisMel

相关推荐

最近更新

标签

Java charAt() 还是子字符串？哪个更快？

提问by estacado

采纳答案by mhaller

回答by Jesper

回答by Will

回答by Will

回答by Andrzej Doyle

回答by DimitrisMel

相关推荐

如何在javadoc中添加对方法参数的引用？

Java 如何设置行数？JTable setRowCount GUI 由 Jtextfield 和 JButton

Java 在 Android 上解析查询字符串

Java - Hibernate criteria.setResultTransformer() 使用默认值初始化模型字段

相关推荐

最近更新

标签