java “startsWith”比“indexOf”快吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1062064/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is "startsWith" faster than "indexOf"?
提问by Krishna Kumar
I am writing code in Java where I branch off based on whether a stringstarts with certain characters while looping through a datasetand my datasetis expected to be large.
我正在用 Java 编写代码,根据 a 是否string以某些字符开头,同时循环 adataset和 mydataset预计会很大,我会在其中进行分支。
I was wondering whether startsWithis faster than indexOf. I did experiment with 2000 records but not found any difference.
我想知道是否startsWith比indexOf. 我做了 2000 条记录的实验,但没有发现任何区别。
采纳答案by Priyank
public class Test
{
public static void main(String args[]) {
long value1 = System.currentTimeMillis();
for(long i=0;i<100000000;i++)
{
"abcd".indexOf("a");
}
long value2 = System.currentTimeMillis();
System.out.println(value2-value1);
value1 = System.currentTimeMillis();
for(long i=0;i<100000000;i++)
{
"abcd".startsWith("a");
}
value2 = System.currentTimeMillis();
System.out.println(value2-value1);
}
}
Tested it with this piece of code and perf for startsWith seems to be better, for obvious reason that it doesn't have to traverse through string. But in best case scenario both should perform close while in a worst case scenario startsWith will always perform better than indexOf
用这段代码对其进行了测试,startsWith 的性能似乎更好,原因很明显,它不必遍历字符串。但在最好的情况下,两者都应该执行 close 而在最坏的情况下,startsWith 的性能总是比 indexOf 好
回答by Jon Skeet
startsWithonly needs to check for the presence at the very start of the string - it's doing less work, so it should be faster.
startsWith只需要检查字符串开头是否存在 - 它所做的工作较少,因此应该更快。
My guess is that your 2000 records finished in a few milliseconds (if that). Whenever you want to benchmark one approach against another, try to do it for enough time that differences in timing will be significant. I find that 10-30 seconds is long enough to show significant improvements, but short enough to make it bearable to run the tests multiple times. (If this were a serious investigation I'd probably try for longer times. Most of my benchmarking is for fun.)
我的猜测是您的 2000 条记录在几毫秒内完成(如果是的话)。每当您想将一种方法与另一种方法进行基准测试时,请尝试进行足够长的时间,以使时间差异显着。我发现 10-30 秒足够长,可以显示显着的改进,但也足够短,可以承受多次运行测试。(如果这是一项严肃的调查,我可能会尝试更长时间。我的大部分基准测试都是为了好玩。)
Also make sure you've got varied data - indexOfand startsWithshould have roughly the same running time in the case where indexOfreturns 0. So if all your records match the pattern, you're not really testing correctly. (I don't know whether that was the case in your tests of course - it's just something to watch out for.)
还要确保您有不同的数据 -indexOf并且在返回 0的情况下startsWith应该具有大致相同的运行时间。因此,如果您的所有记录都与模式匹配,那么您就没有真正正确地进行测试。(当然,我不知道您的测试是否是这种情况 - 这只是需要注意的事情。)indexOf
回答by Sean Reilly
In general, the golden rule of micro-optimization applies here:
一般来说,微优化的黄金法则适用于这里:
"Measure, don't guess".
“测量,不要猜测”。
As with all optimizations of this type, the difference between the two calls almost certainly won't matter unless you are checking millions of strings that are each tens of thousands of characters long.
与此类型的所有优化一样,除非您检查数百万个字符串,每个字符串的长度为数万个字符,否则这两个调用之间的差异几乎肯定无关紧要。
Run a profiler over your code, and only optimize this call when you can measure that it's slowing you down. Till then, go with the more readable options (startsWith, in this case). Once you know that this block is slowing you down, try both and use whichever is faster. Rinse. Repeat ;-)
对您的代码运行分析器,并仅在您可以衡量它正在减慢您的速度时优化此调用。在此之前,使用更具可读性的选项(在本例中为 startsWith)。一旦你知道这个块会减慢你的速度,请尝试两者并使用更快的那个。冲洗。重复 ;-)
Academically, my guess is that startsWith will likely be implemented using indexOf. Check the source code and see if you're interested.(Turns out that startsWith does not call indexOf)
在学术上,我的猜测是,startsWith 可能会使用 indexOf 来实现。检查源代码,看看你是否有兴趣。(事实证明,startsWith 没有调用 indexOf)
回答by dmeister
Even without looking into the sources, it should be clear that startsWith is faster at least for large strings and short pattern:
即使不查看源代码,也应该清楚,至少对于大字符串和短模式,startsWith 更快:
The running time of a.startsWith(b) is bound be the length of b. After at most the first b characters are checked, the search finished.
a.startsWith(b) 的运行时间必然是 b 的长度。最多检查前 b 个字符后,搜索完成。
The running time of a.indexOf(b) is larger (depending on the actual algorithm). Every algorithm has at least a running time depending on the length of a. Roughly, you can say, that you have to look at each character once to check if the pattern starts at that position.
a.indexOf(b) 的运行时间较大(取决于实际算法)。每个算法至少有一个运行时间取决于 a 的长度。粗略地说,您可以说,您必须查看每个字符一次以检查模式是否从该位置开始。
However, as always, it depends on the actual use case if you really see a difference in practice. Measuring the difference in real life is never bad.
但是,与往常一样,如果您真的在实践中看到差异,则取决于实际用例。衡量现实生活中的差异从来都不是坏事。
回答by Fredrik
Probably, if it doesn't match it can stop looking whereas indexOf needs to look for occurrences later in the string.
可能,如果它不匹配,它可以停止查找,而 indexOf 需要在字符串的后面查找出现。
回答by Steve M
startsWith is clearer than indexOf == 0.
startsWith 比 indexOf == 0 更清晰。
Have you identified the test as a performance bottleneck for which you need to sacrifice readability?
您是否将测试确定为需要牺牲可读性的性能瓶颈?
回答by PeterMmm
You mentioned the dataset is expected to be large. So i will bet a lot of performanve will go into access this dataset and handle it in memory. That means use one or the other will not change the perfomance significant. But if this is important to you you may write your own startWith method that could be significant faster than standard library methods or at least you know exactly what is done.
您提到数据集预计会很大。所以我敢打赌,很多性能都会进入访问这个数据集并在内存中处理它。这意味着使用其中之一不会显着改变性能。但是,如果这对您很重要,您可以编写自己的 startWith 方法,该方法可能比标准库方法快得多,或者至少您确切地知道做了什么。

