java String.substring 与 String[].split

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13997361/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 14:41:19  来源:igfitidea点击:

String.substring vs String[].split

javamemorygarbage-collection

提问by Duncan Krebs

I have a comma delaminated string that when calling String.split(",")it returns an array size of about 60. In a specific use case I only need to get the value of the second value that would be returned from the array. So for example "Q,BAC,233,sdf,sdf,"all I want is the value of the string after the first ','and before the second ','. The question I have for performance am I better off parsing it myself using substring or using the split method and then get the second value in the array? Any input would be appreciated. This method will get called hundreds of times a second so it's important I understand the best approach regarding performance and memory allocation.

我有一个逗号分隔的字符串,在调用String.split(",")它时返回一个大约 60 的数组大小。在特定用例中,我只需要获取将从数组返回的第二个值的值。因此,例如,"Q,BAC,233,sdf,sdf,"我想要的只是第一个之后','和第二个之前的字符串值','。我对性能的问题是我最好使用子字符串还是使用 split 方法自己解析它,然后获取数组中的第二个值?任何输入将不胜感激。此方法每秒将被调用数百次,因此了解有关性能和内存分配的最佳方法很重要。

-Duncan

-邓肯

回答by dasblinkenlight

Since String.Splitreturns a string[], using a 60-way Splitwould result in about sixty needless allocations per line. Splitgoes through your entire string, and creates sixty new object plus the array object itself. Of these sixty one objects you keep exactly one, and let garbage collector deal with the remaining sixty.

由于String.Split返回 a string[],使用 60 路Split将导致每行大约 60 次不必要的分配。Split遍历整个字符串,并创建六十个新对象以及数组对象本身。在这 61 个对象中,您只保留一个,让垃圾收集器处理剩余的 60 个。

If you are calling this in a tight loop, a substring would definitely be more efficient: it goes through the portion of your string up to the second comma ,, and then creates one new object that you keep.

如果您在紧密循环中调用它,子字符串肯定会更有效:它遍历字符串的一部分,直到第二个逗号,,然后创建一个您保留的新对象。

String s = "quick,brown,fox,jumps,over,the,lazy,dog";
int from = s.indexOf(',');
int to = s.indexOf(',', from+1);
String brown = s.substring(from+1, to);

The above printsbrown

以上brown

When you run this multiple times, the substringwins on time hands down: 1,000,000 iterations of splittake 3.36s, while 1,000,000 iterations of substringtake only 0.05s. And that's with only eight components in the string! The difference for sixty components would be even more drastic.

当您多次运行此程序时,substring时间会缩短:1,000,000 次迭代split需要 3.36 秒,而 1,000,000 次迭代substring仅需要 0.05 秒。而且字符串中只有八个组件!六十个组件的差异会更大。

回答by Jigar Joshi

ofcourse why iterate through whole string, just use substring()and indexOf()

当然为什么要遍历整个字符串,只需使用substring()indexOf()

回答by fge

You are certainly better off doing it by hand for two reasons:

出于两个原因,您当然最好手动完成:

  • .split()takes a string as an argument, but this string is interpreted as a Pattern, and for your use case Patternis costly;
  • as you say, you only need the second element: the algorithm to grab that second element is simple enough to do by hand.
  • .split()接受一个字符串作为参数,但这个字符串被解释为 a Pattern,并且对于您的用例来说Pattern是昂贵的;
  • 正如您所说,您只需要第二个元素:获取第二个元素的算法非常简单,可以手动完成。

回答by MrSmith42

I would use something like:

我会使用类似的东西:

final int first = searchString.indexOf(",");
final int second = searchString.indexOf(",", first+1);
String result= searchString.substring(first+1, second);

回答by Justin Niessner

My first inclination would be to find the index of the first and second commas and take the substring.

我的第一个倾向是找到第一个和第二个逗号的索引并获取子字符串。

The only real way to tell for sure, though, is to test each in your particular scenario. Break out the appropriate stopwatch and measure the two.

不过,唯一可以确定的真正方法是在您的特定场景中测试每一个。拿出适当的秒表并测量两者。