为什么 String.chars() 是 Java 8 中的整数流?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22435833/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 15:42:45  来源:igfitidea点击:

Why is String.chars() a stream of ints in Java 8?

javastringjava-8

提问by Adam Dyga

In Java 8, there is a new method String.chars()which returns a stream of ints (IntStream) that represent the character codes. I guess many people would expect a stream of chars here instead. What was the motivation to design the API this way?

在 Java 8 中,有一个新方法String.chars()返回表示字符代码的ints ( IntStream)流。我想很多人会期待char这里有一个s流。以这种方式设计 API 的动机是什么?

采纳答案by skiwi

As others have already mentioned, the design decision behind this was to prevent the explosion of methods and classes.

正如其他人已经提到的,这背后的设计决策是为了防止方法和类的爆炸式增长。

Still, personally I think this was a very bad decision, and there should, given they do not want to make CharStream, which is reasonable, different methods instead of chars(), I would think of:

尽管如此,我个人认为这是一个非常糟糕的决定,鉴于他们不想做出CharStream,这是合理的,不同的方法,而不是chars(),我会想到:

  • Stream<Character> chars(), that gives a stream of boxes characters, which will have some light performance penalty.
  • IntStream unboxedChars(), which would to be used for performance code.
  • Stream<Character> chars(),这给出了一个盒子字符流,这将有一些轻性能损失。
  • IntStream unboxedChars(),这将用于性能代码。

However, instead of focusing on whyit is done this way currently, I think this answer should focus on showing a way to do it with the API that we have gotten with Java 8.

但是,与其关注目前为什么这样做,我认为这个答案应该侧重于展示一种使用 Java 8 获得的 API 来做到这一点的方法。

In Java 7 I would have done it like this:

在 Java 7 中,我会这样做:

for (int i = 0; i < hello.length(); i++) {
    System.out.println(hello.charAt(i));
}

And I think a reasonable method to do it in Java 8 is the following:

我认为在 Java 8 中执行此操作的合理方法如下:

hello.chars()
        .mapToObj(i -> (char)i)
        .forEach(System.out::println);

Here I obtain an IntStreamand map it to an object via the lambda i -> (char)i, this will automatically box it into a Stream<Character>, and then we can do what we want, and still use method references as a plus.

这里我获取 anIntStream并通过 lambdai -> (char)i将其映射到一个对象,这会自动将其装箱到 a 中Stream<Character>,然后我们可以做我们想做的事情,并且仍然使用方法引用作为加号。

Be awarethough that you mustdo mapToObj, if you forget and use map, then nothing will complain, but you will still end up with an IntStream, and you might be left off wondering why it prints the integer values instead of the strings representing the characters.

请注意,虽然您必须这样做mapToObj,但如果您忘记并使用map,则不会有任何抱怨,但您仍然IntStream会得到 ,并且您可能不知道为什么它会打印整数值而不是表示字符的字符串。

Other ugly alternatives for Java 8:

Java 8 的其他丑陋的替代品:

By remaining in an IntStreamand wanting to print them ultimately, you cannot use method references anymore for printing:

通过留在 anIntStream并最终想要打印它们,您不能再使用方法引用进行打印:

hello.chars()
        .forEach(i -> System.out.println((char)i));

Moreover, using method references to your own method do not work anymore! Consider the following:

此外,对您自己的方法使用方法引用不再起作用!考虑以下:

private void print(char c) {
    System.out.println(c);
}

and then

进而

hello.chars()
        .forEach(this::print);

This will give a compile error, as there possibly is a lossy conversion.

这将产生编译错误,因为可能存在有损转换。

Conclusion:

结论:

The API was designed this way because of not wanting to add CharStream, I personally think that the method should return a Stream<Character>, and the workaround currently is to use mapToObj(i -> (char)i)on an IntStreamto be able to work properly with them.

API 是这样设计的,因为不想添加CharStream,我个人认为该方法应该返回 a Stream<Character>,目前的解决方法是使用mapToObj(i -> (char)i)anIntStream以便能够正常使用它们。

回答by Stuart Marks

The answer from skiwicovered many of the major points already. I'll fill in a bit more background.

来自skiwi回答已经涵盖了许多要点。我会补充一点背景知识。

The design of any API is a series of tradeoffs. In Java, one of the difficult issues is dealing with design decisions that were made long ago.

任何 API 的设计都是一系列的权衡。在 Java 中,难题之一是处理很久以前做出的设计决策。

Primitives have been in Java since 1.0. They make Java an "impure" object-oriented language, since the primitives are not objects. The addition of primitives was, I believe, a pragmatic decision to improve performance at the expense of object-oriented purity.

原语从 1.0 开始就出现在 Java 中。它们使 Java 成为一种“不纯”的面向对象语言,因为原语不是对象。我相信,添加原语是一个务实的决定,以牺牲面向对象的纯度为代价来提高性能。

This is a tradeoff we're still living with today, nearly 20 years later. The autoboxing feature added in Java 5 mostly eliminated the need to clutter source code with boxing and unboxing method calls, but the overhead is still there. In many cases it's not noticeable. However, if you were to perform boxing or unboxing within an inner loop, you'd see that it can impose significant CPU and garbage collection overhead.

近 20 年后,我们今天仍在接受这种权衡。Java 5 中添加的自动装箱功能基本上消除了用装箱和拆箱方法调用来弄乱源代码的需要,但开销仍然存在。在许多情况下,它并不明显。但是,如果您要在内循环中执行装箱或拆箱,您会发现它会产生大量的 CPU 和垃圾收集开销。

When designing the Streams API, it was clear that we had to support primitives. The boxing/unboxing overhead would kill any performance benefit from parallelism. We didn't want to support allof the primitives, though, since that would have added a huge amount of clutter to the API. (Can you really see a use for a ShortStream?) "All" or "none" are comfortable places for a design to be, yet neither was acceptable. So we had to find a reasonable value of "some". We ended up with primitive specializations for int, long, and double. (Personally I would have left out intbut that's just me.)

在设计 Streams API 时,很明显我们必须支持原语。装箱/拆箱开销会扼杀并行性带来的任何性能优势。不过,我们不想支持所有原语,因为这会给 API 带来大量混乱。(你真的能看到 a 的用途ShortStream吗?)“全部”或“无”对于设计来说是舒适的地方,但两者都不可接受。所以我们必须找到一个合理的“一些”值。我们结束了与原始的专长intlongdouble。(就我个人而言,我会遗漏,int但那只是我。)

For CharSequence.chars()we considered returning Stream<Character>(an early prototype might have implemented this) but it was rejected because of boxing overhead. Considering that a String has charvalues as primitives, it would seem to be a mistake to impose boxing unconditionally when the caller would probably just do a bit of processing on the value and unbox it right back into a string.

因为CharSequence.chars()我们考虑过返回Stream<Character>(早期原型可能已经实现了这一点),但由于拳击开销而被拒绝。考虑到 String 具有char作为原语的值,当调用者可能只是对值进行一些处理并将其拆箱为字符串时,无条件地强加装箱似乎是错误的。

We also considered a CharStreamprimitive specialization, but its use would seem to be quite narrow compared to the amount of bulk it would add to the API. It didn't seem worthwhile to add it.

我们还考虑了CharStream原始专业化,但与它添加到 API 的批量相比,它的使用范围似乎相当狭窄。添加它似乎不值得。

The penalty this imposes on callers is that they have to know that the IntStreamcontains charvalues represented as intsand that casting must be done at the proper place. This is doubly confusing because there are overloaded API calls like PrintStream.print(char)and PrintStream.print(int)that differ markedly in their behavior. An additional point of confusion possibly arises because the codePoints()call also returns an IntStreambut the values it contains are quite different.

这对调用者施加的惩罚是他们必须知道IntStream包含的char值表示为ints,并且必须在适当的位置进行转换。这是极为混乱,因为有超载API调用一样PrintStream.print(char),并PrintStream.print(int)在他们的行为明显不同。可能会出现另一个混淆点,因为该codePoints()调用还返回一个,IntStream但它包含的值却大不相同。

So, this boils down to choosing pragmatically among several alternatives:

因此,这归结为在几种替代方案中进行务实的选择:

  1. We could provide no primitive specializations, resulting in a simple, elegant, consistent API, but which imposes a high performance and GC overhead;

  2. we could provide a complete set of primitive specializations, at the cost of cluttering up the API and imposing a maintenance burden on JDK developers; or

  3. we could provide a subset of primitive specializations, giving a moderately sized, high performing API that imposes a relatively small burden on callers in a fairly narrow range of use cases (char processing).

  1. 我们可以不提供原始专业化,从而产生一个简单、优雅、一致的 API,但会带来高性能和 GC 开销;

  2. 我们可以提供一套完整的原始专业化,但代价是 API 变得混乱并给 JDK 开发人员带来维护负担;或者

  3. 我们可以提供原始专业化的一个子集,提供一个中等大小、高性能的 API,在相当窄的用例范围内(字符处理)对调用者施加相对较小的负担。

We chose the last one.

我们选择了最后一个。