Haskell、Scala、Clojure,高性能模式匹配和并发选择什么

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11607020/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 04:21:59  来源:igfitidea点击:

Haskell, Scala, Clojure, what to choose for high performance pattern matching and concurrency

scalahaskellclojureprogramming-languagesocaml

提问by 2ndlife

I have started work on FP recently after reading a lot of blogs and posts about advantages of FP for concurrent execution and performance. My need for FP has been largely influenced by the application that I am developing, My application is a state based data injector into another subsystem where timing is very crucial (close to a 2 million transactions per sec). I have a couple of such subsystems which needs to be tested. I am seriously considering using FP for its parallelism and want to take the correct approach, many posts on SO talk about disadvantages and advantages of Scala, Haskell and Clojure wrt language constructs, libraries and JVM support. From a language point of view I am ok to learn any language as long as it will help me achieve the result.

在阅读了大量关于 FP 在并发执行和性能方面的优势的博客和文章后,我最近开始研究 FP。我对 FP 的需求在很大程度上受到我正在开发的应用程序的影响,我的应用程序是一个基于状态的数据注入器,进入另一个子系统,其中时间非常重要(接近每秒 200 万个事务)。我有几个这样的子系统需要测试。我正在认真考虑使用 FP 的并行性,并希望采用正确的方法,SO 上的许多帖子都讨论了 Scala、Haskell 和 Clojure wrt 语言构造、库和 JVM 支持的优缺点。从语言的角度来看,我可以学习任何语言,只要它能帮助我取得成果。

Certain posts favor Haskell for pattern matching and simplicity of language, JVM based FP lang have a big advantage with respect to using existing java libraries. JaneStreet is a big OCAML supporter but I am really not sure about developer support and help forums for OCAML.

某些帖子支持 Haskell 的模式匹配和语言的简单性,基于 JVM 的 FP lang 相对于使用现有的 java 库有很大的优势。JaneStreet 是 OCAML 的重要支持者,但我真的不确定 OCAML 的开发人员支持和帮助论坛。

If anybody has worked with handling such large data, please share your experience.

如果有人处理过如此大的数据,请分享您的经验。

回答by Rex Kerr

Do you want fastor do you want easy?

你想要还是想要简单

If you want fast, you should use C++, even if you're using FP principles to aid in correctness. Since timing is crucial, the support for soft (and hard, if need be) real-time programming will be important. You can decide exactly how and when you have time to recover memory, and spend only as much time as you have on that task.

如果你想要快速,你应该使用 C++,即使你使用 FP 原则来帮助正确。由于时序至关重要,因此对软(和硬,如果需要)实时编程的支持将很重要。您可以确切地决定如何以及何时有时间恢复记忆,并且只在该任务上花费尽可能多的时间。

The three languages you've stated all are ~2-3x slower than near-optimally hand-tuned C++ tends to be, and then only when used in a rather traditional imperative way. They all use garbage collection, which will introduce uncontrolled random delays in your transactions.

您所说的三种语言都比接近最佳的手动调整的 C++ 慢约 2-3 倍,而且只有在以相当传统的命令式方式使用时。它们都使用垃圾收集,这会在您的事务中引入不受控制的随机延迟。

Now, that said, it's a lotof work to get this running in bulletproof fashion with C++. Applying FP principles requires considerably more boilerplate (even in C++11), and most libraries are mutable by default. (Edit: Rust is becoming a good alternative, but it is beyond the scope of this answer to describe Rust in sufficient detail.)

现在,也就是说,要使用 C++ 以防弹方式运行它需要做很多工作。应用 FP 原则需要更多样板(即使在 C++11 中),并且大多数库默认是可变的。(编辑:Rust 正在成为一个很好的替代方案,但详细描述 Rust 超出了本答案的范围。)

Maybe you don't have the time and can afford to scale back on other specifications. If it is not timingbut throughputthat is crucial, for example, then you probablywant Scala over Clojure (see the Computer Languages Benchmark Game, where Scala wins every benchmark as of this writing andhas lower code size in almost every case (Edit: CLBG is not helpful in this regard any more, though you may find archives supporting these statements on the Web Archive)); OCaml and Haskell should be chosen for other reasons (similar benchmark scores, but they have different syntax and interoperability and so on).

也许您没有时间并且负担得起缩减其他规格的费用。例如,如果重要的不是时间而是吞吐量,那么您可能希望 Scala 胜过 Clojure(请参阅计算机语言基准游戏,其中 Scala 在撰写本文时赢得了所有基准测试,并且几乎在所有情况下都具有更小的代码大小(编辑: CLBG 在这方面不再有帮助,尽管您可能会在 Web 档案中找到支持这些声明的档案));应该选择 OCaml 和 Haskell 有其他原因(类似的基准测试分数,但它们具有不同的语法和互操作性等)。

As far as which system has the best concurrency support, Haskell, Clojure and Scala are all just fine while OCaml is a bit lacking.

至于哪个系统的并发支持最好,Haskell、Clojure和Scala都可以,而OCaml有点欠缺。

This pretty much narrows it down to Haskell and Scala. Do you need to use Java libraries? Scala. Do you need to use C libraries? Probably Haskell. Do you need neither? Then you can choose either on the basis of which one you prefer stylistically without having to worry overly much that you've made your life vastly harder by choosing the wrong one.

这几乎将范围缩小到 Haskell 和 Scala。你需要使用Java库吗?斯卡拉。你需要使用C库吗?可能是哈斯克尔。两者都不需要吗?然后,您可以根据自己在风格上更喜欢的风格来选择其中一种,而不必过分担心选择错误的风格会使您的生活变得更加艰难。

回答by mikera

I've done this with Clojure, which proved pretty effective for the following reasons:

我已经用 Clojure 做到了这一点,事实证明它非常有效,原因如下:

  • Being on the JVM is a huge advantagein terms of libraries. This effectively ruled out Haskell and Ocaml for my purposes, as we needed easy access to the Java ecosystem and integration with JVM based tools (Maven build etc.)
  • You can drop into pure Java if you need to tightly optimise inner loops. We did this for some custom code processing large double[] arrays, but 99% of the time Clojure can get you the performance you need. See http://www.infoq.com/presentations/Why-Prismatic-Goes-Faster-With-Clojurefor some examples of how to make Clojure go really fast (quite technical video, assumes some prior knowledge!). Once you start counting the ease of exploiting multiple cores, Clojure is very competitive on performance.
  • Clojure has very nice multi-core concurrency support. This proved extremely useful for managing concurrent tasks. See http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
  • The REPL makes a very good environment for testing and exploratory work on data.
  • Clojure is lazywhich makes it suitable for handling larger-than-memory data sets (assuming you are careful not to try and force the whole data set into memory at once). There are also some nice libraries available in such an environment, most notable are Stormand Aleph. Storm may be particularly interesting for you, as it's designed for distributed realtime processing of large numbers of events.
  • 就库而言,在 JVM 上是一个巨大的优势。出于我的目的,这有效地排除了 Haskell 和 Ocaml,因为我们需要轻松访问 Java 生态系统并与基于 JVM 的工具(Maven 构建等)集成。
  • 如果您需要紧密优化内部循环,您可以使用纯 Java。我们这样做是为了一些处理大型 double[] 数组的自定义代码,但 99% 的时间 Clojure 都可以为您提供所需的性能。请参阅http://www.infoq.com/presentations/Why-Prismatic-Goes-Faster-With-Clojure以获取有关如何使 Clojure 真正快速运行的一些示例(相当技术性的视频,假设有一些先验知识!)。一旦您开始计算利用多核的难易程度,Clojure 在性能上就非常具有竞争力。
  • Clojure 具有非常好的多核并发支持。事实证明,这对于管理并发任务非常有用。见http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
  • REPL 为数据的测试和探索工作创造了一个非常好的环境。
  • Clojure 是惰性的,这使它适合处理大于内存的数据集(假设您小心不要尝试将整个数据集一次强制放入内存)。在这样的环境中也有一些不错的库,最著名的是StormAleph。Storm 对您来说可能特别有趣,因为它是为大量事件的分布式实时处理而设计的。

I can't speak with quite so much experience of the other languages, but my impression from some practical experience of Haskell and Scala is:

我不能用其他语言的太多经验说话,但我对 Haskell 和 Scala 的一些实践经验的印象是:

  • Haskell is great if you care about purity and strict functional programming with static types. The static typing can be a strong guarantee of correctness so might make this suitable for highly algorithmic work. Personally, I find pure FP a littletoo rigid - there are many times when mutable state is useful and I think Clojure has a slightly better balance here (by allowing controlled muability thorugh managed references).
  • Scala is a great language and shares with Clojure the advantages of being on the JVM. To me Scala is more like a "better Java" with functional features and a very impressive type system. It's less of a paradigm shift from Clojure. Downside is that the type system can get quite complex / confusing.
  • 如果您关心静态类型的纯度和严格的函数式编程,Haskell 会很棒。静态类型可以是正确性的有力保证,因此可能使其适用于高度算法的工作。就我个人而言,我发现纯 FP有点过于僵化——很多时候可变状态很有用,我认为 Clojure 在这里有更好的平衡(通过允许受控可变性通过托管引用)。
  • Scala 是一种很棒的语言,它与 Clojure 共享在 JVM 上的优势。对我来说,Scala 更像是一个“更好的 Java”,具有功能特性和令人印象深刻的类型系统。这不是 Clojure 的范式转变。缺点是类型系统会变得非常复杂/混乱。

Overall, I think you could be happy with any of these. It will probably come down to how much you care about the JVM and your view on type systems.

总的来说,我认为您可以对其中任何一个感到满意。这可能取决于您对 JVM 的关心程度以及您对类型系统的看法。