Javascript 和科学处理?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11651081/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 13:54:42  来源:igfitidea点击:

Javascript and Scientific Processing?

javascriptdata-miningscientific-computing

提问by MikeB

Matlab, R, and Python are powerful but either costly or slow for some data mining work I'd like to do. I'm considering using Javascript both for speed, good visualization libraries, and to be able to use the browser as an interface.

Matlab、R 和 Python 功能强大,但对于我想做的某些数据挖掘工作来说要么成本高要么速度慢。我正在考虑使用 Javascript 来 提高速度、良好的可视化库,并能够将浏览器用作界面。

The first question I faced is the obvious one for science programming, how to do I/O to data files? The second is client-side or server-side? The last question, can I make something that is truly portable i.e. put it all on a USB and run from that?

我面临的第一个问题是科学编程的明显问题,如何对数据文件进行 I/O?第二个是客户端还是服务器端?最后一个问题,我可以制作真正便携的东西,即把它全部放在 USB 上并从中运行吗?

I've spent a couple of weeks looking for answers. Server2go seems to address client/server needs which I thinkmeans I can get data to and from the programs on the client side. Server2go also allows running from a USB. The data files I work with are usually XML and there seem to be several javascript converters to JSON.

我花了几个星期寻找答案。Server2go 似乎解决了客户端/服务器的需求,我认为这意味着我可以从客户端的程序中获取数据。Server2go 还允许从 USB 运行。我使用的数据文件通常是 XML,似乎有几个 javascript 转换为 JSON。

However, after all the looking around, I'm not sure if my approach makes sense. So before I commit further, any advice/thoughts/guidance on Javascript as a portable tool for scientific data processing?

然而,在环顾四周之后,我不确定我的方法是否有意义。所以在我进一步提交之前,关于 Javascript 作为科学数据处理的便携式工具的任何建议/想法/指导?

采纳答案by Odalrick

I have to agree with the comments that JavaScript is not a good fit for scientific processing. However, you know your requirements best; maybe you already found useful libraries that do what you need. Just be aware that you'll have to implement alllogic yourself. There is no built in handling of complex numbers, or matrices or integrals or ... Usually programmer time is far more valuable than machine time. Personally, I'd look in to compiled languages; afterI created a first version that isn't fast enough in whatever language I like the most.

我不得不同意 JavaScript 不适合科学处理的评论。但是,您最了解自己的要求;也许您已经找到了可以满足您需求的有用库。请注意,您必须自己实现所有逻辑。没有内置处理复数、矩阵或积分或......通常程序员时间比机器时间更有价值。就个人而言,我会研究编译语言;我用我最喜欢的语言创建了第一个版本之后,速度不够快。

Assuming that JavaScript is the way to go:

假设 JavaScript 是要走的路:

Data I/O

数据输入/输出

I can think of three options:

我能想到三个选项:

Sending and receiving data with ajax to a server

使用ajax向服务器发送和接收数据

Seems to be the solution you've found with Server2go. It requires you to write a server back end, but that can be kept quite simple. All it really needs to do be able to read and write files as a response to you client-side application.

似乎是您在 Server2go 中找到的解决方案。它要求您编写一个服务器后端,但这可以保持非常简单。它真正需要做的就是能够读取和写入文件作为对客户端应用程序的响应。

Using a non-browser implementation of v8 which includes file I/O

使用包含文件 I/O 的 v8 的非浏览器实现

For instance Node.js. You couldthen avoid the need for a server and simply use a command-line interface, and all code will be JavaScript. Other than that it is roughly equivalent to the first option.

例如Node.js。你可以再避免对服务器的需求,只需使用一个命令行界面,所有的代码将是JavaScript的。除此之外,它大致相当于第一个选项。

Creating a file object using the file APIwhich you ask the user to save or load

使用您要求用户保存或加载的文件 API创建文件对象

It is the worst option in my opinion, as user interaction is required. It would avoid the need for a server; your application could be a simple html file that loads all data files with ajax requests. You'd have to start Chrome with a special switch to allow ajax requests with the file://protocol, as described here

我认为这是最糟糕的选择,因为需要用户交互。它将避免对服务器的需要;您的应用程序可以是一个简单的 html 文件,它使用 ajax 请求加载所有数据文件。你不得不Chrome启动一个特殊的开关,以便与Ajax请求file://的协议,如所描述这里

These options are onlyconcerned with file I/O and you can't do file I/O in JavaScript. This is because browsers cannotallow arbitrary web code to do arbitrary file I/O; the security implications would be horrendous. Each option describes one way to notdo file I/O.

这些选项与文件 I/O 相关,您不能在 JavaScript 中进行文件 I/O。这是因为浏览器无法允许任意网页代码做任意的文件I / O; 安全影响将是可怕的。每个选项都描述了一种进行文件 I/O 的方法。

The first communicates with a server that does the file I/O for the client.

第一个与为客户端执行文件 I/O 的服务器通信。

The second uses "special" versions of JavaScript, with conditions other than that of the browser so the security implications are not important. But that means you'll have to look up how file I/O is done in the actual implementation you use, it's not common to JavaScript.

第二种使用 JavaScript 的“特殊”版本,条件不同于浏览器的条件,因此安全隐患并不重要。但这意味着您必须查看在您使用的实际实现中文件 I/O 是如何完成的,这在 JavaScript 中并不常见。

The third requires the user to control the file I/O.

第三个要求用户控制文件 I/O。

Interface

界面

Even if you don't use JavaScript to do the actual processing, which so far is the consensus, there is nothing stopping you from using a browser as the interface or JavaScript libraries for visualisation. That is something JavaScript is good at.

即使您不使用 JavaScript 进行实际处理(目前已达成共识),也没有什么能阻止您使用浏览器作为界面或 JavaScript 库进行可视化。这是 JavaScript 擅长的。

If you want to interactively control your data mining tool, you willneed a server that can control the tool. Server2go should work, or the built in server in Node.js if you use that or... If you don't need interactive control of the data tool; that is you first generate the processed data, then look at the data a server can be avoided, by using the file//:protocol and JSONP. But really; avoiding a server shouldn't be a goal.

如果您想以交互方式控制您的数据挖掘工具,您需要一个可以控制该工具的服务器。Server2go 应该可以工作,或者如果你使用 Node.js 中的内置服务器,或者......如果你不需要数据工具的交互式控制;也就是说,您首先生成处理过的数据,然后通过使用file//:协议和JSONP查看服务器可以避免的数据。但真的;避免使用服务器不应该成为目标。

I won't go into detail about interface issues, as there is nothing specific to say and very nearly everythingthat has been written about javascript is about interface.

我不会详细讨论接口问题,因为没有什么具体可说的,而且几乎所有关于 javascript 的文章都是关于接口的。

One thing, do use a declarative data binding library like Angular.jsor Knockout.js.

一件事,请使用声明性数据绑定库,如Angular.jsKnockout.js

回答by Has QUIT--Anony-Mousse

JavaScript speed is heavily overrated. This is a Web 2.0 myth.

JavaScript 速度被严重高估了。这是一个 Web 2.0 神话。

Let me explain this claim a bit (and don't just downvote me for saying something you do not want to hear!)

让我解释一下这个说法(不要因为我说了一些你不想听的话而贬低我!)

Sure, JavaScript V8 is a quite highly optimized VM. It does beat many other scripting languages in naive benchmarks.

当然,JavaScript V8 是一个高度优化的虚拟机。它确实在幼稚的基准测试中击败了许多其他脚本语言。

However, it is a very limited scope language. It is meant for the "ADHS world" of web. It is a best effort, but it may just fail and you have little guarantees on things completing or completing on time.

但是,它是一种范围非常有限的语言。它适用于网络的“ADHS 世界”。这是尽最大努力,但它可能会失败,而且您几乎无法保证事情按时完成或完成。

Consider for example MongoDB. At first it seems to be good and fast and offer a lot. Until you see for example that the MapReduce is single-threaded only and thus really slow. It's not all gold that shines!

例如考虑 MongoDB。起初它似乎又好又快,而且提供了很多。直到您看到例如 MapReduce 仅是单线程的,因此非常慢。不是所有的金子都会发光!

Now look at data mining relevant libraries such as BLAS. Basic linear algebra, math operations and such. All CPU manufacturers like Intel and AMD offer optimized versionsfor their CPUs. This is an optimization that requires detailed understanding of the individual CPUs, way beyond the capabilities of our current compilers. The libraries contain optimized codepaths for various CPUs all essentially doing the same thing. And for these operations, using an optimized library such as BLAS can easily yield a 5-20x speedup; at the same time matrix operations that are often in O(n^2) or O(n^3) will dominate your overall runtime.

现在看看数据挖掘相关的库,比如BLAS。基本线性代数,数学运算等。所有 CPU 制造商(如 Intel 和 AMD)都为其 CPU提供优化版本。这是一种优化,需要详细了解各个 CPU,远远超出我们当前编译器的能力。这些库包含针对各种 CPU 的优化代码路径,它们基本上都在做同样的事情。对于这些操作,使用优化的库(例如 BLAS)可以轻松实现 5-20 倍的加速;同时,通常在 O(n^2) 或 O(n^3) 中的矩阵运算将主导您的整体运行时间。

So a good language for data mining will let you go all the way to machine code!

所以一门好的数据挖掘语言会让你一路走上机器码!

Pythons SciPy and R are good choices here. They have the optimized libraries inside and easily accessible, but at the same time allow to do the wrapper stuff in a simpler language.

Pythons SciPy 和 R 在这里是不错的选择。它们内部具有优化的库且易于访问,但同时允许以更简单的语言进行包装。

Have a look at this programming language benchmark:

看看这个编程语言基准:

http://benchmarksgame.alioth.debian.org/u32/which-programs-are-fastest.html

http://benchmarksgame.alioth.debian.org/u32/which-programs-are-fastest.html

Pure JavaScript has a high variance, indicating that it can do somethings fast (mostly regular expressions!) others much slower. It can clearly beat PHP, but it will be just as clearly be beaten by C and Java.

纯 JavaScript 有很大的差异,这表明它可以快速完成某些事情(主要是正则表达式!)其他的则慢得多。它可以明显击败 PHP,但同样明显会被 C 和 Java 击败。

Multithreadingis also important for modern data mining. Few large systems today have a single core, and you do want to make use of all cores. So you need libraries and a programming language that has a powerful set of multithreading operations. This is actually why Fortran and C are losing popularity here. Other languages such as Java are much better here.

多线程对于现代数据挖掘也很重要。当今很少有大型系统具有单个内核,而您确实希望利用所有内核。因此,您需要具有强大多线程操作集的库和编程语言。这实际上就是 Fortran 和 C 在这里失去人气的原因。其他语言(例如 Java)在这里要好得多。

回答by user1082748

Although this discussion is a bit old and I am not a Javascript guru by any stretch of the imagination, I find the above arguments doubtful about not having the processing speed or the capabilities for advance math operations. WebGL is a Javascipt API for rendering advance 2D and 3D graphics which relies heavily on advance math operations. I believe the capabilities are there from a technical point of view however what is lacking is good libraries to handling statistical analysis, natural language processing and other predictive analytics included in data mining.

尽管这个讨论有点陈旧,而且我不是任何想象中的 Javascript 大师,但我发现上述论点对于没有处理速度或高级数学运算的能力是值得怀疑的。WebGL 是一个 Javascipt API,用于渲染高度依赖于高级数学运算的高级 2D 和 3D 图形。我相信从技术角度来看,这些功能是存在的,但是缺少的是处理统计分析、自然语言处理和数据挖掘中包含的其他预测分析的良好库。

WebGL is based on openGL, which in turn uses libraries like BLAS(library info here).

WebGL 基于 openGL,后者又使用BLAS 等库(此处为库信息)。

Advances like node.js, w8 make it technically possible. What is lacking is libraries like we can find in R and Scilab to do the same operations.

node.js、w8 等先进技术使其在技术上成为可能。缺少的是我们可以在 R 和 Scilab 中找到的库来执行相同的操作。