如何使用 Jython 导入 Pandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36213908/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:55:53  来源:igfitidea点击:

How can I import Pandas with Jython

pythonpandasjython-2.7

提问by Jimmy Chu

I'm new to python, and I've install Jython2.7.0

我是 python 新手,我已经安装了 Jython2.7.0

Java

爪哇

import org.python.util.PythonInterpreter;
import org.python.core.*; 

public class Main {
    public static void main(String[] args) {
         PythonInterpreter interp = new PythonInterpreter(); 
         interp.execfile("D:/Users/JY/Desktop/test/for_java_test.py");  
         interp.close();
    }
}

Python

Python

import pandas as pd
import ctypes

def main():
    data = pd.read_csv('for_test.csv')
    data_mean = data.a*2
    data_mean.to_csv('catch_test.csv',index=False)
    ctypes.windll.user32.MessageBoxW(0, "Done. Output: a * 2", "Output csv", 0)

if __name__ == '__main__':
    main()

Then I got this error.

然后我得到了这个错误。

Exception in thread "main" Traceback (most recent call last):
File "D:\Users\JYJU\Desktop\test_java\for_java_test.py", line 1, in <module>
    import pandas as pd
ImportError: No module named pandas

How can I fix this if I want to use pandas?

如果我想使用Pandas,我该如何解决这个问题?

回答by stewori

You currently cannot use Pandas with Jython, because it depends on CPython specific native extensions. One dependency is NumPy, the other is Cython (which is actually not a native CPython extension, but generates such).

您目前不能将 Pandas 与 Jython 一起使用,因为它取决于 CPython 特定的本机扩展。一个依赖项是 NumPy,另一个是 Cython(它实际上不是本地 CPython 扩展,而是生成这样的)。

Keep an eye on the JyNI project("Jython Native Interface"). It enables Jython to use native CPython-extensions and its exact purpose is to solve issues like that encountered by you. However, it is still under heavy development and not yet capable of loading Pandas or NumPy into Jython, but both frameworks are high on the priority list.

密切关注JyNI 项目(“Jython Native Interface”)。它使 Jython 能够使用本机 CPython 扩展,其确切目的是解决您遇到的类似问题。然而,它仍在大量开发中,尚不能将 Pandas 或 NumPy 加载到 Jython 中,但这两个框架在优先级列表中都很高。

(E.g. ctypes is already working to some extend.)

(例如 ctypes 已经在某种程度上工作了。)

Also, it is currently POSIX only (tested on Linux and OSX).

此外,它目前仅适用于 POSIX(在 Linux 和 OSX 上测试)。

If you wouldn't require Jython specifically, but just someJava/Pandas interoperation, an already workable solution would be to embed the CPython interpreter. JPYand JEPare projects that provide this. With either of them you should be able to interoperate Java and Pandas (or any other CPython-specific framework).

如果您不特别需要 Jython,而只需要一些Java/Pandas 互操作,那么一个已经可行的解决方案是嵌入 CPython 解释器。 JPYJEP是提供此功能的项目。使用它们中的任何一个,您都应该能够互操作 Java 和 Pandas(或任何其他特定于 CPython 的框架)。

回答by farzad

As far as I know pandasis written in cythonand is a CPython extension. This means that it's meant to be used by CPythonimplementation of the Python language (which is the primary implemntation most people use).

据我所知pandas是用cython编写的,是一个 CPython 扩展。这意味着它旨在由Python 语言的CPython实现(这是大多数人使用的主要实现)使用。

Jythonis a Python implementation to run Python programs on JVM and is used to provide integration with Java libraries, or Python scripting to Java programs, etc.

Jython是在 JVM 上运行 Python 程序的 Python 实现,用于提供与 Java 库的集成,或将 Python 脚本编写到 Java 程序等。

Python modules implemented as CPython extensions (like pandas) are not necessarily compatible with all Python implementations (famous implementations other than CPython are Jython, PyPyand IronPython)

作为 CPython 扩展实现的 Python 模块(如 Pandas)不一定与所有 Python 实现兼容(CPython 以外的著名实现是 Jython、PyPyIronPython

If you really have to use Jython and pandas together and you could not find another way to solve the issue, then I suggest using them in different processes.

如果您真的必须同时使用 Jython 和 Pandas 并且找不到其他方法来解决问题,那么我建议在不同的过程中使用它们。

A Java process is your Jython application running on JVM (either is Java code calling Jython libraries, or a Python code that possibly requires integration with some Java libraries), and another CPython process runs to provide operations required from pandas.

Java 进程是运行在 JVM 上的 Jython 应用程序(要么是调用 Jython 库的 Java 代码,要么是可能需要与某些 Java 库集成的 Python 代码),另一个 CPython 进程运行以提供 Pandas 所需的操作。

Then use some form of IPC (or tool) to communicate (standard IO, sockets, OS pipes, shared memory, memcache, Redis, etc.).

然后使用某种形式的 IPC(或工具)进行通信(标准 IO、套接字、OS 管道、共享内存、memcache、Redis 等)。

The Java process sends a request to CPython (or registers the request to shared storage), providing processing parameters, CPython process uses pandas to calculate results and sends back a serialized form of the results (or puts the results back on the shared storage).

Java 进程向 CPython 发送请求(或将请求注册到共享存储),提供处理参数,CPython 进程使用 Pandas 计算结果并将结果的序列化形式发回(或将结果放回共享存储)。

This approach requires extra coding (due to splitting the tasks into separate processes), and to serialize the request/response (which depends on the application and the data it's trying to process).

这种方法需要额外的编码(由于将任务拆分为单独的进程),并序列化请求/响应(这取决于应用程序和它试图处理的数据)。

For example in this sample code on the question, Java process can provide the CSV filename to CPython, CPython processes the CSV file using pandas, generates the result CSV file and returns the name of the new file to Java process.

例如,在这个问题的示例代码中,Java 进程可以向 CPython 提供 CSV 文件名,CPython 使用 Pandas 处理 CSV 文件,生成结果 CSV 文件并将新文件的名称返回给 Java 进程。