从 python 运行 R 脚本

Question

提问by Ehsan

I searched for this question and found some answers on this, but none of them seem to work. This is the script that I'm using in python to run my R script.

我搜索了这个问题并找到了一些答案，但它们似乎都不起作用。这是我在 python 中使用的脚本来运行我的 R 脚本。

import subprocess
retcode = subprocess.call("/usr/bin/Rscript --vanilla -e 'source(\"/pathto/MyrScript.r\")'", shell=True)

and I get this error:

我收到这个错误：

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  no lines available in input
Calls: source ... withVisible -> eval -> eval -> read.csv -> read.table
Execution halted

and here is the content of my R script (pretty simple!)

这是我的 R 脚本的内容（非常简单！）

data = read.csv('features.csv')
data1 = read.csv("BagofWords.csv")
merged = merge(data,data1)
write.table(merged, "merged.csv",quote=FALSE,sep=",",row.names=FALSE)
for (i in 1:length(merged$fileName))
{
        fileConn<-file(paste("output/",toString(merged$fileName[i]),".txt",sep=""))
        writeLines((toString(merged$BagofWord[i])),fileConn)
        close(fileConn)
}

The r script is working fine, when I use source('MyrScript.r')in r commandline. Moreover, when I try to use the exact command which I pass to the subprocess.callfunction (i.e., /usr/bin/Rscript --vanilla -e 'source("/pathto/MyrScript.r")') in my commandline it works find, I don't really get what's the problem.

当我source('MyrScript.r')在 r 命令行中使用时，r 脚本工作正常。此外，当我尝试使用我在命令行中传递给subprocess.call函数（即/usr/bin/Rscript --vanilla -e 'source("/pathto/MyrScript.r")'）的确切命令时，它可以正常工作，但我真的不明白问题出在哪里。

Answer 1

回答by pyCthon

I would not suggest using a system call to there are many differences between python and R especially when passing around data.

我不建议使用系统调用，因为 python 和 R 之间存在很多差异，尤其是在传递数据时。

There are many standard libraries to call R from Python to choose from see this answer

有许多标准库可以从 Python 调用 R 以供选择，请参阅此答案

Answer 2

回答by Aaron Hall

I think RPy2 is worth looking into, here is a cool presentation on R-bloggers.com to get you started:

我认为 RPy2 值得研究，这是 R-bloggers.com 上的一个很酷的演示文稿，可以帮助您入门：

http://www.r-bloggers.com/accessing-r-from-python-using-rpy2/

Essentially, it allows you to have access to R libraries with R objects that provides both a high level and low level interface.

从本质上讲，它允许您使用 R 对象访问 R 库，该对象提供高级和低级接口。

Here are the docs on the most recent version: https://rpy2.github.io/doc/latest/html/

以下是最新版本的文档：https: //rpy2.github.io/doc/latest/html/

I like to point Python users to Anaconda, and if you use the package manager, conda, to install rpy2, it will also ensure you install R.

我喜欢将 Python 用户指向 Anaconda，如果您使用包管理器conda, 来安装rpy2，它也将确保您安装 R。

$ conda install rpy2

And here's a vignet based on the documents' introduction:

这是基于文件介绍的小插图：

>>> from rpy2 import robjects
>>> pi = robjects.r['pi']
>>> pi
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7fde1c00a088 / R:0x562b8fbbe118>
[3.141593]

>>> from rpy2.robjects.packages import importr
>>> base = importr('base')
>>> utils = importr('utils')

>>> import rpy2.robjects.packages as rpackages
>>> utils = rpackages.importr('utils')
>>> packnames = ('ggplot2', 'hexbin')
>>> from rpy2.robjects.vectors import StrVector
>>> names_to_install = [x for x in packnames if not rpackages.isinstalled(x)]
>>> if len(names_to_install) > 0:
...     utils.install_packages(StrVector(names_to_install))

And running an R snippet:

并运行 R 代码段：

>>> robjects.r('''
...         # create a function `f`
...         f <- function(r, verbose=FALSE) {
...             if (verbose) {
...                 cat("I am calling f().\n")
...             }
...             2 * pi * r
...         }
...         # call the function `f` with argument value 3
...         f(3)
...         ''')
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7fde1be0d8c8 / R:0x562b91196b18>
[18.849556]

And a small self-contained graphics demo:

还有一个小的自包含图形演示：

from rpy2.robjects.packages import importr
graphics = importr('graphics')
grdevices = importr('grDevices')
base = importr('base')
stats = importr('stats')

import array

x = array.array('i', range(10))
y = stats.rnorm(10)

grdevices.X11()

graphics.par(mfrow = array.array('i', [2,2]))
graphics.plot(x, y, ylab = "foo/bar", col = "red")

kwargs = {'ylab':"foo/bar", 'type':"b", 'col':"blue", 'log':"x"}
graphics.plot(x, y, **kwargs)


m = base.matrix(stats.rnorm(100), ncol=5)
pca = stats.princomp(m)
graphics.plot(pca, main="Eigen values")
stats.biplot(pca, main="biplot")

Answer 3

回答by dmontaner

I would not trust too much the sourcewithin the Rscriptcall as you may not completely understand where are you running your different nestedR sessions. The process may fail because of simple things such as your working directory not being the one you think.

我不会太相信调用中的来源，Rscript因为您可能不完全了解您在哪里运行不同的嵌套R 会话。这个过程可能会因为一些简单的事情而失败，比如你的工作目录不是你想的那样。

Rscriptlets you directly run an script (see man Rscriptif you are using Linux).

Rscript让您直接运行脚本（看看man Rscript您是否使用 Linux）。

Then you can do directly:

然后你可以直接做：

subprocess.call ("/usr/bin/Rscript --vanilla /pathto/MyrScript.r", shell=True)

or better parsing the Rscriptcommand and its parameters as a list

或者更好地将Rscript命令及其参数解析为列表

subprocess.call (["/usr/bin/Rscript", "--vanilla", "/pathto/MyrScript.r"])

Also, to make things easier you could create an R executable file. For this you just need to add this in the first line of the script:

此外，为了简化操作，您可以创建一个R 可执行文件。为此，您只需要在脚本的第一行添加它：

#! /usr/bin/Rscript

and give it execution rights. See herefor detalis.

并赋予其执行权。有关详细信息，请参见此处。

Then you can just do your python call as if it was any other shell command or script:

然后你可以像调用任何其他 shell 命令或脚本一样执行你的 python 调用：

subprocess.call ("/pathto/MyrScript.r")

Answer 4

回答by Mips42

If you just want run a script then you can use system("shell command")of the syslib available by import sys. If you have an usefull output you can print the result by " > outputfilename"at the end of your shell command.

如果你只是想运行一个脚本，那么你可以使用system("shell command")的的sys可用的LIB import sys。如果您有有用的输出，您可以" > outputfilename"在 shell 命令的末尾打印结果。

For example:

例如：

import sys

system("ls -al > output.txt")

Answer 5

回答by enpitsu

Try adding a line to the beginning of your R script that says:

尝试在 R 脚本的开头添加一行，内容为：

setwd("path-to-working-directory")

Except, replace the path with the path to the folder containing the files features.csvand BagofWords.csv.

除了，将路径替换为包含文件features.csv和BagofWords.csv.

I think the problem you are having is because when you run this script from R your working directory is already the correct path, but when you run the script from python, it defaults to a working directory somewhere else (likely the top of the user directory).

我认为您遇到的问题是因为当您从 R 运行此脚本时，您的工作目录已经是正确的路径，但是当您从 python 运行脚本时，它默认为其他地方的工作目录（可能是用户目录的顶部））。

By adding the extra line at the beginning of your R script, you are explicitly setting the working directory and the code to read in these files will work. Alternatively, you could replace the filenames in read.csv()with the full filepaths of these files.

通过在 R 脚本的开头添加额外的行，您明确设置了工作目录，并且读取这些文件的代码将起作用。或者，您可以将文件名替换read.csv()为这些文件的完整文件路径。

@dmontaner suggested this possibility in his answer:

@dmontaner 在他的回答中提出了这种可能性：

The process may fail because of simple things such as your working directory not being the one you think.

这个过程可能会因为一些简单的事情而失败，比如你的工作目录不是你想的那样。

从 python 运行 R 脚本

提问by Ehsan

回答by pyCthon

回答by Aaron Hall

回答by dmontaner

回答by Mips42

回答by enpitsu

相关推荐

最近更新

标签

从 python 运行 R 脚本

提问by Ehsan

回答by pyCthon

回答by Aaron Hall

回答by dmontaner

回答by Mips42

回答by enpitsu

相关推荐

Python 将 Json 嵌套到具有特定格式的 Pandas DataFrame

Python 当 DEBUG = False 时，Django 给出错误请求 (400)

如何腌制或存储 Jupyter (IPython) 笔记本会话以备后用

Python 使用 OpenCV 时找不到模块 cv2

相关推荐

最近更新

标签