从 python 运行 R 脚本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19894365/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Running R script from python
提问by Ehsan
I searched for this question and found some answers on this, but none of them seem to work. This is the script that I'm using in python to run my R script.
我搜索了这个问题并找到了一些答案,但它们似乎都不起作用。这是我在 python 中使用的脚本来运行我的 R 脚本。
import subprocess
retcode = subprocess.call("/usr/bin/Rscript --vanilla -e 'source(\"/pathto/MyrScript.r\")'", shell=True)
and I get this error:
我收到这个错误:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
Calls: source ... withVisible -> eval -> eval -> read.csv -> read.table
Execution halted
and here is the content of my R script (pretty simple!)
这是我的 R 脚本的内容(非常简单!)
data = read.csv('features.csv')
data1 = read.csv("BagofWords.csv")
merged = merge(data,data1)
write.table(merged, "merged.csv",quote=FALSE,sep=",",row.names=FALSE)
for (i in 1:length(merged$fileName))
{
fileConn<-file(paste("output/",toString(merged$fileName[i]),".txt",sep=""))
writeLines((toString(merged$BagofWord[i])),fileConn)
close(fileConn)
}
The r script is working fine, when I use source('MyrScript.r')
in r commandline. Moreover, when I try to use the exact command which I pass to the subprocess.call
function (i.e., /usr/bin/Rscript --vanilla -e 'source("/pathto/MyrScript.r")'
) in my commandline it works find, I don't really get what's the problem.
当我source('MyrScript.r')
在 r 命令行中使用时,r 脚本工作正常。此外,当我尝试使用我在命令行中传递给subprocess.call
函数(即/usr/bin/Rscript --vanilla -e 'source("/pathto/MyrScript.r")'
)的确切命令时,它可以正常工作,但我真的不明白问题出在哪里。
回答by pyCthon
回答by Aaron Hall
I think RPy2 is worth looking into, here is a cool presentation on R-bloggers.com to get you started:
我认为 RPy2 值得研究,这是 R-bloggers.com 上的一个很酷的演示文稿,可以帮助您入门:
http://www.r-bloggers.com/accessing-r-from-python-using-rpy2/
http://www.r-bloggers.com/accessing-r-from-python-using-rpy2/
Essentially, it allows you to have access to R libraries with R objects that provides both a high level and low level interface.
从本质上讲,它允许您使用 R 对象访问 R 库,该对象提供高级和低级接口。
Here are the docs on the most recent version: https://rpy2.github.io/doc/latest/html/
以下是最新版本的文档:https: //rpy2.github.io/doc/latest/html/
I like to point Python users to Anaconda, and if you use the package manager, conda
, to install rpy2
, it will also ensure you install R.
我喜欢将 Python 用户指向 Anaconda,如果您使用包管理器conda
, 来安装rpy2
,它也将确保您安装 R。
$ conda install rpy2
And here's a vignet based on the documents' introduction:
这是基于文件介绍的小插图:
>>> from rpy2 import robjects
>>> pi = robjects.r['pi']
>>> pi
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7fde1c00a088 / R:0x562b8fbbe118>
[3.141593]
>>> from rpy2.robjects.packages import importr
>>> base = importr('base')
>>> utils = importr('utils')
>>> import rpy2.robjects.packages as rpackages
>>> utils = rpackages.importr('utils')
>>> packnames = ('ggplot2', 'hexbin')
>>> from rpy2.robjects.vectors import StrVector
>>> names_to_install = [x for x in packnames if not rpackages.isinstalled(x)]
>>> if len(names_to_install) > 0:
... utils.install_packages(StrVector(names_to_install))
And running an R snippet:
并运行 R 代码段:
>>> robjects.r('''
... # create a function `f`
... f <- function(r, verbose=FALSE) {
... if (verbose) {
... cat("I am calling f().\n")
... }
... 2 * pi * r
... }
... # call the function `f` with argument value 3
... f(3)
... ''')
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7fde1be0d8c8 / R:0x562b91196b18>
[18.849556]
And a small self-contained graphics demo:
还有一个小的自包含图形演示:
from rpy2.robjects.packages import importr
graphics = importr('graphics')
grdevices = importr('grDevices')
base = importr('base')
stats = importr('stats')
import array
x = array.array('i', range(10))
y = stats.rnorm(10)
grdevices.X11()
graphics.par(mfrow = array.array('i', [2,2]))
graphics.plot(x, y, ylab = "foo/bar", col = "red")
kwargs = {'ylab':"foo/bar", 'type':"b", 'col':"blue", 'log':"x"}
graphics.plot(x, y, **kwargs)
m = base.matrix(stats.rnorm(100), ncol=5)
pca = stats.princomp(m)
graphics.plot(pca, main="Eigen values")
stats.biplot(pca, main="biplot")
回答by dmontaner
I would not trust too much the sourcewithin the Rscript
call as you may not completely understand where are you running your different nestedR sessions. The process may fail because of simple things such as your working directory not being the one you think.
我不会太相信调用中的来源,Rscript
因为您可能不完全了解您在哪里运行不同的嵌套R 会话。这个过程可能会因为一些简单的事情而失败,比如你的工作目录不是你想的那样。
Rscript
lets you directly run an script (see man Rscript
if you are using Linux).
Rscript
让您直接运行脚本(看看man Rscript
您是否使用 Linux)。
Then you can do directly:
然后你可以直接做:
subprocess.call ("/usr/bin/Rscript --vanilla /pathto/MyrScript.r", shell=True)
or better parsing the Rscript
command and its parameters as a list
或者更好地将Rscript
命令及其参数解析为列表
subprocess.call (["/usr/bin/Rscript", "--vanilla", "/pathto/MyrScript.r"])
Also, to make things easier you could create an R executable file. For this you just need to add this in the first line of the script:
此外,为了简化操作,您可以创建一个R 可执行文件。为此,您只需要在脚本的第一行添加它:
#! /usr/bin/Rscript
and give it execution rights. See herefor detalis.
Then you can just do your python call as if it was any other shell command or script:
然后你可以像调用任何其他 shell 命令或脚本一样执行你的 python 调用:
subprocess.call ("/pathto/MyrScript.r")
回答by Mips42
If you just want run a script then you can use system("shell command")
of the sys
lib available by import sys
. If you have an usefull output you can print the result by " > outputfilename"
at the end of your shell command.
如果你只是想运行一个脚本,那么你可以使用system("shell command")
的的sys
可用的LIB import sys
。如果您有有用的输出,您可以" > outputfilename"
在 shell 命令的末尾打印结果。
For example:
例如:
import sys
system("ls -al > output.txt")
回答by enpitsu
Try adding a line to the beginning of your R script that says:
尝试在 R 脚本的开头添加一行,内容为:
setwd("path-to-working-directory")
Except, replace the path with the path to the folder containing the files features.csv
and BagofWords.csv
.
除了,将路径替换为包含文件features.csv
和BagofWords.csv
.
I think the problem you are having is because when you run this script from R your working directory is already the correct path, but when you run the script from python, it defaults to a working directory somewhere else (likely the top of the user directory).
我认为您遇到的问题是因为当您从 R 运行此脚本时,您的工作目录已经是正确的路径,但是当您从 python 运行脚本时,它默认为其他地方的工作目录(可能是用户目录的顶部) )。
By adding the extra line at the beginning of your R script, you are explicitly setting the working directory and the code to read in these files will work. Alternatively, you could replace the filenames in read.csv()
with the full filepaths of these files.
通过在 R 脚本的开头添加额外的行,您明确设置了工作目录,并且读取这些文件的代码将起作用。或者,您可以将文件名替换read.csv()
为这些文件的完整文件路径。
@dmontaner suggested this possibility in his answer:
@dmontaner 在他的回答中提出了这种可能性:
The process may fail because of simple things such as your working directory not being the one you think.
这个过程可能会因为一些简单的事情而失败,比如你的工作目录不是你想的那样。