Python Pandas 到 R 数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24094476/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:08:30  来源:igfitidea点击:

Python Pandas to R dataframe

pythonrpandasrpy2

提问by JonghoKim

I am going to convert Python pandas dataframe to dataframe in R. I found out few libraries for this problem

我打算将 Python pandas 数据帧转换为 R 中的数据帧。我发现了几个解决这个问题的库

http://pandas.pydata.org/pandas-docs/stable/r_interface.html

http://pandas.pydata.org/pandas-docs/stable/r_interface.html

which is rpy2

这是 rpy2

But I couldn't find the methods for saving or transfer it to R.

但是我找不到将其保存或传输到 R 的方法。

Firstly I tried "to_csv"

首先我尝试了“to_csv”

df_R = com.convert_to_r_dataframe(df_total)
df_R.to_csv(direc+"/qap/detail_summary_R/"+"distance_"+str(gp_num)+".csv",sep = ",")

But it gives me an error

但它给了我一个错误

"AttributeError: 'DataFrame' object has no attribute 'to_csv'  "

So I tried to see its data type it was

所以我试图查看它的数据类型

<class 'rpy2.robjects.vectors.DataFrame'>

how could I save this type object to csv file or transfer to R?

如何将这种类型的对象保存到 csv 文件或传输到 R?

采纳答案by lgautier

The recent documentation https://rpy2.github.io/doc/v3.2.x/html/generated_rst/pandas.htmlhas a section about interacting with pandas.

最近的文档https://rpy2.github.io/doc/v3.2.x/html/generated_rst/pandas.html有一个关于与pandas.

Otherwise objects of type rpy2.robjects.vectors.DataFramehave a method to_csvfile, not to_csv:

否则类型的对象rpy2.robjects.vectors.DataFrame有一个方法to_csvfile,而不是to_csv

https://rpy2.github.io/doc/v3.2.x/html/vector.html#rpy2.robjects.vectors.DataFrame.to_csvfile

https://rpy2.github.io/doc/v3.2.x/html/vector.html#rpy2.robjects.vectors.DataFrame.to_csvfile

If wanting to pass data between Python and R, there are more efficient ways than writing and reading CSV files. Try the conversion system:

如果想在 Python 和 R 之间传递数据,有比读写 CSV 文件更有效的方法。试试转换系统:

from rpy2.robjects import pandas2ri
pandas2ri.activate()

from rpy2.robjects.packages import importr

base = importr('base')
# call an R function on a Pandas DataFrame
base.summary(my_pandas_dataframe)

回答by jayelm

If standard text-based formats (csv) are too slow or bulky, I'd recommend feather, a serialization format built on Apache Arrow. It was explicitly developed by the creators of RStudio/ggplot2/etc (Hadley Wickham) and pandas (Wes McKinney) for performance and interoperability between Python and R (see here).

如果标准的基于文本的格式 (csv) 太慢或太笨重,我建议使用feather,一种基于Apache Arrow的序列化格式。它是由 RStudio/ggplot2/etc (Hadley Wickham) 和 pandas (Wes McKinney) 的创建者明确开发的,用于 Python 和 R 之间的性能和互操作性(请参见此处)。

You need pandas verson 0.20.0+, pip install feather-format, then you can use the to_feather/read_featheroperations as drop-in replacements for to_csv/read_csv:

您需要 pandas 版本 0.20.0+, pip install feather-format,然后您可以使用to_feather/read_feather操作作为to_csv/ 的替代品read_csv

df_R.to_feather('filename.feather')
df_R = pd.read_feather('filename.feather')

The Requivalents (using the package feather) are

R等同物(使用包feather)是

df <- feather::read_feather('filename.feather')
feather::write_feather(df, 'filename.feather')

Besides some minor tweaks (e.g. you can't save custom DataFrame indexes in feather, so you'll need to call df.reset_index()first), this is a fast and easy drop-in replacement for csv, pickle, etc.

除了一些小的调整(例如,你不能保存在羽定制数据帧索引,所以你需要调用df.reset_index()第一),这是一个快速和容易下降的替代产品csvpickle等等。

回答by agstudy

Once you have your data.frame you can save it using write.tableor one of the wrappers of the latter, for example writee.csv.

拥有 data.frame 后,您可以使用write.table或后者的包装器之一保存它,例如writee.csv

In rpy2 :

在 rpy2 中:

import rpy2.robjects as robjects
## get a reference to the R function 
write_csv = robjects.r('write.csv')
## save 
write_csv(df_R,'filename.csv')