pandas 将 R 代码转换为 python 代码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50538247/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting R code into python code
提问by vinay karagod
working code in R
R中的工作代码
library(dplyr)
tmp <- test %>%
group_by(InvoiceDocNumber) %>%
summarise(invoiceprob=max(itemprob)) %>%
mutate(invoicerank=rank(desc(invoiceprob)))
But I want to rewrite the code in python. I wrote the below code but it's throwing me the error. I am using the similar version of dplyr which is available in python.
但是我想用python重写代码。我写了下面的代码,但它向我抛出了错误。我正在使用 python 中可用的类似版本的 dplyr。
from dfply import *
tmp = (test >>
group_by(test.InvoiceDocNumber) >>
summarize(invoiceprob=max(test.itemprob)) >>
mutate(invoicerank=rankdata(test.invoiceprob)))
AttributeError: 'DataFrame' object has no attribute 'invoiceprob'
Can anyone help me ?
谁能帮我 ?
采纳答案by andrew_reece
You can use assign
to get it all in one chain:
您可以使用assign
将其全部放在一个链中:
(
test.groupby("InvoiceDocNumber", as_index=False)
.itemprob.max()
.rename(columns={"itemprob":"invoiceprob"})
.assign(invoicerank = lambda x: x.invoiceprob.rank(ascending=False))
)
Output:
输出:
InvoiceDocNumber invoiceprob invoicerank
0 0 0.924193 5.0
1 1 0.974173 4.0
2 2 0.978962 3.0
3 3 0.992663 2.0
4 4 0.994243 1.0
Data:
数据:
import numpy as np
import pandas as pd
n = 100
test = pd.DataFrame({"InvoiceDocNumber": np.random.choice(np.arange(5), size=n),
"itemprob": np.random.uniform(size=n)})
回答by vinay karagod
I got the answer
我得到了答案
ddd = test.groupby('InvoiceDocNumber', as_index=False).agg({"itemprob": "max"})
ddd= ddd.rename(columns={'itemprob': 'invoiceprob'})
ddd['invoicerank'] =ddd['invoiceprob'].rank(ascending=0)