pandas 数据框应用不接受轴参数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45878720/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Dataframe apply doesn't accept axis argument
提问by sleepophile
I have two dataframes: data
and rules
.
我有两个数据框:data
和rules
.
>>>data >>>rules
vendor rule
0 googel 0 google
1 google 1 dell
2 googly 2 macbook
I am trying to add two new columns into the data
dataframe after computing the Levenshtein similarity between each vendor and rule. So my dataframe should ideally contain columns looking like this:
data
在计算每个供应商和规则之间的 Levenshtein 相似性后,我试图将两个新列添加到数据框中。所以我的数据框理想情况下应该包含如下所示的列:
>>>data
vendor rule similarity
0 googel google 0.8
So far I am trying to perform an apply
function that will return me this structure, but the dataframe apply is not accepting the axis
argument.
到目前为止,我正在尝试执行一个apply
将返回此结构的函数,但数据框 apply 不接受该axis
参数。
>>> for index,r in rules.iterrows():
... data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2220, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/src/inference.pyx", line 1088, in pandas.lib.map_infer (pandas/lib.c:62658)
File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2209, in <lambda>
f = lambda x: func(x, *args, **kwds)
TypeError: <lambda>() got an unexpected keyword argument 'axis'
Could someone please help me figure out what I am doing wrong? Any change I make is just creating new errors.Thank you
有人可以帮我弄清楚我做错了什么吗?我所做的任何更改都只会造成新的错误。谢谢
采纳答案by EdChum
You're calling the Series
version of apply
for which it doesn't make sense to have an axis
arg hence the error.
您正在调用具有arg没有意义的Series
版本,因此会出现错误。apply
axis
If you did:
如果你这样做:
data[['rule','similarity']]=data[['vendor']].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
then this makes a single column df for which this would work
那么这会产生一个单列 df ,这将起作用
Or just remove the axis
arg:
或者只是删除axis
arg:
data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])])
update
更新
Looking at what you're doing, you need to calculate the levenshtein ratio for each rule against every vendor.
看看您在做什么,您需要针对每个供应商计算每个规则的编辑比例。
You can do this by:
您可以通过以下方式执行此操作:
data['vendor'].apply(lambda row: rules['rule'].apply(lambda x: ratio(x, row))
this I think should calculate the ratio for each vendor against every rule.
我认为这应该根据每个规则计算每个供应商的比率。