pandas 数据框应用不接受轴参数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45878720/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:19:14  来源:igfitidea点击:

Dataframe apply doesn't accept axis argument

pythonpandasdataframeapply

提问by sleepophile

I have two dataframes: dataand rules.

我有两个数据框:datarules.

>>>data                            >>>rules
   vendor                             rule
0  googel                           0 google
1  google                           1 dell
2  googly                           2 macbook

I am trying to add two new columns into the datadataframe after computing the Levenshtein similarity between each vendor and rule. So my dataframe should ideally contain columns looking like this:

data在计算每个供应商和规则之间的 Levenshtein 相似性后,我试图将两个新列添加到数据框中。所以我的数据框理想情况下应该包含如下所示的列:

>>>data
  vendor   rule    similarity
0 googel   google    0.8

So far I am trying to perform an applyfunction that will return me this structure, but the dataframe apply is not accepting the axisargument.

到目前为止,我正在尝试执行一个apply将返回此结构的函数,但数据框 apply 不接受该axis参数。

>>> for index,r in rules.iterrows():
...     data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
...
Traceback (most recent call last):

File "<stdin>", line 2, in <module>

File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2220, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/src/inference.pyx", line 1088, in pandas.lib.map_infer (pandas/lib.c:62658)
File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2209, in <lambda>
f = lambda x: func(x, *args, **kwds)

TypeError: <lambda>() got an unexpected keyword argument 'axis'

Could someone please help me figure out what I am doing wrong? Any change I make is just creating new errors.Thank you

有人可以帮我弄清楚我做错了什么吗?我所做的任何更改都只会造成新的错误。谢谢

采纳答案by EdChum

You're calling the Seriesversion of applyfor which it doesn't make sense to have an axisarg hence the error.

您正在调用具有arg没有意义的Series版本,因此会出现错误。applyaxis

If you did:

如果你这样做:

data[['rule','similarity']]=data[['vendor']].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)

then this makes a single column df for which this would work

那么这会产生一个单列 df ,这将起作用

Or just remove the axisarg:

或者只是删除axisarg:

data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])])

update

更新

Looking at what you're doing, you need to calculate the levenshtein ratio for each rule against every vendor.

看看您在做什么,您需要针对每个供应商计算每个规则的编辑比例。

You can do this by:

您可以通过以下方式执行此操作:

data['vendor'].apply(lambda row: rules['rule'].apply(lambda x: ratio(x, row))

this I think should calculate the ratio for each vendor against every rule.

我认为这应该根据每个规则计算每个供应商的比率。