pandas 数据框应用不接受轴参数

Question

提问by sleepophile

I have two dataframes: dataand rules.

我有两个数据框：data和rules.

>>>data                            >>>rules
   vendor                             rule
0  googel                           0 google
1  google                           1 dell
2  googly                           2 macbook

I am trying to add two new columns into the datadataframe after computing the Levenshtein similarity between each vendor and rule. So my dataframe should ideally contain columns looking like this:

data在计算每个供应商和规则之间的 Levenshtein 相似性后，我试图将两个新列添加到数据框中。所以我的数据框理想情况下应该包含如下所示的列：

>>>data
  vendor   rule    similarity
0 googel   google    0.8

So far I am trying to perform an applyfunction that will return me this structure, but the dataframe apply is not accepting the axisargument.

到目前为止，我正在尝试执行一个apply将返回此结构的函数，但数据框 apply 不接受该axis参数。

>>> for index,r in rules.iterrows():
...     data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
...
Traceback (most recent call last):

File "<stdin>", line 2, in <module>

File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2220, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/src/inference.pyx", line 1088, in pandas.lib.map_infer (pandas/lib.c:62658)
File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2209, in <lambda>
f = lambda x: func(x, *args, **kwds)

TypeError: <lambda>() got an unexpected keyword argument 'axis'

Could someone please help me figure out what I am doing wrong? Any change I make is just creating new errors.Thank you

有人可以帮我弄清楚我做错了什么吗？我所做的任何更改都只会造成新的错误。谢谢

Answer 1

采纳答案by EdChum

You're calling the Seriesversion of applyfor which it doesn't make sense to have an axisarg hence the error.

您正在调用具有arg没有意义的Series版本，因此会出现错误。applyaxis

If you did:

如果你这样做：

data[['rule','similarity']]=data[['vendor']].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)

then this makes a single column df for which this would work

那么这会产生一个单列 df ，这将起作用

Or just remove the axisarg:

或者只是删除axisarg：

data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])])

update

更新

Looking at what you're doing, you need to calculate the levenshtein ratio for each rule against every vendor.

看看您在做什么，您需要针对每个供应商计算每个规则的编辑比例。

You can do this by:

您可以通过以下方式执行此操作：

data['vendor'].apply(lambda row: rules['rule'].apply(lambda x: ratio(x, row))

this I think should calculate the ratio for each vendor against every rule.

我认为这应该根据每个规则计算每个供应商的比率。

pandas 数据框应用不接受轴参数

提问by sleepophile

采纳答案by EdChum

相关推荐

最近更新

标签

pandas 数据框应用不接受轴参数

提问by sleepophile

采纳答案by EdChum

相关推荐

pandas Neo4j 使用 py2neo 从熊猫数据帧创建节点和关系

pandas 如何指定 x 和 y 轴以在 Python 中绘制数据框

pandas 数据框，只保留一列

pandas - 将字符串转换为字符串列表

相关推荐

最近更新

标签