Pandas:使用 .isin() 返回错误:“AttributeError: float' object has no attribute 'isin'”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48852855/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:12:03  来源:igfitidea点击:

Pandas: Using .isin() returns the error: "AttributeError: float' object has no attribute 'isin'"

pythonpandascsvdataframe

提问by BoroBorooooooooooooooooooooooo

I am using Pandas and Python to import a CSV, and the data in the imported dataframe is manipulated so that a new column is made.

我正在使用 Pandas 和 Python 导入一个 CSV,并对导入的数据框中的数据进行操作,以便创建一个新列。

Each row in the new column is made based on the values in each corresponding row of both column A and column B. There are more columns with data in the dataframe, however these are irrelevant for code below.

新列中的每一行都是基于 A 列和 B 列的每个对应行中的值创建的。数据框中有更多包含数据的列,但是这些与下面的代码无关。

The imported dataframe have several thousand rows.

导入的数据框有几千行。

Both column A and column B contain numerical values between and including 0 and 99.

A 列和 B 列都包含介于 0 和 99 之间(包括在内)的数值。

import pandas as pd

将Pandas导入为 pd

import csv

df = pd.read_csv("import.csv", names=["Id", "Month", "Name", "ColA", "ColB" ])

def f(row):
    if row['colA'].isin([10, 11, 12, 13, 14, 15, 20, 21, 22, 23, 24, 48]) and  row['colB'].isin([30, 31, 32, 33, 34, 35, 57, 58]):
        val = row['ColA']
    elif row['ColB'].isin([10, 11, 12, 13, 14, 15, 20, 21, 22, 23, 24, 48]) and  row['ColA'].isin([30, 31, 32, 33, 34, 35, 57, 58]):
        val = row['ColB']
    elif row['ColA'] > row['ColB']:
        val = row['ColA']
    elif row['ColA'] < row['ColB']:
        val = row['ColB']
    else: 
        val = row['ColA']
    return val            

df['NewColumnName'] = df.apply(f, axis=1)   

df.to_csv("export.csv", encoding='utf-8')

Running the above code returns the error:

运行上面的代码返回错误:

AttributeError: ("'float' object has no attribute 'isin'", 'occurred at index 0')

So obviously .isin() can't be used in that manner. Any suggestions to how this could be solved?

所以显然 .isin() 不能以这种方式使用。关于如何解决这个问题的任何建议?

EDITAdding a column where the same conditions apply using Jezrael's approach the code would look as follows I guess:

编辑使用 Jezrael 的方法添加一个适用相同条件的列,我猜代码如下所示:

m1 = (df['colA'].isin(L1) & df['colB'].isin(L2)) | (df['ColA'] > df['ColB'])
m2 = (df['colB'].isin(L1) & df['colA'].isin(L2)) | (df['ColA'] < df['ColB'])
m3 = (df['colC'].isin(L1) & df['colB'].isin(L2)) | (df['ColC'] > df['ColB'])
m4 = (df['colB'].isin(L1) & df['colC'].isin(L2)) | (df['ColC'] < df['ColB'])
m5 = (df['colC'].isin(L1) & df['colA'].isin(L2)) | (df['ColC'] > df['ColA'])
m6 = (df['colA'].isin(L1) & df['colC'].isin(L2)) | (df['ColC'] < df['ColA'])



df['NewColumnName'] = np.select([m1, m2, m3, m4, m5, m6], [df['ColA'], df['ColB'], df['ColC'], df['ColA'], df['ColB'], df['ColC'],], default=df['ColA'])

回答by jezrael

In pandas the best is avoid loops, so better is use numpy.selectand chain condition by &for ANDand |for OR:

在 Pandas 中,最好是避免循环,因此最好使用for和for来使用numpy.select和链接条件:&AND|OR

L1 = [10, 11, 12, 13, 14, 15, 20, 21, 22, 23, 24, 48]
L2 = [30, 31, 32, 33, 34, 35, 57, 58]

m1 = (df['colA'].isin(L1) & df['colB'].isin(L2)) | (df['ColA'] > df['ColB'])
m2 = (df['colB'].isin(L1) & df['colA'].isin(L2)) | (df['ColA'] < df['ColB'])

df['NewColumnName'] = np.select([m1, m2], [df['ColA'], df['ColB']], default=df['ColA'])

回答by relay

You need to use it like this:

你需要像这样使用它:

df[df['ColA'].isin([10, 11, 12, 13, 14, 15, 20, 21, 22, 23, 24, 48])]

This will give you the rows where ColAvalue is in the list that is indicated above. You are trying to do it per value, however this method applies on whole column. If you want to see if one value is in this list then you can write something like this in your function using numpy:

这将为您提供ColA值在上面指示的列表中的行。您正在尝试按值执行此操作,但是此方法适用于整列。如果您想查看此列表中是否有一个值,那么您可以使用 numpy 在您的函数中编写如下内容:

if np.any(row['colA'] == [10, 11, 12, 13, 14, 15, 20, 21, 22, 23, 24, 48]):
   val = row['ColA']