Python Pandas:解决“列表对象没有属性‘Loc’”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19266798/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas: Resolving "List Object has no Attribute 'Loc'"
提问by Parseltongue
I import a CSV as a DataFrame using:
我使用以下方法将 CSV 作为 DataFrame 导入:
import numpy as np
import pandas as pd
df = pd.read_csv("test.csv")
Then I'm trying to do a simple replace based on IDs:df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson'
然后我尝试根据 ID 进行简单的替换:df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson'
I get the following error:
我收到以下错误:
AttributeError: 'list' object has no attribute 'loc'
AttributeError: 'list' object has no attribute 'loc'
Note, when I do print pd.version()
I get 0.12.0, so it's not a problem (at least as far as I understand) with having pre-11 version. Any ideas?
请注意,当我print pd.version()
得到 0.12.0 时,因此拥有 11 之前的版本不是问题(至少据我所知)。有任何想法吗?
采纳答案by Carst
To pickup from the comment: "I was doing this:"
从评论中提取:“我正在这样做:”
df = [df.hc== 2]
What you create there is a "mask": an array with booleans that says which part of the index fulfilled your condition.
您在那里创建的是一个“掩码”:一个带有布尔值的数组,表示索引的哪一部分满足您的条件。
To filter your dataframe on your condition you want to do this:
要根据您要执行的条件过滤数据框:
df = df[df.hc == 2]
A bit more explicit is this:
更明确一点的是:
mask = df.hc == 2
df = df[mask]
If you want to keep the entire dataframe and only want to replace specific values, there are methods such replace: Python pandas equivalent for replace. Also another (performance wise great) method would be creating a separate DataFrame with the from/to values as column and using pd.merge to combine it into the existing DataFrame. And using your index to set values is also possible:
如果您想保留整个数据帧并且只想替换特定值,则可以使用诸如 replace 之类的方法:Python pandas 等效于 replace。另一种(性能方面很棒)的方法是创建一个单独的 DataFrame,将 from/to 值作为列,并使用 pd.merge 将其合并到现有的 DataFrame 中。也可以使用您的索引来设置值:
df[mask]['fname'] = 'Johnson'
But for a larger set of replaces you would want to use one of the two other methods or use "apply" with a lambda function (for value transformations). Last but not least: you can use .fillna('bla') to rapidly fill up NA values.
但是对于更大的替换集,您可能想要使用其他两种方法之一,或者将“应用”与 lambda 函数一起使用(用于值转换)。最后但并非最不重要的一点:您可以使用 .fillna('bla') 快速填充 NA 值。
回答by Boud
The traceback indicates to you that df is a list
and not a DataFrame
as expected in your line of code.
回溯向您表明 df 是 alist
而不是DataFrame
您的代码行中预期的a 。
It means that between df = pd.read_csv("test.csv")
and df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson'
you have other lines of codes that assigns a list object to df
. Review that piece of code to find your bug
这意味着在df = pd.read_csv("test.csv")
和之间df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson'
还有其他代码行将列表对象分配给df
. 查看那段代码以找到您的错误
回答by Jeff
@Boud answer is correct. Loc assignment works fine if the right-hand-side list matches the number of replacing elements
@Boud 答案是正确的。如果右侧列表与替换元素的数量匹配,则 Loc 分配工作正常
In [56]: df = DataFrame(dict(A =[1,2,3], B = [4,5,6], C = [7,8,9]))
In [57]: df
Out[57]:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
In [58]: df.loc[1,['A','B']] = -1,-2
In [59]: df
Out[59]:
A B C
0 1 4 7
1 -1 -2 8
2 3 6 9