pandas 在熊猫中,df['column'] 和 df.column 有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23546555/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
In pandas, what's the difference between df['column'] and df.column?
提问by Anton
I'm working my way through Pandas for Data Analysis and learning a ton. However, one thing keeps coming up. The book typically refers to columns of a dataframe as df['column']however, sometimes without explanation the book uses df.column.
我正在通过 Pandas 进行数据分析并学习大量知识。然而,一件事不断出现。这本书通常指的是数据帧的列,df['column']然而,有时本书使用df.column.
I don't understand the difference between the two. Any help would be appreciated.
我不明白两者之间的区别。任何帮助,将不胜感激。
Below is come code demonstrating the what I am talking about:
下面是演示我在说什么的代码:
In [5]:
import pandas as pd
data = {'column1': ['a', 'a', 'a', 'b', 'c'],
'column2': [1, 4, 2, 5, 3]}
df = pd.DataFrame(data, columns = ['column1', 'column2'])
df
Out[5]:
column1 column2
0 a 1
1 a 4
2 a 2
3 b 5
4 c 3
5 rows × 2 columns
df.column:
df.column:
In [8]:
df.column1
Out[8]:
0 a
1 a
2 a
3 b
4 c
Name: column1, dtype: object
df['column']:
df['列']:
In [9]:
df['column1']
Out[9]:
0 a
1 a
2 a
3 b
4 c
Name: column1, dtype: object
回答by acushner
for setting, values, you need to use df['column'] = series.
对于设置,值,您需要使用df['column'] = series.
once this is done however, you can refer to that column in the future with df.column, assuming it's a valid python name. (so df.columnworks, but df.6columnwould still have to be accessed with df['6column'])
但是,一旦完成,您可以在将来使用 引用该列df.column,假设它是一个有效的 python 名称。(如此df.column有效,但df.6column仍需使用 访问df['6column'])
i think the subtle difference here is that when you set something with df['column'] = ser, pandas goes ahead and adds it to the columns/does some other stuff (i believe by overriding the functionality in __setitem__. if you do df.column = ser, it's just like adding a new field to any existing object which uses __setattr__, and pandas does not seem to override this behavior.
我觉得这里的细微差别是,当你设置一些与df['column'] = ser,Pandas前进并把它添加到列/做一些其他的东西(我相信通过重写的功能__setitem__。如果你这样做df.column = ser,它就像增加一个新的领域,以任何使用__setattr__, 和 pandas 的现有对象似乎没有覆盖此行为。

