pandas 在熊猫中,df['column'] 和 df.column 有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23546555/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:01:29  来源:igfitidea点击:

In pandas, what's the difference between df['column'] and df.column?

pythonpandas

提问by Anton

I'm working my way through Pandas for Data Analysis and learning a ton. However, one thing keeps coming up. The book typically refers to columns of a dataframe as df['column']however, sometimes without explanation the book uses df.column.

我正在通过 Pandas 进行数据分析并学习大量知识。然而,一件事不断出现。这本书通常指的是数据帧的列,df['column']然而,有时本书使用df.column.

I don't understand the difference between the two. Any help would be appreciated.

我不明白两者之间的区别。任何帮助,将不胜感激。

Below is come code demonstrating the what I am talking about:

下面是演示我在说什么的代码:

In [5]:

import pandas as pd

data = {'column1': ['a', 'a', 'a', 'b', 'c'], 
        'column2': [1, 4, 2, 5, 3]}
df = pd.DataFrame(data, columns = ['column1', 'column2'])
df

Out[5]:
column1 column2
0    a   1
1    a   4
2    a   2
3    b   5
4    c   3
5 rows × 2 columns


df.column:

df.column:

In [8]:

df.column1
Out[8]:
0    a
1    a
2    a
3    b
4    c
Name: column1, dtype: object


df['column']:

df['列']:

In [9]:

df['column1']
Out[9]:
0    a
1    a
2    a
3    b
4    c
Name: column1, dtype: object

回答by acushner

for setting, values, you need to use df['column'] = series.

对于设置,值,您需要使用df['column'] = series.

once this is done however, you can refer to that column in the future with df.column, assuming it's a valid python name. (so df.columnworks, but df.6columnwould still have to be accessed with df['6column'])

但是,一旦完成,您可以在将来使用 引用该列df.column,假设它是一个有效的 python 名称。(如此df.column有效,但df.6column仍需使用 访问df['6column']

i think the subtle difference here is that when you set something with df['column'] = ser, pandas goes ahead and adds it to the columns/does some other stuff (i believe by overriding the functionality in __setitem__. if you do df.column = ser, it's just like adding a new field to any existing object which uses __setattr__, and pandas does not seem to override this behavior.

我觉得这里的细微差别是,当你设置一些与df['column'] = ser,Pandas前进并把它添加到列/做一些其他的东西(我相信通过重写的功能__setitem__。如果你这样做df.column = ser,它就像增加一个新的领域,以任何使用__setattr__, 和 pandas 的现有对象似乎没有覆盖此行为。