pandas 遍历数据框熊猫时如何获取列名?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47481874/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:49:01  来源:igfitidea点击:

How to get the column name when iterating through dataframe pandas?

pythonpandas

提问by veridian

I only want the column name when iterating using

迭代时我只想要列名

for index, row in df.iterrows()

回答by cs95

When iterating over a dataframe using df.iterrows:

使用以下方法迭代数据帧时df.iterrows

for i, row in df.iterrows():
    ...

Each row rowis converted to a Series, where row.indexcorresponds to df.columns, and row.valuescorresponds to df.loc[i].values, the column values at row i.

每一行都row转换为一个系列,其中row.index对应于df.columns,并且row.values对应df.loc[i].values于 行 处的列值i



Minimal Code Sample

最少的代码示例

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}, index=['a', 'b'])
df
   A  B
a  1  3
b  2  4

row = None
for i, row in df.iterrows():
     print(row['A'], row['B'])         
# 1 3
# 2 4

row   # outside the loop, `row` holds the last row    
A    2
B    4
Name: b, dtype: int64

row.index
# Index(['A', 'B'], dtype='object')

row.index.equals(df.columns)
# True

row.index[0]
# A

回答by Sraffa

You are already getting to column name, so if you just want to drop the series you can just use the throwaway _variable when starting the loop.

您已经获得了列名,因此如果您只想删除系列,您可以_在开始循环时使用一次性变量。

for column_name, _ in df.iteritems():
    # do something

However, I don't really understand the use case. You could just iterate over the column names directly:

但是,我不太了解用例。您可以直接遍历列名:

for column in df.columns:
    # do something

回答by javac

when we use for index, row in df.iterrows()the right answer is row.index[i]to get the cloumn name, for example:

当我们使用for index, row in df.iterrows()正确的答案是row.index[i]获取cloumn名称,例如:

pdf = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
pdf.head(5)
    A   B   C   D
  0 3   1   2   6
  1 5   8   7   3
  2 7   2   2   5
  3 0   9   9   4
  4 1   8   1   4
for index, row in pdf[:3].iterrows():# we check only 3 rows in the dataframe

for i in range(4):
     if row[i] > 7 :
        print(row.index[i]) #then the answer is B

B

B