Python/Pandas 数据框 - 返回列名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38169342/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python/Pandas dataframe - return column name
提问by lte__
Is there a way to return the name/header of a column into a string in a pandas dataframe? I want to work with a row of data which has the same prefix. The dataframe header looks like this:
有没有办法将列的名称/标题返回到 Pandas 数据框中的字符串中?我想处理具有相同前缀的一行数据。数据帧标头如下所示:
col_00 | col_01 | ... | col_51 | bc_00 | cd_00 | cd_01 | ... | cd_90
I'd like to apply a function to each row, but only from col_00
to col_51
and to cd_00
to cd_90
separately. To do this, I thought I'd collect the column names into a list, fe. to_work_with
would be the list of columns starting with the prefix 'col', apply the function to df[to_work_with]
. Then I'd change the to_work_with
and it would contain the list of columns starting with the 'cd' prefix et cetera. But I don't know how to iterate through the column names.
我想对每一行应用一个函数,但只能分别从col_00
tocol_51
和 tocd_00
到cd_90
。为此,我想我会将列名收集到一个列表中,fe。to_work_with
将以前缀“col”开头的列列表,将该函数应用于df[to_work_with]
. 然后我会更改to_work_with
它,它将包含以“cd”前缀等开头的列列表。但我不知道如何遍历列名。
So basically, the thing I'm looking for is this function:
所以基本上,我正在寻找的是这个功能:
to_work_with = column names in the df that start with "thisstring"
How can I do that? Thank you!
我怎样才能做到这一点?谢谢!
回答by jezrael
You can use boolean indexing
with str.startswith
:
你可以用boolean indexing
与str.startswith
:
cols = df.columns[df.columns.str.startswith('cd')]
print (cols)
Index(['cd_00', 'cd_01', 'cd_02', 'cd_90'], dtype='object')
Sample:
样本:
print (df)
col_00 col_01 col_02 col_51 bc_00 cd_00 cd_01 cd_02 cd_90
0 1 2 3 4 5 6 7 8 9
cols = df.columns[df.columns.str.startswith('cd')]
print (cols)
Index(['cd_00', 'cd_01', 'cd_02', 'cd_90'], dtype='object')
#if want apply some function for filtered columns only
def f(x):
return x + 1
df[cols] = df[cols].apply(f)
print (df)
col_00 col_01 col_02 col_51 bc_00 cd_00 cd_01 cd_02 cd_90
0 1 2 3 4 5 7 8 9 10
Another solution with list comprehension
:
另一个解决方案list comprehension
:
cols = [col for col in df.columns if col.startswith("cd")]
print (cols)
['cd_00', 'cd_01', 'cd_02', 'cd_90']