Python pandas -> 按列名中的条件选择

Question

提问by CezarySzulc

I have df with column names: 'a', 'b', 'c' ... 'z'.

我有 df 列名：'a', 'b', 'c' ... 'z'。

print(my_df.columns)
Index(['a', 'b', 'c', ... 'y', 'z'],
  dtype='object', name=0)

I have function which determine which columns should be displayed. For example:

我有确定应显示哪些列的功能。例如：

start = con_start()
stop = con_stop()
print(my_df.columns >= start) & (my_df <= stop)

My result is:

我的结果是：

[False False ... False False False False  True  True
True  True False False]

My goal is display dataframe only with columns that satisfy my condition. If start = 'a' and stop = 'b', I want to have:

我的目标是仅使用满足我的条件的列显示数据框。如果开始 = 'a' 和停止 = 'b'，我想要：

0                                      a              b         
index1       index2                                                  
New York     New York           0.000000       0.000000          
California   Los Angeles   207066.666667  214466.666667     
Illinois     Chicago       138400.000000  143633.333333     
Pennsylvania Philadelphia   53000.000000   53633.333333      
Arizona      Phoenix       111833.333333  114366.666667

Answer 1

采纳答案by piRSquared

I want to make this robust and with as few assumptions as possible.

我想让它变得健壮，并尽可能少做假设。

option 1
use ilocwith array slicing
Assumptions:

选项 1与数组切片一起
使用假设：iloc

my_df.columns.is_uniqueevaluates to True
columns are already in order

my_df.columns.is_unique评估为 True
列已经排序

start = df.columns.get_loc(con_start())
stop = df.columns.get_loc(con_stop())

df.iloc[:, start:stop + 1]

option 2
use locwith boolean slicing
Assumptions:

选项 2与布尔切片一起
使用假设：loc

column values are comparable

列值具有可比性

start = con_start()
stop = con_stop()

c = df.columns.values
m = (start <= c) & (stop >= c)

df.loc[:, m]

Answer 2

回答by Scott Boston

You can use slicing to achieve this with .loc:

您可以使用切片来通过 .loc 实现此目的：

 df.loc[:,'a':'b']

Answer 3

回答by Petr Matuska

If your conditions are on a similar level of complexity as you shown in your example there is no need to use any additional function but just do filtering e.g.

如果您的条件与示例中所示的复杂程度相似，则无需使用任何其他功能，只需进行过滤即可

sweet_and_red_fruit = fruit[(fruit[sweet == 1) & (fruit["colour"] == "red")]
print(sweet_and_red_fruit)

OR if you want to just print

或者，如果您只想打印

print(fruit[(fruit[sweet == 1) & (fruit["colour"] == "red")])

Answer 4

回答by acidtobi

Generate a list of colums to display:

生成要显示的列列表：

cols = [x for x in my_df.columns if start <= x <= stop]

Use only these columns in your DataFrame:

在您的 DataFrame 中仅使用这些列：

my_df[cols]

Answer 5

回答by Binyamin Even

assuming resultis your [true/false]array and that lettersis [a...z]:

假设result是你的[true/false]数组，那letters就是[a...z]：

res=[letters[i] for i,r in enumerate(result) if r]
new_df=df[res]

Python pandas -> 按列名中的条件选择

提问by CezarySzulc

采纳答案by piRSquared

回答by Scott Boston

回答by Petr Matuska

回答by acidtobi

回答by Binyamin Even

相关推荐

最近更新

标签

Python pandas -> 按列名中的条件选择

提问by CezarySzulc

采纳答案by piRSquared

回答by Scott Boston

回答by Petr Matuska

回答by acidtobi

回答by Binyamin Even

相关推荐

pandas 如何找出列中唯一值的数量以及数据框中唯一值的数量？

pandas 删除 DataFrame 中的多个空白

如何为 Pandas DataFrame 的列设置值？

pandas 检查列值是否在熊猫的其他列中

相关推荐

最近更新

标签