按名称列表切片 Pandas 中的多个列范围

Question

提问by Guga

I am trying to select multiple columns in a Pandas dataframe in two different approaches:

我正在尝试以两种不同的方法在 Pandas 数据框中选择多列：

1)via the columns number, for examples, columns 1-3 and columns 6 onwards.

1) 通过列号，例如，第 1-3 列和第 6 列以后。

and

和

2)via a list of column names, for instance:

2）通过列名列表，例如：

years = list(range(2000,2017))
months = list(range(1,13))
years_month = list(["A", "B", "B"])
for y in years:
    for m in months:
        y_m = str(y) + "-" + str(m)
        years_month.append(y_m)

Then, years_monthwould produce the following:

然后，years_month将产生以下结果：

['A',
 'B',
 'C',
 '2000-1',
 '2000-2',
 '2000-3',
 '2000-4',
 '2000-5',
 '2000-6',
 '2000-7',
 '2000-8',
 '2000-9',
 '2000-10',
 '2000-11',
 '2000-12',
 '2001-1',
 '2001-2',
 '2001-3',
 '2001-4',
 '2001-5',
 '2001-6',
 '2001-7',
 '2001-8',
 '2001-9',
 '2001-10',
 '2001-11',
 '2001-12']

That said, what is the best(or correct) way to load only the columns in which the names are in the list years_monthin the two approaches?

也就是说，在两种方法中仅加载名称在列表years_month中的列的最佳（或正确）方法是什么？

Answer 1

回答by jezrael

I think you need numpy.r_for concanecate positions of columns, then use ilocfor selecting:

我认为您需要numpy.r_连接列的位置，然后iloc用于选择：

print (df.iloc[:, np.r_[1:3, 6:len(df.columns)]])

and for second approach subset by list:

对于第二种方法子集list：

print (df[years_month])

Sample:

样本：

df = pd.DataFrame({'2000-1':[1,3,5],
                   '2000-2':[5,3,6],
                   '2000-3':[7,8,9],
                   '2000-4':[1,3,5],
                   '2000-5':[5,3,6],
                   '2000-6':[7,8,9],
                   '2000-7':[1,3,5],
                   '2000-8':[5,3,6],
                   '2000-9':[7,4,3],
                   'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9]})

print (df)
   2000-1  2000-2  2000-3  2000-4  2000-5  2000-6  2000-7  2000-8  2000-9  A  \
0       1       5       7       1       5       7       1       5       7  1   
1       3       3       8       3       3       8       3       3       4  2   
2       5       6       9       5       6       9       5       6       3  3   

   B  C  
0  4  7  
1  5  8  
2  6  9  

print (df.iloc[:, np.r_[1:3, 6:len(df.columns)]])
   2000-2  2000-3  2000-7  2000-8  2000-9  A  B  C
0       5       7       1       5       7  1  4  7
1       3       8       3       3       4  2  5  8
2       6       9       5       6       3  3  6  9

You can also sum of ranges(cast to listin python 3is necessary):

您还可以总结ranges（强制转换为listinpython 3是必要的）：

rng = list(range(1,3)) + list(range(6, len(df.columns)))
print (rng)
[1, 2, 6, 7, 8, 9, 10, 11]

print (df.iloc[:, rng])
   2000-2  2000-3  2000-7  2000-8  2000-9  A  B  C
0       5       7       1       5       7  1  4  7
1       3       8       3       3       4  2  5  8
2       6       9       5       6       3  3  6  9

Answer 2

回答by wonce

I'm not sure what exactly you are asking but in general DataFrame.locallows you to select by label, DataFrame.ilocby index.

我不确定您到底在问什么，但通常DataFrame.loc允许您按标签、DataFrame.iloc按索引进行选择。

For example selecting columns # 0, 1 and 4:

例如选择第 0、1 和 4 列：

dataframe.iloc[:, [0, 1, 4]]

and selecting columns labelled 'A', 'B' and 'C':

并选择标有“A”、“B”和“C”的列：

dataframe.loc[:, ['A', 'B', 'C']]

按名称列表切片 Pandas 中的多个列范围

提问by Guga

回答by jezrael

回答by wonce

相关推荐

最近更新

标签

按名称列表切片 Pandas 中的多个列范围

提问by Guga

回答by jezrael

回答by wonce

相关推荐

pandas 使用 python {census} 计算每个州的县数

pandas 如何在熊猫数据框中尽可能用 0 替换空单元格并将字符串更改为整数？

pandas 在熊猫中合并多索引数据框

pandas 熊猫，将多列的多个功能应用于 groupby 对象

相关推荐

最近更新

标签