pandas 按在熊猫中的位置选择多个数据框列

Question

提问by aiden rosenblatt

I have a (large) dataframe. How can I select specific columns by position? e.g. columns 1..3, 5, 6

我有一个（大）数据框。如何按位置选择特定列？例如第 1..3、5、6 列

Rather than just drop column4, I am trying to do it in this way because there are a ton of rows in my dataset and I want to select by position:

我不只是删除 column4，而是尝试以这种方式执行此操作，因为我的数据集中有大量行并且我想按位置选择：

 df=df[df.columns[0:2,4:5]]

but that gives IndexError: too many indices for array

但这给 IndexError: too many indices for array

DF input

DF输入

 Col1     Col2     Col3       Col4        Col5       Col6
 1        apple    tomato     pear        banana     banana
 1        apple    grape      nan         banana     banana
 1        apple    nan        banana      banana     banana
 1        apple    tomato     banana      banana     banana
 1        apple    tomato     banana      banana     banana
 1        apple    tomato     banana      banana     banana
 1        avacado  tomato     banana      banana     banana
 1        toast    tomato     banana      banana     banana
 1        grape    tomato     egg         banana     banana

DF output - desired

DF 输出 - 所需

 Col1     Col2     Col3       Col5       Col6
 1        apple    tomato     banana     banana
 1        apple    grape      banana     banana
 1        apple    nan        banana     banana
 1        apple    tomato     banana     banana
 1        apple    tomato     banana     banana
 1        apple    tomato     banana     banana     
 1        avacado  tomato     banana     banana     
 1        toast    tomato     banana     banana     
 1        grape    tomato     banana     banana

Answer 1

回答by YOBEN_S

What you need is numpy np.r_

你需要的是numpy np.r_

df.iloc[:,np.r_[0:2,4:5]]
Out[265]: 
   Col1     Col2    Col5
0     1    apple  banana
1     1    apple  banana
2     1    apple  banana
3     1    apple  banana
4     1    apple  banana
5     1    apple  banana
6     1  avacado  banana
7     1    toast  banana
8     1    grape  banana

Answer 2

回答by jpp

You can select columns 0, 1, 4 in this way:

您可以通过这种方式选择第 0、1、4 列：

df.iloc[:, [0, 1, 4]]

You can read more about this in Indexing and Selecting Data.

您可以在索引和选择数据中阅读有关此内容的更多信息。

? iloc is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array. .iloc will raise IndexError if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing. (this conforms with python/numpy slice semantics). Allowed inputs are:
? An integer e.g. 5
? A list or array of integers [4, 3, 0]
? A slice object with ints 1:7
? A boolean array
? A callable function with one argument (the calling Series, DataFrame or Panel) and that returns valid output for indexing (one of the above)

? iloc 主要基于整数位置（从轴的 0 到长度 1），但也可以与布尔数组一起使用。如果请求的索引器越界，.iloc 将引发 IndexError，但允许越界索引的切片索引器除外。（这符合 python/numpy 切片语义）。允许的输入是：
? 一个整数，例如 5
? 整数列表或数组 [4, 3, 0]
? 一个整数为 1:7 的切片对象
? 一个布尔数组
? 一个可调用函数，带有一个参数（调用 Series、DataFrame 或 Panel）并返回有效的索引输出（上述之一）

Answer 3

回答by YanSym

Use the pandas iloc method:

使用Pandas iloc 方法：

df_filtered = df.iloc[:, [1,2,3,5,6]]

Answer 4

回答by Tai

The error OP face is from df.columns[0:2,4:5]where too many indices were put into. IIUC, you can put all the column names you need together to do a selection.

错误 OP face 来自df.columns[0:2,4:5]放入太多索引的地方。IIUC，你可以把你需要的所有列名放在一起做一个选择。

from itertools import chain
cols_to_select = list(v for v in chain(df.columns[0:2], df.columns[4:5]))
df_filtered = df[cols_to_select]

If there can be name conflicts in cols_to_select, do selection using ilocas jp_data_analysis suggested or np.r_as Wen suggested.

如果 cols_to_select 中可能存在名称冲突，请iloc按照 jp_data_analysis 建议或np.r_Wen 建议进行选择。

Answer 5

回答by student

You can also use rangewith concatenatefrom numpyand get columns where np.concatenateis used to combine two different ranges:

您还可以使用range与concatenate从numpy和获得，其中，列np.concatenate被用于两个不同范围的结合：

import numpy as np
df = df[df.columns[np.concatenate([range(0,3),range(4,6)])]]
df

Output:

输出：

   Col1     Col2    Col3    Col5    Col6
0     1    apple  tomato  banana  banana
1     1    apple   grape  banana  banana
2     1    apple     NaN  banana  banana
3     1    apple  tomato  banana  banana
4     1    apple  tomato  banana  banana
5     1    apple  tomato  banana  banana
6     1  avacado  tomato  banana  banana
7     1    toast  tomato  banana  banana
8     1    grape  tomato  banana  banana

pandas 按在熊猫中的位置选择多个数据框列

提问by aiden rosenblatt

回答by YOBEN_S

回答by jpp

回答by YanSym

回答by Tai

回答by student

相关推荐

最近更新

标签

pandas 按在熊猫中的位置选择多个数据框列

提问by aiden rosenblatt

回答by YOBEN_S

回答by jpp

回答by YanSym

回答by Tai

回答by student

相关推荐

pandas python中电机振动信号的快速傅里叶变换

带有 lambda 函数的 Pandas .filter() 方法

Pandas to_sql 不会在我的表中插入任何数据

pandas 熊猫如何在“loc”之后“替换”工作？

相关推荐

最近更新

标签