pandas 如何获取熊猫中的每第 n 列？

Question

提问by Angelo

I have a dataframe which looks like this:

我有一个如下所示的数据框：

    a1    b1    c1    a2    b2    c2    a3    ...
x   1.2   1.3   1.2   ...   ...   ...   ...
y   1.4   1.2   ...   ...   ...   ...   ...
z   ...

What I want is grouping by every nth column. In other words, I want a dataframe with all the as, one with bs and one with cs

我想要的是按每第 n 列分组。换句话说，我想要一个包含所有 as 的数据框，一个包含 bs，一个包含 cs

    a1     a2     a4
x   1.2    ...    ...
y
z

In another SO question I saw that is possibile to do df.iloc[::5,:], for example, to get every 5th raw. I could do of course df.iloc[:,::3]to get the c cols but it doesn't work for getting a and b.

在另一个 SO 问题中，我看到这是可能的df.iloc[::5,:]，例如，每 5 次获取原始数据。我当然df.iloc[:,::3]可以得到 c cols，但它不适用于获取 a 和 b。

Any ideas?

有任何想法吗？

Answer 1

回答by EdChum

slice the columns:

切片列：

df[df.columns[::2]]

To get every nth column

获取每第 n 列

Example:

例子：

In [2]:
cols = ['a1','b1','c1','a2','b2','c2','a3']
df = pd.DataFrame(columns=cols)
df

Out[2]:
Empty DataFrame
Columns: [a1, b1, c1, a2, b2, c2, a3]
Index: []

In [3]:
df[df.columns[::3]]
Out[3]:

Empty DataFrame
Columns: [a1, a2, a3]
Index: []

You can also filter using startswith:

您还可以使用过滤器startswith：

In [5]:
a = df.columns[df.columns.str.startswith('a')]
df[a]

Out[5]:
Empty DataFrame
Columns: [a1, a2, a3]
Index: []

and do the same for b cols and c cols etc..

并对 b cols 和 c cols 等执行相同的操作。

You can get a set of all the unique col prefixes using the following:

您可以使用以下命令获取一组所有唯一 col 前缀：

In [19]:
df.columns.str.extract(r'([a-zA-Z])').unique()

Out[19]:
array(['a', 'b', 'c'], dtype=object)

You can then use these values to filter the columns using startswith

然后，您可以使用这些值来过滤列 startswith

Answer 2

回答by divandc

The following should work:

以下应该工作：

df.ix[:, ::2] - get every second column, beginning with first (here all a's)
df.ix[:, 1::2] - get every second column, beginning with second (b's)
....

I just searched for a solution to the same problem and that solved it.

我只是搜索了相同问题的解决方案，并解决了它。

Answer 3

回答by joctee

In current version (0.24), this works:

在当前版本 (0.24) 中，这有效：

Getting your 'a' columns:

获取您的“a”列：

df.iloc[:, ::3]

getting your 'b' columns:

获取您的“b”列：

df.iloc[:, 1::3]

getting your 'c' columns:

获取您的“c”列：

df.iloc[:, 2::3]

pandas 如何获取熊猫中的每第 n 列？

提问by Angelo

回答by EdChum

回答by divandc

回答by joctee

相关推荐

最近更新

标签

pandas 如何获取熊猫中的每第 n 列？

提问by Angelo

回答by EdChum

回答by divandc

回答by joctee

相关推荐

pandas np.where 多个返回值

pandas 熊猫格兰杰因果关系

在 Pandas DataFrame 列上应用阈值

pandas 在熊猫数据框中以相同字符串开头的列的总和值

相关推荐

最近更新

标签