Pandas 在行上设置多索引，然后转置到列

Question

提问by sheridp

If I have a simple dataframe:

如果我有一个简单的数据框：

print(a)

  one  two three
0   A    1     a
1   A    2     b
2   B    1     c
3   B    2     d
4   C    1     e
5   C    2     f

I can easily create a multi-index on the rows by issuing:

我可以通过发出以下命令轻松地在行上创建多索引：

a.set_index(['one', 'two'])

        three
one two      
A   1       a
    2       b
B   1       c
    2       d
C   1       e
    2       f

Is there a similarly easy way to create a multi-index on the columns?

是否有类似的简单方法在列上创建多索引？

I'd like to end up with:

我想结束：

    one A       B       C   
    two 1   2   1   2   1   2
    0   a   b   c   d   e   f

In this case, it would be pretty simple to create the row multi-index and then transpose it, but in other examples, I'll be wanting to create a multi-index on both the rows and columns.

在这种情况下，创建行多索引然后转置它会非常简单，但在其他示例中，我将要在行和列上创建多索引。

Answer 1

采纳答案by piRSquared

Yes! It's called transposition.

是的！这叫做转位。

a.set_index(['one', 'two']).T

Let's borrow from @ragesz's post because they used a much better example to demonstrate with.

让我们借用@ragesz 的帖子，因为他们使用了一个更好的例子来演示。

df = pd.DataFrame({'a':['foo_0', 'bar_0', 1, 2, 3], 'b':['foo_0', 'bar_1', 11, 12, 13],
    'c':['foo_1', 'bar_0', 21, 22, 23], 'd':['foo_1', 'bar_1', 31, 32, 33]})

df.T.set_index([0, 1]).T

Answer 2

回答by Nickil Maveli

You could use pivot_tablefollowed by a series of manipulations on the dataframe to get the desired form:

您可以使用pivot_table后跟对数据框的一系列操作来获得所需的形式：

df_pivot = pd.pivot_table(df, index=['one', 'two'], values='three', aggfunc=np.sum)

def rename_duplicates(old_list):    # Replace duplicates in the index with an empty string
    seen = {}
    for x in old_list:
        if x in seen:
            seen[x] += 1
            yield " " 
        else:
            seen[x] = 0
            yield x

col_group = df_pivot.unstack().stack().reset_index(level=-1)
col_group.index = rename_duplicates(col_group.index.tolist())
col_group.index.name = df_pivot.index.names[0]
col_group.T

one  A     B     C   
two  1  2  1  2  1  2
0    a  b  c  d  e  f

Answer 3

回答by ragesz

I think the short answer is NO. To have multi-index columns, the dataframe should have two (or more) rows to be converted into headers (like columns for multi-index rows). If you have this kind of dataframe, creating multi-index header is not so difficult. It can be done in a very long line of code, and you can reuse it at any other dataframe, only the row numbers of the headers should be kept in mind & change if differs:

我认为简短的回答是否定的。要拥有多索引列，数据框应该有两（或更多）行要转换为标题（如多索引行的列）。如果您有这种数据帧，创建多索引标头就不是那么困难了。它可以在很长的代码行中完成，您可以在任何其他数据帧中重用它，只应记住标题的行号并在不同时更改：

df = pd.DataFrame({'a':['foo_0', 'bar_0', 1, 2, 3], 'b':['foo_0', 'bar_1', 11, 12, 13],
    'c':['foo_1', 'bar_0', 21, 22, 23], 'd':['foo_1', 'bar_1', 31, 32, 33]})

The dataframe:

数据框：

       a      b      c      d
0  foo_0  foo_0  foo_1  foo_1
1  bar_0  bar_1  bar_0  bar_1
2      1     11     21     31
3      2     12     22     32
4      3     13     23     33

Creating multi-index object:

创建多索引对象：

arrays = [df.iloc[0].tolist(), df.iloc[1].tolist()]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])

df.columns = index

Multi-index header result:

多索引头结果：

first   foo_0         foo_1       
second  bar_0  bar_1  bar_0  bar_1
0       foo_0  foo_0  foo_1  foo_1
1       bar_0  bar_1  bar_0  bar_1
2           1     11     21     31
3           2     12     22     32
4           3     13     23     33

Finally we need to drop 0-1 rows then reset the row index:

最后我们需要删除 0-1 行然后重置行索引：

df = df.iloc[2:].reset_index(drop=True)

The "one-line" version (only thing you have to change is to specify header indexes and the dataframe itself):

“一行”版本（您唯一需要更改的是指定标头索引和数据帧本身）：

idx_first_header = 0
idx_second_header = 1

df.columns = pd.MultiIndex.from_tuples(list(zip(*[df.iloc[idx_first_header].tolist(),
    df.iloc[idx_second_header].tolist()])), names=['first', 'second'])

df = df.drop([idx_first_header, idx_second_header], axis=0).reset_index(drop=True)

Pandas 在行上设置多索引，然后转置到列

提问by sheridp

采纳答案by piRSquared

回答by Nickil Maveli

回答by ragesz

相关推荐

最近更新

标签

Pandas 在行上设置多索引，然后转置到列

提问by sheridp

采纳答案by piRSquared

回答by Nickil Maveli

回答by ragesz

相关推荐

pandas 基于过滤器更改数据框列的值

如何使用来自多列的参数调用 pandas.rolling.apply？

Python：Pandas 系列 - 为什么使用 loc？

Python Pandas 查找所有值为 NaN 的所有行

相关推荐

最近更新

标签