Python pandas.concat 中的列顺序

Question

提问by Edward

I do as below:

我做如下：

data1 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
data2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
frames = [data1, data2]
data = pd.concat(frames)
data


   a    b
0   2   1
1   2   1
2   2   1
0   2   1
1   2   1
2   2   1

The data column order is in alphabet order. Why is it so? and how to keep the original order?

数据列顺序按字母顺序排列。为什么会这样？以及如何保持原来的顺序？

Answer 1

采纳答案by albert

You are creating DataFrames out of dictionaries. Dictionaries are a unordered which means the keys do not have a specific order. So

您正在从字典中创建 DataFrame。字典是无序的，这意味着键没有特定的顺序。所以

d1 = {'key_a': 'val_a', 'key_b': 'val_b'}

and

和

d2 = {'key_b': 'val_b', 'key_a': 'val_a'}

are the same.

是相同的。

In addition to that I assume that pandas sorts the dictionary's keys descending by default (unfortunately I did not find any hint in the docs in order to prove that assumption) leading to the behavior you encountered.

除此之外，我假设pandas 默认按降序对字典的键进行排序（不幸的是，我没有在文档中找到任何提示来证明该假设）导致您遇到的行为。

So the basic motivation would be to resort / reorder the columns in your DataFrame. You can do this as follows:

因此，基本动机是对 DataFrame 中的列进行重新排序/重新排序。您可以按如下方式执行此操作：

import pandas as pd

data1 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
data2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
frames = [data1, data2]
data = pd.concat(frames)

print(data)

cols = ['b' , 'a']
data = data[cols]

print(data)

Answer 2

回答by Philip Zelitchenko

def concat_ordered_columns(frames):
    columns_ordered = []
    for frame in frames:
        columns_ordered.extend(x for x in frame.columns if x not in columns_ordered)
    final_df = pd.concat(frames)    
    return final_df[columns_ordered]       

# Usage
dfs = [df_a,df_b,df_c]
full_df = concat_ordered_columns(dfs)

This should work.

这应该有效。

Answer 3

回答by Michael H.

Starting from version 0.23.0, you can prevent the concat() method to sort the returned DataFrame. For example:

从 0.23.0 版本开始，您可以阻止 concat() 方法对返回的 DataFrame 进行排序。例如：

df1 = pd.DataFrame({ 'a' : [1, 1, 1], 'b' : [2, 2, 2]})
df2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
df = pd.concat([df1, df2], sort=False)

A future version of pandas will change to not sort by default.

未来版本的 Pandas 将更改为默认不排序。

Answer 4

回答by mohrtw

You can create the original DataFrames with OrderedDicts

您可以使用 OrderedDicts 创建原始数据帧

from collections import OrderedDict

odict = OrderedDict()
odict['b'] = [1, 1, 1]
odict['a'] = [2, 2, 2]
data1 = pd.DataFrame(odict)
data2 = pd.DataFrame(odict)
frames = [data1, data2]
data = pd.concat(frames)
data


    b    a
0   1    2
1   1    2
2   1    2
0   1    2
1   1    2
2   1    2

Answer 5

回答by Oumab10

you can also specify the order like this :

您还可以指定这样的顺序：

import pandas as pd

data1 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
data2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
listdf = [data1, data2]
data = pd.concat(listdf)
sequence = ['b','a']
data = data.reindex(columns=sequence)

Answer 6

回答by Emre Tatbak

Simplest way is firstly make the columns same order then concat:

最简单的方法是首先使列的顺序相同，然后连接：

df2=df2[df1.columns]
df=pd.concat((df1,df2),axis=0)

Python pandas.concat 中的列顺序

提问by Edward

采纳答案by albert

回答by Philip Zelitchenko

回答by Michael H.

回答by mohrtw

回答by Oumab10

回答by Emre Tatbak

相关推荐

最近更新

标签

Python pandas.concat 中的列顺序

提问by Edward

采纳答案by albert

回答by Philip Zelitchenko

回答by Michael H.

回答by mohrtw

回答by Oumab10

回答by Emre Tatbak

相关推荐

Python 正则表达式匹配任何字符或无？

Python 未使用 TensorFlow 编译的 CPU 指令

Python：声明为整数和字符

通过python脚本设置shell环境变量

相关推荐

最近更新

标签