Python pandas.concat 中的列顺序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39046931/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Column order in pandas.concat
提问by Edward
I do as below:
我做如下:
data1 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
data2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
frames = [data1, data2]
data = pd.concat(frames)
data
a b
0 2 1
1 2 1
2 2 1
0 2 1
1 2 1
2 2 1
The data column order is in alphabet order. Why is it so? and how to keep the original order?
数据列顺序按字母顺序排列。为什么会这样?以及如何保持原来的顺序?
采纳答案by albert
You are creating DataFrames out of dictionaries. Dictionaries are a unordered which means the keys do not have a specific order. So
您正在从字典中创建 DataFrame。字典是无序的,这意味着键没有特定的顺序。所以
d1 = {'key_a': 'val_a', 'key_b': 'val_b'}
and
和
d2 = {'key_b': 'val_b', 'key_a': 'val_a'}
are the same.
是相同的。
In addition to that I assume that pandas sorts the dictionary's keys descending by default (unfortunately I did not find any hint in the docs in order to prove that assumption) leading to the behavior you encountered.
除此之外,我假设pandas 默认按降序对字典的键进行排序(不幸的是,我没有在文档中找到任何提示来证明该假设)导致您遇到的行为。
So the basic motivation would be to resort / reorder the columns in your DataFrame. You can do this as follows:
因此,基本动机是对 DataFrame 中的列进行重新排序/重新排序。您可以按如下方式执行此操作:
import pandas as pd
data1 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
data2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
frames = [data1, data2]
data = pd.concat(frames)
print(data)
cols = ['b' , 'a']
data = data[cols]
print(data)
回答by Philip Zelitchenko
def concat_ordered_columns(frames):
columns_ordered = []
for frame in frames:
columns_ordered.extend(x for x in frame.columns if x not in columns_ordered)
final_df = pd.concat(frames)
return final_df[columns_ordered]
# Usage
dfs = [df_a,df_b,df_c]
full_df = concat_ordered_columns(dfs)
This should work.
这应该有效。
回答by Michael H.
Starting from version 0.23.0, you can prevent the concat() method to sort the returned DataFrame. For example:
从 0.23.0 版本开始,您可以阻止 concat() 方法对返回的 DataFrame 进行排序。例如:
df1 = pd.DataFrame({ 'a' : [1, 1, 1], 'b' : [2, 2, 2]})
df2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
df = pd.concat([df1, df2], sort=False)
A future version of pandas will change to not sort by default.
未来版本的 Pandas 将更改为默认不排序。
回答by mohrtw
You can create the original DataFrames with OrderedDicts
您可以使用 OrderedDicts 创建原始数据帧
from collections import OrderedDict
odict = OrderedDict()
odict['b'] = [1, 1, 1]
odict['a'] = [2, 2, 2]
data1 = pd.DataFrame(odict)
data2 = pd.DataFrame(odict)
frames = [data1, data2]
data = pd.concat(frames)
data
b a
0 1 2
1 1 2
2 1 2
0 1 2
1 1 2
2 1 2
回答by Oumab10
you can also specify the order like this :
您还可以指定这样的顺序:
import pandas as pd
data1 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
data2 = pd.DataFrame({ 'b' : [1, 1, 1], 'a' : [2, 2, 2]})
listdf = [data1, data2]
data = pd.concat(listdf)
sequence = ['b','a']
data = data.reindex(columns=sequence)
回答by Emre Tatbak
Simplest way is firstly make the columns same order then concat:
最简单的方法是首先使列的顺序相同,然后连接:
df2=df2[df1.columns]
df=pd.concat((df1,df2),axis=0)