保留列顺序 - Python Pandas 和 Column Concat
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32533944/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Preserving Column Order - Python Pandas and Column Concat
提问by Jibril
So my google-fu doesn't seem to be doing me justice with what seems like should be a trivial procedure.
所以我的 google-fu 似乎并没有为我伸张正义,这似乎应该是一个微不足道的程序。
In Pandas for Python I have 2 datasets, I want to merge them. This works fine using .concat. The issue is, .concat reorders my columns. From a data retrieval point of view, this is trivial. From a "I just want to open the file and quickly see the most important column" point of view, this is annoying.
在 Pandas for Python 中,我有 2 个数据集,我想合并它们。使用 .concat 可以很好地工作。问题是, .concat 重新排序我的列。从数据检索的角度来看,这是微不足道的。从“我只想打开文件并快速查看最重要的列”的角度来看,这很烦人。
File1.csv
Name Username Alias1
Tom Tomfoolery TJZ
Meryl MsMeryl Mer
Timmy Midsize Yoda
File2.csv
Name Username Alias 1 Alias 2
Bob Firedbob Fire Gingy
Tom Tomfoolery TJZ Awww
Result.csv
Alias1 Alias2 Name Username
0 TJZ NaN Tom Tomfoolery
1 Mer NaN Meryl MsMeryl
2 Yoda NaN Timmy Midsize
0 Fire Gingy Bob Firedbob
1 TJZ Awww Tom Tomfoolery
The result is fine, but in the data-file I'm working with I have 1,000 columns. The 2-3 most important are now in the middle. Is there a way, in this toy example, I could've forced "Username" to be the first column and "Name" to be the second column, preserving the values below each all the way down obviously.
结果很好,但在我使用的数据文件中,我有 1,000 列。2-3 最重要的现在在中间。有没有办法,在这个玩具示例中,我可以强制将“用户名”作为第一列,将“名称”作为第二列,显然保留每列下方的值。
Also as a side note, when I save to file it also saves that numbering on the side (0 1 2 0 1). If theres a way to prevent that too, that'd be cool. If not, its not a big deal since it's a quick fix to remove.
另外作为旁注,当我保存到文件时,它还会在旁边保存该编号(0 1 2 0 1)。如果有一种方法也可以防止这种情况发生,那就太酷了。如果没有,这没什么大不了的,因为它可以快速修复。
Thanks!
谢谢!
采纳答案by YS-L
Assuming the concatenated DataFrame is df, you can perform the reordering of columns as follows:
假设连接的 DataFrame 是df,您可以按如下方式执行列的重新排序:
important = ['Username', 'Name']
reordered = important + [c for c in df.columns if c not in important]
df = df[reordered]
print df
Output:
输出:
Username Name Alias1 Alias2
0 Tomfoolery Tom TJZ NaN
1 MsMeryl Meryl Mer NaN
2 Midsize Timmy Yoda NaN
0 Firedbob Bob Fire Gingy
1 Tomfoolery Tom TJZ Awww
The list of numbers [0, 1, 2, 0, 1]is the index of the DataFrame. To prevent them from being written to the output file, you can use the index=Falseoption in to_csv():
数字列表[0, 1, 2, 0, 1]是 DataFrame 的索引。为了防止它们被写入输出文件,您可以使用以下index=False选项to_csv():
df.to_csv('Result.csv', index=False, sep=' ')

