在 Pandas Read_CSV 中使用 UseCols 时按指定顺序保留列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40024406/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Keeping columns in the specified order when using UseCols in Pandas Read_CSV
提问by AButkov
I have a csv file with 50 columns of data. I am using Pandas read_csv function to pull in a subset of these columns, using the usecols parameter to choose the ones I want:
我有一个包含 50 列数据的 csv 文件。我正在使用 Pandas read_csv 函数来提取这些列的一个子集,使用 usecols 参数来选择我想要的那些:
cols_to_use = [0,1,5,16,8]
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)
The trouble is df_ret contains the correct columns, but not in the order I specified. They are in ascending order, so [0,1,5,8,16]. (By the way the column numbers can change from run to run, this is just an example.) This is a problem because the rest of the code has arrays which are in the "correct" order and I would rather not have to reorder all of them.
问题是 df_ret 包含正确的列,但不是我指定的顺序。它们按升序排列,因此 [0,1,5,8,16]。(顺便说一下,列号可以在每次运行中更改,这只是一个示例。)这是一个问题,因为其余代码具有“正确”顺序的数组,我宁愿不必重新排序所有其中。
Is there any clever pandas way of pulling in the columns in the order specified? Any help would be much appreciated!
是否有任何聪明的Pandas方式按指定的顺序拉入列?任何帮助将非常感激!
采纳答案by MaxU
you can reuse the same cols_to_use
list for selecting columns in desired order:
您可以重复使用相同的cols_to_use
列表来按所需顺序选择列:
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]
回答by PeptideWitch
Just piggybacking off this question here (hi from 2018).
只是在这里捎带这个问题(嗨,从 2018 年开始)。
I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.
我发现我的Pandas read_csv 存在同样的问题,并想找出一种使用列标题字符串获取 [col_reorder] 的方法。它就像定义要使用的字符串数组一样简单。
pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]