在 Pandas Read_CSV 中使用 UseCols 时按指定顺序保留列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40024406/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:12:10  来源:igfitidea点击:

Keeping columns in the specified order when using UseCols in Pandas Read_CSV

pythonpandasdataframe

提问by AButkov

I have a csv file with 50 columns of data. I am using Pandas read_csv function to pull in a subset of these columns, using the usecols parameter to choose the ones I want:

我有一个包含 50 列数据的 csv 文件。我正在使用 Pandas read_csv 函数来提取这些列的一个子集,使用 usecols 参数来选择我想要的那些:

cols_to_use = [0,1,5,16,8]
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)

The trouble is df_ret contains the correct columns, but not in the order I specified. They are in ascending order, so [0,1,5,8,16]. (By the way the column numbers can change from run to run, this is just an example.) This is a problem because the rest of the code has arrays which are in the "correct" order and I would rather not have to reorder all of them.

问题是 df_ret 包含正确的列,但不是我指定的顺序。它们按升序排列,因此 [0,1,5,8,16]。(顺便说一下,列号可以在每次运行中更改,这只是一个示例。)这是一个问题,因为其余代码具有“正确”顺序的数组,我宁愿不必重新排序所有其中。

Is there any clever pandas way of pulling in the columns in the order specified? Any help would be much appreciated!

是否有任何聪明的Pandas方式按指定的顺序拉入列?任何帮助将非常感激!

采纳答案by MaxU

you can reuse the same cols_to_uselist for selecting columns in desired order:

您可以重复使用相同的cols_to_use列表来按所需顺序选择列:

df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]

回答by PeptideWitch

Just piggybacking off this question here (hi from 2018).

只是在这里捎带这个问题(嗨,从 2018 年开始)。

I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.

我发现我的Pandas read_csv 存在同样的问题,并想找出一种使用列标题字符串获取 [col_reorder] 的方法。它就像定义要使用的字符串数组一样简单。

pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]