Pandas usecols all 除了最后一个
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33424503/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas usecols all except last
提问by Leb
I have a csv file, is it possible to have usecols
take all columns except the last one when utilizing read_csv
without listing every column needed.
我有一个 csv 文件,是否可以usecols
在read_csv
不列出需要的每一列的情况下使用除最后一列之外的所有列。
For example, if I have a 13 column file, I can do usecols=[0,1,...,10,11]
. Doing usecols=[:-1]
will give me syntax error?
例如,如果我有一个 13 列的文件,我可以做usecols=[0,1,...,10,11]
. 这样做usecols=[:-1]
会给我语法错误吗?
Is there another alternative? I'm using pandas 0.17
还有其他选择吗?我正在使用pandas 0.17
采纳答案by EdChum
You can just read a single line using nrows=1
to get the cols and then re-read in the full csv skipping the last col by slicing the column array from the first read:
您可以只读取一行nrows=1
以获取 cols,然后通过从第一次读取中切片列数组,在完整的 csv 中重新读取跳过最后一个 col:
cols = pd.read_csv(file, nrows=1).columns
df = pd.read_csv(file, usecols=cols[:-1])
回答by gibbone
Starting from version 0.20
the usecols
method in pandas accepts a callable filter, i.e. a lambda
expression. Hence if you know the name of the column you want to skip you can do as follows:
从版本开始,pandas 中0.20
的usecols
方法接受一个可调用的过滤器,即一个lambda
表达式。因此,如果您知道要跳过的列的名称,则可以执行以下操作:
columns_to_skip = ['foo','bar']
df = pd.read_csv(file, usecols=lambda x: x not in columns_to_skip )
Here's the documentation reference.
这是文档参考。