pandas 使用 pd.read_csv 时无法删除标题

Question

提问by Tiberius

I have a .csv that contains contains column headers and is displayed below. I need to suppress the column labeling when I ingest the file as a data frame.

我有一个包含列标题的 .csv，显示在下面。当我将文件作为数据框摄取时，我需要取消列标签。

date,color,id,zip,weight,height,locale
11/25/2013,Blue,122468,1417464,3546600,254,7

When I issue the following command:

当我发出以下命令时：

 df = pd.read_csv('c:/temp1/test_csv.csv', usecols=[4,5], names = ["zip","weight"], header = 0, nrows=10)

I get:

我得到：

zip               weight
0   1417464       3546600

I have tried various manipulations of header=True and header=0. If I don't use header=0, then the columns will all print out on top of the rows like so:

我尝试了 header=True 和 header=0 的各种操作。如果我不使用 header=0，那么列将全部打印在行的顶部，如下所示：

    zip           weight
    height        locale
0   1417464       3546600

I have tried skiprows= 0 and 1 but neither removes the headers. However, the command works by skipping the line specified.

我尝试过 skiprows= 0 和 1 但都没有删除标题。但是，该命令通过跳过指定的行来工作。

I could really use some additional insight or a solve. Thanks in advance for any assistance you could provide.

我真的可以使用一些额外的见解或解决方案。预先感谢您提供的任何帮助。

Tiberius

提比略

Answer 1

回答by jrovegno

Using the example of @jezrael, if you want to skip the header and suppress de column labeling:

使用@jezrael 的例子，如果你想跳过标题并取消列标签：

import pandas as pd
import numpy as np
import io

temp=u"""date,color,id,zip,weight,height,locale
11/25/2013,Blue,122468,1417464,3546600,254,7"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), usecols=[4,5], header=None, skiprows=1)
print df
         4    5
0  3546600  254

Answer 2

回答by π?δα? ?κ??

I'm not sure I entirely understand why you want to remove the headers, but you could comment out the header line as follows as long as you don't have any other rows that begin with 'd':

我不确定我是否完全理解您为什么要删除标题，但是只要您没有任何其他以开头的行，您就可以如下注释掉标题行'd'：

>>> df = pd.read_csv('test.csv', usecols=[3,4], header=None, comment='d')  # comments out lines beginning with 'date,color' . . .
>>> df
         3        4
0  1417464  3546600

It would be better to comment out the line in the csv file with the crosshatch character (#) and then use the same approach (again, as long as you have not commented out any other lines with a crosshatch):

最好用剖面线字符 ( #)注释掉 csv 文件中的行，然后使用相同的方法（同样，只要您没有用剖面线注释掉任何其他行）：

>>> df = pd.read_csv('test.csv', usecols=[3,4], header=None, comment='#')   # comments out lines with #
>>> df
         3        4
0  1417464  3546600

Answer 3

回答by jezrael

I think you are right.

我想你是对的。

So you can change column names to aand b:

因此，您可以将列名更改为a和b：

import pandas as pd
import numpy as np
import io

temp=u"""date,color,id,zip,weight,height,locale
11/25/2013,Blue,122468,1417464,3546600,254,7"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), usecols=[4,5], names = ["a","b"], header = 0 , nrows=10)
print df
         a    b
0  3546600  254

Now these columns have new names instead of weightand height.

现在这些列有新名称而不是weight和height。

df = pd.read_csv(io.StringIO(temp), usecols=[4,5], header = 0 , nrows=10)
print df
    weight  height
0  3546600     254

You can check docs read_csv(bold by me):

您可以查看文档read_csv（我加粗）：

header: int, list of ints, default ‘infer'
Row number(s) to use as the column names, and the start of the data. Defaults to 0 if no names passed, otherwise None. Explicitly pass header=0 to be able to replace existing names.The header can be a list of integers that specify row locations for a multi-index on the columns E.g. [0,1,3]. Intervening rows that are not specified will be skipped (e.g. 2 in this example are skipped). Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.

标头：int，整数列表，默认为“推断”
用作列名的行号和数据的开头。如果没有传递名称，则默认为 0，否则为 None。显式传递 header=0 以能够替换现有名称。标题可以是一个整数列表，用于指定列（例如 [0,1,3] 上的多索引的行位置）。未指定的中间行将被跳过（例如，本例中的 2 行被跳过）。请注意，如果skip_blank_lines=True，此参数将忽略注释行和空行，因此header=0 表示数据的第一行而不是文件的第一行。

pandas 使用 pd.read_csv 时无法删除标题

提问by Tiberius

回答by jrovegno

回答by π?δα? ?κ??

回答by jezrael

相关推荐

最近更新

标签

pandas 使用 pd.read_csv 时无法删除标题

提问by Tiberius

回答by jrovegno

回答by π?δα? ?κ??

回答by jezrael

相关推荐

Python 文本处理：NLTK 和 Pandas

将日期时间列转换为不同的时区 Pandas

Python/Pandas - 将类型从 Pandas 句点转换为字符串

Python pandas 使用滚动应用到 groupby 对象以矢量化方式计算机车车辆 beta

相关推荐

最近更新

标签