将 Pandas 中的 CSV 文件导入到 Pandas 数据框中

Question

提问by user7289

I have a CSV file taken from a SQL dump that looks like the below (first few lines using head file.csv from terminal):

我有一个从 SQL 转储中获取的 CSV 文件，如下所示（使用终端中的 head file.csv 的前几行）：

??AANAT,AANAT1576,4
AANAT,AANAT1704,1
AAP,AAP-D-12-00691,8
AAP,AAP-D-12-00834,3

When I use the pd.read_csv('file.csv') command I get an error "ValueError: No columns to parse from file".

当我使用 pd.read_csv('file.csv') 命令时，我收到错误“ValueError: No columns to parse from file”。

Any ideas on how to import the CSV file into a table and avoid the error?

关于如何将 CSV 文件导入表格并避免错误的任何想法？

ELABORATION OF QUESTION (following Ed's comment)

问题的阐述（按照 Ed 的评论）

I have tried header = None, skiprows=1 to avoid the ?? (which appear when using the head command from the terminal).

我试过 header = None, skiprows=1 来避免 ?? （在终端使用 head 命令时出现）。

The file path to the extract is http://goo.gl/jyYlIK

提取的文件路径是http://goo.gl/jyYlIK

Answer 1

回答by EdChum

So the ??characters you see are in fact non-printable characters which after looking at your raw csv file using a hex editor show that they are in fact utf-16 little endian\FFEEwhich is the Byte-Order-Mark.

因此，??您看到的字符实际上是不可打印的字符，在使用十六进制编辑器查看原始 csv 文件后显示它们实际上是utf-16 小端\FFEE，即字节顺序标记。

So all you need to do is to pass this as the encoding type and it reads in fine:

所以你需要做的就是把它作为编码类型传递，它读起来很好：

In [46]:

df = pd.read_csv('otherfile.csv', encoding='utf-16', header=None)
df
Out[46]:
       0               1   2
0  AANAT       AANAT1576   4
1  AANAT       AANAT1704   1
2    AAP  AAP-D-12-00691   8
3    AAP  AAP-D-12-00834   3
4    AAP  AAP-D-13-00215  10
5    AAP  AAP-D-13-00270   7
6    AAP  AAP-D-13-00435   5
7    AAP  AAP-D-13-00498   4
8    AAP  AAP-D-13-00530   0
9    AAP  AAP-D-13-00747   3

将 Pandas 中的 CSV 文件导入到 Pandas 数据框中

提问by user7289

回答by EdChum

相关推荐

最近更新

标签

将 Pandas 中的 CSV 文件导入到 Pandas 数据框中

提问by user7289

回答by EdChum

相关推荐

pandas 过滤数据以仅获取当月行的第一天

无法在 Pandas python 中绘制我的数据

pandas 对熊猫数据框中的每一行进行排序的最快方法

在 Pandas 中将相同键的字典加入数据框

相关推荐

最近更新

标签