为什么我使用 Pandas 读取 csv 文件的对象是 TextFileReader 对象

Question

提问by Long Ye

I read a csv file using pandas:

我使用Pandas读取了一个 csv 文件：

data_raw = pd.read_csv(filename, chunksize=chunksize)
print(data_raw['id'])

Then, it reports TypeError:

然后，它报告 TypeError：

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'TextFileReader' object has no attribute '__getitem__'

What can I do to resolve the problem? And how can I change the data_raw into a dataFrame object? I use the python2.7 and pandas v0.19.1

我该怎么做才能解决问题？以及如何将 data_raw 更改为 dataFrame 对象？我使用python2.7和pandas v0.19.1

Answer 1

回答by DYZ

When you pass chunksizeoption to read_csv(), it creates a TextFileReaderreader - an open-file-like object that can be used to read the original file in chunks. See usage example here: How to read a 6 GB csv file with pandasWhen this option is not provided, the function indeed reads the file content.

当您将chunksize选项传递给时read_csv()，它会创建一个TextFileReader读取器——一个类似打开文件的对象，可用于以块的形式读取原始文件。请参阅此处的用法示例：How to read a 6 GB csv file with pandas当未提供此选项时，该函数确实会读取文件内容。

Answer 2

回答by Mikhail

Oone way around this problem is to set nrowsparameter in pd.read_csv()function and that way you select subset of data you want to load into the dataframe. Of course, drawback is that you wont be able to see and work with full dataset. Code example:

解决此问题的一种方法是nrows在pd.read_csv()函数中设置参数，这样您就可以选择要加载到数据帧中的数据子集。当然，缺点是您将无法查看和使用完整数据集。代码示例：

data = pd.read_csv(filename, nrows=100000)

为什么我使用 Pandas 读取 csv 文件的对象是 TextFileReader 对象

提问by Long Ye

回答by DYZ

回答by Mikhail

相关推荐

最近更新

标签

为什么我使用 Pandas 读取 csv 文件的对象是 TextFileReader 对象

提问by Long Ye

回答by DYZ

回答by Mikhail

相关推荐

LOC 函数中的 Pandas 使用和运算符

pandas 如何合并数据帧熊猫中的两行

Python、Pandas：GroupBy 属性文档

pandas 使用pandas-Python 3遍历excel中的行和列

相关推荐

最近更新

标签