为什么我使用 Pandas 读取 csv 文件的对象是 TextFileReader 对象
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41844485/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why the object, which I read a csv file using pandas from, is TextFileReader object
提问by Long Ye
I read a csv file using pandas:
我使用Pandas读取了一个 csv 文件:
data_raw = pd.read_csv(filename, chunksize=chunksize)
print(data_raw['id'])
Then, it reports TypeError:
然后,它报告 TypeError:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'TextFileReader' object has no attribute '__getitem__'
What can I do to resolve the problem? And how can I change the data_raw into a dataFrame object? I use the python2.7 and pandas v0.19.1
我该怎么做才能解决问题?以及如何将 data_raw 更改为 dataFrame 对象?我使用python2.7和pandas v0.19.1
回答by DYZ
When you pass chunksize
option to read_csv()
, it creates a TextFileReader
reader - an open-file-like object that can be used to read the original file in chunks. See usage example here: How to read a 6 GB csv file with pandasWhen this option is not provided, the function indeed reads the file content.
当您将chunksize
选项传递给 时read_csv()
,它会创建一个TextFileReader
读取器——一个类似打开文件的对象,可用于以块的形式读取原始文件。请参阅此处的用法示例:How to read a 6 GB csv file with pandas当未提供此选项时,该函数确实会读取文件内容。
回答by Mikhail
Oone way around this problem is to set nrows
parameter in pd.read_csv()
function and that way you select subset of data you want to load into the dataframe. Of course, drawback is that you wont be able to see and work with full dataset. Code example:
解决此问题的一种方法是nrows
在pd.read_csv()
函数中设置参数,这样您就可以选择要加载到数据帧中的数据子集。当然,缺点是您将无法查看和使用完整数据集。代码示例:
data = pd.read_csv(filename, nrows=100000)