pandas StringIO 和熊猫 read_csv
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34447448/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
StringIO and pandas read_csv
提问by JohnE
I'm trying to mix StringIO and BytesIO with pandas and struggling with some basic stuff. For example, I can't get "output" below to work, whereas "output2" below does work. But "output" is closer to the real world example I'm trying to do. The way in "output2" is from an old pandas example but not really a useful way for me to do it.
我正在尝试将 StringIO 和 BytesIO 与 Pandas 混合使用,并在一些基本的东西上挣扎。例如,我无法让下面的“输出”工作,而下面的“输出2”可以工作。但是“输出”更接近于我正在尝试做的真实世界的例子。“output2”中的方法来自一个旧的Pandas示例,但对我来说并不是一个真正有用的方法。
import io # note for python 3 only
# in python2 need to import StringIO
output = io.StringIO()
output.write('x,y\n')
output.write('1,2\n')
output2 = io.StringIO("""x,y
1,2
""")
They seem to be the same in terms of type and contents:
它们在类型和内容方面似乎是相同的:
type(output) == type(output2)
Out[159]: True
output.getvalue() == output2.getvalue()
Out[160]: True
But no, not the same:
但不,不一样:
output == output2
Out[161]: False
More to the point of the problem I'm trying to solve:
更重要的是我要解决的问题:
pd.read_csv(output) # ValueError: No columns to parse from file
pd.read_csv(output2) # works fine, same as reading from a file
回答by DSM
io.StringIO
here is behaving just like a file -- you wrote to it, and now the file pointer is pointing at the end. When you try to read from it after that, there's nothing after the point you wrote, so: no columns to parse.
io.StringIO
这里的行为就像一个文件——你写入了它,现在文件指针指向末尾。当您之后尝试从中读取时,在您写入的点之后没有任何内容,因此:没有要解析的列。
Instead, just like you would with an ordinary file, seek
to the start, and then read:
相反,就像使用普通文件一样,seek
从头开始,然后阅读:
>>> output = io.StringIO()
>>> output.write('x,y\n')
4
>>> output.write('1,2\n')
4
>>> output.seek(0)
0
>>> pd.read_csv(output)
x y
0 1 2