Python pandas to_csv 输出引用问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21147058/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas to_csv output quoting issue
提问by user3199761
I'm having trouble getting the pandas dataframe.to_csv(...)output quoting strings right.
我无法让 Pandasdataframe.to_csv(...)输出正确地引用字符串。
import pandas as pd
text = 'this is "out text"'
df = pd.DataFrame(index=['1'],columns=['1','2'])
df.loc['1','1']=123
df.loc['1','2']=text
df.to_csv('foo.txt',index=False,header=False)
The output is:
输出是:
123,"this is ""out text"""
123,"这是""输出文本"""
But I would like:
但我想:
123,this is "out text"
123,这是“外文”
Does anyone know how to get this right?
有谁知道如何做到这一点?
采纳答案by DSM
You could pass quoting=csv.QUOTE_NONE, for example:
你可以通过quoting=csv.QUOTE_NONE,例如:
>>> df.to_csv('foo.txt',index=False,header=False)
>>> !cat foo.txt
123,"this is ""out text"""
>>> import csv
>>> df.to_csv('foo.txt',index=False,header=False, quoting=csv.QUOTE_NONE)
>>> !cat foo.txt
123,this is "out text"
but in my experience it's better to quote more, rather than less.
但根据我的经验,引用更多而不是更少更好。
回答by ericmjl
As opposed to writing 'foo.txt', write 'foo.csv'. That solved the issue. When the CSV file is read in Excel, the extra quotation marks are absent.
相对于写作'foo.txt',写作'foo.csv'。那解决了这个问题。在 Excel 中读取 CSV 文件时,没有多余的引号。
回答by Owen
Note: there is currently a small error in the Pandas to_string documentation. It says:
注意:目前 Pandas to_string 文档中存在一个小错误。它说:
- quoting : int, Controls whether quotes should be recognized. Values are taken from csv.QUOTE_* values. Acceptable values are 0, 1, 2, and 3 for QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONE, and QUOTE_NONNUMERIC,
respectively.
- quoting : int,控制是否应该识别引号。值取自 csv.QUOTE_* 值。对于 QUOTE_MINIMAL、QUOTE_ALL、QUOTE_NONE 和 QUOTE_NONNUMERIC,可接受的值分别为 0、1、2 和 3
。
But this reverses how csv defines the QUOTE_NONE and QUOTE_NONNUMERIC variables.
但这颠倒了 csv 定义 QUOTE_NONE 和 QUOTE_NONNUMERIC 变量的方式。
In [13]: import csv
In [14]: csv.QUOTE_NONE
Out[14]: 3
回答by alvas
To use quoting=csv.QUOTE_NONE, you need to set the escapechar, e.g.
要使用quoting=csv.QUOTE_NONE,您需要设置escapechar,例如
# Create a tab-separated file with quotes
$ echo abc$'\t'defg$'\t'$'"xyz"' > in.tsv
$ cat in.tsv
abc defg "xyz"
# Gotcha the quotes disappears in `"..."`
$ python3
>>> import pandas as pd
>>> import csv
>>> df = pd.read("in.tsv", sep="\t")
>>> df = pd.read_csv("in.tsv", sep="\t")
>>> df
Empty DataFrame
Columns: [abc, defg, xyz]
Index: []
# When reading in pandas, to read the `"..."` quotes,
# you have to explicitly say there's no `quotechar`
>>> df = pd.read_csv("in.tsv", sep="\t", quotechar='##代码##')
>>> df
Empty DataFrame
Columns: [abc, defg, "xyz"]
Index: []
# To print out without the quotes.
>> df.to_csv("out.tsv", , sep="\t", quoting=csv.QUOTE_NONE, quotechar="", escapechar="\")
回答by penduDev
To use without escapechar:
要在没有转义符的情况下使用:
Replace comma char,(Unicode:U+002C) in your df with an single low-9 quotation markcharacter ?(Unicode: U+201A)
用单个低 9 引号字符(Unicode:U+201A)替换df 中的逗号字符,(Unicode:U+002C )?
After this, you can simply use:
在此之后,您可以简单地使用:
import csv
df.to_csv('foo.txt', index=False, header=False, quoting=csv.QUOTE_NONE)
import csv
df.to_csv('foo.txt', index=False, header=False, quoting=csv.QUOTE_NONE)

