pandas 在将数据帧写入 csv 文件时解决错误“分隔符必须是 1 个字符的字符串”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21005059/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Solving error "delimiter must be a 1-character string" while writing a dataframe to a csv file
提问by Julia
Using this question: Pandas writing dataframe to CSV fileas a model, I wrote the following code to make a csv file:
使用这个问题:Pandaswrite dataframe to CSV fileas a model,我编写了以下代码来制作一个csv文件:
df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep='\s+', header=True)
But it returns the following error:
但它返回以下错误:
TypeError: "delimiter" must be an 1-character string
I have looked up the documentation for this here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.htmlbut I can't figure out what I am missing, or what that error means. I also tried using (sep='\s') in the code, but got the same error.
我在这里查找了文档http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html但我不知道我错过了什么,或者那个错误方法。我也尝试在代码中使用 (sep='\s'),但得到了同样的错误。
回答by binarysubstrate
Note that the although the solution to this error was using a string charcter instead of regex, pandas also raises this error when using from __future__ import unicode_literalswith valid unicode characters. As of 2015-11-16, release 0.16.2, this error is still a known bug in pandas:
"to_csv chokes if not passed sep as a string, even when encoding is set to unicode" #6035
请注意,尽管此错误的解决方案是使用字符串字符而不是正则表达式,但在使用from __future__ import unicode_literals有效的 unicode 字符时,pandas 也会引发此错误。截至 2015 年 11 月 16 日,发布 0.16.2,此错误仍然是 Pandas 中的一个已知错误:
“如果不将 sep 作为字符串传递,to_csv 会阻塞,即使编码设置为 unicode”#6035
For example, where df is a pandas DataFrame:
例如,其中 df 是一个 Pandas DataFrame:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd
df.to_csv(pdb_seq_fp, sep='\t', encoding='utf-8')
TypeError: "delimiter" must be an 1-character string
类型错误:“分隔符”必须是 1 个字符的字符串
Using a byte lteralwith the specified encoding (default utf-8 with Python 3) -*- coding: utf-8 -*-will resolve this in pandas 0.16.2: (b'\t') —I haven't tested with previous versions or 0.17.0.
使用具有指定编码的字节文本(Python 3 中默认为 utf-8)-*- coding: utf-8 -*-将在 Pandas 0.16.2 中解决这个问题:( b'\t') —我没有用以前的版本或 0.17.0 测试过。
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd
df.to_csv(pdb_seq_fp, sep=b'\t', encoding='utf-8')
(Note that with versions 0.13.0 - ???, it was necessary to use pandas.compat import u; but by 0.16.2 the byte literal is the way to go.)
(请注意,在 0.13.0 - ??? 版本中,必须使用pandas.compat import u; 但到 0.16.2 时,字节文字是要走的路。)
回答by Mohamed Ali JAMAOUI
As mentioned in the issue discussion (here), this is not considered as a pandas issue but rather a compatibility issue of python's csv modulewith python2.x.
正如在问题讨论(这里)中提到的,这不被视为Pandas问题,而是python's csv module与 python2.x的兼容性问题。
The workaround to solve it is to enclose the separator with str(..). For example, here is how you can reproduce the problem, and then solve it:
解决此问题的解决方法是将分隔符用str(..). 例如,以下是重现问题并解决问题的方法:
from __future__ import unicode_literals
import pandas as pd
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=',')
This will raise the following error:
这将引发以下错误:
TypeError ....
----> 1 df.to_csv(sep=',')
TypeError: "delimiter" must be an 1-character string
The following however, will show the expected result
但是,以下将显示预期的结果
from __future__ import unicode_literals
import pandas as pd
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=str(','))
Output:
输出:
',0,1\n0,a,A\n1,b,B\n'
In your case, you should edit your code as follows:
在您的情况下,您应该按如下方式编辑代码:
df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep=str('\s+'), header=True)

