pandas 在将数据帧写入 csv 文件时解决错误“分隔符必须是 1 个字符的字符串”

Question

提问by Julia

Using this question: Pandas writing dataframe to CSV fileas a model, I wrote the following code to make a csv file:

使用这个问题：Pandaswrite dataframe to CSV fileas a model，我编写了以下代码来制作一个csv文件：

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep='\s+', header=True)

But it returns the following error:

但它返回以下错误：

TypeError: "delimiter" must be an 1-character string

I have looked up the documentation for this here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.htmlbut I can't figure out what I am missing, or what that error means. I also tried using (sep='\s') in the code, but got the same error.

我在这里查找了文档http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html但我不知道我错过了什么，或者那个错误方法。我也尝试在代码中使用 (sep='\s')，但得到了同样的错误。

Answer 1

回答by binarysubstrate

Note that the although the solution to this error was using a string charcter instead of regex, pandas also raises this error when using from __future__ import unicode_literalswith valid unicode characters. As of 2015-11-16, release 0.16.2, this error is still a known bug in pandas:
"to_csv chokes if not passed sep as a string, even when encoding is set to unicode" #6035

请注意，尽管此错误的解决方案是使用字符串字符而不是正则表达式，但在使用from __future__ import unicode_literals有效的 unicode 字符时，pandas 也会引发此错误。截至 2015 年 11 月 16 日，发布 0.16.2，此错误仍然是 Pandas 中的一个已知错误：
“如果不将 sep 作为字符串传递，to_csv 会阻塞，即使编码设置为 unicode”#6035

For example, where df is a pandas DataFrame:

例如，其中 df 是一个 Pandas DataFrame：

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep='\t', encoding='utf-8')

TypeError: "delimiter" must be an 1-character string

类型错误：“分隔符”必须是 1 个字符的字符串

Using a byte lteralwith the specified encoding (default utf-8 with Python 3) -*- coding: utf-8 -*-will resolve this in pandas 0.16.2: (b'\t') —I haven't tested with previous versions or 0.17.0.

使用具有指定编码的字节文本（Python 3 中默认为 utf-8）-*- coding: utf-8 -*-将在 Pandas 0.16.2 中解决这个问题：( b'\t') —我没有用以前的版本或 0.17.0 测试过。

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep=b'\t', encoding='utf-8')

(Note that with versions 0.13.0 - ???, it was necessary to use pandas.compat import u; but by 0.16.2 the byte literal is the way to go.)

（请注意，在 0.13.0 - ??? 版本中，必须使用pandas.compat import u; 但到 0.16.2 时，字节文字是要走的路。）

Answer 2

回答by Mohamed Ali JAMAOUI

As mentioned in the issue discussion (here), this is not considered as a pandas issue but rather a compatibility issue of python's csv modulewith python2.x.

正如在问题讨论（这里）中提到的，这不被视为Pandas问题，而是python's csv module与 python2.x的兼容性问题。

The workaround to solve it is to enclose the separator with str(..). For example, here is how you can reproduce the problem, and then solve it:

解决此问题的解决方法是将分隔符用str(..). 例如，以下是重现问题并解决问题的方法：

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=',')

This will raise the following error:

这将引发以下错误：

TypeError ....              
----> 1 df.to_csv(sep=',')
TypeError: "delimiter" must be an 1-character string

The following however, will show the expected result

但是，以下将显示预期的结果

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=str(','))

Output:

输出：

',0,1\n0,a,A\n1,b,B\n'

In your case, you should edit your code as follows:

在您的情况下，您应该按如下方式编辑代码：

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep=str('\s+'), header=True)

pandas 在将数据帧写入 csv 文件时解决错误“分隔符必须是 1 个字符的字符串”

提问by Julia

回答by binarysubstrate

回答by Mohamed Ali JAMAOUI

相关推荐

最近更新

标签

pandas 在将数据帧写入 csv 文件时解决错误“分隔符必须是 1 个字符的字符串”

提问by Julia

回答by binarysubstrate

回答by Mohamed Ali JAMAOUI

相关推荐

pandas 熊猫检索时间序列的频率

pandas 尝试使用熊猫读取表时出现索引错误

pandas 从 numpy timedelta64 获取秒数

Pandas Dataframe AttributeError: 'DataFrame' 对象没有属性 'design_info'

相关推荐

最近更新

标签