pandas 熊猫在 to_csv 中转义回车

Question

提问by Kamil Sindi

I have a string column that sometimes has carriage returns in the string:

我有一个字符串列，有时在字符串中有回车：

import pandas as pd
from io import StringIO

datastring = StringIO("""\
country  metric           2011   2012
USA      GDP              7      4
USA      Pop.             2      3
GB       GDP              8      7
""")
df = pd.read_table(datastring, sep='\s\s+')
df.metric = df.metric + '\r'  # append carriage return

print(df)
  country  metric  2011  2012
0     USA   GDP\r     7     4
1     USA  Pop.\r     2     3
2      GB   GDP\r     8     7

When writing to and reading from csv, the dataframe gets corrupted:

写入和读取 csv 时，数据帧被损坏：

df.to_csv('data.csv', index=None)

print(pd.read_csv('data.csv'))
  country metric  2011  2012
0     USA    GDP   NaN   NaN
1     NaN      7     4   NaN
2     USA   Pop.   NaN   NaN
3     NaN      2     3   NaN
4      GB    GDP   NaN   NaN
5     NaN      8     7   NaN

Question

题

What's the best way to fix this? The one obvious method is to just clean the data first:

解决此问题的最佳方法是什么？一个明显的方法是先清理数据：

df.metric = df.metric.str.replace('\r', '')

Answer 1

回答by Mike Müller

Specify the line_terminator:

指定line_terminator：

print(pd.read_csv('data.csv', line_terminator='\n'))

  country  metric  2011  2012
0     USA   GDP\r     7     4
1     USA  Pop.\r     2     3
2      GB   GDP\r     8     7

UPDATE:

更新：

In more recent versions of pandas (the original answer is from 2015) the name of the argument changed to lineterminator.

在最新版本的Pandas（原始答案来自 2015 年）中，参数名称更改为lineterminator.

pandas 熊猫在 to_csv 中转义回车

提问by Kamil Sindi

Question

题

回答by Mike Müller

相关推荐

最近更新

标签

pandas 熊猫在 to_csv 中转义回车

提问by Kamil Sindi

Question

题

回答by Mike Müller

相关推荐

pandas 在 Python 中分析时间序列 - 熊猫格式错误 - statsmodels

将 sklearn 函数应用于 Pandas 数据帧会给出 ValueError("Unknown label type: %r" % y)

pandas 熊猫按另一列中的值对一列进行排序

pandas 在任何列中搜索关键字的数据框并获取行

相关推荐

最近更新

标签