pandas 熊猫在 to_csv 中转义回车
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34550120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas escape carriage return in to_csv
提问by Kamil Sindi
I have a string column that sometimes has carriage returns in the string:
我有一个字符串列,有时在字符串中有回车:
import pandas as pd
from io import StringIO
datastring = StringIO("""\
country metric 2011 2012
USA GDP 7 4
USA Pop. 2 3
GB GDP 8 7
""")
df = pd.read_table(datastring, sep='\s\s+')
df.metric = df.metric + '\r' # append carriage return
print(df)
country metric 2011 2012
0 USA GDP\r 7 4
1 USA Pop.\r 2 3
2 GB GDP\r 8 7
When writing to and reading from csv, the dataframe gets corrupted:
写入和读取 csv 时,数据帧被损坏:
df.to_csv('data.csv', index=None)
print(pd.read_csv('data.csv'))
country metric 2011 2012
0 USA GDP NaN NaN
1 NaN 7 4 NaN
2 USA Pop. NaN NaN
3 NaN 2 3 NaN
4 GB GDP NaN NaN
5 NaN 8 7 NaN
Question
题
What's the best way to fix this? The one obvious method is to just clean the data first:
解决此问题的最佳方法是什么?一个明显的方法是先清理数据:
df.metric = df.metric.str.replace('\r', '')
回答by Mike Müller
Specify the line_terminator
:
指定line_terminator
:
print(pd.read_csv('data.csv', line_terminator='\n'))
country metric 2011 2012
0 USA GDP\r 7 4
1 USA Pop.\r 2 3
2 GB GDP\r 8 7
UPDATE:
更新:
In more recent versions of pandas (the original answer is from 2015) the name of the argument changed to lineterminator
.
在最新版本的Pandas(原始答案来自 2015 年)中,参数名称更改为lineterminator
.