Pandas Read_CSV 报价问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37589795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Read_CSV quotes issue
提问by A. Jameel
I have a file that looks like:
我有一个文件,看起来像:
'colA'|'colB'
'word"A'|'A'
'word'B'|'B'
I want to use pd.read_csv('input.csv',sep='|', quotechar="'"
) but I get the following output:
我想使用pd.read_csv('input.csv',sep='|', quotechar="'"
) 但我得到以下输出:
colA colB
word"A A
wordB' B
The last row is not correct, it should be word'B B
. How do I get around this? I have tried various iterations but none of them word that reads both rows correctly. I need some csv reading expertise!
最后一行不正确,应该是word'B B
。我该如何解决这个问题?我尝试了各种迭代,但没有一个单词可以正确读取两行。我需要一些 csv 阅读专业知识!
回答by jezrael
I think you need str.strip
with apply
:
import pandas as pd
import io
temp=u"""'colA'|'colB'
'word"A'|'A'
'word'B'|'B'"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep='|')
df = df.apply(lambda x: x.str.strip("'"))
df.columns = df.columns.str.strip("'")
print (df)
colA colB
0 word"A A
1 word'B B
回答by Yaron
The source of the problem is that ' is defined as quote, and as a regular char.
问题的根源在于 ' 被定义为引用,并且被定义为常规字符。
You can escape it e.g.
你可以逃避它,例如
'colA'|'colB'
'word"A'|'A'
'word/'B'|'B'
And then use escapechar:
然后使用转义符:
>>> pd.read_csv('input.csv',sep='|',quotechar="'",escapechar="/")
colA colB
0 word"A A
1 word'B B
Also You can use: quoting=csv.QUOTE_ALL - but the output will include the quote chars
您也可以使用:quoting=csv.QUOTE_ALL - 但输出将包括引号字符
>>> import pandas as pd
>>> import csv
>>> pd.read_csv('input.csv',sep='|',quoting=csv.QUOTE_ALL)
'colA' 'colB'
0 'word"A' 'A'
1 'word'B' 'B'
>>>