pandas 熊猫将数据帧转换为 Utf-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45424414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas convert dataframe to Utf-8
提问by Chris Johnson
I have a df
that consist of 100 rows and 24 columns. The column type is string. It's throwing me the following error when I tried to append the data frame to KDB
我有一个df
由 100 行和 24 列组成的。列类型为字符串。当我尝试将数据框附加到 KDB 时,它向我抛出以下错误
UnicodeEncodeError: 'ascii' codec can't encode character '\xd3' in position 9: ordinal not in range(128)
Here is an example of the first row in my df.
这是我的 df 中第一行的示例。
AnnouncementDate AuctionDate BBT \
_id
00000067 2012-12-11T00:00:00.000+00:00 NaN FHLB
CouponDividendRate DaysToSettle \
_id
00000067 0.61 1
Description \
_id
00000067 FHLB 0.61 12/28/16
FirstSettlementDate ISN IsAgency IsWhenIssued \
_id
00000067 2012-12-28T00:00:00.000+00:00 US313381K796 True False
... OnTheRunTreasury OperationalIndicator \
_id ...
00000067 ... NaN False
OriginalAmountOfPrincipal OriginalMaturityDate \
_id
00000067 13000000.0 NaN
PrincipalAmountOutstanding SCSP SMCP \
_id
00000067 0.0 313381K79 76000000
SecurityTypeLevel1 SecurityTypeLevel2 TCK
_id
00000067 US-DOMESTIC NaN NaN
My question is, is there an easy way to convert my df
to utf-8 format?
我的问题是,有没有一种简单的方法可以将我的df
格式转换为 utf-8 格式?
Possibly something like df = df.encode('utf-8')
可能像 df = df.encode('utf-8')
Thanks
谢谢
回答by Ricky McMaster
It depends on how you're outputting the data. If you're simply using csv files, which you then import to KDB, then you can specify that easily:
这取决于您如何输出数据。如果您只是使用 csv 文件,然后将其导入 KDB,那么您可以轻松指定:
df.to_csv('df_output.csv', encoding='utf-8')
Or, you can set the encoding when you import the data to Pandas originally, using the same syntax.
或者,您可以使用相同的语法在最初将数据导入 Pandas 时设置编码。
If you're connecting directly to KDB using SQLAlchemy or something similar, you should try specifying this in the connection itself - see this question: Another UnicodeEncodeError when using pandas method to_sql with MySQL
如果您使用 SQLAlchemy 或类似的东西直接连接到 KDB,您应该尝试在连接本身中指定它 - 请参阅这个问题:Another UnicodeEncodeError when using pandas method to_sql with MySQL