pandas read_csv 列 dtype 设置为十进制但转换为字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38114654/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas read_csv column dtype is set to decimal but converts to string
提问by candleford
I am using pandas (v0.18.1) to import the following data from a file called 'test.csv':
我正在使用 pandas (v0.18.1) 从名为“test.csv”的文件中导入以下数据:
a,b,c,d
1,1,1,1.0
I have set the dtype to 'decimal.Decimal' for columns 'c' and 'd' but instead they return as type 'str'.
我已将 'c' 和 'd' 列的 dtype 设置为 'decimal.Decimal',但它们返回为类型 'str'。
import pandas as pd
import decimal as D
df = pd.read_csv('test.csv', dtype={'a': int, 'b': float, 'c': D.Decimal, 'd': D.Decimal})
for i, v in df.iterrows():
print(type(v.a), type(v.b), type(v.c), type(v.d))
Results:
结果:
`<class 'int'> <class 'float'> <class 'str'> <class 'str'>`
I have also tried converting to decimal explicitly after import with no luck (converting to float works but not decimal).
我也尝试过在导入后显式转换为十进制但没有运气(转换为浮点数有效但不是十进制数)。
df.c = df.c.astype(float)
df.d = df.d.astype(D.Decimal)
for i, v in df.iterrows():
print(type(v.a), type(v.b), type(v.c), type(v.d))
Results:
结果:
`<class 'int'> <class 'float'> <class 'float'> <class 'str'>`
The following code converts a 'str' to 'decimal.Decimal' so I don't understand why pandas doesn't behave the same way.
以下代码将“str”转换为“decimal.Decimal”,所以我不明白为什么Pandas的行为方式不同。
x = D.Decimal('1.0')
print(type(x))
Results:
结果:
`<class 'decimal.Decimal'>`
回答by jezrael
I think you need converters:
我认为你需要转换器:
import pandas as pd
import io
import decimal as D
temp = u"""a,b,c,d
1,1,1,1.0"""
# after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
dtype={'a': int, 'b': float},
converters={'c': D.Decimal, 'd': D.Decimal})
print (df)
a b c d
0 1 1.0 1 1.0
for i, v in df.iterrows():
print(type(v.a), type(v.b), type(v.c), type(v.d))
<class 'int'> <class 'float'> <class 'decimal.Decimal'> <class 'decimal.Decimal'>