Python DataFrame 列中混合类型的元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27362234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Mixed types of elements in DataFrame's column
提问by Dror
Consider the following three DataFrame
's:
考虑以下三个DataFrame
:
df1 = pd.DataFrame([[1,2],[4,3]])
df2 = pd.DataFrame([[1,.2],[4,3]])
df3 = pd.DataFrame([[1,'a'],[4,3]])
Here are the types of the second column of the DataFrame
's:
以下是DataFrame
's的第二列的类型:
In [56]: map(type,df1[1])
Out[56]: [numpy.int64, numpy.int64]
In [57]: map(type,df2[1])
Out[57]: [numpy.float64, numpy.float64]
In [58]: map(type,df3[1])
Out[58]: [str, int]
In the first case, all int
's are casted to numpy.int64
. Fine. In the third case, there is basically no casting. However, in the second case, the integer (3
) is casted to numpy.float64
; probably since the other number is a float.
在第一种情况下,所有int
的都被强制转换为numpy.int64
。美好的。第三种情况,基本没有铸造。但是,在第二种情况下,整数 ( 3
) 被强制转换为numpy.float64
; 可能是因为另一个数字是浮点数。
How can I control the casting? In the second case, I want to have either [float64, int64]
or [float, int]
as types.
我怎样才能控制铸造?在第二种情况下,我想要么[float64, int64]
或[float, int]
作为类型。
Workaround:
解决方法:
Using a callable printing function there can be a workaround as showed here.
使用可调用打印功能可以有一个替代方案来显示在这里。
def printFloat(x):
if np.modf(x)[0] == 0:
return str(int(x))
else:
return str(x)
pd.options.display.float_format = printFloat
采纳答案by joris
The columns of a pandas DataFrame (or a Series) are homogeneously of type. You can inspect this with dtype
(or DataFrame.dtypes
):
Pandas DataFrame(或系列)的列是同构的。您可以使用dtype
(或DataFrame.dtypes
)检查它:
In [14]: df1[1].dtype
Out[14]: dtype('int64')
In [15]: df2[1].dtype
Out[15]: dtype('float64')
In [16]: df3[1].dtype
Out[16]: dtype('O')
Only the generic 'object'
dtype can hold any python object, and in this way can also contain mixed types:
只有泛型'object'
dtype 可以容纳任何 python 对象,这样也可以包含混合类型:
In [18]: df2 = pd.DataFrame([[1,.2],[4,3]], dtype='object')
In [19]: df2[1].dtype
Out[19]: dtype('O')
In [20]: map(type,df2[1])
Out[20]: [float, int]
But this is really not recommended, as this defeats the purpose (or at least the performance) of pandas.
但这真的不推荐,因为这违背了熊猫的目的(或至少是性能)。
Is there a reason you specifically want both ints and floats in the same column?
您是否有特别想要在同一列中同时使用整数和浮点数的原因?