Python DataFrame 列中混合类型的元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27362234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Mixed types of elements in DataFrame's column
提问by Dror
Consider the following three DataFrame's:
考虑以下三个DataFrame:
df1 = pd.DataFrame([[1,2],[4,3]])
df2 = pd.DataFrame([[1,.2],[4,3]])
df3 = pd.DataFrame([[1,'a'],[4,3]])
Here are the types of the second column of the DataFrame's:
以下是DataFrame's的第二列的类型:
In [56]: map(type,df1[1])
Out[56]: [numpy.int64, numpy.int64]
In [57]: map(type,df2[1])
Out[57]: [numpy.float64, numpy.float64]
In [58]: map(type,df3[1])
Out[58]: [str, int]
In the first case, all int's are casted to numpy.int64. Fine. In the third case, there is basically no casting. However, in the second case, the integer (3) is casted to numpy.float64; probably since the other number is a float.
在第一种情况下,所有int的都被强制转换为numpy.int64。美好的。第三种情况,基本没有铸造。但是,在第二种情况下,整数 ( 3) 被强制转换为numpy.float64; 可能是因为另一个数字是浮点数。
How can I control the casting? In the second case, I want to have either [float64, int64]or [float, int]as types.
我怎样才能控制铸造?在第二种情况下,我想要么[float64, int64]或[float, int]作为类型。
Workaround:
解决方法:
Using a callable printing function there can be a workaround as showed here.
使用可调用打印功能可以有一个替代方案来显示在这里。
def printFloat(x):
if np.modf(x)[0] == 0:
return str(int(x))
else:
return str(x)
pd.options.display.float_format = printFloat
采纳答案by joris
The columns of a pandas DataFrame (or a Series) are homogeneously of type. You can inspect this with dtype(or DataFrame.dtypes):
Pandas DataFrame(或系列)的列是同构的。您可以使用dtype(或DataFrame.dtypes)检查它:
In [14]: df1[1].dtype
Out[14]: dtype('int64')
In [15]: df2[1].dtype
Out[15]: dtype('float64')
In [16]: df3[1].dtype
Out[16]: dtype('O')
Only the generic 'object'dtype can hold any python object, and in this way can also contain mixed types:
只有泛型'object'dtype 可以容纳任何 python 对象,这样也可以包含混合类型:
In [18]: df2 = pd.DataFrame([[1,.2],[4,3]], dtype='object')
In [19]: df2[1].dtype
Out[19]: dtype('O')
In [20]: map(type,df2[1])
Out[20]: [float, int]
But this is really not recommended, as this defeats the purpose (or at least the performance) of pandas.
但这真的不推荐,因为这违背了熊猫的目的(或至少是性能)。
Is there a reason you specifically want both ints and floats in the same column?
您是否有特别想要在同一列中同时使用整数和浮点数的原因?

