Python DataFrame 列中混合类型的元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27362234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:42:41  来源:igfitidea点击:

Mixed types of elements in DataFrame's column

pythonnumpypandas

提问by Dror

Consider the following three DataFrame's:

考虑以下三个DataFrame

df1 = pd.DataFrame([[1,2],[4,3]])
df2 = pd.DataFrame([[1,.2],[4,3]])
df3 = pd.DataFrame([[1,'a'],[4,3]])

Here are the types of the second column of the DataFrame's:

以下是DataFrame's的第二列的类型:

In [56]: map(type,df1[1])
Out[56]: [numpy.int64, numpy.int64]

In [57]: map(type,df2[1])
Out[57]: [numpy.float64, numpy.float64]

In [58]: map(type,df3[1])
Out[58]: [str, int]

In the first case, all int's are casted to numpy.int64. Fine. In the third case, there is basically no casting. However, in the second case, the integer (3) is casted to numpy.float64; probably since the other number is a float.

在第一种情况下,所有int的都被强制转换为numpy.int64。美好的。第三种情况,基本没有铸造。但是,在第二种情况下,整数 ( 3) 被强制转换为numpy.float64; 可能是因为另一个数字是浮点数。

How can I control the casting? In the second case, I want to have either [float64, int64]or [float, int]as types.

我怎样才能控制铸造?在第二种情况下,我想要么[float64, int64][float, int]作为类型。

Workaround:

解决方法:

Using a callable printing function there can be a workaround as showed here.

使用可调用打印功能可以有一个替代方案来显示在这里

def printFloat(x):
    if np.modf(x)[0] == 0:
        return str(int(x))
    else:
        return str(x)
pd.options.display.float_format = printFloat

采纳答案by joris

The columns of a pandas DataFrame (or a Series) are homogeneously of type. You can inspect this with dtype(or DataFrame.dtypes):

Pandas DataFrame(或系列)的列是同构的。您可以使用dtype(或DataFrame.dtypes)检查它:

In [14]: df1[1].dtype
Out[14]: dtype('int64')

In [15]: df2[1].dtype
Out[15]: dtype('float64')

In [16]: df3[1].dtype
Out[16]: dtype('O')

Only the generic 'object'dtype can hold any python object, and in this way can also contain mixed types:

只有泛型'object'dtype 可以容纳任何 python 对象,这样也可以包含混合类型:

In [18]: df2 = pd.DataFrame([[1,.2],[4,3]], dtype='object')

In [19]: df2[1].dtype
Out[19]: dtype('O')

In [20]: map(type,df2[1])
Out[20]: [float, int]

But this is really not recommended, as this defeats the purpose (or at least the performance) of pandas.

但这真的不推荐,因为这违背了熊猫的目的(或至少是性能)。

Is there a reason you specifically want both ints and floats in the same column?

您是否有特别想要在同一列中同时使用整数和浮点数的原因?