Python 阻止 Pandas 将 int 转换为 float
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40251948/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Stop Pandas from converting int to float
提问by user2570465
I have a DataFrame. Two relevant columns are the following: one is a column of intand another is a column of str.
我有一个DataFrame. 两个相关的列如下:一个是 的列,int另一个是 的列str。
I understand that if I insert NaNinto the intcolumn, Pandas will convert all the intinto floatbecause there is no NaNvalue for an int.
我明白,如果我插入NaN到int列,熊猫将全部转换int成float,因为没有NaN一个值int。
However, when I insert Noneinto the strcolumn, Pandas converts all my intto floatas well. This doesn't make sense to me - why does the value I put in column 2 affect column 1?
然而,当我插入None到str列,熊猫将所有我int要float为好。这对我来说没有意义 - 为什么我放在第 2 列中的值会影响第 1 列?
Here's a simple working example (Python 2):
这是一个简单的工作示例(Python 2):
import pandas as pd
df = pd.DataFrame()
df["int"] = pd.Series([], dtype=int)
df["str"] = pd.Series([], dtype=str)
df.loc[0] = [0, "zero"]
print df
print
df.loc[1] = [1, None]
print df
The output is
输出是
int str
0 0 zero
int str
0 0.0 zero
1 1.0 NaN
Is there any way to make the output the following:
有什么办法可以使输出如下:
int str
0 0 zero
int str
0 0 zero
1 1 NaN
without recasting the first column to int.
无需将第一列重铸为int.
I prefer using
intinstead offloatbecause the actual data in that column are integers. If there's not workaround, I'll just usefloatthough.I prefer not having to recast because in my actual code, I don't
store the actualdtype.I also need the data inserted row-by-row.
我更喜欢使用
int而不是float因为该列中的实际数据是整数。如果没有解决方法,我只会使用float。我更喜欢不必重铸,因为在我的实际代码中,我不
存储实际的dtype.我还需要逐行插入数据。
回答by maxymoo
If you set dtype=object, your series will be able to contain arbitrary data types:
如果您设置dtype=object,您的系列将能够包含任意数据类型:
df["int"] = pd.Series([], dtype=object)
df["str"] = pd.Series([], dtype=str)
df.loc[0] = [0, "zero"]
print(df)
print()
df.loc[1] = [1, None]
print(df)
int str
0 0 zero
1 NaN NaN
int str
0 0 zero
1 1 None
回答by fuglede
If you use DataFrame.appendto add the data, the dtypes are preserved, and you do not have to recast or rely on object:
如果DataFrame.append用于添加数据,则保留 dtypes,并且您不必重新转换或依赖object:
In [157]: df
Out[157]:
int str
0 0 zero
In [159]: df.append(pd.DataFrame([[1, None]], columns=['int', 'str']), ignore_index=True)
Out[159]:
int str
0 0 zero
1 1 None
回答by totalhack
As of pandas 1.0.0 I believe you have another option, which is to first use convert_dtypes. This converts the dataframe columns to dtypes that support pd.NA, avoiding the issues with NaN/None.
从 pandas 1.0.0 开始,我相信您还有另一种选择,即首先使用convert_dtypes。这会将数据帧列转换为支持 pd.NA 的数据类型,从而避免 NaN/None 的问题。
...
df = df.convert_dtypes()
df.loc[1] = [1, None]
print(df)
# int str
# 0 0 zero
# 1 1 NaN

