Python 在 Pandas 中将浮点数转换为整数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21291259/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert floats to ints in Pandas?
提问by MJP
I've been working with data imported from a CSV. Pandas changed some columns to float, so now the numbers in these columns get displayed as floating points! However, I need them to be displayed as integers, or, without comma. Is there a way to convert them to integers or not display the comma?
我一直在处理从 CSV 导入的数据。Pandas 将一些列更改为浮动,因此现在这些列中的数字显示为浮点!但是,我需要将它们显示为整数,或者不带逗号。有没有办法将它们转换为整数或不显示逗号?
采纳答案by EdChum
To modify the float output do this:
要修改浮点输出,请执行以下操作:
df= pd.DataFrame(range(5), columns=['a'])
df.a = df.a.astype(float)
df
Out[33]:
a
0 0.0000000
1 1.0000000
2 2.0000000
3 3.0000000
4 4.0000000
pd.options.display.float_format = '{:,.0f}'.format
df
Out[35]:
a
0 0
1 1
2 2
3 3
4 4
回答by Ryan G
Use the pandas.DataFrame.astype(<type>)function to manipulate column dtypes.
使用该pandas.DataFrame.astype(<type>)函数来操作列数据类型。
>>> df = pd.DataFrame(np.random.rand(3,4), columns=list("ABCD"))
>>> df
A B C D
0 0.542447 0.949988 0.669239 0.879887
1 0.068542 0.757775 0.891903 0.384542
2 0.021274 0.587504 0.180426 0.574300
>>> df[list("ABCD")] = df[list("ABCD")].astype(int)
>>> df
A B C D
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
EDIT:
编辑:
To handle missing values:
处理缺失值:
>>> df
A B C D
0 0.475103 0.355453 0.66 0.869336
1 0.260395 0.200287 NaN 0.617024
2 0.517692 0.735613 0.18 0.657106
>>> df[list("ABCD")] = df[list("ABCD")].fillna(0.0).astype(int)
>>> df
A B C D
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
回答by EdChum
Considering the following data frame:
考虑以下数据框:
>>> df = pd.DataFrame(10*np.random.rand(3, 4), columns=list("ABCD"))
>>> print(df)
... A B C D
... 0 8.362940 0.354027 1.916283 6.226750
... 1 1.988232 9.003545 9.277504 8.522808
... 2 1.141432 4.935593 2.700118 7.739108
Using a list of column names, change the type for multiple columns with applymap():
使用列名列表,使用以下命令更改多列的类型applymap():
>>> cols = ['A', 'B']
>>> df[cols] = df[cols].applymap(np.int64)
>>> print(df)
... A B C D
... 0 8 0 1.916283 6.226750
... 1 1 9 9.277504 8.522808
... 2 1 4 2.700118 7.739108
Or for a single column with apply():
或者对于具有以下内容的单列apply():
>>> df['C'] = df['C'].apply(np.int64)
>>> print(df)
... A B C D
... 0 8 0 1 6.226750
... 1 1 9 9 8.522808
... 2 1 4 2 7.739108
回答by user8051244
>>> import pandas as pd
>>> right = pd.DataFrame({'C': [1.002, 2.003], 'D': [1.009, 4.55], 'key': ['K0', 'K1']})
>>> print(right)
C D key
0 1.002 1.009 K0
1 2.003 4.550 K1
>>> right['C'] = right.C.astype(int)
>>> print(right)
C D key
0 1 1.009 K0
1 2 4.550 K1
回答by enri
This is a quick solution in case you want to convert more columns of your pandas.DataFramefrom float to integer considering also the case that you can have NaN values.
如果您想将更多列pandas.DataFrame从浮点数转换为整数,考虑到您可以拥有 NaN 值的情况,这是一个快速的解决方案。
cols = ['col_1', 'col_2', 'col_3', 'col_4']
for col in cols:
df[col] = df[col].apply(lambda x: int(x) if x == x else "")
I tried with else x)and else None), but the result is still having the float number, so I used else "".
我尝试使用else x)and else None),但结果仍然是浮点数,所以我使用了else "".
回答by RAHUL KUMAR
>>> df_18['cyl'].value_counts()
... 4.0 365
... 6.0 246
... 8.0 153
>>> df_18['cyl'] = df_18['cyl'].astype(int)
>>> df_18['cyl'].value_counts()
... 4 365
... 6 246
... 8 153
回答by Suhas_Pote
To convert all float columns to int
将所有浮点列转换为 int
>>> df = pd.DataFrame(np.random.rand(5, 4) * 10, columns=list('PQRS'))
>>> print(df)
... P Q R S
... 0 4.395994 0.844292 8.543430 1.933934
... 1 0.311974 9.519054 6.171577 3.859993
... 2 2.056797 0.836150 5.270513 3.224497
... 3 3.919300 8.562298 6.852941 1.415992
... 4 9.958550 9.013425 8.703142 3.588733
>>> float_col = df.select_dtypes(include=['float64']) # This will select float columns only
>>> # list(float_col.columns.values)
>>> for col in float_col.columns.values:
... df[col] = df[col].astype('int64')
>>> print(df)
... P Q R S
... 0 4 0 8 1
... 1 0 9 6 3
... 2 2 0 5 3
... 3 3 8 6 1
... 4 9 9 8 3
回答by aebmad
Expanding on @Ryan G mentioned usage of the pandas.DataFrame.astype(<type>)method, one can use the errors=ignoreargument to only convert those columns that do not produce an error, which notably simplifies the syntax. Obviously, caution should be applied when ignoring errors, but for this task it comes very handy.
扩展@Ryan G 提到的pandas.DataFrame.astype(<type>)方法的使用,可以使用errors=ignore参数只转换那些不会产生错误的列,这显着简化了语法。显然,在忽略错误时应该小心谨慎,但对于这项任务,它非常方便。
>>> df = pd.DataFrame(np.random.rand(3, 4), columns=list('ABCD'))
>>> df *= 10
>>> print(df)
... A B C D
... 0 2.16861 8.34139 1.83434 6.91706
... 1 5.85938 9.71712 5.53371 4.26542
... 2 0.50112 4.06725 1.99795 4.75698
>>> df['E'] = list('XYZ')
>>> df.astype(int, errors='ignore')
>>> print(df)
... A B C D E
... 0 2 8 1 6 X
... 1 5 9 5 4 Y
... 2 0 4 1 4 Z
From pandas.DataFrame.astypedocs:
errors : {‘raise', ‘ignore'}, default ‘raise'
Control raising of exceptions on invalid data for provided dtype.
- raise : allow exceptions to be raised
- ignore : suppress exceptions. On error return original object
New in version 0.20.0.
错误:{'raise', 'ignore'},默认为 'raise'
控制对提供的 dtype 的无效数据引发异常。
- raise : 允许引发异常
- 忽略:抑制异常。出错时返回原始对象
0.20.0 版中的新功能。
回答by JohnE
Here's a simple function that will downcast floats into the smallest possible integer type that doesn't lose any information. For examples,
这是一个简单的函数,它会将浮点数向下转换为不会丢失任何信息的最小整数类型。举些例子,
100.0 can be converted from float to integer, but 99.9 can't (without losing information to rounding or truncation)
Additionally, 1.0 can be downcast all the way to
int8without losing information, but the smallest integer type for 100_000.0 isint32
100.0 可以从浮点数转换为整数,但 99.9 不能(不会丢失舍入或截断信息)
此外,1.0 可以一直向下转换
int8而不会丢失信息,但 100_000.0 的最小整数类型是int32
Code examples:
代码示例:
import numpy as np
import pandas as pd
def float_to_int( s ):
if ( s.astype(np.int64) == s ).all():
return pd.to_numeric( s, downcast='integer' )
else:
return s
# small integers are downcast into 8-bit integers
float_to_int( np.array([1.0,2.0]) )
Out[1]:array([1, 2], dtype=int8)
# larger integers are downcast into larger integer types
float_to_int( np.array([100_000.,200_000.]) )
Out[2]: array([100000, 200000], dtype=int32)
# if there are values to the right of the decimal
# point, no conversion is made
float_to_int( np.array([1.1,2.2]) )
Out[3]: array([ 1.1, 2.2])

