Python 更改熊猫数据框特定列的数据类型
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41590884/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Change data type of a specific column of a pandas dataframe
提问by DougKruger
I want to sort a dataframe with many columns by a specific column, but first I need to change type from object
to int
. How to change the data type of this specific column while keeping the original column positions?
我想按特定列对包含多列的数据框进行排序,但首先我需要将类型从 更改object
为int
。如何在保持原始列位置的同时更改此特定列的数据类型?
采纳答案by jezrael
You can use reindex
by sorted column by sort_values
, cast to int
by astype
:
您可以使用reindex
by 排序列 by sort_values
,转换为int
by astype
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'colname':['7','3','9'],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B D E F colname
0 1 4 1 5 7 7
1 2 5 3 3 4 3
2 3 6 5 6 3 9
print (df.colname.astype(int).sort_values())
1 3
0 7
2 9
Name: colname, dtype: int32
print (df.reindex(df.colname.astype(int).sort_values().index))
A B D E F colname
1 2 5 3 3 4 3
0 1 4 1 5 7 7
2 3 6 5 6 3 9
print (df.reindex(df.colname.astype(int).sort_values().index).reset_index(drop=True))
A B D E F colname
0 2 5 3 3 4 3
1 1 4 1 5 7 7
2 3 6 5 6 3 9
If first solution does not works because None
or bad data use to_numeric
:
如果第一个解决方案由于None
或错误数据而不起作用,请使用to_numeric
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'colname':['7','3','None'],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B D E F colname
0 1 4 1 5 7 7
1 2 5 3 3 4 3
2 3 6 5 6 3 None
print (pd.to_numeric(df.colname, errors='coerce').sort_values())
1 3.0
0 7.0
2 NaN
Name: colname, dtype: float64
回答by JimmyOnThePage
df['colname'] = df['colname'].astype(int)
works when changing from float
values to int
atleast.
df['colname'] = df['colname'].astype(int)
从float
值更改为int
至少时有效。
回答by user19120
I have tried following:
我试过以下:
df['column']=df.column.astype('int64')
and it worked for me.
它对我有用。
回答by Kripalu Sar
To simply change one column, here is what you can do:
df.column_name.apply(int)
要简单地更改一列,您可以执行以下操作:
df.column_name.apply(int)
you can replace int
with the desired datatype you want e.g (np.int64)
, str
, category
.
您可以替换int
为所需的数据类型,例如(np.int64)
, str
, category
。
For multiple datatype changes, I would recommend the following:
对于多个数据类型更改,我建议如下:
df = pd.read_csv(data, dtype={'Col_A': str,'Col_B':int64})
df = pd.read_csv(data, dtype={'Col_A': str,'Col_B':int64})