Pandas:无法更改列数据类型
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/17778139/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: unable to change column data type
提问by
I was following the advice hereto change the column data type of a pandas dataframe. However, it does not seem to work if I reference the columns by index numbers instead of column names. Is there a way to do this correctly?
我正在按照此处的建议更改Pandas数据框的列数据类型。但是,如果我通过索引号而不是列名来引用列,它似乎不起作用。有没有办法正确地做到这一点?
In [49]: df.iloc[:, 4:].astype(int)
Out[49]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5074 entries, 0 to 5073
Data columns (total 3 columns):
5    5074  non-null values
6    5074  non-null values
7    5074  non-null values
dtypes: int64(3) 
In [50]: df.iloc[:, 4:] = df.iloc[:, 4:].astype(int)
In [51]: df
Out[51]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5074 entries, 0 to 5073
Data columns (total 7 columns):
1    5074  non-null values
2    5074  non-null values
3    5074  non-null values
4    5074  non-null values
5    5074  non-null values
6    5074  non-null values
7    5074  non-null values
dtypes: object(7) 
In [52]: 
采纳答案by Jeff
Do it like this
像这样做
In [49]: df = DataFrame([['1','2','3','.4',5,6.,'foo']],columns=list('ABCDEFG'))
In [50]: df
Out[50]: 
   A  B  C   D  E  F    G
0  1  2  3  .4  5  6  foo
In [51]: df.dtypes
Out[51]: 
A     object
B     object
C     object
D     object
E      int64
F    float64
G     object
dtype: object
Need to assign columns one-by-one
需要一一分配列
In [52]: for k, v in df.iloc[:,0:4].convert_objects(convert_numeric=True).iteritems():
    df[k] = v
   ....:     
In [53]: df.dtypes
Out[53]: 
A      int64
B      int64
C      int64
D    float64
E      int64
F    float64
G     object
dtype: object
Convert objects usually does the right thing, so easiest to do this
转换对象通常会做正确的事情,所以最容易做到这一点
In [54]: df = DataFrame([['1','2','3','.4',5,6.,'foo']],columns=list('ABCDEFG'))
In [55]: df.convert_objects(convert_numeric=True).dtypes
Out[55]: 
A      int64
B      int64
C      int64
D    float64
E      int64
F    float64
G     object
dtype: object
assigning via df.iloc[:,4:]with a series on the right-hand side copies the data changing type as needed, so I think this should work in theory, but I suspect that this is hitting a very obscure bug that prevents the object dtype from changing to a real(meaning int/float) dtype. Should probably raise for now.
df.iloc[:,4:]使用右侧的系列分配 via根据需要复制数据更改类型,所以我认为这在理论上应该有效,但我怀疑这是一个非常模糊的错误,阻止对象 dtype 更改为真实的(意思是 int/float) dtype。应该暂时加注。
Heres the issue to track this: https://github.com/pydata/pandas/issues/4312
这是跟踪此问题的问题:https: //github.com/pydata/pandas/issues/4312

