转换数据帧的 Pandas dtype

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28817902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:59:58  来源:igfitidea点击:

Convert Pandas dtype of dataframe

pythonnumpypandask-means

提问by conr404

I have a Pandasdataframe which is stored as an 'object', but I need to change the dataframe structure to an 'int' as the 'object' dtype will not process in the kmeans()function of numpylibrary

我有一个存储为“对象”的Pandas数据帧,但我需要将数据帧结构更改为“int”,因为“对象”数据类型不会在numpy库的kmeans()函数中处理

I have managed to convert each column of the dataframe into an float64,based on this example Pandas: change data type of columnsbut I can't change the whole thing into anything else.

我已经设法将数据帧的每一列转换为 float64,基于这个示例Pandas:更改列的数据类型,但我无法将整个内容更改为其他任何内容。

 #create subset of user variables
 user.posts = user.posts.astype('int')
 user.views = user.views.astype('int')
 user.kudos = user.kudos.astype('int')

 Y = user[['posts','views','kudos']]
 #convert dataframe into float
 X.convert_objects(convert_numeric=True).dtypes

Out[205]:
 posts    float64
 views    float64
 kudos    float64
 dtype: object

This then causes issues when I try and run

当我尝试运行时,这会导致问题

K = range(1,10)

# scipy.cluster.vq.kmeans
KM = [kmeans(X,k) for k in K] # apply kmeans 1 to 10

I get the error

我收到错误

  --->KM = [kmeans(X,k) for k in K] # apply kmeans 1 to 10
  ^

  AttributeError: 'DataFrame' object has no attribute 'dtype'

What is the issue kmeans is having with either the K or X dataframe, and how can it be resolved? Thanks

kmeans 对 K 或 X 数据帧有什么问题,如何解决?谢谢

回答by conr404

save it just as the values, not the objects. per this post How to convert a pandas DataFrame subset of columns AND rows into a numpy array?

将它保存为值,而不是对象。根据这篇文章 如何将列和行的 Pandas DataFrame 子集转换为 numpy 数组?

user.posts = user.posts.astype('float')
user.views = user.views.astype('float')
user.kudos = user.kudos.astype('float')

Y = user[['posts','views','kudos']].values