pandas 将系列转换为数据帧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38913355/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
convert Series to DataFrame
提问by SHIVAM GOYAL
I created a dataframe 'x'
我创建了一个数据框“x”
I wanted to create another dataframe y which consists of values of feature 'wheat_type' from dataframe x
我想创建另一个数据帧 y,它包含来自数据帧 x 的特征“wheat_type”的值
so i executed the code
所以我执行了代码
y=X.loc[:, 'wheat_type']
when I ran the following command
当我运行以下命令时
y['wheat_type'] = y.wheat_type("category").cat.codes
I got following error
我收到以下错误
'Series' object has no attribute 'wheat_type'
'Series' 对象没有属性 'wheat_type'
on executing type(X),I got
在执行类型(X)时,我得到
<class 'pandas.core.frame.DataFrame'>
and on executing type(y),I got
在执行类型(y)时,我得到了
<class 'pandas.core.series.Series'>
Is there possible way to covert y into a dataframe.If not,please tell me how to create required dataframe y from x
是否有可能将 y 转换为数据帧。如果没有,请告诉我如何从 x 创建所需的数据帧 y
采纳答案by jezrael
It looks like need astype
and to_frame
:
X = pd.DataFrame({'wheat_type':[5,7,3]})
print (X)
wheat_type
0 5
1 7
2 3
#create DataFrame by subset
y=X[['wheat_type']]
#cast to category and get codes
y['wheat_type'] = y.wheat_type.astype("category").cat.codes
print (y)
wheat_type
0 1
1 2
2 0
If there are multiple columns, better is use to_frame
as pointed Ami
:
如果有多个列,最好to_frame
按指示使用Ami
:
X = pd.DataFrame({'wheat_type':[5,7,3], 'z':[4,7,9]})
print (X)
wheat_type z
0 5 4
1 7 7
2 3 9
y = X['wheat_type'].to_frame()
#cast to category and get codes
y['wheat_type'] = y.wheat_type.astype("category").cat.codes
print (y)
wheat_type
0 1
1 2
2 0
Another solution for creating new DataFrame is by subset and copy
:
创建新 DataFrame 的另一种解决方案是按子集和copy
:
y = X[['wheat_type']].copy()
回答by Ami Tavory
There's a special method for that - pd.Series.to_frame
有一种特殊的方法—— pd.Series.to_frame
In [2]: df = pd.DataFrame({'a': range(4)})
In [3]: df.a
Out[3]:
0 0
1 1
2 2
3 3
Name: a, dtype: int64
In [4]: df.a.to_frame()
Out[4]:
a
0 0
1 1
2 2
3 3