Python Pandas 融化函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34830597/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Melt Function
提问by slaw
I have a dataframe:
我有一个数据框:
df = pd.DataFrame([[2, 4, 7, 8, 1, 3, 2013], [9, 2, 4, 5, 5, 6, 2014]], columns=['Amy', 'Bob', 'Carl', 'Chris', 'Ben', 'Other', 'Year'])
Amy Bob Carl Chris Ben Other Year
0 2 4 7 8 1 3 2013
1 9 2 4 5 5 6 2014
And a dictionary:
还有一本字典:
d = {'A': ['Amy'], 'B': ['Bob', 'Ben'], 'C': ['Carl', 'Chris']}
I would like to reshape my dataframe to look like this:
我想重塑我的数据框看起来像这样:
Group Name Year Value
0 A Amy 2013 2
1 A Amy 2014 9
2 B Bob 2013 4
3 B Bob 2014 2
4 B Ben 2013 1
5 B Ben 2014 5
6 C Carl 2013 7
7 C Carl 2014 4
8 C Chris 2013 8
9 C Chris 2014 5
10 Other 2013 3
11 Other 2014 6
Note that Other
doesn't have any values in the Name
column and the order of the rows does not matter. I think I should be using the melt
function but the examples that I've come across aren't too clear.
请注意,列Other
中没有任何值,Name
行的顺序无关紧要。我想我应该使用该melt
函数,但是我遇到的示例不太清楚。
采纳答案by TomAugspurger
melt
gets you part way there.
melt
让你分道扬镳。
In [29]: m = pd.melt(df, id_vars=['Year'], var_name='Name')
This has everything except Group
. To get that, we need to reshape d
a bit as well.
除了Group
. 为了做到这一点,我们还需要d
稍微重塑一下。
In [30]: d2 = {}
In [31]: for k, v in d.items():
for item in v:
d2[item] = k
....:
In [32]: d2
Out[32]: {'Amy': 'A', 'Ben': 'B', 'Bob': 'B', 'Carl': 'C', 'Chris': 'C'}
In [34]: m['Group'] = m['Name'].map(d2)
In [35]: m
Out[35]:
Year Name value Group
0 2013 Amy 2 A
1 2014 Amy 9 A
2 2013 Bob 4 B
3 2014 Bob 2 B
4 2013 Carl 7 C
.. ... ... ... ...
7 2014 Chris 5 C
8 2013 Ben 1 B
9 2014 Ben 5 B
10 2013 Other 3 NaN
11 2014 Other 6 NaN
[12 rows x 4 columns]
And moving 'Other' from Name
to Group
并将“其他”从 移动Name
到Group
In [8]: mask = m['Name'] == 'Other'
In [9]: m.loc[mask, 'Name'] = ''
In [10]: m.loc[mask, 'Group'] = 'Other'
In [11]: m
Out[11]:
Year Name value Group
0 2013 Amy 2 A
1 2014 Amy 9 A
2 2013 Bob 4 B
3 2014 Bob 2 B
4 2013 Carl 7 C
.. ... ... ... ...
7 2014 Chris 5 C
8 2013 Ben 1 B
9 2014 Ben 5 B
10 2013 3 Other
11 2014 6 Other
[12 rows x 4 columns]
回答by HeadAndTail
Pandas Melt Function :-
熊猫融化功能:-
This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable' and ‘value'.
此函数可用于将 DataFrame 转换为一种格式,其中一列或多列是标识符变量 (id_vars),而所有其他列,被视为测量变量 (value_vars),“未旋转”到行轴,只留下两个非标识符列,“变量”和“值”。
eg:-
例如:-
melted = pd.melt(df, id_vars=["weekday"],
var_name="Person", value_name="Score")
we use melt to transform wide data to long data.
我们使用melt 将宽数据转换为长数据。