Python Pandas 融化函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34830597/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 15:34:32  来源:igfitidea点击:

Pandas Melt Function

pythonpandas

提问by slaw

I have a dataframe:

我有一个数据框:

df = pd.DataFrame([[2, 4, 7, 8, 1, 3, 2013], [9, 2, 4, 5, 5, 6, 2014]], columns=['Amy', 'Bob', 'Carl', 'Chris', 'Ben', 'Other', 'Year'])
   Amy  Bob  Carl  Chris  Ben  Other  Year
0    2    4     7      8    1      3  2013
1    9    2     4      5    5      6  2014

And a dictionary:

还有一本字典:

d = {'A': ['Amy'], 'B': ['Bob', 'Ben'], 'C': ['Carl', 'Chris']}

I would like to reshape my dataframe to look like this:

我想重塑我的数据框看起来像这样:

    Group   Name  Year  Value
 0      A    Amy  2013      2
 1      A    Amy  2014      9
 2      B    Bob  2013      4
 3      B    Bob  2014      2
 4      B    Ben  2013      1
 5      B    Ben  2014      5
 6      C   Carl  2013      7
 7      C   Carl  2014      4
 8      C  Chris  2013      8
 9      C  Chris  2014      5
10  Other         2013      3
11  Other         2014      6

Note that Otherdoesn't have any values in the Namecolumn and the order of the rows does not matter. I think I should be using the meltfunction but the examples that I've come across aren't too clear.

请注意,列Other中没有任何值,Name行的顺序无关紧要。我想我应该使用该melt函数,但是我遇到的示例不太清楚。

采纳答案by TomAugspurger

meltgets you part way there.

melt让你分道扬镳。

In [29]: m = pd.melt(df, id_vars=['Year'], var_name='Name')

This has everything except Group. To get that, we need to reshape da bit as well.

除了Group. 为了做到这一点,我们还需要d稍微重塑一下。

In [30]: d2 = {}

In [31]: for k, v in d.items():
    for item in v:
        d2[item] = k
   ....:

In [32]: d2
Out[32]: {'Amy': 'A', 'Ben': 'B', 'Bob': 'B', 'Carl': 'C', 'Chris': 'C'}

In [34]: m['Group'] = m['Name'].map(d2)

In [35]: m
Out[35]:
    Year   Name  value Group
0   2013    Amy      2     A
1   2014    Amy      9     A
2   2013    Bob      4     B
3   2014    Bob      2     B
4   2013   Carl      7     C
..   ...    ...    ...   ...
7   2014  Chris      5     C
8   2013    Ben      1     B
9   2014    Ben      5     B
10  2013  Other      3   NaN
11  2014  Other      6   NaN

[12 rows x 4 columns]

And moving 'Other' from Nameto Group

并将“其他”从 移动NameGroup

In [8]: mask = m['Name'] == 'Other'

In [9]: m.loc[mask, 'Name'] = ''

In [10]: m.loc[mask, 'Group'] = 'Other'

In [11]: m
Out[11]:
    Year   Name  value  Group
0   2013    Amy      2      A
1   2014    Amy      9      A
2   2013    Bob      4      B
3   2014    Bob      2      B
4   2013   Carl      7      C
..   ...    ...    ...    ...
7   2014  Chris      5      C
8   2013    Ben      1      B
9   2014    Ben      5      B
10  2013             3  Other
11  2014             6  Other

[12 rows x 4 columns]

回答by HeadAndTail

Pandas Melt Function :-

熊猫融化功能:-

This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable' and ‘value'.

此函数可用于将 DataFrame 转换为一种格式,其中一列或多列是标识符变量 (id_vars),而所有其他列,被视为测量变量 (value_vars),“未旋转”到行轴,只留下两个非标识符列,“变量”和“值”。

eg:-

例如:-

melted = pd.melt(df, id_vars=["weekday"], 
             var_name="Person", value_name="Score")

we use melt to transform wide data to long data.

我们使用melt 将宽数据转换为长数据。