使用 Pandas 中列的唯一值创建 DataFrame

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43878942/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:34:18  来源:igfitidea点击:

Create a DataFrame with unique values of a Column in Pandas

pythonpandas

提问by Felipe Amaral Rodrigues

I'm new at Python and Pandas and I'm having troubles to solve a problem, I have a DF with multiple variables, as the example bellow:

我是 Python 和 Pandas 的新手,我在解决问题时遇到了麻烦,我有一个带有多个变量的 DF,如下例所示:

SRC Data1 Data2
AAA  180   122
BBB  168   121
CCC  165   147
DDD  140   156
EEE  152   103
AAA  170   100
CCC  166   112
DDD  116   155
EEE  179   119

And I'm expecting something like:

我期待这样的事情:

DF_A

DF_A

SRC    Data1   Data2
AAA    180     122
AAA    170     100

DF_B

DF_B

SRC    Data1   Data2
BBB     168     121

What I need is create a DF to each value in SRCand carry their respective data in Data1and Data2

我需要的是为SRC 中的每个值创建一个 DF并在Data1Data2 中携带它们各自的数据

I have alredy use pd.DataFrame(Example.SRC.unique()) and get each unique values in SRCbut I don't know if this will help me.

我已经使用 pd.DataFrame(Example.SRC.unique()) 并在SRC 中获取每个唯一值,但我不知道这是否对我有帮助。

Thank you all!

谢谢你们!

回答by Andy Hayden

The neat way to do this is dict(iter(g)):

这样做的巧妙方法是dict(iter(g))

In [11]: g = df.groupby("SRC", as_index=False)

In [12]: d = dict(iter(g))

In [13]: d
Out[13]:
{'AAA':    SRC  Data1  Data2
 0  AAA    180    122
 5  AAA    170    100, 'BBB':    SRC  Data1  Data2
 1  BBB    168    121, 'CCC':    SRC  Data1  Data2
 2  CCC    165    147
 6  CCC    166    112, 'DDD':    SRC  Data1  Data2
 3  DDD    140    156
 7  DDD    116    155, 'EEE':    SRC  Data1  Data2
 4  EEE    152    103
 8  EEE    179    119}

In [14]: d["AAA"]
Out[14]:
   SRC  Data1  Data2
0  AAA    180    122
5  AAA    170    100

You can pull out the subgroups without copying:

您可以在不复制的情况下拉出子组:

In [21]: g.get_group("AAA")
Out[21]:
   SRC  Data1  Data2
0  AAA    180    122
5  AAA    170    100

Note: you can get an iterable of the keys with g.groups.keys().

注意:您可以使用g.groups.keys().

回答by MaxU

I'd generate a dictionary of DFs:

我会生成一个 DF 字典:

In [247]: dfs = {n:g for n,g in df.groupby('SRC')}

In [248]: dfs['AAA']
Out[248]:
   SRC  Data1  Data2
0  AAA    180    122
5  AAA    170    100

In [249]: dfs['BBB']
Out[249]:
   SRC  Data1  Data2
1  BBB    168    121

In [253]: dfs.keys()
Out[253]: dict_keys(['EEE', 'DDD', 'CCC', 'BBB', 'AAA'])

a bit nicer way to achieve the same thing:

实现相同目标的更好方法:

dfs = dict(tuple(df.groupby('SRC')))