使用 Pandas 中列的唯一值创建 DataFrame
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43878942/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create a DataFrame with unique values of a Column in Pandas
提问by Felipe Amaral Rodrigues
I'm new at Python and Pandas and I'm having troubles to solve a problem, I have a DF with multiple variables, as the example bellow:
我是 Python 和 Pandas 的新手,我在解决问题时遇到了麻烦,我有一个带有多个变量的 DF,如下例所示:
SRC Data1 Data2 AAA 180 122 BBB 168 121 CCC 165 147 DDD 140 156 EEE 152 103 AAA 170 100 CCC 166 112 DDD 116 155 EEE 179 119
And I'm expecting something like:
我期待这样的事情:
DF_A
DF_A
SRC Data1 Data2 AAA 180 122 AAA 170 100
DF_B
DF_B
SRC Data1 Data2 BBB 168 121
What I need is create a DF to each value in SRCand carry their respective data in Data1and Data2
我需要的是为SRC 中的每个值创建一个 DF并在Data1和Data2 中携带它们各自的数据
I have alredy use pd.DataFrame(Example.SRC.unique()) and get each unique values in SRCbut I don't know if this will help me.
我已经使用 pd.DataFrame(Example.SRC.unique()) 并在SRC 中获取每个唯一值,但我不知道这是否对我有帮助。
Thank you all!
谢谢你们!
回答by Andy Hayden
The neat way to do this is dict(iter(g))
:
这样做的巧妙方法是dict(iter(g))
:
In [11]: g = df.groupby("SRC", as_index=False)
In [12]: d = dict(iter(g))
In [13]: d
Out[13]:
{'AAA': SRC Data1 Data2
0 AAA 180 122
5 AAA 170 100, 'BBB': SRC Data1 Data2
1 BBB 168 121, 'CCC': SRC Data1 Data2
2 CCC 165 147
6 CCC 166 112, 'DDD': SRC Data1 Data2
3 DDD 140 156
7 DDD 116 155, 'EEE': SRC Data1 Data2
4 EEE 152 103
8 EEE 179 119}
In [14]: d["AAA"]
Out[14]:
SRC Data1 Data2
0 AAA 180 122
5 AAA 170 100
You can pull out the subgroups without copying:
您可以在不复制的情况下拉出子组:
In [21]: g.get_group("AAA")
Out[21]:
SRC Data1 Data2
0 AAA 180 122
5 AAA 170 100
Note: you can get an iterable of the keys with g.groups.keys()
.
注意:您可以使用g.groups.keys()
.
回答by MaxU
I'd generate a dictionary of DFs:
我会生成一个 DF 字典:
In [247]: dfs = {n:g for n,g in df.groupby('SRC')}
In [248]: dfs['AAA']
Out[248]:
SRC Data1 Data2
0 AAA 180 122
5 AAA 170 100
In [249]: dfs['BBB']
Out[249]:
SRC Data1 Data2
1 BBB 168 121
In [253]: dfs.keys()
Out[253]: dict_keys(['EEE', 'DDD', 'CCC', 'BBB', 'AAA'])
a bit nicer way to achieve the same thing:
实现相同目标的更好方法:
dfs = dict(tuple(df.groupby('SRC')))