pandas 熊猫数据框中列表上的“Where子句”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26112785/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
'Where clause' on a list in a pandas Dataframe
提问by woshitom
I'm having this kind of pandas Datamframe named df:
我有这种名为 df 的 Pandas Datamframe:
email | list
___________________________
[email protected] | [0,1]
[email protected] | [2,1]
[email protected] | [0,3]
[email protected] | [0,0]
[email protected] | [0,1]
I want to retrieve all the row from df having a 0 list : [0,0]
我想从具有 0 列表的 df 中检索所有行:[0,0]
I'm doing:
我正在做:
df2 = df[df['list'] == [0,0]]
But I'm getting the following error:
但我收到以下错误:
ValueError: Arrays were different lengths: 5 vs 2
回答by firelynx
The reason this is not working:
这不起作用的原因:
df2 = df[df['list'] == [0, 0]]
is because df['list'] is a 5 element long list, and [0, 0]is a two element long list. It fails while evaluating your mask
是因为 df['list'] 是一个 5 个元素的长列表,并且[0, 0]是一个两个元素的长列表。评估您的面罩时失败
df['list'] == [0, 0]
Updated proper solution
更新了正确的解决方案
I believe the fastest way of solving this is to create a series of [0,0] elements the length of your dataframe, and compare this series to your column
我相信解决这个问题的最快方法是创建一系列 [0,0] 元素的数据帧长度,并将这个系列与您的列进行比较
df['list'] == pd.Series([[0, 0]] * len(df))
0 False
1 False
2 False
3 True
4 False
This creates a mask by comparing each elementin the list to [0, 0]instead of comparing the listdf['list']to [0, 0]
这通过比较列表中的每个元素[0, 0]而不是将列表df['list']与[0, 0]
Using this mask you can then create your new dataframe
使用此掩码,您可以创建新的数据框
mask = df['list'] == pd.Series([[0, 0]] * len(df))
df2 = df[mask]
回答by ragingSloth
your comparing the list of lists to an individual entry. You should instead filter df by using iterrows(). iterrows()creates a generator whic yields tuples whose second entry is the dictionary of columns. you can iterate through them and match against them, then build a new dataframe.
您将列表列表与单个条目进行比较。您应该使用iterrows(). iterrows()创建一个生成元组,其第二个条目是列字典。您可以遍历它们并匹配它们,然后构建一个新的数据框。
df2 = {'email':[], 'list':[]}
for row in df.iterrows():
row_dictionary = row[1]
if row_dictionary['list'] == [0,0]:
for key in df2.keys():
df2[key].append(row_dictionary[key])
df2 = pandas.DataFrame.from_dict(df2)
By using the dictionary's keys to populate it you can use this method on any dataframe.
通过使用字典的键来填充它,您可以在任何数据帧上使用此方法。

