Python 熊猫,如何按列值过滤数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34063779/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:25:25  来源:igfitidea点击:

pandas, how to filter dataframe by column value

pythonpandas

提问by GoingMyWay

I have a DataFrame like this

我有一个像这样的 DataFrame

>>> df
    id    name    score    subject
    0001   'bob'    100    'math'
    0001   'bob'     67    'science'
    0001   'bob'     63    'bio'
    0002  'Hyman'     67    'math'
    0002  'Hyman'     98    'science' 
    0002  'Hyman'     90    'bio'
    0003  'Hyman'     60    'math'
    0003  'Hyman'     78    'science' 
    0003  'rose'     87    'bio'

I want to filter every id's data into a new DataFrame and write to an Excel file based on its id. So, the above dfwill be filtered into 3 DataFrames whose idsare 0001, 0002and 0003, and all the DataFrames will be written to individual excel files.

我想将ideach 的数据过滤到一个新的 DataFrame 中,并根据其 ID 写入 Excel 文件。因此,上述内容df将被过滤为 3 个数据帧,ids它们分别是000100020003,并且所有数据帧都将写入单独的 excel 文件。

采纳答案by Tasos

First, get a list of the unique ID values

首先,获取唯一 ID 值的列表

uniquevalues = np.unique(df[['id']].values)

Then iterate on it and export each dataframe with those IDs in a CSV file

然后对其进行迭代并将每个数据帧与这些 ID 导出到 CSV 文件中

for id in uniquevalues:
    newdf = df[df['id'] == id]
    newdf.to_csv("dataframe "+id+".csv", sep='\t')

If you have only those three IDs, then you can just pass the forand do the same thing manually like

如果您只有这三个 ID,那么您可以直接传递for并手动执行相同的操作,例如

newdf = df[df['id'] == "0001"]
newdf.to_csv("dataframe0001.csv", sep='\t')

回答by Fabio Lamanna

IIUC, on your example you can just filter the dataframe by idwith:

IIUC,在您的示例中,您可以通过以下方式过滤数据框id

df1 = df[df['id'] == 0001]

and the same for other idvalues.

其他id值也一样。

回答by A2Ben415

Needed to convert df row to (str) first, otherwise kept getting dtype errors.

需要先将 df 行转换为 (str),否则会不断收到 dtype 错误。

df['sample']=df['sample'].apply(str)