如何使用来自用户输入的 Pandas 数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40880279/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:32:42  来源:igfitidea点击:

How to use pandas dataframes from user input

pythonooppandasboolean

提问by Dave

Using python 3 coding and pandas version 0.18.1

使用 python 3 编码和 Pandas 0.18.1 版

I am trying to make my program more dynamic by giving user options to filter data from dataframe.

我试图通过为用户提供从数据框中过滤数据的选项来使我的程序更具动态性。

My questions are:

我的问题是:

1) How do I make my user choices available for filtering in dataframe?

1) 如何让我的用户选择可用于在数据框中过滤?

2) Is there a better way to do this? Mabye with function or classes?

2)有没有更好的方法来做到这一点?Mabye 有函数或类?

Assume my df is the following:

假设我的 df 如下:

df.dtypes

PIID    object    
fy      object
zone    object

If fy is grouped:

如果 fy 被分组:

df.groupby('fy').PIID.count()

fy
2014    38542
2015    33629
2016    32789

If zone is grouped:

如果区域已分组:

df.groupby('zone').PIID.count()

AZW - Acquisition Zone West        3909
NAZ - Northern Acquisition Zone    1167
SAZ - Southern Acquisition Zone    2983

Normally I can just create a new dataframe with filters by doing the following:

通常我可以通过执行以下操作来创建一个带有过滤器的新数据框:

year = df['fy'] == '2016'    
zone = df['zone'] == 'AZW - Acquisition Zone West'

newdf = df[year & zone]

But how can I make this more dynamic by providing user options?

但是我怎样才能通过提供用户选项来使这更加动态呢?

At this point I provide the user some options with booleans for fy:

在这一点上,我为用户提供了一些带有 fy 布尔值的选项:

print ('What is the interested year?')
print ('1. 2014')
print ('2. 2015')
print ('3. 2016')

year = input('> ')

if year == '1':
    year1 = df['fy'] == '2014'
elif year == '2':
    year2 = df['fy'] == '2015'

And some booleans for zone:

以及区域的一些布尔值:

print ('What is the interested zone?')
print ('1. AZW - Acquisition Zone West')
print ('2. NAZ - Northern Acquisition Zone')
print ('3. SAZ - Southern Acquisition Zone')


zone = input('> ')

if zone == '1':
    zone1 = df['zones'] == 'AZW - Acquisition Zone West'
elif zone == '2':
    zone2 = df['zones'] == 'Northern Acquisition Zone'

At this point I don't know how to receive the user choices

此时我不知道如何接收用户选择

newdf = df[choice1 & choice2]  

where choice 1 is the year and choice 2 is the zone.

其中选项 1 是年份,选项 2 是区域。

Thanks in advance for any help!

在此先感谢您的帮助!

采纳答案by semore_1267

Here's my stab at it. You will need to create your own error messages and handlers for incorrect input though.

这是我的尝试。但是,您需要为不正确的输入创建自己的错误消息和处理程序。

import pandas as pd

df = pd.DataFrame({"PIID":[38542,33629,32789], 
                   "fy":["2014","2015","2016"], 
                   "zone":["AZW - Acquisition Zone West", "NAZ - Northern Acquisition Zone", "SAZ - Southern Acquisition Zone"]})


def get_choice(data, column):
    """
    Gets user choice
    """
    nums = [val for val in range(len(df[column].unique()))]
    choices = list(zip(nums, df[column].unique()))
    print("What '%s' would you like?\n" % column)
    for v in choices:
        print("%s.  %s" % (v))
    user_input = input("Answer: ")
    user_answer = [val[1] for val in choices if val[0]==int(user_input)][0]
    print("'%s' = %s\n" % (column, user_answer)) # Just tells the user what they answered
    return user_answer

def main():

    year_input = get_choice(data=df, column="fy")
    zone_input = get_choice(data=df, column="zone")
    newdf = df.loc[(df["fy"]==year_input)&(df["zone"]==zone_input)]
    print(newdf)

if __name__ == "__main__":
    main()

So if you input something like "0" for the first option (year) and "0" for the second option (zone) your output should be something like:

因此,如果您为第一个选项(年份)输入类似“0”的内容,为第二个选项(区域)输入“0”,您的输出应该是这样的:

    PIID    fy                         zone
0  38542  2014  AZW - Acquisition Zone West

It shouldscale, but like I said, you will obviously have to add your own custom tweaks. This should be enough for you to generalize off of, and solves the problem posed in your question. After reading the code you have I'd just recommend you implement the DRY principle in your work (Do Not Repeat Yourself (e.g., using a ton of if statements)). Hope this helps.

应该可以扩展,但就像我说的那样,您显然必须添加自己的自定义调整。这应该足以让您概括并解决您的问题中提出的问题。在阅读了您拥有的代码后,我建议您在工作中实施 DRY 原则(不要重复自己(例如,使用大量 if 语句))。希望这可以帮助。