pandas 将元组作为一行附加到数据帧

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32876284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:57:16  来源:igfitidea点击:

Append a tuple to a dataframe as a row

pandasappendrowtuples

提问by Data Enthusiast

I am looking for a solution to add rows to a dataframe. Here is the data I have : A grouped object ( obtained by grouping a dataframe on month and year i.e in this grouped object key is [month,year] and value is all the rows / dates in that month and year).

我正在寻找一种向数据帧添加行的解决方案。这是我拥有的数据:一个分组对象(通过对月份和年份的数据框进行分组获得,即在这个分组对象中,键是 [month,year],值是该月份和年份中的所有行/日期)。

I want to extract all the month , year combinations and put that in a new dataframe. Issue: When I iterate over the grouped object, month, row is a tuple, so I converted the tuple into a list and added it to a dataframe using thye append command. Instead of getting added as rows : 1 2014 2 2014 3 2014 it got added in one column 0 1 1 2014 0 2 1 2014 0 3 1 2014 ...

我想提取所有的月、年组合并将其放入一个新的数据框中。问题:当我遍历分组对象时,月份、行是一个元组,因此我将元组转换为列表并使用 thye append 命令将其添加到数据帧中。而不是被添加为行:1 2014 2 2014 3 2014 它被添加在一列中 0 1 1 2014 0 2 1 2014 0 3 1 2014 ...

I want to store these values in a new dataframe. Here is how I want the new dataframe to be : month year 1 2014 2 2014 3 2014

我想将这些值存储在一个新的数据框中。这是我希望新数据框的样子:月年 1 2014 2 2014 3 2014

I tried converting the tuple to list and then I tried various other things like pivoting. Inputs would be really helpful.

我尝试将元组转换为列表,然后尝试了其他各种方法,例如旋转。输入将非常有帮助。

Here is the sample code :

这是示例代码:

    df=df.groupby(['month','year'])
    df = pd.DataFrame()
    for key, value in df:
            print "type of key is:",type(key)
            print "type of list(key) is:",type(list(key))
            df = df.append(list(key))
    print df

回答by Andy Hayden

When you do the groupby the resulting MultiIndex is available as:

当您执行 groupby 时,生成的 MultiIndex 可用作:

In [11]: df = pd.DataFrame([[1, 2014, 42], [1, 2014, 44], [2, 2014, 23]], columns=['month', 'year', 'val'])

In [12]: df
Out[12]:
   month  year  val
0      1  2014   42
1      1  2014   44
2      2  2014   23

In [13]: g = df.groupby(['month', 'year'])

In [14]: g.grouper.result_index
Out[14]:
MultiIndex(levels=[[1, 2], [2014]],
           labels=[[0, 1], [0, 0]],
           names=['month', 'year'])

Often this will be sufficient, and you won't need a DataFrame. If you do, one way is the following:

通常这就足够了,而且您不需要 DataFrame。如果你这样做,一种方法是:

In [21]: pd.DataFrame(index=g.grouper.result_index).reset_index()
Out[21]:
   month  year
0      1  2014
1      2  2014

I thought there was a method to get this, but can't recall it.

我以为有一种方法可以得到这个,但想不起来了。

If you really want the tuples you can use .valuesor to_series:

如果你真的想要元组,你可以使用.valuesto_series

In [31]: g.grouper.result_index.values
Out[31]: array([(1, 2014), (2, 2014)], dtype=object)

In [32]: g.grouper.result_index.to_series()
Out[32]:
month  year
1      2014    (1, 2014)
2      2014    (2, 2014)
dtype: object

回答by chrisb

If all you want are the unique values, you could use drop_duplicates

如果您想要的只是唯一值,则可以使用 drop_duplicates

In [29]: df[['month','year']].drop_duplicates()
Out[29]: 
   month  year
0      1  2014
2      2  2014

回答by abstract

You had initially declared both the groupby and empty dataframe as df. Here's a modified version of your code that allows you to append a tuple as a dataframe row.

您最初已将 groupby 和空数据框声明为df. 这是您的代码的修改版本,允许您将元组附加为数据帧行。

g=df.groupby(['month','year'])
df = pd.DataFrame()
for (key1,key2), value in g:
    row_series = pd.Series((key1,key),index=['month','year'])
    df = df.append(row_series, ignore_index = True)
print df