Python 在 Pandas 中对列和索引使用合并

Question

提问by user2242044

I have two separate dataframes that share a project number. In type_df, the project number is the index. In time_df, the project number is a column. I would like to count the number of rows in type_dfthat have a Project Typeof 2. I am trying to do this with pandas.merge(). It works great when using both columns, but not indices. I'm not sure how to reference the index and if mergeis even the right way to do this.

我有两个单独的数据框，它们共享一个项目编号。在中type_df，项目编号为索引。在中time_df，项目编号为一列。我想计算type_df其中有 a Project Typeof的行数2。我正在尝试使用pandas.merge(). 使用两列时效果很好，但不能使用索引。我不确定如何引用索引，merge甚至不知道是否是正确的方法。

import pandas as pd
type_df = pd.DataFrame(data = [['Type 1'], ['Type 2']], 
                       columns=['Project Type'], 
                       index=['Project2', 'Project1'])
time_df = pd.DataFrame(data = [['Project1', 13], ['Project1', 12], 
                               ['Project2', 41]], 
                       columns=['Project', 'Time'])
merged = pd.merge(time_df,type_df, on=[index,'Project'])
print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()

Error:

错误：

Name 'Index' is not defined.

名称“索引”未定义。

Desired Output:

期望输出：

Answer 1

采纳答案by maxymoo

If you want to use an index in your merge you have to specify left_index=Trueor right_index=True, and then use left_onor right_on. For you it should look something like this:

如果要在合并中使用索引，则必须指定left_index=True或right_index=True，然后使用left_on或right_on。对你来说，它应该是这样的：

merged = pd.merge(type_df, time_df, left_index=True, right_on='Project')

Answer 2

回答by dermen

You must have the same column in each dataframe to merge on.

您必须在每个数据框中具有相同的列才能合并。

In this case, just make a 'Project' column for type_df, then merge on that:

在这种情况下，只需为制作一个“项目”列type_df，然后对其进行合并：

type_df['Project'] = type_df.index.values
merged = pd.merge(time_df,type_df, on='Project', how='inner')
merged
#    Project  Time Project Type
#0  Project1    13       Type 2
#1  Project1    12       Type 2
#2  Project2    41       Type 1

print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()
2

Answer 3

回答by jezrael

Another solution is use DataFrame.join:

另一种解决方案是使用DataFrame.join：

df3 = type_df.join(time_df, on='Project')

For version pandas 0.23.0+the on, left_on, and right_onparameters may now refer to either column names or index level names:

对于版本pandas 0.23.0+的on，left_on和right_on参数现在可以指代列名或索引级别名称：

left_index = pd.Index(['K0', 'K0', 'K1', 'K2'], name='key1')
left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                    'B': ['B0', 'B1', 'B2', 'B3'],
                     'key2': ['K0', 'K1', 'K0', 'K1']},
                    index=left_index)

right_index = pd.Index(['K0', 'K1', 'K2', 'K2'], name='key1')

right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],
                     'D': ['D0', 'D1', 'D2', 'D3'],
                     'key2': ['K0', 'K0', 'K0', 'K1']},
                      index=right_index)

print (left)    
       A   B key2
key1             
K0    A0  B0   K0
K0    A1  B1   K1
K1    A2  B2   K0
K2    A3  B3   K1

print (right)
       C   D key2
key1             
K0    C0  D0   K0
K1    C1  D1   K0
K2    C2  D2   K0
K2    C3  D3   K1

df = left.merge(right, on=['key1', 'key2'])
print (df)
       A   B key2   C   D
key1                     
K0    A0  B0   K0  C0  D0
K1    A2  B2   K0  C1  D1
K2    A3  B3   K1  C3  D3

Python 在 Pandas 中对列和索引使用合并

提问by user2242044

采纳答案by maxymoo

回答by dermen

回答by jezrael

相关推荐

最近更新

标签

Python 在 Pandas 中对列和索引使用合并

提问by user2242044

采纳答案by maxymoo

回答by dermen

回答by jezrael

相关推荐

Python pandas DataFrame“没有要绘制的数字数据”错误

Python 我可以在 django 模型字段中添加帮助文本吗

Python 使用pip安装时出错

python print() 函数中的新行

相关推荐

最近更新

标签