Pandas 在 Python 合并时删除索引索引？

Question

提问by

I am merging two dataframes using merge(..., how='left')since I want to retain only entries that match up with the "left" dataframe. The problem is that the merge operation seems to drop the index of my leftmost dataframe, shown here:

我正在合并两个数据帧，merge(..., how='left')因为我只想保留与“左”数据帧匹配的条目。问题是合并操作似乎删除了我最左边数据帧的索引，如下所示：

import pandas
df1 = pandas.DataFrame([{"id": 1,
                         "name": "bob"},
                        {"id": 10,
                         "name": "sally"}])
df1 = df1.set_index("id")
df2 = pandas.DataFrame([{"name": "bob",
                         "age": 10},
                        {"name": "sally",
                         "age": 11}])

print "df1 premerge: "
print df1
df1 = df1.merge(df2, on=["name"],
                how="left")
print "merged: "
print df1
# This is not "id"
print df1.index
# And there's no "id" field
assert ("id" in df1.columns) == False

Before the merge, df1was indexed by id. After the merge operation, there's just the default numeric index for the merged dataframe and the idfield was dropped. How can I do this kind of merge operation but retain the index of the leftmost dataframe?

在合并之前，df1被索引id。合并操作后，合并的数据框只有默认的数字索引，该id字段被删除。如何进行这种合并操作但保留最左侧数据帧的索引？

To clarify: I want all the columns of df2to be added to every entry in df1that has the matching idvalue. If an entry in df2has an idvalue not in df1, then that shouldn't be merged in (hence the how='left').

澄清一下：我希望将的所有列df2添加到df1具有匹配id值的每个条目中。如果条目 indf2的id值不在df1，则不应将其合并（因此是how='left'）。

edit: I could as a hack do: df1.reset_index()but merging and then set the index again, but I prefer not to if possible, it seems like merge shouldn't have to drop the index. thanks.

编辑：我可以像黑客一样：df1.reset_index()但是合并然后再次设置索引，但如果可能的话我不想这样做，似乎合并不应该删除索引。谢谢。

Answer 1

回答by Snakes McGee

You've already pointed out doing a reset_index before the merge and a set_index afterwards, which works. The only way I know of to preserve indices across a merge is for the merge to involve an index on at least one of the data frames being merged. So here, you could do:

您已经指出在合并之前执行 reset_index 并在之后执行 set_index，这是有效的。我所知道的在合并中保留索引的唯一方法是合并涉及至少一个正在合并的数据帧上的索引。所以在这里，你可以这样做：

In [403]: df2 = df2.set_index('name')

In [404]: df1.merge(df2, left_on='name', right_index=True)
Out[404]: 
     name  age
id            
1     bob   10
10  sally   11

to merge df2's index, which we've taken from its 'name' column, against the 'name' column on df1.

将 df2 的索引（我们从其 'name' 列中获取）与 df1 上的 'name' 列合并。

This makes some sense, because otherwise the index of the resulting dataframe is ambiguous as it could come from either dataframe.

这是有道理的，因为否则结果数据帧的索引是不明确的，因为它可能来自任一数据帧。

Pandas 在 Python 合并时删除索引索引？

提问by

回答by Snakes McGee

相关推荐

最近更新

标签

Pandas 在 Python 合并时删除索引索引？

提问by

回答by Snakes McGee

相关推荐

pandas 熊猫聚合的条件总和

pandas 使用pandas.io.sql.read_frame，我可以像read_csv那样解析日期吗？

对组对象中的不同项目应用不同的函数：Python pandas

从受密码保护的 Excel 文件到 Pandas DataFrame

相关推荐

最近更新

标签