Pandas 在 Python 合并时删除索引索引?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15661455/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas drops index index on merge in Python?
提问by
I am merging two dataframes using merge(..., how='left')since I want to retain only entries that match up with the "left" dataframe. The problem is that the merge operation seems to drop the index of my leftmost dataframe, shown here:
我正在合并两个数据帧,merge(..., how='left')因为我只想保留与“左”数据帧匹配的条目。问题是合并操作似乎删除了我最左边数据帧的索引,如下所示:
import pandas
df1 = pandas.DataFrame([{"id": 1,
"name": "bob"},
{"id": 10,
"name": "sally"}])
df1 = df1.set_index("id")
df2 = pandas.DataFrame([{"name": "bob",
"age": 10},
{"name": "sally",
"age": 11}])
print "df1 premerge: "
print df1
df1 = df1.merge(df2, on=["name"],
how="left")
print "merged: "
print df1
# This is not "id"
print df1.index
# And there's no "id" field
assert ("id" in df1.columns) == False
Before the merge, df1was indexed by id. After the merge operation, there's just the default numeric index for the merged dataframe and the idfield was dropped. How can I do this kind of merge operation but retain the index of the leftmost dataframe?
在合并之前,df1被索引id。合并操作后,合并的数据框只有默认的数字索引,该id字段被删除。如何进行这种合并操作但保留最左侧数据帧的索引?
To clarify: I want all the columns of df2to be added to every entry in df1that has the matching idvalue. If an entry in df2has an idvalue not in df1, then that shouldn't be merged in (hence the how='left').
澄清一下:我希望将 的所有列df2添加到df1具有匹配id值的每个条目中。如果条目 indf2的id值不在df1,则不应将其合并(因此是how='left')。
edit: I could as a hack do: df1.reset_index()but merging and then set the index again, but I prefer not to if possible, it seems like merge shouldn't have to drop the index. thanks.
编辑:我可以像黑客一样:df1.reset_index()但是合并然后再次设置索引,但如果可能的话我不想这样做,似乎合并不应该删除索引。谢谢。
回答by Snakes McGee
You've already pointed out doing a reset_index before the merge and a set_index afterwards, which works. The only way I know of to preserve indices across a merge is for the merge to involve an index on at least one of the data frames being merged. So here, you could do:
您已经指出在合并之前执行 reset_index 并在之后执行 set_index,这是有效的。我所知道的在合并中保留索引的唯一方法是合并涉及至少一个正在合并的数据帧上的索引。所以在这里,你可以这样做:
In [403]: df2 = df2.set_index('name')
In [404]: df1.merge(df2, left_on='name', right_index=True)
Out[404]:
name age
id
1 bob 10
10 sally 11
to merge df2's index, which we've taken from its 'name' column, against the 'name' column on df1.
将 df2 的索引(我们从其 'name' 列中获取)与 df1 上的 'name' 列合并。
This makes some sense, because otherwise the index of the resulting dataframe is ambiguous as it could come from either dataframe.
这是有道理的,因为否则结果数据帧的索引是不明确的,因为它可能来自任一数据帧。

