如何复制 Pandas DataFrame 中的行并添加 id 列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23331753/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I copy rows in a pandas DataFrame and add an id column
提问by Racing Tadpole
I have a dataframe such as:
我有一个数据框,例如:
from pandas import DataFrame
import pandas as pd
x = DataFrame.from_dict({'farm' : ['A','B','A','B'],
'fruit':['apple','apple','pear','pear']})
How can I copy it Ntimes with an id, eg. to output (for N=2):
我怎样才能N用一个 id复制它,例如。输出(用于N=2):
farm fruit sim
0 A apple 0
1 B apple 0
2 A pear 0
3 B pear 0
0 A apple 1
1 B apple 1
2 A pear 1
3 B pear 1
I tried an approach which works on dataframes in R:
我尝试了一种适用于 R 中数据帧的方法:
from numpy import arange
N = 2
sim_ids = DataFrame(arange(N))
pd.merge(left=x, right=sim_ids, how='left')
but this fails with the error MergeError: No common columns to perform merge on.
但这因错误而失败MergeError: No common columns to perform merge on。
Thanks.
谢谢。
回答by Phillip Cloud
Not sure what R is doing there, but here's a way to do what you want:
不确定 R 在那里做什么,但这里有一种方法可以做你想做的事:
In [150]: x
Out[150]:
farm fruit
0 A apple
1 B apple
2 A pear
3 B pear
[4 rows x 2 columns]
In [151]: N = 2
In [152]: DataFrame(tile(x, (N, 1)), columns=x.columns).join(DataFrame({'sims': repeat(arange(N), len(x))}))
Out[152]:
farm fruit sims
0 A apple 0
1 B apple 0
2 A pear 0
3 B pear 0
4 A apple 1
5 B apple 1
6 A pear 1
7 B pear 1
[8 rows x 3 columns]
In [153]: N = 3
In [154]: DataFrame(tile(x, (N, 1)), columns=x.columns).join(DataFrame({'sims': repeat(arange(N), len(x))}))
Out[154]:
farm fruit sims
0 A apple 0
1 B apple 0
2 A pear 0
3 B pear 0
4 A apple 1
5 B apple 1
6 A pear 1
7 B pear 1
8 A apple 2
9 B apple 2
10 A pear 2
11 B pear 2
[12 rows x 3 columns]
回答by DSM
I might do something like:
我可能会做这样的事情:
>>> df_new = pd.concat([df]*2)
>>> df_new["id"] = df_new.groupby(level=0).cumcount()
>>> df_new
farm fruit id
0 A apple 0
1 B apple 0
2 A pear 0
3 B pear 0
0 A apple 1
1 B apple 1
2 A pear 1
3 B pear 1
[8 rows x 3 columns]

