按列值复制 Pandas 数据框中的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/26777832/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replicating rows in a pandas data frame by a column value
提问by Mersenne Prime
I want to replicate rows in a Pandas Dataframe. Each row should be repeated n times, where n is a field of each row.
我想复制 Pandas 数据框中的行。每行应重复 n 次,其中 n 是每行的一个字段。
import pandas as pd
what_i_have = pd.DataFrame(data={
  'id': ['A', 'B', 'C'],
  'n' : [  1,   2,   3],
  'v' : [ 10,  13,   8]
})
what_i_want = pd.DataFrame(data={
  'id': ['A', 'B', 'B', 'C', 'C', 'C'],
  'v' : [ 10,  13,  13,   8,   8,   8]
})
Is this possible?
这可能吗?
回答by DSM
You could use np.repeatto get the repeated indices and then use that to index into the frame:
您可以使用np.repeat来获取重复的索引,然后使用它来索引到框架中:
>>> df2 = df.loc[np.repeat(df.index.values,df.n)]
>>> df2
  id  n   v
0  A  1  10
1  B  2  13
1  B  2  13
2  C  3   8
2  C  3   8
2  C  3   8
After which there's only a bit of cleaning up to do:
之后,只需进行一些清理工作:
>>> df2 = df2.drop("n",axis=1).reset_index(drop=True)
>>> df2
  id   v
0  A  10
1  B  13
2  B  13
3  C   8
4  C   8
5  C   8
Note that if you might have duplicate indices to worry about, you could use .ilocinstead:
请注意,如果您可能需要担心重复的索引,则可以.iloc改用:
In [86]: df.iloc[np.repeat(np.arange(len(df)), df["n"])].drop("n", axis=1).reset_index(drop=True)
Out[86]: 
  id   v
0  A  10
1  B  13
2  B  13
3  C   8
4  C   8
5  C   8
which uses the positions, and not the index labels.
它使用位置,而不是索引标签。
回答by Zero
You could use set_indexand repeat
你可以使用set_index和repeat
In [1057]: df.set_index(['id'])['v'].repeat(df['n']).reset_index()
Out[1057]:
  id   v
0  A  10
1  B  13
2  B  13
3  C   8
4  C   8
5  C   8
Details
细节
In [1058]: df
Out[1058]:
  id  n   v
0  A  1  10
1  B  2  13
2  C  3   8

