按列值复制 Pandas 数据框中的行

Question

提问by Mersenne Prime

I want to replicate rows in a Pandas Dataframe. Each row should be repeated n times, where n is a field of each row.

我想复制 Pandas 数据框中的行。每行应重复 n 次，其中 n 是每行的一个字段。

import pandas as pd

what_i_have = pd.DataFrame(data={
  'id': ['A', 'B', 'C'],
  'n' : [  1,   2,   3],
  'v' : [ 10,  13,   8]
})

what_i_want = pd.DataFrame(data={
  'id': ['A', 'B', 'B', 'C', 'C', 'C'],
  'v' : [ 10,  13,  13,   8,   8,   8]
})

Is this possible?

这可能吗？

Answer 1

回答by DSM

You could use np.repeatto get the repeated indices and then use that to index into the frame:

您可以使用np.repeat来获取重复的索引，然后使用它来索引到框架中：

>>> df2 = df.loc[np.repeat(df.index.values,df.n)]
>>> df2
  id  n   v
0  A  1  10
1  B  2  13
1  B  2  13
2  C  3   8
2  C  3   8
2  C  3   8

After which there's only a bit of cleaning up to do:

之后，只需进行一些清理工作：

>>> df2 = df2.drop("n",axis=1).reset_index(drop=True)
>>> df2
  id   v
0  A  10
1  B  13
2  B  13
3  C   8
4  C   8
5  C   8

Note that if you might have duplicate indices to worry about, you could use .ilocinstead:

请注意，如果您可能需要担心重复的索引，则可以.iloc改用：

In [86]: df.iloc[np.repeat(np.arange(len(df)), df["n"])].drop("n", axis=1).reset_index(drop=True)
Out[86]: 
  id   v
0  A  10
1  B  13
2  B  13
3  C   8
4  C   8
5  C   8

which uses the positions, and not the index labels.

它使用位置，而不是索引标签。

Answer 2

回答by Zero

You could use set_indexand repeat

你可以使用set_index和repeat

In [1057]: df.set_index(['id'])['v'].repeat(df['n']).reset_index()
Out[1057]:
  id   v
0  A  10
1  B  13
2  B  13
3  C   8
4  C   8
5  C   8

Details

细节

In [1058]: df
Out[1058]:
  id  n   v
0  A  1  10
1  B  2  13
2  C  3   8

按列值复制 Pandas 数据框中的行

提问by Mersenne Prime

回答by DSM

回答by Zero

相关推荐

最近更新

标签

按列值复制 Pandas 数据框中的行

提问by Mersenne Prime

回答by DSM

回答by Zero

相关推荐

Pandas 将多个数据帧与时间戳索引对齐

pandas 从熊猫中的对象日期中剥离时间

pandas 未定义全局名称“inf”

以文件名作为列标题将多个 *.txt 文件读入 Pandas Dataframe

相关推荐

最近更新

标签