Python 如何重复 Pandas 数据框？

Question

提问by lsheng

This is my data frame that should be repeated for 5 times:

这是我的数据框，应该重复 5 次：

>>> x = pd.DataFrame({'a':1,'b':2},index = range(1))
>>> x
   a  b
0  1  2

I wanna have the result like this:

我想要这样的结果：

>>> x.append(x).append(x).append(x)
   a  b
0  1  2
0  1  2
0  1  2
0  1  2

But there must be a way smarter than keep appending.. Actually the data frame Im working on should be repeated for 50 times..

但是必须有一种比继续追加更聪明的方法。实际上，我正在处理的数据框应该重复 50 次。

I haven't found anything practical, including those like np.repeat---- it just doesnt work on data frame.

我还没有发现任何实用的东西，包括像np.repeat----它只是不适用于数据框。

Could anyone help?

有人可以帮忙吗？

Answer 1

采纳答案by joris

You can use the concatfunction:

您可以使用该concat功能：

In [13]: pd.concat([x]*5)
Out[13]: 
   a  b
0  1  2
0  1  2
0  1  2
0  1  2
0  1  2

If you only want to repeat the values and not the index, you can do:

如果您只想重复值而不是索引，您可以执行以下操作：

In [14]: pd.concat([x]*5, ignore_index=True)
Out[14]: 
   a  b
0  1  2
1  1  2
2  1  2
3  1  2
4  1  2

Answer 2

回答by FooBar

I would generally not repeat and/or append, unless your problem really makes it necessary - it is highly inefficiently and typicallycomes from not understanding the proper way to attack a problem.

我通常不会重复和/或追加，除非您的问题确实有必要 - 它非常低效并且通常来自不了解解决问题的正确方法。

I don't know your exact use case, but if you have your values stored as

我不知道您的确切用例，但是如果您将值存储为

values = array(1, 2)
df2 = pd.DataFrame(index=arange(0,50),  columns=['a', 'b'])
df2[['a', 'b']] = values

will do the job. Perhaps you want to better explain what you're trying to achieve?

会做的工作。也许您想更好地解释您想要实现的目标？

Answer 3

回答by Surya

Append should work too:

附加也应该工作：

In [589]: x = pd.DataFrame({'a':1,'b':2},index = range(1))

In [590]: x
Out[590]: 
   a  b
0  1  2

In [591]: x.append([x]*5, ignore_index=True) #Ignores the index as per your need
Out[591]: 
   a  b
0  1  2
1  1  2
2  1  2
3  1  2
4  1  2
5  1  2

In [592]: x.append([x]*5)
Out[592]: 
   a  b
0  1  2
0  1  2
0  1  2
0  1  2
0  1  2
0  1  2

Answer 4

回答by Andy Hayden

I think it's cleaner/faster to use ilocnowadays:

我认为现在使用更清洁/更快iloc：

In [11]: np.full(3, 0)
Out[11]: array([0, 0, 0])

In [12]: x.iloc[np.full(3, 0)]
Out[12]:
   a  b
0  1  2
0  1  2
0  1  2

More generally, you can use tileor repeatwith arange:

更一般地，你可以使用tile或repeat与arange：

In [21]: df = pd.DataFrame([[1, 2], [3, 4]], columns=["A", "B"])

In [22]: df
Out[22]:
   A  B
0  1  2
1  3  4

In [23]: np.tile(np.arange(len(df)), 3)
Out[23]: array([0, 1, 0, 1, 0, 1])

In [24]: np.repeat(np.arange(len(df)), 3)
Out[24]: array([0, 0, 0, 1, 1, 1])

In [25]: df.iloc[np.tile(np.arange(len(df)), 3)]
Out[25]:
   A  B
0  1  2
1  3  4
0  1  2
1  3  4
0  1  2
1  3  4

In [26]: df.iloc[np.repeat(np.arange(len(df)), 3)]
Out[26]:
   A  B
0  1  2
0  1  2
0  1  2
1  3  4
1  3  4
1  3  4

Note: This will work with non-integer indexed DataFrames (and Series).

注意：这将适用于非整数索引的 DataFrame（和系列）。

Answer 5

回答by U10-Forward

Try using numpy.repeat:

尝试使用numpy.repeat：

>>> df=pd.DataFrame(pd.np.repeat(x.values,5,axis=0),columns=x.columns)
>>> df
   a  b
0  1  2
1  1  2
2  1  2
3  1  2
4  1  2
>>>

Python 如何重复 Pandas 数据框？

提问by lsheng

采纳答案by joris

回答by FooBar

回答by Surya

回答by Andy Hayden

回答by U10-Forward

相关推荐

最近更新

标签

Python 如何重复 Pandas 数据框？

提问by lsheng

采纳答案by joris

回答by FooBar

回答by Surya

回答by Andy Hayden

回答by U10-Forward

相关推荐

过滤 Python 字典中的项，其中键包含特定字符串

Python 熊猫将一些列转换为行

python 相当于 R 的 NA 是什么？

Python 如何在Robot Framework中编写if语句的多个条件

相关推荐

最近更新

标签