迭代 Pandas 中的前 N ​​行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/53872905/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:13:30  来源:igfitidea点击:

Iterate over first N rows in pandas

pythonpandas

提问by David542

What is the suggested way to iterate over the rows in pandas like you would in a file? For example:

像在文件中一样迭代 Pandas 中的行的建议方法是什么?例如:

LIMIT = 100
for row_num, row in enumerate(open('file','r')):
    print (row)
    if row_num == LIMIT: break

I was thinking to do something like:

我正在考虑做类似的事情:

for n in range(LIMIT):
    print (df.loc[n].tolist())

Is there a built-in way to do this though in pandas?

在Pandas中是否有内置的方法来做到这一点?

回答by knh190

Hasn't anyone answered the simple solution?

没有人回答简单的解决方案吗?

for row in df.head(5).itertuples():
    # do something


Take a peek at this post.

看一看这篇文章

回答by timgeb

You can islicethe iterator iterrows(or itertuples) produces.

您可以islice使用迭代器iterrows(或itertuples)生成。

from itertools import islice
LIMIT = 100

# iterrows and unpacking
for idx, data in islice(df.iterrows(), LIMIT):
    # do stuff

# itertuples, no unpacking
for row in islice(df.itertuples(), LIMIT):
    # do stuff

回答by meW

You have values, itertuplesand iterrowsout of which itertuplesperforms best as benchmarked by fast-pandas.

你有valuesitertuplesiterrows外面itertuples表现最好由基准快大Pandas

enter image description here

在此处输入图片说明

回答by Joe Halliwell

You can use iterools.isliceto take the first nitems from iterrows:

您可以使用从以下位置iterools.islice获取第一niterrows

import itertools
limit = 5
for index, row in itertools.islice(df.iterrows(), limit):
    ...

回答by gorjan

Since you said that you want to use something like an if I would do the following:

既然你说你想使用类似的东西,如果我会做以下事情:

limit = 2
df = pd.DataFrame({"col1": [1,2,3], "col2": [4,5,6], "col3": [7,8,9]})
df[:limit].loc[df["col3"] == 7]

This would select the first two rows of the data frame, then return the rows out of the first two rows that have a value for the col3equal to 7. Point being you want to use iterrowsonly in very very specific situations. Otherwise, the solution can be vectorized.

这将选择数据框的前两行,然后返回前两行中具有col3等于 7的值的行。重点是您只想iterrows在非常非常特定的情况下使用。否则,可以将解向量化。

I don't know what exactly are you trying to achieve so I just threw a random example.

我不知道你到底想达到什么目的,所以我只是随便举了一个例子。

回答by Tim

If you must iterate over the dataframe, you should use the iterrows()method:

如果必须遍历数据帧,则应使用以下iterrows()方法:

for index, row in df.iterrows():
    ...