迭代 Pandas 中的前 N 行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/53872905/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Iterate over first N rows in pandas
提问by David542
What is the suggested way to iterate over the rows in pandas like you would in a file? For example:
像在文件中一样迭代 Pandas 中的行的建议方法是什么?例如:
LIMIT = 100
for row_num, row in enumerate(open('file','r')):
print (row)
if row_num == LIMIT: break
I was thinking to do something like:
我正在考虑做类似的事情:
for n in range(LIMIT):
print (df.loc[n].tolist())
Is there a built-in way to do this though in pandas?
在Pandas中是否有内置的方法来做到这一点?
回答by knh190
回答by timgeb
You can islice
the iterator iterrows
(or itertuples
) produces.
您可以islice
使用迭代器iterrows
(或itertuples
)生成。
from itertools import islice
LIMIT = 100
# iterrows and unpacking
for idx, data in islice(df.iterrows(), LIMIT):
# do stuff
# itertuples, no unpacking
for row in islice(df.itertuples(), LIMIT):
# do stuff
回答by meW
You have values
, itertuples
and iterrows
out of which itertuples
performs best as benchmarked by fast-pandas.
你有values
,itertuples
和iterrows
外面itertuples
表现最好由基准快大Pandas。
回答by Joe Halliwell
You can use iterools.islice
to take the first n
items from iterrows
:
您可以使用从以下位置iterools.islice
获取第一n
项iterrows
:
import itertools
limit = 5
for index, row in itertools.islice(df.iterrows(), limit):
...
回答by gorjan
Since you said that you want to use something like an if I would do the following:
既然你说你想使用类似的东西,如果我会做以下事情:
limit = 2
df = pd.DataFrame({"col1": [1,2,3], "col2": [4,5,6], "col3": [7,8,9]})
df[:limit].loc[df["col3"] == 7]
This would select the first two rows of the data frame, then return the rows out of the first two rows that have a value for the col3
equal to 7. Point being you want to use iterrows
only in very very specific situations. Otherwise, the solution can be vectorized.
这将选择数据框的前两行,然后返回前两行中具有col3
等于 7的值的行。重点是您只想iterrows
在非常非常特定的情况下使用。否则,可以将解向量化。
I don't know what exactly are you trying to achieve so I just threw a random example.
我不知道你到底想达到什么目的,所以我只是随便举了一个例子。
回答by Tim
If you must iterate over the dataframe, you should use the iterrows()
method:
如果必须遍历数据帧,则应使用以下iterrows()
方法:
for index, row in df.iterrows():
...