Python 迭代 Pandas DataFrame 中的每个元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35758620/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Iterating over each element in pandas DataFrame
提问by Ali
So I got a pandas DataFrame with a single column and a lot of data.
所以我得到了一个包含单列和大量数据的 Pandas DataFrame。
I need to access each of the element, not to change it (with apply()) but to parse it into another function.
我需要访问每个元素,而不是更改它(使用 apply()),而是将其解析为另一个函数。
When looping through the DataFrame it always stops after the first one.
循环遍历 DataFrame 时,它总是在第一个之后停止。
If I convert it to a list before, then my numbers are all in braces (eg. [12] instead of 12) thus breaking my code.
如果我之前将其转换为列表,那么我的数字都在大括号中(例如 [12] 而不是 12)从而破坏了我的代码。
Does anyone see what I am doing wrong?
有没有人看到我做错了什么?
import pandas as pd
def go_trough_list(df):
for number in df:
print(number)
df = pd.read_csv("my_ids.csv")
go_trough_list(df)
df looks like:
df 看起来像:
1
0 2
1 3
2 4
dtype: object
[Finished in 1.1s]
Edit: I found one mistake. My first value is recognized as a header. So I changed my code to:
编辑:我发现了一个错误。我的第一个值被识别为标题。所以我将代码更改为:
df = pd.read_csv("my_ids.csv",header=None)
But with
但是随着
for ix in df.index:
print(df.loc[ix])
I get:
我得到:
0 1
Name: 0, dtype: int64
0 2
Name: 1, dtype: int64
0 3
Name: 2, dtype: int64
0 4
Name: 3, dtype: int64
edit: Here is my Solution thanks to jezrael and Nick!
编辑:感谢 jezrael 和 Nick,这是我的解决方案!
First I added headings=None
because my data has no header.
Then I changed my function to:
首先我添加,headings=None
因为我的数据没有标题。然后我将我的功能更改为:
def go_through_list(df)
new_list = df[0].apply(my_function,parameter=par1)
return new_list
And it works perfectly! Thank you again guys, problem solved.
它完美地工作!再次谢谢各位,问题解决了。
回答by Nick Brady
You can use the index as in other answers, and also iterate through the df and access the row like this:
您可以像在其他答案中一样使用索引,也可以遍历 df 并像这样访问行:
for index, row in df.iterrows():
print(row['column'])
however, I suggest solving the problem differently if performance is of any concern. Also, if there is only one column, it is more correct to use a Pandas Series.
但是,如果性能有任何问题,我建议以不同的方式解决问题。此外,如果只有一列,使用Pandas Series更正确。
What do you mean by parse it into another function? Perhaps take the value, and do something to it and create it into another column?
将其解析为另一个函数是什么意思?也许获取该值,对其进行处理并将其创建到另一列中?
I need to access each of the element, not to change it (with apply()) but to parse it into another function.
我需要访问每个元素,而不是更改它(使用 apply()),而是将其解析为另一个函数。
Perhaps this example will help:
也许这个例子会有所帮助:
import pandas as pd
df = pd.DataFrame([20, 21, 12])
def square(x):
return x**2
df['new_col'] = df[0].apply(square) # can use a lambda here nicely
回答by jezrael
You can convert column as Series
tolist
:
您可以将列转换为Series
tolist
:
for x in df['Colname'].tolist():
print x
Sample:
样本:
import pandas as pd
df = pd.DataFrame({'a': pd.Series( [1, 2, 3]),
'b': pd.Series( [4, 5, 6])})
print df
a b
0 1 4
1 2 5
2 3 6
for x in df['a'].tolist():
print x
1
2
3
If you have only one column, use iloc
for selecting first column:
如果您只有一列,请iloc
用于选择第一列:
for x in df.iloc[:,0].tolist():
print x
Sample:
样本:
import pandas as pd
df = pd.DataFrame({1: pd.Series( [2, 3, 4])})
print df
1
0 2
1 3
2 4
for x in df.iloc[:,0].tolist():
print x
2
3
4
This can work too, but it is not recommended approach, because 1
can be number or string and it can raise Key error:
这也可以工作,但不是推荐的方法,因为1
可以是数字或字符串,它会引发 Key 错误:
for x in df[1].tolist():
print x
2
3
4
回答by Sam
Say you have one column named 'myColumn', and you have an index on the dataframe (which is automatically created with read_csv). Try using the .loc function:
假设您有一个名为“myColumn”的列,并且您在数据框上有一个索引(使用 read_csv 自动创建)。尝试使用 .loc 函数:
for ix in df.index:
print(df.loc[ix]['myColumn'])