pandas Python：在 pd.DataFrame 中循环遍历行时，“ValueError：只能将大小为 1 的数组转换为 Python 标量”

Question

提问by parno

I would like to loop over the rows of a DataFrame, in my case to calculate strength ratings for a number of sports teams.

我想遍历 DataFrame 的行，在我的例子中计算多个运动队的强度等级。

The DataFrame columns 'home_elo'and 'away_elo'contain the pre-match strength rating (ELO score) of the teams involved and are updated in the row of the next home / away match after the match (each team has two strength ratings at any point in time, for home and away games), with what update_elo(a,b,c)returns.

DataFrame 列'home_elo'并'away_elo'包含所涉及球队的赛前实力评级（ELO 分数），并在比赛结束后在下一场主/客场比赛的行中更新（每支球队在任何时间点都有两个实力评级，对于主场和客场比赛），什么update_elo(a,b,c)回报。

The respective code snippet looks as follows:

相应的代码片段如下所示：

for index in df.index:

    counter = counter + 1
    # Calculation of post-match ELO scores for home and away teams
    if df.at[index,'updated'] == 2: # Update next match ELO scores if not yet updated but pre-match ELO scores available

        try:
            all_home_fixtures = df.date_rank[df['localteam_id'] == df.at[index,'localteam_id']]
            next_home_fixture = all_home_fixtures[all_home_fixtures > df.at[index,'date_rank']].min()
            next_home_index = df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index.item()
        except ValueError:
            print('ERROR 1 at' + str(index))
            df.at[index,'updated'] = 4

        try:
            all_away_fixtures = df.date_rank[df['visitorteam_id'] == df.at[index,'visitorteam_id']]
            next_away_fixture = all_away_fixtures[all_away_fixtures > df.at[index,'date_rank']].min()
            next_away_index = df[(df['date_rank'] == next_away_fixture) & (df['visitorteam_id'] == df.at[index,'visitorteam_id'])].index.item()
        except ValueError:
            print('ERROR 2 at' + str(index))
            df.at[index,'updated'] = 4

        # print('Current: ' + str(df.at[index,'fixture_id']) + '; Followed by: ' + str(next_home_fixture))
        # print('Current date rank: ' + str(df.at[index,'date']) + ' ' + str(df.at[index,'date_rank']) + '; Next home date rank: ' + str(df.at[next_home_index,'date_rank']) + '; Next away date rank: ' + str(df.at[next_away_index,'date_rank']))

        df.at[next_home_index, 'home_elo'] = update_elo(df.at[index,'home_elo'],df.at[index,'away_elo'],df.at[index,'actual_score'])
        df.at[next_away_index, 'away_elo'] = update_elo(df.at[index,'away_elo'],df.at[index,'home_elo'],1 - df.at[index,'actual_score']) # Swap function inputs for away team


        df.at[next_home_index, 'updated'] = df.at[next_home_index, 'updated'] + 1
        df.at[next_away_index, 'updated'] = df.at[next_away_index, 'updated'] + 1

        df.at[index,'updated'] = 3

The code works fine for the first couple of rows. I then, however, encounter errors, always for the same rows, even though I cannot see how the rows would differ from others.

该代码适用于前几行。然而，我遇到错误，总是针对相同的行，即使我看不出这些行与其他行有何不同。

If I do not handle the ValueErroras shown above, I receive the error message ValueError: can only convert an array of size 1 to a Python scalarfor the first time after about 250 rows.
If I do handle the ValueErroras shown above, I capture four such errors, two for each of the error-handling blocks (the code works fine otherwise), but the code stops updating any further strength ratings after about 18% of all rows, without throwing any error message.

如果我不处理ValueError如上所示，我ValueError: can only convert an array of size 1 to a Python scalar在大约 250 行后第一次收到错误消息。
如果我处理ValueError如上所示，我会捕获四个这样的错误，每个错误处理块两个（否则代码工作正常），但代码在所有行的大约 18% 后停止更新任何进一步的强度评级，没有抛出任何错误信息。

I would very much appreciate it if you could help me (a) understand what causes the error and (b) how to handle them.

如果您能帮助我 (a) 了解导致错误的原因以及 (b) 如何处理它们，我将不胜感激。

Since this is my first post on StackOverflow, I am not yet fully aware of the common posting practices of the forum. Please let me know if there is anything I can improve about my post.

由于这是我在 StackOverflow 上的第一篇文章，我还没有完全了解论坛的常见发帖习惯。如果我的帖子有什么可以改进的地方，请告诉我。

Thank you very much!

非常感谢！

Answer 1

采纳答案by sundance

pd.Series.itemrequires at least one item in the Series to return a scalar. If:

pd.Series.item需要系列中的至少一项来返回标量。如果：

df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])]

is a Series with length 0, then the .index.item()will throw a ValueError.

是长度为 0 的系列，.index.item()则将抛出 ValueError。

Answer 2

回答by Wei Chen

FYI,

供参考，

You will get similar error if you are applying .itemto a numpy array.

如果您应用.item到 numpy 数组，您将收到类似的错误。

You can solve it with .tolist()in that case.

.tolist()在这种情况下，您可以解决它。

pandas Python：在 pd.DataFrame 中循环遍历行时，“ValueError：只能将大小为 1 的数组转换为 Python 标量”

提问by parno

采纳答案by sundance

回答by Wei Chen

相关推荐

最近更新

标签

pandas Python：在 pd.DataFrame 中循环遍历行时，“ValueError：只能将大小为 1 的数组转换为 Python 标量”

提问by parno

采纳答案by sundance

回答by Wei Chen

相关推荐

聚合行 Pandas

pandas 如何为python3导入pandas？

Python Pandas 持久缓存

pandas Python - 'TypeError: '<=' 在 'str' 和 'int' 的实例之间不受支持

相关推荐

最近更新

标签