pandas 单位置索引器越界迭代熊猫数据帧

Question

提问by branches

I have a dataframe, myDF, one column of which I wish to set to zero using a combination of conditions from other columns and indexing with a second dataframe, criteriaDF.

我有一个数据框 myDF，我希望使用其他列的条件组合将其中的一列设置为零，并使用第二个数据框 CriteriaDF 进行索引。

myDF.head():

       DateTime  GrossPowerMW USDateTime_string  DateTime_timestamp  \
0  01/01/1998 00:00        17.804  01/01/1998 00:00 1998-01-01 00:00:00   
1  01/01/1998 01:00        18.751  01/01/1998 01:00 1998-01-01 01:00:00   
2  01/01/1998 02:00        20.501  01/01/1998 02:00 1998-01-01 02:00:00   
3  01/01/1998 03:00        22.222  01/01/1998 03:00 1998-01-01 03:00:00   
4  01/01/1998 04:00        24.437  01/01/1998 04:00 1998-01-01 04:00:00   

   Month  Day  Hour  GrossPowerMW_Shutdown  
0      1    3     0                 17.804  
1      1    3     1                 18.751  
2      1    3     2                 20.501  
3      1    3     3                 22.222  
4      1    3     4                 24.437

criteriaDF:

标准DF：

       STARTTIME  ENDTIME
Month                    
1            9.0     12.0
2            9.0     14.0
3            9.0     14.0
4            9.0     14.0
5            9.0     13.0
6            9.0     14.0
7            9.0     13.0
8            9.0     12.0
9            9.0     14.0
10           9.0     13.0
11           9.0     13.0
12           9.0     11.0

myDF is then run through the following for loop:

然后通过以下 for 循环运行 myDF：

month = 1
for month in range (1, 13):
    shutdown_hours = range(int(criteriaDF.iloc[month]['STARTTIME']), int(criteriaDF.iloc[month]['ENDTIME']))
    myDF.loc[(myDF["Month"].isin([month])) & (myDF["Hour"].isin(shutdown_hours)) & (myDF["Day"].isin(shutdown_days)), "GrossPowerMW_Shutdown"] *= 0
    month = month + 1

This gives the below error:

这给出了以下错误：

Traceback (most recent call last):
File "", line 1, in runfile('myscript.py', wdir='C:myscript')
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile execfile(filename, namespace)
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)
File "myscript.py", line 111, in gross_yield, curtailed_yield, shutdown_loss, df_testing = calculate_loss(input_file, input_shutdownbymonth, shutdown_days) #Returning df for testing/interrogation only. Delete once finished.
File "myscript.py", line 79, in calculate_loss shutdown_hours = range(int(criteriaDF.iloc[month]['STARTTIME']), int(criteriaDF.iloc[month]['ENDTIME']))
File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1328, in __getitem__ return self._getitem_axis(key, axis=0)
File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1749, in _getitem_axis self._is_valid_integer(key, axis)
File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1638, in _is_valid_integer raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

回溯（最近一次调用最后一次）：
文件 "", line 1, in runfile('myscript.py', wdir='C:myscript')
文件“C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py”，第 880 行，在运行文件 execfile（文件名，命名空间）中
文件“C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py”，第 87 行，在 execfile exec(compile(scripttext, filename, 'exec'), glob, loc)
文件“myscript.py”，第 111 行，在gross_yield、curtailed_yield、shutdown_loss、df_testing = calculate_loss(input_file, input_shutdownbymonth, shutdown_days) #Returning df 仅用于测试/询问。完成后删除。
文件“myscript.py”，第 79 行，在 calculate_loss shutdown_hours = range(int(criteriaDF.iloc[month]['STARTTIME']), int(criteriaDF.iloc[month]['ENDTIME']))
文件“C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py”，第 1328 行，在 __getitem__ 中返回 self._getitem_axis(key,axis=0)
文件“C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py”，第 1749 行，在 _getitem_axis self._is_valid_integer(key, axis)
文件“C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py”，第 1638 行，_is_valid_integer 引发 IndexError（“单个位置索引器越界”）
IndexError：单个位置索引器越界

However the script works if I set

但是，如果我设置了脚本，则该脚本有效

month = 0
for month in range (0, 12)

However this does not fit with my dataframe's indexing on the Column ['Month'] which runs 1 - 12 not 0 -> 11.

但是，这不符合我的数据框在 Column ['Month'] 上的索引，它运行 1 - 12 而不是 0 -> 11。

To confirm my understanding is that

确认我的理解是

range (1, 13)

returns

回报

[1,2,3,4,5,6,7,8,9,10,11,12].

I have also tried manually running the code line by line with the code in the for loop with month = 12. So I am uncertain why using month in rage (1, 13) is not working, noting that 12 is the highest integer in the list range (1,13).

我也试过手动运行代码和 for 循环中的代码，并且月 = 12。所以我不确定为什么在愤怒中使用月 (1, 13) 不起作用，注意到 12 是最大的整数列表范围 (1,13)。

What is the error in my code or my approach?

我的代码或我的方法有什么错误？

Answer 1

回答by Deb

you're using ilocwhich is "Purely integer-location based indexing for selection by position." So it just counts your rows from 0 to 11 you should use locwhich looks at the value of your index (so 1 to 12)

您使用的iloc是“纯粹基于整数位置的索引以按位置进行选择”。所以它只计算你应该使用的从 0 到 11 的行，loc它查看索引的值（所以 1 到 12）

pandas 单位置索引器越界迭代熊猫数据帧

提问by branches

回答by Deb

相关推荐

最近更新

标签

pandas 单位置索引器越界迭代熊猫数据帧

提问by branches

回答by Deb

相关推荐

pandas-compat: 'import pandas' 给出 AttributeError: 模块 'pandas' 没有属性 'compat'

pandas 数据框应用不接受轴参数

未找到 Pandas Series.dt.total_seconds()

在 Pandas 中显示列为 False 的行

相关推荐

最近更新

标签