Pandas Dataframe - 根据条件获取索引值

Question

提问by Neal Titus Thomas

I have a text file called data.txt containing tabular data look like this:

我有一个名为 data.txt 的文本文件，其中包含表格数据，如下所示：

                        PERIOD
CHANNELS    1      2      3      4       5 
0         1.51   1.61   1.94   2.13   1.95 
5         1.76   1.91   2.29   2.54   2.38 
6         2.02   2.22   2.64   2.96   2.81 
7         2.27   2.52   2.99   3.37   3.24 
8         2.53   2.83   3.35   3.79   3.67 
9         2.78   3.13   3.70   4.21   4.09 
10        3.04   3.44   4.05   4.63   4.53

In the CHANNELS column are the channel numbers of an instrument and in the other 5 columns are the maximum energy that that particular channel can detect in periods 1, 2, 3, 4 and 5 respectively.

CHANNELS 列中是仪器的通道编号，其他 5 列中分别是该特定通道在周期 1、2、3、4 和 5 中可以检测到的最大能量。

I want to write a python code which gets the inputs: Period, Lower energy and Higher energy from the user and then gives out the channel numbers corresponding to the Lower energy and Higher energy for a given period.

我想编写一个 python 代码，它从用户那里获取输入：周期、较低能量和较高能量，然后给出与给定时间段内较低能量和较高能量相对应的通道号。

For example:

例如：

Enter the period:
>>1
Enter the Lower energy:
>1.0
Enter the Higher energy:
>2.0
#Output
The lower energy channel is 0
The higher energy channel is 6

This is what I have written so far:

这是我到目前为止所写的：

import numpy as np
import pandas as pd

period = int(input('Enter the period: '))
lower_energy = float(input('Enter the lower energy value: '))
higher_energy = float(input('Enter the higher energy value: '))
row_names = [0, 5, 6, 7, 8, 9, 10]
column_names = [1, 2, 3, 4, 5] 
data_list = []
with open('data.txt') as f:
lines = f.readlines()[2:]
for line in lines:
    arr = [float(num) for num in line.split()[1:]]
    data_list.append(arr)
df = pd.DataFrame(data_list, columns=column_names, index=row_names)
print (df, '\n')
print (df[period])

Help me add to this.

帮我补充一下。

Answer 1

回答by Bryce Ramgovind

You can add the following code:

您可以添加以下代码：

Retrieve the index based on the condition. Assumes constant increasing down the channels.

根据条件检索索引。假设沿通道不断增加。

lower_channel_energy = df[df[period]>lower_energy].index[0]
high_channel_energy =  df[(df[period]<higher_energy).shift(-1)==False].index[0]

Printing the channels that we calculated:

打印我们计算的通道：

print("The lower energy channel is {}".format(lower_channel_energy))
print("The higher energy channel is {}".format(high_channel_energy))

This solution assumes that the energy is increasing on the channels going down.

该解决方案假设能量在下行通道上增加。

Answer 2

回答by ejb

You can actually read your file directly with Pandas to simplify the program. I can reproduce the output you are expecting with:

实际上，您可以直接使用 Pandas 读取文件以简化程序。我可以重现您期望的输出：

import pandas as pd

df = pd.read_csv('data.txt', engine='python' header=1,sep=r'\s{2,}')

period = input('Enter the period: ')
lower_energy = float(input('Enter the lower energy value: '))
higher_energy = float(input('Enter the higher energy value: '))

# select the channels within the ranges provided
lo_e_range = (df[period] > lower_energy)
hi_e_range = (df[period] > higher_energy)

# Indices of the lower and higher energy channels
lec = df[period][lo_e_range].index[0]
hec = df[period][hi_e_range].index[0]

print('The lower energy channel is {}'.format(df['CHANNELS'][lec]))
print('The higher energy channel is {}'.format(df['CHANNELS'][hec]))

I have edited the code to take into account your comment.

我已经编辑了代码以考虑到您的评论。

Pandas Dataframe - 根据条件获取索引值

提问by Neal Titus Thomas

回答by Bryce Ramgovind

回答by ejb

相关推荐

最近更新

标签

Pandas Dataframe - 根据条件获取索引值

提问by Neal Titus Thomas

回答by Bryce Ramgovind

回答by ejb

相关推荐

pandas 类型错误：无法将“时间戳”类型与“日期”类型进行比较

pandas ValueError：无法将字符串转换为浮点数：'4,6'

Pandas Python：如何从列表中创建多列

pandas 使用 DataFrame.plot 在堆积条形图中显示总计和百分比

相关推荐

最近更新

标签