pandas 在熊猫数据框中插入值

Question

提问by NILESH SUTHAR

I have data in an Excel sheet. I want to check one column value for a range and if that value lies in that range(5000-15000) then I want to insert value in another column(Correct or Flag).

我在 Excel 工作表中有数据。我想检查一个范围的一列值，如果该值在该范围内（5000-15000），那么我想在另一列（正确或标志）中插入值。

I have three columns: City, rent, status.

我有三列：城市、租金、状态。

I have tried append and insert method but that didn't work. How should I do this?

我试过 append 和 insert 方法，但没有用。我该怎么做？

Here is my code:

这是我的代码：

for index, row in df.iterrows():

对于索引，df.iterrows() 中的行：

if row['city']=='mumbai':

    if 5000<= row['rent']<=15000:

        pd.DataFrame.append({'Status': 'Correct'})

It shows this error:

它显示此错误：

TypeError: append() missing 1 required positional argument: 'other'

类型错误：append() 缺少 1 个必需的位置参数：“其他”

What procedure should I follow to insert data row by row in a column?

在列中逐行插入数据应该遵循什么程序？

Answer 1

采纳答案by jezrael

I think you can use numpy.wherewith boolean mask created by betweenand comparing with city:

我认为您可以使用numpy.where由创建的布尔掩码between并与之进行比较city：

mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = np.where(mask, 'Correct', 'Uncorrect')

Sample:

样本：

df = pd.DataFrame({'city':['mumbai','mumbai','mumbai', 'a'],
                   'rent':[1000,6000,10000,10000]})
mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = np.where(mask, 'Correct', 'Flag')
print (df)
     city   rent   status
0  mumbai   1000     Flag
1  mumbai   6000  Correct
2  mumbai  10000  Correct
3       a  10000     Flag

Another solution with loc:

另一个解决方案loc：

mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = 'Flag'
df.loc[mask, 'status'] =  'Correct'
print (df)
     city   rent   status
0  mumbai   1000     Flag
1  mumbai   6000  Correct
2  mumbai  10000  Correct
3       a  10000     Flag

For write to excel use to_excel, if need remove index column add index=False:

对于写入 excel 使用to_excel，如果需要删除索引列添加index=False：

df.to_excel('file.xlsx', index=False)

EDIT:

编辑：

For multiple masks is possible use:

对于多个masks 可以使用：

df = pd.DataFrame({'city':['Mumbai','Mumbai','Delhi', 'Delhi', 'Bangalore', 'Bangalore'],
                   'rent':[1000,6000,10000,1000,4000,5000]})
print (df)
        city   rent
0     Mumbai   1000
1     Mumbai   6000
2      Delhi  10000
3      Delhi   1000
4  Bangalore   4000
5  Bangalore   5000

m1 = (df['city']=='Mumbai') & df['rent'].between(5000,15000)
m2 = (df['city']=='Delhi') & df['rent'].between(1000,5000)
m3 = (df['city']=='Bangalore') & df['rent'].between(3000,5000)

m = m1 | m2 | m3
print (m)
0    False
1     True
2    False
3     True
4     True
5     True
dtype: bool

from functools import reduce
mList = [m1,m2,m3]
m = reduce(lambda x,y: x | y, mList)
print (m)
0    False
1     True
2    False
3     True
4     True
5     True
dtype: bool

print (df[m])
        city  rent
1     Mumbai  6000
3      Delhi  1000
4  Bangalore  4000
5  Bangalore  5000

pandas 在熊猫数据框中插入值

提问by NILESH SUTHAR

采纳答案by jezrael

相关推荐

最近更新

标签

pandas 在熊猫数据框中插入值

提问by NILESH SUTHAR

采纳答案by jezrael

相关推荐

pandas 打印没有省略号的 numpy 数组

pandas 熊猫数据框索引匹配

pandas 熊猫组合两个分组依据，过滤和合并组（计数）

Pandas groupby 自定义函数到每个系列

相关推荐

最近更新

标签