pandas 在熊猫数据框中插入值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44500136/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:46:21  来源:igfitidea点击:

Insert value in panda dataframe

pythonexcelpandas

提问by NILESH SUTHAR

I have data in an Excel sheet. I want to check one column value for a range and if that value lies in that range(5000-15000) then I want to insert value in another column(Correct or Flag).

我在 Excel 工作表中有数据。我想检查一个范围的一列值,如果该值在该范围内(5000-15000),那么我想在另一列(正确或标志)中插入值。

I have three columns: City, rent, status.

我有三列:城市、租金、状态。

I have tried append and insert method but that didn't work. How should I do this?

我试过 append 和 insert 方法,但没有用。我该怎么做?

Here is my code:

这是我的代码:

for index, row in df.iterrows():

对于索引,df.iterrows() 中的行:

if row['city']=='mumbai':

    if 5000<= row['rent']<=15000:

        pd.DataFrame.append({'Status': 'Correct'})

It shows this error:

它显示此错误:

TypeError: append() missing 1 required positional argument: 'other'

类型错误:append() 缺少 1 个必需的位置参数:“其他”

What procedure should I follow to insert data row by row in a column?

在列中逐行插入数据应该遵循什么程序?

采纳答案by jezrael

I think you can use numpy.wherewith boolean mask created by betweenand comparing with city:

我认为您可以使用numpy.where由创建的布尔掩码between并与之进行比较city

mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = np.where(mask, 'Correct', 'Uncorrect')

Sample:

样本:

df = pd.DataFrame({'city':['mumbai','mumbai','mumbai', 'a'],
                   'rent':[1000,6000,10000,10000]})
mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = np.where(mask, 'Correct', 'Flag')
print (df)
     city   rent   status
0  mumbai   1000     Flag
1  mumbai   6000  Correct
2  mumbai  10000  Correct
3       a  10000     Flag

Another solution with loc:

另一个解决方案loc

mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = 'Flag'
df.loc[mask, 'status'] =  'Correct'
print (df)
     city   rent   status
0  mumbai   1000     Flag
1  mumbai   6000  Correct
2  mumbai  10000  Correct
3       a  10000     Flag

For write to excel use to_excel, if need remove index column add index=False:

对于写入 excel 使用to_excel,如果需要删除索引列添加index=False

df.to_excel('file.xlsx', index=False)

EDIT:

编辑:

For multiple masks is possible use:

对于多个masks 可以使用:

df = pd.DataFrame({'city':['Mumbai','Mumbai','Delhi', 'Delhi', 'Bangalore', 'Bangalore'],
                   'rent':[1000,6000,10000,1000,4000,5000]})
print (df)
        city   rent
0     Mumbai   1000
1     Mumbai   6000
2      Delhi  10000
3      Delhi   1000
4  Bangalore   4000
5  Bangalore   5000


m1 = (df['city']=='Mumbai') & df['rent'].between(5000,15000)
m2 = (df['city']=='Delhi') & df['rent'].between(1000,5000)
m3 = (df['city']=='Bangalore') & df['rent'].between(3000,5000)

m = m1 | m2 | m3
print (m)
0    False
1     True
2    False
3     True
4     True
5     True
dtype: bool

from functools import reduce
mList = [m1,m2,m3]
m = reduce(lambda x,y: x | y, mList)
print (m)
0    False
1     True
2    False
3     True
4     True
5     True
dtype: bool

print (df[m])
        city  rent
1     Mumbai  6000
3      Delhi  1000
4  Bangalore  4000
5  Bangalore  5000