使用 for 循环替换 pandas 列的每一行中的单元格值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51001165/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replace cell values in each row of pandas column using for loop
提问by j.stalin
Please, help me understand my error.
I'm trying to change one column in my .csvfile.
I have .csvfile as following:
请帮助我理解我的错误。我正在尝试更改.csv文件中的一列。我有.csv如下文件:
sku,name,code
k1,aaa,886
k2,bbb,898
k3,ccc,342
k4,ddd,503
k5,eee,401
I want to replace "k" symbol with the "_" symbol in the "sku" column.
I wrote the code:
我想用“sku”列中的“_”符号替换“k”符号。
我写的代码:
import sys
import pandas as pd
import numpy as np
import datetime
df = pd.read_csv('cat0.csv')
for r in df['sku']:
r1 = r.replace('k', '_')
df['sku'] = r1
print (df)
But the code inserts the last value in every row of the "sku" column. So I get:
但是代码在“sku”列的每一行中插入最后一个值。所以我得到:
sku name code
0 _5 aaa 886
1 _5 bbb 898
2 _5 ccc 342
3 _5 ddd 503
4 _5 eee 401
I want to get as following:
我想得到如下:
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
采纳答案by Jan
You can use str.replaceon the whole column:
您可以str.replace在整个列上使用:
from io import StringIO
import pandas as pd
data = """sku,name,code
k1,aaa,886
k2,bbb,898
k3,ccc,342
k4,ddd,503
k5,eee,401"""
file = StringIO(data)
df = pd.read_csv(file)
df['sku'] = df['sku'].str.replace('k', '_')
print(df)
This yields
这产生
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
回答by Zev
As @Jan mentioned, doing it by using df['sku'] = df['sku'].str.replace('k', '_')is the best/quickest way to do this.
正如@Jan 提到的,通过使用df['sku'] = df['sku'].str.replace('k', '_')来做到这一点是最好/最快的方法。
However, to understand why you are getting the results you are and to present a way as close to how you were doing it as possible, you'd do:
但是,要了解为什么会得到这样的结果并尽可能接近您的工作方式,您可以这样做:
import pandas as pd
df = pd.DataFrame(
{
'sku':["k1", "k2", "k3", "k4", "k5"],
'name': ["aaa", "bbb", "ccc", "ddd", "eee"],
'code':[886, 898,342,503,401]
}, columns =["sku", "name", "code"]
)
for i, r in enumerate(df['sku']):
r1 = r.replace('k', '_')
df.at[i, 'sku'] = r1
Which gives:
这使:
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
In your code...
在你的代码...
for r in df['sku']:
r1 = r.replace('k', '_')
...the issue is here:
...问题在这里:
df['sku'] = r1
You are broadcasting your results to the entire column rather than just the row you are working on.
您将结果广播到整个列,而不仅仅是您正在处理的行。

