使用 for 循环替换 pandas 列的每一行中的单元格值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51001165/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replace cell values in each row of pandas column using for loop
提问by j.stalin
Please, help me understand my error.
I'm trying to change one column in my .csv
file.
I have .csv
file as following:
请帮助我理解我的错误。我正在尝试更改.csv
文件中的一列。我有.csv
如下文件:
sku,name,code
k1,aaa,886
k2,bbb,898
k3,ccc,342
k4,ddd,503
k5,eee,401
I want to replace "k" symbol with the "_" symbol in the "sku" column.
I wrote the code:
我想用“sku”列中的“_”符号替换“k”符号。
我写的代码:
import sys
import pandas as pd
import numpy as np
import datetime
df = pd.read_csv('cat0.csv')
for r in df['sku']:
r1 = r.replace('k', '_')
df['sku'] = r1
print (df)
But the code inserts the last value in every row of the "sku" column. So I get:
但是代码在“sku”列的每一行中插入最后一个值。所以我得到:
sku name code
0 _5 aaa 886
1 _5 bbb 898
2 _5 ccc 342
3 _5 ddd 503
4 _5 eee 401
I want to get as following:
我想得到如下:
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
采纳答案by Jan
You can use str.replace
on the whole column:
您可以str.replace
在整个列上使用:
from io import StringIO
import pandas as pd
data = """sku,name,code
k1,aaa,886
k2,bbb,898
k3,ccc,342
k4,ddd,503
k5,eee,401"""
file = StringIO(data)
df = pd.read_csv(file)
df['sku'] = df['sku'].str.replace('k', '_')
print(df)
This yields
这产生
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
回答by Zev
As @Jan mentioned, doing it by using df['sku'] = df['sku'].str.replace('k', '_')
is the best/quickest way to do this.
正如@Jan 提到的,通过使用df['sku'] = df['sku'].str.replace('k', '_')
来做到这一点是最好/最快的方法。
However, to understand why you are getting the results you are and to present a way as close to how you were doing it as possible, you'd do:
但是,要了解为什么会得到这样的结果并尽可能接近您的工作方式,您可以这样做:
import pandas as pd
df = pd.DataFrame(
{
'sku':["k1", "k2", "k3", "k4", "k5"],
'name': ["aaa", "bbb", "ccc", "ddd", "eee"],
'code':[886, 898,342,503,401]
}, columns =["sku", "name", "code"]
)
for i, r in enumerate(df['sku']):
r1 = r.replace('k', '_')
df.at[i, 'sku'] = r1
Which gives:
这使:
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
In your code...
在你的代码...
for r in df['sku']:
r1 = r.replace('k', '_')
...the issue is here:
...问题在这里:
df['sku'] = r1
You are broadcasting your results to the entire column rather than just the row you are working on.
您将结果广播到整个列,而不仅仅是您正在处理的行。