使用 for 循环替换 pandas 列的每一行中的单元格值

Question

提问by j.stalin

Please, help me understand my error. I'm trying to change one column in my .csvfile. I have .csvfile as following:

请帮助我理解我的错误。我正在尝试更改.csv文件中的一列。我有.csv如下文件：

sku,name,code  
k1,aaa,886  
k2,bbb,898  
k3,ccc,342  
k4,ddd,503  
k5,eee,401

I want to replace "k" symbol with the "_" symbol in the "sku" column.
I wrote the code:

我想用“sku”列中的“_”符号替换“k”符号。
我写的代码：

import sys  
import pandas as pd  
import numpy as np  
import datetime  

df = pd.read_csv('cat0.csv')  

for r in df['sku']:  
    r1 = r.replace('k', '_')  
    df['sku'] = r1  

print (df)

But the code inserts the last value in every row of the "sku" column. So I get:

但是代码在“sku”列的每一行中插入最后一个值。所以我得到：

  sku name  code
0  _5  aaa   886
1  _5  bbb   898
2  _5  ccc   342
3  _5  ddd   503
4  _5  eee   401

I want to get as following:

我想得到如下：

  sku name  code
0  _1  aaa   886
1  _2  bbb   898
2  _3  ccc   342
3  _4  ddd   503
4  _5  eee   401

Answer 1

采纳答案by Jan

You can use str.replaceon the whole column:

您可以str.replace在整个列上使用：

from io import StringIO
import pandas as pd

data = """sku,name,code  
k1,aaa,886  
k2,bbb,898  
k3,ccc,342  
k4,ddd,503  
k5,eee,401"""

file = StringIO(data)

df = pd.read_csv(file)
df['sku'] = df['sku'].str.replace('k', '_')

print(df)

This yields

这产生

  sku name  code  
0  _1  aaa     886
1  _2  bbb     898
2  _3  ccc     342
3  _4  ddd     503
4  _5  eee     401

Answer 2

回答by Zev

As @Jan mentioned, doing it by using df['sku'] = df['sku'].str.replace('k', '_')is the best/quickest way to do this.

正如@Jan 提到的，通过使用df['sku'] = df['sku'].str.replace('k', '_')来做到这一点是最好/最快的方法。

However, to understand why you are getting the results you are and to present a way as close to how you were doing it as possible, you'd do:

但是，要了解为什么会得到这样的结果并尽可能接近您的工作方式，您可以这样做：

import pandas as pd

df = pd.DataFrame(
    {
        'sku':["k1", "k2", "k3", "k4", "k5"], 
        'name': ["aaa", "bbb", "ccc", "ddd", "eee"], 
        'code':[886, 898,342,503,401]
    }, columns =["sku", "name", "code"]
)

for i, r in enumerate(df['sku']):  
    r1 = r.replace('k', '_')
    df.at[i, 'sku'] = r1

Which gives:

这使：

  sku name  code
0  _1  aaa   886
1  _2  bbb   898
2  _3  ccc   342
3  _4  ddd   503
4  _5  eee   401

In your code...

在你的代码...

for r in df['sku']:  
    r1 = r.replace('k', '_')

...the issue is here:

...问题在这里：

    df['sku'] = r1

You are broadcasting your results to the entire column rather than just the row you are working on.

您将结果广播到整个列，而不仅仅是您正在处理的行。

使用 for 循环替换 pandas 列的每一行中的单元格值

提问by j.stalin

采纳答案by Jan

回答by Zev

相关推荐

最近更新

标签

使用 for 循环替换 pandas 列的每一行中的单元格值

提问by j.stalin

采纳答案by Jan

回答by Zev

相关推荐

在训练、验证和测试集中对 Pandas 数据框进行分层拆分

在 Pandas 中复制行

pandas 如何使用 Python 下载股票价格数据？

如何在 SQLite DB 中存储 Pandas DataFrame

相关推荐

最近更新

标签