pandas 如何删除数据框列中的字符串子串？

Question

提问by MEhsan

I have this simplified dataframe:

我有这个简化的数据框：

ID, Date
1 8/24/1995
2 8/1/1899 :00

How can I use the power of pandas to recognize any date in the dataframe which has extra :00and removes it.

我怎样才能使用Pandas的力量来识别数据框中的任何日期，它有额外的:00并删除它。

Any idea how to solve this problem?

知道如何解决这个问题吗？

I have tried this syntax but did not help:

我试过这种语法但没有帮助：

df[df["Date"].str.replace(to_replace="\s:00", value="")]

The Output Should Be Like:

输出应该是这样的：

ID, Date
1 8/24/1995
2 8/1/1899

Answer 1

回答by Psidom

You need to assign the trimmed column back to the original column instead of doing subsetting, and also the str.replacemethod doesn't seem to have the to_replaceand valueparameter. It has patand replparameter instead:

您需要将修剪后的列分配回原始列而不是进行子集化，而且该str.replace方法似乎没有to_replaceandvalue参数。它具有pat和repl参数：

df["Date"] = df["Date"].str.replace("\s:00", "")

df
#   ID       Date
#0   1  8/24/1995
#1   2   8/1/1899

Answer 2

回答by piRSquared

To apply this to an entire dataframe, I'd stackthen unstack

要将其应用于整个数据帧，stack然后我会unstack

df.stack().str.replace(r'\s:00', '').unstack()

functionalized

功能化

def dfreplace(df, *args, **kwargs):
    s = pd.Series(df.values.flatten())
    s = s.str.replace(*args, **kwargs)
    return pd.DataFrame(s.values.reshape(df.shape), df.index, df.columns)

Examples

例子

df = pd.DataFrame(['8/24/1995', '8/1/1899 :00'], pd.Index([1, 2], name='ID'), ['Date'])

dfreplace(df, '\s:00', '')

rng = range(5)
df2 = pd.concat([pd.concat([df for _ in rng]) for _ in rng], axis=1)

df2

dfreplace(df2, '\s:00', '')

pandas 如何删除数据框列中的字符串子串？

提问by MEhsan

回答by Psidom

回答by piRSquared

functionalized

功能化

Examples

例子

相关推荐

最近更新

标签

pandas 如何删除数据框列中的字符串子串？

提问by MEhsan

回答by Psidom

回答by piRSquared

functionalized

功能化

Examples

例子

相关推荐

pandas 跨数据框列应用模糊匹配并将结果保存在新列中

pandas Matplotlib：无法将字符串转换为浮点数

pandas 如何更改 iterrows() 的起始索引？

如何在 Windows 10 上为 python 3.5 安装 Pandas

相关推荐

最近更新

标签