Python 删除 pandas DataFrame 列中字符串条目的结尾

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37001787/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:40:04  来源:igfitidea点击:

Remove ends of string entries in pandas DataFrame column

pythonpandasdataframestring-matching

提问by ShanZhengYang

I have a pandas Dataframe with one column a list of files

我有一个熊猫数据框,其中一列是文件列表

import pandas as pd
df = pd.read_csv('fname.csv')

df.head()

filename    A    B    C
fn1.txt   2    4    5
fn2.txt   1    2    1
fn3.txt   ....
....

I would like to delete the file extension .txtfrom each entry in filename. How do I accomplish this?

我想.txtfilename. 我该如何实现?

I tried:

我试过:

df['filename'] = df['filename'].map(lambda x: str(x)[:-4])

but when I look at the column entries afterwards with df.head(), nothing has changed.

但是当我之后查看列条目时df.head(),没有任何变化。

How does one do this?

如何做到这一点?

回答by jezrael

I think you can use str.replacewith regex .txt$'( $- matches the end of the string):

我认为您可以使用str.replace正则表达式.txt$'$-匹配字符串的结尾):

import pandas as pd

df = pd.DataFrame({'A': {0: 2, 1: 1}, 
                   'C': {0: 5, 1: 1}, 
                   'B': {0: 4, 1: 2}, 
                   'filename': {0: "txt.txt", 1: "x.txt"}}, 
                columns=['filename','A','B', 'C'])

print df
  filename  A  B  C
0  txt.txt  2  4  5
1    x.txt  1  2  1

df['filename'] = df['filename'].str.replace(r'.txt$', '')
print df
  filename  A  B  C
0      txt  2  4  5
1        x  1  2  1

df['filename'] = df['filename'].map(lambda x: str(x)[:-4])
print df
  filename  A  B  C
0      txt  2  4  5
1        x  1  2  1

df['filename'] = df['filename'].str[:-4]
print df
  filename  A  B  C
0      txt  2  4  5
1        x  1  2  1

EDIT:

编辑:

rstripcan remove more characters, if the end of strings contains some characters of striped string (in this case ., t, x):

rstrip可以删除更多的字符,如果字符串的末尾包含一些条纹字符串的字符(在这种情况下为., t, x):

Example:

例子:

print df
  filename  A  B  C
0  txt.txt  2  4  5
1    x.txt  1  2  1

df['filename'] = df['filename'].str.rstrip('.txt')

print df
  filename  A  B  C
0           2  4  5
1           1  2  1

回答by EdChum

You can use str.rstripto remove the endings:

您可以使用str.rstrip删除结尾:

df['filename'] = df['filename'].str.rstrip('.txt')

should work

应该管用

回答by Pawe? Kordek

You may want:

你可能想要:

df['filename'] = df.apply(lambda x: x['filename'][:-4], axis = 1)

df['filename'] = df.apply(lambda x: x['filename'][:-4], axis = 1)

回答by Blue Moon

use list comprehension

使用列表理解

df['filename'] = [x[:-4] for x in df['filename']]