替换python pandas数据帧中的部分字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14345739/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:08:46  来源:igfitidea点击:

Replacing part of string in python pandas dataframe

pythoncsvpandas

提问by joseph_pindi

I have a similar problem to the one posted here:

我有一个与此处发布的问题类似的问题:

Pandas DataFrame: remove unwanted parts from strings in a column

Pandas DataFrame:从列中的字符串中删除不需要的部分

I need to remove newline characters from within a string in a DataFrame. Basically, I've accessed an api using python's json module and that's all ok. Creating the DataFrame works amazingly, too. However, when I want to finally output the end result into a csv, I get a bit stuck, because there are newlines that are creating false 'new rows' in the csv file.

我需要从 DataFrame 中的字符串中删除换行符。基本上,我已经使用 python 的 json 模块访问了一个 api,这一切都很好。创建 DataFrame 也非常有效。但是,当我想最终将最终结果输出到 csv 中时,我有点卡住了,因为在 csv 文件中有些换行符正在创建错误的“新行”。

So basically I'm trying to turn this:

所以基本上我试图改变这个:

'...this is a paragraph.

'……这是一段话。

And this is another paragraph...'

这是另一段……”

into this:

进入这个:

'...this is a paragraph. And this is another paragraph...'

'……这是一段话。这是另一段……”

I don't care about preserving any kind of '\n' or any special symbols for the paragraph break. So it can be stripped right out.

我不在乎为分段保留任何类型的 '\n' 或任何特殊符号。所以它可以直接剥离。

I've tried a few variations:

我尝试了一些变体:

misc['product_desc'] = misc['product_desc'].strip('\n')

AttributeError: 'Series' object has no attribute 'strip'

here's another

这是另一个

misc['product_desc'] = misc['product_desc'].str.strip('\n')

TypeError: wrapper() takes exactly 1 argument (2 given)

misc['product_desc'] = misc['product_desc'].map(lambda x: x.strip('\n'))
misc['product_desc'] = misc['product_desc'].map(lambda x: x.strip('\n\t'))

There is no error message, but the newline characters don't go away, either. Same thing with this:

没有错误消息,但换行符也不会消失。与此相同:

misc = misc.replace('\n', '')

The write to csv line is this:

写入 csv 行是这样的:

misc_id.to_csv('C:\Users\jlalonde\Desktop\misc_w_id.csv', sep=' ', na_rep='', index=False, encoding='utf-8')

Version of Pandas is 0.9.1

Pandas 的版本是 0.9.1

Thanks! :)

谢谢!:)

采纳答案by BrenBarn

striponly removes the specified characters at the beginning and end of the string. If you want to remove all\n, you need to use replace.

strip只删除字符串开头和结尾的指定字符。如果要全部删除\n,则需要使用replace.

misc['product_desc'] = misc['product_desc'].str.replace('\n', '')

回答by Anton Protopopov

You could use regexparameter of replacemethod to achieve that:

您可以使用方法regex参数replace来实现:

misc['product_desc'] = misc['product_desc'].replace(to_replace='\n', value='', regex=True)