用于查找字符串中所有单词的 Python 正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37543724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:32:34  来源:igfitidea点击:

Python regex for finding all words in a string

pythonregexwordssentence

提问by TNT

Hello I am new into regex and I'm starting out with python. I'm stuck at extracting all words from an English sentence. So far I have:

您好,我是 regex 的新手,我从 python 开始。我被困在从一个英语句子中提取所有单词。到目前为止,我有:

import re

shop="hello seattle what have you got"
regex = r'(\w*) '
list1=re.findall(regex,shop)
print list1

This gives output:

这给出了输出:

['hello', 'seattle', 'what', 'have', 'you']

['你好','西雅图','什么','有','你']

If I replace regex by

如果我将正则表达式替换为

regex = r'(\w*)\W*'

then output:

然后输出:

['hello', 'seattle', 'what', 'have', 'you', 'got', '']

['你好','西雅图','什么','有','你','得到','']

whereas I want this output

而我想要这个输出

['hello', 'seattle', 'what', 'have', 'you', 'got']

['你好','西雅图','什么','有','你','有']

Please point me where I am going wrong.

请指出我哪里出错了。

回答by Pranav C Balan

Use word boundary \b

使用词边界 \b

import re

shop="hello seattle what have you got"
regex = r'\b\w+\b'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

or simply \w+is enough

或者干脆\w+就够了

import re

shop="hello seattle what have you got"
regex = r'\w+'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']