Python 产品代码看起来像abcd2343，用字母和数字分割什么

Question

提问by Blankman

I have a list of product codes in a text file, on each like is the product code that looks like:

我在一个文本文件中有一个产品代码列表，每个类似的产品代码如下所示：

abcd2343 abw34324 abc3243-23A

So it is lettersfollowed by numbersand other characters.

所以它是字母后跟数字和其他字符。

I want to spliton the first occurrence of a number.

我想在第一次出现 number 时拆分。

Answer 1

采纳答案by unutbu

In [32]: import re

In [33]: s='abcd2343 abw34324 abc3243-23A'

In [34]: re.split('(\d+)',s)
Out[34]: ['abcd', '2343', ' abw', '34324', ' abc', '3243', '-', '23', 'A']

Or, if you want to split on the first occurrence of a digit:

或者，如果您想在第一次出现数字时进行拆分：

In [43]: re.findall('\d*\D+',s)
Out[43]: ['abcd', '2343 abw', '34324 abc', '3243-', '23A']

\d+matches 1-or-more digits.
\d*\D+matches 0-or-more digits followed by 1-or-more non-digits.
\d+|\D+matches 1-or-more digits or1-or-more non-digits.

\d+匹配 1 个或多个数字。
\d*\D+匹配 0 个或多个数字后跟 1 个或多个非数字。
\d+|\D+匹配 1 个或多个数字或1 个或多个非数字。

Consult the docsfor more about Python's regex syntax.

有关 Python 的正则表达式语法的更多信息，请参阅文档。

re.split(pat, s)will split the string susing patas the delimiter. If patbegins and ends with parentheses (so as to be a "capturing group"), then re.splitwill return the substrings matched by patas well. For instance, compare:

re.split(pat, s)将s使用pat作为分隔符分割字符串。如果pat以括号开头和结尾（以便成为“捕获组”），则re.split也将返回匹配的子字符串pat。例如，比较：

In [113]: re.split('\d+', s)
Out[113]: ['abcd', ' abw', ' abc', '-', 'A']   # <-- just the non-matching parts

In [114]: re.split('(\d+)', s)
Out[114]: ['abcd', '2343', ' abw', '34324', ' abc', '3243', '-', '23', 'A']  # <-- both the non-matching parts and the captured groups

In contrast, re.findall(pat, s)returns only the parts of sthat match pat:

相反，re.findall(pat, s)只返回s匹配的部分pat：

In [115]: re.findall('\d+', s)
Out[115]: ['2343', '34324', '3243', '23']

Thus, if sends with a digit, you could avoid ending with an empty string by using re.findall('\d+|\D+', s)instead of re.split('(\d+)', s):

因此，如果s以数字结尾，则可以使用re.findall('\d+|\D+', s)代替来避免以空字符串结尾re.split('(\d+)', s)：

In [118]: s='abcd2343 abw34324 abc3243-23A 123'

In [119]: re.split('(\d+)', s)
Out[119]: ['abcd', '2343', ' abw', '34324', ' abc', '3243', '-', '23', 'A ', '123', '']

In [120]: re.findall('\d+|\D+', s)
Out[120]: ['abcd', '2343', ' abw', '34324', ' abc', '3243', '-', '23', 'A ', '123']

Answer 2

回答by Mike

def firstIntIndex(string):
    result = -1
    for k in range(0, len(string)):
        if (bool(re.match('\d', string[k]))):
            result = k
            break
    return result

Answer 3

回答by jwsample

import re

m = re.match(r"(?P<letters>[a-zA-Z]+)(?P<the_rest>.+)$",input)

m.group('letters')
m.group('the_rest')

This covers your corner case of abc3243-23A and will output abcfor the letters group and 3243-23A for the_rest

这涵盖了 abc3243-23A 的角落情况，并将输出abc字母组和 3243-23Athe_rest

Since you said they are all on individual lines you'll obviously need to put a line at a time in input

既然你说他们都在单独的行上，你显然需要一次放一行 input

Answer 4

回答by Muhammad Alkarouri

To partition on the first digit

在第一个数字上分区

parts = re.split('(\d.*)','abcd2343')      # => ['abcd', '2343', '']
parts = re.split('(\d.*)','abc3243-23A')   # => ['abc', '3243-23A', '']

So the two parts are always parts[0] and parts[1].

所以这两个部分总是parts[0]和parts[1]。

Of course, you can apply this to multiple codes:

当然，您可以将其应用于多个代码：

>>> s = "abcd2343 abw34324 abc3243-23A"
>>> results = [re.split('(\d.*)', pcode) for pcode in s.split(' ')]
>>> results
[['abcd', '2343', ''], ['abw', '34324', ''], ['abc', '3243-23A', '']]

If each code is in an individual line then instead of s.split( )use s.splitlines().

如果每个代码都在单独的行中，则不要s.split( )使用s.splitlines().

Answer 5

回答by Basant Rules

Try this code it will work fine

试试这个代码它会正常工作

import re
text = "MARIA APARECIDA 99223-2000 / 98450-8026"
parts = re.split(r' (?=\d)',text, 1)
print(parts)

Output:

输出：

['MARIA APARECIDA', '99223-2000 / 98450-8026']

['玛丽亚阿帕雷西达'，'99223-2000 / 98450-8026']

Python 产品代码看起来像abcd2343，用字母和数字分割什么

提问by Blankman

采纳答案by unutbu

回答by Mike

回答by jwsample

回答by Muhammad Alkarouri

回答by Basant Rules

相关推荐

最近更新

标签

Python 产品代码看起来像abcd2343，用字母和数字分割什么

提问by Blankman

采纳答案by unutbu

回答by Mike

回答by jwsample

回答by Muhammad Alkarouri

回答by Basant Rules

相关推荐

在 Python 中压缩 `x if x else y` 语句

如何为python安装OpenCV

Python、子进程、call()、check_call 和 returncode 以查找命令是否存在

Python中语句的执行可以延迟吗？

相关推荐

最近更新

标签