pandas ValueError：模式不包含捕获组

Question

提问by Chan

When using regular expression, I get:

使用正则表达式时，我得到：

import re
string = r'http://www.example.com/abc.html'
result = re.search('^.*com', string).group()

In pandas, I write:

在Pandas中，我写道：

df = pd.DataFrame(columns = ['index', 'url'])
df.loc[len(df), :] = [1, 'http://www.example.com/abc.html']
df.loc[len(df), :] = [2, 'http://www.hello.com/def.html']
df.str.extract('^.*com')

ValueError: pattern contains no capture groups

How to solve the problem?

如何解决问题？

Thanks.

谢谢。

Answer 1

回答by cs95

According to the docs, you need to specify a capture group(i.e., parentheses) for str.extractto, well, extract.

根据docs，您需要指定一个捕获组（即括号）以str.extract进行提取。

Series.str.extract(pat, flags=0, expand=True)
For each subject string in the Series, extract groups from the first match of regular expression pat.

Series.str.extract(pat, flags=0, expand=True)
对于系列中的每个主题字符串，从正则表达式 pat 的第一个匹配项中提取组。

Each capture group constitutes its own column in the output.

每个捕获组在输出中构成自己的列。

df.url.str.extract(r'(.*.com)')

                        0
0  http://www.example.com
1    http://www.hello.com

# If you need named capture groups,
df.url.str.extract(r'(?P<URL>.*.com)')

                      URL
0  http://www.example.com
1    http://www.hello.com

Or, if you need a Series,

或者，如果您需要一个系列，

df.url.str.extract(r'(.*.com)', expand=False)

0    http://www.example.com
1      http://www.hello.com
Name: url, dtype: object

Answer 2

回答by jezrael

You need specify column urlwith ()for match groups:

你需要指定柱url与()用于匹配组：

df['new'] = df['url'].str.extract(r'(^.*com)')
print (df)
  index                              url                     new
0     1  http://www.example.com/abc.html  http://www.example.com
1     2    http://www.hello.com/def.html    http://www.hello.com

Answer 3

回答by anky

Try this python library, works well for this purpose:

试试这个 python 库，适用于这个目的：

Using urllib.parse

使用 urllib.parse

from urllib.parse import urlparse
df['domain']=df.url.apply(lambda x:urlparse(x).netloc)
print(df)

  index                              url           domain
0     1  http://www.example.com/abc.html  www.example.com
1     2    http://www.hello.com/def.html    www.hello.com

pandas ValueError：模式不包含捕获组

提问by Chan

回答by cs95

回答by jezrael

回答by anky

相关推荐

最近更新

标签

pandas ValueError：模式不包含捕获组

提问by Chan

回答by cs95

回答by jezrael

回答by anky

相关推荐

pandas 从数组python创建一个数据框

pandas 在数据框的整个列中应用正则表达式

Python:Pandas - 数据帧中的对象到字符串类型转换

Pandas groupby 两列并绘制

相关推荐

最近更新

标签