bash 使用 grep 从本地文件中的 HTML 标签中获取文本

Question

提问by LakeMicrobe

Possible Duplicate:
RegEx match open tags except XHTML self-contained tags

可能的重复：
RegEx 匹配除 XHTML 自包含标签之外的开放标签

Excerpt From Input File

输入文件摘录

<TD class="clsTDLabelWeb" width="28%">Municipality:&nbsp;</TD>
<TD style="WIDTH: 394px" class="clsTDLabelSm" colSpan="5">
<span id="DInfo1_Municipality">JUPITER</span></TD>

My Regular Expression

我的正则表达式

(?<=<span id="DInfo1_Municipality">)([^</span>]*)

I have an HTML file saved to disk. I would like to use grep to search through the file and output the contents of a specific span, though I don't know if this is a proper use of grep. When I run grep on the file with the expression read from another file (so I dont mess up escaping any special characters), it doesn't output anything. I have tested the expression in RegExr and it matches "JUPITER" which is exactly what I want returned. Thank you so much for your help!

我有一个保存到磁盘的 HTML 文件。我想使用 grep 搜索文件并输出特定 span 的内容，但我不知道这是否正确使用 grep。当我使用从另一个文件中读取的表达式对文件运行 grep 时（这样我就不会搞砸转义任何特殊字符），它不会输出任何内容。我已经测试了 RegExr 中的表达式，它匹配“JUPITER”，这正是我想要返回的。非常感谢你的帮助！

Desired Output

期望输出

JUPITER

Answer 1

回答by Paused until further notice.

Give this a try:

试试这个：

sed -n 's|^<span id="DInfo1_Municipality">\([^<]*\)</span></TD>$||p' file

or with GNU grepand your regex:

或使用 GNUgrep和您的正则表达式：

grep -Po '(?<=<span id="DInfo1_Municipality">)([^</span>]*)'

Answer 2

回答by Paul Creasey

Grep doesn't support that type of regex (lookbehind assertions), and its a very poor tool for this, but for the example given it is workable, will break under many situtions.

Grep 不支持这种类型的正则表达式（后视断言），它是一个非常糟糕的工具，但对于给出的例子来说它是可行的，在许多情况下都会中断。

grep -io "<span id=\"DInfo1_Municipality\">.*</span>" file.htlm | grep -io ">[^<]*" | grep -io [^>]*

something crazy like that, not a good idea.

像那样疯狂的事情，不是一个好主意。

Answer 3

回答by ghostdog74

sed -n '/DInfo1_Municipality/s/<\/span.*//p' file | sed 's/.*>//'

bash 使用 grep 从本地文件中的 HTML 标签中获取文本

提问by LakeMicrobe

回答by Paused until further notice.

回答by Paul Creasey

回答by ghostdog74

相关推荐

最近更新

标签

bash 使用 grep 从本地文件中的 HTML 标签中获取文本

提问by LakeMicrobe

回答by Paused until further notice.

回答by Paul Creasey

回答by ghostdog74

相关推荐

用于验证 Git 标签或提交是否存在并已推送到远程存储库的 Bash/Shell 脚本函数

bash 仅压缩最新提交的更改

bash PHP exec $PATH 变量缺少元素

bash/sh if 语句语法

相关推荐

最近更新

标签