使用 sed 替换 HTML 标签内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7189604/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 10:16:04  来源:igfitidea点击:

Replacing HTML tag content using sed

htmlregexbashreplacesed

提问by Revell

I'm trying to replace the content of some HTML tags in an HTML page using sed in a bash script. For some reason I'm not getting the proper result as it's not replacing anything. It has to be something very simple/stupid im overlooking, anyone care to help me out?

我正在尝试在 bash 脚本中使用 sed 替换 HTML 页面中某些 HTML 标记的内容。由于某种原因,我没有得到正确的结果,因为它没有取代任何东西。它必须是我忽略的非常简单/愚蠢的事情,有人愿意帮助我吗?

HTML to search/replace in:

要搜索/替换的 HTML:

Unlocked <span id="unlockedCount"></span>/<span id="totalCount"></span> achievements for <span id="totalPoints"></span> points.

sed command used:

使用的 sed 命令:

cat index.html | sed -i -e "s/\<span id\=\"unlockedCount\"\>([0-9]\{0,\})\<\/span\>/${unlockedCount}/g" index.html 

The point of this is to parse the HTML page and update the figures according to some external data. For a first run, the contents of the tags will be empty, after that they will be filled.

这样做的目的是解析HTML页面并根据一些外部数据更新图形。第一次运行时,标签的内容将为空,之后它们将被填充。



EDIT:

编辑:

I ended up using a combination of the answers which resulted in the following code:

我最终使用了导致以下代码的答案组合:

sed -i -e 's|<span id="unlockedCount">\([0-9]\{0,\}\)</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

Many thanks to @Sorpigal, @tripleee, @classic for the help!

非常感谢@Sorpigal、@tripleee、@classic 的帮助!

采纳答案by classic

Try this:

尝试这个:

sed -i -e "s/\(<span id=\"unlockedCount\">\)\(<\/span>\)/${unlockedCount}/g" index.html

回答by sorpigal

What you say you want to do is not what you're telling sedto do.

你说你想做的不是你告诉sed做的。

You want to insert a number into a tag or replace it if present. What you're trying to tell sedto do is to replace a span tag and its contents, if any or a number, with the value of in a shell variable.

您想在标签中插入一个数字或替换它(如果存在)。您sed想要做的是用 shell 变量中的值替换 span 标记及其内容(如果有)或数字。

You're also employing a lot of complex, annoying and erorr-prone escape sequences which are just not necessary.

您还使用了许多不必要的复杂、烦人且容易出错的转义序列。

Here's what you want:

这是你想要的:

sed -r -i -e 's|<span id="unlockedCount">([0-9]{0,})</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

Note the differences:

注意区别:

  • Added -rto turn on extended expressions without which your capture pattern would not work.
  • Used |instead of /as the delimiter for the substitution so that escaping /would not be necessary.
  • Single-quoted the sedexpression so that escaping things inside it from the shell would not be necessary.
  • Included the matched span tag in the replacement section so that it would not get deleted.
  • In order to expand the unlockedCountvariable, closed the single-quoted expression, then later re-opened it.
  • Omitted cat |which was useless here.
  • 添加-r以打开扩展表达式,否则您的捕获模式将无法工作。
  • |代替用作替换/的分隔符,这样/就不需要转义了。
  • 用单引号引用sed表达式,这样就没有必要从 shell 中转义它里面的东西了。
  • 在替换部分包含匹配的跨度标记,以便它不会被删除。
  • 为了扩展unlockedCount变量,关闭了单引号表达式,然后再重新打开它。
  • 省略了cat |这里没用的。

I also used double quotes around the shell variable expansion, because this is good practice but if it contains no spaces this is not really necessary.

我还在 shell 变量扩展周围使用了双引号,因为这是一个很好的做法,但如果它不包含空格,这并不是真正必要的。

It was not, strictly speaking, necessary for me to add -r. Plain old sedwill work if you say \([0-9]\{0,\}\), but the idea here was to simplify.

严格来说,我没有必要添加-r. sed如果你说\([0-9]\{0,\}\),普通的旧会起作用,但这里的想法是简化。

回答by tripleee

sed -i -e 's%<span id="unlockedCount">([0-9]*)</span\>/'"${unlockedCount}/g" index.html 

I removed the Useless Use of Cat, took out a bunch of unnecessary backslashes, added single quotes around the regex to protect it from shell expansion, and fixed the repetition operator. You might still need to backslash the grouping parentheses; my sed, at least, wants \(...\).

我删除了无用的 Cat,去掉了一堆不必要的反斜杠,在正则表达式周围添加了单引号以保护它免受 shell 扩展,并修复了重复运算符。您可能仍然需要反斜杠分组括号;至少,我的 sed 想要 \(...\)。

Note the use of single and double quotes next to each other. Single quotes protect against shell expansion, so you can't use them around "${unlockedCount}" where you do want the shell to interpolate the variable.

注意单引号和双引号的使用。单引号可防止 shell 扩展,因此您不能在“${unlockedCount}”周围使用它们,您确实希望 shell 插入变量。