bash 使用 Grep 匹配文件名

Question

提问by Jason Zhu

The overarching problem: So I have a file name that comes in the form of JohnSmith14_120325_A10_6.raw and I want to match it using regex. I have a couple of issues in building a working example but unfortunately my issues won't be solved unless I get the basics.

首要问题：所以我有一个格式为 JohnSmith14_120325_A10_6.raw 的文件名，我想使用正则表达式匹配它。我在构建一个工作示例时遇到了一些问题，但不幸的是，除非我掌握了基础知识，否则我的问题将无法解决。

So I have just recently learned about piping and one of the cool things I learned was that I can do the following.

所以我最近刚刚了解了管道，我学到的一件很酷的事情是我可以做以下事情。

X=ll_paprika.sc (don't ask)
VAR=`echo $X | cut -p -f 1`
echo $VAR

which gives me paprika.sc Now when I try to execute the pipe idea in grep, nothing happens.

这给了我 paprika.sc 现在当我尝试在 grep 中执行管道想法时，没有任何反应。

x=ll_paprika.sc
VAR=`echo $X | grep *.sc`
echo $VAR

Can anyone explain what I am doing wrong?

谁能解释我做错了什么？

Second question: How does one match a single underscore using regex?

第二个问题：如何使用正则表达式匹配单个下划线？

Here's what I am ultimately trying to do;

这就是我最终想要做的事情；

VAR=`echo $X | grep -e "^[a-bA-Z][a-bA-Z0-9]*(_){1}[0-9]*(_){1}[a-bA-Z0-9]*(_){1}[0-9](\.){1}(raw)"

So the basic idea of my pattern here is that the file name must start with a letter and then it can have any number of letters and numbers following it and it must have an _ delimit a series of numbers and another _ to delimit the next set of numbers and characters and another _ to delimit the next set of numbers and then it must have a single period following by raw. This looks grossly wrong and ugly (because I am not sure about the syntax). So how does one match a file extension? Can someone put up a simple example for something ll_parpika.sc so that I can figure out how to do my own regex?

所以我这里模式的基本思想是文件名必须以字母开头，然后它可以有任意数量的字母和数字，它必须有一个 _ 分隔一系列数字，另一个 _ 分隔下一组数字和字符以及另一个 _ 来分隔下一组数字，然后它必须有一个单独的句点，后面跟原始的。这看起来非常错误和丑陋（因为我不确定语法）。那么如何匹配文件扩展名呢？有人可以为 ll_parpika.sc 举一个简单的例子，这样我就可以弄清楚如何做我自己的正则表达式吗？

Thanks.

谢谢。

Answer 1

回答by drysdam

x=ll_paprika.sc
VAR=`echo $X | grep *.sc`
echo $VAR

The reason this isn't doing what you want is that the grep matches a line and returns it. *.scdoes in fact match 11_paprika.sc, so it returns that whole line and sticks it in $VAR.

这不是你想要的原因是 grep 匹配一行并返回它。*.sc实际上匹配11_paprika.sc，因此它返回整行并将其粘贴在$VAR.

If you want to just get a part of it, the cutline probably better. There is a grep -ooption that returns only the matching portion, but for this you'd basically have to put in the thing you were looking for, at which point why bother?

如果你只想得到它的一部分，这cut条线可能会更好。有一个grep -o选项只返回匹配的部分，但为此你基本上必须输入你正在寻找的东西，在这一点上何必呢？

the file name must start with a letter

文件名必须以字母开头

`grep -e "^[a-zA-Z]

and then it can have any number of letters and numbers following it

然后它后面可以有任意数量的字母和数字

[a-zA-Z0-9]*

and it must have an _ delimit a series of numbers and another _ to delimit the next set of numbers and characters and another _ to delimit the next set of numbers

它必须有一个 _ 分隔一系列数字，另一个 _ 分隔下一组数字和字符，另一个 _ 分隔下一组数字

(_[0-9]+){3}

and then it must have a single period following by raw.

然后它必须有一个单独的句点，后面是原始的。

.raw"

。生的”

Answer 2

回答by imm

For the first, use:

首先，使用：

VAR=`echo $X | egrep '\.sc$'`

For the second, you can try this alternative instead:

对于第二个，你可以试试这个替代方案：

VAR=`echo $X | egrep '^[[:alpha:]][[:alnum:]]*_[[:digit:]]+_[[:alnum:]]+_[[:digit:]]+\.raw'`

Note that your character classes from your expression differ from the description that follows in that they seem to only be permissive of a-b for lower case characters in some places. This example is permissive of all alphanumeric characters in those places.

请注意，您的表达式中的字符类与后面的描述不同，因为它们在某些地方似乎只允许 ab 用于小写字符。此示例允许这些位置中的所有字母数字字符。

bash 使用 Grep 匹配文件名

提问by Jason Zhu

回答by drysdam

回答by imm

相关推荐

最近更新

标签

bash 使用 Grep 匹配文件名

提问by Jason Zhu

回答by drysdam

回答by imm

相关推荐

通过 ssh 将目录列表分配给 bash 脚本中的变量

bash 捕获远程脚本的退出代码？

在 bash 中获取 cURL 响应

Bash 中的空格连接

相关推荐

最近更新

标签