set thecommandstring to "echo \"" & filename & "\"|sed \"s/[0-9]\\{10\\}/*good*(&)/\"" as string
set sedResult to do shell script thecommandstring
set isgood to sedResult starts with "*good*"
我的sed技能不是很容易崩溃,所以可能有一种更优雅的方法,而不是将 good 附加到任何匹配[0-9]{10}的名称,然后在结果的开头查找 good。但基本上,如果文件名是“1234567890dfoo.mov”,这将运行命令:
set isMatch to "0" = (do shell script ¬
"egrep -q '^\\d{10}' <<<" & quoted form of filename & "; printf $?")
虽然这可能表现得更差,但它有两个优点:
可以使用快捷字符类(如\d)和Assert(如\b
您可以通过使用-i调用egrep来更轻松地使匹配不区分大小写:
然而,你不能通过捕获组获得子匹配;如果需要,请使用[[ ... =~ ... ]]方法。
最后,这里是实用函数,它将两种方法都打包了(语法突出显示是关闭的,但它们确实工作):
# SYNOPIS
# doesMatch(text, regexString) -> Boolean
# DESCRIPTION
# Matches string s against regular expression (string) regex using bash's extended regular expression language *including*
# support for shortcut classes such as `\d`, and assertions such as `\b`, and *returns a Boolean* to indicate if
# there is a match or not.
# - AppleScript's case sensitivity setting is respected; i.e., matching is case-INsensitive by default, unless inside
# a 'considering case' block.
# - The current user's locale is respected.
# EXAMPLE
# my doesMatch("127.0.0.1", "^(\\d{1,3}\\.){3}\\d{1,3}$") # -> true
on doesMatch(s, regex)
local ignoreCase, extraGrepOption
set ignoreCase to "a" is "A"
if ignoreCase then
set extraGrepOption to "i"
else
set extraGrepOption to ""
end if
# Note: So that classes such as \w work with different locales, we need to set the shell's locale explicitly to the current user's.
# Rather than let the shell command fail we return the exit code and test for "0" to avoid having to deal with exception handling in AppleScript.
tell me to return "0" = (do shell script "export LANG='" & user locale of (system info) & ".UTF-8'; egrep -q" & extraGrepOption & " " & quoted form of regex & " <<< " & quoted form of s & "; printf $?")
end doesMatch
# SYNOPSIS
# getMatch(text, regexString) -> { overallMatch[, captureGroup1Match ...] } or {}
# DESCRIPTION
# Matches string s against regular expression (string) regex using bash's extended regular expression language and
# *returns the matching string and substrings matching capture groups, if any.*
#
# - AppleScript's case sensitivity setting is respected; i.e., matching is case-INsensitive by default, unless this subroutine is called inside
# a 'considering case' block.
# - The current user's locale is respected.
#
# IMPORTANT:
#
# Unlike doesMatch(), this subroutine does NOT support shortcut character classes such as \d.
# Instead, use one of the following POSIX classes (see `man re_format`):
# [[:alpha:]] [[:word:]] [[:lower:]] [[:upper:]] [[:ascii:]]
# [[:alnum:]] [[:digit:]] [[:xdigit:]]
# [[:blank:]] [[:space:]] [[:punct:]] [[:cntrl:]]
# [[:graph:]] [[:print:]]
#
# Also, `\b`, '\B', '\<', and '\>' are not supported; you can use `[[:<:]]` for '\<' and `[[:>:]]` for `\>`
#
# Always returns a *list*:
# - an empty list, if no match is found
# - otherwise, the first list element contains the matching string
# - if regex contains capture groups, additional elements return the strings captured by the capture groups; note that *named* capture groups are NOT supported.
# EXAMPLE
# my getMatch("127.0.0.1", "^([[:digit:]]{1,3})\\.([[:digit:]]{1,3})\\.([[:digit:]]{1,3})\\.([[:digit:]]{1,3})$") # -> { "127.0.0.1", "127", "0", "0", "1" }
on getMatch(s, regex)
local ignoreCase, extraCommand
set ignoreCase to "a" is "A"
if ignoreCase then
set extraCommand to "shopt -s nocasematch; "
else
set extraCommand to ""
end if
# Note:
# So that classes such as [[:alpha:]] work with different locales, we need to set the shell's locale explicitly to the current user's.
# Since `quoted form of` encloses its argument in single quotes, we must set compatibility option `shopt -s compat31` for the =~ operator to work.
# Rather than let the shell command fail we return '' in case of non-match to avoid having to deal with exception handling in AppleScript.
tell me to do shell script "export LANG='" & user locale of (system info) & ".UTF-8'; shopt -s compat31; " & extraCommand & "[[ " & quoted form of s & " =~ " & quoted form of regex & " ]] && printf '%s\\n' \"${BASH_REMATCH[@]}\" || printf ''"
return paragraphs of result
end getMatch
set filename to "1234567890abcdefghijkl"
return isPrefixGood(filename)
on isPrefixGood(filename) --returns boolean
set legalCharacters to {"1", "2", "3", "4", "5", "6", "7", "8", "9", "0"}
set thePrefix to (characters 1 thru 10) of filename as text
set badPrefix to false
repeat with thisChr from 1 to (get count of characters in thePrefix)
set theChr to character thisChr of thePrefix
if theChr is not in legalCharacters then
set badPrefix to true
end if
end repeat
if badPrefix is true then
return "bad prefix"
end if
return "good prefix"
end isPrefixGood
on checkFilename(thisName)
set {n, isOk} to {length of fileName, true}
try
repeat with i from 1 to 10
set isOk to (isOk and ((character i of thisName) is in "0123456789"))
end repeat
return isOk
on error
return false
end try
end checkFilename
# Returns a list of strings from _subject that match _regex
# _regex in the format of /<value>/<flags>
on match(_subject, _regex)
set _js to "(new String(`" & _subject & "`)).match(" & _regex & ")"
set _result to run script _js in "JavaScript"
if _result is null or _result is missing value then
return {}
end if
return _result
end match
match("file-name.applescript", "/^\\d+/g") #=> {}
match("1234_file.js", "/^\\d+/g") #=> {"1234"}
match("5-for-fighting.mp4", "/^\\d+/g") #=> {"5"}
看起来大部分JavaScript String methods都能按预期工作。我还没有找到一个关于哪个版本的ECMAScript与JavaScript for macOS Automation兼容的参考,所以在使用前请进行测试。
set mstr to "1234567889Abcdefg"
set isnum to prefixIsOnlyDigits for mstr
to prefixIsOnlyDigits for aText
set aProbe to text 1 thru 10 of aText
set isnum to false
if not ((offset of "," in aProbe) > 0 or (offset of "." in aProbe) > 0 or (offset of "-" in aProbe) > 0) then
try
set aNumber to aProbe as number
set isnum to true
end try
end if
return isnum
end prefixIsOnlyDigits
var app = Application.currentApplication();
app.includeStandardAdditions = true;
var text = "https://www.example.com";
var patt = /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/;
app.displayAlert(text.search(patt));
8条答案
按热度按时间vptzau2j1#
不要绝望,因为OSX你也可以通过“do shell script”访问sed和grep。所以:
我的sed技能不是很容易崩溃,所以可能有一种更优雅的方法,而不是将 good 附加到任何匹配[0-9]{10}的名称,然后在结果的开头查找 good。但基本上,如果文件名是“1234567890dfoo.mov”,这将运行命令:
注意applescript中的转义引号\”和转义反斜杠\。如果你要在壳里逃脱,你必须逃脱逃脱。所以要运行一个包含反斜杠的shell脚本,你必须像\那样对shell进行转义,然后像\\那样对applescript中的每个反斜杠进行转义。这可能很难读懂。
所以你可以在命令行上做的任何事情都可以通过从applescript调用它来完成(woohoo!)。stdout上的任何结果都会作为结果返回给脚本。
tcomlyy62#
有一种更简单的方法可以使用shell(适用于bash 3.2+)进行正则表达式匹配:
注意事项:
[[ ... ]]
和regex匹配运算符=~
;not 在bash 3.2+上必须使用右操作数(或者至少是特殊的正则字符),除非你在前面加上shopt -s compat31;
do shell script
语句执行测试,并通过一个附加命令(谢谢,@LauriRanta)返回其退出命令;"0"
表示成功。=~
运算符不支持快捷字符类,如\d
和Assert,如\b
(从OS X 10.9.4开始为真-这不太可能很快改变)。shopt -s nocasematch;
export LANG='" & user locale of (system info) & ".UTF-8';
。${BASH_REMATCH[@]}
数组变量访问捕获的字符串。\
-转义双引号和反斜杠。以下是使用
egrep
的替代方案:虽然这可能表现得更差,但它有两个优点:
\d
)和Assert(如\b
-i
调用egrep
来更轻松地使匹配不区分大小写:[[ ... =~ ... ]]
方法。最后,这里是实用函数,它将两种方法都打包了(语法突出显示是关闭的,但它们确实工作):
klh5stk13#
我最近在一个脚本中需要正则表达式,并希望找到一个脚本添加来处理它,这样就更容易阅读发生了什么。我找到了Satimage.osax,它允许你使用如下语法:
唯一的缺点是(截至2010年11月8日)它是一个32位加法,因此当从64位进程调用它时会抛出错误。这让我陷入了雪豹的邮件规则,因为我必须在32位模式下运行邮件。从一个独立的脚本调用,虽然,我没有保留-它真的很棒,让你选择任何你想要的正则表达式语法,并使用反向引用。
更新2011年5月28日
感谢Mitchell Model在下面的评论指出,他们已经将其更新为64位,所以没有更多的保留-它做了我需要的一切。
6l7fqoea4#
我确信有一个ApplescriptAddition或shell脚本可以调用来将regex引入到文件夹中,但我避免了对简单内容的依赖。我一直用这种风格模式。。
2nbm6dog5#
这里有另一种方法来检查任何字符串的前十个字符是否是数字。
gmxoilav6#
我能够直接从AppleScript(在High Sierra上)调用JavaScript,如下所示。
看起来大部分JavaScript String methods都能按预期工作。我还没有找到一个关于哪个版本的ECMAScript与JavaScript for macOS Automation兼容的参考,所以在使用前请进行测试。
dy2hfwbg7#
我有一个替代方案,直到我实现了汤普森NFA算法的字符类,我才在AppleScript中完成了工作的基本内容。如果有人有兴趣用Applescript解析非常基本的正则表达式,那么代码发布在MacScripters的CodeExchange中,请看一看!
下面是判断文本/字符串的前十个字符是否为:
b91juud38#
正如在其他答案中提到的,Applescript中没有对正则表达式的语言级支持,但从约塞米蒂开始,您可以切换到Javascript for Applications (JXA)引擎(也可以参见Apple's docs),它确实包含正则表达式引擎。
使用JXA根据正则表达式验证URL的示例:
注意️:为了从
Script Editor.app
运行脚本,请确保在运行时下拉列表中选择Javascript
,如下所示:要使用
osascript
从终端的shell脚本运行纯文本格式的JXA,请使用用途:或者,如果您通过
Script Editor.app
以二进制格式保存脚本,并使用.scpt
扩展名,那么它可以在没有引擎说明符的情况下运行:另一种选择是添加
#!/usr/bin/osascript -l JavaScript
的shebang,执行chmod +x myJxaScript.js
并将js脚本作为可执行文件运行: