php 除非包含在单引号或双引号中，

wgmfuz8q 于 2023-10-15 发布在 PHP

关注(0)|答案(4)|浏览(134)

php

来源：https://stackoverflow.com/questions/69598649/split-string-on-delimiter-unless-contains-in-single-or-double-quotes

4条答案

按热度按时间

hujrc8aj1#

这可以通过pcre提供的(*SKIP)(*FAIL)功能轻松实现：

(['"]).*?\1(*SKIP)(*FAIL)|\s*\|\s*

在PHP中，这可能是：

<?php

$string = "aa | bb | \"cc | dd\" | 'ee | ff'";

$pattern = '~([\'"]).*?\1(*SKIP)(*FAIL)|\s*\|\s*~';

$splitted = preg_split($pattern, $string);
print_r($splitted);
?>

并且会屈服于

Array
(
    [0] => aa
    [1] => bb
    [2] => "cc | dd"
    [3] => 'ee | ff'
)

参见a demo on regex101.com和on ideone.com。

赞(0）回复(0）举报 2023-10-15

qqrboqgw2#

如果您匹配零件（而不是拆分），这会更容易。模式默认是贪婪的，它们会消耗尽可能多的字符。这允许在为未加引号的标记提供模式之前为加引号的字符串定义更复杂的模式：

$subject = '[ aa | bb | "cc | dd" | \'ee | ff\' ]';

$pattern = <<<'PATTERN'
(
    (?:[|[]|^) # after | or [ or string start
    \s*
    (?<token> # name the match
        "[^"]*" # string in double quotes
        |
        '[^']*'  # string in single quotes
        |
        [^\s|]+ # non-whitespace 
    )
    \s*
)x
PATTERN;

preg_match_all($pattern, $subject, $matches);
var_dump($matches['token']);

输出量：

array(4) {
  [0]=>
  string(2) "aa"
  [1]=>
  string(2) "bb"
  [2]=>
  string(9) ""cc | dd""
  [3]=>
  string(9) "'ee | ff'"
}

提示：

<<<'PATTERN'被称为HEREDOC语法，减少了转义
1.我使用()作为模式分隔符-它们是组0
1.匹配使代码更具可读性
1.修饰符x允许对模式进行标记和注解

赞(0）回复(0）举报 2023-10-15

c86crjj03#

使用

$string = "aa | bb | \"cc | dd\" | 'ee | ff'";
preg_match_all("~(?|\"([^\"]*)\"|'([^']*)'|([^|'\"]+))(?:\s*\|\s*|\z)~", $string, $matches);
print_r(array_map(function($x) {return trim($x);}, $matches[1]));

请参见PHP proof。

结果：

Array
(
    [0] => aa
    [1] => bb
    [2] => cc | dd
    [3] => ee | ff
)

说明

--------------------------------------------------------------------------------
  (?|                      Branch reset group, does not capture:
--------------------------------------------------------------------------------
    \"                       '"'
--------------------------------------------------------------------------------
    (                        group and capture to \1:
--------------------------------------------------------------------------------
      [^\"]*                   any character except: '\"' (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \1
--------------------------------------------------------------------------------
    \"                       '"'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    '                        '\''
--------------------------------------------------------------------------------
    (                        group and capture to \1:
--------------------------------------------------------------------------------
      [^']*                    any character except: ''' (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \1
--------------------------------------------------------------------------------
    '                        '\''
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    (                        group and capture to \1:
--------------------------------------------------------------------------------
      [^|'\"]+                 any character except: '|', ''', '\"'
                               (1 or more times (matching the most
                               amount possible))
--------------------------------------------------------------------------------
    )                        end of \1
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \|                       '|'
--------------------------------------------------------------------------------
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    \z                       the end of the string
--------------------------------------------------------------------------------
  )                        end of grouping

赞(0）回复(0）举报 2023-10-15

23c0lvtd4#

有趣的是，有很多方法可以构造这个问题的正则表达式。这是另一个类似于@Jan的答案。

(['"]).*?\1\K| *\| *

PCRE Demo

(['"]) # match a single or double quote and save to capture group 1
.*?    # match zero or more characters lazily
\1     # match the content of capture group 1
\K     # reset the starting point of the reported match and discard
       # any previously-consumed characters from the reported match
|      # or
\ *    # match zero or more spaces
\|     # match a pipe character
\ *    # match zero or more spaces

请注意，管道字符（“or”）之前的部分仅用于将引擎的内部字符串指针移动到刚过右引号或带引号的子字符串。

赞(0）回复(0）举报 2023-10-15

我来回答

php 除非包含在单引号或双引号中，

4条答案

提示：

相关问题

热门标签

最新问答