3数组、preg_match和PHP中结果的合并

dxxyhpgq  于 2022-10-30  发布在  PHP
关注(0)|答案(3)|浏览(169)

我有3个不同的多维数组:

// INPUT DATA WITH HOUSE DESCRIPTION. STRUCTURE: ID, OPTION DESCRIPTION

$input_house_data = array (
array("AAA","Chimney with red bricks"),
array("BBB","Two wide windows in the main floor"),
array("CCC","Roof tiles renewed in 2015")
);

// CATALOGUE WITH ALL HOUSE EQUIPMENT OPTIONS. STRUCTURE: ID, OPTION NAME

$ct_all_house_options = array (
  array("0001","Chimney"),
  array("0002","Garden"),
  array("0003","Roof tiles"),
  array("0004","Windows"),
  array("0005","Garage")
);

// SEARCH STRINGS AS REGULAR EXPRESSIONS. STRUCTURE: ID, EQUIPMENT OPTION NAME, REGULAR EXPRESSION TO SEARCH

$ct_house_options = array (
  array("0001","Chimney","/^Chimney with./"),
  array("0003","Roof tiles","/^Roof tiles./"),
  array("0004","Windows","/.windows./"),
  array("0004","Windows","/.wide windows./")    
);

我想通过数组$ct_house_options中的正则表达式在$input_house_data中进行搜索,以指明哪个设备有房屋。结果应该是包含所有可能选项和状态“可用”或“不可用”的完整列表:

0001 - Chimney - available
0002 - Garden - not available
0003 - Roof tiles - available
0004 - Windows - available
0005 - Garage - not available

我试图实现它如下:

$arrlength_input_house_data = count($input_house_data);
$arrlength_ct_all_house_options = count($ct_all_house_options);
$arrlength_ct_house_options = count($ct_house_options);

使用preg_match函数的For循环。所有结果都被写入数组$matches(包括重复项):

for ($row1 = 0; $row1 < $arrlength_input_house_data; $row1++) {

   for ($row2 = 0; $row2 < $arrlength_ct_house_options; $row2++) {

if (preg_match($ct_house_options[$row2][2], $input_house_data[$row1][1]) === 1) {
    $matches[] = $ct_house_options [$row2][0];
}
}
}

删除重复项:

$unique = array_unique($matches);
print_r($unique);

所以现在我已经得到了独一无二的结果:

Array ( [0] => 0001 [1] => 0004 [3] => 0003 )

下一步应该是合并数组$ct_all_house_options和来自$unique的唯一结果。不幸的是,我不能让它实现。你有什么想法如何实现它?也许有一个更简单的方法来实现它?
2022年8月12日
大家好!谢谢大家的反馈。我检查并测试了所有的项目。其间,业务逻辑发生了变化,变得有点复杂:
1.有3种不同的星座来表示产品选项。

  • 只能通过产品描述内的正则表达式、
  • 通过正则表达式内描述+产品族或产品族的零件、
  • 按说明+产品系列+产品编号中的正则表达式。

2.输出可以不同:TRUE/FALSE或特定字符串(例如,产品颜色“白色”、“绿色”等)。
所以请看我是如何设计一个可能的解决方案的:

$input_product_data = array (
array("AAAA", "9999", "Chimney with red bricks"),
array("CCCC","2234","Two wide windows in the main floor"),
array("RRRR","0022","Roof tiles renewed in 2015"),
array("","2258","Floor has been renovated for two years. Currently it has ground in wood."),
array("","","Beautiful door in green color")

);

// CUSTOMIZING TABLE FOR PRODUCT OPTIONS. STRUCTURE: ID[0], OPTION NAME[1], OPTION CATEGORY[2], OPTION-FAMILY[3], PROD.-NR[4], REG. EXPRESSION[5], PRIORITY[6], OUTPUT[7]

$ct_product_options = array (
  array("0001", "Chimney", "Additional options", "/^AAAA/", "9999", "/^Chimney with./", "0", "TRUE"),
  array("0003", "Roof tiles", "Basic options", "/^RRRR/", "0022", "/^Roof tiles./", "0", "TRUE"),
  array("0004", "Windows", "Basic options", "/^C...$/", "2234", "/.windows./", "0", "TRUE"),
  array("0004", "Windows", "Basic options", "/^C...$/", "2567", "/.wide windows./", "0", "TRUE"), 
  array("0002", "Material of ground floor", "Additional options", "", "/^2.../", "/.wood./", "0", "Wood"),  
  array("0005", "Door color", "Basic options", "", "", "/.green./", "0", "Green") 

);

// IMPORTANT: THE REG. EXPRESSIONS CAN BE DEFINED MANY TIMES (e. g. 10 DIFFERENT REG. EXPRESSIONS FOR OPTION "WINDOWS"). POINTS "." REPRESENTS EMPTY SPACES WITHIN PRODUCT DESCRIPTION AND ARE IMPORTANT TO IDENTIFY EXACTLY AN OPTION. 

// FOR LOOP TO MAKE COMPARISON BETWEEN INPUT PRODUCT DATA AND PREDEFINED CUST. STRINGS

$matches_array = array();

foreach ($input_product_data as [$product_family, $product_number, $product_description]) {
    foreach($ct_product_options as [$option_id, $option_name, $option_category, $product_family_reg_exp, $product_number_reg_exp, $regular_expression, $priority, $output]) {

   if (preg_match($regular_expression, $product_description) == 1
   &&  preg_match($product_family_reg_exp, $product_family) == 1 ||

       preg_match($regular_expression, $product_description) == 1
   &&  preg_match($product_number_reg_exp, $product_number) == 1) {

    $matches_array [] = array("id" => $option_id, "option_name" => $option_name, "option_category" => $option_category, "output"=> $output);

    } 

    else {

   if (empty($product_family) && empty($product_number)) {

   if (preg_match($regular_expression, $product_description) == 1) {

    $matches_array [] = array("id" => $option_id, "option_name" => $option_name, "option_category" => $option_category, "output"=> $output);

   }
   }
    }   
  }
}

echo "<pre>";
print_r($matches_array);

// FUNCTION FOR DELETE DUBLICATES FROM ARRAY WITH MATCHES

function unique_multidimensional_array($array, $key) {
$temp_array = array();
$i = 0;
$key_array = array();

foreach($array as $val) {
    if (!in_array($val[$key], $key_array)) {
        $key_array[$i] = $val[$key];
        $temp_array[$i] = $val;
    }
    $i++;
}
return $temp_array;
}

echo "<br><h3>UNIQUE MATCHES</h3>";

// CALL OF THE FUNCTION TO GET UNIQUE MATCHES

$unique_matches = unique_multidimensional_array($matches_array, 'id');
sort($unique_matches);
echo "<pre>";
print_r($unique_matches);

// CALL OF THE FUNCTION TO CREATE LIST/ARRAY WITH ALL AVAILABLE PRODUCT OPTIONS

$list_all_product_options = unique_multidimensional_array($ct_product_options, 0);
$list_all_product_options_short = array();

foreach ($list_all_product_options as $option_item) {
    $list_all_product_options_short[] =  array("option_id" => $option_item[0], "option_name" => $option_item[1], "option_category" => $option_item[2]);
}

sort($list_all_product_options_short);

echo "<h3>LIST WITH ALL PRODUCT OPTIONS (SHORT VERSION)</h3>";
echo "<pre>";
print_r($list_all_product_options_short);

我的疑问:
1.如何在一个数组中交叉使用两个数组$list_all_product_options_short和$unique_matches,格式如下:$list_all_product_options_short中的所有值和$unique_matches中的唯一字段“输出”?

// EXAMPLE:

0001 - Chimney - Additional options - TRUE
0005 - Door color - Basic options - Green

// etc.

2.此外,在指示过程中,应考虑新参数“优先级”。它应用于重写/优先化某些情况下的结果。例如,当门有两种不同的颜色“绿色”(优先级=“0”)和“红色”(优先级=“1”)时,门应获得“输出”=“红色”。
3.对于一些关于更好的编码更好的性能提示,我将非常感谢。

vpfxa7rd

vpfxa7rd1#

使用所有选项循环数组。检查ID是否在$unique列表中,然后它可用。

foreach ($ct_all_house_options as [$id, $name]) {
    if (in_array($id, $unique)) {
        echo "$id - $name - available<br>";
    } else {
        echo "$id - $name - not available<br>";
    }
}
y4ekin9u

y4ekin9u2#

如果性能非常重要,您可以准备关联数组以更快地访问数据。下面是整个问题的解决方案:

$optionsById = [];
foreach ($ct_house_options as $option) {
    $optionsById[$option[0]] = $option;
}

$availabilityById = [];
foreach ($ct_all_house_options as $option) {
    $id = $option[0];
    $availabilityById[$id] = false;
    foreach ($input_house_data as $data) {
        $option = $optionsById[$id] ?? null;
        if (!$option) {
            continue;
        }
        if (preg_match($option[2], $data['1'])) {
            $availabilityById[$id] = true;
            break;
        }

    }
}

foreach ($ct_all_house_options as $option) {
    $availabilityString = $availabilityById[$option[0]]
        ? 'available'
        : 'not available';
    echo "$option[0] - $option[1] - $availabilityString\n";
}

性能:

  • 执行时间:
  • 最小值:0.00000476837158203秒
  • 最大值:0.0004758348388672秒
  • 平均值:0.0000226473808289秒
  • 内存使用量:396KB文件

不过,如果修改foreach循环,就可以用更少的代码来实现:

$available = [];
foreach ($input_house_data as $data) {
    foreach ($ct_house_options as $option) {
        if (preg_match($option[2], $data['1'])) {
            $available[$option[0]] = true;
        }   
    }
}

foreach($ct_all_house_options as $option) {
    $availabilityString = isset($available)
        ? 'available'
        : 'not available';
    echo "$option[0] - $option[1] - $availabilityString\n";
}

性能:

  • 执行时间:
  • 最小值:0.00000691413879395秒
  • 最大值:0.00018811225891113秒
  • 平均值:0.0000229740142822秒
  • 内存使用量:396KB文件

如果您仍然想使用$unique,出于性能原因,您仍然可以准备一个关联数组:

$indexById = array_flip($unique);
foreach ($ct_all_house_options as $option) {
    $availabilityString = isset($indexById[$option[0]])
        ? 'available'
        : 'not available';
    echo "$option[0] - $option[1] - $availabilityString\n";
}

性能:

  • 执行时间:
  • 最小值:0.00000905990600586秒
  • 最大值:0.00018906593322754秒
  • 平均值:0.0000392961502075秒
  • 内存使用量:397千字节

改用in_array

foreach ($ct_all_house_options as $option) {
    $availabilityString = in_array($option[0], $unique)
        ? 'available'
        : 'not available';
    echo "$option[0] - $option[1] - $availabilityString\n";
}
  • 执行时间:
  • 最小值:0.0000100135803223秒
  • 最大值:0.0050561428070068秒
  • 平均值:0.0000860285758972秒
  • 内存使用量:396KB文件

我使用PHP Sandbox来执行测试。当然,执行时间并不总是相同的。

o4hqfura

o4hqfura3#

1.只有$input_house_data1列有值。通过将这些值内爆到一个合并的干草堆字符串中,可以减少所需的preg_match()调用总数。
1.每个house选项都应该有自己的正则表达式模式。你可以做一个单独的查找数组,但我只会在“all options”数组中添加一个正则表达式列。
代码:(Demo)(or with a lookup array

$dwellingInfo = [
    ["AAA", "Chimney with red bricks"],
    ["BBB", "Two wide windows in the main floor"],
    ["CCC", "Roof tiles renewed in 2015"]
];

$haystack = implode(
    ',',
    array_column($dwellingInfo, 1)
);

$dwellingAttributes = [
    ["0001", 'Chimney', '/\b(?:Chimney with|Big stove)\b/i'],
    ["0002", 'Garden', '/\bgarden\b/i'],
    ["0003", 'Roof tiles', '/\bRoof tiles\b/i'],
    ["0004", 'Windows', '/\bwindows?\b/i'],
    ["0005", 'Garage', '/\b(?:garage|car port|under cover area)\b/i']
];

foreach ($dwellingAttributes as [$id, $term, $pattern]) {
    printf(
        "%s - %s - %s\n",
        $id,
        $term,
        preg_match($pattern, $haystack)
            ? 'available'
            : 'unavailable'
    );
}

输出量:

0001 - Chimney - available
0002 - Garden - unavailable
0003 - Roof tiles - available
0004 - Windows - available
0005 - Garage - unavailable

相关问题