使用php jsonpath解析JSON

rkkpypqq  于 2023-06-21  发布在  PHP
关注(0)|答案(4)|浏览(115)

我正在尝试使用jsonpath在PHP中解析JSON。
我的JSON就是从这里来的
https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs
(it在这里剪切/粘贴太长了,但是你可以在浏览器会话中看到它…)
JSON是一个有效的JSON(我已经使用https://jsonlint.com/验证了它...).
我已经使用http://www.jsonquerytool.com/尝试了jsonpath表达式,一切似乎都很好,但当我把所有的PHP代码示例放在下面时....

<?php  
    ini_set('display_errors', 'On');
    error_reporting(E_ALL);

    require_once('json.php');      // JSON parser
    require_once('jsonpath-0.8.0.php');  // JSONPath evaluator

    $url = 'https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs';

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_PROXY, '');
    $data = curl_exec($ch);
    curl_close($ch);

    $parser = new Services_JSON(SERVICES_JSON_LOOSE_TYPE);
    $o = $parser->decode($data);

    $xpath_for_parsing = '$..aziende[?(@.descrizione=="A.S.U.I. - Trieste")]..prontoSoccorsi[?(@.descrizione=="Pronto Soccorso e Terapia Urgenza Trieste")]..dipartimenti[?(@.descrizione=="Pronto Soccorso Maggiore")]..codiciColore[?(@.descrizione=="Bianco")]..situazionePazienti..numeroPazientiInAttesa';

    $match1 = jsonPath($o, $xpath_for_parsing);
    //print_r($match1);
    $match1_encoded = $parser->encode($match1);
    print_r($match1_encoded);

    $match1_decoded = json_decode($match1_encoded);

    //print_r($match1_decoded);

    if ($match1_decoded[0] != '') {
     return  $match1_decoded[0];
    }
    else {
     return  "N.D.";
   } 
?>

...不打印任何值..只有一个“假”值。
当我将jsonpath表达式放入PHP代码中时,它出现了错误:出现的错误如下

Warning: Missing argument 3 for JsonPath::evalx(), called in /var/www/html/OpenProntoSoccorso/Test/jsonpath-0.8.0.php on line 84 and defined in /var/www/html/OpenProntoSoccorso/Test/jsonpath-0.8.0.php on line 101

Notice: Use of undefined constant descrizione - assumed 'descrizione' in /var/www/html/OpenProntoSoccorso/Test/jsonpath-0.8.0.php(104) : eval()'d code on line 1

也许我必须转义/引用我的jsonpath才能在PHP中使用它,但我不知道如何...任何建议都很感激…
注意:我需要使用像?(@.descrizione=="A.S.U.I. - Trieste")这样的jsonpath表达式,我不能使用“位置”json path...
我也试过使用jsonpath-0.8.3.php来自这里https://github.com/ITS-UofIowa/jsonpath/blob/master/jsonpath.php,但没有什么变化...
有什么建议吗
感谢你的评分

chhkpiq4

chhkpiq41#

<?php // PRINT SI JSON ORIGINAL
define("DIRPATH", dirname($_SERVER["SCRIPT_FILENAME"]) . '/');
define("WEBPATH", 'http://' . $_SERVER['SERVER_ADDR'] . dirname($_SERVER['PHP_SELF']) . '/');
//define("WEBPORT", 'http://' . $_SERVER['SERVER_ADDR'] . ':' . $_SERVER['SERVER_PORT'] . dirname($_SERVER['PHP_SELF']) . '/');
//define("imgpath", DIRPATH . 'image/');
//$png = file_get_contents('iptv.kodi.al/images/');
$jsondata = file_get_contents('https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs');
header("Content-type: application/ld+json; charset=utf-8");
    $print = json_decode($jsondata);
    print_r($print);
?>

<?php // PRINT ME KATEGORI
define("DIRPATH", dirname($_SERVER["SCRIPT_FILENAME"]) . '/');
define("WEBPATH", 'http://' . $_SERVER['SERVER_ADDR'] . dirname($_SERVER['PHP_SELF']) . '/');
//define("WEBPORT", 'http://' . $_SERVER['SERVER_ADDR'] . ':' . $_SERVER['SERVER_PORT'] . dirname($_SERVER['PHP_SELF']) . '/');
//define("imgpath", DIRPATH . 'image/');
//$png = file_get_contents('iptv.kodi.al/images/');
$jsondata = file_get_contents('https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs');
header("Content-type: application/ld+json; charset=utf-8");
    $print = json_decode($jsondata);
    //print_r($print);
    $items = '';
    // KETU FILLON LISTA
    foreach ($print->{'aziende'} as $item) {
    $items .= '
' . $item->id . '
' . $item->descrizione . '
';
};
?>
<?php echo $items; ?>
q7solyqu

q7solyqu2#

你可以使用json_decode将其转换为原生php数组,然后你可以使用hhb_xml_encode(来自https://stackoverflow.com/a/43697765/1067003)将数组转换为xml,然后你可以使用DOMDocument::loadHTML将XML转换为DOMDocument,然后你可以使用DOMXPath::query通过XPath搜索它...
示例:

<?php
declare(strict_types = 1);
header ( "content-type: text/plain;charset=utf8" );
require_once ('hhb_.inc.php');
$json_raw = (new hhb_curl ( '', true ))->exec ( 'https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs' )->getStdOut ();
$parsed = json_decode ( $json_raw, true );
// var_dump ( $parsed );
$xml = hhb_xml_encode ( $parsed );
// var_dump($xml);
$dom = @DOMDocument::loadHTML ( $xml );
$dom->formatOutput = true;
$xp = new DOMXPath ( $dom );
$elements_for_parsing = $xp->query ( '//aziende/descrizione[text()=' . xpath_quote ( 'A.S.U.I. - Trieste' ) . ']|//prontosoccorsi/descrizione[text()=' . xpath_quote ( 'Pronto Soccorso e Terapia Urgenza Trieste' ) . ']|//dipartimenti/descrizione[text()=' . xpath_quote ( 'Pronto Soccorso Maggiore' ) . ']|//codicicolore/descrizione[text()=' . xpath_quote ( 'Bianco' ) . ']|//situazionepazienti|//numeroPazientiInAttesa' );
// var_dump ( $elements_for_parsing,$dom->saveXML() );
foreach ( $elements_for_parsing as $ele ) {
    var_dump ( $ele->textContent );
}

// based on https://stackoverflow.com/a/1352556/1067003
function xpath_quote(string $value): string {
    if (false === strpos ( $value, '"' )) {
        return '"' . $value . '"';
    }
    if (false === strpos ( $value, '\'' )) {
        return '\'' . $value . '\'';
    }
    // if the value contains both single and double quotes, construct an
    // expression that concatenates all non-double-quote substrings with
    // the quotes, e.g.:
    //
    // concat("'foo'", '"', "bar")
    $sb = 'concat(';
    $substrings = explode ( '"', $value );
    for($i = 0; $i < count ( $substrings ); ++ $i) {
        $needComma = ($i > 0);
        if ($substrings [$i] !== '') {
            if ($i > 0) {
                $sb .= ', ';
            }
            $sb .= '"' . $substrings [$i] . '"';
            $needComma = true;
        }
        if ($i < (count ( $substrings ) - 1)) {
            if ($needComma) {
                $sb .= ', ';
            }
            $sb .= "'\"'";
        }
    }
    $sb .= ')';
    return $sb;
}
function hhb_xml_encode(array $arr, string $name_for_numeric_keys = 'val'): string {
    if (empty ( $arr )) {
        // avoid having a special case for <root/> and <root></root> i guess
        return '';
    }
    $is_iterable_compat = function ($v): bool {
        // php 7.0 compat for php7.1+'s is_itrable
        return is_array ( $v ) || ($v instanceof \Traversable);
    };
    $isAssoc = function (array $arr): bool {
        // thanks to Mark Amery for this
        if (array () === $arr)
            return false;
        return array_keys ( $arr ) !== range ( 0, count ( $arr ) - 1 );
    };
    $endsWith = function (string $haystack, string $needle): bool {
        // thanks to MrHus
        $length = strlen ( $needle );
        if ($length == 0) {
            return true;
        }
        return (substr ( $haystack, - $length ) === $needle);
    };
    $formatXML = function (string $xml) use ($endsWith): string {
        // there seems to be a bug with formatOutput on DOMDocuments that have used importNode with $deep=true
        // on PHP 7.0.15...
        $domd = new DOMDocument ( '1.0', 'UTF-8' );
        $domd->preserveWhiteSpace = false;
        $domd->formatOutput = true;
        $domd->loadXML ( '<root>' . $xml . '</root>' );
        $ret = trim ( $domd->saveXML ( $domd->getElementsByTagName ( "root" )->item ( 0 ) ) );
        assert ( 0 === strpos ( $ret, '<root>' ) );
        assert ( $endsWith ( $ret, '</root>' ) );
        $full = trim ( substr ( $ret, strlen ( '<root>' ), - strlen ( '</root>' ) ) );
        $ret = '';
        // ... seems each line except the first line starts with 2 ugly spaces,
        // presumably its the <root> element that starts with no spaces at all.
        foreach ( explode ( "\n", $full ) as $line ) {
            if (substr ( $line, 0, 2 ) === '  ') {
                $ret .= substr ( $line, 2 ) . "\n";
            } else {
                $ret .= $line . "\n";
            }
        }
        $ret = trim ( $ret );
        return $ret;
    };

    // $arr = new RecursiveArrayIterator ( $arr );
    // $iterator = new RecursiveIteratorIterator ( $arr, RecursiveIteratorIterator::SELF_FIRST );
    $iterator = $arr;
    $domd = new DOMDocument ();
    $root = $domd->createElement ( 'root' );
    foreach ( $iterator as $key => $val ) {
        // var_dump ( $key, $val );
        $ele = $domd->createElement ( is_int ( $key ) ? $name_for_numeric_keys : $key );
        if (! empty ( $val ) || $val === '0') {
            if ($is_iterable_compat ( $val )) {
                $asoc = $isAssoc ( $val );
                $tmp = hhb_xml_encode ( $val, is_int ( $key ) ? $name_for_numeric_keys : $key );
                // var_dump ( $tmp );
                // die ();
                $tmp = @DOMDocument::loadXML ( '<root>' . $tmp . '</root>' );
                foreach ( $tmp->getElementsByTagName ( "root" )->item ( 0 )->childNodes ?? [ ] as $tmp2 ) {
                    $tmp3 = $domd->importNode ( $tmp2, true );
                    if ($asoc) {
                        $ele->appendChild ( $tmp3 );
                    } else {
                        $root->appendChild ( $tmp3 );
                    }
                }
                unset ( $tmp, $tmp2, $tmp3 );
                if (! $asoc) {
                    // echo 'REMOVING';die();
                    // $ele->parentNode->removeChild($ele);
                    continue;
                }
            } else {
                $ele->textContent = $val;
            }
        }
        $root->appendChild ( $ele );
    }
    $domd->preserveWhiteSpace = false;
    $domd->formatOutput = true;
    $ret = trim ( $domd->saveXML ( $root ) );
    assert ( 0 === strpos ( $ret, '<root>' ) );
    assert ( $endsWith ( $ret, '</root>' ) );
    $ret = trim ( substr ( $ret, strlen ( '<root>' ), - strlen ( '</root>' ) ) );
    // seems to be a bug with formatOutput on DOMDocuments that have used importNode with $deep=true..
    $ret = $formatXML ( $ret );
    return $ret;
}

ps,require_once ('hhb_.inc.php'); $json_raw = (new hhb_curl ( '', true ))->exec ( 'https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs' )->getStdOut ();行只是获取url并将json放入$json_raw中(使用gzip压缩传输来加快速度),将其替换为您想要将其获取到$json_raw中的任何内容,我使用的实际curl库来自https://github.com/divinity76/hhb_.inc.php/blob/master/hhb_.inc.php#L477
目前它打印:

string(18) "A.S.U.I. - Trieste"
string(41) "Pronto Soccorso e Terapia Urgenza Trieste"
string(9) "121200:14"
string(10) "181400:254"
string(6) "Bianco"
string(7) "200:292"
string(5) "00:00"
string(24) "Pronto Soccorso Maggiore"
string(7) "3300:15"
string(6) "Bianco"
string(8) "6200:584"
string(5) "00:00"
string(5) "00:00"
string(8) "4100:353"
string(6) "Bianco"
string(7) "100:051"
string(5) "00:00"
string(5) "00:00"
string(7) "1100:15"
string(8) "6402:012"
string(6) "Bianco"
string(7) "402:274"
string(5) "00:00"
string(9) "11900:202"
string(9) "11401:427"
string(6) "Bianco"
string(8) "2102:051"
string(5) "00:00"
string(7) "3300:08"
string(8) "7401:423"
string(6) "Bianco"
string(8) "8402:104"
string(5) "00:00"
string(6) "Bianco"
string(5) "00:00"
string(5) "00:00"
string(5) "00:00"
string(5) "00:00"
string(7) "1100:04"
string(10) "121000:512"
string(6) "Bianco"
string(8) "5400:461"
string(5) "00:00"
string(5) "00:00"
string(5) "00:00"
string(6) "Bianco"
string(5) "00:00"
string(5) "00:00"
string(9) "121200:18"
string(9) "11800:593"
string(6) "Bianco"
string(8) "6401:272"
string(5) "00:00"
string(6) "Bianco"
string(7) "1100:04"
string(5) "00:00"
string(5) "00:00"
string(5) "00:00"
string(7) "2200:05"
string(9) "10801:102"
string(6) "Bianco"
string(8) "8201:166"
string(5) "00:00"
string(8) "3200:071"
string(7) "100:261"
string(6) "Bianco"
string(5) "00:00"
string(5) "00:00"
string(7) "1100:00"
string(9) "151500:26"
string(10) "161301:123"
string(6) "Bianco"
string(8) "9500:434"
string(7) "1100:00"
string(7) "2200:13"
string(6) "Bianco"
string(7) "200:342"
string(5) "00:00"
string(6) "Bianco"
string(7) "1100:24"
string(5) "00:00"
string(5) "00:00"
string(5) "00:00"
string(7) "1100:04"
string(8) "9700:222"
string(10) "171500:582"
string(6) "Bianco"
string(7) "200:512"
string(7) "1100:40"
string(7) "1100:22"
string(6) "Bianco"
string(8) "3100:062"
string(5) "00:00"
string(5) "00:00"
string(5) "00:00"
string(6) "Bianco"
string(5) "00:00"
string(5) "00:00"
string(7) "1100:22"
string(8) "7500:302"
string(6) "Bianco"
string(5) "00:00"
string(5) "00:00"
string(7) "1100:06"
string(6) "Bianco"
string(7) "1100:00"
string(5) "00:00"
string(5) "00:00"

希望这就是你要找的,我猜是通过你提供的“xpath”。

pkln4tw6

pkln4tw63#

xpath对于你的任务来说太复杂了,而且通常都是矫枉过正……
只需使用标准的json_decode(),获取等效的PHP对象,并使用标准的for/while循环和正则表达式导航它
另外,我认为你的问题是误导,你的问题不是解析JSON(这是由json_decode()自动完成的),你的问题是使用xpath从中提取一些数据。我建议对你的问题进行重构,以准确地显示出哪里出了问题,以及你的意图是什么
如果你需要下降到一个精确的JSON节点(或一组节点),为什么不使用for循环和正则表达式呢?

4uqofj5v

4uqofj5v4#

我已经解决了更改JsonPath的库实现:现在我使用Skyscanner JsonPath实现(参考https://github.com/Skyscanner/JsonPath-PHP)。
安装上有些麻烦(对我来说,我以前从来没有使用过composer...),但skyskanner团队支持我(参考。https://github.com/Skyscanner/JsonPath-PHP/issues/6),现在我有这个PHP代码...

<?php
    ini_set('display_errors', 'On');
    error_reporting(E_ALL);

    include "./tmp/vendor/autoload.php";

    $url = 'https://servizionline.sanita.fvg.it/tempiAttesaService/tempiAttesaPs';

    //#Set CURL parameters: pay attention to the PROXY config !!!!
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_PROXY, '');
    $data = curl_exec($ch);
    curl_close($ch);

    $jsonObject = new JsonPath\JsonObject($data);

    $jsonPathExpr = "$..aziende[?(@.descrizione==\"A.S.U.I. - Trieste\")]..prontoSoccorsi[?(@.descrizione==\"Pronto Soccorso e Terapia Urgenza Trieste\")]..dipartimenti[?(@.descrizione==\"Pronto Soccorso Maggiore\")]..codiciColore[?(@.descrizione==\"Verde\")]..situazionePazienti..numeroPazientiInAttesa";

    $r = $jsonObject->get($jsonPathExpr);

    //print json_encode($r);

    print json_encode($r[0]);
?>

./tmp中,我有从Composer获得内容

这样我就可以做我的json查询了,可能不需要知道它的确切结构

相关问题