Azure数据工厂-如何将具有动态键的对象转换为数据流中的数组?

bbuxkriu  于 2023-02-13  发布在  其他
关注(0)|答案(1)|浏览(159)

在花了很多时间阅读文档,遵循一些教程和试错,我只是不能弄清楚;如何使用Azure数据工厂中的数据流将以下包含键对象的复杂对象转换为数组?

    • 输入**
{
  "headers": {
    "Content-Length": 1234
  },
  "body": {
    "00b50a39-8591-3db3-88f7-635e2ec5c65a": {
      "id": "00b50a39-8591-3db3-88f7-635e2ec5c65a",
      "name": "Example 1",
      "date": "2023-02-09"
    },
    "0c206312-2348-391b-99f0-261323a94d95": {
      "id": "0c206312-2348-391b-99f0-261323a94d95",
      "name": "Example 2",
      "date": "2023-02-09"
    },
    "0c82d1e4-a897-32f2-88db-6830a21b0a43": {
      "id": "00b50a39-8591-3db3-88f7-635e2ec5c65a",
      "name": "Example 3",
      "date": "2023-02-09"
    },
  }
}
    • 预期产出**
[
  {
    "id": "00b50a39-8591-3db3-88f7-635e2ec5c65a",
    "name": "Example 1",
    "date": "2023-02-09"
  },
  {
    "id": "0c206312-2348-391b-99f0-261323a94d95",
    "name": "Example 2",
    "date": "2023-02-09"
  },
  {
    "id": "00b50a39-8591-3db3-88f7-635e2ec5c65a",
    "name": "Example 3",
    "date": "2023-02-09"
  }
]
oxcyiej7

oxcyiej71#

    • AFAIK**,您的JSON键是动态的。因此,使用数据流可能无法获得所需的结果。

在这种情况下,你可以尝试下面的方法作为一个解决方案。这将只在你的所有密钥的长度是相同的。

    • 这是我的渠道:**

  • 首先,我使用了一个查找活动来获取JSON文件,并使用下面的表达式. @substring(string(activity('Lookup1').output.value[0].body),2,sub(length(string(activity('Lookup1').output.value[0].body)),4))将查找输出转换为字符串并存储在变量中。
  • 然后我用'},"'在字符串变量上使用了split,并使用下面的表达式将其存储在一个数组变量中。@split(variables('res_str'),'},"')它将给出如下所示的数组。

  • 将该数组赋予ForEach,并在ForEach内使用append variable活动将键存储到具有以下表达式的数组中。@take(item(), 36)
  • 现在,我得到了数组中的键列表,在上面的ForEach之后,使用另一个ForEach活动来获得所需的对象数组。在ForEach中使用append变量活动,并给出下面的表达式。@activity('Lookup1').output.value[0].body[item()]
    • ForEach之后的结果数组将为:**

如果要将上面的JSON存储到文件中,需要使用SQL中的OPENJSON,因为复制活动附加列只支持字符串类型,不支持数组类型。
在拷贝活动源上使用SQL数据集,并在查询中提供以下SQL脚本。

DECLARE @json NVARCHAR(MAX)
SET @json =   
  N'@{variables('json_arr')}'  
   
SELECT * FROM  
 OPENJSON ( @json )  
WITH (   
              id varchar(200) '$.id' ,  
              name varchar(32)     '$.name',  
              date varchar(32) '$.date'
 )

在Sink中,给定一个JSON数据集,并选择Array of Objects作为文件模式。

执行管道,您将在一个文件中获得上面的数组。

    • 这是我的管道JSON:**
{
"name": "pipeline1",
"properties": {
    "activities": [
        {
            "name": "Lookup1",
            "type": "Lookup",
            "dependsOn": [],
            "policy": {
                "timeout": "0.12:00:00",
                "retry": 0,
                "retryIntervalInSeconds": 30,
                "secureOutput": false,
                "secureInput": false
            },
            "userProperties": [],
            "typeProperties": {
                "source": {
                    "type": "JsonSource",
                    "storeSettings": {
                        "type": "AzureBlobFSReadSettings",
                        "recursive": true,
                        "enablePartitionDiscovery": false
                    },
                    "formatSettings": {
                        "type": "JsonReadSettings"
                    }
                },
                "dataset": {
                    "referenceName": "Json1",
                    "type": "DatasetReference"
                },
                "firstRowOnly": false
            }
        },
        {
            "name": "Lookup output to Str",
            "description": "",
            "type": "SetVariable",
            "dependsOn": [
                {
                    "activity": "Lookup1",
                    "dependencyConditions": [
                        "Succeeded"
                    ]
                }
            ],
            "userProperties": [],
            "typeProperties": {
                "variableName": "res_str",
                "value": {
                    "value": "@substring(string(activity('Lookup1').output.value[0].body),2,sub(length(string(activity('Lookup1').output.value[0].body)),4))",
                    "type": "Expression"
                }
            }
        },
        {
            "name": "Split Str to array",
            "type": "SetVariable",
            "dependsOn": [
                {
                    "activity": "Lookup output to Str",
                    "dependencyConditions": [
                        "Succeeded"
                    ]
                }
            ],
            "userProperties": [],
            "typeProperties": {
                "variableName": "split_arr",
                "value": {
                    "value": "@split(variables('res_str'),'},\"')",
                    "type": "Expression"
                }
            }
        },
        {
            "name": "build keys array using split array",
            "type": "ForEach",
            "dependsOn": [
                {
                    "activity": "Split Str to array",
                    "dependencyConditions": [
                        "Succeeded"
                    ]
                }
            ],
            "userProperties": [],
            "typeProperties": {
                "items": {
                    "value": "@variables('split_arr')",
                    "type": "Expression"
                },
                "isSequential": true,
                "activities": [
                    {
                        "name": "take first 36 chars of every item",
                        "type": "AppendVariable",
                        "dependsOn": [],
                        "userProperties": [],
                        "typeProperties": {
                            "variableName": "keys_array",
                            "value": {
                                "value": "@take(item(), 36)",
                                "type": "Expression"
                            }
                        }
                    }
                ]
            }
        },
        {
            "name": "build final array using keys array",
            "type": "ForEach",
            "dependsOn": [
                {
                    "activity": "build keys array using split array",
                    "dependencyConditions": [
                        "Succeeded"
                    ]
                }
            ],
            "userProperties": [],
            "typeProperties": {
                "items": {
                    "value": "@variables('keys_array')",
                    "type": "Expression"
                },
                "isSequential": true,
                "activities": [
                    {
                        "name": "Append variable1",
                        "description": "append every object to array",
                        "type": "AppendVariable",
                        "dependsOn": [],
                        "userProperties": [],
                        "typeProperties": {
                            "variableName": "json_arr",
                            "value": {
                                "value": "@activity('Lookup1').output.value[0].body[item()]",
                                "type": "Expression"
                            }
                        }
                    }
                ]
            }
        },
        {
            "name": "Just for Res show",
            "type": "SetVariable",
            "dependsOn": [
                {
                    "activity": "build final array using keys array",
                    "dependencyConditions": [
                        "Succeeded"
                    ]
                }
            ],
            "userProperties": [],
            "typeProperties": {
                "variableName": "final_res_show",
                "value": {
                    "value": "@variables('json_arr')",
                    "type": "Expression"
                }
            }
        },
        {
            "name": "Copy data1",
            "type": "Copy",
            "dependsOn": [
                {
                    "activity": "Just for Res show",
                    "dependencyConditions": [
                        "Succeeded"
                    ]
                }
            ],
            "policy": {
                "timeout": "0.12:00:00",
                "retry": 0,
                "retryIntervalInSeconds": 30,
                "secureOutput": false,
                "secureInput": false
            },
            "userProperties": [],
            "typeProperties": {
                "source": {
                    "type": "AzureSqlSource",
                    "sqlReaderQuery": "DECLARE @json NVARCHAR(MAX)\nSET @json =   \n  N'@{variables('json_arr')}'  \n   \nSELECT * FROM  \n OPENJSON ( @json )  \nWITH (   \n              id varchar(200) '$.id' ,  \n              name varchar(32)     '$.name',  \n              date varchar(32) '$.date'\n )",
                    "queryTimeout": "02:00:00",
                    "partitionOption": "None"
                },
                "sink": {
                    "type": "JsonSink",
                    "storeSettings": {
                        "type": "AzureBlobFSWriteSettings"
                    },
                    "formatSettings": {
                        "type": "JsonWriteSettings",
                        "filePattern": "arrayOfObjects"
                    }
                },
                "enableStaging": false
            },
            "inputs": [
                {
                    "referenceName": "AzureSqlTable1",
                    "type": "DatasetReference"
                }
            ],
            "outputs": [
                {
                    "referenceName": "Target_JSON",
                    "type": "DatasetReference"
                }
            ]
        }
    ],
    "variables": {
        "res_str": {
            "type": "String"
        },
        "split_arr": {
            "type": "Array"
        },
        "keys_array": {
            "type": "Array"
        },
        "final_res_show": {
            "type": "Array"
        },
        "json_arr": {
            "type": "Array"
        }
    },
    "annotations": []
}
}
    • 结果文件:**

相关问题