通过SQL端点将Zip文件夹转换为Blob和Synapse视图作为数据源将活动复制到Azure SQL DB

7fyelxc5  于 2023-08-07  发布在  其他
关注(0)|答案(1)|浏览(96)

我对azure和数据工厂还很陌生,有两个关于数据源的问题。
在我的管道的开始,我需要连接到一个共享的本地zip文件夹,并复制到azureblob存储。我知道我需要创建一个自我主机和连接到一个文件系统等,但我不知道什么文件格式使用其.zip文件夹和文件类型选项分隔,二进制,Excel等?
我的下一个问题是我需要编写访问blob存储源的synapse视图,我知道这是通过配置SQL端点和创建外部表完成的,但我需要我创建的synapse视图作为数据复制源,将视图的结果传输到Azure SQL DB,并且不知道如何选择视图作为数据复制活动中的源?

ff29svar

ff29svar1#

要将.zip文件从本地复制到Blob存储,您应该在ADF中使用二进制数据集。使用ZipDeflate (.zip)压缩类型的文件系统链接服务创建二进制数据集。


的数据
然后,您应该能够将解压缩的文件复制到blob存储中,如下所述:



这里是将Zip文件从本地复制到blob存储的管道Json,供您参考:

{
    "name": "Pipeline 2",
    "properties": {
        "activities": [
            {
                "name": "Copy data1",
                "type": "Copy",
                "dependsOn": [],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "source": {
                        "type": "BinarySource",
                        "storeSettings": {
                            "type": "FileServerReadSettings",
                            "recursive": true
                        },
                        "formatSettings": {
                            "type": "BinaryReadSettings",
                            "compressionProperties": {
                                "type": "ZipDeflateReadSettings"
                            }
                        }
                    },
                    "sink": {
                        "type": "BinarySink",
                        "storeSettings": {
                            "type": "AzureBlobStorageWriteSettings"
                        }
                    },
                    "enableStaging": false
                },
                "inputs": [
                    {
                        "referenceName": "Binary2",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "Binary1",
                        "type": "DatasetReference"
                    }
                ]
            }
        ],
        "annotations": []
    }
}

字符串
如果创建的视图在synapse无服务器池中,则创建与无服务器SQL池详细信息链接的Azure SQL数据库,并使用系统分配的托管身份验证,如下所述:



创建一个Azure SQL数据库数据集,选择此数据集的上述链接服务,然后您将能够直接选择类似表的视图。



这里是管道Json,用于将视图从synapse复制到blob Azure SQL数据库:

{
    "name": "Pipeline 1",
    "properties": {
        "activities": [
            {
                "name": "Copy data1",
                "type": "Copy",
                "dependsOn": [],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "source": {
                        "type": "AzureSqlSource",
                        "queryTimeout": "02:00:00",
                        "partitionOption": "None"
                    },
                    "sink": {
                        "type": "AzureSqlSink",
                        "writeBehavior": "insert",
                        "sqlWriterUseTableLock": false,
                        "tableOption": "autoCreate",
                        "disableMetricsCollection": false
                    },
                    "enableStaging": false,
                    "translator": {
                        "type": "TabularTranslator",
                        "typeConversion": true,
                        "typeConversionSettings": {
                            "allowDataTruncation": true,
                            "treatBooleanAsNumber": false
                        }
                    }
                },
                "inputs": [
                    {
                        "referenceName": "AzureSqlTable1",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "AzureSqlTable2",
                        "type": "DatasetReference"
                    }
                ]
            }
        ],
        "annotations": []
    }
}

相关问题