python 使用Lambda进行复制时,无法处理包含空格的文件

bsxbgnwa  于 2023-08-02  发布在  Python
关注(0)|答案(2)|浏览(87)

我最近一直在研究Lambda函数和s3 bucket,遇到了一个复制问题。我做了一个函数,当你上传一个文件到某个存储桶时,它会将它复制到另一个存储桶,同时根据扩展名将它放入一个文件夹。然后删除原始文件。你可以看到下面的代码:

import json
import boto3
import os
    
# boto3 S3 initialization
s3_client = boto3.client("s3")
   
def lambda_handler(event, context):
    destination_bucket_name = 'destination_storage'
    
    # event contains all information about uploaded object
    print("Event :", event)
    
    # Bucket Name where file was uploaded
    source_bucket_name = event['Records'][0]['s3']['bucket']['name']
       
    # Filename of object (with path)
    file_key_name = event['Records'][0]['s3']['object']['key']
       
    # Parse file type
    file_name, file_extension = os.path.splitext(file_key_name)
       
    # Come up with file path
    folder_name = file_extension.replace('.', '')
       
    # New Filename with new Path
    new_file_key_name = folder_name + "/" + file_key_name
    
    # Copy Source Object
    copy_source_object = {'Bucket': source_bucket_name, 'Key': file_key_name}
    
    # S3 copy object operation
    s3_client.copy_object(CopySource=copy_source_object, Bucket=destination_bucket_name, Key=new_file_key_name)
       
    # Delete file in paste bucket
    s3_client.delete_object(Bucket=source_bucket_name, Key=file_key_name)
    
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from S3 events Lambda!')
    }

字符串
不过,我注意到一个问题。当上传一个名字中有空格的文件时,该文件将保留在源存储桶中,根本不会被复制。我以为源代码是splittext(),但我的研究似乎表明它工作正常。s3中的空格是否有我遗漏的问题?有人有什么想法吗?

8e2ybdfx

8e2ybdfx1#

Amazon S3 event message structure的文档包含以下注解:
s3键提供有关事件中涉及的存储桶和对象的信息。对象键值是URL编码的。例如,“red flower.jpg”变为“red+flower.jpg”(Amazon S3返回“application/x-www-form-urlencoded”作为响应中的内容类型)。
这意味着空格沿着其他一些字符将不会按原样传递,因此您无法找到给定编码字符串的文件名,并且您的副本失败。您需要使用以下代码解码S3路径:

from urllib.parse import unquote_plus
    # Filename of object (with path)
    file_key_name = event['Records'][0]['s3']['object']['key']
    # Decode the URL encoded key name
    file_key_name = unquote_plus(file_key_name)

字符串

szqfcxe2

szqfcxe22#

我也在努力,这对我很有效。

import boto3
from urllib.parse import unquote_plus

def lambda_handler(event, context):
    
    try:
        source_bucket = ""
        destination_bucket = ""
    
        s3 = boto3.client("s3")
    
        bucket_name = event["Records"][0]["s3"]["bucket"]["name"]
        object_key = event["Records"][0]["s3"]["object"]["key"]
  
        file_key_name = unquote_plus(object_key)
    
        copy_source = {"Bucket": bucket_name, "Key": file_key_name}
        s3.copy_object(Bucket=destination_bucket, CopySource=copy_source, Key=file_key_name)
        s3.delete_object(Bucket=source_bucket, Key=file_key_name)  # Use the decoded key here
    
        return {
            "statusCode": 200,
            "body": "ok"
        }

     except Exception as e:
        return {
            "statusCode": 500,
            "body": f"error occurred: {str(e)}"
        }

字符串

相关问题