java—如何解析lambda函数作为输入接收的未转义json?

nwwlzxa7  于 2021-07-09  发布在  Java
关注(0)|答案(1)|浏览(541)

我正在使用一个web抓取工具(parsehub)提取数据。提取完成后,parsehub将有关此数据的信息(json格式)发送到amazonlambda,我将其用作webhook。但是这个json没有正确转义,因此lambda抛出了一个错误(即,无法解析请求体)。如何转义json字符串,以便lambda不会抛出错误?我还使用eclipse测试了这个功能。
我使用了简单的java类型作为输入(https://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-req-resp.html). 我也试过用pojo(https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-pojo.html)和字节流实现(https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-stream.html)作为输入,但它仍然抛出json解析错误。
这是lambda处理程序代码的一部分:

public class LambdaFunctionHandler implements RequestHandler<Object, String> {

    @Override
    public String handleRequest(Object input, Context context) {
        System.out.println("input - " + input);
        return "response";
    }
}

这是json,parsehub正在发送给lambda:

{
    "run_token": "I have removed this",
    "status": "complete",
    "md5sum": "90dc9753513a248502414e8d5345a6de /phfiles/ty6qie7-ut5C.gz ",
    "custom_proxies": "",
    "data_ready": 1,
    "template_pages": {},
    "start_time": "2019-01-30T11:01:58",
    "owner_email": "I have removed this",
    "webhook": "https://api endpoint of lambda function",
    "is_empty": false,
    "project_token": "I have removed this",
    "end_time": "2019-01-30T11:02:19",
    "start_running_time": "2019-01-30T11:01:59",
    "options_json": "{"recoveryRules": "{}", "rotateIPs": false, "sendEmail": true, "allowPerfectSimulation": false, "ignoreDisabledElements": true, "webhook": "https://api endpoint of lambda function", "outputType": "csv", "customProxies": "", "preserveOrder": false, "startTemplate": "main_template", "allowReselection": false, "proxyDisableAdblock": false, "proxyCustomRotationHybrid": false, "maxWorkers": "0", "loadJs": true, "startUrl": "https://address of the website from which data is extracted", "startValue": "{}", "maxPages": "0", "proxyAllowInsecure": false}",
    "start_value": "{}",
    "start_template": "main_template",
    "pages": 2,
    "start_url": "https://address of the website from which data is extracted"
}

这是我的cloudwatch日志中的输出:

Lambda invocation failed with status: 400. Lambda request id: eecd695e-61e7-47d9-bc27-04628c99e158
Execution failed: Could not parse request body into json: Unrecognized token 'run_token': was expecting ('true', 'false' or 'null')
at [Source: [B@36f6b2e9; line: 1, column: 11]

这是我的eclipse控制台中的输出:

Invoking function...
==================== INVOCATION ERROR ====================
com.amazonaws.services.lambda.model.InvalidRequestContentException: Could not parse request body into json: Unexpected character ('r' (code 114)): was expecting comma to separate Object entries
at [Source: [B@1ade7b2b; line: 15, column: 21] (Service: AWSLambda; Status Code: 400; Error Code: InvalidRequestContentException; Request ID: b46bf0b4-4bb2-4bc0-aa13-81457349153c)

我们可以看到“options_json”:“{”recoveryrules”:“{}”,。。。。。。。json的一部分没有转义。无法更改parsehub发送的json。我只能在lambda上操作数据。

jfgube3f

jfgube3f1#

也许去派对太迟了。但我遇到了这个问题,我的结论是:
api网关可以管理两种不同的协议。他们称之为rest和http
http协议具有“路由”。每个路由都有一个集成方法和一个有效负载格式版本
当以最简单的方式设计webhook时,一切都是自以为是的,因此您可以在api网关和lambda之间进行无缝集成,使用默认的catch all route和payload格式v2.0
这将导致所有请求作为一个大的json对象直接发送到lambda事件。头,请求上下文,正文。。。正文没有反序列化,它只是这个大json的body属性的有效负载,采用转义字符串格式。
因此,当到达lambda函数时,您必须相应地处理它以反序列化它并获得一个对象。对于node.js lambda,您应该执行以下操作

exports.handler = async (bigEvent, context) => {
    // Deserializing just the body
    event = JSON.parse(bigEvent.body);
    console.log('value1 =', event.key1);
    return event.key1; 
};

为了澄清这个问题,我要说的是,这个重大事件

{
  version: '2.0',
  routeKey: 'POST /endpoint',
  rawPath: '/endpoint',
  rawQueryString: '',
  headers: {
    accept: '*/*',
    ...
  },
  requestContext: {
    accountId: '123456789012',
    ....
  },
  body: '{\n    "key1": "importantDatum",\n    "key2": "..."\n}',
  isBase64Encoded: false
}

如果您想用json来响应,那么应该在发送之前序列化它(用 JSON.stringify(...) )

相关问题