nodejs用curl导入数据到Mailchimp时遇到的问题

在工作中,遇到用node.js spawn调用curl方法向Mailchimp导入用户数据的情形。

一、child.stdout.on的机制

一开始的代码如下:

const spawn = require('child_process').spawn

const args = `curl --request POST \
     --url '${BASE_URL}/3.0/lists/${LIST_ID}' \
     --user 'user:${API_KEY}' \
     --header 'content-type:application/json' \
     --data '${postData}' \
     --include`

const child = spawn('curl', [args], {
    shell: true
})

child.stdout.on('data', (data) => {
    console.log(data)
});

console.log(data)的结果是:

HTTP/1.1 200 OK
Server: openresty
Content-Type: application/json; charset=utf-8
Content-Length: 4530
Vary: Accept-Encoding
X-Request-Id: 19ae5ea8-1f1a-4cb8-a142-074887140753
Link: ; rel="describedBy", ; rel="dashboard"
Date: Thu, 06 Sep 2018 15:25:49 GMT
Connection: keep-alive
Set-Cookie: _AVESTA_ENVIRONMENT=prod; path=/
Set-Cookie: _mcid=1.03c96f827b0fce334cb3f36bf2ec5667; expires=Fri, 06-Sep-2019 15:25:49 GMT; Max-Age=31536000; path=/; domain=.mailchimp.com

{"new_members":[],"updated_members":[],"errors":[{"email_address":"[email protected]_cancel","error":"Please provide a valid email address."}],"total_created":0,"total_updated":0,"error_count":1,"_links":[{"rel":"self","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Response.json"},{"rel":"parent","href":"https://us19.api.mailchimp.com/3.0/lists","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists.json"},{"rel":"update","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0","method":"PATCH","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Response.json","schema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/PATCH.json"},{"rel":"batch-sub-unsub-members","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0","method":"POST","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/BatchPOST-Response.json","schema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/BatchPOST.json"},{"rel":"delete","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0","method":"DELETE"},{"rel":"abuse-reports","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/abuse-reports","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Abuse/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/Abuse.json"},{"rel":"activity","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/activity","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Activity/Response.json"},{"rel":"clients","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/clients","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Clients/Response.json"},{"rel":"growth-history","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/growth-history","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Growth/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/Growth.json"},{"rel":"interest-categories","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/interest-categories","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/InterestCategories/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/InterestCategories.json"},{"rel":"members","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/members","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Members/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/Members.json"},{"rel":"merge-fields","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/merge-fields","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/MergeFields/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/MergeFields.json"},{"rel":"segments","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/segments","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Segments/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/Segments.json"},{"rel":"webhooks","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/webhooks","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Webhooks/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/Webhooks.json"},{"rel":"signup-forms","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/signup-forms","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/SignupForms/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/SignupForms.json"},{"rel":"locations","href":"https://us19.api.mailchimp.com/3.0/lists/5b524f3ac0/locations","method":"GET","targetSchema":"https://us19.api.mailchimp.com/schema/3.0/Definitions/Lists/Locations/CollectionResponse.json","schema":"https://us19.api.mailchimp.com/schema/3.0/CollectionLinks/Lists/Locations.json"}]}

我其实想直接拿到{"new_members":以后的东西,所以我做了如下操作(其实是同事帮忙哈哈):

const spawn = require('child_process').spawn

const args = `curl --request POST \
     --url '${BASE_URL}/3.0/lists/${LIST_ID}' \
     --user 'user:${API_KEY}' \
     --header 'content-type:application/json' \
     --data '${postData}' \
     --include`

const child = spawn('curl', [args], {
    shell: true
})

child.stdout.on('data', (data) => {
    const rData = data.toString('utf8').split('\n')
    const da = (rData[rData.length - 1])
    let chunksObject = JSON.parse(da)
    console.log(chunksObject)
});

child.on('close', (code) => {
    console.log(`child process close code:${code}`);
});

可以看到,我先通过const rData = data.toString('utf8').split('\n')把stdout出来的buffer转成utf-8格式的字符串,再通过\n即换行符切分字符串,最后取最后一位const da = (rData[rData.length - 1])为我想要的{"new_members": 部分。最后通过JSON.parse(da)得到我想要的JSON格式数据。

一开始数据量小的时候还好,等数据量一大,就发现JSON.parse(da)会报错,说da不是一个JSON格式的字符串。

查询资料发现:

node.js官网里说:

stdout | output[1] 的内容。

原来,stdout吐出来的东西是一个buffer,所以它应该是一段一段吐出来的。为了验证猜想,我在child.stdout.on里加了一段console.log('------------')

child.stdout.on('data', (data) => {
    console.log('------------')
    const rData = data.toString('utf8').split('\n')
    const da = (rData[rData.length - 1])
    let chunksObject = JSON.parse(da)
    console.log(chunksObject)
});

运行脚本,一次curl请求打印了4到5次------------,验证了其是buffer。

于是我想,不能在每次stdout处解析data了,因为它此时data是不完整的,应该在child.on('close')时,即整个buffer输出完毕后再解析。

所以我定义了一个全局变量allChunksString,然后将每次stdout出来的chunk拼接起来,最后在child.on('close')的时候去解析allChunksString

const spawn = require('child_process').spawn

const args = `curl --request POST \
     --url '${BASE_URL}/3.0/lists/${LIST_ID}' \
     --user 'user:${API_KEY}' \
     --header 'content-type:application/json' \
     --data '${postData}' \
     --include`

const child = spawn('curl', [args], {
    shell: true
})

let allChunksString = '' // a global variable to concat all the stdout chunks

child.stdout.on('data', (data) => {
    const rData = data.toString('utf8').split('\n')
    const da = (rData[rData.length - 1])
    allChunksString += da
});

child.on('close', (code) => {
    let chunksObject = JSON.parse(allChunksString)
    console.log(chunksObject)
    console.log(`child process close code:${code}`);
});

果然,现在JSON.parse就没有再报错了。

总结child.stdout.on里的结果是一个buffer,一段一段的结果,所以不能出来一次解析一次,应该将其组装起来,最后到整个buffer输出完毕后在close事件里一起解析。

二、向Mailchimp API里发送的数据格式有问题

完成上述工作后,发现还是有400多条用户数据无法导入导Mailchimp中。报错shema expect object but got NULL。最后和同事千辛万苦找出来坑爹的结果,是因为有用户填入的某项merge_field字段带有'(英文单引号)。这个有开始符却没有结束符的'问题导致整个curl的data部分数据紊乱,因为curl认为你这里的data不是正确的JSON格式,所以直接给你报错结束进程。一条数据的错误,导致一起的200条用户都没法导入(我是每200条用户导入一次。)

所以,用正则将'替换成了空格,

postData = postData.replace(/\'/g, " ") // in case some user's field has ' which will mess the shell script

然后发现可以全部导入了。

当然这不是最优解,因为将用户的'改成了空格。

坑!

你可能感兴趣的:(nodejs用curl导入数据到Mailchimp时遇到的问题)