本人GitLab 10.7.3
配置GitLab和Jenkins中遇到了webhook无法触发,GitLab测试报500的问题。
开始一直以为是Jenkins的问题,但是百度并没有找到好的答案。
后来对Jenkins接口做测试发现有回应,将GitLab的webHook指向写好的本地服务,发现GitLab并没有触发请求。
终于在GitLab社区中找到了相关的问题:https://gitlab.com/gitlab-org/gitlab-ce/issues/44480
大意似乎是为了防止SSRF漏洞,针对本地(内/专)网络请求做了限制,但是应用页面中并没有友好的提示出来,预计在10.8中优化。
文中有人提了一个方法,通过API修改隐藏设置
curl -X PUT --header "PRIVATE-TOKEN: XXXXX" 'http://*****/api/v4/application/settings?allow_local_requests_from_hooks_and_services=true'
请求完成会返回JSON格式的当前设置,检查allow_local_requests_from_hooks_and_services
为true即可。
PRIVATE-TOKEN就是用户设置里的访问令牌,权限要给
之后再对仓库进行push就能够正常触发Jenkins构建(如果webhook记录报403到Jenkins中的安全设置关掉代理设置)
以下摘自文中的发言
[Douwe Maan @DouweM ](https://gitlab.com/DouweM)commented [](https://gitlab.com/gitlab-org/gitlab-ce/issues/44480#note_71666997)
Master
Thanks for the ping, [@brodock](https://gitlab.com/brodock "Gabriel Mazetto"), and my bad for having lost track of this issue.
I'll try to provide a little bit of background on how we screwed up (I agree with that [characterization](https://gitlab.com/gitlab-org/gitlab-ce/issues/44480#note_69556743 "Integrations Webhooks not working - Error 500"), [@tpdownes](https://gitlab.com/tpdownes "Tom Downes")), and what we have done and are doing to improve the situation.
As I mentioned in [#44480 (comment 64361064)](https://gitlab.com/gitlab-org/gitlab-ce/issues/44480#note_64361064 "Integrations Webhooks not working - Error 500") and clarified in [#44480 (comment 65090742)](https://gitlab.com/gitlab-org/gitlab-ce/issues/44480#note_65090742 "Integrations Webhooks not working - Error 500"), the root of the issue lies in the fact that we significantly underestimated the number of "false positives" this SSRF protection would have, and consequently, how many legitimate users it would affect.
When the SSRF vulnerability was discussed in [#15329 (closed)](https://gitlab.com/gitlab-org/gitlab-ce/issues/15329 "Server Side Request Forgery in Services and Web Hooks"), and more recently in [#41642 (closed)](https://gitlab.com/gitlab-org/gitlab-ce/issues/41642 "SSRF vulnerability in gitlab.com webhook"), we naively interpreted the problem as "all requests to the local network are bad", and we decided that the most effective way to protect against these evil requests to the local network would be to block all of them. During implementation, we had the presence of mind to realize that there had to be *some* legitimate use cases of requests to the local network, so we decided to exclude admin-configured system hooks from the protection, and added the "Allow requests to the local network from hooks and services" checkbox I'm sure everyone in this thread is familiar with, but we never stopped to think about how many legitimate use cases there would actually be.
We were locked into our assumption that "all requests to the local network are bad", decided to disable local requests by default because we want GitLab to be secure by default, and called it a day. Because we didn't expect any legitimate users to run into this error, we didn't make any time to consider the user experience aspect of this SSRF protection (how do we surface the error to the user? how should we phrase the error messages?) or how to communicate it (do we need to have a transition period before enabling the protection by default? how do we make sure everyone affected knows about the change?). The discussion in the aforementioned issues and the [MR on our private GitLab instance](https://dev.gitlab.org/gitlab/gitlabhq/merge_requests/2337) revolved almost exclusively around the technical details of the vulnerability and the fix, simply because no one involved realized there was more to it than that. We all assumed this was a security fix like any other, that would be a positive change for everyone. You can clearly see this in the blurb in the ["GitLab Critical Security Release 10.5.6"](https://about.gitlab.com/2018/03/20/critical-security-release-gitlab-10-dot-5-dot-6-released/) blogpost about this issue, which describes the vulnerability, but doesn't give any indication that our "fix" may not actually be welcomed by everyone.
Pretty much immediately after we released the security patch, issues like this one and [omnibus-gitlab#3307 (closed)](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3307 "Webhook does not work for me when i update to date!") were created, and we realized that our assumption had been wrong, although we didn't (and maybe still haven't) quite realized the impact of it. Our solution at the time was to add an extra note to the "Updating" section of the blogpost, and to actively direct anyone who reported issues with web hooks to the aforementioned "Allow requests to the local network from hooks and services" setting. In [#44480 (comment 64361064)](https://gitlab.com/gitlab-org/gitlab-ce/issues/44480#note_64361064 "Integrations Webhooks not working - Error 500"), I suggested turning SSRF protection off by default in a followup patch release so that we wouldn't break peoples' web hooks, but regrettably I never followed up on this.
Instead, our next actions were to make the error message more descriptive in [!18058 (merged)](https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18058 "Raise more descriptive errors when URLs are blocked"), and (much later) to improve the documentation in [!18532 (merged)](https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18532 "Improve documentation of SSRF protection"). We're also adding "is this URL blocked?" validation to web hook and service URLs in [!18686](https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18686 "WIP: Add validation to webhook and service URLs to ensure they are not blocked because of SSRF"), so that we can get the error to the user quicker, and I just fixed the bug that [@christopherb1](https://gitlab.com/christopherb1 "Christopher Baklid") ran into in [#44480 (comment 71280927)](https://gitlab.com/gitlab-org/gitlab-ce/issues/44480#note_71280927 "Integrations Webhooks not working - Error 500") with [!18746 (merged)](https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18746 "Ensure web hook 'blocked URL' errors are stored in as web hook logs and properly surfaced to the user"). We also have an issue outstanding to add an outbound requests whitelist for local networks ([#44496](https://gitlab.com/gitlab-org/gitlab-ce/issues/44496 "Outbound requests whitelist for local networks")), but this is not currently scheduled.
In terms of lessons for the future, I think we should spend more time on determining the impact of changes we make, we should challenge our assumptions on whether or not a given change will actually benefit everyone or if there are valid use cases where it wouldn't, and we should be more mindful of the user experience of the unhappy path, even if we think it's unlikely people will get on that path at all. We should also be more careful about communicating the potential negative impact of a change, even if we think it wouldn't affect many people. We should also be quicker to respond when we realize that we did screw up, and we should be more proactive about "making it right"; an issue like this one definitely shouldn't drag on for weeks and weeks without any updates or actual resolution from our side, as has happened here.
I take full responsibility for all of this having gone wrong in this particular case. I hope that with the MRs I've linked to above, all of which have already been released or are planned to be released in GitLab 10.8 on May 22nd, we will have sufficiently addressed the issues that have been discussed in this thread. If there are any other actions we should take to fully resolve the situation, I kindly ask you to create individual issues for them, and to ping me, so we can discuss them in more detail.
I'm happy to continue the conversation about the general situation and our handling of it in this issue, but I'll close it since all action items that are still relevant have been moved to and are being discussed in more specific issues.