瑞芯微开源_微包和开源信任扩展

瑞芯微开源

Like everybody else this week we had fun with the pad-left disaster. We’re from the Python community and our exposure to the node ecosystem is primarily for the client side. We’re big fans of the ecosystem that develops around react and as such quite a bit of our daily workflow involves npm.

像本周的其他所有人一样, 我们对左倾灾难 也感到很开心。 我们来自Python社区,我们对节点生态系统的了解主要是针对客户端的。 我们是围绕React而发展的生态系统的忠实拥护者,因此我们的日常工作流程中有相当一部分涉及npm。

What frustrated me personally about this conversation that took place over the internets about the last few days however has nothing to do with npm, the guy who deleted his packages, any potential trademark disputes or the supposed inability of the JavaScript community to write functions to pad strings. It has more to do with how the ecosystem evolving around npm has created the most dangerous and irresponsible environment which in many ways leaves me scared.

我对过去几天在互联网上进行的对话感到沮丧的原因与npm,删除软件包的人,任何潜在的商标争议或JavaScript社区无法编写填充函数的人无关。字符串。 它与npm周围的生态系统如何创造出最危险,最不负责任的环境有关,这在很多方面使我感到恐惧。

My opinion query quickly went from “Oh that’s funny” to “This concerns me”.

我的意见查询Swift从“ 哦,很有趣 ”变为“ 这与我有关 ”。

依赖爆炸 (Dependency Explosion)

When “pad left” disaster stroke I had a brief look at Sentry’s dependency tree. I should probably have done that before but for as long things work you don’t really tend to do that. At the time of writing we have 39 dependencies in our package.json. These dependencies are strongly vetted in the sense that we do not include anything there we did not investigate properly. What however we cannot do, is also to investigate every single dependency there is. The reason for this is how these node dependencies explode. While we have 39 direct dependencies, we have more than a thousand dependencies in total as it turns out.

当“垫左”灾难中风时,我简要介绍了Sentry的依赖树。 我以前应该已经做过,但是只要事情能做很长时间,您实际上就不会这样做。 在撰写本文时,我们的package.json中有39个依赖项。 从某种意义上说,我们严格审查了这些依赖项,因为我们没有在其中进行任何我们未适当调查的内容。 但是,我们不能做的是,还要调查存在的每个依赖项。 原因是这些节点依赖性如何爆炸。 虽然我们有39个直接依赖项,但事实证明,我们总共有1000多个依赖项。

To give you a comparison: the Sentry backend (Sentry server) has 45 direct dependencies. If you resolve all dependencies and install them as well you end up with a total of 65 packages which is significantly less. We only get a total of 20 packages over what we depend on ourselves. The typical Python project would be similar. For instance the Flask framework depends on three (soon to be four with Click added) other packages: Werkzeug, Jinja2 and itsdangerous. Jinja2 additionally depends on MarkupSafe. All of those packages are written by the same author however but split into rough responsibilities.

为了给您一个比较:Sentry后端(Sentry服务器)具有45个直接依赖项。 如果您解决了所有依赖项并也安装了它们,则最终会得到总共65个软件包,而这个数量要少得多。 我们仅依靠自己获得了20个包裹。 典型的Python项目将是类似的。 例如,Flask框架依赖于三个其他软件包(添加了Click的情况很快为四个)其他软件包:Werkzeug,Jinja2及其危险。 Jinja2还依赖于MarkupSafe。 所有这些软件包都是由同一位作者编写的,但要分担主要责任。

Why is that important?

为什么这么重要?

  • dependencies incur cost.
  • every dependency is a liability.
  • 依赖关系会产生成本。
  • 每个依赖都是一种责任。

依赖成本 (The Cost of Dependencies)

Let’s talk about the cost of dependencies first. There are a few costs associated with every dependency and most of you who have been programming for a few years will have encountered this.

首先让我们谈谈依赖的成本。 每个依赖项都有一些相关的成本,并且已经编程了几年的大多数人都会遇到这种情况。

The most obvious costs are that packages need to be downloaded from somewhere. This corresponds to direct cost. The most shocking example I encountered for this is the isarray npm package. It’s currently being downloaded short of 19 million times a month from npm. The entire contents of that package can fit into a single line:

最明显的成本是需要从某个地方下载软件包。 这相当于直接成本。 我遇到的最令人震惊的示例是isarray npm软件包。 目前,从npm开始,每月下载量不足1900万次。 该软件包的全部内容可以放在一行中:

modulemodule .. exports exports = = ArrayArray .. isArray isArray || || functionfunction (( aa ) ) { { return return {}.{}. toStringtoString .. callcall (( aa ) ) == == '[object Array]' '[object Array]' }
}

However in addition to this stuff there is a bunch of extra content in it. You actually end up downloading a 2.5KB tarball because of all the extra metadata, readme, license file, travis config, unittests and makefile. On top of that npm adds 6KB for its own metadata. Let’s round it to 8KB that need to be downloaded. Multiplied with the total number of downloads last month the node community downloaded 140GB worth of isarray. That’s half of the monthly downloads of what Flask achieves measured by size.

但是,除了这些东西之外,还有很多额外的内容。 由于所有额外的元数据,自述文件,许可证文件,travis配置,单元测试和makefile,您实际上最终下载了2.5KB的tarball。 在该npm的顶部,为自己的元数据添加了6KB。 让我们将其舍入为需要下载的8KB。 乘以上个月的下载总数,节点社区下载了140GB的isarray。 按大小衡量,这是Flask每月获得的下载量的一半。

The footprint of Sentry’s server component is big when you add up all the dependencies. Yet the entire installation of Sentry from pypi takes about 30 seconds including compiling lxml. Installing the over 1000 dependencies for the UI though takes I think about 5 minutes even though you end up with a fraction of the code afterwards. Also the further you are away from the npm CDN node the worse the price for the network roundtrip you pay. I threw away my node cache for fun and ran npm install on Sentry. Takes about 4.5 minutes. And that’s with good latency to npm, on a above average network connect and a top of the line Macbook Pro with an SSD. I don’t want to know what the experience is for people on unreliable network connections. Afterwards I end up with 165MB in node_modules. For comparison the entirety of the Sentry’s backend dependencies on the file system and all metadata is 60MB.

当您添加所有依赖项时,Sentry服务器组件的占用空间很大。 然而,从pypi安装Sentry的整个过程大约需要30秒,包括编译lxml。 尽管为UI安装了1000多个依赖项,但我花了大约5分钟的时间,尽管此后您只得到了一部分代码。 而且,您离npm CDN节点越远,您为网络往返支付的价格就越差。 我扔掉我的节点缓存以获取乐趣,然后在Sentry上运行了npm install。 大约需要4.5分钟。 在高于平均水平的网络连接和配备SSD的顶级Macbook Pro上,这对npm的延迟很长。 我不想知道对于那些不可靠的网络连接的人有什么体验。 之后,我在node_modules中获得了165MB的内存 为了进行比较,Sentry对文件系统和所有元数据的后端依赖关系的整体大小为60MB。

When we have a thousand different dependencies we have a thousand different licenses and copyright files. Really makes me wonder what the license screen of a node powered desktop application would look like. But it’s not also a thousand licenses, it’s a huge number of independent developers.

当我们有上千个不同的依赖项时,我们有上千个不同的许可证和版权文件。 真的让我想知道节点驱动的桌面应用程序的许可证屏幕是什么样的。 但这也不是一千个许可证,而是大量的独立开发人员。

信任与审计 (Trust and Auditing)

This leads me to what my actual issue with micro-dependencies is: we do not have trust solved. Every once in a while people will bring up how we all would be better off if we PGP signed our Python packages. I think what a lot of people miss in the process is that signatures were never a technical problem but a trust and scaling problem.

这使我想到了微观依赖性的实际问题是:我们还没有解决信任问题。 每隔一段时间,人们就会提出一个问题,那就是如果我们PGP签署了我们的Python软件包,我们会变得更好。 我认为很多人在此过程中遗漏的是签名不是技术问题,而是信任和扩展问题。

I want to give you a practical example of what I mean with this. Say you build a program based on the Flask framework. You pull in a total of 4-5 dependencies for Flask alone which are all signed off my me. The attack vector to get untrusted code into Flask is:

我想给你一个实际例子,说明我的意思。 假设您基于Flask框架构建了一个程序。 您仅为Flask提取了4-5个依赖项,这些依赖项全部由我本人签署。 将不受信任的代码放入Flask的攻击媒介是:

  • get a backdoor into a pull request and get it merged
  • steal my credentials to PyPI and publish a new release with a backdoor
  • put a backdoor into one of my dependencies
  • 将后门放入请求请求并合并
  • 窃取我的凭据到PyPI并使用后门发布新版本
  • 将后门放入我的依赖项之一

All of those attack vectors I cover. I use my own software, monitor what releases are PyPI which is also the only place to install my software from. I 2FA all my logins where possible, I use long randomly generated passwords where I cannot etc. None of my libraries use a dependency I do not trust the developer of. In essence if you use Flask you only need to trust me to not be malicious or idiotic. Generally by vetting me as a person (or maybe at a later point an organization that releases my libraries) you can be reasonably sure that what you install is what you expect and not something dangerous. If you develop large scale Python applications you can do this for all your dependencies and you end up with a reasonably short list. More than that. Because Python’s import system is very limited you end up with only one version of each library so when you want to go in detail and sign off on releases you only need to do it once.

我涵盖了所有这些攻击媒介。 我使用自己的软件,监视PyPI的发行版,这也是从中安装我的软件的唯一位置。 我尽可能使用2FA进行所有登录,我在不能使用的地方使用长的随机生成的密码,等等。我的所有库都不使用我不信任其开发人员的依赖项。 本质上,如果您使用Flask,您只需要相信我不会恶意或白痴。 通常,通过以个人身份审核我(或稍后再发布我的库的组织),您可以合理地确定所安装的内容符合您的期望,而不是危险的事情。 如果您开发大型Python应用程序,则可以针对所有依赖项执行此操作,并且最终会得到一个简短的清单。 比那更多的。 由于Python的导入系统非常有限,您最终只能获得每个库的一个版本,因此当您想详细了解并签署发行版时,只需执行一次。

Back to Sentry’s use of npm. It turns out we have four different versions of the same query string library because of different version pinning by different libraries. Fun.

回到Sentry对npm的使用。 事实证明,由于不同的库固定了不同的版本,因此同一查询字符串库有四个不同的版本。 好玩

Those dependencies can easily end up being high value targets because of how few people know about them. juliangruber’s “isarray” has 15 stars on github and only two people watch the repository. It’s downloaded 18 million times a month. Sentry depends on it 20 times. 14 times it’s a pin for 0.0.1, once it’s a pin for ^1.0.0 and 5 times for ~1.0.0. Any pin for anything other than a strict version match is a disaster waiting to happen if someone would manage to push out a point release for it by stealing juliangruber’s credentials.

由于很少有人了解这些依赖关系,因此这些依赖关系很容易成为高价值目标。 juliangruber的“ isarray”在github上有15个星星,只有两个人在看这个仓库。 每月下载1800万次。 哨兵依赖它20次。 14倍它是0.0.1针,一旦它的^ 1.0.0和5次〜1.0.0针。 如果有人设法通过窃取juliangruber的凭据来为其发布一个点发布,那么任何与严格版本匹配以外的其他任何事情都将是一场灾难。

Now one could argue that the same problem applies if people hack my account and push out a new Flask release. But I can promise you I will notice a release from one of my ~5 libraries because of a) I monitor those packages, b) other people would notice a release. I doubt people would notice a new isarray release. Yet isarray is not sandboxed and runs with the same rights as the rest of the code you have.

现在,人们可能会争辩说,如果人们入侵我的帐户并推出新的Flask版本,也会遇到同样的问题。 但是我可以保证,由于a)我监视这些软件包,b)其他人会注意到我的发布,我会注意到我的〜5个库中的一个发布。 我怀疑人们会注意到新的isarray版本。 但是isarray并不处于沙盒状态,其运行权限与其余代码相同。

For instance sindresorhus maintains 827 npm packages. Most of which are probably one liners. I have no idea how good his opsec is, but my assumption is that it’s significantly harder for him to ensure that all of those are actually his releases than it is for me as I only have to look over a handful.

例如,sindresorhus 维护827个npm软件包 。 其中大多数可能是一个班轮。 我不知道他的opsec有多好,但是我的假设是,要确保所有这些实际上都是他的发布,对他而言要比对我来说困难得多,因为我只需要查看少数几个。

签名 (Signatures)

There is a common talk that package signatures would solve a lot of those issues but at the end of the day because of the trust we get from PyPI and npm we get very little extra security from a package signature compared to just trusting the username/password auth on package publish.

众所周知,包签名可以解决很多这样的问题,但归根结底,由于我们从PyPI和npm获得信任,因此与仅信任用户名/密码相比,包签名带来的额外安全性很小包发布上的auth。

Why package signatures are not the holy grail was covered by Donald Stufft aka Mr PyPI. You should definitely read that since he’s describing the overarching issue much better than I could ever do.

为什么包装签名不是圣杯,唐纳德·斯塔夫特 ( Donald Stufft)又名PyPI先生谈到 。 您绝对应该读一读,因为他对总体问题的描述比我以前做的要好得多。

微依赖性的未来 (Future of Micro-Dependencies)

To be perfectly honestly. I’m legitimately scared about node’s integrity of the ecosystem and this worry does not go away. Among other things I’m using keybase and keybase uses unpinned node libraries left and right. keybase has 225 node dependencies from a quick look. Among those many partially pinned one-liner libraries for which it would be easily enough to roll out backdoor update if one gets hold of credentials.

老实说。 我真的很担心节点生态系统的完整性,这种担忧并没有消失。 除其他外,我正在使用密钥库,并且密钥库使用左右未固定的节点库。 通过快速查看,keybase具有225个节点依赖项。 在许多部分固定的单线库中,如果一个拥有凭证,就可以很容易地推出后门更新。

If micro-dependencies want to have a future then something must change in npm. Maybe they would have to get a specific tag so that the system can automatically run automated analysis to spot unexpected updates. Probably they should require a CC0 license to simplify copyright dialogs etc.

如果微依赖性要有未来,那么npm必须有所改变。 也许他们必须获得特定的标签,以便系统可以自动运行自动分析以发现意外更新。 可能他们应该要求CC0许可证才能简化版权对话框等。

But as it stands right now I feel like this entire thing is a huge disaster waiting to happen and if you are not using node shrinkwrap yet you better get started quickly.

但是就目前而言,我觉得整件事是一场巨大的灾难,等待发生,如果您不使用节点收缩包装,那么最好快速开始。

翻译自: https://www.pybloggers.com/2016/03/micropackages-and-open-source-trust-scaling/

瑞芯微开源

你可能感兴趣的:(瑞芯微开源_微包和开源信任扩展)