云中数据
This article was originally published on mongoDB. Thank you for supporting the partners who make SitePoint possible.
本文最初在mongoDB上发布 。 感谢您支持使SitePoint成为可能的合作伙伴。
Black holes are regions in spacetime with such a strong gravitational pull that nothing can escape. Not entirely destructive as you might have been lead to believe, their gravitational effects help drive the formation and evolution of galaxies. In fact, our own Milky Way galaxy orbits a supermassive black hole with 4.1 million times the mass of the Sun. Some theorize that none of us would be here were it not for a black hole.
黑洞是时空中具有强大引力的区域,没有任何东西可以逃脱。 正如您可能会相信的那样,它们并不完全具有破坏性,它们的引力效应有助于推动星系的形成和演化。 实际上,我们自己的银河系绕着质量为太阳质量四百一十万倍的超大质量黑洞运动。 一些理论认为,如果不是黑洞,我们谁也不会在这里。
On the flip side, black holes can also be found hurtling through the cosmos — often at millions of miles per hour — tearing apart everything in their path. It’s said that anything that makes it into their event horizons, the “point of no return”, will never be seen or heard from again, making black holes some of the most interesting and terrifying objects in space.
另一方面,还可以发现黑洞刺穿整个宇宙,时速通常达数百万英里,将其路径上的所有东西都撕裂了。 有人说,任何进入其事件视界的东西,即“无返回点”,都将再也看不见或听到,这使黑洞成为太空中一些最有趣和最恐怖的物体。
Why are we going on about black holes, gravitational effects, and points of no return? Because something analogous is happening right now in computing.
为什么我们要处理黑洞,引力效应和无归点? 因为现在在计算中正在发生类似的事情。
First coined in 2010 by Dave McCrory, the concept of “data gravity” treats data as if it were a planet or celestial object with mass. As data accumulates in an environment, applications and services that rely on that data will naturally be pulled into the same environment. The larger the “mass” of data there is, the stronger the “gravitational pull” and the faster this happens. Applications and services each have their own gravity but data gravity is by far the strongest, especially as:
“数据引力”的概念由戴夫·麦克罗里(Dave McCrory )于2010年首次提出,将数据视为是质量的行星或天体。 随着数据在环境中累积,依赖于该数据的应用程序和服务自然会被拉到同一环境中。 数据的“质量”越大,“引力”越强,发生的速度越快。 应用程序和服务各自具有自己的重力,但数据重力是最强的,尤其是:
As in the real world, the larger the mass of an object, the harder it is to move, so data gravity also means that once your mass of data gets large enough, it is also harder (and in some cases, near impossible) to move. What makes this relevant now more than ever is the shift to cloud computing. As companies move to the cloud, they need to make a decision that will have massive implications down the line — where and how are they going to store their data? And how do they not let data gravity in the cloud turn into a data black hole?
就像在现实世界中一样,物体的质量越大,移动起来就越困难,因此数据引力也意味着一旦数据量变得足够大,它也就变得更难(在某些情况下几乎是不可能的)移动。 现在比以往任何时候都更重要的是向云计算的转变。 随着公司迁移到云中,他们需要做出一个决策,该决策将对整个流程产生重大影响-他们将在哪里以及如何存储数据? 他们又如何不让云中的数据引力变成数据黑洞 ?
There are several options for organizations moving from building their own IT to consuming it as a service in the cloud.
对于组织,从构建自己的IT到将其作为云中的服务来使用,有多种选择。
The companies behind proprietary tabular databases often penalize their customers for running these technologies on any cloud platform other than their own. This should not surprise any of us. These are the same vendors that for decades have been relying on selling heavy proprietary software with multi-year contracts and annual maintenance fees. Vendor lock-in is nothing new to them.
专有表格数据库背后的公司通常会因在其自身以外的任何云平台上运行这些技术而对客户造成不利影响。 这不应该使我们所有人感到惊讶。 这些供应商几十年来一直依赖销售具有多年合同和年度维护费用的重型专有软件。 供应商锁定对他们来说并不是什么新鲜事物。
Organizations choosing to use proprietary tabular databases in the cloud also carry over all the baggage of those technologies and realize few cloud benefits. These databases scale vertically and often cannot take advantage of cloud-native architectures for scale-out and elasticity without massive compromises. If horizontal scale-out of data across multiple instances is available, it isn’t native to the database and requires complex configurations, app-side changes, and additional software.
选择在云中使用专有表格数据库的组织还承担了这些技术的全部负担,几乎没有云优势。 这些数据库是垂直扩展的,并且在没有大量妥协的情况下,通常无法利用云原生架构进行扩展和弹性。 如果可以在多个实例之间横向扩展数据,则它不是数据库固有的,需要复杂的配置,应用程序侧更改和其他软件。
Lifting and shifting these databases to the cloud does not change the fact that they’re not designed to take advantage of cloud architectures.
将这些数据库提升并转移到云上并不会改变它们并非旨在利用云架构的事实。
Things are a little better with open source tabular databases insofar as there is no vendor enforcing punitive pricing to keep you on their cloud. However, similar to proprietary tabular databases, most of these technologies are designed to scale vertically; scaling out to fully realize cloud elasticity is often managed with fragile configurations or additional software.
在没有供应商执行惩罚性定价以保持您的云计算的范围内,开源表格数据库的情况要好一些。 但是,类似于专有表格数据库,大多数这些技术都是垂直扩展的。 向外扩展以完全实现云弹性通常是通过脆弱的配置或其他软件来管理的。
Many companies running these databases in the cloud rely on a managed service to reduce their operational overhead. However, feature parity across cloud platforms is nonexistent, making migrations complicated and expensive. For example, databases running on Amazon Aurora leverage Aurora-specific features not found on other clouds.
许多在云中运行这些数据库的公司都依赖托管服务来减少其运营开销。 但是,不存在跨云平台的功能奇偶校验,这使迁移变得复杂且昂贵。 例如,在Amazon Aurora上运行的数据库利用了其他云上没有的特定于Aurora的功能。
With proprietary cloud databases, it’s very easy to get into a situation where data goes in and nothing ever comes out. These database services run only in their parent cloud and often provide very limited database functionality, requiring customers to integrate additional cloud services for anything beyond very simple use cases.
使用专有的云数据库,很容易陷入数据流入而又什么也不会流出的情况。 这些数据库服务仅在其父云中运行,并且通常提供非常有限的数据库功能,要求客户针对超出非常简单用例的所有内容集成其他云服务。
For example, many of the proprietary cloud NoSQL services offer little more than key-value functionality; users often need to pipe data into a cloud data warehouse for more complex queries and analytics. They also tend to be operationally immature, requiring additional integrations and services to address data protection and provide adequate performance visibility. And it doesn’t stop there. New features are often introduced in the form of new services, and before users know it, instead of relying on a single cloud database, they’re dependent on an ever-growing network of cloud services. This makes it all the more difficult to ever get data out.
例如,许多专有的云NoSQL服务仅提供键值功能。 用户通常需要将数据通过管道传输到云数据仓库中,以进行更复杂的查询和分析。 它们在操作上还很不成熟,需要额外的集成和服务来解决数据保护问题并提供足够的性能可见性。 而且不止于此。 新功能通常以新服务的形式引入,并且在用户不知不觉中,它们不再依赖单个云数据库,而是依赖于不断增长的云服务网络。 这使得获取数据变得更加困难。
The major cloud providers know that if they’re able to get your data in one of their proprietary database services, they’ve got you right where they want you. And while some may argue that organizations should actually embrace this new, ultimate form of vendor lock-in to get the most out of the cloud, that doesn’t leave customers with many options if their requirements, or if data regulations, change. What if the cloud provider you’re not using releases a game-changing service you need to edge out your competition? What if they open up a data center in a new geographic region you’ve prioritized and yours doesn’t have it on their roadmap? What if your main customer dictates that you should sever ties with your cloud provider? It’s happened before.
主要的云提供商知道,如果他们能够通过自己的专有数据库服务之一获取您的数据,那么他们就可以将您带到想要的地方。 尽管有些人可能认为组织实际上应该采用这种新的,最终的供应商锁定形式,以充分利用云,但如果客户的需求或数据法规发生变化,这并不会给客户留下很多选择。 如果您不使用的云提供商发布了一项改变游戏规则的服务,您需要在竞争中脱颖而出,该怎么办? 如果他们在您优先考虑的新地理区域内开设数据中心,而您的路线图上没有它,该怎么办? 如果您的主要客户指示您应与云提供商断绝关系怎么办? 以前发生过 。
These are all scenarios where you could benefit from using a database that runs the same, everywhere.
在所有这些情况下,您都可以使用在任何地方都运行相同的数据库而受益。
As you move into the cloud, how you prevent data gravity from turning against you and limiting your flexibility is simple — use a database that runs the same in any environment.
当您迁移到云中时,如何防止数据重力逆转并限制灵活性很简单-使用在任何环境中都运行相同数据库的数据库。
One option to consider is MongoDB. As a database, it combines the flexibility of the document data model with sophisticated querying and indexing required by a wide range of use cases, from simple key-value to real-time aggregations powering analytics.
要考虑的一种选择是MongoDB。 作为数据库,它结合了文档数据模型的灵活性以及各种用例所需的复杂查询和索引,这些用例从简单的键值到支持分析的实时聚合。
MongoDB is a distributed database designed for the cloud at its core. Redundancy for resilience, horizontal scaling, and geographic distribution are native to the database and easy to use.
MongoDB是一个专为云计算而设计的分布式数据库。 弹性,水平缩放和地理分布的冗余是数据库固有的,并且易于使用。
And finally, MongoDB delivers a consistent experience regardless of where it is deployed:
最后,无论部署在何处,MongoDB都能提供一致的体验:
For organizations not quite ready to migrate to the cloud, they can deploy MongoDB on premises behind their own firewalls and manage their databases using advanced operational tooling.
对于尚未准备好迁移到云的组织,他们可以在自己的防火墙后面的场所部署MongoDB,并使用高级操作工具管理数据库。
Data gravity will no doubt have a tremendous impact on how your IT resources coalesce and evolve in the cloud. But that doesn’t mean you have to get trapped. Choose a database that delivers a consistent experience across different environments and avoid going past the point of no return.
毫无疑问,数据引力将对您的IT资源在云中如何融合和发展产生巨大影响。 但这并不意味着您必须被困住。 选择一个可在不同环境中提供一致体验的数据库,并避免超过无回报的地步。
翻译自: https://www.sitepoint.com/cloud-data-strategies-preventing-data-black-holes-in-the-cloud/
云中数据