苹果M1芯片为何如此快?一个开发者的解释

苹果M1芯片为何如此快?一个开发者的解释

    • This Is Fast But The Best Is Yet To Come

苹果M1芯片为何如此快?一个开发者的解释_第1张图片

You may have wondered why the Apple M1 Chip is so fast? We have shown, with both the benchmark tests and our own more practical tests for Pro Tools, Logic and Studio One, even these first-generation Apple Silicon M1 powered Mac computers are remarkably fast. In this article, we learn from a developer as to what makes, even these entry-level Macs, so fast.

你可能想知道为什么苹果M1芯片这么快?我们已经证明,通过基准测试和我们自己对Pro Tools、Logic和Studio One的更实用的测试,即使是这些第一代苹果硅M1驱动的Mac电脑也非常快。在本文中,我们将从一名开发人员那里了解是什么让这些入门级的mac如此快速。

In his article, developer Erik Engheim has dug into what makes the M1 powered Macs so much faster than even some of their bigger Intel-powered brothers. Erik starts…

“On YouTube, I watched a Mac user who had bought an iMac last year. It was maxed out with 40 GB of RAM costing him about $4,000. He watched in disbelief how his hyper-expensive iMac was being demolished by his new M1 Mac Mini, which he had paid a measly $700 for.
In real-world test after test, the M1 Macs are not merely inching past top-of-the-line Intel Macs, they are destroying them. In disbelief, people have started asking how on earth this is possible?”

苹果M1芯片为何如此快?一个开发者的解释_第2张图片
The M1 isn’t just a processor chip, its what is called a System-on-a-Chip or SoC for short. What that means is, that unlike computers to date, where the components that make up a computer are individual parts mounted on a motherboard, an SoC, like the Apple M1, brings together an 8-core CPU, 8-core GPU (7-core in some MacBook Air models), unified memory, SSD controller, image signal processor, Secure Enclave, on one chip.
M1不仅仅是一个处理器芯片,它被称为一个系统芯片或简称SoC。这意味着,与电脑到目前为止,在各个部分组成一个计算机组件安装在主板上,一个SoC,像苹果M1,汇集了一个8核CPU、8核GPU (7-core MacBook Air模型),统一内存、固态硬盘控制器,图像信号处理器,安全飞地,一个芯片上。

Another reason the Apple Silicon chips perform so well is that as well as being together on one chip, the M1 is made up of a series of specialised tools. Erik explains…

  • Central processing unit (CPU) — the “brains” of the SoC. Runs most of the code of the operating system and your apps.
  • Graphics processing unit (GPU) — handles graphics-related tasks, such as visualizing an app’s user interface and 2D/3D gaming.
  • Image processing unit (ISP) — can be used to speed up common tasks done by image processing applications.
  • Digital signal processor (DSP) — handles more mathematically intensive functions than a CPU. Includes decompressing music files.
  • Neural processing unit (NPU) — used in high-end smartphones to accelerate machine learning (A.I.) tasks. These include voice recognition and camera processing.
  • Video encoder/decoder — handles the power-efficient conversion of video files and formats.
  • Secure Enclave — encryption, authentication, and security.
  • Unified memory — allows the CPU, GPU, and other cores to quickly exchange information.

CPU,GPU,图像处理单元,数字信号处理单元,神经处理单元,视频编解码器,安全包体,统一内存

苹果M1芯片为何如此快?一个开发者的解释_第3张图片
Let’s dig into the last point, the on-chip memory. With the M1, this is also part of the SoC. The memory in the M1 is what is described as a ‘unified memory architecture’ (UMA) that allows the CPU, GPU, and other cores to exchange information between one another, and with unified memory, the CPU and GPU can access memory simultaneously rather than copying data between one area and another. Erik continues…

让我们深入研究最后一点,片上存储器。对于M1,这也是SoC的一部分。M1的内存是被描述为一个“统一内存架构”(UMA)允许CPU、GPU,和其他核心之间交换信息,统一内存,CPU和GPU可以同时存取存储器而不是一个区域和另一个之间复制数据。埃里克继续……

“For a long time, budget computer systems have had the CPU and GPU integrated into the same chip (same silicon die). In the past saying ‘integrated graphics’ was essentially the same as saying ‘slow graphics’. These were slow for several reasons:
Separate areas of this memory got reserved for the CPU and GPU. If the CPU had a chunk of data it wanted the GPU to use, it couldn’t say “here have some of my memory.” No, the CPU had to explicitly copy the whole chunk of data over the memory area controlled by the GPU.”

“很长一段时间以来,廉价计算机系统都将CPU和GPU集成到同一个芯片(同一个硅片)中。在过去,说“集成显卡”基本上等同于说“慢显卡”。这些进展缓慢有几个原因:
这个内存的单独区域被预留给CPU和GPU。如果CPU有一大块数据想让GPU使用,它就不能说“这是我的一些内存”。“不,CPU必须明确地将整个数据块复制到GPU控制的内存区域。”

Another challenge is that CPUs and GPUs don’t want their memory served the same way. CPUs want their data served ‘little and often’. GPUs, however, want the complete opposite. They are happy to have infrequent huge portions of data. They can gobble huge amounts of data because they are parallel machines, that can chew through lots of data in simultaneously.

If like me, at this point you are thinking “so why has Apple put the CPU and GPU on the same chip?" Why doesn’t the M1 suffer the same problem as ‘computers with integrated graphics?” Stay with us, we will get there. Back to Erik…

另一个挑战是cpu和gpu不希望它们的内存以同样的方式服务。cpu希望他们的数据使用方式是“很少而且经常”。然而,gpu想要的完全相反。他们很乐意拥有很少出现的大量数据。它们能够吞噬大量数据,因为它们是并行机器,可以同时咀嚼大量数据。

如果你和我一样,这时候你会想“为什么苹果把CPU和GPU放在同一个芯片上?”为什么M1没有遇到与集成图形的电脑相同的问题?“我们继续往下看,会得到答案。”回到Erik……

  • “The second problem was that large GPUs produce a lot of heat and thus you cannot integrate them with the CPU without getting problems ridding yourself of the heat produced. Thus discrete graphics cards tend to [be large] beasts with massive cooling fans. They have special dedicated memory designed to serve the greedy cards massive amounts of data.
  • That is why these cards have high performance. But they have an achilles heel: Whenever they have to get data from the memory used by the CPU, this happens over a set of copper traces on the computer motherboard called a PCIe bus. Try chugging water through a super thin straw. It may get to your mouth fast, but the throughput is totally inadequate.”
    “第二个问题是大型gpu产生大量热量,因此你无法在不消除产生热量的情况下将它们与CPU集成。因此,离散显卡往往是带有大量冷却风扇的巨兽。它们有专门的内存,专门为贪婪的显卡提供大量的数据。
    这就是为什么这些卡有很高的性能。但它们有一个致命的弱点:每当它们必须从中央处理器使用的内存中获取数据时,这就发生在计算机主板上一组称为PCIe总线的铜线上。试着用一根超薄的吸管把水灌进去。它可能很快进入你的口中,但吞吐量完全不够。”

苹果M1芯片为何如此快?一个开发者的解释_第4张图片
Apple’s Unified Memory Architecture aims to solve these problems without the restrictions of ‘old school shared memory’ in 3 ways…

  • There is no special area reserved just for the CPU or just the GPU. Memory is allocated to both processors. They can both use the same memory. This means that no copying is needed and so things go faster.
  • Apple uses memory, which is designed to serve both large chunks of data and do it very quickly. It is called ‘low latency and high throughput’. This removes the need to have two different types of memory and all the copying of data between them, making the M1 faster.
  • With their iPhone and iPad design experience, Apple has been able to get the GPU power consumption down so that a relatively powerful GPU can be integrated into an SoC without overheating.

The takeaway here is that accessing the same pool of memory without the need for copying speeds up information exchange for faster overall performance.

苹果的统一内存架构旨在不受“老派共享内存”的限制,通过三种方式解决这些问题……

  • 没有专门为CPU或GPU预留的区域。内存分配给两个处理器。它们可以使用相同的内存。这意味着不需要进行复制,因此运行速度更快。
  • 苹果使用的是内存,它的设计是为了同时处理大量数据,并且处理速度非常快。它被称为“低延迟和高吞吐量”。这样就不需要使用两种不同类型的内存,也不需要在它们之间复制所有的数据,从而使M1更快。
  • 凭借iPhone和iPad的设计经验,苹果已经能够降低GPU的功耗,这样一个相对强大的GPU就可以集成到SoC中而不会出现过热。
    这里的要点是,访问相同的内存池而不需要复制可以加快信息交换,从而提高整体性能。

苹果M1芯片为何如此快?一个开发者的解释_第5张图片

Another benefit of a system-in-a-chip design is that everything so much closer together. At the speeds we are talking about, the distance data has to travel, even at the speed of light, can matter. It is going to be quicker to move data over millimetres or even microns within an SoC as opposed to centimetres around a motherboard.

芯片内系统设计的另一个好处是,所有东西都紧密地联系在一起。在我们谈论的速度下,数据传输的距离,即使是以光速,也是很重要的。与主板上几厘米的移动相比,在SoC内移动几毫米甚至几微米的数据将会更快。

Finally, the M1 does use virtual memory. VM is where the CPU uses hard disk space as RAM when it runs out of proper RAM. When we were using spinning rust drives that was so slow, hence the push to have as much RAM as you could afford. Now with NVMe drives you have hard drives that are pretty well as fast as RAM, so in an M1 system, although the virtual memory use is significant, it can be considered as fast as RAM in Intel machines.

最后,M1确实使用虚拟内存。VM是CPU在RAM耗尽时使用硬盘空间作为RAM的地方。当我们使用旋转rust驱动器时,速度非常慢,所以我们需要尽可能多的内存。现在使用NVMe驱动器的硬盘驱动器与RAM一样快,所以在M1系统中,尽管虚拟内存的使用非常大,但可以认为它与Intel机器中的RAM一样快。

Taken together, these benefits are what people like us are experiencing, as Erik says…

“This is part of the reason why a lot of people working on images and video editing with the ‌M1‌ Macs are seeing such speed improvements. A lot of the tasks they do can run directly on specialized hardware. That is what allows a cheap ‌M1‌ Mac Mini to encode a large video file, without breaking a sweat while an expensive iMac has all its fans going full blast and still cannot keep up.”

综上所述,这些好处正是像我们这样的人正在经历的,正如埃里克所说……

“这是部分原因,为什么很多人在M1 Macs上做图像和视频编辑‌工作时看到速度提升明显。它们执行的许多任务都可以直接在专门的硬件上运行。这就是允许廉价‌M1‌Mac Mini编码一个大视频文件,毫不费力,而昂贵的iMac开足马力 全力以赴还不一定跟得上。”

Now to be fair, specialised chips are nothing new but as Erik says, Apple is taking this concept and then taking a “more radical shift towards this direction.”*

公平地说,专业芯片并不新鲜,但正如埃里克所说,苹果正在采用这一概念,然后“朝着这个方向进行更彻底的转变”。

Apple has been able to take their 10 years of experience developing phones and tablets that have become ever faster and more powerful, whilst becoming more power-efficient, which is crucial in portable devices where battery life is so important and where heat means power inefficiency.

苹果用他们10年的经验开发出了速度更快、功能更强大的手机和平板电脑,同时也变得更节能,这对便携式设备至关重要,因为电池寿命非常重要,而热量意味着电力效率低下。

Greg Joswiak, Apple’s senior vice president of worldwide marketing, has spoken about how Steve Jobs used to push Apple to “make the whole widget.”

“Steve used to say that we make the whole widget,” Joswiak told me. “We’ve been making the whole widget for all of our products, from the iPhone, to the iPads, to the watch. This was the final element to making the whole widget on the Mac.”

苹果负责全球营销的高级副总裁格雷格·乔斯维克(Greg Joswiak)谈到了史蒂夫·乔布斯(Steve Jobs)过去是如何推动苹果“制造整个小部件”的。

“史蒂夫过去常说,我们制造整个小部件,”乔斯瓦克告诉我。“从iPhone、ipad到手表,我们一直在为我们的所有产品制作整个部件。这是在Mac上制作整个小部件的最后一个元素。”

Developer Erik Engheim picks up on this point…

“Sure Intel and AMD may simply begin to sell whole finished SoCs. But what are these to contain? PC makers may have different ideas of what they should contain. You potentially get a conflict between Intel, AMD, Microsoft and PC makers about what sort of specialized chips should be included because these will need software support.”

The reality is that there are benefits that Intel and AMD will never be able to offer, even if they are dragged screaming and kicking into the SoC world, whereas Apple is able to offer the full deal because they control both the hardware and software.

“They give you, for example, the Core ML library for developers to write machine learning stuff. Whether Core ML runs on Apple’s CPU or the Neural Engine is an implementation detail developers don’t have to care about.”

Johny Srouj, senior vice president of hardware technologies at Apple, said in an interview with Om Malik that bringing the Mac processors in-house gives Apple far more control over the future:

“I believe the Apple model is unique and the best model,” he said. “We’re developing a custom silicon that is perfectly fit for the product and how the software will use it. When we design our chips, which are like three or four years ahead of time, Craig and I are sitting in the same room defining what we want to deliver, and then we work hand in hand. You cannot do this as an Intel or AMD or anyone else.”

Craig Federighi, Apple’s senior vice president of software engineering, echoed those thoughts:

“Being in a position for us to define together the right chip to build the computer we want to build and then build that exact chip at scale is a profound thing,” Federighi said about the symbiotic relationship between hardware and software groups at Apple. Both teams strive to look three years into the future and see what the systems of tomorrow look like. Then they build software and hardware for that future.

Coming back to the processors, another reason why the M1 is so fast is that Apple is using a processor design that is able to execute more instructions in parallel through what is called ‘Out-of-Order execution’, RISC architecture, and some specific tweaks Apple has used, which Erik provides an in-depth explanation of in his article. If you want to learn more then we do recommend you check out his article.

苹果M1芯片为何如此快?一个开发者的解释_第6张图片

This Is Fast But The Best Is Yet To Come

There you have it. Hopefully, we have been able to explain in an accessible way, why the Apple M1 powered entry-level first-generation Apple Silicon Macs are so quick and can do it with a relatively small amount of RAM.

Now imagine what the 2nd and 3rd generation Apple Silicon Macs are going to be like with 16, 32 or even 64 cores for the CPU and say 32 cores or the GPU, 64GB of Unified Memory Architecture instead of conventional RAM and things are going to very, very quick and very, very powerful and not get too hot.

就是这样。希望我们已经能够以一种容易理解的方式解释,为什么苹果M1驱动的入门级第一代苹果芯片的mac电脑速度如此之快,而且内存消耗更低。

现在想象第二和第三代苹果芯片mac电脑会像16、32 甚至64核的CPU和32核的GPU, 64GB的统一内存架构。而不是传统的RAM,事情将会非常非常快,非常强大而且不会太热。

你可能感兴趣的:(业界视野,苹果,M1芯片,Mac电脑,性能,GPU)