目前GPU 超过100 TFLOPS的GPU 之一

  • 一个MFLOPS(megaFLOPS)等于每秒一百万(=10^6)次的浮点运算,

  • 一个GFLOPS(gigaFLOPS)等于每秒十亿(=10^9)次的浮点运算,

  • 一个TFLOPS(teraFLOPS)等于每秒一万亿(=10^12)次的浮点运算,

  • 一个PFLOPS(petaFLOPS)等于每秒一千万亿(=10^15)次的浮点运算,

  • 一个EFLOPS(exaFLOPS)等于每秒一百亿亿(=10^18)次的浮点运算。

NVIDIA® V100 Tensor Core 是有史以来极其先进的数据中心 GPU,能加快 AI、高性能计算 (HPC) 和图形技术的发展。其采用 NVIDIA Volta 架构,并带有 16 GB 和 32GB 两种配置,在单个 GPU 中即可提供高达 100 个 CPU 的性能。如今,数据科学家、研究人员和工程师可以减少优化内存使用率的时间,从而将更多时间用于设计下一项 AI 突破性作品。

V100 拥有 640 个 Tensor 内核,是世界上第一个突破 100 万亿次 (TFLOPS) 深度学习性能障碍的 GPU。新一代 NVIDIA NVLink™ 以高达 300 GB/s 的速度连接多个 V100 GPU,在全球打造出功能极其强大的计算服务器。现在,在之前的系统中需要消耗数周计算资源的人工智能模型在几天内就可以完成训练。随着训练时间的大幅缩短,人工智能现在可以解决各类新型问题。

其他:类似NPU常见的采用TOPS 表示算力。而 GPU 采用TFLOPS 表示速度。

可以参考有人提出的问题

A: VIM3 is 5 TOPS, but other SBC boards are measured in FLOPS. Does anybody knows how many FLOPS is one TOPS.

B: by the way TOPS is diffrent from FLOPS
here TOPS is referring to the NPU, FLOPS is used for the raw cpu, gpu processing power

How many Flops is one Tops? - General Discussion - Khadas Community


What is the difference between FLOPS and OPS?

  • FLOPS is floating-point operations per second
  • OPS is operations per second

The difference should be obvious from the name: one is the number of operations per second, the other is the number of floating-point operations per second.

Why use one over the other?

If you want to know the floating-point performance, you would measure FLOPS, if you want to know the performance over all kinds of operations, you would measure OPS.

Floating-point operations are just not terribly interesting for most use cases. In fact, in the past, floating-point operations used to be implemented on a separate chip sitting in a separate socket on the motherboard. This was done for two reasons: floating-point operations are pretty complex, slow, and power-hungry, so it was simply not physically possible to have the complex Floating-Point Unit (FPU) on the same die as the CPU. And second, only few people need high floating-point performance, so this made it possible for people to only buy an FPU if they actually needed it, and everybody else avoided wasting money, complexity, and power on an FPU they rarely used.

FLOPS are just not a terribly interesting metric for most use cases. Both parts of the metric, actually: the FLO part (floating-point) and the PS part (time).

If you are building a supercomputer for military applications, then yes, FLOPS is interesting to you. However, if you are not building a supercomputer, then it is highly likely that you don't actually care about floating-point operations at all. And even if you are building a supercomputer for a company, then you do care about floating-point operations, but you actually care more about floating-point operations per dollar (cost), per watt (not just energy cost, but also thermal management, cooling, waste heat, etc.), and per cubic meter (rack space, real estate, property taxes, etc.)

Really, only the military cares about brute-force performance with no regard to cost, energy, or size.

For my mobile phone, I care about the performance-per-cost, performance-per-Watt (both battery life and heat), and of course size. For my desktop, size is a little less important, but cost and energy still are. (And who has desktops anymore?) Even extreme gamers care about waste heat and thermal management!

Crypto miners are all about performance per Watt, since energy dominates the cost for mining. That's why regions with lots of wind, solar, hydro, and geothermal energy are popular with miners. (Or, regions with less than strict environmental laws – apparently, miners have bought or leased and reactivated coal and gas plants that were in the process of being shut down in favor of alternative energy sources.)

What is an example of a non-floating point operation?

  • Integer operations
  • Fixed-point operations
  • Rational operations
  • Complex operations
  • Decimal operations
  • Money operations (nobody in their right mind would use floating-point for money)
  • [literally every single kind of number that is not a floating-point number] operations
  • text operations
  • boolean operations
  • binary operations
  • cryptographic operations

Basically, most of the operations we use in our everyday usage of computers.



 

TFLOPS

FLOPS,即每秒浮点运算次数 [1]  (亦称每秒峰值速度)

是每秒所执行的浮点运算次数

(英文:Floating-point operations per second;缩写:FLOPS)的简称,

被用来评估电脑效能,尤其是在使用到大量浮点运算的科学计算领域中。

正因为FLOPS字尾的那个S,代表秒,而不是复数,所以不能够省略。

中文名

每秒浮点运算次数

外文名

TFLOPS

包    括

所有涉及小数的运算

运算次数

ENIAC: 300 FLOPS

基准程式

测量每秒浮点运算次数

目录

  1. 1 基本介绍
  2. 2 其他信息

基本介绍

浮点运算实际上包括了所有涉及小数的运算,在某类应用软件中常常出现,比整数运算更费时间。

现今大部分的处理器中都有浮点运算器。

因此每秒浮点运算次数所量测的实际上就是浮点运算器的执行速度。

而最常用来测量每秒浮点运算次数的基准程序(benchmark)之一,就是Linpack。

  • 一个MFLOPS(megaFLOPS)等于每秒一百万(=10^6)次的浮点运算,

  • 一个GFLOPS(gigaFLOPS)等于每秒十亿(=10^9)次的浮点运算,

  • 一个TFLOPS(teraFLOPS)等于每秒一万亿(=10^12)次的浮点运算,

  • 一个PFLOPS(petaFLOPS)等于每秒一千万亿(=10^15)次的浮点运算,

  • 一个EFLOPS(exaFLOPS)等于每秒一百亿亿(=10^18)次的浮点运算。

其他信息

以下列出几个有代表性硬件的每秒浮点运算次数

FLOPS

  • ENIAC: 300 FLOPS

MFLOPS

  • CRAY-1: 160 MFLOPS

GFLOPS

  • Intel Xeon 3.6 GHz: <1.8 GFLOPS

  • Intel Pentium 4 HT 3.6Ghz: 7 GFLOPS

  • Intel Core 2 Duo E4300 14 GFLOPS

  • Intel Core 2 Duo E8400 24 GFLOPS

  • AMD Phenom 9950: 29.05 GFLOPS

  • Intel Core 2 Quad Q8200: 37 GFLOPS

  • Intel Core 2 QX9770: 39.63 GFLOPS

  • AMD Phenom II x4 955: 42.13 GFlopS

  • Intel Core i7-965: 69.23 GFLOPS

  • Intel Core i7-980 XE : 107.6 GFLOPS

  • Intel Core i5-2500K @4.5GHz: 123.35 GFLOPS (w/AVX instruction set)

  • IBM POWER7: 264.96GFLOPS[2]

  • nVIDIA Geforce 8800 Ultra(G80-450 GPU):393.6 GFLOPS

  • nVIDIA Geforce GTX 280(G200-300 GPU):720 GFLOPS

  • AMD Radeon HD 3870(RV670 GPU):497 GFLOPS

  • AMD Radeon HD 4870(RV770 GPU):1008 GFlops

TFLOPS

  • nVIDIA Geforce GTX 580(GF110-375 GPU):2.37 TFLOPS

  • AMD Radeon HD 6990(R900 GPU):4.98 TFLOPS

  • nVIDA Geforce GTX 1070: 6.5 TFLOPS

  • nVIDA Geforce GTX 1080: 9 TFLOPS

  • nVIDA Geforce GTX 1080Ti: 10.8 TFLOPS

  • nIVIDIA Titan Xp : 12.1 TFLOPS

  • ASCI White:12.3TFLOPS

  • AMD Vega Frontier Edition : 13.1 TFLOPS

  • Earth Simulator: 35.61 TFLOPS

  • Blue Gene/L: 135.5 TFLOPS

  • 中国曙光Dawning 5000A: 230 TFLOPS

  • HUAWEI Acsend 910: 256 TFLOPS

PFLOPS

  • IBM Roadrunner:1.026 PFLOPS

  • Jaguar:1.75 PFLOPS

  • 天河一号:2.566 PFLOPS

  • Folding@home运算平台:4.769 PFLOPS

  • BOINC运算平台:6.282 PFLOPS (持续增加中)

  • IBM Mira: 8.16 PFLOPS

  • 京:10.51 PFLOPS

  • IBM Sequoia:16.32 PFLOPS

  • Cray Titan:17.59 PFLOPS

  • 天河二号:33.86PFLOPS

  • 神威·太湖之光:125PFLOPS

参考:

performance - What is the difference between FLOPS and OPS? - Computer Science Stack Exchange

How many Flops is one Tops? - General Discussion - Khadas Community

V100 Data Center GPU | NVIDIA

TFLOPS_百度百科

你可能感兴趣的:(人工智能,其他,软考,人工智能,gpu)