FlashAttention 第2页

推荐频道

FlashAttention

FlashAttention燃爆显存，Transformer上下文长度史诗级提升

继超快且省内存的注意力算法FlashAttention爆火后，升级版的2代来了。FlashAttention-2是一种从头编写的算法，可以加快注意力并减少其内存占用，且没有任何近似值。

qq_41771998·2023-08-23 20:55

迈入大模型时代的深度学习：使用 Flash Attention 技术让 Transformer 起飞

FlashAttention:FastandMemory-EfficientExactAttentionwithIO-Aware

·2023-08-21 17:31

FlashAttention算法详解

这篇文章的目的是详细的解释FlashAttention，为什么要解释FlashAttention呢？

·2023-08-21 10:17

FlashAttention

一、论文题目（发表处-时间）FlashAttention:FastandMemory-EfficientExactAttentionwithIO-Awareness二、主要方向新型注意力机制三、细化任务一种具有

be_humble·2023-08-12 14:21

FlashAttention

Sourcespaper:https://arxiv.org/abs/2205.14135aninformaltalkbytheauthorTriDao:https://www.youtube.com/watch?v=FThvfkXWqtEcoderepo:GitHub-HazyResearch/flash-attention:Fastandmemory-efficientexactattenti

EverNoob·2023-07-29 12:25

FlashAttention-2

FlashAttentionisafusiontrick,whichmergesmultipleoperationalsteps(ops)intheattentionlayersoftransformernetworkstoachievebetterend2endresult;theperformancegainismainlyfrombettermemoryreusegiventhevanill

EverNoob·2023-07-29 12:24

一些改cuda加速的思路：FlashAttention、PagedAttention、LightSeq、ByteTransformer

FlashAttentionFlashAttention一般指的是FlashAttention:FastandMemory-EfficientExactAttentionwithIO-Awareness

taoqick·2023-07-13 18:42

近期关于Transformer结构有潜力的改进方法总结

目录0引言1GatedLinearUnit(GLU)1.1思路2GatedAttentionUnit(GAU)2.1思路2.2实验结论2.3混合注意力3FlashAttention3.1标准Attention

tyhj_sf·2023-06-17 18:58

基于Pytorch2对比 FlashAttention、Memory-Efficient Attention、CausalSelfAttention

本文主要是Pytorch2.0的小实验，在MacBookPro上体验一下等优化改进后的TransformerSelfAttention的性能，具体的有FlashAttention、Memory-EfficientAttention

写bug的程旭源·2023-04-14 07:51

FlashAttention

FlashAttention:FastandMemory-EfficientExactAttentionwithIO-Awarenesshttps://paperswithcode.com/paper/

Valar_Morghulis·2022-06-07 15:11

上一页 1 2 下一页

按字母分类： A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 其他