Showing 120 of 120on this page. Filters & sort apply to loaded results; URL updates for sharing.120 of 120 on this page
SwiGLU Activation Function
Exploring SwiGLU : The Activation Function Powering Modern LLMs | by ...
SwiGLU activation function · Issue #20403 · huggingface/transformers ...
Discovering SwiGLU: The Activation Function Powering Modern LLMs
为什么大型语言模型都在使用 SwiGLU 作为激活函数? - 知乎
Swish Activation Function by Google | by Random Nerd | Medium
python - How to implement SwiGLU activation? Why does SwiGLU takes in ...
SwiGLU: The Activation Function Powering Modern LLMs | by Saeed Mehrang ...
Beyond ReLU: Discovering the Power of SwiGLU | by heping_LU | Medium
为什么大型语言模型都在使用 SwiGLU 作为激活函数?_腾讯新闻
为什么大型语言模型都在使用 SwiGLU 作为激活函数?-腾讯云开发者社区-腾讯云
SwiGLU with SiLU:大模型时代的激活函数革命与架构设计精要_silu 论文-CSDN博客
Function Notation - Graphs
Linear Layers and Activation Functions in Transformer Models ...
详解SwiGLU激活函数 - 知乎
Transformer Design Guide (Part 2: Modern Architecture) | Rohit Bandaru
What is SwiGLU? • Carlos Roldán
LLaMA-2 from the Ground Up - by Cameron R. Wolfe, Ph.D.
Understanding LLM through the LLaMA Models - Jie Yu’s Home Page
llama源码学习·model.py[2]SwiGLU激活函数-CSDN博客
大模型系列:SwiGLU激活函数与GLU门控线性单元原理解析-CSDN博客
【大模型架构笔记】大模型常用激活函数SwiGLU - 知乎
详解如何从零构建Llama 3(含代码)!_llama3代码-CSDN博客
大模型基础|激活函数|从ReLU 到SwiGLU - 知乎
Llama3大模型的SwiGLU激活函数 - 知乎
Decoder-Only Transformers: The Workhorse of Generative LLMs
All the Activation Functions
介绍llama2|带有SwiGlu的FeedForward_swiglu mlp-CSDN博客
Building an Efficient Machine Learning API
The Evolution of Llama: From Llama 1 to Llama 3.1 | Towards Data Science
大模型学习笔记------Llama 3模型架构之RMS Norm与激活函数SwiGLU - 技术栈
神经网络的激活函数(五)门控系列GLU、Swish和SwiGLU - 知乎
LLaMa-1/2/3 原理+源码——拆解 (KV-Cache, RoPE, RMSNorm, GQA, SwiGLU)_llama源码-CSDN博客
激活函数-SwiGLU_silu激活函数-CSDN博客
SwiGLU论文阅读-CSDN博客
【大模型】激活函数之SwiGLU详解-CSDN博客
激活函数的进化之旅:从Sigmoid到SwiGLU,深度学习的神经触发器_ITPUB博客
LLM Inference: From Prompt to Text | DhiraPT's Lab
Four curve diagrams of activation function. (a) Relu activation ...
SwiGLU: The FFN Upgrade I Use to Get Free Performance - DEV Community
大模型系列:SwiGLU激活函数与GLU门控线性单元原理解析_mb648c186b9844f的技术博客_51CTO博客
Bloom, the model everyone hates... - Ed's Blog
神经网络激活函数:从ReLU到前沿SwiGLU - 技术栈
DeepSeek中的激活函数SwiGLU_ITPUB博客
Deep Learning 101: Transformer Activation Functions Explainer - Sigmoid ...
LLM时代的transformer参数量、计算量、激活值的分析 | Paul's Blog
昇腾大模型|结构组件-2——ReLU、GeLU、SwiGLU、GeGLU - 知乎
What Is SwiGLU? How to Implement It? And Why Does it Work?
【大模型】LLaMa系列演进及源码解析_llama模型-CSDN博客
Deepseek
为什么所有主流LLM都使用SwiGLU? - 知乎
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
SwiGLU激活函数简要总结 - 知乎
【NLP高频面题 - LLM架构篇】使用SwiGLU相对于ReLU有什么好处?_动态门控机制-CSDN博客
补充:SwiGLU激活函数-CSDN博客
Gated Linear Units: The FFN Architecture Behind Modern LLMs ...
大语言模型技术百科:原理、架构与工程实践,第八章:关键组件优化:RMSNorm与SwiGLU - 知乎
激活函数总结_glu和switchglu-CSDN博客
SwiGLU: GLU Variants Improve Transformer (2020) – Naoki Shibuya
Transformer Activation Functions and their Details | JoeLogs
Swish和SwiGLU激活函数介绍_pytorch swish-CSDN博客
SwiGLU在深度学习中到底有什么作用? - 知乎
一文图解AF3原理 - 知乎
Swiggle Search Engine _ Swiggle Search – VRIMCA
Swiggle Search for Kids
Activation Functions: All You Need To Know | Machine Learning Archive