Showing 120 of 120on this page. Filters & sort apply to loaded results; URL updates for sharing.120 of 120 on this page
Quantization in LLMS (Part 1): LLM.int8(), NF4 | TensorTunes
Day 65/75 LLM Quantization Techniques [GPTQ - AWQ - BitsandBytes NF4 ...
[Unsloth Puzzle 2] NF4 4-bit Quantization & Dequantization Explained ...
Flux.1 Quantization Quality: BNB nf4 vs GGUF-Q8 vs FP16 via /r ...
QLoRA: 4-Bit Quantization for Memory-Efficient LLM Fine-Tuning ...
INT4 Quantization: Group-wise Methods & NF4 Format for LLMs ...
4-bit NormalFloat (NF4) Quantization
4-bit LLM training and Primer on Precision, data types & Quantization
Yang Yang | A Primer on Neural Network Quantization
NF4: 4-bit NormalFloat in Neural Quantization
What is the NF4 Lewis Structure?
GitHub - intel/neural-compressor: SOTA low-bit LLM quantization (INT8 ...
Load FP4 or NF4 Quantized Diffusion or UNET Model
Low-Bit LLM Quantization (INT4, NF4, FP4)
[Feature Request] Add GPTQ quantization kernels for 4-bit NormalFloat ...
Quantization Methods for Enabling Efficient Fine-Tuning and Deployment ...
[2306.06965] NF4 Isn’t Information Theoretically Optimal (and that’s Good)
4-bit Quantization with GPTQ | Towards Data Science
How to Draw the Lewis Dot Structure for NF4 + - YouTube
Flux Fusion V2 [4 steps] [GGUF • NF4 • FP8/FP16] - 🆕 v2 fp16 | Flux ...
Deep Neural Network Quantization Framework for Effective Defense ...
FLUX NF4 | ComfyUI Workflow
Making LLMs Lighter: A deep dive into LLM quantization with Code | by ...
Introduction to Quantization cooked in 🤗 with 💗🧑🍳
A Visual Guide to Quantization - by Maarten Grootendorst
Blockwise quantization only supports 16/32-bit floats, but got torch ...
What is Quantization - Lightning AI
Naive Quantization Methods for LLMs — a hands-on
Beyond Static Ranks: The Power of Dynamic Quantization in LLM Fine ...
How to optimize large deep learning models using quantization
Neural Network Model Quantization On Mobile
Practical Guide to LLM Quantization Methods - Cast AI
LLM 모델 파인튜닝을 위한 Quantization | 패스트캠퍼스
Quantization. 4bit-NormalFloat (NF4) consists of… | by Sivanarayana ...
[QLoRA] QLoRA: Efficient Finetuning of Quantized LLMs
【手撕LLM-QLoRA】NF4与双量化-源码解析 - 知乎
模型量化原理与实践 – Robot 9
We have created a little guide on how to perform Flux.1 LoRA training ...
Understanding LoRA and QLoRA - The Powerhouses of Efficient Finetuning ...
神经网络量化原理_nf4量化-CSDN博客
QLoRA Explained: Fine-Tuning Large Language Models
QLoRA、GPTQ:模型量化概述 - 知乎
hoanduy27/bloomz-1b1-quantization-nf4 · Hugging Face
transformer/config.json · priyesh17/FLUX.1-dev_Quantized_nf4 at main
QLoRA: Efficient Finetuning of Quantized LLMs论文解读 - 李理的博客
Paper review[QLORA: Efficient Finetuning of Quantized LLMs]
p1atdev/Neta-Lumina-bnb-nf4 · Hugging Face
zeux.io - Quantizing floats
【LLM模型微调】PEFT与QLoRA:LLMs微调总结笔记_peft qlora-CSDN博客
低比特LLM量化(INT4, NF4, FP4)
用 bitsandbytes、4 比特量化和 QLoRA 打造亲民的 LLM - 智源社区
LoRA、QLoRA - 知乎
模型量化(Model Quantization)-CSDN博客
Finetuning Generative AI Large Language Model (LLM) Falcon (40B,7B ...
LLMs量化系列|LLM量化方法小结 - 知乎
Understanding QLoRA: Quantized Fine-Tuning | AI Tutorial | Next Electronics
any4: Learned 4-bit Numeric Representation for LLMs | AI Research Paper ...
大模型参数高效微调 (PEFT) 浅析 - 知乎
For varying B, the 16 values of AF4-B, a code based on minimizing ...
[2402.04902] L4Q: Parameter Efficient Quantization-Aware Training on ...
Model Quantization: Concepts, Methods, and Why It Matters | NVIDIA ...
Optimizing LLMs for Performance and Accuracy with Post-Training ...
GitHub - comfyorg/comfyui_nf4_loader: Nodes for loading both ...
LORA微调系列(二):QLORA和它的基本原理 - 知乎
量化感知训练(Quantization-aware-training)探索-从原理到实践 - 知乎
The stress–strain curves of NF1, NF2, NF3, and NF4. | Download ...
出图速度加快2倍,NF4精度的Flux模型本地部署全教程 178
Each vertical slice represents a particular code which leads to uniform ...
GPTQ quantization(3 or 4 bit quantization) support for LLaMa · Issue ...