Showing 120 of 120on this page. Filters & sort apply to loaded results; URL updates for sharing.120 of 120 on this page
Fine-Tuning gpt-oss for Accuracy and Performance with Quantization ...
4-Bit Quantization Using GPTQ Techniques | PDF | Matrix (Mathematics ...
4-bit Quantization with GPTQ | Towards Data Science
What is Quantization and how to use it with TensorFlow
Introduction to AI Model Quantization Formats | by zhaozhiming | Medium
Quantization-aware training for GPT2 - quantization - PyTorch Forums
GPT OSS - OpenAI Reference Implementation | Kapil Sharma
LLM By Examples — Use GPTQ Quantization | by MB20261 | Medium
Selectq Calibration Data Selection For Post-Training Quantization at ...
GPTQ Quantization of LLMs - The Most Simple Explanation
Underline | Quadapter: Adapter for GPT-2 Quantization
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained ...
(PDF) Quadapter: Adapter for GPT-2 Quantization
GPT-2 XL PyTorch Quantization for use on a Cloud Server · Issue #2466 ...
Quadapter: Adapter for GPT-2 Quantization - ACL Anthology
GitHub - Digitous/GPTQ-for-GPT-NeoX: 4 bits quantization of LLaMa using ...
GPT-2 quantization performance when finetuned on F-ID of varying sizes ...
Quantization Explained: Why the Same LLM Gives Better Results on High ...
T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language ...
How to Use GPT-2 Indonesia Quantization fxis.ai
GPT-OSS quantization demystified From the HuggingFace model card ...
A Visual Guide to Quantization - by Maarten Grootendorst
Exploring Bits-and-Bytes, AWQ, GPTQ, EXL2, and GGUF Quantization ...
unsloth/gpt-oss-20b-GGUF · Native FP4 seems to make quantization ...
MXFP4 Quantization and GPT-OSS. GPT-OSS dropped in early August, and ...
LLM Quantization | GPTQ | QAT | AWQ | GGUF | GGML | PTQ | by Siddharth ...
[2210.17323] GPTQ: Accurate Post-Training Quantization for Generative ...
Quantization Bits at Amanda Okane blog
Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)
Fast and Accurate GPU Quantization for Transformers | Speechmatics
QLoRA:4-bit level quantization and fine-tuning method for LLM with 33B ...
Optimizing Large Language Models: A Deep Dive into Quantization ...
[PDF] GPTQ: Accurate Post-Training Quantization for Generative Pre ...
Hands-on LLM Quantization using GPTQ
A Visual Guide to Quantization - Maarten Grootendorst
[논문 리뷰] 38. GPTQ: ACCURATE POST-TRAINING QUANTIZATION FOR GENERATIVE ...
Quantization loading models by different data types | by Sivanarayana ...
Fast and Accurate GPU Quantization for Transformers
Understanding LLM Quantization. With the surge in applications using ...
Yi30/quantized_gpt_oss_20b at main
GPTQ&OBQ:量化你的GPTQuantization:模型量化,通过把浮点数转为定点数、降低访存来在引入较少误差的情 - 掘金
Simple Tutorial to Quantize Models using llama.cpp from safetensors to ...
GPTQ:探索一次性量化的革命性创新! - 知乎
高通AI | Quadapter: Adapter for GPT-2 Quantization(Quadapter:用于 GPT-2 量化的 ...
How to Contribute to Lit-GPT and Lit-LLaMA - Lightning AI
LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and ...
🚀 Shrinking GPT-2 with Model Quantization: Lower Costs, Faster ...
Quantization. 4bit-NormalFloat (NF4) consists of… | by Sivanarayana ...
How to load quantized gpt-j based models · Issue #171 · LostRuins ...
The Power of Quantization: Shrinking GPT2, Unleashing Speed - DEV Community
Performance of quantized GPT-2 with varying weight bit-widths and 8-bit ...
Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers | BARD AI
GPTQ(GPT Quantization)
Run a Chatgpt-like Chatbot on a Single GPU with ROCm
Quantized Models for openai/gpt-oss-20b – Hugging Face
GitHub - AlpinDale/gptq-gptj: Code for the paper "GPTQ: Accurate Post ...
The Role of Quantization-Aware Fine-Tuning of GPT-OSS-20b in Optimizing ...
GitHub - adrianmarino/gpt4-quantized: gpt4-quantized
模型的量化(Quantization)_--quantization-CSDN博客
GPT-4o Benchmark : r/ChatGPT
GPTQ quantization(3 or 4 bit quantization) support for LLaMa · Issue ...
mNLP-project/gpt2-dpo-quantized16bits at main
frollini/quantized_gpt2_c4_4bits_with-fixed-dataset · Hugging Face
Building a Mixture of Experts Model with GPT-2, BERT, RoBERTa, and 8 ...
{sigma}-GPTs: A New Approach to Autoregressive Models | AI Research ...
一文搞懂LLM量化之GPTQ算法! - 知乎
Quantized Models for ytz20/GAD-GPT-5-Chat-Qwen2.5-7B-Instruct – Hugging ...
jquave/gpt4all-lora-unfiltered-quantized at main
LLM Quantization: Quantize Model with GPTQ, AWQ, and Bitsandbytes ...
Irtasam/Quantized-GPT2-Wains · Hugging Face
Jolyne-W/gpt2-quantized-tokenizer · Hugging Face
使用量化感知训练微调 gpt-oss 提高准确性和性能 - NVIDIA 技术博客
LLukas22/gpt4all-lora-quantized-ggjt · Hugging Face
GitHub - mvandermeulen/lit-gpt: Hackable implementation of state-of-the ...