Skip to content

vllm.model_executor.layers.quantization.online

Modules:

Name Description
base
fp8
int8
moe_base
mxfp8

Online MXFP8 (microscaling FP8, block-32) quantization methods.