Z.ai

GLM-5

GLM family, 744B parameters, recommended as Q4_K_M for first-pass local usage in V1.

Params

744B

Context

200K

License

Custom

Best runtime

vLLM

Recommended hardware

First-pass fit across priority GPUs

Open calculator
HardwareFitDecodeSafe ctx
NVIDIA A10 24GBToo heavy2 tok/s4K
NVIDIA A100 40GBToo heavy4.5 tok/s4K
NVIDIA A100 80GBToo heavy6 tok/s4K
NVIDIA A16 64GBToo heavy2 tok/s4K