vllm.model_executor.models.granitemoeshared ¶
Inference-only GraniteMoeShared model.
The architecture is the same as granitemoe but with the addition of shared experts.
vllm.model_executor.models.granitemoeshared ¶Inference-only GraniteMoeShared model.
The architecture is the same as granitemoe but with the addition of shared experts.