Supported ModelsΒΆ
Models |
Tensor Parallel |
Quantization |
Chat API |
HF models examples |
---|---|---|---|---|
Aquila |
Yes |
Yes |
Yes |
|
Bloom |
Yes |
Yes |
No |
|
Baichuan |
Yes |
Yes |
Yes |
|
ChatGLM3 |
Yes |
Yes |
Yes |
|
Gemma |
Yes |
Yes |
Yes |
|
GPT_j |
Yes |
Yes |
No |
|
GPT_NeoX |
Yes |
Yes |
No |
|
GPT2 |
Yes |
Yes |
No |
|
InternLM |
Yes |
Yes |
Yes |
|
Llama3/2 |
Yes |
Yes |
Yes |
meta-llama/Meta-Llama-3.1-8B-Instruct, meta-llama/Meta-Llama-3.1-8B, meta-llama/Llama-2-7b |
Mistral |
Yes |
Yes |
Yes |
|
MPT |
Yes |
Yes |
Yes |
|
Phi2 |
Yes |
Yes |
No |
|
Qwen |
Yes |
Yes |
Yes |
|
Yi |
Yes |
Yes |
Yes |
If your model is not included in the supported list, we are more than willing to assist you. Please feel free to create a request for adding a new model on GitHub Issues.