Deploy Llm

post https://api.monsterapi.ai/v1/deploy/llm

MonsterAPI Deploy service deploy Opensource or private llm along with lora adapter with one request onto monsterAPI compute infrastructure.

Path to the base model can be a huggingface model. Needs to be a valid hugging face model.
Our service is based on vllm and any model supported by vllm is hence supported by this service.

Basemodels we have confirmed to work are as follow:basemodel_path_examples

Falcon (tiiuae/falcon-7b, tiiuae/falcon-40b, tiiuae/falcon-rw-7b, etc.)
GPT-2 (gpt2, gpt2-xl, etc.) - Limited to 1xGPU
GPT-J (EleutherAI/gpt-j-6b, nomic-ai/gpt4all-j, etc.) - Limited to 1xGPU
GPT-NeoX (EleutherAI/gpt-neox-20b, databricks/dolly-v2-12b, stabilityai/stablelm-tuned-alpha-7b, etc.)
LLaMA & LLaMA-2 (meta-llama/Llama-2-70b-hf, lmsys/vicuna-13b-v1.3, young-geng/koala, openlm-research/open_llama_13b, etc.) - lmsys/vicuna-13b-v1.3
Mistral (mistralai/Mistral-7B-v0.1, mistralai/Mistral-7B-Instruct-v0.1, etc.)
MPT (mosaicml/mpt-7b, mosaicml/mpt-30b, etc.)
OPT (facebook/opt-66b, facebook/opt-iml-max-30b, etc.)
Qwen (Qwen/Qwen-7B, Qwen/Qwen-7B-Chat, etc.)

Please note that we are working on adding more models to this list and if you have any specific model request please reach out to us at
[email protected] or join our discord server at https://discord.gg/3qXwXVX9

Language

Credentials

Bearer

JWT

Click Try It! to start a request and see the response here!