Introducing MonsterAPI's new NextGen LLM Inference API

MonsterAPI is excited to introduce our innovative Large Language Model (LLM) API service, leveraging top-tier open-source models from Hugging Face to cater to a broad spectrum of needs in natural language processing, chatbots, content creation, and more. Our platform is designed to offer a versatile range of LLMs to meet the diverse requirements of your projects.

Check out the QuickStart Notebook to use the service: here

Featured Models:

Note: The following prices are introductory beta rates, subject to change as we work towards reducing costs for our production release.

Models Supported:

  • google/gemma-2-9b-it: Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

  • microsoft/Phi-3-mini-4k-instruct: Developed by Microsoft Research, this is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants 4K and 128K which is the context length (in tokens) that it can support.

  • mistralai/Mistral-7B-Instruct-v0.2: An advanced iteration of the Mistral-7B series, this model is fine-tuned for instructional purposes, building on the capabilities of its predecessor.

  • meta-llama/Meta-Llama-3-8B-Instruct.: Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.

Transparent Pricing:

Our clear, token-based pricing ensures you pay only for what you use, without any hidden costs.

Model NameInput Token Price (per 1K tokens)Output Token Price (per 1K tokens)
google/gemma-2-9b-it$0.00004559$0.00045591
microsoft/Phi-3-mini-4k-instruct$0.00003483$0.00034834
mistralai/Mistral-7B-Instruct-v0.2$0.00004559$0.00045591
meta-llama/Meta-Llama-3-8B-Instruct.$0.00004559$0.00045591

Why MonsterAPI?

  • Diverse Model Selection: From specialized tasks to general applications, our platform ensures the availability of the right LLM for your project.
  • Transparent Pricing: Our straightforward, token-based pricing model guarantees no surprises with hidden fees.
  • User-Friendly: Our service is designed for ease of access, complemented by detailed documentation and dedicated support for seamless API integration.
  • Commitment to Open-Source: We champion open-source technology, hosting all our models on a scalable, secure, and cost-effective GPU cloud to facilitate accessible and scalable use.

For additional details or assistance, please visit our website or contact our support team at [email protected]. Let MonsterAPI be the driving force behind your applications, harnessing the power of the latest LLM technology.

Beta Access Limitations:

To manage our beta service effectively, we've set specific throttle limits for different MonsterAPI plans:

PlanRequests per 60 secondsDaily API Call Limit
Free1014,400
Wolf2028,800
Beast4057,600
Monster6086,400

Support: For inquiries, feedback, or suggestions, please reach out to us at [email protected].