Skip to main content

LLM Models Suggestions

1. OpenAI Models

GPT-4.1, GPT-4.1 mini, GPT-4.1 nano

  • Key Features:

    • Excellent at following instructions and tool calling.
    • 1M token context window (can handle huge inputs).
    • Low latency, no explicit reasoning step.
  • Differences:

    • Mini and Nano are lighter, faster, and more cost-effective, but may have slightly reduced performance compared to the full version.
  • Examples on how you can use it:

    • GPT-4.1: This is for complex tasks needing high accuracy and a large context, such as analyzing long documents or advanced code generation.
    • Mini/Nano: For faster, cost-sensitive tasks where slightly less accuracy is acceptable, such as chatbots, quick code snippets, and batch processing.

GPT-4o, GPT-4o mini

  • Key Features:

    • GPT-4o is the latest “omni” model, optimized for chat and API integrations.
    • Mini version is smaller, faster, and efficient for coding and visual tasks.
    • Both support text and image inputs (multimodal).
  • Examples on how you can use it:

    • GPT-4o: For chat applications, conversational agents, and when you want to test the latest improvements.
    • GPT-4o mini: For fast, affordable tasks like code review, visual data extraction, or when running many parallel requests.

GPT-5, GPT-5 mini, GPT-5 nano

  • Key Features:

    • GPT-5 is the flagship for coding, reasoning, and agentic tasks
    • Mini and Nano are optimized for speed and cost, with Nano being the fastest and cheapest.
  • Examples on how you can use it:

    • GPT-5: For advanced coding, multi-step reasoning, and agent workflows, such as building autonomous agents and complex data analysis.
    • Mini: For well-defined, precise tasks where speed matters; for example: automated testing, structured data extraction.
    • Nano: For high-volume, simple tasks like summarization, classification, or when cost is a primary concern.

GPT o3 mini, GPT o4 mini

  • Key Features:

    • Small, fast models for reasoning and structured outputs.
    • Support function calling and batch API.
  • Examples on how you can use it:

    • For lightweight applications, structured data generation, or when you need to process many requests quickly and cheaply.

For more details, see the OpenAI documentation.

2. Anthropic Claude Models (via Bedrock)

Claude 3.5 Sonnet

  • Key Features:

    • Strong at step-by-step reasoning, careful with controversial topics.

3.7 Sonnet

Conversational and careful, 3.7 is more proactive in dialogue.

4 Sonnet

It excels at complex, long-running coding and agent tasks. It is the best for advanced coding and agent workflows.

  • Examples on how you can use it:

    • 3.5/3.7: For thoughtful, nuanced conversations, content moderation, or when handling sensitive topics.

    • 4 Sonnet: This is for complex coding, long-running tasks, or when sustained performance is needed in agentic workflows.

For more details, see the Anthropic documentation.

3. DeepSeek R1 (via Bedrock)

  • Key Features:

    • Open source, strong at logical inference, math, and real-time decision-making.
  • Examples on how you can use it:

    • For tasks requiring mathematical problem-solving, logic puzzles, or when you want an open-source alternative.

For more details, see the DeepSeek documentation.

4. Llama 4 Models

Llama 4 Maverick, Llama 4 Scout (via Bedrock)

  • Key Features:

    • Multimodal (text, image), multilingual, optimized for coding and tool-calling
    • Scout is a minor and excels at summarization, parsing, and image grounding.
  • Examples on how you can use it:

    • Maverick: For agentic systems, multilingual applications, and multimodal tasks.
    • Scout: For summarizing large documents, parsing user activity, or answering visual questions.

For more details, see the DeepSeek documentation.

5. Mistral Pixtral (via Bedrock)

  • Key Features:

    • Multimodal, excels at understanding documents, charts, and images.
  • Examples on how you can use it:

    • For extracting information from images, analyzing charts, or when you need strong text and image understanding.

For more details, see the Mistral AI documentation.

6. Gemini 2.5 Pro

  • Key Features:

    • State-of-the-art multimodal model (audio, image, video, text input).
    • Best for complex coding, reasoning, and multimodal understanding.
  • Examples on how you can use it:

    • For advanced data analysis, handling multiple input types, or when you need the highest accuracy in complex tasks.

For more details, see the Gemini documentation.

tip

How to choose one a LLM? Here are some tips:

  • Do you need high accuracy and a large context? Use GPT-4.1 or GPT-5.
  • Do you need speed and low cost? Use mini or nano versions.
  • Do you need multimodal (image) support? Use GPT-4o, Llama 4, Mistral Pixtral, or Gemini 2.5 Pro.
  • Do you need open source? Use DeepSeek R1.
  • Do you need a careful conversation? Use Claude models.