LLM Models Suggestions
1. OpenAI Models
GPT-4.1, GPT-4.1 mini, GPT-4.1 nano
-
Key Features:
- Excellent at following instructions and tool calling.
- 1M token context window (can handle huge inputs).
- Low latency, no explicit reasoning step.
-
Differences:
- Mini and Nano are lighter, faster, and more cost-effective, but may have slightly reduced performance compared to the full version.
-
Examples on how you can use it:
- GPT-4.1: This is for complex tasks needing high accuracy and a large context, such as analyzing long documents or advanced code generation.
- Mini/Nano: For faster, cost-sensitive tasks where slightly less accuracy is acceptable, such as chatbots, quick code snippets, and batch processing.
GPT-4o, GPT-4o mini
-
Key Features:
- GPT-4o is the latest “omni” model, optimized for chat and API integrations.
- Mini version is smaller, faster, and efficient for coding and visual tasks.
- Both support text and image inputs (multimodal).
-
Examples on how you can use it:
- GPT-4o: For chat applications, conversational agents, and when you want to test the latest improvements.
- GPT-4o mini: For fast, affordable tasks like code review, visual data extraction, or when running many parallel requests.
GPT-5, GPT-5 mini, GPT-5 nano
-
Key Features:
- GPT-5 is the flagship for coding, reasoning, and agentic tasks
- Mini and Nano are optimized for speed and cost, with Nano being the fastest and cheapest.
-
Examples on how you can use it:
- GPT-5: For advanced coding, multi-step reasoning, and agent workflows, such as building autonomous agents and complex data analysis.
- Mini: For well-defined, precise tasks where speed matters; for example: automated testing, structured data extraction.
- Nano: For high-volume, simple tasks like summarization, classification, or when cost is a primary concern.
GPT o3 mini, GPT o4 mini
-
Key Features:
- Small, fast models for reasoning and structured outputs.
- Support function calling and batch API.
-
Examples on how you can use it:
- For lightweight applications, structured data generation, or when you need to process many requests quickly and cheaply.
2. Anthropic Claude Models (via Bedrock)
Claude 3.5 Sonnet
-
Key Features:
- Strong at step-by-step reasoning, careful with controversial topics.
3.7 Sonnet
Conversational and careful, 3.7 is more proactive in dialogue.
4 Sonnet
It excels at complex, long-running coding and agent tasks. It is the best for advanced coding and agent workflows.
-
Examples on how you can use it:
-
3.5/3.7: For thoughtful, nuanced conversations, content moderation, or when handling sensitive topics.
-
4 Sonnet: This is for complex coding, long-running tasks, or when sustained performance is needed in agentic workflows.
-
3. DeepSeek R1 (via Bedrock)
-
Key Features:
- Open source, strong at logical inference, math, and real-time decision-making.
-
Examples on how you can use it:
- For tasks requiring mathematical problem-solving, logic puzzles, or when you want an open-source alternative.
4. Llama 4 Models
Llama 4 Maverick, Llama 4 Scout (via Bedrock)
-
Key Features:
- Multimodal (text, image), multilingual, optimized for coding and tool-calling
- Scout is a minor and excels at summarization, parsing, and image grounding.
-
Examples on how you can use it:
- Maverick: For agentic systems, multilingual applications, and multimodal tasks.
- Scout: For summarizing large documents, parsing user activity, or answering visual questions.
5. Mistral Pixtral (via Bedrock)
-
Key Features:
- Multimodal, excels at understanding documents, charts, and images.
-
Examples on how you can use it:
- For extracting information from images, analyzing charts, or when you need strong text and image understanding.
6. Gemini 2.5 Pro
-
Key Features:
- State-of-the-art multimodal model (audio, image, video, text input).
- Best for complex coding, reasoning, and multimodal understanding.
-
Examples on how you can use it:
- For advanced data analysis, handling multiple input types, or when you need the highest accuracy in complex tasks.
How to choose one a LLM? Here are some tips:
- Do you need high accuracy and a large context? Use GPT-4.1 or GPT-5.
- Do you need speed and low cost? Use mini or nano versions.
- Do you need multimodal (image) support? Use GPT-4o, Llama 4, Mistral Pixtral, or Gemini 2.5 Pro.
- Do you need open source? Use DeepSeek R1.
- Do you need a careful conversation? Use Claude models.