
| Difficulty: Intermediate | Category: Ai Tools |
Building a Production-Ready Claude Chatbot After the Fable 5 / Mythos 5 Shutdown
In the past 48 hours, Anthropic terminated access to its Fable 5 and Mythos 5 model families following a government national security directive—a stark reminder that production AI systems can’t rely on single-model dependencies. If your chatbot hardcoded one of these models, your users are staring at error screens right now. By the end of this tutorial, you’ll deploy a resilient Claude-powered chatbot with automatic model fallbacks, compliance checking, and graceful degradation—battle-tested architecture that survives sudden API changes.
Prerequisites
Before you start, ensure you have:
- Python 3.11+ installed locally
- Anthropic API key (get one at console.anthropic.com—free tier includes $5 credit)
- anthropic SDK 0.28.0+ (
pip install anthropic>=0.28.0) - python-dotenv 1.0+ for secure credential management
- Basic familiarity with async/await syntax in Python
Step-by-Step Guide
Step 1: Install Dependencies and Set Up Your Environment
Create a new project directory and install the required packages:
mkdir claude-chatbot-resilient && cd claude-chatbot-resilient
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install anthropic>=0.28.0 python-dotenv
Create a .env file in your project root:
ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
⚠️ WARNING: Never commit .env files to version control. Add .env to your .gitignore immediately.
Step 2: Define Your Model Fallback Chain
The Fable 5 / Mythos 5 shutdown demonstrates why you need ranked fallback models. Create config.py:
# config.py
MODEL_PRIORITY = [
"claude-sonnet-4-5", # Primary: Most capable, $3.00/M input tokens
"claude-haiku-4-5", # Fallback: Fast + cheap, $0.80/M input tokens
"Claude 4 family-sonnet-20240620" # Legacy fallback for migration period
]
RESTRICTED_MODELS = [
"fable-5",
"mythos-5"
]
MAX_TOKENS = 1024
TEMPERATURE = 0.7
Pro tip: Keep MODEL_PRIORITY in a config file or environment variable so you can update it without code changes when new models launch or old ones sunset.
Step 3: Build the Core Chatbot Class with Automatic Fallbacks
Create chatbot.py with a resilient client that automatically tries fallback models when the primary fails:
# chatbot.py
import os
from anthropic import Anthropic, APIError
from dotenv import load_dotenv
from config import MODEL_PRIORITY, RESTRICTED_MODELS, MAX_TOKENS, TEMPERATURE
load_dotenv()
class ResilientChatbot:
def __init__(self):
self.client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
self.conversation_history = []
def send_message(self, user_input: str) -> dict:
"""
Send a message with automatic model fallback.
Returns: {"model": str, "content": str, "error": str|None}
"""
self.conversation_history.append({
"role": "user",
"content": user_input
})
for model_id in MODEL_PRIORITY:
try:
response = self.client.messages.create(
model=model_id,
max_tokens=MAX_TOKENS,
temperature=TEMPERATURE,
messages=self.conversation_history
)
assistant_message = response.content[0].text
self.conversation_history.append({
"role": "assistant",
"content": assistant_message
})
return {
"model": model_id,
"content": assistant_message,
"error": None
}
except APIError as e:
error_code = getattr(e, 'status_code', None)
# 403 = access denied (like Fable/Mythos shutdown)
if error_code == 403:
print(f"⚠️ Model {model_id} access denied, trying fallback...")
continue
# 429 = rate limit, 500 = server error - retry next model
if error_code in [429, 500, 502, 503]:
print(f"⚠️ Model {model_id} unavailable ({error_code}), trying fallback...")
continue
# Other errors (auth, invalid request) shouldn't fallback
return {
"model": model_id,
"content": None,
"error": f"API Error: {str(e)}"
}
# All models failed
return {
"model": None,
"content": None,
"error": "All models unavailable. Check API status."
}
def reset_conversation(self):
"""Clear conversation history for new session."""
self.conversation_history = []
Gotcha: The Anthropic SDK raises APIError for HTTP errors. Always catch it specifically—a bare except will mask bugs in your own code.
Step 4: Add Compliance Checking for Restricted Models
Before the Fable/Mythos shutdown, some developers had hardcoded those model IDs. Add a validation layer:
# chatbot.py (add this method to ResilientChatbot class)
def validate_model_access(self, requested_model: str) -> tuple[bool, str]:
"""
Check if a requested model is restricted.
Returns: (is_allowed: bool, message: str)
"""
if requested_model.lower() in [m.lower() for m in RESTRICTED_MODELS]:
return False, (
f"Model '{requested_model}' is currently restricted by government order. "
f"Falling back to {MODEL_PRIORITY[0]}."
)
return True, ""
Call this in any UI layer where users might specify models directly (admin panels, API endpoints).
Step 5: Implement Conversation Context Management
Real chatbots need memory limits—Claude’s context window is 200K tokens, but you’ll hit rate limits and cost issues before that. Add sliding window memory:
# chatbot.py (add to ResilientChatbot)
MAX_HISTORY_MESSAGES = 20 # Keep last 10 exchanges (user+assistant pairs)
def trim_history(self):
"""Keep conversation history under MAX_HISTORY_MESSAGES."""
if len(self.conversation_history) > MAX_HISTORY_MESSAGES:
# Keep system message if present, trim oldest user/assistant pairs
system_msgs = [m for m in self.conversation_history if m["role"] == "system"]
conversation = [m for m in self.conversation_history if m["role"] != "system"]
conversation = conversation[-(MAX_HISTORY_MESSAGES - len(system_msgs)):]
self.conversation_history = system_msgs + conversation
Call self.trim_history() at the end of send_message() before returning.
Pro tip: For production, store conversation history in Redis with a 24-hour TTL. In-memory state doesn’t survive restarts.
Step 6: Add Cost Tracking and Model Performance Logging
Track which models you’re actually using post-fallback and what they cost:
# chatbot.py
import time
class ResilientChatbot:
def __init__(self):
# ... existing init code ...
self.usage_stats = {
"total_requests": 0,
"model_usage": {},
"fallback_triggers": 0
}
def send_message(self, user_input: str) -> dict:
start_time = time.time()
self.usage_stats["total_requests"] += 1
# ... existing send logic ...
# After successful response:
model_used = response_dict["model"]
if model_used:
self.usage_stats["model_usage"][model_used] = \
self.usage_stats["model_usage"].get(model_used, 0) + 1
if MODEL_PRIORITY.index(model_used) > 0:
self.usage_stats["fallback_triggers"] += 1
print(f"✓ Responded with {model_used} in {time.time() - start_time:.2f}s")
return response_dict
⚠️ WARNING: Token-level cost tracking requires parsing response.usage from the API. For a quick estimate: sonnet-4-5 input = $3.00/M tokens, haiku-4-5 = $0.80/M tokens.
Step 7: Create a Simple CLI Interface
Make your chatbot testable immediately:
# main.py
from chatbot import ResilientChatbot
def main():
bot = ResilientChatbot()
print("Claude Chatbot (Resilient Edition)")
print("Type 'quit' to exit, 'reset' to clear history, 'stats' for usage\n")
while True:
user_input = input("You: ").strip()
if user_input.lower() == "quit":
break
elif user_input.lower() == "reset":
bot.reset_conversation()
print("🔄 Conversation reset\n")
continue
elif user_input.lower() == "stats":
print(f"\n📊 Usage Stats: {bot.usage_stats}\n")
continue
if not user_input:
continue
response = bot.send_message(user_input)
if response["error"]:
print(f"❌ Error: {response['error']}\n")
else:
print(f"Assistant ({response['model']}): {response['content']}\n")
if __name__ == "__main__":
main()
Run it: python main.py
Practical Example: Complete Working Chatbot
Here’s a full working example you can copy-paste and run immediately:
# complete_example.py
import os
from anthropic import Anthropic, APIError
from dotenv import load_dotenv
load_dotenv()
MODEL_CHAIN = ["claude-sonnet-4-5", "claude-haiku-4-5"]
def chat_with_fallback(user_message: str):
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
for model in MODEL_CHAIN:
try:
response = client.messages.create(
model=model,
max_tokens=1024,
messages=[{"role": "user", "content": user_message}]
)
print(f"✓ Using model: {model}")
return response.content[0].text
except APIError as e:
if getattr(e, 'status_code', None) == 403:
print(f"⚠️ {model} restricted, trying next...")
continue
raise
return "❌ All models unavailable"
# Test it
if __name__ == "__main__":
result = chat_with_fallback("Explain the Fable 5 shutdown in one sentence.")
print(f"\nResponse: {result}")
Save this as complete_example.py, run python complete_example.py, and you’ll see the fallback mechanism in action. If sonnet-4-5 fails, it automatically switches to haiku-4-5.
Debugging Common Issues
Error: anthropic.AuthenticationError: Invalid API key
Cause: Missing or malformed ANTHROPIC_API_KEY in .env
Fix: Regenerate your key at console.anthropic.com/settings/keys and verify it starts with sk-ant-api03-
Error: anthropic.NotFoundError: model 'fable-5' not found
Cause: Requesting a restricted/deprecated model directly
Fix: Remove fable-5 and mythos-5 from any hardcoded model strings. Use the MODEL_PRIORITY list instead.
Error: Rate limit exceeded (429) even with fallbacks
Cause: All models in your chain are rate-limited simultaneously
Fix: Add exponential backoff with time.sleep() between retries, or upgrade to a higher-tier API plan ($40/month removes most limits).
Error: Conversation context growing too large (>200K tokens)
Cause: Not trimming conversation_history
Fix: Call trim_history() after every message exchange, or implement token counting with anthropic.count_tokens().
Key Takeaways
- Model fallbacks are mandatory for production: The Fable 5 / Mythos 5 shutdown proves that any single model can vanish with zero notice. Always define a ranked fallback chain.
- Cost-optimize with tiered models: Sonnet costs 3.75× more than Haiku. Use Haiku for simple queries, sonnet only when complexity demands it.
- Track which models you’re actually using: Log model selection per request—you might discover 80% of queries work fine on the cheaper fallback.
- Graceful degradation beats hard errors: Return a helpful error message when all models fail, don’t crash the user’s session.
What’s Next
Now that you have a resilient chatbot, explore function calling with Claude tools to let your bot take actions (search databases, call APIs) instead of just answering questions—tutorial coming in tomorrow’s issue.
Key Takeaway: With Anthropic restricting access to certain Claude models under government orders, developers need to architect chatbots with model fallbacks and compliance checks built-in from day one. This tutorial shows you how to build a resilient multi-model chatbot that gracefully handles API restrictions.
New AI tutorials published daily on AtlasSignal. Follow @AtlasSignalDesk for more.
This report was produced with AI-assisted research and drafting, curated and reviewed under AtlasSignal’s editorial standards. For corrections or feedback, contact atlassignal.ai@gmail.com.