Back to Blog
12 min readTechnical Tutorial

How to Integrate OpenAI and Anthropic Claude into Your Web App

A technical guide to integrating GPT-4 and Claude into your Next.js application, including API setup, streaming responses, error handling, and cost optimization.

How to Integrate OpenAI and Anthropic Claude into Your Web App

Let's build a real AI integration. Not a toy example—production-ready code you can ship.

Prerequisites

Before you start:

  • Next.js 14+ application
  • Node.js 18+ environment
  • Basic understanding of API routes
  • OpenAI and/or Anthropic API keys

Part 1: API Setup

Getting Your API Keys

OpenAI: 1. Go to platform.openai.com 2. Create account → API Keys 3. Generate new key 4. Add $5-10 credit to start

Anthropic: 1. Visit console.anthropic.com 2. Request API access (usually instant) 3. Generate API key 4. No minimum balance required

Environment Configuration

Create .env.local:

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Add to .gitignore:

.env*.local

Never commit API keys to version control.

Part 2: OpenAI Integration

Install SDK

npm install openai

Basic Implementation

Create app/api/chat/openai/route.ts:

import OpenAI from 'openai';
import { NextRequest, NextResponse } from 'next/server';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, });

export async function POST(req: NextRequest) { try { const { messages } = await req.json();

const completion = await openai.chat.completions.create({ model: "gpt-4", messages: messages, temperature: 0.7, max_tokens: 1000, });

return NextResponse.json({ message: completion.choices[0].message, usage: completion.usage, }); } catch (error) { console.error('OpenAI Error:', error); return NextResponse.json( { error: 'AI request failed' }, { status: 500 } ); } }

Frontend Component

Create components/ChatOpenAI.tsx:

'use client';
import { useState } from 'react';

export default function ChatOpenAI() { const [messages, setMessages] = useState([ { role: 'system', content: 'You are a helpful assistant.' } ]); const [input, setInput] = useState(''); const [loading, setLoading] = useState(false);

const sendMessage = async () => { if (!input.trim()) return;

const userMessage = { role: 'user', content: input }; const newMessages = [...messages, userMessage]; setMessages(newMessages); setInput(''); setLoading(true);

try { const response = await fetch('/api/chat/openai', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ messages: newMessages }), });

const data = await response.json();

if (data.message) { setMessages([...newMessages, data.message]); } } catch (error) { console.error('Error:', error); } finally { setLoading(false); } };

return ( <div className="flex flex-col h-screen max-w-2xl mx-auto p-4"> <div className="flex-1 overflow-y-auto space-y-4 mb-4"> {messages.filter(m => m.role !== 'system').map((msg, i) => ( <div key={i} className={p-4 rounded-lg ${ msg.role === 'user' ? 'bg-blue-100 ml-auto' : 'bg-gray-100' } max-w-[80%]} > {msg.content} </div> ))} {loading && <div className="text-gray-500">Thinking...</div>} </div>

<div className="flex gap-2"> <input value={input} onChange={(e) => setInput(e.target.value)} onKeyPress={(e) => e.key === 'Enter' && sendMessage()} placeholder="Type your message..." className="flex-1 p-3 border rounded-lg" /> <button onClick={sendMessage} disabled={loading} className="px-6 py-3 bg-blue-600 text-white rounded-lg disabled:opacity-50" > Send </button> </div> </div> ); }

Part 3: Streaming Responses (Better UX)

Users hate waiting. Streaming shows results immediately.

Streaming API Route

Update app/api/chat/openai/route.ts:

import OpenAI from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, });

export async function POST(req: Request) { const { messages } = await req.json();

const response = await openai.chat.completions.create({ model: 'gpt-4', stream: true, messages, });

const stream = OpenAIStream(response); return new StreamingTextResponse(stream); }

Install AI SDK

npm install ai

Streaming Frontend

Update component:

'use client';
import { useChat } from 'ai/react';

export default function ChatOpenAI() { const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({ api: '/api/chat/openai', });

return ( <div className="flex flex-col h-screen max-w-2xl mx-auto p-4"> <div className="flex-1 overflow-y-auto space-y-4 mb-4"> {messages.map((msg) => ( <div key={msg.id} className={p-4 rounded-lg ${ msg.role === 'user' ? 'bg-blue-100 ml-auto' : 'bg-gray-100' } max-w-[80%]} > {msg.content} </div> ))} </div>

<form onSubmit={handleSubmit} className="flex gap-2"> <input value={input} onChange={handleInputChange} placeholder="Type your message..." className="flex-1 p-3 border rounded-lg" /> <button type="submit" disabled={isLoading} className="px-6 py-3 bg-blue-600 text-white rounded-lg disabled:opacity-50" > Send </button> </form> </div> ); }

Result: Users see responses appear word-by-word. Feels 10x faster.

Part 4: Anthropic Claude Integration

Install SDK

npm install @anthropic-ai/sdk

Claude API Route

Create app/api/chat/claude/route.ts:

import Anthropic from '@anthropic-ai/sdk';
import { AnthropicStream, StreamingTextResponse } from 'ai';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, });

export async function POST(req: Request) { const { messages } = await req.json();

const response = await anthropic.messages.create({ model: 'claude-3-5-sonnet-20241022', max_tokens: 1024, stream: true, messages, });

const stream = AnthropicStream(response); return new StreamingTextResponse(stream); }

Frontend (Same as OpenAI)

Just change the API endpoint:

const { messages, input, handleInputChange, handleSubmit } = useChat({
  api: '/api/chat/claude', // Changed this line
});

Part 5: Advanced Features

System Prompts

Shape AI behavior:

const messages = [
  {
    role: 'system',
    content: 'You are an expert software engineer specializing in React and Next.js. Provide concise, production-ready code examples.'
  },
  ...userMessages
];

Temperature Control

  • 0.0-0.3: Deterministic, factual (code generation, data extraction)
  • 0.4-0.7: Balanced (general chat, Q&A)
  • 0.8-1.0: Creative (content writing, brainstorming)

Token Limits

Control output length and cost:

{
  max_tokens: 500, // Short responses
  // vs
  max_tokens: 2000, // Detailed responses
}

Context Management

Limit conversation history to save costs:

const recentMessages = messages.slice(-10); // Last 10 messages only

Part 6: Error Handling

Production apps need robust error handling:

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();

// Validation if (!messages || !Array.isArray(messages)) { return NextResponse.json( { error: 'Invalid messages format' }, { status: 400 } ); }

const response = await openai.chat.completions.create({ model: 'gpt-4', messages, stream: true, });

return new StreamingTextResponse(OpenAIStream(response));

} catch (error: any) { // Rate limiting if (error.status === 429) { return NextResponse.json( { error: 'Too many requests. Please try again later.' }, { status: 429 } ); }

// Invalid API key if (error.status === 401) { console.error('Invalid API key'); return NextResponse.json( { error: 'Configuration error' }, { status: 500 } ); }

// Generic error console.error('AI Error:', error); return NextResponse.json( { error: 'Failed to process request' }, { status: 500 } ); } }

Part 7: Cost Optimization

AI APIs aren't cheap at scale. Optimize:

1. Prompt Caching

Reuse system prompts:

// OpenAI caches automatically for repeated content
const systemPrompt = {
  role: 'system',
  content: LONG_SYSTEM_PROMPT // This gets cached
};

2. Model Selection

Use cheaper models when possible:

// For simple tasks
model: 'gpt-3.5-turbo' // 10x cheaper than GPT-4

// For complex tasks model: 'gpt-4'

3. Token Limits

Prevent runaway costs:

{
  max_tokens: 500, // Cap response length
}

4. User Rate Limiting

import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const ratelimit = new Ratelimit({ redis: Redis.fromEnv(), limiter: Ratelimit.slidingWindow(10, '1 m'), // 10 requests per minute });

export async function POST(req: Request) { const ip = req.headers.get('x-forwarded-for') ?? 'unknown'; const { success } = await ratelimit.limit(ip);

if (!success) { return NextResponse.json( { error: 'Rate limit exceeded' }, { status: 429 } ); }

// ... rest of handler }

Part 8: User Usage Tracking

Track per-user AI usage:

// lib/db.ts
export async function trackUsage(userId: string, tokens: number) {
  await db.usage.create({
    data: {
      userId,
      tokens,
      cost: tokens * 0.00003, // Approximate cost per token
      timestamp: new Date(),
    },
  });
}

// API route const completion = await openai.chat.completions.create({...}); const tokensUsed = completion.usage?.total_tokens || 0; await trackUsage(userId, tokensUsed);

Part 9: Multi-Model Support

Let users choose their AI:

export async function POST(req: Request) {
  const { messages, model } = await req.json();

if (model === 'claude') { // Use Claude const response = await anthropic.messages.create({...}); return new StreamingTextResponse(AnthropicStream(response)); } else { // Use OpenAI const response = await openai.chat.completions.create({...}); return new StreamingTextResponse(OpenAIStream(response)); } }

Part 10: Testing Your Integration

Local Testing

npm run dev

Visit http://localhost:3000 and test:

  • Send messages
  • Check streaming works
  • Verify error handling
  • Test rate limits

Load Testing

Use tools like:

  • Apache Bench (ab)
  • k6
  • Artillery
Example:
ab -n 100 -c 10 -p payload.json -T application/json http://localhost:3000/api/chat/openai

Common Pitfalls

1. API Keys in Frontend

❌ Never:
// DON'T DO THIS
const openai = new OpenAI({
  apiKey: 'sk-...' // Exposed to users!
});

✅ Always use API routes (server-side).

2. No Timeout Handling

AI APIs can hang. Set timeouts:
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 30000); // 30s

const response = await openai.chat.completions.create({ messages, signal: controller.signal, });

clearTimeout(timeout);

3. Ignoring Token Limits

Models have max context windows:
  • GPT-4: 8K-128K tokens
  • Claude: 200K tokens
Monitor and truncate conversation history.

4. No User Feedback Loop

Add feedback buttons (👍/👎) to improve prompts over time.

Production Checklist

Before deploying:

✅ API keys in environment variables ✅ Error handling implemented ✅ Rate limiting in place ✅ Usage tracking enabled ✅ Streaming responses working ✅ Mobile responsive ✅ Loading states clear ✅ Cost monitoring setup

Next Steps

Now that you have working AI integration:

1. Add RAG (Retrieval-Augmented Generation) for knowledge bases 2. Implement function calling for tool use 3. Add conversation memory (database storage) 4. Create prompt templates for common use cases 5. Build analytics to track AI performance

We Handle All of This

Building AI integrations from scratch takes time. We've done it dozens of times and can ship your complete AI app in 7 days.

Includes:

  • Production-ready AI integration
  • Streaming responses
  • Error handling
  • Rate limiting
  • Usage tracking
  • Cost optimization
Ready to skip the learning curve? Start your build →

Ready to Build Your AI Product?

We'll turn your AI idea into a production-ready application in just 7 days. No fluff, no overhead—just clean code that converts.

Start Your Build