Streaming AI Responses in the Browser: SSE + Vue Composables

If you've ever used ChatGPT, you've seen it — text appearing word by word, like someone is typing in real time. It feels fast, responsive, and almost human. But behind that smooth experience there's a simple and elegant protocol: Server-Sent Events (SSE).

In this post, we'll build that exact experience using Vue 3 composables and a Nitro server endpoint. No WebSocket complexity, no polling hacks. Just a clean, one-way stream from server to browser.

Why streaming matters

When you call an LLM API the traditional way (a single request, wait for the full response), the user stares at a loading spinner for seconds. That's a bad experience, especially when the answer is long.

Streaming changes the game: the user starts reading the response immediately, while the model is still thinking. It's faster perceived performance, even if the total time is the same.

And the best part? You don't need WebSockets for this. SSE is built into every browser, it's simple, and it works perfectly for this use case — because AI responses are one-directional.

What are Server-Sent Events?

SSE is a browser API that lets the server push data to the client over a single HTTP connection. Think of it as a one-way WebSocket — the server sends, the browser listens. That's it.

The browser opens a connection using EventSource (or a simple fetch with stream reading), and the server sends chunks of data as they become available.

For AI streaming, each chunk is typically a few tokens of text. The frontend appends them in real time.

The composable: `useAIStream`

Let's start from the frontend. We want a composable that:

Takes a prompt
Streams the response token by token
Exposes reactive state (the text so far, loading, errors)

// composables/useAIStream.ts
import { ref } from 'vue';

export function useAIStream() {
  const output = ref('');
  const isStreaming = ref(false);
  const error = ref<string | null>(null);

  async function send(prompt: string) {
    output.value = '';
    error.value = null;
    isStreaming.value = true;

    try {
      const response = await fetch('/api/ai/stream', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ prompt }),
      });

      if (!response.ok || !response.body) {
        throw new Error('Stream connection failed');
      }

      const reader = response.body.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = decoder.decode(value, { stream: true });
        const lines = chunk.split('\n');

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = line.slice(6);
            if (data === '[DONE]') break;
            output.value += data;
          }
        }
      }
    } catch (err: any) {
      error.value = err.message ?? 'Something went wrong';
    } finally {
      isStreaming.value = false;
    }
  }

  return { output, isStreaming, error, send };
}

A few things to notice here:

We're using the Fetch API with response.body.getReader() instead of EventSource. This gives us more control — we can send POST requests with a body, which EventSource doesn't support.
The TextDecoder with { stream: true } handles partial chunks correctly.
Each SSE line starts with data: , and we use [DONE] as the end signal — same convention OpenAI uses.

The server: streaming from Nitro

Now the backend. In Nuxt, we create a Nitro event handler that connects to an AI provider and streams the response back.

// server/api/ai/stream.post.ts
export default defineEventHandler(async (event) => {
  const { prompt } = await readBody(event);

  const response = await fetch('https://api.anthropic.com/v1/messages', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-api-key': process.env.ANTHROPIC_API_KEY!,
      'anthropic-version': '2023-06-01',
    },
    body: JSON.stringify({
      model: 'claude-sonnet-4-5-20250514',
      max_tokens: 1024,
      stream: true,
      messages: [{ role: 'user', content: prompt }],
    }),
  });

  if (!response.ok || !response.body) {
    throw createError({ statusCode: 500, statusMessage: 'AI request failed' });
  }

  // Set SSE headers
  setResponseHeaders(event, {
    'Content-Type': 'text/event-stream',
    'Cache-Control': 'no-cache',
    Connection: 'keep-alive',
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  const stream = new ReadableStream({
    async start(controller) {
      while (true) {
        const { done, value } = await reader.read();
        if (done) {
          controller.enqueue(new TextEncoder().encode('data: [DONE]\n\n'));
          controller.close();
          break;
        }

        const chunk = decoder.decode(value, { stream: true });
        const lines = chunk.split('\n').filter(Boolean);

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            try {
              const json = JSON.parse(line.slice(6));
              if (json.type === 'content_block_delta') {
                const text = json.delta?.text ?? '';
                controller.enqueue(
                  new TextEncoder().encode(`data: ${text}\n\n`),
                );
              }
            } catch {
              // Skip non-JSON lines
            }
          }
        }
      }
    },
  });

  return sendStream(event, stream);
});

This handler acts as a proxy — it takes the prompt, sends it to the AI API with stream: true, and re-streams the response to the browser in SSE format. The client never talks to the AI API directly, which keeps your API key safe on the server.

The component

Now let's put it all together in a simple chat-like component:

<template>
  <div class="chat">
    <form @submit.prevent="handleSubmit">
      <input
        v-model="prompt"
        placeholder="Ask me anything..."
        :disabled="isStreaming"
      />
      <button type="submit" :disabled="isStreaming || !prompt.trim()">
        Send
      </button>
    </form>

    <div v-if="output" class="response">
      <p>{{ output }}</p>
      <span v-if="isStreaming" class="cursor">|</span>
    </div>

    <p v-if="error" class="error">{{ error }}</p>
  </div>
</template>

<script setup lang="ts">
import { ref } from 'vue';
import { useAIStream } from '~/composables/useAIStream';

const { output, isStreaming, error, send } = useAIStream();
const prompt = ref('');

async function handleSubmit() {
  const text = prompt.value.trim();
  if (!text) return;
  prompt.value = '';
  await send(text);
}
</script>

<style scoped>
.chat {
  max-width: 640px;
  margin: 0 auto;
}

.response {
  margin-top: 1rem;
  padding: 1rem;
  background: var(--bg-secondary);
  border-radius: 8px;
  white-space: pre-wrap;
}

.cursor {
  animation: blink 0.8s step-end infinite;
}

@keyframes blink {
  50% {
    opacity: 0;
  }
}

.error {
  color: #e11d48;
  margin-top: 0.5rem;
}
</style>

That blinking cursor at the end is a nice touch — it gives the user a visual cue that the response is still coming in.

Adding abort support

One thing users expect is the ability to stop a response mid-stream. This is easy to add with AbortController:

// Inside useAIStream.ts
const controller = ref<AbortController | null>(null);

async function send(prompt: string) {
  controller.value = new AbortController();
  // ...
  const response = await fetch('/api/ai/stream', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ prompt }),
    signal: controller.value.signal,
  });
  // ...rest of the logic
}

function abort() {
  controller.value?.abort();
  isStreaming.value = false;
}

return { output, isStreaming, error, send, abort };

Now you can expose a "Stop" button in the UI and call abort() — the stream closes cleanly.

Key takeaways

SSE > WebSockets for AI streaming — it's simpler, one-directional, and perfectly suited for this use case.
Composables make the streaming logic reusable across any component. You could use useAIStream in a chat page, a search bar, or a writing assistant — same composable, different UI.
Nitro as a proxy keeps your API keys safe and gives you a place to add rate limiting, caching, or prompt validation before hitting the AI.
AbortController is essential for a good UX — always let users stop a stream.

The streaming pattern is becoming the standard for AI-powered interfaces. Once you've built it, you'll find yourself using it everywhere.

TL;DR: Use SSE + Vue composables to stream AI responses token by token. It's simpler than WebSockets, feels instant to users, and the composable pattern makes it reusable across your entire app.