Documentation

AI SDK

The AI SDK by Vercel is the AI Toolkit for TypeScript and JavaScript from the creators of Next.js. It is a free open-source library that gives you the tools you need to build AI-powered products.

What's even better is that it works with LLM models by ByteCompute out of the box. You can check AI SDK docs.

Install AI SDK

Copy
npm install ai @ai-sdk/ByteCompute

LLM Examples

The examples below show how to use the AI SDK with ByteCompute and large language models. Make sure to get your API key from ByteCompute. You have to Login and get your token.

Text Generation

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateText } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const { text, usage, finishReason } = await generateText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  prompt: "Write a vegetarian lasagna recipe for 4 people.",
});

console.log(text);
console.log(usage);
console.log(finishReason);

You can improve the answers further by providing a system message

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateText } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const { text, usage, finishReason } = await generateText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  prompt: "Write a vegetarian lasagna recipe for 4 people.",
  system:
    "You are a professional writer. " +
    "You write simple, clear, and concise content.",
});

console.log(text);
console.log(usage);
console.log(finishReason);

Streaming

Generating text is nice, but your users don't want to wait when large amount of text is generated. For those use cases you can use streaming.

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { streamText } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const result = streamText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  prompt: "Invent a new holiday and describe its traditions.",
  system:
    "You are a professional writer. You write simple, clear, and concise content.",
});

for await (const textPart of result.textStream) {
  console.log(textPart);
}

console.log(await result.usage);
console.log(await result.finishReason);

Conversations

To create a longer chat-like conversation you have to add each response message and each of the user's messages to every request. This way the model will have the context and will be able to provide better answers. You can tweak it even further by providing a system message.

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateText } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const { text, usage, finishReason } = await generateText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  messages: [
    { role: "system", content: "Respond like a michelin starred chef." },
    {
      role: "user",
      content: "Can you name at least two different techniques to cook lamb?",
    },
    {
      role: "assistant",
      content:
        'Bonjour! Let me tell you, my friend, cooking lamb is an art form, and I\'m more than happy to share with you not two, but three of my favorite techniques to coax out the rich, unctuous flavors and tender textures of this majestic protein. First, we have the classic "Sous Vide" method. Next, we have the ancient art of "Sous le Sable". And finally, we have the more modern technique of "Hot Smoking."',
    },
    { role: "user", content: "Tell me more about the second method." },
  ],
});

console.log(text);
console.log(usage);
console.log(finishReason);

Conversations & Streaming

Of course a conversation response can also be streaming and it is very simple.

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { streamText } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const result = streamText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  messages: [
    { role: "system", content: "Respond like a michelin starred chef." },
    {
      role: "user",
      content: "Can you name at least two different techniques to cook lamb?",
    },
    {
      role: "assistant",
      content:
        'Bonjour! Let me tell you, my friend, cooking lamb is an art form, and I\'m more than happy to share with you not two, but three of my favorite techniques to coax out the rich, unctuous flavors and tender textures of this majestic protein. First, we have the classic "Sous Vide" method. Next, we have the ancient art of "Sous le Sable". And finally, we have the more modern technique of "Hot Smoking."',
    },
    { role: "user", content: "Tell me more about the second method." },
  ],
});

for await (const textPart of result.textStream) {
  console.log(textPart);
}

console.log(await result.usage);
console.log(await result.finishReason);

Generating structured data

Getting text, streaming or not, is amazing but when two systems work together a structured approach is even better.

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateObject } from "ai";
import { z } from "zod";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const { object, usage, finishReason } = await generateObject({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  schema: z.object({
    recipe: z.object({
      name: z.string(),
      ingredients: z.array(z.object({ name: z.string(), amount: z.string() })),
      steps: z.array(z.string()),
    }),
  }),
  prompt: "Generate a lasagna recipe.",
});

console.log(object.recipe.name);
console.log(object.recipe.ingredients);
console.log(object.recipe.steps);
console.log(usage);
console.log(finishReason);

You can ask for more specific things like enums, too.

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateObject } from "ai";
import { z } from "zod";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const { object, usage, finishReason } = await generateObject({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  output: "enum",
  enum: ["action", "comedy", "drama", "horror", "sci-fi"],
  prompt:
    "Classify the genre of this movie plot: " +
    '"A group of astronauts travel through a wormhole in search of a ' +
    'new habitable planet for humanity."',
});

console.log(object);
console.log(usage);
console.log(finishReason);

Tool / Function calling

Tool calling allows models to call external functions provided by the user, and use the results to generate a comprehensive response to the user query. They are very powerful.

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateText, tool } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const result = await generateText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  tools: {
    weather: tool({
      description: "Get the weather in a location",
      parameters: z.object({
        location: z.string().describe("The location to get the weather for"),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  prompt: "What is the weather in San Francisco?",
  maxSteps: 2, // without it a text response is not generated, only the tool response
});

console.log(result.text);
console.log(result.usage);
console.log(result.finishReason);

Conversations and tool calling

Let's see how tool calling works when you are having a conversation

Copy
import { createByteCompute } from "@ai-sdk/ByteCompute";
import { generateText, tool } from "ai";

const ByteCompute = createByteCompute({
  apiKey: "$ByteCompute_TOKEN",
});

const messages = [
  { role: "user", content: "What is the weather in San Francisco?" },
];

const first_result = await generateText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  tools: {
    weather: tool({
      description: "Get the weather in a location",
      parameters: z.object({
        location: z.string().describe("The location to get the weather for"),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  messages: messages,
  maxSteps: 2, // without it a text response is not generated, only the tool response
});

console.log(first_result.text);

// Let's continue our conversation
messages.push(...result.response.messages);
messages.push({
  role: "user",
  content: "Is this normal temperature for the summer?",
});

const second_result = await generateText({
  model: ByteCompute("meta-llama/Llama-3.3-70B-Instruct-Turbo"),
  tools: {
    weather: tool({
      description: "Get the weather in a location",
      parameters: z.object({
        location: z.string().describe("The location to get the weather for"),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  messages: messages,
  maxSteps: 2,
});

console.log(second_result.text);