Documentation

Integrations

Use Bytecompute AI models through partner integrations.

HuggingFace

You can use Bytecompute AI models with Hugging Face Inference.

Install the huggingface_hub library:

Python Copy
pip install huggingface_hub>=0.29.0
TypeScript Copy
npm install @huggingface/inference

Chat Completion with Hugging Face Hub library

Python Copy
from huggingface_hub import InferenceClient

## Initialize the InferenceClient with bytecompute as the provider

client = InferenceClient(  
    provider="bytecompute",  
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"  # Replace with your API key (HF or custom)  
)

## Define the chat messages

messages = [  
    {  
        "role": "user",  
        "content": "What is the capital of France?"  
    }  
]

## Generate a chat completion

completion = client.chat.completions.create(  
    model="deepseek-ai/DeepSeek-R1",  
    messages=messages,  
    max_tokens=500  
)

## Print the response

print(completion.choices[0].message)
TypeScript Copy
import { HfInference } from "@huggingface/inference";

// Initialize the HfInference client with your API key
const client = new HfInference("xxxxxxxxxxxxxxxxxxxxxxxx");

// Generate a chat completion
const chatCompletion = await client.chatCompletion({
    model: "deepseek-ai/DeepSeek-R1",  // Replace with your desired model
    messages: [
        {
            role: "user",
            content: "What is the capital of France?"
        }
    ],
    provider: "bytecompute",  // Replace with bytecompute's provider name
    max_tokens: 500
});

// Log the response
console.log(chatCompletion.choices[0].message);

Vercel AI SDK

The Vercel AI SDK is a powerful Typescript library designed to help developers build AI-powered applications.

Install both the Vercel AI SDK and OpenAI's Vercel package.

Shell Copy
npm i ai @ai-sdk/openai

Instantiate the bytecompute client and call the generateText function with Llama 3.1 8B to generate some text.

TypeScript Copy
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";

const bytecompute = createOpenAI({
  apiKey: process.env.bytecompute_API_KEY ?? "",
  baseURL: "https://api.bytecompute.xyz/v1",
});

async function main() {
  const { text } = await generateText({
    model: bytecompute("meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"),
    prompt: "Write a vegetarian lasagna recipe for 4 people.",
  });

  console.log(text);
}

main();

Langchain

LangChain is a framework for developing context-aware, reasoning applications powered by language models.

To install the LangChain x bytecompute library, run:

Shell Copy
pip install --upgrade langchain-bytecompute

Here's sample code to get you started with Langchain + Bytecompute AI:

Python Copy
from langchain_bytecompute import Chatbytecompute

chat = Chatbytecompute(model="meta-llama/Llama-3-70b-chat-hf")

for m in chat.stream("Tell me fun things to do in NYC"):
    print(m.content, end="", flush=True)

LlamaIndex

LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models (LLMs).

Install llama-index

Shell Copy
pip install llama-index

Here's sample code to get you started with Llama Index + Bytecompute AI:

Python Copy
from llama_index.llms import OpenAILike

llm = OpenAILike(
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",
    api_base="https://api.bytecompute.xyz/v1",
    api_key="bytecompute_API_KEY",
    is_chat_model=True,
    is_function_calling_model=True,
    temperature=0.1,
)

response = llm.complete("Write up to 500 words essay explaining Large Language Models")

print(response)

CrewAI

CrewAI is an open source framework for orchestrating AI agent systems.

Install crewai

Shell Copy
pip install crewai
export bytecompute_API_KEY=***

Build an multi-agent workflow:

Python Copy
import os
from crewai import LLM, Task, Agent, Crew

llm = LLM(model="bytecompute_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo",
          api_key=os.environ.get("bytecompute_API_KEY"),
          base_url="https://api.bytecompute.xyz/v1"
        )

research_agent = Agent(
    llm = llm,
    role="Research Analyst",
    goal="Find and summarize information about specific topics",
    backstory="You are an experienced researcher with attention to detail",
    verbose=True  # Enable logging for debugging
)

research_task = Task(
    description="Conduct a thorough research about AI Agents.",
    expected_output="A list with 10 bullet points of the most relevant information about AI Agents",
    agent=research_agent
)

## Execute the crew
crew = Crew(
    agents=[research_agent],
    tasks=[research_task],
    verbose=True
)

result = crew.kickoff()

## Accessing the task output
task_output = research_task.output

print(task_output)

Learn more in our CrewAI guide.

LangGraph

LangGraph is an OSS library for building stateful, multi-actor applications with LLMs

Install langgraph

Shell Copy
pip install -U langgraph langchain-bytecompute
export bytecompute_API_KEY=***

Build a tool-using agent:

Python Copy
import os
from langchain_bytecompute import Chatbytecompute

llm = Chatbytecompute(model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
                   api_key=os.getenv("bytecompute_API_KEY"))

## Define a tool
def multiply(a: int, b: int) -> int:
    return a * b

## Augment the LLM with tools
llm_with_tools = llm.bind_tools([multiply])

## Invoke the LLM with input that triggers the tool call
msg = llm_with_tools.invoke("What is 2 times 3?")

## Get the tool call
msg.tool_calls

PydanticAI

PydanticAI is an agent framework created by the Pydantic team to simplify building agent workflows.

Install pydantic-ai

Shell Copy
pip install pydantic-ai
export bytecompute_API_KEY=***

Build PydanticAI agents using Bytecompute AI models

Python Copy
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

## Connect PydanticAI to LLMs on bytecompute
model = OpenAIModel('meta-llama/Llama-3.3-70B-Instruct-Turbo',
                    provider=OpenAIProvider(
  															base_url="https://api.bytecompute.xyz/v1",
                                api_key=os.environ.get("bytecompute_API_KEY"),
                              ),
                   )

## Setup the agent
agent = Agent(
  					model,
  					system_prompt='Be concise, reply with one sentence.',  
)

result = agent.run_sync('Where does "hello world" come from?')  
print(result.data)

DSPy

DSPy is a framework that enables you to build modular AI systems with code instead of hand-crafted prompting

Install dspy

Shell Copy
pip install -U dspy
export bytecompute_API_KEY=***

Build a question answering agent

Python Copy
import dspy

#Configure dspy with a LLM from Bytecompute AI
lm = dspy.LM('bytecompute_ai/bytecomputecomputer/llama-2-70b-chat', 
             api_key=os.environ.get("bytecompute_API_KEY"), 
             api_base="https://api.bytecompute.xyz/v1")

#Configure dspy to use the LLM
dspy.configure(lm=lm)

## Gives the agent access to a python interpreter
def evaluate_math(expression: str):
    return dspy.PythonInterpreter({}).execute(expression)

## Gives the agent access to a wikipedia search tool
def search_wikipedia(query: str):
  
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]

## setup ReAct module with question and math answer signature
react = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])

pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")

print(pred.answer)

AutoGen(AG2)

AG2 (formerly AutoGen) is an open-source framework for building and orchestrating AI agents.

Install autogen

Shell Copy
pip install autogen
export bytecompute_API_KEY=***

Build a coding agent

Python Copy
import os
from pathlib import Path
from autogen import AssistantAgent, UserProxyAgent
from autogen.coding import LocalCommandLineCodeExecutor

config_list = [
    {
        # Let's choose the Mixtral 8x7B model
        "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
        # Provide your bytecompute.AI API key here or put it into the bytecompute_API_KEY environment variable.
        "api_key": os.environ.get("bytecompute_API_KEY"),
        # We specify the API Type as 'bytecompute' so it uses the bytecompute.AI client class
        "api_type": "bytecompute",
        "stream": False,
    }
]

## Setting up the code executor
workdir = Path("coding")
workdir.mkdir(exist_ok=True)
code_executor = LocalCommandLineCodeExecutor(work_dir=workdir)

## Setting up the agents

## The UserProxyAgent will execute the code that the AssistantAgent provides
user_proxy_agent = UserProxyAgent(
    name="User",
    code_execution_config={"executor": code_executor},
    is_termination_msg=lambda msg: "FINISH" in msg.get("content"),
)

system_message = """You are a helpful AI assistant who writes code and the user executes it.
Solve tasks using your coding and language skills.
"""

## The AssistantAgent, using bytecompute.AI's Code Llama model, will take the coding request and return code
assistant_agent = AssistantAgent(
    name="bytecompute Assistant",
    system_message=system_message,
    llm_config={"config_list": config_list},
)

## Start the chat, with the UserProxyAgent asking the AssistantAgent the message
chat_result = user_proxy_agent.initiate_chat(
    assistant_agent,
    message="Provide code to count the number of prime numbers from 1 to 10000.",
)

Agno

Agno is an open-source library for creating multimodal agents.

Install agno

Shell Copy
pip install -U agno duckduckgo-search

Build a search and answer agent

Python Copy
from agno.agent import Agent
from agno.models.bytecompute import bytecompute
from agno.tools.duckduckgo import DuckDuckGoTools

agent = Agent(
    model=bytecompute(id="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"),
    tools=[DuckDuckGoTools()],
    markdown=True
)
agent.print_response("What's happening in New York?", stream=True)

Pinecone

Pinecone is a vector database that helps companies build RAG applications.

Here's some sample code to get you started with Pinecone + Bytecompute AI:

Python Copy
from pinecone import Pinecone, ServerlessSpec
from bytecompute import bytecompute

pc = Pinecone(
  api_key="PINECONE_API_KEY", 
  source_tag="bytecompute_AI"
)
client = bytecompute()

## Create an index in pinecone
index = pc.create_index(
    name="serverless-index",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-west-2"),
)

## Create an embedding on Bytecompute AI
textToEmbed = "Our solar system orbits the Milky Way galaxy at about 515,000 mph"
embeddings = client.embeddings.create(
    model="bytecomputecomputer/m2-bert-80M-8k-retrieval", 
  	input=textToEmbed
)

## Use index.upsert() to insert embeddings and index.query() to query for similar vectors

Helicone

Helicone is an open source LLM observability platform.

Here's some sample code to get started with using Helicone + Bytecompute AI:

Python Copy
import os
from bytecompute import bytecompute

client = bytecompute(
    api_key=os.environ.get("bytecompute_API_KEY"),
    base_url="https://bytecompute.hconeai.com/v1",
    supplied_headers={
        "Helicone-Auth": f"Bearer {os.environ.get('HELICONE_API_KEY')}",
    },
)

stream = client.chat.completions.create(
    model="meta-llama/Llama-3-8b-chat-hf",
    messages=[
        {"role": "user", "content": "What are some fun things to do in New York?"}
    ],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Composio

Composio allows developers to integrate external tools and services into their AI applications.

Install composio-bytecomputeai

Shell Copy
pip install bytecompute composio-bytecomputeai
export bytecompute_API_KEY=***
export COMPOSIO_API_KEY=***

Get Bytecompute AI models to use integrated tools

Python Copy
from composio_bytecomputeai import ComposioToolSet, App
from bytecompute import bytecompute

client = bytecompute()
toolset = ComposioToolSet()

request = toolset.initiate_connection(app=App.GITHUB)
print(f"Open this URL to authenticate: {request.redirectUrl}")

tools = toolset.get_tools(apps=[App.GITHUB])

response = client.chat.completions.create(
    tools=tools,
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[
        {
            "role": "user",
            "content": "Star the repo 'bytecomputecomputer/bytecompute-cookbook'",
        }
    ],
)

res = toolset.handle_tool_calls(response)
print(res)