Deploying LoRA adapter model
How to deploy LoRA adapter model
-
Navigate to the dashboard https://byteCompute.com/dash
-
Click on the 'New Deployment' button
-
Click on the 'LoRA Model' tab
-
Fill the form:
- LoRA model name: model name used to reference the deployment
- Hugging Face Model Name: Hugging Face model name
- Hugging Face Token: (optional) Hugging Face token if the LoRA adapter model is private
To use LoRA adapter model, you need
- LoRA adapter model hosted on Hugging Face
- Base model that supports LoRA adapter at ByteCompute (you can see the list of supported base models in upload lora form)
- Hugging Face token if the LoRA adapter model is private
- ByteCompute account, and ByteCompute API key
Example flow:
Prerequisites:
- askarByteCompute/llama-3.1-8B-rank-32-example-lora
- The base model is meta-llama/Meta-LLama-3.1-8B-Instruct which is supported at ByteCompute
- The LoRA adapter model is public, so no need for Hugging Face token
- ByteCompute API key is generated from https://byteCompute.com/dash/api_keys page
Then I'm gonna deploy the model:
-
Navigate to the dashboard https://byteCompute.com/dash
-
Click on the 'New Deployment' button
-
Click on the 'LoRA Model' tab
-
Fill the form:
- LoRA model name: asdf/lora-example
- Hugging Face Model Name: askarByteCompute/llama-3.1-8B-rank-32-example-lora
-
Click on the 'Upload' button
Now the deployment should appear in https://byteCompute.com/dash/deployments page, with a name asdf/lora-example. Initially the state is "Initializing", after a while it should become "Deploying" and then "Running". Once the state is "Running", you can use the model.
Navigate to https://byteCompute.com/asdf/lora-example where you can find all the information about the the model including:
- Pricing
- Precision
- Demo page, where you can test the model
- API reference, where you can find information how to inference the model using REST API
I'll leave example of inference with curl below:
curl "https://api.byteCompute.com/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ByteCompute_API_KEY" \
-d '{
"model": "asdf/lora-example",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
