Getting started with Amazon Bedrock Custom Model Import with Sarvam and CodeLlama

5 min readNov 5, 2024

Amazon Bedrock is a fully managed serverless service that offers a choice of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon. While this should serve the needs for most of the customer usecases, you also have the option of leveraging SageMaker Jumpstart, a Machine learning (ML) hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few click, which are then provisioned on servers in AWS.

Apart from these two options, if you needed to use other Foundation Models (as a serverless/managed service similar to Bedrock), which are not yet available on Amazon Bedrock, and have been customized in other environments like Amazon SageMaker or available in a model repository like HuggingFace, now you can !! With the general availability (GA) of Amazon Bedrock Custom Model Import a few days back, you can now import and use these customized models alongside existing foundation models (FMs) and by using the same API !!

This blog is my quick proof of concept showcasing Amazon Bedrock Custom Model Import feature by importing a couple of Foundation Models from HuggingFace into Amazon S3, and then importing these models into Amazon Bedrock !!
I chose two models hosted on HuggingFace— sarvam-2b-v0.5, a small, yet powerful language model pre-trained from scratch on 2 trillion tokens and trained to be good at 10 Indic languages + English and CodeLlama-7b-Instruct-hf, a model is designed for general code synthesis and understanding.

Let us get started ..

Step 0 — Pre-reqs

There are some pre-requisites before you get started, please consult the documentation. Amazon Bedrock Custom Model Import supports a variety of popular model architectures, including Meta Llama, Mistral, Mixtral, Flan and more.

I chose two models hosted on HuggingFace, sarvam-2b-v0.5, where had some great reviews about its support for Indic languages especially “Hinglish” (where folks use a mix of Hindi and English !!) and CodeLlama, state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts, and both these models are on top of Meta AI’s, Llama architecture. I also deliberately chose models, which are smaller in size, as I wanted to do a quick proof of concept and did not want to spend a lot of time in downloading from HuggingFace and uploading the model files to S3.

Step 1 — Downloading the model files and uploading to Amazon S3

The files are usually large and several GiBs, you can either use git clone or use some other tools like huggingface’s hf_transfer, check out https://huggingface.co/docs/hub/en/models-downloading for more guidance. Alternatively, if you created the model in Amazon SageMaker, you can also specify the SageMaker model.

I used the following commands to download the model files and upload them to an Amazon S3 bucket for both Sarvam and CodeLlama.

git clone https://huggingface.co/sarvamai/sarvam-2b-v0.5

aws s3 cp sarvam-2b-v0.5/ s3://mani-s3-<<mybucket>>/sarvam-2b-v0.5/ --recursive

The S3 bucket directory looks something similar to this:

Tip: The HuggingFace repo has a .git directory, I don’t think they are needed for Bedrock Model import. Just consult documentation on the required files for Custom Model Import. You can save a lot of time during download/upload if you chose to ignore this directory !!

Step 2 — Import Model into Bedrock

The next few steps are fairly easy, and Amazon Bedrock has made it very easy. Once the model is uploaded to an S3 bucket or via SageMaker, you can “Import the Model”, which kicks off a job, and it took around 10–13 minutes for me, and voila, the Model was available for inferencing.

Available Custom Models on Amazon Bedrock — for my POC

Step 3— Testing with sample prompts on sarvam and codellama

With Sarvam, I tested with a few prompts in Hinglish (a question in Hindi written in English), Kannada (summarizing a news article about the drubbing the Indian cricket team is receiving down under!!) and Hindi.

Text summarization — Kannada — sarvam on Amazon Bedrock

I tried a few coding related tasks with CodeLlama — coding, debugging ..

coding task — CodeLlama on Amazon Bedrock

debug code — codellama on Amazon Bedrock

My prompts were in this format:

[INST]
<<SYS>>You are an expert in Indian languages. You can translate the sentence from Indian languages or even Hinglish to English. You can also answer questions in Hindi, Kannada, Tamil and other Indian languages. <</SYS>>

Translate this sentence to english - "Mera order ka status kya hai? Maine 15th october ko order place kya tha apke website me"
[/INST]

[INST]
<<SYS>>You are an expert programmer that helps to review Python code for bugs. <</SYS>>

<<user>>This function should return a list of lambda functions that compute successive powers of their input, but it doesn’t work:
 
def power_funcs(max_pow):
    return [lambda x:x**k for k in range(1, max_pow+1)]
 
the function should be such that [h(2) for f in powers(3)] should give [2, 4, 8], but it currently gives [8,8,8]. What is happening here?<</user>>
[/INST]

Final thoughts

My first preference would be to use the wide range of Foundation Models that are available in Amazon Bedrock, while the Custom Model Import should be an option, if I need to use an specialized FM which is not yet available on Bedrock. While this was a quick proof of concept, you should also consider the following aspects mentioned in the AWS blog before moving to production:

Custom Model best practices from https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-custom-model-import-now-generally-available/

Custom Model Pricing:

Please refer to https://aws.amazon.com/bedrock/pricing/ for the latest pricing:

Hope this blog was useful. Please do share your feedback via this blog or connect with me on LinkedIn 🙏

Resources:

AWS blog — Amazon Bedrock Custom Model Import
AWS Documentation — https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html
Sarvam on HuggingFace— https://huggingface.co/sarvamai/sarvam-2b-v0.5
CodeLlama on HuggingFace— https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf
CodeLlama — https://ai.meta.com/blog/code-llama-large-language-model-coding/
Sarvam — https://www.sarvam.ai/blogs/sarvam-1

Getting started with Amazon Bedrock Custom Model Import with Sarvam and CodeLlama

Step 0 — Pre-reqs

Step 1 — Downloading the model files and uploading to Amazon S3

Step 2 — Import Model into Bedrock

Step 3— Testing with sample prompts on sarvam and codellama

Final thoughts

Resources:

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Mani

No responses yet

More from Mani

My experiments with “Amazon Bedrock” — tips to get engineering college admissions using GenAI for…

I will use my own quirky use-cases (🙂) like Text Summarization, an chatbot using RAG, Image generation to experiment with Amazon Bedrock

Getting up-to speed with the new features of Amazon Titan Image Generator on Amazon Bedrock

Exclusive to Amazon Bedrock, the Amazon Titan family of models incorporates Amazon’s 25 years of experience innovating with AI and machine…

Pod level access to DynamoDB using IAM on Amazon EKS

Amazon EKS now allows you to assign IAM permissions to Kubernetes service accounts, which in-turns makes it possible to give pod level…

Getting started with Disaster Recovery across AWS Regions in India — between AWS Asia Pacific…

This short blog focuses on a requirement, where you may need to setup a multi-region Disaster Recovery within India across the AWS regions

Recommended from Medium

Deploying and Managing Ollama Models on Kubernetes: A Comprehensive Guide

Deploying machine learning models can be challenging, especially when aiming for scalable and maintainable deployments. Kubernetes (K8s)…

Building a Multi-Agent RAG Pipeline with Crew AI

In today’s era of intelligent systems, the ability to combine diverse retrieval tools with robust language models is transforming the way…

Lists

Natural Language Processing

ChatGPT prompts

You’re Doing RAG Wrong: How to Fix Retrieval-Augmented Generation for Local LLMs

How To Set Up RAG Locally, Avoid Common Issues, and Improve RAG Retrieval Accuracy.

Fine-Tuning Models with Amazon Bedrock: A Step-by-Step Guide

Introduction

Building a Serverless RAG based Q&A app with AWS Bedrock, knowledgebase, Lambda, API Gateway and…

In today’s world, knowledge is power, but retrieving the right knowledge at the right time is even more powerful. Imagine a chatbot that…

Building an Intelligent Customer Service Agent with Amazon Bedrock and a Knowledge Base

Simplify customer support with AI and a robust knowledge base