C4AI Command R+ Everything You Need to Know

C4AI Command R+ (CohereForAI/c4ai-command-r-plus) is the latest large language model that beats all other LLMs on the Hugging Face’s open LLM leaderboard.

Table of Content

Model Architecture Key Points Evaluations How to Use C4AI Command R+Tool use & multihop capabilities:Grounded Generation and RAG Capabilities:Code Capabilities

It’s an open-source weight research release of 104 billion parameters, including Retrieval Augmented Generation (RAG) and tools used to automate sophisticated tasks. It’s evaluated in 10 languages for performance: English, French, Italian, Spanish, German, Japanese, Korean, Brazilian Portuguese, Arabic, and Simplified Chinese.

This model is optimized for conversational interaction and long-context tasks, with a maximum context length of 128K.

Model Architecture

This language model is auto-regressive and makes use of an ideal Transformer architecture. The model uses preference training and supervised fine-tuning after pretraining to match model behavior to human preferences for safety and helpfulness.

Key Points

Here are the key points about Command R+:

Purpose and Use Cases:
- Command R+ is intended for intricate tasks requiring the use of multi-step tools or agents and retrieval augmented generation (RAG) functionality. It performs best in scenarios that demand complex interactions and longer contexts.
Capabilities:
- Language Tasks: Command R+ has been trained on a wide range of texts in several languages, enabling it to excel in various text-generation tasks.
- Fine Tuning Use-Cases: It has been specifically fine-tuned to excel in critical domain use cases.
- Multilingual Support: Performs well in 10 languages.
- Cross-Lingual Tasks: It is capable of handling cross-lingual tasks such as content translation and multilingual question answering.
Grounded Generations:
- Command R+ can ground its English-speaking generations. This means that it can produce responses with citations indicating the information’s source based on a list of provided document snippets.
Fine-Tuning and Safety:
- Its architecture is based on an optimized transformer design, and its outputs have been tuned via supervised and preference training to match human preferences for safety and helpfulness.
- For multi-step tool usage, developers can link Command R+ to external tools (search engines, APIs, etc.).

Evaluations

Along with a direct comparison to the most robust state-of-the-art open weights models currently on Hugging Face, we present the results below. It should be noted that these results are intended solely for comparison with other models submitted to the leaderboard or with self-reported data, which may not be reproducible in the same manner. Evaluations for all models should be conducted using standardized methods with publicly available code.

Model	Average	Arc (Challenge)	Hella Swag	MMLU	Truthful QA	Winogrande	GSM8k
CohereForAI/c4ai-command-r-plus	74.6	70.99	88.6	75.7	56.3	85.4	70.7
DBRX Instruct	74.5	68.9	89	73.7	66.9	81.8	66.9
Mixtral 8x7B-Instruct	72.7	70.1	87.6	71.4	65	81.1	61.1
Mixtral 8x7B Chat	72.6	70.2	87.6	71.2	64.6	81.4	60.7
CohereForAI/c4ai-command-r-v01	68.5	65.5	87	68.2	52.3	81.5	56.6
Llama 2 70B	67.9	67.3	87.3	69.8	44.9	83.7	54.1

How to Use C4AI Command R+

To Use this model you first need to install transformers. Use the following command to install:

! pip install transformers

Now, load the model directly from Hugging Face Hub.

Python

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

Message format for C4AI Command R+:

# Format message with the command-r-plus chat template
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>How to Fine Tune C4AI Command R+?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

Python

messages = [{"role": "user", "content": "How to Fine Tune C4AI Command R+?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")

Set the Required Parameters and get output:

Python

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )
    
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Additional: To quantize the model through bitsandbytes, 8-bit precision.

Python

# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(load_in_8bit=True)

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Tool use & multihop capabilities:

Command R+ has undergone specialized training in the use of conversational tools. This involved employing a specific prompt template and applying a combination of supervised and preference fine-tuning techniques to integrate these tools into the model. We strongly encourage experimentation, but deviating from this prompt template may result in decreased performance.

Usage: Rendering Tool Use Prompts.

Python

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# Define tools available for the model to use:
tools = [
  {
    "name": "internet_search",
    "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
    "parameter_definitions": {
      "query": {
        "description": "Query to search the internet with",
        "type": 'str',
        "required": True
      }
    }
  },
  {
    'name': "directly_answer",
    "description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
    'parameter_definitions': {}
  }
]

# render the tool use prompt as a string:
tool_use_prompt = tokenizer.apply_tool_use_template(
    conversation,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)
print(tool_use_prompt)

Grounded Generation and RAG Capabilities:

Command R+ has undergone specialized training in grounded generation, enabling it to generate responses using a provided list of document excerpts. These responses include grounding spans (citations) indicating the original source of information. This capability can support behaviors such as the last phase of retrieval augmented generation (RAG) and grounded summarization. Through the application of a specific prompt template and a combination of supervised and preference fine-tuning, this behavior has been ingrained into the model. While we encourage experimentation, it’s important to note that deviating from this prompt template may lead to a decrease in performance.

Usage: Rendering Grounded Generation prompts.

Python

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# define documents to ground on:
documents = [
    { "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." }, 
    { "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]

# render the tool use prompt as a string:
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
    conversation,
    documents=documents,
    citation_mode="accurate", # or "fast"
    tokenize=False,
    add_generation_prompt=True,
)
print(grounded_generation_prompt)

Code Capabilities

To interact effectively with your code, Command R+ has been optimized to request code snippets, code explanations, or code rewrites. It may not perform optimally for pure code completion without further adjustments.

Top Stories

How to Fine-tune Meta Llama-3 8B

C4AI Command R+ Everything You Need to Know

Fine-tune Jamba-v0.1 on English quotes using QLoRA

Stay Connected

C4AI Command R+ Everything You Need to Know

Model Architecture

Key Points

Evaluations

How to Use C4AI Command R+

Tool use & multihop capabilities:

Grounded Generation and RAG Capabilities:

Code Capabilities

Other Posts

Monetize Solution ADX: Ad Network for Publishers

What is Compiler Construction? Overview to Compiler Design

DNA Animation Using Html & CSS

20 Best Google AdSense Alternatives in 2023

Latest Posts

Hugging Face Transformers Pipeline, what can they do?

Quiz App Using HTML, CSS, and JavaScript

Google Bard Now Offers Free AI Image Generation

Quick Links

About US

Top Stories

Stay Connected

Model Architecture

Key Points

Evaluations

How to Use C4AI Command R+

Tool use & multihop capabilities:

Grounded Generation and RAG Capabilities:

Code Capabilities

You Might Also Like

Other Posts

Latest Posts

Quick Links

About US