Exnrt Logo
  • Home
  • Technology
    • Artificial Intelligence
    • WordPress
  • Programming
    ProgrammingShow More
    Mistral AI Model
    Mistral-7B Instruct Fine-Tuning using Transformers LoRa
    19 1
    Hugging Face Website
    Hugging Face Transformers Pipeline, what can they do?
    15 1
    AI generated images using SDXL-Lightning huggingface
    SDXL-Lightning model using hugging face Transformers
    14 1
    Gemma AI Model
    Finetune Gemma Models with Transformers
    11 1
    HTML Quiz App
    Quiz App Using HTML, CSS, and JavaScript
    9 1
  • Business
    • Ads
    • SEO
  • AI Tools
    • AI Chatbot For Education
    • Ask a Question
    • News Title Generator
  • My Feed
    • My Interests
    • My Saves
    • History
Notification
Sign In
ExnrtExnrtExnrt
Font ResizerAa
  • Artificial Intelligence
  • Technology
  • Business
  • Ads
  • SEO
Search
  • Blog
  • Ads
  • Programming
  • Technology
  • Artificial Intelligence
  • WordPress
  • SEO
  • Business
  • Education

Top Stories

Explore the latest updated news!
Fine Tuning Siglip2 a ViT on Image Classification Task.

Fine Tuning Siglip2 on Image Classification Task

4
AI-Generated-Image-using-Flux-1

How to Fine-Tune Flux.1 Using AI Toolkit

8
microsoft/Phi-3-mini-128k-instruct

How to fine-tune Microsoft/Phi-3-mini-128k-instruct

12

Stay Connected

Find us on socials
248.1k Followers Like
61.1k Followers Follow
165k Subscribers Subscribe
Artificial IntelligenceBlog

Get started with the open-source Grok-1 model

The release of Grok-1 has sparked excitement in the AI community, seen as a milestone towards open-source Artificial General Intelligence (AGI).
Ateeq Azam
Last updated: March 25, 2024 3:42 pm
By Ateeq Azam Add a Comment 12 2
Share
xAI grok-1 open-weights model
xAI grok-1 open-weights model
SHARE

On Sunday, Elon Musk’s AI company, xAI, released the base model weights and network architecture for the Grok-1, a large language model compete with OpenAI’s ChatGPT. The release of open-weights via GitHub and BitTorrent is in line with Musk’s ongoing criticism and legal battle against OpenAI for not making its AI models available to the public.

Table of Content
The Grok-1 modelGrok Benchmark:Key Technical Specifications:How to download GrokTo Download this Model:Run ModelOutput:Gork-1 PyTorch version for transformersUsageImplementation of Grok-1 TokenizerGuidelines to download the model from the Hugging FaceConclusion
Open Source Gork vs Close OpenAI
Open Source Gork vs Close OpenAI

The Grok-1 model

Grok-1 is a basic prototype checkpoint from the pre-training phase, which ended in October 2023. It is not trained for any particular use, such as in chatbot. The model utilizes a Mixture-of-Experts architecture and boasts an impressive 314 billion parameters, with 25% of the weights active. xAI employed a custom training stack built on JAX and Rust to train Grok-1 from scratch.

Mixture-of-Experts architecture
Mixture-of-Experts architecture

Grok Benchmark:

Grok’s MMLU score is of 73% and it beats Llama 2 69B’s at 68.9% and Mixtral 8x7B’s at 71%.

Grok Benchmark test

Key Technical Specifications:

  • 314 billion parameters
  • Mixture-of-Experts architecture with 8 experts (2 active per token)
  • SentencePiece tokenizer with 131,072 tokens
  • Maximum context window of 8,000 tokens
  • Support for 8-bit quantization and activation caching
  • 64 layers
  • 48 attention heads for queries
  • Support for Rotary Positional Embeddings (RoPE)
  • 6,000-dimensional internal embeddings

The large size of Grok-1 necessitates substantial hardware resources to be run on a local level. Approximately 8-bit inference requires an NVIDIA DGX H100 system (with 8 GPUs, each with 80GB of VRAM), while 4-bit data usage necessitates around 320GB.

This release contains code and weights associated with Grok-1 that are licensed under Apache 2.0.

How to download Grok

Instructions for downloading and running Grok-1 are explained in this GitHub repository. You can clone the repository to your local machine. This repository contains example JAX code to load and run the Grok-1 open-weight model.

To Download this Model:

Copy
git clone https://github.com/xai-org/grok-1.git && cd grok-1

#git clone https://github.com/xai-org/grok-1.git
#cd grok-1
Copy
pip install huggingface_hub[hf_transfer]
#pip install -U huggingface_hub
Copy
huggingface-cli download xai-org/grok-1 --repo-type model --include ckpt-0/* --local-dir checkpoints --local-dir-use-symlinks False

Use can aslo directly download this model from Hugging Face 🤗 Hub xai-org/grok-1:

Copy
# You can also clone the repository, It's Simple
git lfs install
git clone https://huggingface.co/xai-org/grok-1

Make sure to place the ckpt-0 directory in the checkpoints folder as in GitHub.

Grok Github Repository
Grok Github Repository

Note: When you directly clone the model from hub, make sure that the others file like run.py are also downloaded otherwise installed them manually from github. Check the image below; the files are not there because the model currently does not have a HF Transformers format.

You can also download its all-model weights by using a torrent client and this magnet link:

Copy
magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

Run Model

Now you’re model is ready. To test the code, run the following commands:

Copy
pip install -r requirements.txt
python run.py

Output:

You can Change your Prompt in run.py file.

PythonCopy
    #A piece of code taken from run.py file
    inp = "The answer to life the universe and everything is of course"
    print(f"Output for prompt: {inp}", sample_from_model(gen, inp, max_len=100, temperature=0.01))

The model’s excessive size (314B parameters) necessitates a machine with ample GPU memory to test the model using the example code.

Gork-1 PyTorch version for transformers

PyTorch Version of Grok-1 model and weights is now available on hugging Face. The original modeling written in JAX was translated into a PyTorch version. The weights were converted by mapping tensor files with parameter keys, de-quantizing the tensors with corresponding packed scales, and then saved to a checkpoint file using torch APIs.

Usage

Python
import torch
from transformers import AutoModelForCausalLM
from sentencepiece import SentencePieceProcessor
Python
torch.set_default_dtype(torch.bfloat16)
model = AutoModelForCausalLM.from_pretrained(
    "hpcai-tech/grok-1",
    trust_remote_code=True,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
sp = SentencePieceProcessor(model_file="tokenizer.model")
Python
text = "Replace this with your text: What is Fine Tuning?"
input_ids = sp.encode(text)
input_ids = torch.tensor([input_ids]).cuda()
attention_mask = torch.ones_like(input_ids)
generate_kwargs = {}  # Add any additional args if you want
inputs = {
    "input_ids": input_ids,
    "attention_mask": attention_mask,
    **generate_kwargs,
}
outputs = model.generate(**inputs)

OutPut:

Implementation of Grok-1 Tokenizer

A Hugging Face version of the Grok-1 tokenizer, adapted from xai-org/grok-1, is now available. Grok-1 can now be utilized with Hugging Face libraries such as Transformers, Tokenizers, and Transformers.js.

Transformers/Tokenizers:

PythonCopy
from transformers import LlamaTokenizerFast

tokenizer = LlamaTokenizerFast.from_pretrained('Xenova/grok-1-tokenizer')
assert tokenizer.encode('hello world')

Guidelines to download the model from the Hugging Face

Download a single file. It’s just a example 🤗.

PythonCopy
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="xai-org/grok-1", filename="config.json")

Or an entire repository

PythonCopy
from huggingface_hub import snapshot_download
snapshot_download("xai-org/grok-1")

Files will be downloaded in a local cache folder. More details in this guide.

Conclusion

Limited access hinders progress in AI models: While some fully open-source models exist (e.g., Mistral and Falcon), many leading models are either closed-source or have restricted licenses. Companies like Meta release “free” research models (Llama 2) but restrict commercial use and further development. This limits valuable feedback and innovation from the wider research community.

capabilities on standard LM benchmarks

Paid exclusivity can backfire: The Grok chatbot initially required a paid subscription for access. This aimed to position Grok as a unique, edgy alternative. However, our testing found it underwhelming compared to freely available options like ChatGPT or Gemini. This suggests limited access can hinder a model’s ability to compete and improve.

Facing an issue? Ask in comments or join the discussion on Hugging Face.

TAGGED:Artificial IntelligenceHugging FaceProgramming
SOURCES:x.aiHuggingFace.comGitHub.com
Share This Article
Facebook Twitter Copy Link Print
What do you think?
Love5
Sad0
Happy0
Sleepy0
Angry0
Leave a comment
Subscribe
Login
Notify of
guest

guest

0 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

You Might Also Like

Fine Tuning Siglip2 a ViT on Image Classification Task.
Fine Tuning Siglip2 on Image Classification Task
AI-Generated-Image-using-Flux-1
How to Fine-Tune Flux.1 Using AI Toolkit
microsoft/Phi-3-mini-128k-instruct
How to fine-tune Microsoft/Phi-3-mini-128k-instruct
AI Generated: A professional real llama looking like a hacker in a dark lab with light yellow lights
How to Fine-tune Meta Llama-3 8B

Other Posts

AI Generated: A professional real llama looking like a hacker in a dark lab with light yellow lights
How to Fine-tune Meta Llama-3 8B
Artificial Intelligence Blog
Calculator Using Html, CSS, Javascript
Simple Calculator Using Html, CSS, and JavaScript
Programming Blog
9 Planets Animated Solar System
9 Planets Animated Solar System Using Html, CSS, Javascript
Programming Blog
Mail by Google - AdSense for Search
New site-level approval process for AdSense for Search (AFS) sites
Ads Blog

Latest Posts

Uncover the Latest stories that related to your interest!
Fine Tuning Siglip2 a ViT on Image Classification Task.
Artificial IntelligenceBlog

Fine Tuning Siglip2 on Image Classification Task

April 14, 2025

At Exnrt.com, we believe in empowering computer science students with the knowledge and skills they need to succeed in their careers. Our goal is to provide accessible and engaging tutorials that help students and professionals develop their skills and advance their careers.

  • Categories:
  • Business
  • Technology
  • Ads
  • SEO

Quick Links

  • Blog
  • Technology
  • Artificial Intelligence
  • Business

About US

  • About Us
  • Contact Us
  • Privacy Policy

Copyright © 2024 All Rights Reserved – Exnrt by ateeq.pk

wpDiscuz
Welcome Back!

Sign in to your account

Register Lost your password?