Hugging Face Transformers Pipeline, what can they do?

Transformers is a state-of-the-art machine learning library by Hugging Face designed to facilitate advanced techniques using PyTorch, TensorFlow, and JAX as the underlying frameworks. It’s provide thousands of pre-trained models related to text, vision and audio to perform different tasks or to fine tune.

Table of Content

Introduction Get started with Transformers Working with pipelines Pipeline Parameters Device Batch size Task specific parameters Vision pipeline Using pipelines on a dataset Using pipeline on large models with 🤗 accelerate

Introduction

Hugging Face is an Machine Learning community platform created by Julien Chaumond, Clément Delangue, and Thomas Wolf in 2016. It aims to offer thousands of pre-trained models based on the state-of-the-art transformer architecture for data scientists, AI practitioners, and ML engineers.

The Transformers library enables the creation and utilization of those shared models. Anyone can download and use thousands of pre-referenced models from the Model Hub. You can also upload your own models on the Hub!

These models support common tasks in different modalities, such as:

📝 Natural Language Processing: Text Generation, Text Classification, Summarization, Translation, Named Entity Recognition, Question Answering, Language Modeling and Multiple Choice.
🖼️ Computer Vision: Object Detection, Image Classification, and Segmentation.
🗣️ Audio: Automatic Speech Recognition and Audio Classification.
🐙 Multimodal: Optical Character Recognition, Table Question Answering, Video Classification, Document Question Answering, and Visual Question Answering.

PyTorch, TensorFlow, and JAX are supported frameworks for transformers. It provides the ability to train a model in few lines of code with one framework and load it for inference in another.

Get started with Transformers

In this section, we will explore how to install the Transformers library in Python and use our first tool: the pipeline() function.

Transformers is tested on Python 3.6+, PyTorch 1.1.0+, TensorFlow 2.0+, and Flax. Install Transformers with the following command:

pip install transformers

It’s recommended to install Transformers in the virtual environment. You’ll also need to install your preferred machine learning framework:

pip install torch

pip install tensorflow

Before we explore the functionality of Transformer models under the hood, let’s take a closer look at some interesting NLP problems.

Working with pipelines

The pipeline() method has the following structure:

Python

from transformers import pipeline

# To use a default model & tokenizer for a given task(e.g. question-answering)
pipeline("<task-name>")

# To use an existing model
pipeline("<task-name>", model="<model_name>")

# To use a custom model/tokenizer
pipeline('<task-name>', model='<model name>',tokenizer='<tokenizer_name>')

The Transformers library’s most basic object is the pipeline() function. By linking a model to its necessary preprocessing and postprocessor steps, we can input text directly and receive an understandable response:

Python

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("This hugging face course is great")

Output

[{'label': 'POSITIVE', 'score': 0.9998687505722046}]

You can even pass more than one sentence! Take a look.

Python

classifier(
    ["A great course by exnrt.com", "The offical course is difficult for beginners"]
)

Output

[{'label': 'POSITIVE', 'score': 0.9996315240859985},
 {'label': 'NEGATIVE', 'score': 0.9986200332641602}]

The pipeline, by default, selects a specific pretrained model that has been fine-tuned for English sentiment analysis. Upon creating the classifier object, the model is downloaded and stored in memory. Restarting the command will result in using the cached model instead of downloading it again.

Three primary steps are involved in transferring text to a pipeline:

In first step, the text is preprocessed in model understandable format.
In second step, the preprocessed text is passed to the model.
In third step, the model predictions are post-processed, so you can understand them.

Let’s take the example of using the pipeline() for automatic speech recognition.

Python

from transformers import pipeline
transcriber = pipeline(task="automatic-speech-recognition")
#transcriber = pipeline(model="openai/whisper-large-v2")
transcriber("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")

Output

{'text': 'I HAVE A DREAM BUT ONE DAY THIS NATION WILL RISE UP LIVE UP THE TRUE MEANING OF ITS TREES'}

Pipeline Parameters

Several parameters are supported by pipeline(), with some being task-specific and others being general for all pipelines. As general, you have the liberty to specify parameters in any manner you choose. Let’s check out 3 important ones:

Device

The model is automatically loaded onto the device specified by the pipeline when device=n is used. This will work regardless of whether you are using PyTorch or Tensorflow.

Python

transcriber = pipeline(model="openai/whisper-large-v2", device=0)

Batch size

By default, batch inference is not set in pipelines. Batching may not be the quickest process, but it can also be slower in certain situations. But if it works in your use case, you can use:

Python

transcriber = pipeline(model="openai/whisper-large-v2", device=0, batch_size=2)
audio_filenames = [f"https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/{i}.flac" for i in range(1, 5)]
texts = transcriber(audio_filenames)

The pipeline is able to execute on the 4 audio files provided by the user, but it will send them in batches of 2 to the model using GPU-based batching without any additional code. Batching is unnecessary as the output should always be in line with what you would have received. This is just as a way to help you get more speed out of the pipeline.

Task specific parameters

Tasks are assigned specific parameters that provide additional options and flexibility to facilitate their completion. The transformers.AutomaticSpeechRecognitionPipeline.call() method has a promising return_timestamps parameter that sounds like the subtitling of videos.

Python

transcriber = pipeline(model="openai/whisper-large-v2", return_timestamps=True)
transcriber("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")

The model inferred the text and also produced output when different sentences were pronounced, as can be seen.

Output

{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.', 'chunks': [{'timestamp': (0.0, 11.88), 'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its'}, {'timestamp': (11.88, 12.38), 'text': ' creed.'}]}

Vision pipeline

A pipeline() is practically identical to its use for vision tasks. Detail your task and transmit your image to the classifier. A base64-encoded image, a local path, or even an image link. For example, what is in the image?

Python

from transformers import pipeline
vision_classifier = pipeline(model="google/vit-base-patch16-224")
preds = vision_classifier(
    images="https://exnrt.com/wp-content/uploads/2022/03/3-2.jpg"
)
preds = [{"score": round(pred["score"], 3), "label": pred["label"]} for pred in preds]
preds

Output

[{'score': 0.733, 'label': 'notebook, notebook computer'},
 {'score': 0.151, 'label': 'laptop, laptop computer'},
 {'score': 0.028, 'label': 'desktop computer'},
 {'score': 0.015, 'label': 'screen, CRT screen'},
 {'score': 0.012, 'label': 'modem'}]

Using pipelines on a dataset

The pipeline can also perform inference on a large dataset. The simplest way to do this is to use an iterator:

Python

def data():
    for i in range(1000):
        yield f"My example {i}"


pipe = pipeline(model="openai-community/gpt2", device=0)
generated_characters = 0
for out in pipe(data()):
    generated_characters += len(out[0]["generated_text"])

The pipeline takes into account the iterable input and retrieves data while continuing to process on GPU using DataLoader. The GPU can be fed as quickly as possible without requiring any memory for the entire dataset, making it crucial. If batching is feasible, it might be beneficial to modify the batch_size parameter in this context. The most straightforward approach to iterate over a dataset is by selecting one from hugging face Datasets:

Python

# KeyDataset is a util that will just output the item we're interested in.
from transformers.pipelines.pt_utils import KeyDataset
from datasets import load_dataset

pipe = pipeline(model="hf-internal-testing/tiny-random-wav2vec2", device=0)
dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation[:10]")

for out in pipe(KeyDataset(dataset, "audio")):
    print(out)

Using pipeline on large models with 🤗 accelerate

Using accelerate, you can run the pipeline on large models with ease! To begin with, ensure that you have installed Accelerate using pip install Acceleriiate. We will use the Facebook/opt-1.3b example to load our model with device_map=”auto” as the initial step.

Python

# pip install accelerate
import torch
from transformers import pipeline

pipe = pipeline(model="facebook/opt-1.3b", torch_dtype=torch.bfloat16, device_map="auto")
output = pipe("This is a cool example!", do_sample=True, top_p=0.95)

By installing bitsandbytes and adding the argument load_in_8bit=True, it is possible to pass models with 8-bit loaded.

Python

# pip install accelerate bitsandbytes
import torch
from transformers import pipeline

pipe = pipeline(model="facebook/opt-1.3b", device_map="auto", model_kwargs={"load_in_8bit": True})
output = pipe("This is a cool machine learning example!", do_sample=True, top_p=0.95)

It should be noted that the checkpoint can be substituted with any of the Hugging Face models that support large model loading, such as BLOOM!

Top Stories

How to Fine-tune Meta Llama-3 8B

C4AI Command R+ Everything You Need to Know

Fine-tune Jamba-v0.1 on English quotes using QLoRA

Stay Connected

Hugging Face Transformers Pipeline, what can they do?

Introduction

Get started with Transformers

Working with pipelines

Pipeline Parameters

Device

Batch size

Task specific parameters

Vision pipeline

Using pipelines on a dataset

Using pipeline on large models with 🤗 accelerate

Other Posts

Alright ADX: Google Certified Partner

Is WpLocker.com safe? Do its themes or plugins contain viruses?

What is Generative AI? A Detailed Overview

10 Best Google Adx Publishers Partners in 2023

Latest Posts

DeepLearning.AI just announced a course on Hugging Face

Two Pass Compiler in Compiler Design

Get started with the open-source Grok-1 model

Quick Links

About US

Top Stories

Stay Connected

Introduction

Get started with Transformers

Working with pipelines

Pipeline Parameters

Device

Batch size

Task specific parameters

Vision pipeline

Using pipelines on a dataset

Using pipeline on large models with 🤗 accelerate

You Might Also Like

Other Posts

Latest Posts

Quick Links

About US