Latest Posts
-
·
Finetuning Falcon LLMs More Efficiently With LoRA and Adapters
Finetuning allows us to adapt pretrained LLMs in a cost-efficient manner. But which method should we use? This article compares different parameter-efficient finetuning methods for the latest top-performing open-source LLM, Falcon. Using parameter-efficient finetuning methods outlined in this article, it’s possible to finetune an LLM in 1 hour on a single GPU instead of a…
-
·
How to Fine-Tune LLMs with Hugging Face
Large Language Models or LLMs have seen a lot of progress in the last year. We went from no ChatGPT competitor to a whole zoo of LLMs, including Meta AI’s Llama 2, Mistrals Mistral & Mixtral models, TII Falcon, and many more. Those LLMs can be used for a variety of tasks, including chatbots, question…
-
·
Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch
In this article, we are going to understand how self-attention works from scratch. This means we will code it ourselves one step at a time. Since its introduction via the original transformer paper (Attention Is All You Need), self-attention has become a cornerstone of many state-of-the-art deep learning models, particularly in the field of Natural…
-
·
Understanding LLM | A Transformative Reading List
Large language models have taken the public attention by storm – no pun intended. In just half a decade large language models – transformers – have almost completely changed the field of natural language processing. Moreover, they have also begun to revolutionize fields such as computer vision and computational biology. The following list below is…
-
·
Kolmogorov Smirnov Test: When and Where To Use It
What is the Kolmogorov-Smirnov Test (KS test or K-S test)? The Kolmogorov Smirnov test (KS test or K-S test) is used to compare two distributions to determine if they are pulling from the same underlying distribution. In the typical ML use case, there are two distributions (A & B) that you are trying to compare.…
-
·
Jensen Shannon Divergence: Intuition and Practical Application
What Is JS Divergence (JS Div)? The Jensen-Shannon divergence (JS) metric – also known as information radius ( IRad ) or total divergence to the average – is a statistical measurement with a basis in information theory that is closely related to Kullback-Leibler divergence (KL Divergence) and population stability index (PSI) . The advantage of…
-
·
KL Divergence: When To Use Kullback-Leibler divergence
The basics of KL divergence and how it is used in drift monitoring What Is KL Divergence? Kullback-Leibler divergence metric is a statistical measure from information theory that quantifies the difference between one probability distribution from a reference probability distribution. KL divergence is also known as relative entropy. This post covers: KL Divergence Formula KL…
-
·
Population Stability Index (PSI): What You Need To Know
Population stability index (PSI) is a statistical measure with a basis in information theory that quantifies the difference between one probability distribution from a reference probability distribution. The advantage of PSI over KL divergence is that it is a symmetric metric. PSI can be thought of as the round trip loss of entropy – the…
-
·
Monitoring Text-Based Generative AI Models
Tips for measuring text-based generative models using BLEU, ROUGE, METEOR, and Bertscore as well as prediction embeddings In recent years, text-based generative AI models have been making significant strides in natural language processing tasks such as language translation, text summarization, and dialogue generation. These models are capable of generating text that is often indistinguishable from…