Semantic Text Similarity with LLM Embeddings in Python


In this post, we'll briefly learn what Semantic Text Similarity is, how LLM Embeddings enable it, and how to measure the semantic closeness between sentences in Python. The tutorial covers:

  1. What is Semantic Text Similarity?
  2. What are LLM Embeddings?
  3. Installation
  4. Loading an Embedding Model
  5. Cosine Similarity Between Two Sentences
  6. Ranking Sentences by Similarity
  7. Batch Similarity with a Query
  8. Similarity Heatmap for a Sentence Set
  9. Conclusion

Let's get started.

How to Use Hugging Face Transformers Pipeline in Python

In this post, we'll briefly learn what the Hugging Face Transformers pipeline is, how it works, and how to apply it to common NLP tasks in Python. The tutorial covers:

  1. What is the Transformers Pipeline?
  2. Installation
  3. Pipeline Task Overview
  4. Text Classification
  5. Text Generation
  6. Question Answering
  7. Named Entity Recognition
  8. Conclusion
  9. Source Code Listing

Let's get started.

LLM Embeddings – A Practical Introduction in Python

    In this post, we'll briefly learn what LLM embeddings are, how they work, and how to generate and use them in Python. The tutorial covers:

  1. What are Embeddings?
  2. How LLMs Generate Embeddings
  3. Types of Embeddings
  4. Generating Embeddings with Sentence Transformers
  5. Generating Embeddings with OpenAI API
  6. Measuring Semantic Similarity
  7. Visualizing Embeddings with TSNE
  8. Conclusion
  9. Source Code Listing

     Let's get started. 

Self-Attention Mechanism – A Practical Introduction in Python

    In this post, we'll briefly learn what the self-attention mechanism is, how it works, and how to implement it from scratch in Python. The tutorial covers:
  1. What is Self-Attention?
  2. How Does Self-Attention Work?
  3. Query, Key, and Value Explained
  4. Self-Attention Step by Step
  5. Implementing Self-Attention in Python
  6. Self-Attention with PyTorch
  7. Conclusion
  8. Source Code Listing

Let's get started.

What is an LLM? A Practical Introduction in Python

     In this post, we'll briefly learn what a Large Language Model (LLM) is, how it works, and how to run your first LLM in Python with just a few lines of code. The tutorial covers:

  • What is an LLM?
  • How does an LLM work?
  • Types of LLM architectures
  • Popular LLMs
  • Running your first LLM in Python
  • Source code listing

Tokenization in LLMs – SentencePiece and Byte-level BPE (part-2)

     In the previous tutorial, we explored LLM tokenization and learned how to use BPE and WordPiece tokenization with the tokenizers library. In the second part of the tutorial, we will learn how to use SentencePiece and Byte-level BPE methods. 

    The tutorial will cover:

  1. Introduction to SentencePiece
  2. Implementing SentencePiece Tokenization
  3. Introduction to Byte-level BPE 
  4. Implementing Byte-level BPE Tokenization
  5. Conclusion

     Let's get started.