Sentence Transformers: Architecture, Working Principles, and Practical Examples

Jayant Kumar
1w
526
0
0

Article

Introduction

Traditional NLP systems treat text as sequences of words, making it difficult to capture semantic meaning. Sentence Transformers solve this problem by converting sentences, paragraphs, or documents into dense vector representations (embeddings) that preserve semantic relationships.

These embeddings allow machines to perform semantic search, clustering, similarity comparison, recommendation, and retrieval-augmented generation (RAG) efficiently.

What Is a Sentence Transformer?

A Sentence Transformer is a neural network model designed to convert sentences, paragraphs, or documents into dense numerical vectors (embeddings) that capture semantic meaning.

These embeddings allow machines to understand how similar two pieces of text are in meaning, not just in words.

Simple Example

Sentence 1: "How do I reset my password?"
Sentence 2: "What is the process to change my account password?"

Though the words differ, their meanings are almost the same. A sentence transformer converts both into vectors that are very close in vector space.

How Sentence Transformers Work

Sentence Transformers are typically built using Transformer architectures such as:

BERT
RoBERTa
DistilBERT
MiniLM

Key Steps

Tokenization
Transformer Encoding
Pooling (mean/max/CLS pooling)
Fixed size embedding output

Example Output

"How do I reset my password?"
→ [0.021, -0.932, 0.118, ..., 0.441]

This vector can be stored, compared, indexed, and searched efficiently.

Common Use Cases

Use Case	Description
Semantic Search	Find documents based on meaning
Document Clustering	Group similar documents
Chatbots	Match user queries to intents
Recommendation Systems	Recommend similar content
Duplicate Detection	Identify near-duplicate text

Examples

Installing Sentence Transformers

pip install sentence-transformers

The above command downloads Sentence Transformers python SDK framework from UKB Lab.

The library provides:

Sentence Transformer class
Utilities for encoding text
Training, fine-tuning, evaluation helpers
Pooling strategies (mean, CLS, max pooling, etc.)

It does NOT download any pretrained sentence embedding model. Models are fetched on demand, for example:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")

At this point:

The model is downloaded from Hugging Face Hub
Cached locally (usually in ~/.cache/huggingface/)
Reused automatically in future runs

Some popular embedding models are

OpenAI

text-embedding-3-small
text-embedding-3-large

Hugging Face / Open Source

sentence-transformers/all-MiniLM-L6-v2
BAAI/bge-large-en
intfloat/e5-large-v2

Local (Ollama)

nomic-embed-text
mxbai-embed-large

Basic Sentence Embedding Example

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
"How do I reset my password?",
"What is the process to change my account password?",
"The weather is sunny today"
]
embeddings = model.encode(sentences)
print(embeddings.shape)
print(embeddings[0])

Output

(3, 384)
[ 0.0214, -0.1189, 0.4471, ... , -0.0321 ]

What happens internally:

Text → tokens
Tokens → transformer layers
Contextual representations → pooled
Final fixed-size vectors returned

Result:

One vector per sentence
Same length for all sentences

So, the output

3- Number of sentences
384-Size of each embedding vector