Embeddings¶

Step 3 of 5

Welcome to the exciting world of embeddings in your Advanced Assistant Copilot! This section is crucial for setting up semantic search capabilities within Business Central using AI-powered techniques. Before diving into the technical aspects, let’s gain a foundational understanding of embeddings and their significance.

How LLMs Work¶

First, spend about 10 minutes exploring the Generative AI website to understand how large language models (LLMs) work, including a focused look at embeddings and vectors. This resource is invaluable for visualizing and comprehending the broader context of AI interactions.

What are Embeddings?¶

Embeddings are essentially a way to convert text into numerical form that computers can understand and process. Here's a simple analogy: think of embeddings as a unique fingerprint for each piece of text. No matter the length of the text, this "fingerprint" or vector, is a fixed-size list of numbers. These numbers capture the meaning of the text, allowing AI models to perform complex tasks like searching, sorting, and understanding similarity between different texts.

Implementing Embeddings¶

With your newfound understanding of vectors, it’s time to build the structure to store them in Business Central. We'll focus on creating embeddings for session data, which will be used to enhance our semantic search capabilities.

1. Structure¶

Here’s how we’ll structure our embeddings:

Payload Text: This is the text that we convert into a vector. It could include session names, descriptions, speaker details, etc. Think of the payload as the searchable content for your copilot.
Vector: A vector is a fixed-size list of decimal numbers representing the payload text. For our use with the model text-embedding-ada-002, it will always be a list of 1536 decimal numbers.

Let's define the necessary tables in AL:

2. Session Embedding Table¶

This table stores the session code and its corresponding payload. Open the "src\6-AdvancedAssistantCopilot\Embeddings\SessionEmbedding.Table.al" file and add the following code:

fields
{
    field(1; "Session Code"; Code[20]) { }
    field(11; Payload; Text[2048]) { }
    field(12; Vector; Blob)
    {
        Compressed = false;
    }
}

keys
{
    key(PK; "Session Code")
    {
        Clustered = true;
    }
}

We use two main fields in our embeddings that help our Advanced Assistant Copilot search within Business Central:

field(1; "Session Code"; Code[20]) { }
field(11; Payload; Text[2048]) { }

Session Code¶

What It Is: The Session Code is a unique tag that connects the embedding to a specific session in the Business Central database. In databases that store vectors and payloads, like Azure Vector Search or Qdrant, this tag is often called the "source."
How It Works: This tag is key when using search or chat apps. You might have seen something similar in apps like CentralQ, where each answer is linked back to its original data for context.
Enrich Assistant: In our Advanced Assistant Copilot, the Session Code is used to get more session details like start time or speaker info. These details aren't in the payload, but users might find them useful.

Payload Text¶

What It Is: This is the text that the AI looks through when you ask it questions. It's the main information the AI uses to understand each session.
How It Works: At its simplest, it could just be the name of the session. But, it’s better if we include more details like descriptions of the session or information about speakers to make the AI's search results better.

Adding More Details

For example, take Microsoft’s "Suggest sales lines" AI feature from Business Central 24. It doesn’t use embeddings, but it combines lots of details like item names, descriptions, and categories into one big text field for the AI to key search through.

Choose What to Include Carefully

When you decide what to put in your payload text, think about what will make the AI's answers most useful. The more relevant information you include, the better the AI can help users.

3. Vector Table¶

This table will hold the actual vectors for each session.

Open the "src\6-AdvancedAssistantCopilot\Embeddings\SessionEmbeddingVector.Table.al" file and add the following code:

fields
{
    field(1; "Session Code"; Code[20]) { }
    field(2; VectorId; Integer) { }
    field(10; VectorValue; Decimal) { }
}

keys
{
    key(PK; "Session Code", VectorId)
    {
        Clustered = true;
    }
}

Understanding Structure¶

The table that holds our vectors is straightforward, but it’s also quite large. Here’s how it works:

What’s Inside: For each session, using the model called text-embedding-ada-002, we store 1536 records in this table. Each record represents a piece of the vector (decimal number), which is essentially a list of numbers that describe the session’s details in a way that AI models, can understand.
Size and Scale: Just to give you an idea of the size, if we have 100 sessions, this table will have 153,600 records. That's a lot! This size makes the table really big and can be tough to manage in Business Central. Usually embeddings are stored in specialized vector databases, but we're keeping it simple for now.
Why It Matters: This structure is not ideal for storing in AL, because AL isn’t optimized for handling huge tables efficiently. We hope to have platform support for this in the future, but for now, we’re working with what we have.

AI Hackathon Solution

During a recent AI Hackathon, our team came up with a way to manage these large data sets at the SQL level for Business Central On-Premises environments. You can check out the details in this blog post on how we managed to keep and manage millions of vectors and search through them in milliseconds.

4. Pages for Embeddings¶

We'll also create pages to manage and view these embeddings:

Session Embeddings¶

Open the "src\6-AdvancedAssistantCopilot\Embeddings\SessionEmbeddings.Page.al" file and add the following code:

layout
{
    area(content)
    {
        repeater(General)
        {
            field("Session Code"; Rec."Session Code") { ApplicationArea = All; }
            field(Payload; Rec.Payload) { ApplicationArea = All; }
            field(Vector; VectorText) { ApplicationArea = All; }
        }
    }
}

Vectors¶

Open the "src\6-AdvancedAssistantCopilot\Embeddings\SessionEmbeddingVectors.Page.al" file and add the following code:

layout
{
    area(content)
    {
        repeater(General)
        {
            field(SessionCode; Rec."Session Code") { ApplicationArea = All; }
            field(VectorId; Rec.VectorId) { ApplicationArea = All; }
            field(VectorValue; Rec.VectorValue) { ApplicationArea = All; DecimalPlaces = 5 : 5; }
        }
    }
}

Next Steps¶

Prepare to convert your payload texts into vectors. This involves understanding how to effectively gather and process the information you want to be searchable.