Skip to content

Token Counting

Step 7 of 7

Introduction

Token counting is essential for managing the interactions with Azure OpenAI services efficiently. It helps you understand the cost implications and operational limitations of your requests. In the Event Assistant Copilot, token counting ensures that your queries are within the permissible limits for processing by the AI.

What are Tokens

Tokens are the basic units of input and output for AI models. Each token represents a word or a part of a word in the input text. The number of tokens used in a request determines the cost and length of the AI model's response.

Example:

Text Tokens Token Count
Hello AL developers! Welcome to the AI world!!! Hello AL developers ! Welcome to the AI world !!! 10
Hello AL developers! Welcome to the amazing AI world!!! Hello AL developers ! Welcome to the ama zing AI world !!! 12

You can experiment here, to get a better understanding of how tokens are generated.

Note

The spacing between the tokent in the above example has been added purely for readability. In reality, the tokens are not separated by spaces.

Why Should We Care

Understanding token limitations is crucial when working with Azure OpenAI models. Each model, like the GPT-3.5-Turbo (0125) we're using, has specific limits on how many tokens it can handle for both input and output during an interaction.

Token Limits Explained

For the GPT-3.5-Turbo (0125) model:

  • Maximum Request Tokens: 16,385
  • Maximum Output Tokens: 4,096

These numbers represent the total token capacity for a single interaction. This means if your input uses up the entire token limit, the AI won't be able to generate any response.

Here's some examples to illustrate this:

  • 16,385 tokens used for input: No space left for output.
  • 14,000 tokens used for input: Up to 2,385 tokens available for output.
  • 12,000 tokens used for input: Maximum of 4,096 tokens can be used for output.
  • 1,000 tokens used for input: Maximum of 4,096 tokens can be used for output.

Cost Implications

The cost of using Azure OpenAI services is based on the number of tokens processed, both for input and output:

  • Input Cost: $0.0005 per 1,000 tokens
  • Output Cost: $0.0015 per 1,000 tokens

This pricing structure emphasizes the importance of efficient token management to keep costs manageable, especially when scaling up usage.

Managing Token Usage

By understanding and managing token usage, you can ensure that your interactions with the AI are not only cost-effective but also fit within the operational constraints of the model. This management helps prevent errors due to exceeding token limits and ensures a smooth user experience.

For more details on the models and their specifications, you can refer to the Azure AI Services documentation and pricing.

Implementation

1. Open File

Navigate to the file located at "\src\5-EventAssistantCopilot\Implementation\EventAssistImpl.Codeunit.al". This file is where you'll implement the logic for token counting alongside the AI integration.

2. Define Helper Functions

Add the following procedures to manage the token limits effectively. These functions help control how many tokens are used for input and output, ensuring that the AI's responses are within the model's capabilities.

local procedure MaxInputTokens(): Integer
begin
    exit(MaxModelTokens() - MaxOutputTokens());
end;

local procedure MaxOutputTokens(): Integer
begin
    exit(4096); // Max tokens for output as per the model's specification
end;

local procedure MaxModelTokens(): Integer
begin
    exit(16385); // Total tokens available for the model
end;

3. Set Output Max Tokens

In the GetAnswer() function, just after setting the temperature parameter, add the following line to set the maximum tokens that the AI model can use for generating responses.

AOAIChatCompletionParams.SetMaxTokens(MaxOutputTokens());

Note

By default, model will use it's maximum output token limit for generating the response. However, you can set a lower limit to control the length of the response and manage costs effectively.

4. Check Question Token Count

Implement a function to check if the text you want to add (user's question or a tool's definition) doesn't exceed the maximum token limit for input. This function is crucial for avoiding errors due to exceeding token limits.

local procedure CheckIfTextCanBeAdded(var TotalUsedTokens: integer; NewText: Text): Boolean
var
    AOAIToken: Codeunit "AOAI Token";
begin
    if TotalUsedTokens + AOAIToken.GetGPT35TokenCount(NewText) > MaxInputTokens() then
        exit(false);

    TotalUsedTokens += AOAIToken.GetGPT35TokenCount(NewText);
    exit(true);
end;

Check if user's question can be added to the AI request:

AOAIChatCompletionParams.SetMaxTokens(MaxOutputTokens());
if not CheckIfTextCanBeAdded(TotalInputTokenCount, Question) then
    Error('The question is too long. Please try again with a shorter question.');

5. Check Tools Token Count

Before adding each tool to your AI request, use the CheckIfToolCanBeAdded function to ensure that adding the tool's definition won't exceed the input token limits.

For Business Central 24.0 or 24.1:

if CheckIfToolCanBeAdded(TotalInputTokenCount, EventAssistantToolsImpl.GetTotalSessionsTool()) then
    AOAIChatMessages.AddTool(EventAssistantToolsImpl.GetTotalSessionsTool());

For Business Central 24.2 or later:

if CheckIfToolCanBeAdded(TotalInputTokenCount, GetSessionDetailsByNameTool.GetPrompt()) then begin
    GetSessionDetailsByNameTool.SetProjectCode(ProjectNo);
    AOAIChatMessages.AddTool(GetSessionDetailsByNameTool);
end;

6. Define Tool Tokens Checker

Create a helper procedure that checks if a tool can be added based on its token count.

local procedure CheckIfToolCanBeAdded(var TotalUsedTokens: integer; NewTool: JsonObject): Boolean
var
    AOAIToken: Codeunit "AOAI Token";
    NewToolText: Text;
begin
    NewToolText := ConvertJsonToText(NewTool);
    exit(CheckIfTextCanBeAdded(TotalUsedTokens, NewToolText));
end;

local procedure ConvertJsonToText(JsonObject: JsonObject) JsonText: Text
begin
    JsonObject.WriteTo(JsonText);
end;

7. Check Other Tools

Apply the token checking process for other tools such as GetTotalSessionsByTrackTool, GetSessionDetailsByNameTool, and GetSpeakerScheduleTool. Ensure each tool addition is validated for token count before adding to the chat message.

Debugging and Validation

Set breakpoints and debug your code to see how many tokens each tool uses. This will help you estimate how many tools you can safely include without exceeding the model's token limits.

Conclusion

By following these steps, you've successfully integrated token counting into your Event Assistant Copilot. This will help you manage costs and ensure your interactions with Azure OpenAI are efficient and within expected operational limits.

🎇 Congratulations!

You've successfully implemented your Event Assistant Copilot to assist users with event-related questions.