Token Counting¶

Step 7 of 7

Introduction¶

Token counting is essential for managing the interactions with Azure OpenAI services efficiently. It helps you understand the cost implications and operational limitations of your requests. In the Event Assistant Copilot, token counting ensures that your queries are within the permissible limits for processing by the AI.

What are Tokens¶

Tokens are the basic units of input and output for AI models. Each token represents a word or a part of a word in the input text. The number of tokens used in a request determines the cost and length of the AI model's response.

Example:

Text	Tokens	Token Count
Hello AL developers! Welcome to the AI world!!!	`Hello` `AL` `developers` `!` `Welcome` `to` `the` `AI` `world` `!!!`	10
Hello AL developers! Welcome to the amazing AI world!!!	`Hello` `AL` `developers` `!` `Welcome` `to` `the` `ama` `zing` `AI` `world` `!!!`	12

You can experiment here, to get a better understanding of how tokens are generated.

Note

The spacing between the tokent in the above example has been added purely for readability. In reality, the tokens are not separated by spaces.

Why Should We Care¶

Understanding token limitations is crucial when working with Azure OpenAI models. Each model, like the GPT-3.5-Turbo (0125) we're using, has specific limits on how many tokens it can handle for both input and output during an interaction.

Token Limits Explained¶

For the GPT-3.5-Turbo (0125) model:

Maximum Request Tokens: 16,385
Maximum Output Tokens: 4,096

These numbers represent the total token capacity for a single interaction. This means if your input uses up the entire token limit, the AI won't be able to generate any response.

Here's some examples to illustrate this:

16,385 tokens used for input: No space left for output.
14,000 tokens used for input: Up to 2,385 tokens available for output.
12,000 tokens used for input: Maximum of 4,096 tokens can be used for output.
1,000 tokens used for input: Maximum of 4,096 tokens can be used for output.

Cost Implications¶

The cost of using Azure OpenAI services is based on the number of tokens processed, both for input and output:

Input Cost: $0.0005 per 1,000 tokens
Output Cost: $0.0015 per 1,000 tokens

This pricing structure emphasizes the importance of efficient token management to keep costs manageable, especially when scaling up usage.

Managing Token Usage¶

By understanding and managing token usage, you can ensure that your interactions with the AI are not only cost-effective but also fit within the operational constraints of the model. This management helps prevent errors due to exceeding token limits and ensures a smooth user experience.

For more details on the models and their specifications, you can refer to the Azure AI Services documentation and pricing.

Implementation¶

1. Open File¶

Navigate to the file located at "\src\5-EventAssistantCopilot\Implementation\EventAssistImpl.Codeunit.al". This file is where you'll implement the logic for token counting alongside the AI integration.

2. Define Helper Functions¶

Add the following procedures to manage the token limits effectively. These functions help control how many tokens are used for input and output, ensuring that the AI's responses are within the model's capabilities.

local procedure MaxInputTokens(): Integer
begin
    exit(MaxModelTokens() - MaxOutputTokens());
end;

local procedure MaxOutputTokens(): Integer
begin
    exit(4096); // Max tokens for output as per the model's specification
end;

local procedure MaxModelTokens(): Integer
begin
    exit(16385); // Total tokens available for the model
end;

3. Set Output Max Tokens¶

In the GetAnswer() function, just after setting the temperature parameter, add the following line to set the maximum tokens that the AI model can use for generating responses.

AOAIChatCompletionParams.SetMaxTokens(MaxOutputTokens());

Note

By default, model will use it's maximum output token limit for generating the response. However, you can set a lower limit to control the length of the response and manage costs effectively.

4. Check Question Token Count¶

Implement a function to check if the text you want to add (user's question or a tool's definition) doesn't exceed the maximum token limit for input. This function is crucial for avoiding errors due to exceeding token limits.

local procedure CheckIfTextCanBeAdded(var TotalUsedTokens: integer; NewText: Text): Boolean
var
    AOAIToken: Codeunit "AOAI Token";
begin
    if TotalUsedTokens + AOAIToken.GetGPT35TokenCount(NewText) > MaxInputTokens() then
        exit(false);

    TotalUsedTokens += AOAIToken.GetGPT35TokenCount(NewText);
    exit(true);
end;

Check if user's question can be added to the AI request:

AOAIChatCompletionParams.SetMaxTokens(MaxOutputTokens());
if not CheckIfTextCanBeAdded(TotalInputTokenCount, Question) then
    Error('The question is too long. Please try again with a shorter question.');

5. Check Tools Token Count¶

Before adding each tool to your AI request, use the CheckIfToolCanBeAdded function to ensure that adding the tool's definition won't exceed the input token limits.

For Business Central 24.0 or 24.1:

if CheckIfToolCanBeAdded(TotalInputTokenCount, EventAssistantToolsImpl.GetTotalSessionsTool()) then
    AOAIChatMessages.AddTool(EventAssistantToolsImpl.GetTotalSessionsTool());

For Business Central 24.2 or later:

if CheckIfToolCanBeAdded(TotalInputTokenCount, GetSessionDetailsByNameTool.GetPrompt()) then begin
    GetSessionDetailsByNameTool.SetProjectCode(ProjectNo);
    AOAIChatMessages.AddTool(GetSessionDetailsByNameTool);
end;

6. Define Tool Tokens Checker¶

Create a helper procedure that checks if a tool can be added based on its token count.

local procedure CheckIfToolCanBeAdded(var TotalUsedTokens: integer; NewTool: JsonObject): Boolean
var
    AOAIToken: Codeunit "AOAI Token";
    NewToolText: Text;
begin
    NewToolText := ConvertJsonToText(NewTool);
    exit(CheckIfTextCanBeAdded(TotalUsedTokens, NewToolText));
end;

local procedure ConvertJsonToText(JsonObject: JsonObject) JsonText: Text
begin
    JsonObject.WriteTo(JsonText);
end;

7. Check Other Tools¶

Apply the token checking process for other tools such as GetTotalSessionsByTrackTool, GetSessionDetailsByNameTool, and GetSpeakerScheduleTool. Ensure each tool addition is validated for token count before adding to the chat message.

Debugging and Validation¶

Set breakpoints and debug your code to see how many tokens each tool uses. This will help you estimate how many tools you can safely include without exceeding the model's token limits.

Conclusion¶

By following these steps, you've successfully integrated token counting into your Event Assistant Copilot. This will help you manage costs and ensure your interactions with Azure OpenAI are efficient and within expected operational limits.

Congratulations!¶

You've successfully implemented your Event Assistant Copilot to assist users with event-related questions.