Educator Developer Blog

9 MIN READ

Unlock the Full Potential of LLMs with POML: The Markup Language for Prompts

Iron Contributor

Aug 28, 2025

Explore the use of the AI Toolkit in conjunction with the Prompt Orchestration Markup Language (POML) for on-premises model usage. It will be demonstrated how this structured approach addresses the challenges associated with complex, multi-step tasks, moving beyond unstructured prompts to a more programmatic and reliable methodology for language model interaction. The focus will be on achieving predictable and accurate results in a controlled environment.

Most of the times, it is observed that prompts/system message for large language models (LLMs) can become unstructured and difficult to manage, especially when it is lengthy containing few shot examples. This often leads to incorrect responses, makes way for hallucinations and more such problems. For those who have experienced this challenge, Prompt Orchestration Markup Language (POML) offers a structured solution.

POML can be considered as the HTML of AI prompts, utilizing familiar tag-based components to introduce organization and reusability into a set of instructions. Its design addresses common frustrations in prompt engineering, allowing for a more efficient and scalable interaction with LLMs.

Why use POML?

The adoption of POML provides several key benefits:

Structure and Reusability: The need for unstructured, hard-to-edit text prompts is eliminated. With POML, prompts can be constructed in a way that makes them easier to maintain and reuse, integrating them as a central part of a technical workflow.
Diverse Data Handling: POML facilitates the seamless integration of various data types directly into a prompt. It is capable of pulling in text from Word documents, tables from CSVs, images, audio files, and even entire file folders, providing the LLM with comprehensive context.
Enhanced Workflow: A dedicated VS Code extension offers a powerful environment for writing, testing, and managing prompts in a manner similar to code, complete with features such as live preview.

POML : VS Code extension

The Core Components of POML

The architecture of POML is built upon the following main types of components:

Basic Components: These tags provide logical structure and formatting. Examples include <p> for paragraphs and <ul> for lists.
Intention Components: These are used to clearly define the objectives for the LLM. Tags like <task>, <role>, and <example> help in specifying the AI's intended actions and behavior.
Data Components: These are crucial for incorporating external information. Tags such as <document>, <table>, and <img> allow for the inclusion of various data sources.

POML in Action:

Let’s now see POML in action. To begin with we need the POML extension installed on the Visual studio code, navigate to extensions and search for “POML”, Click on the install option.

POML : VS Code extension install

Once the POML extension is installed, its now time to configure the LLMs, Following are the steps to configure the LLM for testing the POML files,

Press Ctrl + , (comma) or go to File > Preferences > Settings.
Navigate to POML Extension Settings
In the search bar within the Settings tab, type "POML" to filter the settings.
Locate the settings related to the POML extension.
Configure Language Model Settings like API Key, API URL, Max Tokens, Provider.
In this section, we are using the Github models for configuring the LLMs on the POML toolkit. Obtain the Personal access token from : Github PAT . This PAT will be used as the API Key.

Choose the provider following as parameters,

Provider: OpenAI (Select from dropdown)
API_URL: https://models.github.ai/inference
Max Tokens: 1500
Model: gpt-4o

POML : config

To start, we'll create a new .poml file. This file type is specifically designed for defining prompts in POML. In our example, we'll use the POML toolkit to pass an image of the photosynthesis process to an LLM and ask it to explain the concept in a way a 10-year-old can understand.

Open a new directory and make sure the .poml and the image file are in the same directory. Create a new file and name it “example.poml” and paste the following content.

<poml>
  <role>You are a patient teacher explaining concepts to a 10-year-old.</role>
  <task>Explain the concept of photosynthesis using the provided image as a reference.</task>

  <img src="https://github.com/microsoft/poml/raw/HEAD/photosynthesis_diagram.png" alt="Diagram of photosynthesis" />

  <output-format>
    Keep the explanation simple, engaging, and under 100 words.
    Start with "Hey there, future scientist!".
  </output-format>
</poml>

Save the file and then click on – “Open POML Preview to the Side [ALT] Open POML Preview” located on the top right corner.

Open POML Preview to the Side [ALT] Open POML Preview

A dedicated window to shows preview of the POML. Display settings are available to view it in a rendered format, which illustrates how the prompt will be visualized when passed to the LLM.

POML preview

Its finally time to execute this prompt and perhaps the most interesting part! Click on the “Run” option in the file example.poml window.

Runoption in the file example.poml

Note:

Sometimes if RUN is not visible, then close the preview tab and it will be seen.

In the output window, we can now see the output from the LLM.

POML output

Great! POML has been successfully utilized to pass prompts and generate completions from the LLM.

It is also worth mentioning that POML can be integrated directly into Python code. A demonstration of this will be provided using an on-premise model hosted via the AI Toolkit. If you are unfamiliar with the AI Toolkit are encouraged to go through the provided link.

Launch the AI Toolkit extension and choose an offline model, now copy the name of the model by simply doing a right click on the model in the list and choose “Copy model name”. This step is very crucial, as a wrong model name in the request can lead to invalid request to the Language model.

AI Toolkit: Copy model name

Note: There are a variety of models that we can choose from AI Toolkit, we can also use models from Ollama, Azure AI foundry, Hugging face and use it through toolkit.

Local models hosted via AI Toolkit are by default available at http://127.0.0.1:5272/v1/chat/completions . This is the address that we will be using while sending a request using the requests library in python.

Note: If you are using it directly from Python in an application, then use the URL http://127.0.0.1:5272/v1/

We will create a POML file to build a Data chatbot that queries a CSV. As this presents a lengthier prompting scenario, using POML is an effective way to provide the language model with clear, step-by-step instructions for successful task completion.

Create a file orders.poml and paste the following content,

<poml>
    <role> You are a helpful chatbot agent answering customer's question in a chat. </role>
    <task> Your task is to answer the customer's question using the data provided in the data section. 
    <!-- Use listStyle property to change the style of a list. -->
        <list listStyle="decimal">
            <item> You can access order histoory in the orders section including email id and order total with payment summary.</item>
            <item> Refer to orderlines for item level details within each order in orders. </item>
       </list>
    </task>
    <!--cp means CaptionedParagraph, which is a paragraph with a customised heading.-->
    <cp caption="Data">
        <cp caption="Orders">
            <!-- Use table to read a csv file. By default, it follows its parents' style (markdown in this case). -->
            <table src="orders.csv" key="order_id" />
        </cp>
        <cp caption="OrderLines">
            <!-- Use syntax to specify its output format. -->
            <table src="orderlines.csv" syntax="tsv" />
        </cp>
    </cp>
    <!-- This can also be stepwise-instructions, and it's case-insensitive. -->
    <StepwiseInstructions>
    <!-- Read a file and save it as instructions -->
        <let src="order_instructions.json" name="instructions"/>
        <!-- Use a for loop to iterate over the instructions, use {{ }} to evaluate an expression -->
        <p for="inst in instructions">
            Instruction {{loop.index+1}}: {{ inst }}
        </p>
    </StepwiseInstructions>
    <!--Specify the speaker of a block-->
    <HumanMessage>
        <qa> How much did I pay for my last order? </qa>
    </HumanMessage>
    <!-- Use stylesheet (a CSS-like JSON) to modify the style in a batch. -->
    <stylesheet>
        {
            "cp": {
                "captionTextTransform": "upper"
            }
        }   
    </stylesheet>
</poml>

We can also preview it while we are creating the file using the POML preview feature of the POML Extension.

POML preview: orders.poml

We also have created a csv file that carries all the instructions and json file that is used to specify instructions to Language model. Both of these files can be found here.

Next step is to create the python file that will use POML file to interact with the Language model. Create a new directory and setup the python virtual environment by using “Ctrl+Shift+P”, in the search bar type “Python: Create Environment”. Make sure that Python is installed on your computer before proceeding with this step. Choose “.venv” from the dropdown and choose the relevant interpreter path.

Let’s now install the library using pip. In the terminal, type in the following command,

pip install poml requests

Create a new file app.py and paste the following code in the file,

from poml import poml
import requests
import json

#Load and render POML file
messages=poml("orders.poml",chat=True)

#Combine messages into a single prompt
full_prompt = "\n".join(
    ["\n".join(str(c).strip() for c in m["content"]) if isinstance(m.get("content"), list) else str(m["content"]).strip()
     for m in messages if m.get("content")]
)

print("\n---Full Prompt---\n")
print(full_prompt)

#Send response to AI Toolkit Model

url = "http://127.0.0.1:5272/v1/chat/completions"

payload = json.dumps({
  "model": "Phi-4-cpu-int4-rtn-block-32-acc-level-4",
  "messages": [
    {
      "role": "user",
      "content": full_prompt
    }
  ],
  "temperature": 0.7,
  "top_p": 1,
  "top_k": 10,
  "max_tokens": 100,
  "stream": False
})
headers = {
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

# To view the completions
print("\n---AI Toolkit Response---\n")
print(response.text)
print(response.json()['choices'][0]['message']['content'])

With the configurations complete, the setup can now be tested. Clicking "RUN" will display the POML-based prompt in the terminal before it is sent to the Phi-4 model.

Terminal

The clarity of the prompt being sent to the language model is notable. This structured approach is understood to be the reason for POML's effectiveness in ensuring a better response, particularly when there is a need to impose order on a complex or lengthy prompting scenario like the one used for this Data chatbot.

POML: Completions

This is the model's response after being prompted with the POML file.

Response

Based on the order history provided, it appears that both orders listed (Order ID CC10182 and Order ID CC10183) have an order total of 0. This suggests that no payment was made for these orders. However, to ensure I am referring to your specific order, could you please confirm your email address or any additional details about your last order? This will help me assist you more accurately.

The output demonstrates how the structured prompt successfully instructed the model to extract and organize data from a CSV file. This demonstrates POML's effectiveness in guiding the language model to perform specific, data-driven tasks accurately.Unlike XML or JSON, which require dedicated code for parsing, POML handles the parsing automatically. This allows one to focus on the prompt's content rather than the underlying data structure.

This has demonstrated the shift from simple conversational prompts to a more programmatic approach to interacting with language models. POML provides the essential structure required to manage complex scenarios, ensuring models like Phi-4 hosted on an on-premise scenario like we did here with AI toolkit can handle tasks like data extraction with remarkable clarity and precision. This method offers a robust path forward for developing more reliable and sophisticated LLM applications.

The full code for this project is available on AI_Toolkit_Samples GitHub repository.

In our upcoming series of posts, these concepts will be expanded upon as we delve into creating agentic workflows. By leveraging POML, these agents can be given clear, hierarchical instructions, enabling them to make decisions, execute tasks, and interact with various tools in a reliable manner.

Updated Aug 26, 2025

Version 1.0

shreyanfern

Iron Contributor

Joined January 30, 2024

View Profile

Educator Developer Blog

Follow this blog board to get notified when there's new activity

Blog Post

Unlock the Full Potential of LLMs with POML: The Markup Language for Prompts

1 Comment