Forum Discussion

Aditya_vp's avatar
Aditya_vp
Copper Contributor
Apr 06, 2025

Enhancing ESGai: Performance Improvements Through Parallel Processing and Structured Data Management

We're excited to announce significant performance enhancements to our ESGai solution, focusing on reducing response times and improving data organization. These upgrades represent a major step forward in our mission to deliver efficient ESG insights.

https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh

Key Improvements

Parallel Processing with Multiple Worker Agents

The most impactful change is our shift from sequential to parallel processing. Previously, sub-questions generated by the Manager agent were processed one after another by a single Worker agent, creating a bottleneck:

  • Before: Sub-questions entered a queue, with total response time being the sum of all processing times
  • Now: Three Worker agents operate simultaneously, each powered by its own GPT-4o model deployment
  • Result: Total response time is now determined only by the longest-running sub-question rather than the sum of all processing times

This parallel architecture dramatically reduces wait times for end users, as multiple data retrievals and analyses happen concurrently rather than sequentially.

Mathematical Improvement in Response Time

We can quantify this improvement mathematically:

Let's define:

  • n = number of sub-questions
  • t = average time taken per sub-question
  • s = time taken by the longest sub-question (where s ≤ n×t)

Before: Total processing time = n × t (sequential processing)
Now: Total processing time = s (parallel processing)

Since s is typically much smaller than n×t, this represents a substantial improvement in response time. For example, with 5 sub-questions taking an average of 10 seconds each, the previous architecture would require 50 seconds, while the new architecture might only require 15 seconds (assuming the longest sub-question takes 15 seconds).

Structured Data Organization

We've implemented a more sophisticated data organization system:

  • Virtual Folders in Blob Storage: Each company now has its dedicated virtual folder containing its three document types (Sustainability Report, XBRL, and BRSR)
  • Enhanced Vector Database Schema: The AI Search index now includes a defined schema that enables filtering capabilities
  • Targeted Retrievals: Index filtering allows for more precise document retrieval, reducing noise in the results

Optimized Query Processing with Enhanced Schema

The Manager agent now has the ability to pass two critical variables to narrow the search scope:

  1. Company Name: Identifies which company's documents to search
  2. Document Type: Specifies which of the three document types to examine

This two-stage search process significantly improves efficiency:

  1. First, a keyword search identifies the relevant company and document type
  2. Then, vector search retrieves the most relevant text chunks from that specific document

Index Schema and Search Process

As shown in the diagram, our enhanced workflow follows a structured path:

  1. The Manager Agent identifies both Company Name and Document Type from the user query
  2. These parameters are passed to AI Search to narrow the scope
  3. AI Search connects to Vector Search with these filters applied
  4. Vector Search retrieves only the most relevant chunks from the specified documents

This targeted approach means our system no longer needs to search through the entire database for each query, resulting in faster and more accurate responses.

Impact on User Experience

These improvements deliver a significantly enhanced user experience:

  • Faster Response Times: Users receive comprehensive answers in a fraction of the time
  • More Relevant Results: Structured data organization and filtering lead to more precise information retrieval
  • Scalable Architecture: The system is now better positioned to handle additional companies and documents as we expand

As we continue to refine ESGai with additional funding, these architectural improvements provide a solid foundation for scaling our solution to serve more companies and handle increasingly complex ESG queries.

https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh

No RepliesBe the first to reply

Resources