Blog Post

AI - Machine Learning Blog
4 MIN READ

Introducing Gretel Navigator to Azure AI Foundry Model Catalog.

Dom_Rob's avatar
Dom_Rob
Icon for Microsoft rankMicrosoft
Jan 15, 2025

Meet Gretel Navigator, this cutting-edge compound AI system harnesses advanced AI, intelligent workflows, and pre-trained large language models (LLMs) to turn your ideas into detailed, task-specific data. It effortlessly converts raw concepts into practical, ready-to-use datasets with ease.

Key Features and Applications

  • Synthetic Data Generation: Harness the power of Gretel Navigator API to craft custom question-answer pairs exactly how you need them—be it for specific expertise levels or niche topics.
  • Enterprise-Grade Infrastructure: Seamless scalability, availability, and reliability through Azure’s infrastructure.
  • Streamlined Workflow: Simplified deployment and management and reduced operational overhead.
  • Cost-Effective Synthetic Data Generation: Pay-as-you-go pricing and elimination of manual data collection costs.
  • Enhanced Data Quality: Consistent, well-structured data generation with built-in quality validation.

 

Use Cases:

  • Model Training and Evaluation: Generate high-quality, varied question-answer sets perfect for training and fine-tuning your AI models.
  • Educational Tools: Tailor-made question-answer collections designed to suit various educational levels and subjects, making learning more personalized and engaging.
  • Customer Support: Create detailed FAQs and troubleshooting guides from user manuals and product docs to keep your customers informed and satisfied.
  • QA Pair Scoring: Ensure excellence with scoring that checks for conformance, quality, toxicity, bias, and accuracy.

 

Why Reflection Matters

Reflection guides AI through step-by-step reasoning before delivering a final answer. This method enhances explainability and accuracy, making the AI's thought process clear and reliable. By capturing intermediate reflections, we gain not just answers, but insights into how those answers were reached. Here’s why this is a game-changer:

  • Improved Explainability: See the logic behind the AI’s responses.
  • Enhanced Fine-Tuning: Detailed reasoning steps allow for more effective model refinement.
  • Better Complexity Handling: Models engage in thoughtful, multi-step reasoning rather than relying on rote memorization.

 

Introducing gretelai/synthetic-gsm8k-reflection-405bt

For those keen on training or evaluating language models on complex reasoning tasks, Gretel launched their own synthetic GSM8K-inspired dataset on Hugging Face. Here’s what sets it apart:

  • Enhanced Complexity: Tackles more intricate reasoning problems.
  • Increased Diversity: Features a broader range of demographics and real-world scenarios.
  • Labeled Difficulty Levels: Each problem comes with a difficulty label for nuanced performance evaluation.

 

With the Reflection technique and this groundbreaking synthetic dataset, Gretel is set to revolutionize AI reasoning. By shifting the focus from merely generating answers to fostering thoughtful, step-by-step problem-solving, Gretel is paving the way for AI systems that don't just react but truly reflect. Get ready for an era of AI that's not just smart but wise!

 

Enhancing PII and PHI Detection with Synthetic Data

Gretel understands the challenges of improving Personally Identifiable Information (PII) and Protected Health Information (PHI) entity detection. Accessing real-world sensitive data is often restricted due to privacy concerns. To address this, Gretel has developed synthetic documents enriched with a wide variety of PII and PHI entities, such as social security numbers, medical record numbers, and email addresses. These documents span multiple industries and document types, providing comprehensive coverage for training and fine-tuning models like GLiNER.

 

Fine-Tuning GLiNER Models: Enhancing PII and PHI Detection

The fine-tuned GLiNER models, developed using Gretel-generated synthetic data, are engineered for precise detection of Personally Identifiable Information (PII) and Protected Health Information (PHI) across multiple industries while maintaining privacy compliance. Their capacity to generalize across various domains renders them suitable for applications in healthcare, finance, and other sectors.

Model fine-tuning was conducted on the training split of the dataset, with the validation set providing feedback during training. This approach enabled Gretel to monitor the model's performance and make necessary adjustments. The final performance of the model was assessed using the test set. The Personally Identifiable Information (PII) and Protected Health Information (PHI) entities in the dataset served as ground truth labels, guiding the model to accurately detect entities across various domains and document types.

The model variants are available on Hugging Face:

  • gretelai/gretel-gliner-bi-small-v1.0: Lightweight, suitable for resource-constrained environments while maintaining strong performance.
  • gretelai/gretel-gliner-bi-base-v1.0: Balanced performance, ideal for most use cases requiring efficient resource usage and high accuracy.
  • gretelai/gretel-gliner-bi-large-v1.0: The highest-performing model, recommended for applications where accuracy is paramount.

Gretel Navigator constitutes a substantial advancement in synthetic data generation, providing unparalleled versatility and precision customized to meet your unique requirements. We encourage you to explore this sophisticated tool

 

Getting Started with Gretel Navigator in the Azure AI Foundry Model Catalog

To get started with Azure AI Foundry and deploy your first model, follow these clear steps:

  1. Familiarize Yourself: If you're new to Azure AI Foundry, start by reviewing this documentation to understand the basics and set up your first project.
  2. Access the Model Catalog: Open the model catalog in AI Foundry.
  3. Find the Model: Click the “View models” button on the MaaS announcement card.
  4. Select the Model: Open the Gretel Navigator model from the list.
  5. Deploy the Model: Click on ‘Deploy’ and choose the managed compute option

 

 

FAQ

  • What types of data can I use with Navigator?
    • Gretel Navigator is engineered to support tabular data encompassing any combination of numeric, categorical, and text modalities. This versatility enables seamless operation across diverse datasets, effectively addressing a wide spectrum of data generation and augmentation tasks.
  • What can I do with Navigator?
    • Tabular data can be generated from natural language or SQL prompts, existing datasets can be edited, data can be augmented, missing values can be filled in, interactive experimentation can be conducted in the console, and data can be generated and edited at scale using our batch API and SDK.
  • Can I use Navigator to work with my existing datasets?
    • Yes, Navigator helps edit and enhance datasets. You can fill in missing values, correct errors, or extend your data using natural language prompts.
  • Is Navigator a model or an application?
    • Navigator is a compound AI system that utilizes multiple transformer-based models, including Gretel’s own fine-tuned LLM.
  • How does Gretel Navigator overcome the limitations of traditional LLMs in data generation tasks?
    • Traditional LLMs have limitations due to their context windows and may face challenges with tasks that exceed these limits or require precise mathematical operations. Gretel Navigator addresses these issues by employing an agent-based approach that plans tasks, delegates operations beyond the scope of LLMs, and provides output without added complexity for the user.

 

Updated Jan 14, 2025
Version 1.0
No CommentsBe the first to comment