djsmartberry
1 TopicEvaluation
Hi there, I tried out the evaluation feature, and tested out groundedness, relevance as well as similarity. My dataset has 94 questions and both relevance and similarity checked all 94 questions and its respective responses and gave me either a pass or a fail. However, groundedness completed the run with errors, as almost 10 of the inputs came back as null. I tried going through the logs but I'm not sure where to check what went wrong for those questions. Appreciate if someone could point me in the right direction.