Evaluations Key Concepts

Evaluating the LLM

One of the key features of Big Hummingbird is the ability to incorporate human feedback in evaluating the effectiveness of your AI-generated outputs.

Create an Evaluation Card

In the Human Review section, you can set up review criteria for the prompt. Evaluation cards allow you to define specific questions regarding "Conciseness", "Relevance", or "Tone", and provide a rating scale (from 1 to 5 stars) or True/False.

evaluation card

For our example, we'll include the following questions:

Brand Messaging: Does the email align with EcoNest’s mission of creating sustainable home products?
Subject Line Effectiveness: Does the subject line grab your attention and make you want to open the email? Rate from 1-5.

Invite Human Reviewers

You can invite team members or domain experts to review prompt outputs. Human reviewers will score the output based on the criteria defined in the valuation card, providing qualitative feedback.

What your reviewers see. check generate output

give feedback

Review Feedback

After the human reviewers have completed their assessments, you can see their ratings and comments. This feedback is crucial for refining your prompts and ensuring that the outputs meet your project's goals.

Evaluations Key Concepts

Evaluating the LLM​

Create an Evaluation Card​

Invite Human Reviewers​

Review Feedback​

Evaluating the LLM

Create an Evaluation Card

Invite Human Reviewers

Review Feedback