Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 85 additions & 30 deletions src/langsmith/annotation-queues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,30 @@ title: Use annotation queues
sidebarTitle: Use annotation queues
---

_Annotation queues_ provide a streamlined, directed view for human annotators to attach feedback to specific [runs](/langsmith/observability-concepts#runs). While you can always annotate [traces](/langsmith/observability-concepts#traces) inline, annotation queues provide another option to group runs together, then have annotators review and provide [feedback](/langsmith/observability-concepts#feedback) on them.
_Annotation queues_ provide a streamlined, directed view for human annotators to attach feedback to specific [runs](/langsmith/observability-concepts#runs). While you can always annotate [traces](/langsmith/observability-concepts#traces) inline, annotation queues provide a way to group runs together, prescribe rubrics, and track reviewer progress.

## Create an annotation queue
LangSmith supports two queue styles:

To create an annotation queue:
- [**Single-run annotation queues**](#single-run-annotation-queues) present one run at a time and let reviewers submit any rubric feedback you configure.
- [**Pairwise annotation queues (PAQs)**](#pairwise-annotation-queues) present two runs side-by-side so reviewers can quickly decide which output is better (or if they are equivalent) against the rubric items you define.

1. Navigate to the **Annotation queues** section on the left-hand navigation panel of the [LangSmith UI](https://smith.langchain.com).
1. Click **+ New annotation queue** in the top right corner.
## Single-run annotation queues

Single-run queues present one run at a time and let reviewers submit any rubric feedback you configure. They can be created directly from the **Annotation queues** section in the [LangSmith UI](https://smith.langchain.com/).

### Create a single-run queue

1. Navigate to **Annotation queues** in the left navigation.
1. Click **+ New annotation queue** in the top-right corner.

![Create Annotation Queue form with Basic Details, Annotation Rubric, and Feedback sections.](/langsmith/images/create-annotation-queue-new.png)

### Basic Details
#### Basic Details

1. Fill in the form with the **Name** and **Description** of the queue. You can also assign a **default dataset** to queue, which will streamline the process of sending the inputs and outputs of certain runs to datasets in your LangSmith [workspace](/langsmith/administration-overview#workspaces).
1. Fill in the **Name** and **Description** of the queue.
1. Optionally assign a **default dataset** to streamline exporting reviewed runs into a dataset in your LangSmith [workspace](/langsmith/administration-overview#workspaces).

### Annotation Rubric
#### Annotation Rubric

1. Draft some high-level instructions for your annotators, which will be shown in the sidebar on every run.
1. Click **+ Desired Feedback** to add feedback keys to your annotation queue. Annotators will be presented with these feedback keys on each run.
Expand All @@ -30,67 +38,114 @@ To create an annotation queue:

![The rendered rubric for reviewers from the example instructions.](/langsmith/images/rubric-for-annotators.png)

### Collaborator Settings
#### Collaborator Settings

When there are multiple annotators for a run:

- **Number of reviewers per run**: This determines the number of reviewers that must mark a run as **Done** for it to be removed from the queue. If you check **All workspace members review each run**, then a run will remain in the queue until all [workspace](/langsmith/administration-overview#workspaces) members have marked their review as **Done**.

- Reviewers cannot view the feedback left by other reviewers.
- Comments on runs are visible to all reviewers.
- Reviewers cannot view the feedback left by other reviewers.
- Comments on runs are visible to all reviewers.

- **Enable reservations on runs**: When a reviewer views a run, the run is reserved for that reviewer for the specified **Reservation length**. If there are multiple reviewers per run as specified above, the run can be reserved by multiple reviewers (up to the number of reviewers per run) at the same time.

<Tip>
We recommend enabling reservations. This will prevent multiple annotators from reviewing the same run at the same time.
We recommend enabling reservations. This will prevent multiple annotators from reviewing the same run at the same time.
</Tip>

If a reviewer has viewed a run and then leaves the run without marking it **Done**, the reservation will expire after the specified **Reservation length**. The run is then released back into the queue and can be reserved by another reviewer.

<Note>
Clicking **Requeue** for a run's annotation will only move the current run to the end of the current user's queue; it won't affect the queue order of any other user. It will also release the reservation that the current user has on that run.
Clicking **Requeue** for a run's annotation will only move the current run to the end of the current user's queue; it won't affect the queue order of any other user. It will also release the reservation that the current user has on that run.
</Note>

As a result of the **Collaborator settings**, it's possible (and likely) that the number of runs visible to an individual in an annotation queue differs from the total number of runs in the queue compared to another user's queue size.
Because of these settings, the number of runs visible to each reviewer can differ from the total queue size.

You can update these settings at any time by clicking on the pencil icon <Icon icon="pencil"/> in the **Annotation Queues** section.
You can revisit the pencil icon <Icon icon="pencil"/> in **Annotation queues** to update any settings later.

## Assign runs to an annotation queue
### Assign runs to a single-run queue

To assign runs to an annotation queue, do one of the following:
There are several ways to populate a single-run queue with work items:

- Click on **Add to Annotation Queue** in top right corner of any [trace](/langsmith/observability-concepts#traces) view. You can add any intermediate [run](/langsmith/observability-concepts#runs) (span) of the trace to an annotation queue, but not the root span.
- **From a trace view**: Click **Add to Annotation Queue** in the top-right corner of any [trace](/langsmith/observability-concepts#traces) view. You can add any intermediate [run](/langsmith/observability-concepts#runs), but not the root span.

![Trace view with the Add to Annotation Queue button highglighted at the top of the screen.](/langsmith/images/add-to-annotation-queue.png)
![Trace view with the Add to Annotation Queue button highlighted at the top of the screen.](/langsmith/images/add-to-annotation-queue.png)

- Select multiple runs in the runs table then click **Add to Annotation Queue** at the bottom of the page.
- **From the runs table**: Select multiple runs, then click **Add to Annotation Queue** at the bottom of the page.

![View of the runs table with runs selected. Add to Annotation Queue button at the botton of the page.](/langsmith/images/multi-select-annotation-queue.png)

- [Set up an automation rule](/langsmith/rules) that automatically assigns runs that pass a certain filter and sampling condition to an annotation queue.
- Navigate to the **Datasets & Experiments** page and select a dataset. On the dataset's page select one or multiple [experiments](/langsmith/evaluation-concepts#experiment). At the bottom of the page, click **<Icon icon="pencil"/> Annotate**. From the resulting popup, you can either create a new queue or add the runs to an existing one.
- **Automation rules**: [Set up a rule](/langsmith/rules) to automatically assign runs that match a filter (for example, errors or low user scores) into a queue.
- **Datasets & experiments**: Select one or more [experiments](/langsmith/evaluation-concepts#experiment) within a dataset and click **<Icon icon="pencil"/> Annotate**. Choose an existing queue or create a new one, then confirm the (single-run) queue option.

![Selected experiments with the Annotate button at the bottom of the page.](/langsmith/images/annotate-experiment.png)

<Check>
It is often a good idea to assign runs that have a particular type of user feedback score (e.g., thumbs up, thumbs down) from the application to an annotation queue. This way, you can identify and address issues that are causing user dissatisfaction. To learn more about how to capture user feedback from your LLM application, follow the guide on [attaching user feedback](/langsmith/attach-user-feedback).
</Check>

## Review runs in an annotation queue

To review runs in an annotation queue:
### Review a single-run queue

1. Navigate to the **Annotation Queues** section through the left-hand navigation bar.
1. Click on the queue you want to review. This will take you to a focused, cyclical view of the runs in the queue that require review.
1. You can attach a comment, attach a score for a particular [feedback](/langsmith/observability-concepts#feedback) criteria, add the run to a dataset or mark the run as reviewed. You can also remove the run from the queue for all users, despite any current reservations or settings for the queue, by clicking the **Trash** icon <Icon icon="trash"/> next to **View run**.

<Tip>
The keyboard shortcuts that are next to each option can help streamline the review process.
The keyboard shortcuts that are next to each option can help streamline the review process.
</Tip>

![View or a run with the Annotate side panel. Keyboard shortcuts visible for options.](/langsmith/images/review-runs.png)

## Pairwise annotation queues

Pairwise annotation queues (PAQs) present two runs side-by-side so reviewers can quickly decide which output is better (or if they are equivalent) against the rubric items you define. They are designed for fast A/B comparisons between two experiments (often a baseline vs. a candidate model) and must be created from the **Datasets & Experiments** pages.

### Create a pairwise queue

1. Navigate to **Datasets & Experiments**, open a dataset, and select **exactly two experiments** you want to compare.
1. Click **Annotate**. In the popover, choose **Add to Pairwise Annotation Queue**. (The button is disabled until exactly two experiments are selected.)

![Popover showing the "Add to Pairwise Annotation Queue" card highlighted after two experiments are selected.](/langsmith/images/pairwise-annotation-queue-popup.png)

1. Decide whether to send the experiments to an existing pairwise queue or create a new one.
1. Provide the queue details:
- **Basic details** (name and description)
- **Instructions & rubrics** tailored to pairwise scoring
- **Collaborator settings** (reviewer count, reservations, reservation length)
1. Submit the form to create the queue. LangSmith immediately pairs runs from the two experiments and populates the queue.

Key differences for PAQs:

- **Experiments**: You must provide two experiment sessions up front. LangSmith automatically pairs their runs in chronological order and populates the queue during creation.
- **Rubric**: Pairwise rubric items only require a feedback key and (optionally) a description. Annotators decide whether Run A, Run B, or both are better for each rubric item.
- **Dataset**: Pairwise queues do not use a default dataset, because comparisons span two experiments.
- **Reservations & reviewers**: The same collaborator controls apply. Reservations help prevent two people from judging the same comparison simultaneously.

### Add more comparisons to a pairwise queue

If you need to add more comparisons later, return to **Datasets & Experiments**, select the two experiments again, and choose **Add to Pairwise Annotation Queue** to append new pairs.

Selecting two experiments and creating a PAQ automatically pairs the runs. When augmenting an existing PAQ, LangSmith preserves historical comparisons and appends new pairs to the queue.

### Review a pairwise queue

1. From **Annotation queues**, select the pairwise queue you want to review.
1. Each queue item displays Run A on the left and Run B on the right, along with your rubric.
1. For every rubric item:
- Choose **A is better**, **B is better**, or **Equal**. The UI records binary feedback on both runs behind the scenes.
- Use hotkeys `A`, `B`, or `E` to lock in your choice.
1. Once you finish all rubric items, press **Done** (or `Enter` on the final rubric item) to advance to the next comparison.
1. Optional actions:
- Leave comments tied to either run.
- Requeue the comparison if you need to revisit it later.
- Open the full trace view for deeper debugging.

Reservations, reviewer thresholds, and comments behave identically to those in single-run queues, enabling teams to use different queue types without modifying their existing workflow.

![Pairwise review screen showing runs side-by-side with the feedback pane containing A/B/Equal buttons and keyboard shortcuts.](/langsmith/images/pairwise-annotation-queue-review-feedback-pane.png)

<Check>
Consider routing runs that already have user feedback (e.g., thumbs-down) into a single-run queue for triage and a pairwise queue for head-to-head comparisons against a stronger baseline. This helps you identify regressions quickly. To learn more about how to capture user feedback from your LLM application, follow the guide on [attaching user feedback](/langsmith/attach-user-feedback).
</Check>

## Video guide

<iframe
className="w-full aspect-video rounded-xl"
src="https://www.youtube.com/embed/rxKYHA-2KS0?si=V4EnrUmzJaUVJh0m"
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading