Skip to content

Conversation

@YoYoJa
Copy link
Contributor

@YoYoJa YoYoJa commented Oct 21, 2025

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

YoYoJa and others added 26 commits October 17, 2025 14:01
* add eval result converter

* Add result converter

* update converter params to optional

* add eval meta data

* fix type

* remove useless file

* get eval meta data as input

* fix build errors

* remove useless import

* resolve comments

* update

* update comments

* fix checker failure

* add error msg and error code

* Surface evaluator error msg

* update UT

* fix usage

* make eval_meta_data optional

* remove useless lines

* update param name to add underscore

* parse updated annotation results

* update trace_id

* expose sample data for sdk evaluators

* update

* update

* fix UT

* fix tests

* fix test
* add eval result converter

* Add result converter

* update converter params to optional

* add eval meta data

* fix type

* remove useless file

* get eval meta data as input

* fix build errors

* remove useless import

* resolve comments

* update

* update comments

* fix checker failure

* add error msg and error code

* Surface evaluator error msg

* update UT

* fix usage

* make eval_meta_data optional

* remove useless lines

* update param name to add underscore

* parse updated annotation results

* update trace_id

* expose sample data for sdk evaluators

* update

* Fix column mapping bug for AOAI evaluators with custom data mapping (#43429)

* fix nesting bug for custom data mapping

* address comments

* remove extra code and fix test case

* run formatter

* use dumps

* Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091)

* Modify logic in PR (#43060) to include default message for messages with empty body and export logs

* Update CHANGELOG

* Update logic as per updated spec

* Addressed comments

* Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470)

Co-authored-by: Daniel Jurek <[email protected]>

* update

* fix UT

* fix tests

* Added Tests and Samples for Paginated Queries (#43472)

* added tests and samples for paginated queries

* Apply suggestions from code review

Co-authored-by: Copilot <[email protected]>

* added single partition pagination sample

---------

Co-authored-by: Andrew Mathew <[email protected]>
Co-authored-by: Copilot <[email protected]>

* [Test Proxy] Support AARCH64 platform (#43428)

* Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415)

this doc is outdated

* fix test

* [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230)

* code and test

* update pyproject.toml

---------

Co-authored-by: azure-sdk <PythonSdkPipelines>
Co-authored-by: ChenxiJiang333 <[email protected]>

* [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476)

* code and test

* update changelog

* update changelog

* Update CHANGELOG.md

---------

Co-authored-by: azure-sdk <PythonSdkPipelines>
Co-authored-by: ChenxiJiang333 <[email protected]>
Co-authored-by: ChenxiJiang333 <[email protected]>

* Extend basic test for "project_client.agents" to do more operations (#43516)

* Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457)

* Updated validate pkg template to use packageInfo

* Fixed typo

* Fixed the right variable to use

* output debug log

* Fixed errors in expression evaluation

* removed debug code

* Fixed an issue in pipeline

* Updated condition for variable setting step

* Join paths of the script path

* Use join-path

* return from the function rather than exit

---------

Co-authored-by: ray chen <[email protected]>

* Reorder error and warning log line processing (#43456)

Co-authored-by: Wes Haggard <[email protected]>

* [App Configuration] - Release 1.7.2 (#43520)

* release 1.7.2

* update change log

* Modify CODEOWNERS for Azure SDK ownership changes (#43524)

Updated CODEOWNERS to reflect new ownership for Azure SDK components.

* Migrate Confidential Ledger library from swagger to typespec codegen (#42664)

* regen

* add default cert endpoint with tsp

* remove refs to old namespace

* update async operation patch

* fix operations patch

* fix header impl

* more header fixes

* revert receipt directory removal

* cspell

* regen certificates under correct namespace

* regen ledger client

* update namespace name

* revert certificate change

* update shared files after regen

* updates

* delete extra files

* cspell

* match return type to current behavior

* cspell

* mypy

* pylint

* update docs

* regen

* regen

* fix patch

* Revert "mypy"

This reverts commit 6351ead.

* add info in tsp_location.yaml

* regen

* update patch files

* update patch files

* fix patch

* update patch files

* regen

* update tsp-location.yaml

* generate certificate client

* update patch files

* fixes

* regen clients

* update pyproject.toml deps

* update assets

* regen

* revert test change

* nit

* fix test input

* regen with new model

* update tests

* update tests

* apiview props

* regen

* update tests

* update assets

* apiview props

* temp relative package updates

* fix name

* fix ledger ci (#43181)

* remove swagger

* remove extra configs

* wip revert package dep temporarily

* update readme

* fix config files

* Revert "wip revert package dep temporarily"

This reverts commit db553c4.

* move tests

* add identity samples

---------

Co-authored-by: catalinaperalta <[email protected]>

* rm certificate files

* update changelog

* misc fixes

* update shared reqs

* test

* pylint

---------

Co-authored-by: catalinaperalta <[email protected]>

* update scripts (#43527)

Co-authored-by: helen229 <[email protected]>

* [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448)

* Configurations:  'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release.

* update changelog

---------

Co-authored-by: ChenxiJiang333 <[email protected]>

* App Configuration Provider - Key Vault Refresh (#41882)

* Sync refresh changes

* Key Vault Refresh

* adding tests and fixing sync refresh

* Updating Async

* Fixed Async Tests

* Updated tests and change log

* Apply suggestions from code review

Co-authored-by: Copilot <[email protected]>

* Fixing merge issue

* Updating comments

* Updating secret refresh

* Update _azureappconfigurationproviderasync.py

* Fixing Optional Endpoint

* fix mypy issue

* fixing async test

* mixing merge

* fixing test after merge

* Update testcase.py

* Secret Provider Base

* removing unused imports

* updating exception

* updating resolve key vault references

* Review comments

* fixing tests

* tox updates

* Updating Tests

* Updating Async to be the same as sync

* Fixing formatting

* fixing tox and unneeded ""

* fixing tox items

* fix cspell + tests recording

* Update test_async_secret_provider.py

* Post Merge updates

* Move cache to shared code

* removed unneeded disabled

* Update Secret Provider

* Updating usage

* Update assets.json

* Updated to make secret refresh update dictionary

* removing _secret_version_cache

* Update assets.json

* Update _secret_provider_base.py

---------

Co-authored-by: Copilot <[email protected]>

* Increment package version after release of azure-appconfiguration (#43531)

* Patch `azure-template` back to `green`  (#43533)

* Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source`

* added brackets for sql query keyword value (#43525)

Co-authored-by: Andrew Mathew <[email protected]>

* update changelog (#43532)

Co-authored-by: catalinaperalta <[email protected]>

* App Config Provider - Provider Refactor (#43196)

* Code Cleanup

* Move validation to shared file

* Updating Header Check

* Update test_azureappconfigurationproviderbase.py

* moved async tests to aio folder

* post merge updates

---------

Co-authored-by: Ethan Winters <[email protected]>
Co-authored-by: rads-1996 <[email protected]>
Co-authored-by: Azure SDK Bot <[email protected]>
Co-authored-by: Daniel Jurek <[email protected]>
Co-authored-by: Andrew Mathew <[email protected]>
Co-authored-by: Andrew Mathew <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: McCoy Patiño <[email protected]>
Co-authored-by: Yuchao Yan <[email protected]>
Co-authored-by: ChenxiJiang333 <[email protected]>
Co-authored-by: ChenxiJiang333 <[email protected]>
Co-authored-by: Darren Cohen <[email protected]>
Co-authored-by: ray chen <[email protected]>
Co-authored-by: Wes Haggard <[email protected]>
Co-authored-by: Zhiyuan Liang <[email protected]>
Co-authored-by: Matthew Metcalf <[email protected]>
Co-authored-by: catalinaperalta <[email protected]>
Co-authored-by: catalinaperalta <[email protected]>
Co-authored-by: helen229 <[email protected]>
Co-authored-by: Scott Beddall <[email protected]>
* add eval result converter

* Add result converter

* update converter params to optional

* add eval meta data

* fix type

* remove useless file

* get eval meta data as input

* fix build errors

* remove useless import

* resolve comments

* update

* update comments

* fix checker failure

* add error msg and error code

* Surface evaluator error msg

* update UT

* fix usage

* make eval_meta_data optional

* remove useless lines

* update param name to add underscore

* parse updated annotation results

* update trace_id

* expose sample data for sdk evaluators

* update

* Fix column mapping bug for AOAI evaluators with custom data mapping (#43429)

* fix nesting bug for custom data mapping

* address comments

* remove extra code and fix test case

* run formatter

* use dumps

* Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091)

* Modify logic in PR (#43060) to include default message for messages with empty body and export logs

* Update CHANGELOG

* Update logic as per updated spec

* Addressed comments

* Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470)

Co-authored-by: Daniel Jurek <[email protected]>

* update

* fix UT

* fix tests

* Added Tests and Samples for Paginated Queries (#43472)

* added tests and samples for paginated queries

* Apply suggestions from code review

Co-authored-by: Copilot <[email protected]>

* added single partition pagination sample

---------

Co-authored-by: Andrew Mathew <[email protected]>
Co-authored-by: Copilot <[email protected]>

* [Test Proxy] Support AARCH64 platform (#43428)

* Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415)

this doc is outdated

* fix test

* [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230)

* code and test

* update pyproject.toml

---------

Co-authored-by: azure-sdk <PythonSdkPipelines>
Co-authored-by: ChenxiJiang333 <[email protected]>

* [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476)

* code and test

* update changelog

* update changelog

* Update CHANGELOG.md

---------

Co-authored-by: azure-sdk <PythonSdkPipelines>
Co-authored-by: ChenxiJiang333 <[email protected]>
Co-authored-by: ChenxiJiang333 <[email protected]>

* Extend basic test for "project_client.agents" to do more operations (#43516)

* Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457)

* Updated validate pkg template to use packageInfo

* Fixed typo

* Fixed the right variable to use

* output debug log

* Fixed errors in expression evaluation

* removed debug code

* Fixed an issue in pipeline

* Updated condition for variable setting step

* Join paths of the script path

* Use join-path

* return from the function rather than exit

---------

Co-authored-by: ray chen <[email protected]>

* Reorder error and warning log line processing (#43456)

Co-authored-by: Wes Haggard <[email protected]>

* [App Configuration] - Release 1.7.2 (#43520)

* release 1.7.2

* update change log

* Modify CODEOWNERS for Azure SDK ownership changes (#43524)

Updated CODEOWNERS to reflect new ownership for Azure SDK components.

* Migrate Confidential Ledger library from swagger to typespec codegen (#42664)

* regen

* add default cert endpoint with tsp

* remove refs to old namespace

* update async operation patch

* fix operations patch

* fix header impl

* more header fixes

* revert receipt directory removal

* cspell

* regen certificates under correct namespace

* regen ledger client

* update namespace name

* revert certificate change

* update shared files after regen

* updates

* delete extra files

* cspell

* match return type to current behavior

* cspell

* mypy

* pylint

* update docs

* regen

* regen

* fix patch

* Revert "mypy"

This reverts commit 6351ead.

* add info in tsp_location.yaml

* regen

* update patch files

* update patch files

* fix patch

* update patch files

* regen

* update tsp-location.yaml

* generate certificate client

* update patch files

* fixes

* regen clients

* update pyproject.toml deps

* update assets

* regen

* revert test change

* nit

* fix test input

* regen with new model

* update tests

* update tests

* apiview props

* regen

* update tests

* update assets

* apiview props

* temp relative package updates

* fix name

* fix ledger ci (#43181)

* remove swagger

* remove extra configs

* wip revert package dep temporarily

* update readme

* fix config files

* Revert "wip revert package dep temporarily"

This reverts commit db553c4.

* move tests

* add identity samples

---------

Co-authored-by: catalinaperalta <[email protected]>

* rm certificate files

* update changelog

* misc fixes

* update shared reqs

* test

* pylint

---------

Co-authored-by: catalinaperalta <[email protected]>

* update scripts (#43527)

Co-authored-by: helen229 <[email protected]>

* [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448)

* Configurations:  'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release.

* update changelog

---------

Co-authored-by: ChenxiJiang333 <[email protected]>

* App Configuration Provider - Key Vault Refresh (#41882)

* Sync refresh changes

* Key Vault Refresh

* adding tests and fixing sync refresh

* Updating Async

* Fixed Async Tests

* Updated tests and change log

* Apply suggestions from code review

Co-authored-by: Copilot <[email protected]>

* Fixing merge issue

* Updating comments

* Updating secret refresh

* Update _azureappconfigurationproviderasync.py

* Fixing Optional Endpoint

* fix mypy issue

* fixing async test

* mixing merge

* fixing test after merge

* Update testcase.py

* Secret Provider Base

* removing unused imports

* updating exception

* updating resolve key vault references

* Review comments

* fixing tests

* tox updates

* Updating Tests

* Updating Async to be the same as sync

* Fixing formatting

* fixing tox and unneeded ""

* fixing tox items

* fix cspell + tests recording

* Update test_async_secret_provider.py

* Post Merge updates

* Move cache to shared code

* removed unneeded disabled

* Update Secret Provider

* Updating usage

* Update assets.json

* Updated to make secret refresh update dictionary

* removing _secret_version_cache

* Update assets.json

* Update _secret_provider_base.py

---------

Co-authored-by: Copilot <[email protected]>

* Increment package version after release of azure-appconfiguration (#43531)

* Patch `azure-template` back to `green`  (#43533)

* Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source`

* added brackets for sql query keyword value (#43525)

Co-authored-by: Andrew Mathew <[email protected]>

* update changelog (#43532)

Co-authored-by: catalinaperalta <[email protected]>

* App Config Provider - Provider Refactor (#43196)

* Code Cleanup

* Move validation to shared file

* Updating Header Check

* Update test_azureappconfigurationproviderbase.py

* moved async tests to aio folder

* post merge updates

---------

Co-authored-by: Ethan Winters <[email protected]>
Co-authored-by: rads-1996 <[email protected]>
Co-authored-by: Azure SDK Bot <[email protected]>
Co-authored-by: Daniel Jurek <[email protected]>
Co-authored-by: Andrew Mathew <[email protected]>
Co-authored-by: Andrew Mathew <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: McCoy Patiño <[email protected]>
Co-authored-by: Yuchao Yan <[email protected]>
Co-authored-by: ChenxiJiang333 <[email protected]>
Co-authored-by: ChenxiJiang333 <[email protected]>
Co-authored-by: Darren Cohen <[email protected]>
Co-authored-by: ray chen <[email protected]>
Co-authored-by: Wes Haggard <[email protected]>
Co-authored-by: Zhiyuan Liang <[email protected]>
Co-authored-by: Matthew Metcalf <[email protected]>
Co-authored-by: catalinaperalta <[email protected]>
Co-authored-by: catalinaperalta <[email protected]>
Co-authored-by: helen229 <[email protected]>
Co-authored-by: Scott Beddall <[email protected]>
@YoYoJa YoYoJa requested a review from a team as a code owner October 21, 2025 05:21
@Copilot Copilot AI review requested due to automatic review settings October 21, 2025 05:21
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Oct 21, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces support for Azure Monitor OpenTelemetry logging and adds token count tracking throughout the evaluation system. The changes enhance evaluation results with detailed token usage information and enable sending evaluation events to Application Insights. The PR appears to be a work-in-progress (WIP) based on the title.

Key Changes:

  • Added comprehensive token count tracking (input/output/total tokens) across evaluators and graders
  • Implemented Application Insights logging via OpenTelemetry with distributed tracing support
  • Modified Prompty response format to return dictionaries containing both LLM output and metadata instead of raw outputs
  • Added new data structures for AOAI evaluation results conversion and summary statistics

Reviewed Changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
setup.py Added opentelemetry dependencies as optional extras
test_evaluate.py Added comprehensive test for AOAI evaluation results conversion
test_built_in_evaluator.py Updated mock functions to return dictionaries and adjusted assertions for new token count fields
test_prompty_async.py Updated tests to handle new dictionary response format from Prompty
test_mass_evaluate.py Updated assertions to reflect increased key counts from new token metrics
_prompty.py Modified to return dict with llm_output and metadata
_utils.py Modified format_llm_response to return comprehensive metadata dictionary
_base_prompty_eval.py Updated to extract llm_output from new dictionary response format
_relevance.py Updated to extract llm_output from new dictionary response format
_evaluate.py Added AOAI results conversion, App Insights logging, and token count aggregation exclusion
_run_submitter.py Added error handling for batch results
rai_service.py Added token count fields to RAI service responses
Grader files Added _type class variable for proper type identification
_constants.py Added evaluator-to-metric mappings and OpenTelemetry event name constant

)
sample_output = json.dumps(sample_output_list)
input_str = f"{json.dumps(inputs)}" if inputs else ""
if inputs and len(inputs) > 0:
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition len(inputs) > 0 is redundant. In Python, empty dictionaries are falsy, so if inputs already checks for non-empty dictionaries. Consider simplifying to if inputs:.

Suggested change
if inputs and len(inputs) > 0:
if inputs:

Copilot uses AI. Check for mistakes.

content=input_str,
)
sample_input_json.append(msg)
sample_input = json.dumps(sample_input_json)
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable sample_input is used without being initialized when the if inputs and len(inputs) > 0 condition is false. This will cause an UnboundLocalError when the function tries to use sample_input in the return statement. Initialize sample_input = "" before the conditional block.

Copilot uses AI. Check for mistakes.

elif client_type == "pf_client":
batch_run_client = ProxyClient(user_agent=UserAgentSingleton().value)
# Ensure the absolute path is passed to pf.run, as relative path doesn't work with
# Ensure the absolute path is Re to pf.run, as relative path doesn't work with
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete comment text. The word 'Re' appears to be a fragment. Should likely be 'passed' or similar based on context.

Suggested change
# Ensure the absolute path is Re to pf.run, as relative path doesn't work with
# Ensure the absolute path is passed to pf.run, as relative path doesn't work with

Copilot uses AI. Check for mistakes.

or metric in _EvaluatorMetricMapping.EVALUATOR_NAME_METRICS_MAPPINGS["code_vulnerability"]
or metric in _EvaluatorMetricMapping.EVALUATOR_NAME_METRICS_MAPPINGS["protected_material"]):
copy_label = label
if copy_label is not None and isinstance(copy_label, bool) and copy_label == True:
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing boolean to True using == True is redundant. Simplify to if copy_label is not None and isinstance(copy_label, bool) and copy_label:.

Suggested change
if copy_label is not None and isinstance(copy_label, bool) and copy_label == True:
if copy_label is not None and isinstance(copy_label, bool) and copy_label:

Copilot uses AI. Check for mistakes.

Comment on lines +2253 to +2254
if (isinstance(result_item, dict) and "sample" in result_item and result_item["sample"]
and result_item["metric"] not in dup_usage_list):
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Complex multi-line conditional is difficult to read. Consider extracting into a helper function or assigning intermediate boolean variables with descriptive names to improve readability.

Suggested change
if (isinstance(result_item, dict) and "sample" in result_item and result_item["sample"]
and result_item["metric"] not in dup_usage_list):
is_dict = isinstance(result_item, dict)
has_sample = "sample" in result_item
sample_is_truthy = result_item.get("sample") if has_sample else False
metric_not_in_dup_usage_list = result_item.get("metric") not in dup_usage_list
if is_dict and has_sample and sample_is_truthy and metric_not_in_dup_usage_list:

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants