-
Notifications
You must be signed in to change notification settings - Fork 817
feat(traceloop-sdk): add sampling_rate parameter to control trace volume #3409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds a sampling_rate parameter to Traceloop.init that validates bounds [0.0–1.0], enforces mutual exclusivity with an explicit sampler, and constructs a TraceIdRatioBased sampler when provided. Adds tests verifying init behavior, validation errors, sampler conflict, and edge cases. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor App as Application
participant Init as Traceloop.init
participant Validate as Param Validation
participant Sampler as TraceIdRatioBased Factory
participant TW as TracerWrapper
App->>Init: init(..., sampler=?, sampling_rate=?)
Init->>Validate: check mutual exclusivity & bounds
alt both provided
Validate-->>App: ValueError (sampler and sampling_rate)
else out of range
Validate-->>App: ValueError (sampling_rate out of [0.0,1.0])
else sampling_rate provided
Init->>Sampler: TraceIdRatioBased(sampling_rate)
Sampler-->>Init: sampler
Init->>TW: construct TracerWrapper with sampler
TW-->>App: return client or None
else no sampling_rate
Init->>TW: construct TracerWrapper (existing/default sampler)
TW-->>App: return client or None
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 51d2efb in 1 minute and 29 seconds. Click for details.
- Reviewed
138lines of code in2files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/traceloop-sdk/traceloop/sdk/__init__.py:140
- Draft comment:
Good mutual exclusion check for sampling_rate and sampler. For improved clarity during debugging, consider including the provided sampler value in the error message. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
2. packages/traceloop-sdk/traceloop/sdk/__init__.py:147
- Draft comment:
The error message for invalid sampling_rate includes the provided value. Consider updating the function's docstring to document this parameter and its expected range for better developer guidance. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
Workflow ID: wflow_fIGED7V1UMylRK49
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/traceloop-sdk/tests/test_sampling_rate.py (1)
25-47: Consider using raw strings or escaping metacharacters in regex patterns.The tests correctly validate that invalid sampling_rate values raise ValueError. However, the
match=patterns contain unescaped dots that are regex metacharacters. While the tests currently pass, it's better practice to use raw strings or escape the dots.Apply this diff to use raw strings for regex patterns:
- with pytest.raises(ValueError, match="sampling_rate must be between 0.0 and 1.0"): + with pytest.raises(ValueError, match=r"sampling_rate must be between 0\.0 and 1\.0"): Traceloop.init( app_name="test-invalid-sampling-rate", sampling_rate=1.5, exporter=exporter, disable_batch=True ) - with pytest.raises(ValueError, match="sampling_rate must be between 0.0 and 1.0"): + with pytest.raises(ValueError, match=r"sampling_rate must be between 0\.0 and 1\.0"): Traceloop.init( app_name="test-invalid-sampling-rate", sampling_rate=-0.1, exporter=exporter, disable_batch=True )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/traceloop-sdk/tests/test_sampling_rate.py(1 hunks)packages/traceloop-sdk/traceloop/sdk/__init__.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/traceloop-sdk/traceloop/sdk/__init__.pypackages/traceloop-sdk/tests/test_sampling_rate.py
🧬 Code graph analysis (1)
packages/traceloop-sdk/tests/test_sampling_rate.py (3)
packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (2)
export(45-51)InMemorySpanExporter(22-61)packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-289)init(49-220)packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py (1)
TracerWrapper(61-241)
🪛 Ruff (0.13.3)
packages/traceloop-sdk/traceloop/sdk/__init__.py
141-144: Avoid specifying long messages outside the exception class
(TRY003)
148-150: Avoid specifying long messages outside the exception class
(TRY003)
packages/traceloop-sdk/tests/test_sampling_rate.py
31-31: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
38-38: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
🔇 Additional comments (6)
packages/traceloop-sdk/traceloop/sdk/__init__.py (3)
8-8: LGTM!The import of
TraceIdRatioBasedis correct and necessary for implementing the sampling_rate feature using OpenTelemetry's standard sampling mechanism.
65-65: LGTM!The
sampling_rateparameter is correctly added with an appropriate type hint and default value, maintaining backward compatibility.
140-151: LGTM! Validation logic is correct and well-structured.The implementation correctly:
- Enforces mutual exclusivity between
sampling_rateandsamplerparameters- Validates the
sampling_raterange [0.0, 1.0]- Creates a
TraceIdRatioBasedsampler using the OpenTelemetry standard approach- Places validation at the appropriate point before tracer initialization
Note: The static analysis tool flags TRY003 (long exception messages), but this is a style preference rather than a functional issue. The context-specific error messages provide clear guidance to users and are acceptable here.
packages/traceloop-sdk/tests/test_sampling_rate.py (3)
7-24: LGTM! Test correctly verifies basic sampling_rate initialization.The test appropriately:
- Manages the TracerWrapper singleton state to avoid test interference
- Verifies that initialization with
sampling_rate=0.5creates a TracerWrapper instance- Expects
client is Nonecorrectly (no api_key provided with custom exporter)- Restores the original state in the finally block
48-65: LGTM! Test correctly verifies mutual exclusivity of sampling_rate and sampler.The test appropriately:
- Creates a
TraceIdRatioBasedsampler to test the conflict scenario- Verifies that providing both
sampling_rateandsamplerraises a ValueError- Properly manages TracerWrapper singleton state
66-91: LGTM! Test correctly verifies edge cases.The test appropriately:
- Tests both boundary values:
0.0(never sample) and1.0(always sample)- Correctly deletes and recreates TracerWrapper.instance between the two initializations (necessary due to singleton pattern)
- Verifies that both edge cases successfully create a TracerWrapper instance
Add simplified sampling_rate parameter to Traceloop.init() for easy trace sampling configuration without manually creating sampler objects. Closes traceloop#2109
51d2efb to
fdb5d63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
packages/traceloop-sdk/tests/test_sampling_rate.py (2)
10-23: Consider using pytest's built-in fixture mechanisms.The fixture correctly manages the global
TracerWrapper.instancestate, but direct manipulation of singleton state can be fragile.Note: This duplicates a past review comment. The current implementation is functional, but future refactoring to make
TracerWrappermore test-friendly (e.g., providing areset()method or dependency injection) would improve maintainability.
29-47: Verify the sampler type and sampling rate value.The test confirms that a sampler was created but doesn't verify it's a
TraceIdRatioBasedsampler with the expected rate of 0.5.Add assertions to verify the sampler type and configuration:
# Verify that a tracer provider was created with a sampler tracer_provider = TracerWrapper.instance._TracerWrapper__tracer_provider assert tracer_provider is not None assert tracer_provider.sampler is not None +assert isinstance(tracer_provider.sampler, TraceIdRatioBased) +# TraceIdRatioBased stores the rate as 'rate' attribute +assert tracer_provider.sampler.rate == 0.5
🧹 Nitpick comments (4)
packages/traceloop-sdk/traceloop/sdk/__init__.py (1)
140-151: LGTM! The validation and sampler construction logic is correct.The implementation properly handles:
- Mutual exclusivity between
samplerandsampling_rate- Range validation for
sampling_rate[0.0, 1.0]- Construction of
TraceIdRatioBasedsamplerThe placement is optimal—after early returns and before
TracerWrapperinitialization.Static analysis flagged TRY003 (long error messages). For user-facing errors, inline messages are often clearer than custom exception classes, so the current approach is acceptable. If you prefer to follow the guideline strictly, you could define custom exception classes:
+class SamplingConfigError(ValueError): + """Raised when both sampler and sampling_rate are provided.""" + def __init__(self): + super().__init__( + "Cannot specify both 'sampler' and 'sampling_rate'. " + "Please use only one of these parameters." + ) + +class SamplingRateError(ValueError): + """Raised when sampling_rate is out of valid range.""" + def __init__(self, rate: float): + super().__init__( + f"sampling_rate must be between 0.0 and 1.0, got {rate}" + ) + if sampling_rate is not None and sampler is not None: - raise ValueError( - "Cannot specify both 'sampler' and 'sampling_rate'. " - "Please use only one of these parameters." - ) + raise SamplingConfigError() if sampling_rate is not None: if not 0.0 <= sampling_rate <= 1.0: - raise ValueError( - f"sampling_rate must be between 0.0 and 1.0, got {sampling_rate}" - ) + raise SamplingRateError(sampling_rate) sampler = TraceIdRatioBased(sampling_rate)packages/traceloop-sdk/tests/test_sampling_rate.py (3)
53-53: Escape regex metacharacters in pytest.raises match patterns.The
.character in the match patterns is a regex metacharacter. Use raw strings or escape the dots for precise matching.Apply this diff:
- with pytest.raises(ValueError, match="sampling_rate must be between 0.0 and 1.0"): + with pytest.raises(ValueError, match=r"sampling_rate must be between 0\.0 and 1\.0"): Traceloop.init( app_name="test-invalid-sampling-rate", sampling_rate=1.5, exporter=exporter, disable_batch=True ) - with pytest.raises(ValueError, match="sampling_rate must be between 0.0 and 1.0"): + with pytest.raises(ValueError, match=r"sampling_rate must be between 0\.0 and 1\.0"): Traceloop.init( app_name="test-invalid-sampling-rate", sampling_rate=-0.1, exporter=exporter, disable_batch=True )Also applies to: 61-61
83-113: Refactor to avoid manual instance deletion within the test.Manually deleting
TracerWrapper.instanceat line 100 between test cases is fragile. Consider splitting into two separate test methods or improving the fixture.Split into two test methods:
- def test_sampling_rate_edge_cases(self, clean_tracer_wrapper): - """Test sampling_rate with edge case values (0.0 and 1.0).""" + def test_sampling_rate_zero(self, clean_tracer_wrapper): + """Test sampling_rate with 0.0 (no sampling).""" exporter = InMemorySpanExporter() - # Test 0.0 (no sampling) client = Traceloop.init( app_name="test-sampling-zero", sampling_rate=0.0, exporter=exporter, disable_batch=True ) assert client is None assert hasattr(TracerWrapper, "instance") tracer_provider = TracerWrapper.instance._TracerWrapper__tracer_provider assert tracer_provider.sampler is not None + assert isinstance(tracer_provider.sampler, TraceIdRatioBased) + assert tracer_provider.sampler.rate == 0.0 - del TracerWrapper.instance - - # Test 1.0 (full sampling) + def test_sampling_rate_one(self, clean_tracer_wrapper): + """Test sampling_rate with 1.0 (full sampling).""" + exporter = InMemorySpanExporter() + client = Traceloop.init( app_name="test-sampling-one", sampling_rate=1.0, exporter=exporter, disable_batch=True ) assert client is None assert hasattr(TracerWrapper, "instance") tracer_provider = TracerWrapper.instance._TracerWrapper__tracer_provider assert tracer_provider.sampler is not None + assert isinstance(tracer_provider.sampler, TraceIdRatioBased) + assert tracer_provider.sampler.rate == 1.0
45-47: Consider adding helper methods to avoid name mangling.Accessing private attributes via name mangling (
_TracerWrapper__tracer_provider) is fragile and breaks encapsulation. If these attributes need to be tested, consider adding test helper methods toTracerWrapper.For example, add to
TracerWrapper:def get_tracer_provider(self): """Test helper to access tracer provider.""" return self.__tracer_providerThen in tests:
-tracer_provider = TracerWrapper.instance._TracerWrapper__tracer_provider +tracer_provider = TracerWrapper.instance.get_tracer_provider()Also applies to: 97-98, 112-113
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/traceloop-sdk/tests/test_sampling_rate.py(1 hunks)packages/traceloop-sdk/traceloop/sdk/__init__.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/traceloop-sdk/traceloop/sdk/__init__.pypackages/traceloop-sdk/tests/test_sampling_rate.py
🧬 Code graph analysis (1)
packages/traceloop-sdk/tests/test_sampling_rate.py (3)
packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (2)
export(45-51)InMemorySpanExporter(22-61)packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-289)init(49-220)packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py (1)
TracerWrapper(61-241)
🪛 Ruff (0.13.3)
packages/traceloop-sdk/traceloop/sdk/__init__.py
141-144: Avoid specifying long messages outside the exception class
(TRY003)
148-150: Avoid specifying long messages outside the exception class
(TRY003)
packages/traceloop-sdk/tests/test_sampling_rate.py
29-29: Unused method argument: clean_tracer_wrapper
(ARG002)
49-49: Unused method argument: clean_tracer_wrapper
(ARG002)
53-53: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
61-61: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
69-69: Unused method argument: clean_tracer_wrapper
(ARG002)
83-83: Unused method argument: clean_tracer_wrapper
(ARG002)
🔇 Additional comments (2)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
8-8: LGTM!The import of
TraceIdRatioBasedis appropriate and necessary for the new sampling functionality.
65-65: LGTM!The
sampling_rateparameter is correctly typed and follows the existing parameter pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to fdb5d63 in 1 minute and 26 seconds. Click for details.
- Reviewed
160lines of code in2files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/traceloop-sdk/tests/test_sampling_rate.py:45
- Draft comment:
Consider asserting that the sampler is an instance of TraceIdRatioBased and verify its ratio attribute, not just that it’s non-null. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
2. packages/traceloop-sdk/tests/test_sampling_rate.py:100
- Draft comment:
Directly deleting TracerWrapper.instance with 'del' may be fragile; consider relying on the fixture to manage and reset global state. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment has a point - using direct deletion in the middle of a test when there's a fixture specifically designed for this purpose is not ideal. However, the clean_tracer_wrapper fixture is already being used for the test method overall, and this is just an intermediate cleanup between two test cases in the same method. The current approach, while not perfect, works and isn't likely to cause issues. The comment identifies a real code quality issue, but it may be overstating the fragility - the direct deletion is happening in a controlled test environment where we understand the state. While using the fixture would be cleaner, the current approach is functional and the change would be more about style than fixing a real problem. This comment, while technically correct, suggests a minor style improvement rather than fixing a significant issue. Following our rules about only keeping comments that require clear code changes, we should remove it.
3. packages/traceloop-sdk/traceloop/sdk/__init__.py:65
- Draft comment:
Consider updating the Traceloop.init docstring to document the new sampling_rate parameter and its behavior. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. packages/traceloop-sdk/traceloop/sdk/__init__.py:140
- Draft comment:
The mutual exclusion check between 'sampler' and 'sampling_rate' is clear. Optionally, consider using a custom exception for configuration errors to better differentiate these cases. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_ATz6l6YKXX5AFcBi
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
Hi @nirga, just wanted to check if you could take a look at this PR when you get a chance. |
| raise ValueError( | ||
| f"sampling_rate must be between 0.0 and 1.0, got {sampling_rate}" | ||
| ) | ||
| sampler = TraceIdRatioBased(sampling_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a no-op unless you also set it in the tracer provider:
https://opentelemetry-python.readthedocs.io/en/latest/sdk/trace.sampling.html
| assert TracerWrapper.instance is not None | ||
|
|
||
| # Verify that a tracer provider was created with a sampler | ||
| tracer_provider = TracerWrapper.instance._TracerWrapper__tracer_provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use these low-quality autogenerated test - they don't really test anything. You should run a real-world test - like the other ones we have here
|
Thanks for the review! I’ll update the PR with the suggested changes |
|
@nirga I might be missing something, but from my review it seems the sampling_rate does reach the TracerProvider. |
|
@Nikhil172913832 I think you forgot to push packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py |
|
@nirga The lines I am referring to in packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py already exist in the main branch — I haven’t made any edits to them in this PR. |
|
Regarding the tests, I took inspiration from test_sample_initialization to improve coverage and structure. Let me know if there’s anything else I can refine or expand. """Tests for sampling_rate parameter functionality."""
import pytest
from opentelemetry.sdk.trace.sampling import TraceIdRatioBased
from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter
from opentelemetry import trace
from traceloop.sdk import Traceloop
from traceloop.sdk.tracing.tracing import TracerWrapper
def test_sampling_rate_validation():
"""Test that sampling_rate validates input range."""
exporter = InMemorySpanExporter()
with pytest.raises(ValueError, match="sampling_rate must be between 0.0 and 1.0"):
Traceloop.init(
app_name="test-invalid-high",
sampling_rate=1.5,
exporter=exporter,
disable_batch=True
)
with pytest.raises(ValueError, match="sampling_rate must be between 0.0 and 1.0"):
Traceloop.init(
app_name="test-invalid-low",
sampling_rate=-0.1,
exporter=exporter,
disable_batch=True
)
def test_sampling_rate_and_sampler_conflict():
"""Test that providing both sampling_rate and sampler raises ValueError."""
exporter = InMemorySpanExporter()
sampler = TraceIdRatioBased(0.5)
with pytest.raises(ValueError, match="Cannot specify both 'sampler' and 'sampling_rate'"):
Traceloop.init(
app_name="test-conflict",
sampling_rate=0.5,
sampler=sampler,
exporter=exporter,
disable_batch=True
)
def test_sampling_rate_creates_spans():
"""Test that sampling_rate parameter works correctly."""
if hasattr(TracerWrapper, "instance"):
_trace_wrapper_instance = TracerWrapper.instance
del TracerWrapper.instance
try:
exporter = InMemorySpanExporter()
Traceloop.init(
app_name="test-sampling",
sampling_rate=1.0,
exporter=exporter,
disable_batch=True
)
tracer = trace.get_tracer(__name__)
for i in range(5):
with tracer.start_as_current_span(f"test_span_{i}"):
pass
spans = exporter.get_finished_spans()
test_spans = [s for s in spans if s.name.startswith("test_span_")]
assert len(test_spans) == 5
finally:
exporter.clear()
if '_trace_wrapper_instance' in locals():
TracerWrapper.instance = _trace_wrapper_instance
''' |
|
@Nikhil172913832 they're not here in the PR |
|
@nirga I have not yet committed the test. I wanted to have your opinion on their coverage and quality. I have tested them locally and they do pass. |
|
@Nikhil172913832 i meant that the code itself is not complete. The tracing.py file is not here |
|
@nirga , I haven’t modified packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py — the lines I referred to already exist in the main branch. My PR only adds the sampling_rate parameter and related logic in init.py along with the new tests. |
Add a simplified sampling_rate parameter to Traceloop.init() for easy trace sampling configuration without manually creating sampler objects.
Closes #2109
feat(instrumentation): ...orfix(instrumentation): ....Important
Add
sampling_rateparameter toTraceloop.init()for trace sampling control, with validation and conflict checks, and corresponding tests.sampling_rateparameter toTraceloop.init()in__init__.pyto control trace volume (0.0–1.0).sampling_rateandsamplerare provided.sampling_rateis between 0.0 and 1.0.test_sampling_rate.pyto test initialization withsampling_rate, invalid values, conflict withsampler, and edge cases (0.0 and 1.0).This description was created by
for fdb5d63. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit