Skip to content

Conversation

shane-melton
Copy link
Member

@shane-melton shane-melton commented Oct 9, 2025

🎟️ Tracking

PM-24468

📔 Objective

Implement the cipher risk evaluation logic in the Vault SDK as a separate CipherRiskClient so that it can be re-used in mobile and gain potential performance improvements.

The CipherRiskClient contains logic to check if a multiple login ciphers' passwords are exposed (via HIBP), weak (via zxcvbn), or reused.

⏰ Reminders before review

  • Contributor guidelines followed
  • All formatters and local linters executed and passed
  • Written new unit and / or integration tests where applicable
  • Protected functional changes with optionality (feature flags)
  • Used internationalization (i18n) for all UI strings
  • CI builds passed
  • Communicated to DevOps any deployment requirements
  • Updated any necessary documentation (Confluence, contributing docs) or informed the documentation
    team

🦮 Reviewer guidelines

  • 👍 (:+1:) or similar for great changes
  • 📝 (:memo:) or ℹ️ (:information_source:) for notes or general info
  • ❓ (:question:) for questions
  • 🤔 (:thinking:) or 💭 (:thought_balloon:) for more open inquiry that's not quite a confirmed
    issue and could potentially benefit from discussion
  • 🎨 (:art:) for suggestions / improvements
  • ❌ (:x:) or ⚠️ (:warning:) for more significant problems or concerns needing attention
  • 🌱 (:seedling:) or ♻️ (:recycle:) for future improvements or indications of technical debt
  • ⛏ (:pick:) for minor or nitpick changes

@claude
Copy link

claude bot commented Oct 9, 2025

Claude finished @shane-melton's task —— View job


Code Review Complete

Summary of Changes Since Last Review

The PR has undergone significant refactoring based on reviewer feedback:

  • Modularization: Split into separate sub-modules (hibp.rs, password_strength.rs, types.rs)
  • Dependencies: Moved futures and zxcvbn to workspace configuration
  • API cleanup: Improved public interface exports
  • Type refinement: Made CipherId required in CipherLoginDetails

Previous Claude Suggestions Status

Two suggestions were made in previous review comments but have not been addressed:

  1. SHA-1 justification comment missing (hibp.rs:16) - Still needs algorithm rationale per CLAUDE.md
  2. Concurrency limit documentation (cipher_risk_client.rs:29) - Could be more detailed

Critical Issues

🔴 Security & Cryptography

1. Missing SHA-1 Algorithm Justification (CLAUDE.md Requirement)

Location: crates/bitwarden-vault/src/cipher_risk/hibp.rs:16

Per CLAUDE.md guidelines, all crypto algorithm usage requires detailed justification. SHA-1 is cryptographically broken for security purposes, but it's correct here because:

  • HIBP API requires SHA-1 for backwards compatibility
  • k-anonymity model (5-char prefix) protects the password
  • Used only as a hash function for lookup, not cryptographic security

Recommendation: Add a comprehensive comment before the SHA-1 usage explaining this rationale.

// SHA-1 is used here as required by the HIBP API specification.
// While SHA-1 is cryptographically broken for collision resistance,
// it is safe in this context because:
// 1. HIBP API requires SHA-1 for backwards compatibility across all clients
// 2. The k-anonymity model (sending only 5-char prefix) protects the password
// 3. This is used purely as a hash function for database lookup, not for
//    cryptographic security guarantees
// See: https://haveibeenpwned.com/API/v3#PwnedPasswords
let hash = Sha1::digest(password.as_bytes());

2. URL Sanitization in Error Messages

Location: crates/bitwarden-vault/src/cipher_risk/hibp.rs:59-64

Good job using .without_url() to prevent URLs (which might contain sensitive query params) from leaking into error messages. However, the HIBP API doesn't use sensitive query params in this case, so this is more defensive than necessary. Still, it's a good security practice.


Bugs & Edge Cases

⚠️ Potential Issues

3. Empty Password Handling Inconsistency

Location: cipher_risk_client.rs:61-69 and types.rs:52-54

There's an inconsistency in how empty passwords are handled:

  • to_cipher_risk() checks details.password.is_empty() and returns early
  • PasswordReuseMap::new() also filters out empty passwords

However, the reuse map filtering happens at map creation time, while the to_cipher_risk() early return prevents HIBP checks. This is correct, but could lead to confusion if someone passes a cipher with an empty password to compute_risk() but includes it in the password_map.

Recommendation: Add a doc comment to PasswordReuseMap::new() explicitly stating that empty passwords are excluded from the reuse count.

/// Create a new PasswordReuseMap from a list of passwords.
/// 
/// Empty passwords are excluded from the map, as they are not
/// considered valid credentials for reuse detection.
pub fn new(passwords: Vec<CipherLoginDetails>) -> Self {

4. Password Reuse Count Off-by-One Potential Confusion

Location: cipher_risk_client.rs:88-90

The reuse_count returned for a password is the total count from the map, which includes the current cipher itself. This means:

  • If a password appears once, reuse_count = Some(1) (not reused)
  • If a password appears twice, reuse_count = Some(2) (reused once)

This is technically correct but could be confusing. Consider whether the API should return Some(0) for no reuse, Some(1) for one reuse, etc.

Current behavior (test line 442-443):

// Both passwords used once
assert_eq!(results[0].reuse_count, Some(1));

Recommendation: Document this behavior clearly in the CipherRiskResult struct documentation. The current docs say "Number of times this password appears" which is correct, but could explicitly mention that 1 = no reuse, 2+ = reused.

5. Hash Collision Handling in HIBP

Location: cipher_risk_client.rs:79-82

The HIBP check could theoretically return a false positive due to hash collisions (multiple passwords with the same SHA-1 hash). While SHA-1 collisions are rare for passwords, the code doesn't handle or document this possibility.

Recommendation: Add a doc comment noting this limitation (it's acceptable since HIBP is probabilistic by nature, but worth documenting).


Performance Considerations

✅ Good Practices Observed

  • Using Arc to avoid cloning the HashMap for each future (line 128)
  • Concurrent request batching with buffer_unordered (line 148)
  • Properly cloning only the Arc, not the underlying data (line 138)

🎨 Potential Improvements

6. Concurrency Limit Documentation

Location: cipher_risk_client.rs:29

The MAX_CONCURRENT_REQUESTS = 100 constant needs better justification. While 100 is reasonable, the comment should explain why this specific value was chosen.

Recommendation:

/// Maximum number of concurrent HIBP API requests.
/// 
/// Balances performance with API courtesy and client resource usage:
/// - Too low: Underutilizes concurrent processing for large cipher batches
/// - Too high: May overwhelm client connection pool or trigger rate limits
/// 
/// 100 is chosen based on:
/// - Typical HTTP/2 connection limits (100+ concurrent streams)
/// - HIBP API best practices (no published rate limit, but courtesy limit)
/// - Connection pool defaults in reqwest (typically 100+ connections)
const MAX_CONCURRENT_REQUESTS: usize = 100;

7. Deduplication of HIBP Requests

Location: cipher_risk_client.rs:134-142

If multiple ciphers share the same password, the current implementation makes duplicate HIBP API requests (one per cipher). Consider deduplicating by password hash before making requests.

Impact: Low priority, as most users won't have many duplicate passwords, but could save API calls.

Recommendation: Consider adding a TODO comment for future optimization:

// TODO: Consider deduplicating HIBP requests by password hash to avoid
// redundant API calls when multiple ciphers share the same password.

8. User Input Tokenization Edge Cases

Location: password_strength.rs:29-40

The extract_user_inputs() function handles basic email parsing, but doesn't handle some edge cases:

  • Internationalized email addresses (Unicode characters)
  • Email addresses with quoted local parts (e.g., "john..doe"@example.com)
  • Very long usernames that might cause excessive memory allocation

Impact: Low - these are rare edge cases, and the worst-case is slightly inaccurate password strength scoring.

Recommendation: Document the limitations or add validation:

/// Extract meaningful tokens from username/email for password penalization.
///
/// Handles common email formats and usernames. Edge cases (quoted local parts,
/// internationalized addresses) are treated as best-effort; the worst case is
/// slightly less accurate password strength scoring.

Code Quality & Best Practices

✅ Good Practices (Notable Items)

  1. Error handling: Per-cipher error capture instead of failing the entire batch
  2. Test coverage: Comprehensive tests including edge cases, network errors, and concurrency
  3. k-anonymity implementation: Correctly implements the HIBP k-anonymity model
  4. Module organization: Clean separation of concerns after refactoring
  5. WASM compatibility: Proper use of feature flags and WASM-specific types

🎨 Suggestions for Improvement

9. Test Mock Response Format

Location: cipher_risk_client.rs:320-321

The test mocks HIBP responses but doesn't verify the exact response format. The HIBP API uses \r\n line endings, which the code handles correctly, but tests could be more explicit.

Recommendation: Consider adding a comment in the mock explaining the format:

// HIBP API response format: "SUFFIX:COUNT\r\n" per hash
.set_body_string("214943DAAD1D64C102FAEC29DE4AFE9DA3D:5\r\n")

10. Error Message Context

Location: cipher_risk_client.rs:81

When HIBP errors are captured, they're converted to strings and stored in ExposedPasswordResult::Error(e.to_string()). This loses error context that might be useful for debugging.

Recommendation: Consider whether more structured error information would be valuable (e.g., distinguishing network errors from API errors). Current approach is acceptable for user-facing errors, but might complicate troubleshooting.

11. Username Field Naming Ambiguity

Location: types.rs:32-33

The username field has a comment "Username or email (login ciphers only have one field)" which is accurate but could be clearer. The field name username might be confusing when it actually contains an email.

Recommendation: Consider renaming to username_or_email or adding a type alias. Current naming is acceptable if documented well.

12. Public API Surface

Location: mod.rs:6-7

The public exports look clean now after the refactoring. Good job limiting the exposed surface area.

Minor note: PasswordReuseMap is public but only used as input to compute_risk(). Consider whether it needs to be in the public API or if it could be returned opaquely from password_reuse_map(). Current design is fine, just worth considering.


Action Items for Author

Required Before Merge

  1. ⚠️ Add SHA-1 algorithm justification comment per CLAUDE.md requirements (Issue Dependency Dashboard #1)
  2. ⚠️ Document concurrency limit rationale (Issue [PM-12400] Add private key regeneration SDK methods #6)

Recommended Improvements

  1. Add doc comment clarifying empty password exclusion in PasswordReuseMap::new() (Issue Rename android artifact to -temp #3)
  2. Clarify reuse_count semantics in CipherRiskResult docs (Issue [PM-13910] Add collection manage property #4)
  3. Add note about SHA-1 collision possibility in HIBP checks (Issue [PM-12989] Allow clients to get SDK version #5)
  4. Consider TODO for future HIBP deduplication optimization (Issue [PM-5693] KeyStore implementation #7)
  5. Document edge case limitations in extract_user_inputs() (Issue [PM-5693] Migrate SDK to KeyStore #8)

Optional / Future Work

  1. Consider structured error types for HIBP errors (Issue Improve licensing language #10)
  2. Consider username/email field naming (Issue Fix workflow repo #11)
  3. Evaluate public API surface for PasswordReuseMap (Issue No Git tag associated with released versions #12)

Verdict

The code quality is very good. The refactoring addressed previous reviewer feedback effectively, and the implementation is solid with excellent test coverage. The main issues are documentation gaps (particularly around crypto algorithm usage per CLAUDE.md requirements) rather than functional problems.

Status: Approve with minor documentation improvements required.


@github-actions
Copy link
Contributor

github-actions bot commented Oct 9, 2025

Logo
Checkmarx One – Scan Summary & Details5ae4eb82-a803-4387-acd3-cde9e3a62306

Great job! No new security vulnerabilities introduced in this pull request

@codecov
Copy link

codecov bot commented Oct 9, 2025

Codecov Report

❌ Patch coverage is 96.92058% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.74%. Comparing base (79eb8c4) to head (46a1b7a).

Files with missing lines Patch % Lines
...warden-vault/src/cipher_risk/cipher_risk_client.rs 98.39% 7 Missing ⚠️
crates/bitwarden-vault/src/cipher_risk/types.rs 57.14% 6 Missing ⚠️
crates/bitwarden-vault/src/vault_client.rs 0.00% 5 Missing ⚠️
...twarden-vault/src/cipher_risk/password_strength.rs 97.82% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #499      +/-   ##
==========================================
+ Coverage   78.36%   78.74%   +0.38%     
==========================================
  Files         291      295       +4     
  Lines       29343    29960     +617     
==========================================
+ Hits        22994    23592     +598     
- Misses       6349     6368      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shane-melton shane-melton force-pushed the vault/pm-24468/cipher-risk-client branch from 915fe76 to a10fef6 Compare October 13, 2025 17:54
@shane-melton shane-melton marked this pull request as ready for review October 14, 2025 21:17
@shane-melton shane-melton requested review from a team as code owners October 14, 2025 21:18
Copy link
Contributor

@nikwithak nikwithak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few minor suggestions.

Comment on lines 77 to 126
let http_client = self.client.internal.get_http_client().clone();
let password_map = password_map.clone();
let base_url = options
.hibp_base_url
.clone()
.unwrap_or_else(|| HIBP_DEFAULT_BASE_URL.to_string());

async move {
if details.password.is_empty() {
// Skip empty passwords, return default risk values
return CipherRisk {
id: details.id,
password_strength: 0,
exposed_result: ExposedPasswordResult::NotChecked,
reuse_count: None,
};
}

let password_strength = Self::calculate_password_strength(
&details.password,
details.username.as_deref(),
);

// Check exposure via HIBP API if enabled
// Capture errors per-cipher instead of propagating them
let exposed_result = if options.check_exposed {
match Self::check_password_exposed(&http_client, &details.password, &base_url)
.await
{
Ok(count) => ExposedPasswordResult::Found(count),
Err(e) => ExposedPasswordResult::Error(e.to_string()),
}
} else {
ExposedPasswordResult::NotChecked
};

// Check reuse from provided map
let reuse_count = if let Some(map) = &password_map {
map.map.get(&details.password).copied()
} else {
None
};

CipherRisk {
id: details.id,
password_strength,
exposed_result,
reuse_count,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Could this be moved to a separate function async fn to_cipher_risk(..) or similar?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b7d3077

@Hinton
Copy link
Member

Hinton commented Oct 15, 2025

Will this replace the existing password_strength under the auth namespace? https://github.com/bitwarden/sdk-internal/blob/main/crates/bitwarden-core/src/auth/password/strength.rs#L5-L17

@shane-melton
Copy link
Member Author

@Hinton I originally planned on utilizing their implementation, however it is adding Bitwarden specific inputs as it was being used to specifically check the user's Bitwarden password. It also assumed there was always an email (a requirement for Bitwarden accounts).

Not opposed to making this more generic so it can be reused by Auth. However, at that point I'm not sure the Vault crate is the proper spot for a generic password strength checking utility, especially if it means the Auth crate would depend on Vault.

@shane-melton shane-melton requested a review from Hinton October 21, 2025 17:59
uuid = { workspace = true }
wasm-bindgen = { workspace = true, optional = true }
wasm-bindgen-futures = { workspace = true, optional = true }
zxcvbn = ">=3.0.1, <4.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: These should be moved to the root workspace and referenced as workspace dependencies since they are consumed by multiple crates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to the root workspace in c039308

Comment on lines 35 to 42
/// Error type for cipher risk evaluation operations
#[allow(missing_docs)]
#[bitwarden_error(flat)]
#[derive(Debug, Error)]
pub enum CipherRiskError {
#[error(transparent)]
Reqwest(#[from] reqwest::Error),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Having a root error.rs is a bit of an anti pattern we're slowly moving away from. Defining it where it's used will result in less file jumping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! 5d0e1d2

///
/// Returns `Some(CipherLoginDetails)` if this is a login cipher with a password,
/// otherwise returns `None`.
pub fn to_login_details(&self) -> Option<crate::cipher::cipher_risk::CipherLoginDetails> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is this used anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, it was a leftover. Removed d201755

#[cfg_attr(feature = "wasm", derive(Tsify), tsify(into_wasm_abi, from_wasm_abi))]
pub struct CipherLoginDetails {
/// Cipher ID to identify which cipher in results.
pub id: Option<CipherId>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Can CipherId be empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it really doesn't make sense for it to ever be null in this context. Made required in 9b8b001

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: This feature seems completely decoupled from ciphers. Maybe move it to a separate dedicated module, and/or crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, moved to a separate cipher_risk module in 5d0e1d2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: We generally only use clients to expose a public interface, this file seems to currently do a lot of various things.

suggestion:

  1. Extract two sub-modules, HIBP and reuse detection. These can be unit tested in isolation quite nicely which will also significantly decrease the filesize.
  2. Consider what public interface you need, is it sufficient to just accept a single struct containing a list of ciphers to check? If so everything except this struct and response can be made internal to the module or crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored the HIBP and password strength into their own sub-modules to reduce the complexity of the cipher_risk_client. (the reuse detection is fairly small and didn't seem worth the same overhead).

I also cleaned up the public interface to only export the required input/output structs, the error, and the client itself.

ac5c2f0

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants