-
Notifications
You must be signed in to change notification settings - Fork 179
Feature addtotals and addcoltotals #4754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature addtotals and addcoltotals #4754
Conversation
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
35e478f to
754f477
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
docs/user/ppl/cmd/addtotals.rst (1)
14-27: Clarify default behavior and fix minor wording/spacing issuesThe first description sentence implies that
addtotalsalways “appends a row with the totals”, but from the implementation/tests the summary row is only added whencol=true; with the defaults (row=true,col=false) it just adds a per‑row total field. Consider rephrasing to explicitly distinguish row totals vs the optional summary event so users aren’t confused about what happens by default.Also, a few small text nits:
- Line 25: “If it specifies” has a double space; make it “If it specifies”.
- Lines 23–24, 27: “and add a new field” → “and adds a new field” for grammatical agreement.
- Line 71: Start the sentence with “If row=true, …” for consistency.
Also applies to: 71-71
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.java (3)
19-27: Add JavaDoc for this public integration test classGiven the guidelines that public classes should have JavaDoc, consider adding a brief comment describing that this class runs integration tests for the PPL
addtotalscommand with Calcite enabled (and thatinit()enables Calcite and loads ACCOUNT/BANK indices).For example:
/** * Integration tests for PPL {@code addtotals} with the Calcite engine. * Verifies row totals, optional column totals, custom labels, and interaction * with other PPL commands on ACCOUNT and BANK indices. */ public class CalciteAddTotalsCommandIT extends PPLIntegTestCase {
104-106: Limit helper visibility and remove fully qualifiedisNumericcalls
isNumeric,compareDataRowTotals, andverifyColTotalsare only used inside this class, so they can all be private, and the fully qualified reference toCalciteAddTotalsCommandIT.isNumericintestAddTotalsRowFieldsNonNumericis unnecessary.Suggested cleanup:
- public static boolean isNumeric(String str) { + private static boolean isNumeric(String str) { return str != null && str.matches("-?\\d+(\\.\\d+)?"); } @@ - private void verifyColTotals( + private void verifyColTotals( org.json.JSONArray dataRows, List<Integer> field_indexes, String finalSummaryEventLevel) { @@ - } else if (value instanceof String) { - if (org.opensearch.sql.calcite.remote.CalciteAddTotalsCommandIT.isNumeric( - (String) value)) { + } else if (value instanceof String) { + if (isNumeric((String) value)) { cRowTotal = cRowTotal.add(new BigDecimal((String) (value))); } }and similarly in
testAddTotalsRowFieldsNonNumeric:- } else if (value instanceof String) { - if (org.opensearch.sql.calcite.remote.CalciteAddTotalsCommandIT.isNumeric( - (String) value)) { + } else if (value instanceof String) { + if (isNumeric((String) value)) { cRowTotal = cRowTotal.add(new BigDecimal((String) (value))); } }Also applies to: 136-173, 198-212
248-259: AligntestAddTotalsWithNoDatacomment with actual assertionThe comment says:
// Should still have totals row even with no input databut the test asserts:
assertEquals(0, dataRows.length()); // Only totals rowwhich currently expects zero rows. To avoid confusion, either change the assertion to expect a single totals row (and validate it), or update the comment to match the behavior you actually want to guarantee (e.g., “No rows expected when there is no input data”).
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.java (1)
343-355: Uncomment or remove unusedexpectedLogical/verifyLogicalIn
testAddTotalsWithAllOptionsIncludingFieldname,expectedLogicalis built butverifyLogical(root, expectedLogical)is commented out. This leaves the logical plan unvalidated while still maintaining the expected string.Either:
- Uncomment
verifyLogical(root, expectedLogical);, or- Remove
expectedLogicalif there’s a known reason not to assert the logical plan (and document that reason with a TODO).
🧹 Nitpick comments (2)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.java (1)
13-17: Add class-level JavaDoc for this public test classPer project guidelines, public classes should have JavaDoc. Adding a brief class comment describing that this verifies Calcite translation for
addtotals(logical plan, results, Spark SQL) will make the intent clearer to future readers.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java (1)
68-70: Tighten helper visibility and simplifyisNumericusage
isNumericandverifyColTotalsare only used inside this test class, so they don’t need to be public. Making themprivate(andstaticforisNumeric) will better reflect their scope. Also, the fully qualified referenceCalciteAddColTotalsCommandIT.isNumeric(...)insideverifyColTotalscan be simplified to a directisNumeric(...)call for readability.For example:
- public static boolean isNumeric(String str) { + private static boolean isNumeric(String str) { return str != null && str.matches("-?\\d+(\\.\\d+)?"); } - public void verifyColTotals( + private void verifyColTotals( org.json.JSONArray dataRows, List<Integer> field_indexes, String finalSummaryEventLevel) { @@ - } else if (value instanceof String) { - if (org.opensearch.sql.calcite.remote.CalciteAddColTotalsCommandIT.isNumeric( - (String) value)) { + } else if (value instanceof String) { + if (isNumeric((String) value)) { cColTotals[j] = cColTotals[j].add(new BigDecimal((String) (value))); } }Also applies to: 72-100
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
docs/user/ppl/cmd/addcoltotals.rst(1 hunks)docs/user/ppl/cmd/addtotals.rst(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/user/ppl/cmd/addcoltotals.rst
- ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.java
🧠 Learnings (6)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddTotalsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : All new business logic requires unit tests
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : All public classes and methods must have proper JavaDoc
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddTotalsCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAddColTotalsCommandIT.java (1)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
PPLIntegTestCase(36-411)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1)
22-25: Suite composition update looks correctIncluding
CalciteAddTotalsCommandITandCalciteAddColTotalsCommandITin the Calcite no‑pushdown suite is consistent with existing naming and ensures the new commands participate in the standard integration suite.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Asif Bashar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java (4)
13-17: Public test class lacks JavaDocProject guidelines call for JavaDoc on public classes/methods, even in tests. Consider adding a brief class‑level JavaDoc describing that this exercises Calcite translation and Spark SQL generation for
addcoltotals.
19-59:testAddColTotalsandtestAddColTotalsAllFieldsappear to assert the same behaviorBoth tests use the same PPL (
fields DEPTNO, SAL, JOB | addcoltotals), and their expected logical plans, results, and Spark SQL are effectively identical. This looks like redundant coverage.Either:
- Collapse them into a single test, or
- Adjust one to cover a distinct scenario (e.g., different field lists or upstream pipe operations) so the name matches the behavior.
Also applies to: 104-143
232-263: Label truncation fromGrandTotal→GrandTotais non‑obviousThese tests rely on the label
'GrandTotal'being stored in aVARCHAR(9)column (JOB), which truncates it to'GrandTota'. The expectations are correct, but the mismatch between the label literal and the expected strings may confuse future readers.Consider adding a short comment in one of these tests explaining that truncation is due to the underlying column length.
Also applies to: 275-306, 323-363
19-377: Consider adding scenarios with upstream filters/transformations (if not covered elsewhere)All tests here start directly from
source=EMP(with optionalfields). To better guard Calcite integration, you might add a case whereaddcoltotalsfollows awhereor other transforming command, to assert that column totals respect the transformed result set, not just the base table. Only needed if not already exercised in other ITs.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java (1)
186-230: Coverage for label/labelfield combinations looks solidThe tests around
labelandlabelfield(new field vs existingJOB, argument order variation, and full‑row aggregation) thoroughly check logical plans, result sets, and Spark SQL, including type/width handling and nulls. This gives good confidence in the Calcite wiring foraddcoltotals.Also applies to: 232-273, 275-316, 319-377
Signed-off-by: Asif Bashar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java (2)
232-273: Consider documenting the label truncation behavior.The test correctly expects 'GrandTota' (line 242, 262) instead of the full 'GrandTotal' because the JOB field is VARCHAR(9) in the SCOTT schema. While this truncation is expected, adding a brief comment would improve readability and prevent confusion for future maintainers.
For example, add a comment before the test:
// Tests label truncation when labelfield matches an existing field with shorter width. // JOB is VARCHAR(9), so 'GrandTotal' is truncated to 'GrandTota'. @Test public void testAddColTotalsMatchingLabelFieldWithExisting() throws IOException {
318-376: LGTM with minor style note.This test correctly validates addcoltotals behavior when applied to all table fields with a labeled summary row. The logical plan properly aggregates all numeric columns.
Minor style note: Line 320 has extra spaces in the string concatenation (
" labelfield='JOB' "), which is inconsistent but doesn't affect functionality.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (6)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java (6)
13-17: LGTM!The test class is properly structured, extending the appropriate base class and initializing with the standard SCOTT schema.
19-59: LGTM!This test properly validates the default
addcoltotalsbehavior, verifying that it generates a UNION with aggregated column totals and produces the expected Calcite logical plan and Spark SQL translation.
61-102: LGTM!This test correctly validates field-specific aggregation, ensuring that only the specified field (SAL) is summed while other fields remain null in the summary row.
145-184: LGTM!This test validates multi-field aggregation syntax, ensuring that specifying multiple fields (DEPTNO SAL) produces the expected aggregate behavior in the logical plan and result set.
186-230: LGTM!This test properly validates the label and labelfield options, ensuring that a new field is created to hold the label value in the summary row while remaining null in data rows.
275-316: LGTM!This test validates that parameter order doesn't affect the addcoltotals behavior, ensuring parser flexibility. The label truncation behavior is consistent with the previous test.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAddColTotalsTest.java
Show resolved
Hide resolved
LantaoJin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, CI failure is not related.
|
cc @anasalkouz |
Signed-off-by: Asif Bashar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/category.json (1)
13-14: Reorder entries to maintain alphabetical sorting.The new entries break the alphabetical order of the
ppl_cli_calcitecategory. Bothaddtotalsandaddcoltotalsshould be positioned beforeappend.md, andaddcoltotalsshould precedeaddtotalsalphabetically.Apply this reordering to maintain consistency:
"ppl_cli_calcite": [ "user/ppl/cmd/ad.md", + "user/ppl/cmd/addcoltotals.md", + "user/ppl/cmd/addtotals.md", "user/ppl/cmd/append.md", - "user/ppl/cmd/addtotals.md", - "user/ppl/cmd/addcoltotals.md", "user/ppl/cmd/bin.md",
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/category.json(1 hunks)docs/user/ppl/cmd/addcoltotals.md(1 hunks)docs/user/ppl/cmd/addtotals.md(1 hunks)docs/user/ppl/index.md(2 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- docs/user/ppl/cmd/addcoltotals.md
- docs/user/ppl/cmd/addtotals.md
🚧 Files skipped from review as they are similar to previous changes (1)
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Applied to files:
docs/user/ppl/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: security-it-windows-macos (macos-14, 25)
🔇 Additional comments (1)
docs/user/ppl/index.md (1)
81-82: Verify status consistency: new commands marked "stable" while other 3.4 features are "experimental".The newly added
addtotalsandaddcoltotalscommands are marked asstable (since 3.4), while other 3.4-era commands in the same table (replace,chart,streamstats) are marked asexperimental (since 3.4). This creates an inconsistency.Per the PR objectives, companion deliverables (Javadoc, API-spec, and public documentation PRs) remain incomplete, which typically signals experimental status. Please clarify whether:
- These commands should be marked
experimental (since 3.4)for consistency, or- The "stable" designation is intentional based on completed test and documentation coverage, and other 3.4 features should eventually transition to stable.
328dccc to
67f41b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/user/ppl/index.md (1)
81-82: Clarify command descriptions for consistency and precision.The descriptions for both new commands are somewhat vague. Compare them with the PR objectives: addtotals computes per-event (row) totals and optionally per-column totals; addcoltotals computes per-row totals across specified columns. Consider more precise wording that better reflects the actual operations performed.
For example:
- addtotals: "Compute and append row totals and optionally column totals."
- addcoltotals: "Compute and append column totals as a new field."
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/user/ppl/index.md(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Applied to files:
docs/user/ppl/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: Update draft release notes
- GitHub Check: test-sql-cli-integration (21)
67f41b8 to
453887b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (34)
docs/user/ppl/cmd/lookup.md-20-28 (1)
20-28: Add language identifier to fenced code block.The code block is missing a language specification. Per the markdown linting rule (MD040), all fenced code blocks must specify a language identifier for syntax highlighting and parsing.
Apply this diff to add the language identifier:
-``` +```bash source = table1 | lookup table2 id source = table1 | lookup table2 id, name source = table1 | lookup table2 id as cid, name source = table1 | lookup table2 id as cid, name replace dept as department source = table1 | lookup table2 id as cid, name replace dept as department, city as location source = table1 | lookup table2 id as cid, name append dept as department source = table1 | lookup table2 id as cid, name append dept as department, city as location -``` +```docs/user/ppl/cmd/lookup.md-34-42 (1)
34-42: Use standard markdown syntax for bash code fences.Code fences use
\``bash ignore, which is non-standard markdown. Theignorekeyword is not recognized by standard markdown parsers or CommonMark. Either use just```bash` or handle the "ignore" directive via project-specific tooling or comment syntax.Apply this diff to use standard markdown syntax:
-```bash ignore +```bash curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_ppl -d '{ -``` +```Also applies to: 133-141, 147-155, 246-254
docs/user/ppl/cmd/join.md-63-63 (1)
63-63: Add language specification to fenced code blocks (MD040).Lines 63 and 82 contain code blocks without a language specification. Based on the established pattern (e.g., line 97 uses
\``ppl), these PPL syntax examples should specifyppl` as the language.Apply this diff to add language specifications:
-``` +\`\`\`ppl source = table1 | inner join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, cAnd similarly for line 82:
-``` +\`\`\`ppl source = table1 | join type=outer left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, cAlso applies to: 82-82
docs/user/ppl/cmd/explain.md-87-87 (1)
87-87: Add language specifier to code block.The fenced code block is missing a language identifier. Based on the JSON output shown, specify
jsonas the language for proper syntax highlighting.-``` +```jsondocs/user/ppl/cmd/explain.md-26-26 (1)
26-26: Standardize section heading format.Lines 58, 85, 109, and 134 use "Explain" while line 26 uses "Explain:" (with a colon). Standardize across all examples for consistency.
-Explain +Explain:Also applies to: 58-58, 85-85, 109-109, 134-134
docs/user/ppl/cmd/top.md-102-123 (1)
102-123: Fix duplicate example numbering.The section starting at line 123 is labeled "Example 5" but should be "Example 6", as "Example 5" already appears at line 102.
-## Example 5: Specify the usenull field option +## Example 6: Specify the usenull field optiondocs/user/ppl/cmd/patterns.md-32-43 (1)
32-43: Specify a language identifier for the fenced code block.The code block containing the cluster settings command should declare a language identifier for proper syntax highlighting.
Apply this diff to add the language identifier:
-``` +```json PUT _cluster/settings { "persistent": { "plugins.ppl.pattern.method": "brain",[MD040]
docs/user/ppl/cmd/patterns.md-95-95 (1)
95-95: Use hyphens for compound adjectives.The phrase "user defined patterns" should use a hyphen to form a compound adjective.
Apply this diff:
-This example shows how to extract patterns from a raw log field using user defined patterns. +This example shows how to extract patterns from a raw log field using user-defined patterns.docs/user/ppl/cmd/patterns.md-25-26 (1)
25-26: Use hyphens for compound adjectives.The phrase "low frequency words" should use a hyphen to form a compound adjective modifying the noun.
Apply this diff:
-This sets the lower bound of frequency to ignore low frequency words. **Default:** 0.3. +This sets the lower bound of frequency to ignore low-frequency words. **Default:** 0.3.docs/user/ppl/admin/connectors/security_lake_connector.md-17-17 (1)
17-17: Fix grammar: "in future" → "in the future".Line 17 uses non-standard phrasing. Apply this change:
-We currently only support emr-serverless as spark execution engine and Glue as metadata store. we will add more support in future. +We currently only support emr-serverless as spark execution engine and Glue as metadata store. We will add more support in the future.docs/user/ppl/cmd/describe.md-8-8 (1)
8-8: Fix markdown reference link issue in syntax specification.Line 8 triggers a markdown linting error because
[schema.]is parsed as a reference link start, but no reference definition exists. The intent appears to be inline syntax documentation.Apply this diff to use inline code formatting for the syntax specification:
-describe [dataSource.][schema.]\<tablename\> +describe `[dataSource.][schema.]<tablename>`Alternatively, if you prefer to keep the syntax unformatted:
-describe [dataSource.][schema.]\<tablename\> +describe \[dataSource.\]\[schema.\]<tablename>The first approach (inline code) is preferred as it clearly distinguishes the syntax specification from regular text.
docs/user/ppl/admin/connectors/s3glue_connector.md-16-16 (1)
16-16: Correct British English to American English."in future" should be "in the future" for consistency with standard American English documentation style.
Apply this diff:
-We currently only support emr-serverless as spark execution engine and Glue as metadata store. we will add more support in future. +We currently only support emr-serverless as spark execution engine and Glue as metadata store. we will add more support in the future.docs/user/ppl/admin/connectors/s3glue_connector.md-6-6 (1)
6-6: Fix article agreement in sentence.Line 6 reads awkwardly: "how to query and s3Glue datasource" should use "an" instead of "and".
Apply this diff:
-This page covers s3Glue datasource configuration and also how to query and s3Glue datasource. +This page covers s3Glue datasource configuration and also how to query an s3Glue datasource.docs/user/ppl/admin/connectors/s3glue_connector.md-77-77 (1)
77-77: Wrap bare URL in Markdown link syntax.Line 77 contains a bare URL which violates Markdown best practices. Wrap it in proper link syntax or reference format.
Apply this diff:
-These queries would work only top of async queries. Documentation: [Async Query APIs](../../../interfaces/asyncqueryinterface.rst) +These queries would work only top of async queries. Documentation: [Async Query APIs](../../../interfaces/asyncqueryinterface.rst) and [OpenSearch Spark Docs](https://github.com/opensearch-project/opensearch-spark/blob/main/docs/index.md).Committable suggestion skipped: line range outside the PR's diff.
docs/user/ppl/admin/connectors/s3glue_connector.md-19-30 (1)
19-30: Fix unordered list indentation to comply with Markdown standards.Nested list items have inconsistent indentation. All sublists should use 2-space indentation relative to their parent, not 4 or 8 spaces.
Apply this diff to fix the list indentation:
- * `glue.auth.type` [Required] - * This parameters provides the authentication type information required for execution engine to connect to glue. - * S3 Glue connector currently only supports `iam_role` authentication and the below parameters is required. - * `glue.auth.role_arn` - * `glue.indexstore.opensearch.*` [Required] - * This parameters provides the Opensearch domain host information for glue connector. This opensearch instance is used for writing index data back and also - * `glue.indexstore.opensearch.uri` [Required] - * `glue.indexstore.opensearch.auth` [Required] - * Accepted values include ["noauth", "basicauth", "awssigv4"] - * Basic Auth required `glue.indexstore.opensearch.auth.username` and `glue.indexstore.opensearch.auth.password` - * AWSSigV4 Auth requires `glue.indexstore.opensearch.auth.region` and `glue.auth.role_arn` - * `glue.indexstore.opensearch.region` [Required for awssigv4 auth] + * `glue.auth.type` [Required] + * This parameters provides the authentication type information required for execution engine to connect to glue. + * S3 Glue connector currently only supports `iam_role` authentication and the below parameters is required. + * `glue.auth.role_arn` + * `glue.indexstore.opensearch.*` [Required] + * This parameters provides the Opensearch domain host information for glue connector. This opensearch instance is used for writing index data back and also + * `glue.indexstore.opensearch.uri` [Required] + * `glue.indexstore.opensearch.auth` [Required] + * Accepted values include ["noauth", "basicauth", "awssigv4"] + * Basic Auth required `glue.indexstore.opensearch.auth.username` and `glue.indexstore.opensearch.auth.password` + * AWSSigV4 Auth requires `glue.indexstore.opensearch.auth.region` and `glue.auth.role_arn` + * `glue.indexstore.opensearch.region` [Required for awssigv4 auth]docs/user/ppl/cmd/regex.md-98-98 (1)
98-98: Fix escaping in Example 4 regex pattern.The regex pattern in the Example 4 code block uses double backslashes (
\\d{3,4}\\s+), which differs from Example 3 (line 78:@pyrami\.com$). Within Markdown code blocks, backslashes should appear as single characters to represent the actual regex pattern. This inconsistency may confuse users about the correct syntax.Apply this diff to align the escaping with other examples:
-source=accounts | regex address="\\d{3,4}\\s+[A-Z][a-z]+\\s+(Street|Lane|Court)" | fields account_number, address +source=accounts | regex address="\d{3,4}\s+[A-Z][a-z]+\s+(Street|Lane|Court)" | fields account_number, addressdocs/user/ppl/cmd/head.md-8-8 (1)
8-8: Fix Markdown syntax escaping in the syntax definition.Line 8 uses backslash escapes (
\<size\>and\<offset\>) that appear to be remnants from reStructuredText but will render as literal backslashes in standard Markdown. Use backticks or angle-bracket HTML entities instead.-head [\<size\>] [from \<offset\>] +head [`size`] [from `offset`]Alternatively, if you prefer angle brackets:
-head [\<size\>] [from \<offset\>] +head [<size>] [from <offset>]docs/user/ppl/cmd/search.md-666-666 (1)
666-666: Use markdown headings instead of bold emphasis.Lines 666 and 705 use bold emphasis (
**text**) to style section headers, but they should use markdown heading syntax (##) for proper document structure and semantic meaning.-**Backslash in file paths** +## Backslash in file paths-**Text with special characters** +## Text with special charactersAlso applies to: 705-705
docs/user/ppl/cmd/search.md-24-24 (1)
24-24: Fix grammar: use hyphens for compound adjectives.Lines 24 and 89 have compound adjective formatting issues:
- Line 24: "other PPL commands" → "other-PPL commands" (or restructure to "unlike other PPL commands")
- Line 89: "multi field" → "multi-field"
-**Full Text Search**: Unlike other PPL commands, search supports both quoted and unquoted strings. +**Full Text Search**: Unlike other-PPL commands, search supports both quoted and unquoted strings.-* Limitations: No wildcards for partial IP matching. For wildcard search use multi field with keyword: +* Limitations: No wildcards for partial IP matching. For wildcard search use multi-field with keyword:Also applies to: 89-89
docs/user/ppl/cmd/search.md-92-93 (1)
92-93: Fix list indentation formatting.Lines 92-93 have incorrect indentation for unordered list items. Unordered list items should start at column 0, not be indented by 3 spaces.
**Field Type Performance Tips**: - * Each field type has specific search capabilities and limitations. Using the wrong field type during ingestion impacts performance and accuracy - * For wildcard searches on non-keyword fields: Add a keyword field copy for better performance. Example: If you need wildcards on a text field, create `message.keyword` alongside `message` +* Each field type has specific search capabilities and limitations. Using the wrong field type during ingestion impacts performance and accuracy +* For wildcard searches on non-keyword fields: Add a keyword field copy for better performance. Example: If you need wildcards on a text field, create `message.keyword` alongside `message`docs/user/ppl/cmd/subquery.md-92-106 (1)
92-106: Add language specification to fenced code block.Code blocks should declare a language for proper syntax highlighting and linting compliance.
-``` +```ppl // Assumptions: `a`, `b` are fields of table outer, `c`, `d` are fields of table inner, `e`, `f` are fields of table nested source = outer | where exists [ source = inner | where a = c ] source = outer | where not exists [ source = inner | where a = c ] source = outer | where exists [ source = inner | where a = c and b = d ] source = outer | where not exists [ source = inner | where a = c and b = d ] source = outer exists [ source = inner | where a = c ] // search filtering with subquery source = outer not exists [ source = inner | where a = c ] //search filtering with subquery source = table as t1 exists [ source = table as t2 | where t1.a = t2.a ] //table alias is useful in exists subquery source = outer | where exists [ source = inner1 | where a = c and exists [ source = nested | where c = e ] ] //nested source = outer | where exists [ source = inner1 | where a = c | where exists [ source = nested | where c = e ] ] //nested source = outer | where exists [ source = inner | where c > 10 ] //uncorrelated exists source = outer | where not exists [ source = inner | where c > 10 ] //uncorrelated exists source = outer | where exists [ source = inner ] | eval l = "nonEmpty" | fields l //special uncorrelated exists -``` +```docs/user/ppl/cmd/subquery.md-110-135 (1)
110-135: Add language specification to fenced code block.Code blocks should declare a language for proper syntax highlighting and linting compliance.
-``` +```ppl //Uncorrelated scalar subquery in Select source = outer | eval m = [ source = inner | stats max(c) ] | fields m, a source = outer | eval m = [ source = inner | stats max(c) ] + b | fields m, a //Uncorrelated scalar subquery in Where** source = outer | where a > [ source = inner | stats min(c) ] | fields a //Uncorrelated scalar subquery in Search filter source = outer a > [ source = inner | stats min(c) ] | fields a //Correlated scalar subquery in Select source = outer | eval m = [ source = inner | where outer.b = inner.d | stats max(c) ] | fields m, a source = outer | eval m = [ source = inner | where b = d | stats max(c) ] | fields m, a source = outer | eval m = [ source = inner | where outer.b > inner.d | stats max(c) ] | fields m, a //Correlated scalar subquery in Where source = outer | where a = [ source = inner | where outer.b = inner.d | stats max(c) ] source = outer | where a = [ source = inner | where b = d | stats max(c) ] source = outer | where [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a //Correlated scalar subquery in Search filter source = outer a = [ source = inner | where b = d | stats max(c) ] source = outer [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a //Nested scalar subquery source = outer | where a = [ source = inner | stats max(c) | sort c ] OR b = [ source = inner | where c = 1 | stats min(d) | sort d ] source = outer | where a = [ source = inner | where c = [ source = nested | stats max(e) by f | sort f ] | stats max(d) by c | sort c | head 1 ] -RelationSubquery + +**RelationSubquery** + -``` +```ppl source = table1 | join left = l right = r on condition [ source = table2 | where d > 10 | head 5 ] //subquery in join right side source = [ source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ] | stats count(a) by b ] as outer | head 1 -``` +```docs/user/ppl/cmd/subquery.md-77-88 (1)
77-88: Add language specification to fenced code block.Code blocks should declare a language for proper syntax highlighting and linting compliance.
-``` +```ppl source = outer | where a in [ source = inner | fields b ] source = outer | where (a) in [ source = inner | fields b ] source = outer | where (a,b,c) in [ source = inner | fields d,e,f ] source = outer | where a not in [ source = inner | fields b ] source = outer | where (a) not in [ source = inner | fields b ] source = outer | where (a,b,c) not in [ source = inner | fields d,e,f ] source = outer a in [ source = inner | fields b ] // search filtering with subquery source = outer a not in [ source = inner | fields b ] // search filtering with subquery) source = outer | where a in [ source = inner1 | where b not in [ source = inner2 | fields c ] | fields b ] // nested source = table1 | inner join left = l right = r on l.a = r.a AND r.a in [ source = inner | fields d ] | fields l.a, r.a, b, c //as join filter -``` +```docs/user/ppl/cmd/subquery.md-196-196 (1)
196-196: Replace hard tab with spaces.Line 196 contains hard tab characters. Use spaces for consistent indentation across the file.
- } + }Committable suggestion skipped: line range outside the PR's diff.
docs/user/ppl/cmd/sort.md-14-17 (1)
14-17: Fix malformed block quote formatting.Lines 16–17 contain bare
>markers with no content, which creates malformed block quotes. Remove them.Apply this diff:
> **Note:** > You cannot mix +/- and asc/desc in the same sort command. Choose one approach for all fields in a single sort command. -> ->docs/user/ppl/cmd/sort.md-93-93 (1)
93-93: Fix grammar: "document" → "documents".Apply this diff:
-This example shows sorting all the document by the age field in descending order using the desc keyword. +This example shows sorting all the documents by the age field in descending order using the desc keyword.docs/user/ppl/cmd/sort.md-8-8 (1)
8-8: Fix unnecessary escape sequences in the syntax line.The syntax line uses
\|and\-which render as literal backslashes. In Markdown, pipes|and hyphens-do not need escaping in this context.Apply this diff:
-sort [count] <[+\|-] sort-field \| sort-field [asc\|a\|desc\|d]>... +sort [count] <[+|-] sort-field | sort-field [asc|a|desc|d]>...docs/user/ppl/cmd/parse.md-88-129 (1)
88-129: Add language identifiers to fenced code blocks in Limitations section.The code examples at lines 95, 103, 111, 119, 127 are missing language specifications. Per the Markdown guidelines in
docs/dev/testing-doctest.md, all fenced code blocks should specify a language. These should be marked asppl(for PPL queries) orbashas appropriate.Apply this diff to add language specifications:
- ``` + ```ppl source=accounts | parse address '\d+ (?<street>.+)' | parse street '\w+ (?<road>\w+)' ; - ``` + ```Repeat for the other limitation examples (lines 103, 111, 119, 127).
docs/dev/testing-doctest.md-61-111 (1)
61-111: Markdown documentation guidelines are comprehensive and well-structured. The new section provides clear, actionable guidance for writing PPL documentation in Markdown format with examples of the paired input/output pattern and testing configuration. However, apply a minor capitalization fix.Line 61 should capitalize "Markdown" as a proper noun:
-#### RST Format (SQL docs only. On Deprecation path. Use markdown for PPL) +#### RST Format (SQL docs only. On Deprecation path. Use Markdown for PPL)docs/user/ppl/cmd/eventstats.md-28-29 (1)
28-29: Fix list indentation for bucket_nullable defaults.Lines 28-29 have inconsistent indentation (1 space) relative to the parent list item. They should have 0 spaces to align with sibling list items.
Apply this diff to fix the indentation:
* bucket_nullable: optional. Controls whether the eventstats command consider null buckets as a valid group in group-by aggregations. When set to `false`, it will not treat null group-by values as a distinct group during aggregation. **Default:** Determined by `plugins.ppl.syntax.legacy.preferred`. - * When `plugins.ppl.syntax.legacy.preferred=true`, `bucket_nullable` defaults to `true` - * When `plugins.ppl.syntax.legacy.preferred=false`, `bucket_nullable` defaults to `false` + * When `plugins.ppl.syntax.legacy.preferred=true`, `bucket_nullable` defaults to `true` + * When `plugins.ppl.syntax.legacy.preferred=false`, `bucket_nullable` defaults to `false`docs/user/ppl/cmd/streamstats.md-77-90 (1)
77-90: Specify language for the code block showing command syntax.The fenced code block at line 77 lacks a language identifier. Use
```pplto declare it as PPL syntax (or another appropriate language if these are examples in a different syntax).Apply this diff:
-``` +```ppl source = table | streamstats avg(a) source = table | streamstats current = false avg(a) source = table | streamstats window = 5 sum(b) source = table | streamstats current = false window = 2 max(a) source = table | streamstats count(c) source = table | streamstats min(c), max(c) by b source = table | streamstats count(c) as count_by by b | where count_by > 1000 source = table | streamstats dc(field) as distinct_count source = table | streamstats distinct_count(category) by region source = table | streamstats current=false window=2 global=false avg(a) by b source = table | streamstats window=2 reset_before=a>31 avg(b) source = table | streamstats current=false reset_after=a>31 avg(b) by c</blockquote></details> <details> <summary>docs/user/ppl/cmd/streamstats.md-149-162 (1)</summary><blockquote> `149-162`: **Convert indented code block to fenced code block in Example 3.** The "original data" table at line 149 uses indented code block syntax. Use fenced code blocks (triple backticks) for consistency with the rest of the document and to satisfy markdown linting standards. Apply this diff to convert the indented block to fenced: ```diff * global=true: a global window is applied across all rows, but the calculations inside the window still respect the by groups. * global=false: the window itself is created per group, meaning each group gets its own independent window. This example shows how to calculate the running average of age across accounts by country, using global argument. -original data - +-------+---------+------------+-------+------+-----+ - | name | country | state | month | year | age | - - |-------+---------+------------+-------+------+-----+ - | Jake | USA | California | 4 | 2023 | 70 | - | Hello | USA | New York | 4 | 2023 | 30 | - | John | Canada | Ontario | 4 | 2023 | 25 | - | Jane | Canada | Quebec | 4 | 2023 | 20 | - | Jim | Canada | B.C | 4 | 2023 | 27 | - | Peter | Canada | B.C | 4 | 2023 | 57 | - | Rick | Canada | B.C | 4 | 2023 | 70 | - | David | USA | Washington | 4 | 2023 | 40 | - - +-------+---------+------------+-------+------+-----+ +original data +``` +| name | country | state | month | year | age | +|-------|---------|------------|-------|------|-----| +| Jake | USA | California | 4 | 2023 | 70 | +| Hello | USA | New York | 4 | 2023 | 30 | +| John | Canada | Ontario | 4 | 2023 | 25 | +| Jane | Canada | Quebec | 4 | 2023 | 20 | +| Jim | Canada | B.C | 4 | 2023 | 27 | +| Peter | Canada | B.C | 4 | 2023 | 57 | +| Rick | Canada | B.C | 4 | 2023 | 70 | +| David | USA | Washington | 4 | 2023 | 40 | +```docs/user/ppl/cmd/streamstats.md-36-37 (1)
36-37: Fix nested list indentation in the bucket_nullable description.Lines 36–37 use inconsistent indentation for nested bullets. They should align with the parent list.
Apply this diff to fix the indentation:
* bucket_nullable: optional. Controls whether the streamstats command consider null buckets as a valid group in group-by aggregations. When set to `false`, it will not treat null group-by values as a distinct group during aggregation. **Default:** Determined by `plugins.ppl.syntax.legacy.preferred`. - * When `plugins.ppl.syntax.legacy.preferred=true`, `bucket_nullable` defaults to `true` - * When `plugins.ppl.syntax.legacy.preferred=false`, `bucket_nullable` defaults to `false` + - When `plugins.ppl.syntax.legacy.preferred=true`, `bucket_nullable` defaults to `true` + - When `plugins.ppl.syntax.legacy.preferred=false`, `bucket_nullable` defaults to `false`.github/workflows/sql-cli-integration-test.yml-12-12 (1)
12-12: Refine path filter patterns to prevent over-triggering.The path filter patterns are too broad and will trigger unnecessary workflow runs:
- Lines 12, 27:
**gradle*is ambiguous and will match any file containing "gradle" anywhere in the path, including potentially unrelated nested build artifacts.- Lines 15-16, 30-31:
**/*.jarand**/*.pomwill trigger on any JAR or POM file in the entire repository, including transitive dependencies and build artifacts unrelated to the SQL command changes.These overly broad patterns may cause the workflow to run excessively on unrelated changes, wasting CI resources.
Consider scoping the patterns more precisely. For example:
- '**/*.java' - '**/*.g4' - '!sql-jdbc/**' - - '**gradle*' + - 'gradle/**' + - 'gradle.properties' + - 'build.gradle' - '**lombok*' - 'integ-test/**' - - '**/*.jar' - - '**/*.pom' + - 'gradle/wrapper/**'Alternatively, if broad dependency monitoring is intended, document that expectation in the workflow comments.
Also applies to: 15-16, 27-27, 30-31
🧹 Nitpick comments (29)
docs/user/ppl/admin/monitoring.md (1)
5-5: Fix grammar and style on line 5.Three issues detected:
- "able to collect" → "can collect" (cleaner phrasing)
- "node level" → "node-level" (compound modifier requires hyphenation)
- "Cluster level" → "Cluster-level" (same reason)
Apply this diff:
-By a stats endpoint, you are able to collect metrics for the plugin within the interval. Note that only node level statistics collecting is implemented for now. In other words, you only get the metrics for the node you're accessing. Cluster level statistics have yet to be implemented. +By a stats endpoint, you can collect metrics for the plugin within the interval. Note that only node-level statistics collecting is implemented for now. In other words, you only get the metrics for the node you're accessing. Cluster-level statistics have yet to be implemented.docs/user/ppl/cmd/replace.md (2)
9-11: Reduce repetitive phrasing in parameter descriptions.The word "mandatory" is repeated three times across consecutive parameter definitions. Consider varying the phrasing for better readability.
- * pattern: mandatory. The text pattern you want to replace. - * replacement: mandatory. The text you want to replace with. - * field-name: mandatory. One or more field names where the replacement should occur. + * pattern: The text pattern you want to replace (required). + * replacement: The replacement text (required). + * field-name: One or more field names where the replacement should occur (required).
111-111: Use hyphen for compound adjective modifying noun."Pattern matching" should be hyphenated when it functions as a compound adjective preceding a noun.
- Since replace command only supports plain string literals, you can use LIKE command with replace for pattern matching needs. + Since replace command only supports plain string literals, you can use LIKE command with replace for pattern-matching needs.docs/user/ppl/cmd/rename.md (1)
9-10: Simplify the parameter descriptions to reduce repetition.Line ~9 introduces "field you want to rename" and line ~10 repeats "name you want to rename to," creating redundant phrasing. Simplify for clarity.
-* source-field: mandatory. The name of the field you want to rename. Supports wildcard patterns using `*`. -* target-field: mandatory. The name you want to rename to. Must have same number of wildcards as the source. +* source-field: mandatory. The field to rename. Supports wildcard patterns using `*`. +* target-field: mandatory. The new field name. Must have the same number of wildcards as the source.This version removes the repetitive "you want to" phrasing and improves conciseness.
docs/user/ppl/admin/connectors/prometheus_connector.md (5)
10-18: Fix unordered list indentation to match Markdown style guide.Bullets and nested items use 4-space/8-space indentation, but the project's Markdown style expects 2-space/4-space. This will be flagged by markdownlint.
Apply this diff to fix indentation:
-* `prometheus.uri` [Required]. - * This parameters provides the URI information to connect to a prometheus instance. -* `prometheus.auth.type` [Optional] - * This parameters provides the authentication type information. - * Prometheus connector currently supports `basicauth` and `awssigv4` authentication mechanisms. - * If prometheus.auth.type is basicauth, following are required parameters. - * `prometheus.auth.username` and `prometheus.auth.password`. - * If prometheus.auth.type is awssigv4, following are required parameters. - * `prometheus.auth.region`, `prometheus.auth.access_key` and `prometheus.auth.secret_key` +* `prometheus.uri` [Required]. + * This parameters provides the URI information to connect to a prometheus instance. +* `prometheus.auth.type` [Optional] + * This parameters provides the authentication type information. + * Prometheus connector currently supports `basicauth` and `awssigv4` authentication mechanisms. + * If prometheus.auth.type is basicauth, following are required parameters. + * `prometheus.auth.username` and `prometheus.auth.password`. + * If prometheus.auth.type is awssigv4, following are required parameters. + * `prometheus.auth.region`, `prometheus.auth.access_key` and `prometheus.auth.secret_key`
229-230: Standardize indentation in code examples to 2-space bullets.Lines 229–230 use 4-space indentation for list items; adjust to 2-space for consistency.
- - `source=my_prometheus.query_range('prometheus_http_requests_total', 1686694425, 1686700130, 14)` - - `source=my_prometheus.query_range(query='prometheus_http_requests_total', starttime=1686694425, endtime=1686700130, step=14)` + - `source=my_prometheus.query_range('prometheus_http_requests_total', 1686694425, 1686700130, 14)` + - `source=my_prometheus.query_range(query='prometheus_http_requests_total', starttime=1686694425, endtime=1686700130, step=14)`
260-261: Standardize indentation in query_exemplars code examples to 2-space bullets.Lines 260–261 use 4-space indentation; adjust to 2-space for consistency.
- - `source=my_prometheus.query_exemplars('prometheus_http_requests_total', 1686694425, 1686700130)` - - `source=my_prometheus.query_exemplars(query='prometheus_http_requests_total', starttime=1686694425, endtime=1686700130)` + - `source=my_prometheus.query_exemplars('prometheus_http_requests_total', 1686694425, 1686700130)` + - `source=my_prometheus.query_exemplars(query='prometheus_http_requests_total', starttime=1686694425, endtime=1686700130)`
101-101: Clarify compound word "endtime" for better readability.Line 101 uses "endtime" as a single word. For clarity in prose (as opposed to code/parameter names), consider "end time" (two words) or verify this matches the API documentation convention.
102-102: Hyphenate "auto-determined" in prose.Line 102 reads "auto determined"; use the hyphenated form "auto-determined" for adjective modifiers.
-* In case of stats, resolution is auto determined from the time range set. +* In case of stats, resolution is auto-determined from the time range set.docs/user/ppl/admin/settings.md (2)
5-5: Simplify phrasing for clarity."Able to change" can be replaced with "can" for more direct, concise phrasing.
-When OpenSearch bootstraps, PPL plugin will register a few settings in OpenSearch cluster settings. Most of the settings are able to change dynamically so you can control the behavior of PPL plugin without need to bounce your cluster. +When OpenSearch bootstraps, PPL plugin will register a few settings in OpenSearch cluster settings. Most of the settings can change dynamically so you can control the behavior of PPL plugin without need to bounce your cluster.
209-209: Add hyphen to compound adjective."Performance sensitive" should be hyphenated when used as a compound modifier before a noun.
-Since 3.3.0, join types `inner`, `left`, `outer` (alias of `left`), `semi` and `anti` are supported by default. `right`, `full`, `cross` are performance sensitive join types which are disabled by default. Set config `plugins.calcite.all_join_types.allowed = true` to enable. +Since 3.3.0, join types `inner`, `left`, `outer` (alias of `left`), `semi` and `anti` are supported by default. `right`, `full`, `cross` are performance-sensitive join types which are disabled by default. Set config `plugins.calcite.all_join_types.allowed = true` to enable.docs/user/ppl/admin/security.md (3)
6-6: Fix capitalization: use "REST API" instead of "Rest API".For consistency with standard terminology and other documentation sections (e.g., line 46), capitalize "REST API" properly.
-## Using Rest API +## Using REST API
10-10: Improve grammar and punctuation.The sentence structure is unclear. Use a comma instead of a period before "then" and lowercase it, or restructure the sentence for clarity.
-Example: Create the ppl_role for test_user. then test_user could use PPL to query `ppl-security-demo` index. +Example: Create the ppl_role for test_user, then test_user can use PPL to query the `ppl-security-demo` index.
65-65: Consider the stability and accessibility of the external image link.The image is hosted on a GitHub user content URL, which could become stale or inaccessible if the user account or repository changes. Consider embedding the image in the repository or using a more stable reference.
docs/user/ppl/cmd/timechart.md (1)
8-8: Consider removing unnecessary backslash escaping in the syntax line.In Markdown, angle brackets don't need escaping and can be written directly. The backslashes (
\<,\>) are unnecessary and are likely an artifact from reStructuredText migration.-timechart [timefield=\<field_name\>] [span=\<time_interval\>] [limit=\<number\>] [useother=\<boolean\>] \<aggregation_function\> [by \<field\>] +timechart [timefield=<field_name>] [span=<time_interval>] [limit=<number>] [useother=<boolean>] <aggregation_function> [by <field>]docs/user/ppl/cmd/multisearch.md (1)
20-25: Fix Markdown syntax formatting in the syntax specification.Line 22 uses backslash-escaped angle brackets (
\<subsearch1\>) which are unnecessary in Markdown and may render as visible backslashes. The syntax specification should use standard placeholder notation without escaping, or alternatively match the square-bracket format shown in actual usage examples.Consider one of these approaches:
Option 1: Use plain angle brackets without escaping:
-multisearch \<subsearch1\> \<subsearch2\> \<subsearch3\> ... +multisearch <subsearch1> <subsearch2> <subsearch3> ...Option 2: Use square bracket notation to match actual usage:
-multisearch \<subsearch1\> \<subsearch2\> \<subsearch3\> ... +multisearch [subsearch1] [subsearch2] [subsearch3] ...docs/user/ppl/cmd/chart.md (3)
8-8: Simplify the syntax block for readability.The syntax line is quite dense. Breaking it into separate lines or adding a visual hierarchy would improve clarity for users trying to understand the command structure at a glance.
Consider reformatting for better readability:
chart [limit=(top|bottom) <number>] [useother=<boolean>] [usenull=<boolean>] [nullstr=<string>] [otherstr=<string>] <aggregation_function> [by <row_split> [<column_split>] | over <row_split> [by <column_split>]]
31-37: Clarify the distinction betweenbyandover...by...syntax modes.The two grouping syntaxes (
byandover...by...) are described sequentially but the functional relationship between them isn't immediately clear—specifically, thatover <field>alone is equivalent toby <field>, while both can be combined asover <row> by <column>. A concise summary statement would improve scannability.Consider adding a clarifying statement:
* by: Groups results by one or two fields. When two fields are provided, the first is the row split and the second is the column split. * over...by...: Alternative syntax for the same grouping capability: - `over <row_split>` is equivalent to `by <row_split>` - `over <row_split> by <column_split>` is equivalent to `by <row_split> <column_split>`
42-42: Emphasize the null-handling behavior for aggregation functions.This note explains important behavior (exclusion of null values during aggregation), but its placement and phrasing could make it more discoverable. Users designing queries need to understand this early to avoid surprises.
Consider relocating this note to the main "Notes" section header or emphasizing it more prominently as a distinct caveat. You might also add an example showing the impact, such as:
## Notes on Null Handling and Aggregation * **Aggregation exclusion:** Documents with null values in fields used by the aggregation function are excluded from aggregation. For example, `chart avg(balance)` will not include documents where balance is null.docs/user/ppl/cmd/spath.md (1)
18-18: Minor phrasing improvement: clarify "simplest spath".The phrase "The simplest spath is to extract a single field" reads awkwardly. Consider rephrasing to "A simple spath command extracts a single field" or "The simplest spath example extracts a single field."
docs/user/ppl/admin/datasources.md (7)
38-38: Use hyphens for compound adjectives.Lines 38 and 48 contain compound adjectives that need hyphenation for correctness:
- "security disabled domains" → "security-disabled domains"
Also, capitalize the sentence on line 48: "we can remove" → "We can remove"
- * In case of security disabled domains, authorization is disbaled. + * In case of security-disabled domains, authorization is disabled. ... - we can remove authorization and other details in case of security disabled domains. + We can remove authorization and other details in case of security-disabled domains.Also applies to: 48-48
38-38: Fix typo: "disbaled" → "disabled".Line 38 contains a typo: "authorization is disbaled" should be "authorization is disabled"
106-106: Fix formatting in API endpoint headers.Lines 106 and 115 are missing proper spacing and closing parenthesis before the code block:
- Line 106:
Datasource Read GET API("_plugins/_query/_datasources/{{dataSourceName}}"- Line 115:
Datasource Deletion DELETE API("_plugins/_query/_datasources/{{dataSourceName}}"Both should close the parenthetical and include a line break before the code block.
- * Datasource Read GET API("_plugins/_query/_datasources/{{dataSourceName}}" + * Datasource Read GET API (`_plugins/_query/_datasources/{{dataSourceName}}`) - * Datasource Deletion DELETE API("_plugins/_query/_datasources/{{dataSourceName}}" + * Datasource Deletion DELETE API (`_plugins/_query/_datasources/{{dataSourceName}}`)Also applies to: 115-115
148-148: Hyphenate numeric compound adjectives.Line 148: "24 character master key" should use a hyphen: "24-character master key"
- * Sample python script to generate a 24 character master key + * Sample python script to generate a 24-character master key
202-202: Consider more concise phrasing.Line 202 uses "prior to" which is somewhat wordy. Consider: "In versions before 2.7" or "Earlier than version 2.7"
- * In versions prior to 2.7, the plugins.query.federation.datasources.config key store setting was used to configure datasources, but it has been deprecated and will be removed in version 3.0. + * In versions before 2.7, the plugins.query.federation.datasources.config key store setting was used to configure datasources, but it has been deprecated and will be removed in version 3.0.
224-224: Use British/American English consistently: "in future" → "in the future".Line 226: The phrase "in future" is British English. For consistency with the rest of the documentation, use "in the future"
- In the current state, `information_schema` only support metadata of tables. - This schema will be extended for views, columns and other metadata info in future. + In the current state, `information_schema` only supports metadata of tables. + This schema will be extended for views, columns and other metadata info in the future.Also note: Line 225 should use "supports" (plural agreement with "
information_schema") rather than "support"
224-228: Clarify awkward phrasing for better readability.Line 224: "query tables information under a datasource" is awkward. Suggested revision: "query information about tables in a datasource" or "retrieve table metadata from a datasource"
- Use `information_schema` in source command to query tables information under a datasource. + Use `information_schema` in source command to query table information within a datasource.README.md (1)
91-91: Remove redundant "Language" from "SQL Language Reference Manual"."SQL" already stands for "Structured Query Language", making "SQL Language" redundant. Simplify to "SQL Reference Manual" or "Language Reference Manual".
-Please refer to the [SQL Language Reference Manual](./docs/user/index.rst), [Piped Processing Language (PPL) Reference Manual](./docs/user/ppl/index.md), [OpenSearch SQL/PPL Engine Development Manual](./docs/dev/index.md) and [Technical Documentation](https://opensearch.org/docs/latest/search-plugins/sql/index/) for detailed information on installing and configuring plugin. +Please refer to the [SQL Reference Manual](./docs/user/index.rst), [Piped Processing Language (PPL) Reference Manual](./docs/user/ppl/index.md), [OpenSearch SQL/PPL Engine Development Manual](./docs/dev/index.md) and [Technical Documentation](https://opensearch.org/docs/latest/search-plugins/sql/index/) for detailed information on installing and configuring plugin.core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)
198-244: Custom SPLIT implementation with empty-delimiter handling looks sound; consider clarifying character semanticsThe SPLIT wiring and implementation look correct overall:
- Static import and registration under
BuiltinFunctionName.SPLITmatch the new enum constant.- The two-arg
FunctionImp2withPPLTypeChecker.family(SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER)aligns withstr, delimiterusage.CASEoverdelimiter = ''cleanly preserves existing behavior for non-empty delimiters while adding the “split into elements” special case.One minor semantic point to double-check:
- For the empty-delimiter branch you use
REGEXP_EXTRACT_ALL(str, "."). In Calcite-style regex,"."does not match newline characters and works on regex “characters” (code units), not necessarily full Unicode code points.- If the intended behavior is “truly per-character including newlines / full Unicode,” you may want to either:
- Document this as “regex-based per-character” semantics, or
- Adjust the pattern (e.g.,
(?s).or an equivalent) if matching newlines is required.Also, please ensure tests cover:
split(field, '')with strings containing newlines and multi-byte characters.- Type compatibility of the CASE branches (both returning the same array element type) to avoid dialect-specific surprises.
Overall, the design is good; this is just a small semantics/clarity check.
Also applies to: 993-1020
453887b to
4becf12
Compare
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
Signed-off-by: Asif Bashar <[email protected]>
|
@LantaoJin after resolving conflict, the review mark has been reset |
Description
Is your feature request related to a problem?
addtotals command to show total of all columns of each row as a new column , and also have option to show total of all rows of each column values to show at the end of rows.
Fixes issue #4607
From roadmap #4287
addcoltotals command to show total of each column's all rows values to show at the end of rows.
From roadmap #4287
What solution would you like?
command: addtotals ,addcoltotals
addtotals: Add totals across rows by default and also calculate total across columns when col=true
The addtotals command adds together the numeric fields in each search result.
You may specify which fields to include rather than summing all numeric fields.
The final total is stored in a new field.
The addtotals command's behavior is as follows:
When col=true, it computes the sum for every column and adds a summary row at the end containing those totals.
To label this final summary row, specify a labelfield and assign it a value using the label option.
Alternatively, instead of using the addtotals col=true command, you can use the addcoltotals command to calculate a summary event.
labelfield, if specified, is a field that will be added at the last row of the column specified by labalfield with the value set by the 'label' option.
Command Syntax:
addtotals [row=<bool>] [col=<bool>] [labelfield=<field>] [label=<string>] [fieldname=<field>] [<field-list>]arguments description:
row: Syntax:
row=<bool>. Indicates whether to compute the sum of the for each event. This works like generating a total for each row in a table. The result is stored in a new field, which is named Total by default. To use a different field name, provide the fieldname argument. Default value istrue.col : Syntax:
col=<bool>. Indicates whether to append a new event—called a summary event—to the end of the event list. This summary event shows the total for each field across all events, similar to calculating column totals in a table. Default is false.fieldname : Syntax:
fieldname=<field>. Specifies the name of the field that stores the calculated sum of the field-list for each event. This argument is only applicable when row=true. Default isTotalfield-list :
Syntax: <field> .... One or more numeric fields separated by spaces. Only the fields listed in the are included in the sum. If no is provided, all numeric fields are summed by default.labelfield : Syntax:
labelfield=<field>. Specifies a field to use as the label for the summary event. This argument is only applicable when col=true."To use an existing field from your result set, provide its name as the value for the labelfield argument. For example, if the field is named salary, specify labelfield=salary. If no existing field matches the labelfield value, a new field is created using that value.
label: Syntax:
label=<string>. Specifies a row label for the summary event.If the labelfield argument refers to an existing field in the result set, the label value appears in that field for the summary row.
If the labelfield argument creates a new field, the label is placed in that new field in the summary event row. Default label is
Total.command addcoltotals: Add totals across columns of each row to show total in a new field.
addcoltotals: options
Optional Arguments
<field-list>Syntax:
<field> .... A space-delimited list of valid field names. addcoltotals calculates sums only for the fields you include in this list. By default, the command calculates the sum for all fields.labelfield: Syntax:
labelfield=<fieldname>. Field name to add to the result set.label : Syntax:
label=<string>. Used together with the labelfield argument to add a label to the summary event. If labelfield is not specified, the label argument has no effect. Default label isTotal.Related Issues
Resolves #4607 [#4607 ]
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.