Skip to content

[Task] New Hubble 2.0 version#4

Open
imbajin wants to merge 36 commits intomasterfrom
hubble2
Open

[Task] New Hubble 2.0 version#4
imbajin wants to merge 36 commits intomasterfrom
hubble2

Conversation

@imbajin
Copy link

@imbajin imbajin commented Nov 26, 2025

No description provided.

FrostyHec and others added 30 commits September 13, 2024 02:49
Eliminated a redundant protected client() method definition in BaseApiTest to clean up the code and avoid confusion.
Refactored several code blocks in HugeGraphLoader.java and JDBCFetcher.java to improve readability by adding line breaks, consistent indentation, and clearer formatting. No functional changes were made.
Added Node modules caching to GitHub Actions workflow for faster CI builds. Updated the Maven build to use yarn with --frozen-lockfile and --prefer-offline for more reliable installs. Replaced 'node-sass' with 'sass' in frontend dependencies to ensure compatibility and maintainability.
Resolved a merge conflict in convertStringToJSON.js and removed unrelated Java code. Added a 'resolutions' field for 'pretty-format' in package.json to address dependency versioning. yarn.lock added to track dependencies.
Updated Node.js version requirements and setup from 16.x to 18.20.8 in CI workflow, documentation, and build configuration. Also removed the 'resolutions' field for 'pretty-format' from package.json.
Added CI and NODE_OPTIONS environment variables to both the GitHub Actions workflow and the Maven build configuration. This sets CI to false and increases Node.js memory allocation to 4096MB to improve build stability and provide clearer environment diagnostics.
Introduces the .serena directory with onboarding documentation, project architecture, code style, workflows, and related memory files for HugeGraph Toolchain.

These files provide guidelines and quick start information for new developers, covering project overview, design patterns, code conventions, testing, and development workflows.
Copilot AI review requested due to automatic review settings November 26, 2025 07:36
@github-actions
Copy link

@codecov-ai-reviewer review

@coderabbitai
Copy link

coderabbitai bot commented Nov 26, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hubble2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces Hubble 2.0, a major architectural update that removes database-backed graph connections in favor of direct authentication tokens, adds multi-tenant graphspace support, and introduces new user management and algorithm service features.

Key Changes

  • Replaces database-stored connection management with token-based authentication using HugeClientPoolService
  • Adds graphspace-level organization and multi-tenancy capabilities
  • Introduces new algorithm service endpoints for OLTP/OLAP graph analytics
  • Implements user service with role-based access control and space administration

Reviewed changes

Copilot reviewed 205 out of 957 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
GraphsService.java New service for multi-graph management within graphspaces including statistics and built-in data loading
GraphService.java Updated to use HugeClient directly instead of connection IDs with new element operations
WhiteIpListService.java New service for IP whitelist management
UserService.java New comprehensive user management service with authentication and authorization
OltpAlgoService.java Major expansion adding 20+ graph traversal algorithm implementations
HugeClientPoolService.java Complete rewrite to support token-based client creation with URL discovery
GraphConnectionService.java Removed - functionality replaced by client pool service
HubbleOptions.java Added configuration options for PD cluster, dashboard, and monitoring services
CustomInterceptor.java Updated to inject HugeClient into requests based on graphspace/graph context
ExecuteHistory.java Updated schema from conn_id to graphspace/graph identifiers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

boolean isClearSchema, boolean isClearData) {
// TODO client do not support clear Schema field. Check here
if (isClearSchema) {
client.graphs().clearGraph(graph,"I'm sure to delete all data");
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in method call argument

Suggested change
client.graphs().clearGraph(graph,"I'm sure to delete all data");
client.graphs().clearGraph(graph, "I'm sure to delete all data");

Copilot uses AI. Check for mistakes.

public void delete(HugeClient client, String graph, String confirmMessage) {
// TODO check if frontend support passing confirm message.
client.graphs().dropGraph(graph,confirmMessage);
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in method call argument

Suggested change
client.graphs().dropGraph(graph,confirmMessage);
client.graphs().dropGraph(graph, confirmMessage);

Copilot uses AI. Check for mistakes.
client);
HashMap<String, Object> vertexPropertiesMap= new HashMap<>();
vertexPropertiesMap.put("nonNullableProps", vlEntity.getNonNullableProps());
vertexPropertiesMap.put("NullableProps", vlEntity.getNullableProps());
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent naming - key uses PascalCase "NullableProps" while other keys use camelCase. Should be "nullableProps" for consistency

Copilot uses AI. Check for mistakes.
client);
HashMap<String, Object> edgePropertiesMap= new HashMap<>();
edgePropertiesMap.put("nonNullableProps", elEntity.getNonNullableProps());
edgePropertiesMap.put("NullableProps", elEntity.getNullableProps());
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent naming - key uses PascalCase "NullableProps" while other keys use camelCase. Should be "nullableProps" for consistency

Copilot uses AI. Check for mistakes.
Comment on lines +247 to +249
"common.param.must-be-null", "source_id");
Ex.check(entity.getTargetId() != null,
"common.param.must-be-null", "target_id");
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message key "common.param.must-be-null" is incorrect - should be "cannot-be-null" since the condition checks for null values

Suggested change
"common.param.must-be-null", "source_id");
Ex.check(entity.getTargetId() != null,
"common.param.must-be-null", "target_id");
"common.param.cannot-be-null", "source_id");
Ex.check(entity.getTargetId() != null,
"common.param.cannot-be-null", "target_id");

Copilot uses AI. Check for mistakes.
.map(GremlinUtil::escapeId)
.collect(Collectors.toList());
String ids = StringUtils.join(escapedIds, ",");
if (ids == ""){
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String comparison using == operator instead of isEmpty() or equals(). Should use ids.isEmpty() or ids.equals("")

Suggested change
if (ids == ""){
if (StringUtils.isEmpty(ids)){

Copilot uses AI. Check for mistakes.
.map(GremlinUtil::escapeId)
.collect(Collectors.toList());
String ids = StringUtils.join(escapedIds, ",");
if (ids == ""){
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String comparison using == operator instead of isEmpty() or equals(). Should use ids.isEmpty() or ids.equals("")

Suggested change
if (ids == ""){
if (ids.isEmpty()){

Copilot uses AI. Check for mistakes.
"FROM `load_task` as l LEFT JOIN `file_mapping` as f " +
"ON l.file_id=f.id WHERE l.job_id = #{job_id}")
JobManagerItem computeSizeDuration(@Param("job_id") int jobId);
JobManagerItem computeSizeDuration(@Param("job_id") int job_id);
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter name uses snake_case 'job_id' instead of camelCase 'jobId' which is Java convention

Suggested change
JobManagerItem computeSizeDuration(@Param("job_id") int job_id);
JobManagerItem computeSizeDuration(@Param("job_id") int jobId);

Copilot uses AI. Check for mistakes.
HttpServletResponse response,
Object handler) {
String uri = request.getRequestURI();
if ("check".equals(uri) || "/api/v1.3/bill/cron".equals(uri)) { // TODO C rmv required?
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO comment indicates uncertainty about requirement - should be clarified or resolved

Suggested change
if ("check".equals(uri) || "/api/v1.3/bill/cron".equals(uri)) { // TODO C rmv required?
// Allow unauthenticated access to "check" and "/api/v1.3/bill/cron" endpoints (e.g., for health checks or scheduled jobs)
if ("check".equals(uri) || "/api/v1.3/bill/cron".equals(uri)) {

Copilot uses AI. Check for mistakes.
@MergeProperty
@JsonProperty("conn_id")
private Integer connId;
@JsonProperty("graphe")
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property name misspelled as "graphe" instead of "graph"

Suggested change
@JsonProperty("graphe")
@JsonProperty("graph")

Copilot uses AI. Check for mistakes.
@gemini-code-assist
Copy link

Summary of Changes

Hello @imbajin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new major version of Hubble, version 2.0, marking a significant evolution in its architecture and feature set. The changes move away from direct graph connection management towards a more robust and scalable model centered around 'graphspaces' and 'graphs'. This upgrade brings a wealth of new functionalities, including advanced graph algorithms, comprehensive authentication and authorization, improved monitoring and logging, and cutting-edge AI integration for natural language querying. The underlying build system and dependencies have also been thoroughly modernized to support these new capabilities.

Highlights

  • Architectural Shift to Graphspace/Graph Model: The core change involves a significant architectural shift from managing direct GraphConnection instances to a more flexible graphspace and graph model. This is evident in the removal of GraphConnectionController and GraphConnectionService, and the update of numerous API endpoints and internal services to use graphspace and graph path variables.
  • Major Dependency and Build Tool Updates: The pom.xml has undergone extensive changes, including upgrading Node.js requirements to 18.20.8, updating various Spring Boot, Elasticsearch, Jersey, and other core dependencies. New Maven plugins for compilation, cleaning, and testing have also been introduced.
  • Enhanced Authentication and Authorization: New controllers and services have been added for comprehensive authentication and authorization management, including user login/logout, user and role management, access control, IP whitelisting, and integration with superadmin and graph space admin roles.
  • Expanded Graph Algorithm Capabilities: Numerous new OLTP and OLAP graph algorithms have been integrated, including shortest path variations, rings, k-out/k-neighbor, Jaccard similarity, personal/neighbor rank, fusiform similarity, Adamic-Adar, resource allocation, and egonet. Dedicated controllers for OLAP and Vermeer algorithms are now present.
  • New Monitoring, Logging, and AI Integrations: Hubble now includes new features for monitoring (Prometheus, Dashboard), centralized logging (ES integration), and AI capabilities via LangChain for natural language to Gremlin/Cypher query generation. Proxy servlet configurations are also introduced for ingestion.
  • Schema and Graph Management Enhancements: New endpoints for managing schema templates, importing graph data from JSON, and detailed graph metrics (schema and element counts, growth rates) have been added. The schema definition now supports Groovy scripts for more flexible management.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/hubble-ci.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring for Hubble 2.0, including a more RESTful API structure, improved client management, and new features like LangChain integration. The changes are extensive and move in a good direction. However, the pull request contains several issues that indicate it might be a work-in-progress, such as commented-out critical code (like authentication checks), numerous TODO comments, debug statements, and potential bugs (like a copy-paste error in an enum). These issues, especially the critical ones, should be addressed before this PR is merged to ensure the stability and correctness of the application.

private final String name;

static {
SerialEnum.register(AppName.class);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There seems to be a copy-paste error here. The enum AppType is trying to register AppName.class with SerialEnum. This should be AppType.class instead. This could lead to incorrect serialization behavior.

Suggested change
SerialEnum.register(AppName.class);
SerialEnum.register(AppType.class);

Comment on lines +37 to +39
// if (request.getSession().getAttribute(Constant.TOKEN_KEY) == null) { // TODO C why anno.?
// throw new UnauthorizedException();
// }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The authentication check in this interceptor is commented out. This effectively disables authentication for the intercepted API endpoints, which is a major security risk. If authentication is intended, this check must be enabled.

Comment on lines 57 to 58
} catch (Exception ignored) {
LOG.error("hugeConfig exception");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This empty catch block swallows exceptions silently. This is a dangerous practice as it can hide underlying problems. At a minimum, the exception should be logged. The original logging statement was removed in this change.

Comment on lines +61 to +63
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
Constant.STATUS_OK,
"{\"status\": 401}");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When the user is not authenticated (username == null), the servlet returns an HTTP 200 OK response with a JSON body {"status": 401}. The more standard and correct approach is to return an HTTP 401 Unauthorized status code directly. This makes the API behavior more predictable for clients.

Suggested change
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
Constant.STATUS_OK,
"{\"status\": 401}");
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
401,
"Unauthorized");

public static void main(String[] args) {
System.out.println("user.dir ==> " + System.getProperty("user.dir"));
initEnv();
TimeZone.setDefault(TimeZone.getTimeZone(ZoneOffset.of("+8")));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Setting the default timezone for the entire JVM with TimeZone.setDefault() is generally discouraged as it can have unintended side effects on other parts of the application and its dependencies that rely on the system's default timezone. It's better to handle timezones explicitly where needed, for example, when formatting dates.

Comment on lines +25 to +27
// TODO fix import
//import org.apache.hugegraph.client.api.traverser.NeighborRankAPI;
//import org.apache.hugegraph.client.api.traverser.PersonalRankAPI;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a TODO fix import comment here, along with commented-out imports. This indicates incomplete work that should be resolved before merging.

Comment on lines +1 to +82
///*
// *
// * Licensed to the Apache Software Foundation (ASF) under one or more
// * contributor license agreements. See the NOTICE file distributed with this
// * work for additional information regarding copyright ownership. The ASF
// * licenses this file to You under the Apache License, Version 2.0 (the
// * "License"); you may not use this file except in compliance with the License.
// * You may obtain a copy of the License at
// *
// * http://www.apache.org/licenses/LICENSE-2.0
// *
// * Unless required by applicable law or agreed to in writing, software
// * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
// * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
// * License for the specific language governing permissions and limitations
// * under the License.
// */
//
//package org.apache.hugegraph.controller.auth;
//
//import java.util.List;
//
//import org.apache.hugegraph.driver.HugeClient;
//import org.apache.hugegraph.service.auth.AccessService;
//import org.springframework.beans.factory.annotation.Autowired;
//import org.springframework.web.bind.annotation.DeleteMapping;
//import org.springframework.web.bind.annotation.GetMapping;
//import org.springframework.web.bind.annotation.PathVariable;
//import org.springframework.web.bind.annotation.PostMapping;
//import org.springframework.web.bind.annotation.PutMapping;
//import org.springframework.web.bind.annotation.RequestBody;
//import org.springframework.web.bind.annotation.RequestMapping;
//import org.springframework.web.bind.annotation.RequestParam;
//import org.springframework.web.bind.annotation.RestController;
//
//import org.apache.hugegraph.common.Constant;
//import org.apache.hugegraph.entity.auth.AccessEntity;
//
//@RestController
//@RequestMapping(Constant.API_VERSION + "graphspaces/{graphspace}/auth/accesses")
//public class AccessController extends AuthController {
//
// @Autowired
// AccessService accessService;
//
// @GetMapping
// public List<AccessEntity> list(@PathVariable("graphspace") String graphSpace,
// @RequestParam(value="role_id", required = false) String rid,
// @RequestParam(value="target_id", required = false) String tid) {
// HugeClient client = this.authClient(graphSpace, null);
// return this.accessService.list(client, rid, tid);
// }
//
// @GetMapping("{id}")
// public AccessEntity get(@PathVariable("graphspace") String graphSpace,
// @PathVariable("id") String aid) {
// HugeClient client = this.authClient(graphSpace, null);
// return this.accessService.get(client, aid);
// }
//
// @PostMapping
// public AccessEntity add(@PathVariable("graphspace") String graphSpace,
// @RequestBody AccessEntity accessEntity) {
// HugeClient client = this.authClient(graphSpace, null);
// return this.accessService.addOrUpdate(client, accessEntity);
// }
//
// @PutMapping
// public AccessEntity update(@PathVariable("graphspace") String graphSpace,
// @RequestBody AccessEntity accessEntity) {
// HugeClient client = this.authClient(graphSpace, null);
// return this.accessService.addOrUpdate(client, accessEntity);
// }
//
// @DeleteMapping
// public void delete(@PathVariable("graphspace") String graphSpace,
// @RequestParam("role_id") String roleId,
// @RequestParam("target_id") String targetId) {
// HugeClient client = this.authClient(graphSpace, null);
// this.accessService.delete(client, roleId, targetId);
// }
//}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This entire file is commented out. If this feature is not ready, it should not be included in the pull request. Commented-out code reduces readability and should be removed from the codebase.

user.setSuperadmin(false);
userService.update(this.authClient(null, null), user);
} catch (Exception e) {
e.printStackTrace();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using e.printStackTrace() is discouraged in production code. It's better to use a structured logger (like SLF4J) to log the exception with appropriate context. This provides better control over log levels and output formats.

Suggested change
e.printStackTrace();
logger.error("Failed to delete super admin role for user {}", id, e);

<version>1.33</version>
</dependency>
<dependency>
<!-- TODO X FIX HERE-->

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This TODO comment indicates that there's an issue to be fixed with this dependency. Such comments should be resolved, or converted into proper issues in a tracking system, before merging the pull request to ensure code quality and maintainability. There are several other TODO comments in this file that should also be addressed.

Comment on lines +262 to +285
//// TODO REMOVED
//public static final ConfigOption<String> AFS_URI =
// new ConfigOption<>(
// "afs.uri",
// "the uri of afs stored for the olap algorithm's result",
// null,
// "afs://cnn-bd-main.afs.baidu.com:9902"
// );
//
//public static final ConfigOption<String> AFS_USER =
// new ConfigOption<>(
// "afs.user",
// "the user name for accessing afs stored",
// null,
// "user"
// );
//
//public static final ConfigOption<String> AFS_PASSWORD =
// new ConfigOption<>(
// "afs.password",
// "the user password for accessing afs stored",
// null,
// "password"
// );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These AFS-related options are commented out with a TODO REMOVED comment. This should be cleaned up by either removing the commented-out code block entirely or clarifying the status of this feature.

kenssa4eedfd and others added 2 commits November 27, 2025 11:58
…y conflict, and remove the null value check for the graph field in the login process.

private void judgeFileExist(String filePath) {
File file = new File(filePath);
if (!file.exists()) {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
model, ernieClientId, ernieClientSecret);

// 构造ProcessBuilder对象
ProcessBuilder pb = new ProcessBuilder(args1);

Check failure

Code scanning / CodeQL

Uncontrolled command line Critical

This command line depends on a
user-provided value
.

Copilot Autofix

AI 3 months ago

The best way to fix this problem is to ensure that the pythonPath value (the Python interpreter executable) is not arbitrarily provided by the user but instead restricted to a trusted set of installed interpreters. This can be achieved in several ways:

  • Preferably, maintain a hardcoded list of allowed interpreter paths, or only allow pythonPath to take on values like "python3", "python", or other known safe paths.
  • Alternatively, validate the provided pythonPath against a list of allowed values (either via direct comparison or regex matching for allowed paths, and reject anything containing unsafe characters).
  • If absolute paths must be supported (for multi-interpreter deployments), sanitize the input to allow only well-formed paths, ensuring no shell metacharacters can be injected.

Changes needed:

  • Implement a private method to validate pythonPath (e.g., validatePythonInterpreterPath). This method should reject any path containing potentially dangerous characters (such as ;, &, |, backticks, or whitespace), or only allow paths within a known set.
  • Call this validation method before constructing the command line in langchainNoSchema or within excutePythonByProcessBuilder.
  • Throw an exception if the validation fails, preventing any uncontrolled execution.

No external dependencies are required; standard Java utilities (such as regex or simple string checks) suffice.

Suggested changeset 1
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java b/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java
--- a/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java
+++ b/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java
@@ -207,6 +207,9 @@
             throw new RuntimeException("fileName must not be null");
         }
 
+        // Validate pythonPath before execution
+        validatePythonInterpreterPath(requestLangChainParams.pythonPath);
+
         List<String> result =
                 this.excutePythonByProcessBuilder(
                         requestLangChainParams.pythonPath, filePath,
@@ -548,6 +551,31 @@
         return argsList.toArray(new String[argsList.size()]);
     }
 
+    /**
+     * Validate python interpreter path to prevent command injection.
+     * Only allows trusted absolute paths or interpreter names.
+     * Throws RuntimeException if validation fails.
+     */
+    private void validatePythonInterpreterPath(String pythonPath) {
+        if (pythonPath == null || pythonPath.isEmpty()) {
+            throw new RuntimeException("pythonPath must not be empty");
+        }
+        // Only accept known interpreter names or absolute paths without dangerous characters
+        // E.g., allow only "python3", "python", "/usr/bin/python3", or "/usr/local/bin/python3"
+        // Disallow any path containing whitespace, semicolon, ampersand, pipe, or backtick
+        String allowedBinaryNames = "python python3";
+        // Check for simple binary name
+        if (allowedBinaryNames.contains(pythonPath.trim())) {
+            return;
+        }
+        // Check for well-formed absolute path, only allow word characters, slashes, dots
+        // Disallow dangerous shell metacharacters
+        if (!pythonPath.matches("^/?([\\w.-]+/)*[\\w.-]+$")) {
+            throw new RuntimeException("Invalid pythonPath: " + pythonPath);
+        }
+        // Optionally, you may check if the executable exists: new File(pythonPath).exists()
+    }
+
     private void checkParamsValid(GremlinQuery query) {
         Ex.check(!StringUtils.isEmpty(query.getContent()),
                 "common.param.cannot-be-null-or-empty",
EOF
@@ -207,6 +207,9 @@
throw new RuntimeException("fileName must not be null");
}

// Validate pythonPath before execution
validatePythonInterpreterPath(requestLangChainParams.pythonPath);

List<String> result =
this.excutePythonByProcessBuilder(
requestLangChainParams.pythonPath, filePath,
@@ -548,6 +551,31 @@
return argsList.toArray(new String[argsList.size()]);
}

/**
* Validate python interpreter path to prevent command injection.
* Only allows trusted absolute paths or interpreter names.
* Throws RuntimeException if validation fails.
*/
private void validatePythonInterpreterPath(String pythonPath) {
if (pythonPath == null || pythonPath.isEmpty()) {
throw new RuntimeException("pythonPath must not be empty");
}
// Only accept known interpreter names or absolute paths without dangerous characters
// E.g., allow only "python3", "python", "/usr/bin/python3", or "/usr/local/bin/python3"
// Disallow any path containing whitespace, semicolon, ampersand, pipe, or backtick
String allowedBinaryNames = "python python3";
// Check for simple binary name
if (allowedBinaryNames.contains(pythonPath.trim())) {
return;
}
// Check for well-formed absolute path, only allow word characters, slashes, dots
// Disallow dangerous shell metacharacters
if (!pythonPath.matches("^/?([\\w.-]+/)*[\\w.-]+$")) {
throw new RuntimeException("Invalid pythonPath: " + pythonPath);
}
// Optionally, you may check if the executable exists: new File(pythonPath).exists()
}

private void checkParamsValid(GremlinQuery query) {
Ex.check(!StringUtils.isEmpty(query.getContent()),
"common.param.cannot-be-null-or-empty",
Copilot is powered by AI and may make mistakes. Always verify output.
// 执行Python文件,并传入参数
List<String> lineList = new ArrayList<>();
try {
Process proc = Runtime.getRuntime().exec(args1);

Check failure

Code scanning / CodeQL

Uncontrolled command line Critical

This command line depends on a
user-provided value
.
This command line depends on a
user-provided value
.

Copilot Autofix

AI 3 months ago

The best way to fix this vulnerability is to only allow a fixed, hard-coded list of safe executables (Python interpreters) to be launched, instead of allowing the client to provide an arbitrary pythonPath. For instance, the server could maintain a whitelist such as just "python3", or /usr/bin/python3, and completely ignore/disallow the pythonPath input from the user, or only accept a whitelisted value/alias (for example, "python3" or "python3.10", mapping those to real paths). Similarly, it's important to check that other arguments do not permit command injection, but the main vector here is the user input for executable.

Specific steps:

  • In excutePythonRuntime(...) and associated API paths, do not use pythonPath from user input directly.
  • Instead, maintain a hard-coded list of allowed interpreter paths (could be one path, like "python3").
  • In the handler, either always use the default, or if you want to allow selection, only allow a field like pythonVersion: "python3"/"python3.10" and map that to hard-coded safe paths, not using raw user input.
  • You will likely need to update the getExcuteArgs method and where it's called, passing a safe path instead of a user value.

Regions to change:

  • The definition of excutePythonRuntime and calls to it in your controller, to remove user control of the executable path.
  • Optional: If there's a reason to allow multiple versions, accept only an enumerated, mapped value.
  • Remove or replace pythonPath wherever it comes from an untrusted source.

Suggested changeset 1
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java b/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java
--- a/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java
+++ b/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/langchain/LangChainController.java
@@ -130,7 +130,9 @@
             throw new RuntimeException("fileName must not be null");
         }
 
-        List<String> result = this.excutePythonRuntime(requestLangChainParams.pythonPath,
+        // Use a whitelisted Python interpreter path (hardcoded for security)
+        String safePythonPath = "python3"; // Or use an absolute path like "/usr/bin/python3"
+        List<String> result = this.excutePythonRuntime(safePythonPath,
                 filePath, requestLangChainParams.query, requestLangChainParams.openKey, schema,
                 requestLangChainParams.model, requestLangChainParams.ernieClientId, requestLangChainParams.ernieClientSecret);
         if (CollectionUtils.isEmpty(result)) {
@@ -250,6 +252,9 @@
      * @param graphSchema
      * @return
      */
+    // Use only hard-coded python interpreter path for safety
+    private static final String PYTHON_INTERPRETER = "python3"; // Or absolute path if desired
+
     private List<String> excutePythonRuntime(String pythonPath,
                                              String pythonScriptPath,
                                              String query,
@@ -258,7 +263,8 @@
                                              String model,
                                              String ernieClientId,
                                              String ernieClientSecret) {
-        String[] args1 = this.getExcuteArgs(pythonPath, pythonScriptPath, query, openKey, graphSchema,
+        // pythonPath param is ignored for security; always use the hardcoded interpreter
+        String[] args1 = this.getExcuteArgs(PYTHON_INTERPRETER, pythonScriptPath, query, openKey, graphSchema,
                 model, ernieClientId, ernieClientSecret);
         log.info("lang chain execute python command:\n [{}] \n", String.join(" ", args1));
 
@@ -522,6 +528,7 @@
                                    String ernieClientId,
                                    String ernieClientSecret) {
         List<String> argsList = new ArrayList<>();
+        // Only use hardcoded allowed pythonPath for security
         argsList.add(pythonPath);
         argsList.add(pythonScriptPath);
         argsList.add("--query");
EOF
@@ -130,7 +130,9 @@
throw new RuntimeException("fileName must not be null");
}

List<String> result = this.excutePythonRuntime(requestLangChainParams.pythonPath,
// Use a whitelisted Python interpreter path (hardcoded for security)
String safePythonPath = "python3"; // Or use an absolute path like "/usr/bin/python3"
List<String> result = this.excutePythonRuntime(safePythonPath,
filePath, requestLangChainParams.query, requestLangChainParams.openKey, schema,
requestLangChainParams.model, requestLangChainParams.ernieClientId, requestLangChainParams.ernieClientSecret);
if (CollectionUtils.isEmpty(result)) {
@@ -250,6 +252,9 @@
* @param graphSchema
* @return
*/
// Use only hard-coded python interpreter path for safety
private static final String PYTHON_INTERPRETER = "python3"; // Or absolute path if desired

private List<String> excutePythonRuntime(String pythonPath,
String pythonScriptPath,
String query,
@@ -258,7 +263,8 @@
String model,
String ernieClientId,
String ernieClientSecret) {
String[] args1 = this.getExcuteArgs(pythonPath, pythonScriptPath, query, openKey, graphSchema,
// pythonPath param is ignored for security; always use the hardcoded interpreter
String[] args1 = this.getExcuteArgs(PYTHON_INTERPRETER, pythonScriptPath, query, openKey, graphSchema,
model, ernieClientId, ernieClientSecret);
log.info("lang chain execute python command:\n [{}] \n", String.join(" ", args1));

@@ -522,6 +528,7 @@
String ernieClientId,
String ernieClientSecret) {
List<String> argsList = new ArrayList<>();
// Only use hardcoded allowed pythonPath for security
argsList.add(pythonPath);
argsList.add(pythonScriptPath);
argsList.add("--query");
Copilot is powered by AI and may make mistakes. Always verify output.
response.setCharacterEncoding("UTF-8");
response.setContentType("text/html");
String fileName = String.format("%s_%s.schema", graphSpace, graph);
response.setHeader("Content-Disposition", "attachment;fileName=" + fileName);

Check warning

Code scanning / CodeQL

HTTP response splitting Medium

This header depends on a
user-provided value
, which may cause a response-splitting vulnerability.
This header depends on a
user-provided value
, which may cause a response-splitting vulnerability.

Copilot Autofix

AI 3 months ago

The best fix is to sanitize the user-provided graphSpace and graph variables before writing them into the Content-Disposition header. As recommended in the background section, this should be done by either escaping or filtering out CR (\r), LF (\n), and other special characters that could prematurely end the header or inject new headers. The simplest safe approach is to remove any characters except a limited safe subset (such as alphanumerics, dashes, underscores), or to replace dangerous characters with safe alternatives.

Implement a helper method (for example, sanitizeForHeader) that replaces or removes any characters that are not part of a safe set (such as [A-Za-z0-9-_]) from the input. Call this function on both graphSpace and graph before constructing fileName. Place the sanitization function in this class (since that's all the code provided). Apply the sanitized values to construct fileName. No new external dependencies are required.

All edits take place inside hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/schema/SchemaController.java. Add the helper method for sanitization (above or near usage), and invoke it in the schemaGroovyExport() method on both graphSpace and graph arguments.

Suggested changeset 1
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/schema/SchemaController.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/schema/SchemaController.java b/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/schema/SchemaController.java
--- a/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/schema/SchemaController.java
+++ b/hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/schema/SchemaController.java
@@ -72,6 +72,18 @@
         "/{graph}/schema")
 public class SchemaController extends BaseController {
 
+    /**
+     * Sanitizes a string for safe usage in HTTP headers (e.g. Content-Disposition),
+     * removing any CR, LF, or other disallowed/suspicious characters.
+     * Allows only ASCII letters, digits, dash, underscore, dot.
+     */
+    private static String sanitizeForHeader(String value) {
+        if (value == null) {
+            return "";
+        }
+        return value.replaceAll("[^A-Za-z0-9._-]", "");
+    }
+
     public static Logger log = Log.logger(SchemaController.class);
 
     @Autowired
@@ -123,7 +135,9 @@
 
         response.setCharacterEncoding("UTF-8");
         response.setContentType("text/html");
-        String fileName = String.format("%s_%s.schema", graphSpace, graph);
+        String sanitizedGraphSpace = sanitizeForHeader(graphSpace);
+        String sanitizedGraph = sanitizeForHeader(graph);
+        String fileName = String.format("%s_%s.schema", sanitizedGraphSpace, sanitizedGraph);
         response.setHeader("Content-Disposition", "attachment;fileName=" + fileName);
         try {
             OutputStream os = response.getOutputStream();
EOF
@@ -72,6 +72,18 @@
"/{graph}/schema")
public class SchemaController extends BaseController {

/**
* Sanitizes a string for safe usage in HTTP headers (e.g. Content-Disposition),
* removing any CR, LF, or other disallowed/suspicious characters.
* Allows only ASCII letters, digits, dash, underscore, dot.
*/
private static String sanitizeForHeader(String value) {
if (value == null) {
return "";
}
return value.replaceAll("[^A-Za-z0-9._-]", "");
}

public static Logger log = Log.logger(SchemaController.class);

@Autowired
@@ -123,7 +135,9 @@

response.setCharacterEncoding("UTF-8");
response.setContentType("text/html");
String fileName = String.format("%s_%s.schema", graphSpace, graph);
String sanitizedGraphSpace = sanitizeForHeader(graphSpace);
String sanitizedGraph = sanitizeForHeader(graph);
String fileName = String.format("%s_%s.schema", sanitizedGraphSpace, sanitizedGraph);
response.setHeader("Content-Disposition", "attachment;fileName=" + fileName);
try {
OutputStream os = response.getOutputStream();
Copilot is powered by AI and may make mistakes. Always verify output.
@github-actions
Copy link

Due to the lack of activity, the current pr is marked as stale and will be closed after 180 days, any update will remove the stale label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants