Move files from cloudberry-devops-release to main repo #2

leborchuk · 2025-08-02T12:04:06Z

Fixes #ISSUE_Number

What does this PR do?

Type of Change

Bug fix (non-breaking change)
New feature (non-breaking change)
Breaking change (fix or feature with breaking changes)
Documentation update

Breaking Changes

Test Plan

Unit tests added/updated
Integration tests added/updated
Passed make installcheck
Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Followed contribution guide
Added/updated documentation
Reviewed code for security implications
Requested review from cloudberry committers

Additional Context

CI Skip Instructions

apache#1337) * feat: use ColumnEncoding_Kind_DIRECT_DELTA as default in offset stream Optimize performance of variable-length column offsets by switching from Zstd to delta encoding. This approach better compresses incremental integer sequences, cutting disk space by more than half while maintaining performance. The following is a comparison of file sizes for different encoding methods on TPC-DS 20G: Name PAX(ZSTD) AOCS_SIZE PAX(Delta) PAX SIZE / AOCS * 100% call_center 12 kB 231 kB 10185 bytes 4.31% catalog_page 499 kB 653 kB 393 kB 60.18% catalog_returns 240 MB 171 MB 178 MB 104.09% catalog_sales 3033 MB 1837 MB 1977 MB 107.63% customer 16 MB 12 MB 12 MB 100.00% customer_address 7008 kB 3161 kB 3115 kB 98.54% customer_demographics 28 MB 8164 kB 9292 kB 113.82% date_dim 3193 kB 1406 kB 1249 kB 88.85% household_demographics 42 kB 248 kB 28 kB 11.29% income_band 1239 bytes 225 kB 1239 bytes 0.54% inventory 36 MB 71 MB 36 MB 50.70% item 3084 kB 2479 kB 2227 kB 89.84% promotion 27 kB 239 kB 18 kB 7.53% reason 2730 bytes 226 kB 2280 bytes 0.99% ship_mode 3894 bytes 227 kB 3315 bytes 1.43% store 23 kB 239 kB 18 kB 7.53% store_returns 400 MB 265 MB 277 MB 104.53% store_sales 4173 MB 2384 MB 2554 MB 107.12% time_dim 1702 kB 819 kB 627 kB 76.56% warehouse 5394 bytes 227 kB 4698 bytes 2.02% web_page 21 kB 236 kB 14 kB 5.93% web_returns 116 MB 83 MB 85 MB 102.41% web_sales 1513 MB 908 MB 982 MB 108.15%

* PAX: Support LZ4 compression for table columns PAX only support zlib and zstd compression for column values. This commit add lz4 support for pax table columns. * map compress level to acceleration for lz4 * strict acceleration to range [0, 3] * add macro control

Remove the USE_ORCA ifdef around OptimizerOptions. The struct is required regardless of ORCA support, and the conditional caused compilation failures when configured with --disable-orca.

If QueryFinishPending is set when query is running into dumptuples, the tuplecontext is reset but memtuples are not cumsumed. When query is running into dumptuples again, tuplesort_sort_memtuples will access these memtuples, and the memory allocated in tuplecontext is already freed, this will cause invalid memory access. To avoid this situation, do nothing in dumptuples if QueryFinishPending is set.

We used to not have a very clear naming guideline for the existing 'pg_%' system views and the MPP versions of them. As an example, we renamed PG's pg_stat_all_tables and pg_stat_all_indexes to have an '_internal' appendix, and used their original names to collect aggregated results from all segments (commit e6f9303). However, with the previous commit, we now let all existing PG system views to have their original names, while add corresponding 'gp_%' views for the non-aggregated results from all segments, and 'gp_%_summary' views for aggregated results from all segments. Therefore, we now revert pg_stat_all_tables and pg_stat_all_indexes back to their original definitions, which just collect stats from a single segment. Then, we add them to sytem_views_gp.in to produce gp_stat_all_tables and gp_stat_all_indexes which collect non-aggregated results from all segments. Finally, we rename the aggregate version of those views to be gp_stat_all_tables_summary and gp_stat_all_indexes_summary. Because views pg_stat_user_tables and pg_stat_user_indexes use the above sumary views, we have to add _summary views for these two views as well. We will add _summary for other system views later. Modify regress test accordingly.

Added the following views: gp_stat_progress_vacuum_summary gp_stat_progress_analyze_summary gp_stat_progress_cluster_summary gp_stat_progress_create_index_summary Also replaced pg_stat_progress_* views with gp_stat_progress_* views for existing tests.

These summary views offer basic aggregation of the gp_stat_* views across Greenplum coordinator and segments. Aggregation logic applied as follows: * Time related (last_%): use max() * Transaction related, not innately summable (number of commits/rollbacks) : use max() * Table specific: sum()/numsegments for replicated tables, sum() for distributed tables * Innately summable stats, if no particular table is involved: use sum() * pid: use coordinator's pid (not used here, but this is the convention in other gp_%_summary views)

This commit replaces the use of the -d parameter with the -e parameter when checking for the presence of a Git repository. This allows for more comprehensive checks, including cases where the working directory may be part of a Git repository but not the entire repository.

Fix issue: apache#1240 Replicated locus could EXCEPT Partitioned locus when there is writable CTE on replicated tables. We could make them SingleQE or Entry to do the set operation. with result as (update r_1240 set a = a +1 where a < 5 returning *) select * from result except select * from p1_1240; QUERY PLAN --------------------------------------------------------------- HashSetOp Except -> Append -> Explicit Gather Motion 3:1 (slice1; segments: 3) -> Subquery Scan on "*SELECT* 1" -> Update on r_1240 -> Seq Scan on r_1240 Filter: (a < 5) -> Gather Motion 3:1 (slice2; segments: 3) -> Subquery Scan on "*SELECT* 2" -> Seq Scan on p1_1240 Optimizer: Postgres-based planner (11 rows) Authored-by: Zhang Mingli avamingli@gmail.com

…t cleanup" This reverts commit 65cd966.

let resource group io limit testing can be reproduced. If we retain the objects created in the testing, we must clear those objects before we re-run the testing on local, it's not convenient for developers.

* add function to clear io.max This pr has several improvements for io limit: 1. Add a function to clear io.max. This function should be used when alter io_limit. 2. Check tablespace in io_limit when drop tablespaces. If the tablespace which will be dropped presents in some io_limit resource groups, the drop tablespace statement will be aborted. 3. When InitResGroup and AlterResourceGroup, if parseio raises an error, the error will be demote to WARNING. So the cluster can launch when some tablespace has been removed.

Fix resource group io limit flaky case. The flaky case caused by running mkdir on multi segments at the same host. Just catch FileExistsError and ignore it is ok, the mkdir function just need the dir exists.

When io_limit encountered syntax error, previous log is just "Error: Syntax error". Now, the io_limit has comprehensive log for syntax error: ``` demo=# create resource group rg1 WITH (cpu_max_percent=10, io_limit='pg_defaultrbps=100, wbps=550,riops=1000,wiops=1000'); ERROR: io limit: syntax error, unexpected '=', expecting ':' HINT: pg_defaultrbps=100, wbps=550,riops=1000,wiops=1000 ^ ``` ``` demo=# create resource group rg1 WITH (cpu_max_percent=10, io_limit='pg_default: rbps=100wbps=550,riops=1000, wiops=1000'); ERROR: io limit: syntax error, unexpected IO_KEY, expecting end of file or ';' HINT: pg_default:rbps=100wbps=550,riops=1000,wiops=1000 ^ ```

io limit: fix double free. In 'alterResgroupCallback', the io_limit pointer of 'caps' and 'oldCaps' maybe point to the same location, so there is a double free potentially. In 'alterResgroupCallback', the 'oldCaps' will be filled in 'GetResGroupCapabilities', and the assign it to 'caps' via: caps = oldCaps To resolve this problem, the code should free the oldCaps.io_limit, and set it to NIL, when the io_limit has not been altered. So, if the io_limit has not been altered, caps.io_limit = oldCaps.io_limit = NIL. If io_limit has been altered, caps.io_limit != oldCaps.io_limit.

Add one more hierarchy for resource group when use cgroup v2. Current leaf node in the gpdb cgroup hierarchy is: /sys/fs/cgroup/gpdb/<oid>, it's ok for gpdb workflow. But for some extensions which want to use gpdb cgroup hierarchy, it's not convenient. Extensions like plcontainer want create sub-cgroup under /sys/fs/cgroup/<oid> as new leaf node, it's not possible in current hierarchy, because of no internal processes constraint of cgroup v2. This commit use a new hierarchy to adopt extensions which want to use gpdb cgroup hierarchy, and the modification is tiny: move processes from /sys/fs/cgroup/<oid>/cgroup.procs to /sys/fs/cgroup/gpdb/<oid>/queries/cgroup.procs, and keep limitations in /sys/fs/cgroup/<oid>. With this modification, extensions which want to use gpdb cgroup hierarchy can create sub cgroup under /sys/fs/cgroup/gpdb/<oid>. For example, plcontainer will create a cgroup /sys/fs/cgroup/gpdb/<oid>/docker-12345 and put processes into it.

delete cgroup leaf dir only when use group-v2. There is no leaf directory in gpdb cgroup when use cgroup v1, so the rmdir(leaf_path) will always return non-zero values, then the rmdir(path) will be ignored. When drop some resource groups, when corresponding cgroup dir cannot be removed because the rmdire(path) is not executed, this behavior will cause the failure of CI. This commit add some logic to check resource group version in deleteDir, when use group-v1, rmdir(leaf_path) will be ignored.

Add guc: gp_resource_group_cgroup_parent (only for cgroup v2). Current gpdb doesn't support change root cgroup path of resource group. For some situations, it's better if gpdb can change the root cgroup path of resource group. For example, on the OS with systemd, user maybe want to create a delegated cgroup to gpdb via systemd, but the delegated cgroup must end with .service which typically is /sys/fs/cgroup/gpdb.service. And in other OS without systemd, user maybe want to use /sys/fs/cgroup/gpdb or other locations directly. So add the gp_resource_group_cgroup_parent can make the resource group more flexible.

Fix no response when alter io_limit of resource group to '-1'. There is no action when ALTER RESOURCE GROUP xxx SET IO_LIMIT '-1' before. Now the action is that clear the content of io.max and update relation pg_resgroupcapability.

…eter This commit fixes issues introduced in "Add guc: gp_resource_group_cgroup_parent (#16738)" where the gp_resource_group_cgroup_parent GUC parameter was added but the gpcheckresgroupv2impl script still used hardcoded "gpdb" paths. Changes: - Implement get_cgroup_parent() method to dynamically retrieve the gp_resource_group_cgroup_parent value from database - Replace all hardcoded "gpdb" paths with dynamic cgroup parent value - Improve error handling in cgroup.c with more descriptive error messages - Fix test configuration order: set gp_resource_group_cgroup_parent before enabling gp_resource_manager=group-v2 to avoid validation failures This ensures the cgroup validation script works correctly with custom cgroup parent directories configured via the GUC parameter, making the resource group feature more flexible for different deployment scenarios.

when seq scan begins, check whether the scanflags of table am is set to determine whether the runtime filter is pushed down. When the runtime filter is pushed down to pax am, pax am converts the min/max scankey in the runtime filter into PFTNode and performs min/max filtering.

…in repo Changes here includes original commits ``` git log --pretty=format:"%H%x09%an%x09%ad%x09%s" 5c1a2ada9a93ab5f930aebd0018a7369fdf61930 Dianjin Wang Wed Jun 25 18:28:43 2025 +0800 Update gcc/g++ settings for PAX on RockyLinux 8 133a81303555dedc07c36ec16aa686367c47c774 Leonid Borchuk Wed Jul 16 15:58:52 2025 +0000 Rename greenplum_path to cloudberry-env eef2516b90bb7b9de2af95cc2a9df5b125533794 Dianjin Wang Sat Jun 7 08:51:09 2025 +0800 Update `dorny/paths-filter` version tag to commit e06dd830250ce89184b13a18aa6663ffcb56db4b Ed Espino Sun Jun 1 12:27:17 2025 -0700 Initial commit: Apache Cloudberry (Incubating) release script 384202893e571ce06a2224a116019c2ca9a3dce5 Dianjin Wang Wed Apr 30 16:09:51 2025 +0800 Add the PAX support in the configure 5081920c07096e9e2cf217ad1b7489ae8963a86a Ed Espino Tue Apr 1 03:31:45 2025 -0700 Add protobuf-devel to Dockerfiles for Rocky 8 and 9 builds (apache#14) 7a6549cedb84edad516b89ebbd73529b773aaf03 Jianghua Yang Thu Feb 27 22:31:07 2025 +0800 Add debuginfo package. 5965faabbf15965778ec09bca02960dbc1900a6a Ed Espino Thu Feb 13 01:28:47 2025 -0800 Add new Cloudberry dependency (apache#11) d50af03d7c04341ce86047aa098bfd8e6a914804 Ed Espino Sun Dec 15 23:27:47 2024 -0800 Add script to analyze core dumps with gdb (#10) 54bbb3d10ad642f98cf97418978319d6d030070c Ed Espino Sun Dec 15 21:39:20 2024 -0800 Adding packages (gdb and file) used in core file analysis. (apache#9) 9638c9e3c983e4d9ae7327517c3f88b0f8335614 Ed Espino Mon Dec 9 18:43:57 2024 -0800 Container - Multi arch support for Rocky 8 & 9 (apache#8) 5249d69825ec34c2363aa3de3391ddc790786ffe Ed Espino Wed Nov 27 02:09:46 2024 -0800 Enhance test result parsing for ignored tests (apache#7) f6fb4296c392b3cfaef0da45da282ca667af1025 Ed Espino Tue Nov 19 18:02:14 2024 -0800 fix: remove -e from set options in parse-test-results.sh (apache#6) 2c8302c2c2beb2682185cc9c54503342b0bb0351 Ed Espino Thu Nov 7 22:11:17 2024 -0800 build: add Apache Cloudberry build automation and test framework 6b8f8938196d55902eecf3506a9142811f97d633 Ed Espino Thu Nov 14 02:42:02 2024 -0800 Update to Apache Cloudberry (incubating) rpm name and add disclaimer 1a5852903579b11ec0bd12d3083cf4299250eb96 Ed Espino Thu Nov 14 10:44:10 2024 -0800 Update repository names for pushing to official Docker Hub repository 7f82c004c4a106832a2483e509ba152082561838 Ed Espino Thu Nov 14 00:13:43 2024 -0800 Add initial Dockerfiles, configs, and GitHub workflows for Cloudberry 72e5f06a7cdf6e807b255717ffe5681a122da474 Dianjin Wang Tue Nov 5 14:21:11 2024 +0800 Add asf-yaml and basic community files 5acab8e2c785c8036c92562017d06f666b42cfd5 Ed Espino Wed Sep 4 04:16:35 2024 -0700 Fix release names and paramerize pgvector version. 4ec026dea578f397512c3d8686082317338d6d2c Ed Espino Tue Sep 3 01:50:34 2024 -0700 Using Cloudberry pgvector 0.5.1 ed928320f640731efd33e6157a315a4d3100efb5 Ed Espino Mon Sep 2 23:50:39 2024 -0700 Change ownership of symlink. e8b22b0bffdb44a65528cbc862cfe8d6425302af Ed Espino Sun Sep 1 23:52:57 2024 -0700 Update elf scripts 7f295146bc3412894ecbb0027c10754e736fb7ed Ed Espino Fri Aug 30 16:23:38 2024 -0700 Change default installation directory. 1a571a8bdeac6eb1ccf4e7d81d7d89da7ae6b0e8 Ed Espino Fri Aug 30 16:10:44 2024 -0700 Change default installation directory. 8ed243cff42501843f86a8b70925a6b1e34c6681 Ed Espino Fri Aug 30 03:52:16 2024 -0700 Updates ab43f527c56b197f5c223672e92fb1d384f40858 Ed Espino Fri Aug 30 03:36:35 2024 -0700 Add hll and pgvector extensions a133b6ea1fe5b0cb8c7440218873023be990ddfa Ed Espino Wed Aug 28 22:57:20 2024 -0700 Remove Changelog 371e50dafd31282da78ac661aa7f9890f083b3b9 Ed Espino Wed Aug 28 22:56:26 2024 -0700 Add Group to spec file. 54be4f19ecc15ee477040e169cd7e66246c3034e Ed Espino Wed Aug 28 12:43:09 2024 -0700 Fix changing ownership to gpadmin. 357936d3efd9de69b913b736172270781ac0b6f1 Ed Espino Wed Aug 28 12:02:15 2024 -0700 Adjustment for shipping GO apps (e.g. gpbackup). 4f2cde6f1c97913152498a6cbb54fcc0d9785958 Ed Espino Mon Aug 26 23:29:03 2024 -0700 Update spec files 44b1425441c482d260f087d0702aef66f379e840 Ed Espino Mon Aug 26 22:57:05 2024 -0700 Update docs 57faae9ab7600efc3a4bfe0dd7a2b4eb2d82229f Ed Espino Mon Aug 26 22:51:41 2024 -0700 minor enhancements 44123c9a7a133b4625ddaa2a06fb0be0db77f6a2 Ed Espino Thu Aug 22 02:15:00 2024 -0700 Script update. 60a5884b33ef848aa6861bb9537f28b84ce6597a Ed Espino Wed Aug 21 22:28:07 2024 -0700 Clean up repo rpm making it noarch and making the repo entry dynamic e40c74a91ee4dcac727622fdfbef84e41e61969e Ed Espino Wed Aug 21 13:09:22 2024 -0700 Add repo tool 16b3dfe3ead0834bd83165b5559f3388f4e55e6e Ed Espino Wed Aug 21 00:45:53 2024 -0700 Fix description in repo RPM. feb096c5d8887dd6b29d382be9b37844169df1ca Ed Espino Wed Aug 21 00:42:28 2024 -0700 Fix relocation RPM feature by createing own prefix variable. fd5017365759572248fd07c39af67f83f3987295 Ed Espino Wed Aug 21 00:10:05 2024 -0700 In spec file, set version and release variables via script (build-rpm.sh). b9668e1b86cc631db50ae603cc63cc0ffa35a400 Ed Espino Tue Aug 20 11:44:05 2024 -0700 EL SPEC file consolidation. 904f2981e8d1217b57bc5490d31e4463399d4551 Ed Espino Sun Aug 18 00:45:32 2024 -0700 Rename spec file and add additional runtime dependencies. 89f65a6a0902bbe0fd8294c5ad0767419e53e608 Ed Espino Sat Aug 17 01:04:02 2024 -0700 Add RPM GPG KEY 91ea01f7f0b7ce9e5553d0861ec640b3392bacd2 Ed Espino Fri Aug 16 22:56:54 2024 -0700 Update Spec file. c362bdab21ef9e33e885ee839ad9cf8b009122bd Ed Espino Fri Aug 16 19:00:41 2024 -0700 Create repo RPM 9b02885090fb9c09dd03202616303afe9c7f93f5 Ed Espino Fri Aug 16 11:14:36 2024 -0700 feat: Add ELF dependency analyzer script f3a569e9620c1d643046e92d1387f104a3dbf8cd Ed Espino Fri Aug 16 10:30:55 2024 -0700 Initial EL9 spec file ```

leborchuk force-pushed the AddDeployScripts branch from 4a9d188 to b6a5886 Compare August 6, 2025 19:55

leborchuk force-pushed the AddDeployScripts branch 2 times, most recently from 535d5da to f637ea2 Compare August 18, 2025 08:50

leborchuk force-pushed the AddDeployScripts branch 2 times, most recently from 331d799 to 208f386 Compare August 29, 2025 10:23

gongxun0928 and others added 5 commits September 3, 2025 11:29

Fix compilation with --disable-orca

19720e8

Remove the USE_ORCA ifdef around OptimizerOptions. The struct is required regardless of ORCA support, and the conditional caused compilation failures when configured with --disable-orca.

Fix: if tableam implement relation_acquire_sample_rows, then just use it

4a2dd03

tuhaihe force-pushed the AddDeployScripts branch from 208f386 to ec3d63d Compare September 8, 2025 02:20

huansong and others added 19 commits September 8, 2025 15:20

Fix: Adapt system view after cherry-pick

05869c8

gpMgmt: skip downloading existing Python dependency tarballs

25d6490

Revert "Fix double free issue in alterResgroupCallback during io_limi…

1d010b4

…t cleanup" This reverts commit 65cd966.

let resource group io limit testing can be reproduced (#16107)

e5c8ebb

let resource group io limit testing can be reproduced. If we retain the objects created in the testing, we must clear those objects before we re-run the testing on local, it's not convenient for developers.

Fix resource group io limit flaky case (#16386)

0735df4

Fix resource group io limit flaky case. The flaky case caused by running mkdir on multi segments at the same host. Just catch FileExistsError and ignore it is ok, the mkdir function just need the dir exists.

Leonid Borchuk added 2 commits September 11, 2025 07:38

Reorganize files in main repo and add licence info to all files

753e4ee

tuhaihe force-pushed the AddDeployScripts branch from ec3d63d to 753e4ee Compare September 10, 2025 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move files from cloudberry-devops-release to main repo #2

Move files from cloudberry-devops-release to main repo #2

Uh oh!

leborchuk commented Aug 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Move files from cloudberry-devops-release to main repo #2

Are you sure you want to change the base?

Move files from cloudberry-devops-release to main repo #2

Uh oh!

Conversation

leborchuk commented Aug 2, 2025

What does this PR do?

Type of Change

Breaking Changes

Test Plan

Impact

Checklist

Additional Context

CI Skip Instructions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants