Skip to content

Conversation

msrathore-db
Copy link
Contributor

@msrathore-db msrathore-db commented Oct 3, 2025

What type of PR is this?

  • Refactor
  • Feature
  • Bug Fix
  • Other

Description

Added a workflow to parallelise the E2E tests. Updated E2E tests to create new table names for each run to avoid issue in parallelisation

How is this tested?

  • Unit tests
  • E2E Tests
  • Manually
  • N/A

Related Tickets & Documents

…reate new table names for each run to avoid issue in parallelisation
Copy link

github-actions bot commented Oct 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Oct 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Oct 7, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@msrathore-db msrathore-db marked this pull request as ready for review October 7, 2025 07:14
Comment on lines +97 to +105
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}

- name: Install dependencies
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --no-interaction --no-root

- name: Install library
run: poetry install --no-interaction --all-extras

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the poetry setup part being done multiple times, each part basically has the same setup

Comment on lines +170 to +189

- name: Set up python
id: setup-python
uses: actions/setup-python@v5
with:
python-version: "3.10"

- name: Install Poetry
uses: snok/install-poetry@v1
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true

- name: Load cached venv
id: cached-poetry-dependencies
uses: actions/cache@v4
with:
path: .venv
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again we are creating multiple environments , which is just too much overhead.
Simpler would be create a first job called setup that installs the poetry and activates the venv
Then find all the test files and execute them in matrix fashion within the setup done first
Don't think it is efficient to create the entire venv setup for each test

if: ${{ needs.discover-tests.outputs.common-test-files != '[]' }}
strategy:
matrix:
test_file: ${{ fromJson(needs.discover-tests.outputs.common-test-files) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you running each single test is parallel. This is just too much overhead. When using a matrix you are basically spinning up a VM for each test and this is just wrong

@jprakash-db
Copy link
Contributor

The current setup has a lot of boiler plate and is just not efficient
Problems in the current stup

  • You are spinning up a new VM for each single test which is just too much overhead
  • There is just too much code duplication, in each tests a bulk of the code is just installing the library

My suggestion is to have jobs that are concentrated

  • First job just installs the poetry and stores in a cache
  • Second job finds all the tests and segregates it into the unit-test, e2e-test, etc
  • Now start executing the tests, now since step1 is done you know we have a cache hit, to just install from there
  • Execute bulk operations in parallel, like unit- test on 1 vm, e2e-test on 1 vm ,etc. Don't make VM per test
  • If you want faster execution explore something like pytest -xdist that runs tests in parallel on a single VM

@jprakash-db
Copy link
Contributor

jprakash-db commented Oct 7, 2025

I can see the tests still took 1 hr to run, so what exactly is the optimisation?

Code Coverage / coverage (pull_request)
Code Coverage / coverage (pull_request)Successful in 57m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants