Skip to content

Conversation

@yenfryherrerafeliz
Copy link
Contributor

@yenfryherrerafeliz yenfryherrerafeliz commented Nov 21, 2025

Description of changes:
These changes are based on #3079.
This introduces concurrent downloads and resuming upload and download operations.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This is an initial phase for the s3 transfer manager v2, which includes:
- Progress Tracker with a default Console Progres Bar.
- Dedicated Multipart Download Listener for listen to events specificly to multipart download.
- Generic Transfer Listener that will be used in either a multipart upload or a multipart download. The progress tracker is dependant on the Generic Transfer Listener, and when enabled it uses the same parameter to be provided as the progress tracker. This is important because if there is a need for listening to transfer specific events and also track the progress then, a custom implementation must be done that incorporate those two needs together, otherwise one of each other must be used.
- Single Object Download
- Multipart Objet Download

This initial implementation misses the test cases.
- Refactor set a single argument, even when not exists, in the console progress bar.
- Add a specific parameter for showing the progress rendering defaulted to STDOUT.
- Add test cases for ConsoleProgressBar.
- Add test cases for DefaultProgressTracker.
- Add test cases for ObjectProgressTracker.
- Add test cases for TransferListener.
- Add test cases for multipart download listener.
- Add a trait to the MultipartDownloader implementation to keep the main implementatio cleaner.
- Add test cases for multipart downloader, in specific testing part and range get multipart downloader.
Refactor:
- Moves opening braces into a new line.
- Make requestArgs an optional argument.
- Remove unnecessary traits.
- Use traditional declarations.

Adds:
- Download directory feature.
Refactor:
- Add a message placeholder for progress status. For example in case of errors.

Adds:
- Upload feature, missing multipart functionality.
- Add upload directory feature
- Add a dedicated multipart upload implementation
- Add transfer progress to multipart upload
- Add upload directory with the required options.
- Create specific response models for upload, and upload directory.
- Add multipart upload test cases.
- Fix transfer listener completation eval.
Short namespace from `Aws\S3\Features\S3Transfer` to `Aws\S3\S3Transfer`.
- Implement progress tracker based on SEP spec.
- Add a default progress bar implementation.
- Add different progress tracker formats:
-- Plain progress format: [|progress_bar|] |percent|%
-- Transfer progress format: [|progress_bar|] |percent|% |transferred|/|tobe_transferred| |unit|
-- Colored progress format: |object_name|:\n\033|color_code|[|progress_bar|] |percent|% |transferred|/|tobe_transferred| |unit| |message|\033[0m
- Add a default single progress tracker implementation.
- Add a default multi progress tracker implementation for tracking directory transfers.
- Include tests unit just for console progress bar.
- Fixes current test cases for:
  - MultipartUploader
  - MultipartDownloader
  - ProgressTracker
- Remove progress bar color enum since the colors were moved into the specific format that requires them.
TransferListener must be tested from the implementations that extends and use this abstract class.
Add nullable type to listenerNotifier property in the MultipartUploader implementation.
- Tests for MultiProgressTracker
- Tests for SingleProgressTracker
- Tests for ProgressBarFormat
- Tests for TransferProgressSnapshot
- Tests for TransferListenerNotifier
- Refactor code to address some styling related feedback.
- Add upload and uploadDirectory unit tests.
- Fix MultipartUpload tests by increasing the part size from 1024 to 10240000 so it gets between the allowed part size range 5MB-5GBs.
- Rename tobe to to_be in the progress formatting.
- Add download tests
- Add download directory tests
- Minor naming refactor
- Add upload integ tests for:
 - Single uploads
 - Multipart uploads
 - Checksum in single uploads
 - Checksum in multipart uploads
- Add download integ tests for:
 - Single downloads
 - Multipart downloads
- Add integ tests for directory uploads
- Add integ tests for directory downloads
- Move some fixed values out of the methods into consts.
- Address a line exceeded 80 chars.
- Declare keys used across different implementations as consts.
- Fix keys declaration in TransferListener.php
- Make use of DIRECTORY_SEPARATOR const instead of hardcoding `/`
- Some implementations using TransferListener were missing the import statement.
- Add test cases for if a file being download `resolvesOutsideTargetDirectory`.
- Improve how the parts are created in multipart uploader so that it looks cleaner.
- Create single class tests for PartGetMultipartDownloader and RangeGetMultipartDownloader.
- Add tests for TransferListenerNotifier from MultipartUploader and MultipartDownloader implementations.
When a part download fails we trigger downloadFailed so it can be propagated to the listeners, and then we retrhow the exception, however, we also have a global exception catching for if something else fails during a multipart download also gets caught and propagated to the listeners as well, however, this causes the downloadFailed to be called twice. To prevent this we just check if the current snapshot has already a error message present there.
- In MultipartUploader, when a part upload fails, the exception should be thrown, and it was not being to.
Read `request_checksum_calculation` from command arguments, being this config value considered over the one from client config. This is useful for when we need to disable checksums per operation basis.
- Fix abort multipart upload called more than once.
- Add test coverage for multipart uploads with custom checksums.
- Add test coverage for multipart upload abortion, to make sure it is called just once when a upload process fails.
- Add test coverage to make sure the different multipart operations are called.
- Remove full object checksum calculate since it is not recommended.
- Address some styling issues.
- Make some statements multilines.
- Add comments describing functionality.
- Fixes exceptions must be returned and not thrown from upload and download directory.
- Returns how many objects were transferred and failed from upload or download directory operations when there are failures.
- Handles circular folder traversal when following symbolic links.
- Add max_concurrency config for upload and download directory APIs.
- Remove s3_delimiter config parameter from download directory API.
- Add modeled test cases runner for upload and download directory operations.
- Avoid allocating memory for each part to be uploaded, instead its read from the file by creating a new file handle that reads from that specific offset from the file.
- Improves config validation in S3TransferManager
- Renamed MultipartDownloader to AbstractMultipartDownloader
- Add integ tests for multipart upload along with custom checksum algorithm
- Add integ tests for multipart upload along with custom checksum
- Propagate the request arguments for the part request command instead of the general request args.
- Remove custom assertion method for cases where messages are not in order equally, and instead modified the fixtures so it match what is thrown.
- Move checksum validation off of AbstractMultipartUploader.
  - Make createMultipartOperation and completeMultipartOperation abstract methods since MultipartUploader requires some checksum validations that are not necessarily usefull in the abstract class. Also, there is not need to leave a default implementation that will be only used for MultipartCopy implementation. Abstract class should be used just for shared code.
- Move abstract method declarations to the top.
- Enhance FileDownloadHandler to support writting at the destination with arbitrary positioning based on part number.
- Enhance AbstractMultipartDownloader to do concurrent downloads.
- Add persisting resuming states for multipart download
- Add resuming a multipart download from a resumable file
- Make final all components for S3Transfer
- Add a bool return type for bytesTransfer in transfer listener in order to understand if that method in specific was handled correctly in the listeners. This is needed for the download handler to confirm if the part was written to the destination.
- Make the FileDownloadHandler implementation validate a checksum when available when writing parts to disk. This ensures integrity when writting.
- Move filter checksums function to a more centralized implementation so it can be consumed by other implementations.
- Add resuming uploads capabilities
- Create a shared/base class for resumable state holders called ResumableTransfer. This class is extended by ResumableUpload and ResumableDownload.
- Added checksum integrity check for resume files, so that we are sure the resume state being loaded was the one persisted.
- Adjust tests to work with changes done for resuming.
- Add unit tests for resuming uploads
- Add unit tests for resuming downloads
- Add unit tests for FileDownloadHandler
- Add integration tests for resuming uploads
- Add integratin tests for resuming downloads
- Fix integration testing
  - Full object checksum just support crc family algos.
  - When objects are uploaded using full object checksum then if we need to validate that checksum we could use getObjectAttributes and get the checksum information. Otherwise we are not able to validate by passing ChecksumMode ENABLED in a getObject operation.
- Make implementation final.
- Add Abstract prefix to abstract classes.
- Integration teststing for resuming operations
- Add a rewind condition when writing the part to the stream in the StreamDownloadHandler.
- Add a rewind condition in the download handlers in case other processes consumed the stream previous to writing to its final destination.
- Supress warning when running integ tests for aborting multipart uploads
- Added return type as bool for bytesTransfer in AbstractTransferListener. This is needed for when we integrate the resume uploads/downloads feature.
- Remove white spaces in AbstractMultipartDownloder constructor
- Add return statement to listeners used in tests for byteTransferred event.
- Avoid using a default region when using the default S3 Client in S3 Transfer Manager. When a region is not provided and the default client is tried to be used then an exception will be thrown.
Pass the default region if provided in the config, otherwise default it to null.
After combining the main TM branch with this one we got some issues related to renaming classes, etc.
@yenfryherrerafeliz yenfryherrerafeliz changed the title S3 transfer manager with concurrent download S3 transfer manager with concurrent download and resuming features Nov 21, 2025
@yenfryherrerafeliz yenfryherrerafeliz changed the title S3 transfer manager with concurrent download and resuming features S3 TM with concurrent downloads and resuming feature Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant