[SVCS-479] Raise exception for copy/move replace folder that orphans itself.#274
[SVCS-479] Raise exception for copy/move replace folder that orphans itself.#274TomBaxter wants to merge 5 commits intoCenterForOpenScience:developfrom TomBaxter:feature/svcs-479
Conversation
1 similar comment
waterbutler/core/provider.py
Outdated
| await self.delete(src_path) | ||
| return data, created | ||
|
|
||
| async def replace_will_orphan(self, |
There was a problem hiding this comment.
This function makes no awaits so it shouldn't need to be async
|
|
||
| This can be an expensive operation. Should be used as last resort. Should be overridden if | ||
| provider has a cheaper option. | ||
| """ |
There was a problem hiding this comment.
Since this can be expensive, is there a possible check you can do here that will quickly tell you if you need to go down the whole chain, ie if
src_path.name != dest_path.name:
return False
Unsure if this works or if this check is already done.
There was a problem hiding this comment.
There is separate logic/exception, from a different ticket for checking for self overwrite. Perhaps they should be combined.
| src_mock._children_metadata.assert_not_called() | ||
| src_mock.validate_v1_path.assert_not_called() | ||
|
|
||
| @pytest.mark.asyncio |
There was a problem hiding this comment.
I know I already asked but putting this here for thought anyway.
Any possibility of moving this to core/test_provider? (feels more complete that way)
If it is too large of a change, go ahead and ignore this and let Longze or Fitz decide.
There was a problem hiding this comment.
Spoke with Fitz. For now going with this for brevity.
|
|
||
| class OrphanSelfError(InvalidParameters): | ||
| def __init__(self, path): | ||
| super().__init__('Unable to move or copy \'{}\'. Moving or copying a folder onto parent ' |
There was a problem hiding this comment.
Nitpick: The error message is a bit hard for a user to understand. Not a deal breaker though.
There was a problem hiding this comment.
Agreed it is a lot to parse, but not sure how to phrase it in a more clear way that still provides all needed information.
There was a problem hiding this comment.
I tried to rephrase this but failed. 👍
|
Forgot to add message to review.... Local testing went fine (just with osfstorage however) |
| if not dest_path.kind == 'folder': | ||
| return False | ||
| if src_path.name != dest_path.name: | ||
| return False |
There was a problem hiding this comment.
This line isn't hit by test coverage
waterbutler/core/provider.py
Outdated
| conflict == 'replace' and | ||
| dest_provider.replace_will_orphan(dest_path, src_path) | ||
| ): | ||
| raise exceptions.OrphanSelfError(src_path) |
There was a problem hiding this comment.
this line isn't hit by test coverage
cslzchen
left a comment
There was a problem hiding this comment.
Looks good and works as expected locally! Please add more tests and add/update the doc str for new methods. Thanks! 🔥 🔥
| 'folder onto itself is not supported.'.format(path)) | ||
|
|
||
|
|
||
| class OrphanSelfError(InvalidParameters): |
There was a problem hiding this comment.
Can you add a doc str for this exception?
| src_path: wb_path.WaterButlerPath) -> bool: | ||
| """Check if copy/move conflict=replace will orphan src_path. | ||
|
|
||
| This can be an expensive operation. Should be used as last resort. Should be overridden if |
|
|
||
| @pytest.mark.asyncio | ||
| @pytest.mark.aiohttpretty | ||
| async def test_intra_copy_folder_orphan(self, provider_and_mock, provider_and_mock2, |
There was a problem hiding this comment.
As mentioned by @AddisonSchiller , this test only covers one situation when it is an invalid copy. Can you add tests to cover the following for both copy and move? So in total 4 * 2 = 8.
- invalid one
- return false if is not folder
- return false if name is not the same
- loop breaks if materialized path is not the same
There was a problem hiding this comment.
Moved tests to core/provider and added move test for invalid.
Added single tests for 'not folder', 'not same name', and 'loop break' since they aren't running through copy/move.
Completed.
|
|
||
| class OrphanSelfError(InvalidParameters): | ||
| def __init__(self, path): | ||
| super().__init__('Unable to move or copy \'{}\'. Moving or copying a folder onto parent ' |
There was a problem hiding this comment.
I tried to rephrase this but failed. 👍
cslzchen
left a comment
There was a problem hiding this comment.
LGTM and move to PCR 🎆 🎆 !
|
There seems to be a lot of overlap between this and #270 so they may end up getting merged into one super-PR. Leaving both open for now for reference until a final PR is merged. |
Fixing an issue where when a child component is hooked up To a project, and they share the same provider, it is possible to silently delete files
[#SVCS-479] Replace folder from inside folder deletes both Add replace_will_orphan(src_path,dest_path) Checks if dest_path overwrite will orphan src_path Call replace_will_orphan in copy/move defs in waterbutler.core.provider
SVCS-479 Move three pieces of overwrite conflict logic from move/copy into handle_naming and rename handle_naming to handle_conflict. Improved logic and reduced code.
|
|
||
| if ( | ||
| self == src_provider and | ||
| conflict == 'replace' and |
There was a problem hiding this comment.
Boolean operators should be at the start of new lines
| if not dest_path.kind == 'folder': | ||
| return False | ||
| if src_path.name != dest_path.name: | ||
| return False |
There was a problem hiding this comment.
I think it would be more readable x != y rather than not x == y; It'll save a few bytes, (I know we're pressed for disk), but we should probably be a little more consistent about this.
There was a problem hiding this comment.
And we have a 4th one, which is used here ... Instead of doing str comparison, we should use one of the other three which is of type bool.
@property
def kind(self) -> str:
""" Returns `folder` if the path represents a folder, otherwise returns `file`. """
return 'folder' if self._is_folder else 'file'There was a problem hiding this comment.
That looks like its a private method because it's prefixed with an underscore. Should this be private, or can consuming code rely on self._is_folder?
There was a problem hiding this comment.
Let's ignore this for now. It's everywhere in the code base, not only for WB but also for OSF and MFR ...
| dest_path = WaterButlerPath('/folder1/', | ||
| _ids=['root','folder'], | ||
| folder=True) | ||
| assert provider1.replace_will_orphan(src_path, dest_path) == False |
There was a problem hiding this comment.
Should there be a test that covers the case the replace will orphan; i.e assert provider1.replace_will_orphan(src_path, dest_path) == True?
| self.shares_storage_root(src_provider) and | ||
| src_path.materialized_path == dest_path.materialized_path | ||
| ): | ||
| raise exceptions.OverwriteSelfError(src_path) |
There was a problem hiding this comment.
It seems that OverwriteSelf is a degenerate case of OrphanSelf, why do we need to check twice and/or can these checks be combined?
There was a problem hiding this comment.
src_path has an is_dir attribute. Does dest_path too? If there isn't there anything other than path or dir that it can be it should be possible for line 405 to just be if dest_path.is_dir:
There was a problem hiding this comment.
- Both
dest_pathandsrc_pathare instances ofWaterButlerPath, which hasis_dir,is_folderandis_file. I don't know why we have three when we only need one. Let's choose one and be consistent at least within each provider. For new providers, I useis_folder.
@property
def is_dir(self) -> bool:
""" Returns `True` if the path represents a folder. """
return self._is_folder
@property
def is_folder(self) -> bool:
return self._is_folder
@property
def is_file(self) -> bool:
""" Returns `True` if the path represents a file. """
return not self._is_folderThere was a problem hiding this comment.
-
For the two exception, my understanding is that they handle different cases.
OverwriteSelfErrorhandles the case when a move/copy'sdest_pathis exactly the same as thesrc_path.OrphanSelfErrorhandles the case whensrc_pathanddest_pathare different butdest_pathhappens to be one of the parent folders of thesrc_pathup in the tree.
-
My suggestion is that we keep the two separate. If necessary, please update the docstr for both class and maybe a few comments on what each checks does if necessary.
| raise exceptions.OverwriteSelfError(src_path) | ||
|
|
||
| # files and folders shouldn't overwrite themselves | ||
| # TODO: how is this different from above will_self_overwrite call? |
There was a problem hiding this comment.
Remove the todo? the former deals specifically with the case the destination is a dir, and the original will be kept?
|
|
||
| if handle_conflicts: | ||
|
|
||
| dest_path = await dest_provider.handle_conflicts( |
| dest_path = await dest_provider.handle_naming( | ||
| if handle_conflicts: | ||
|
|
||
| dest_path = await dest_provider.handle_conflicts( |
There was a problem hiding this comment.
@birdbrained Thanks for the initial review. If you are taking this PR over, here are a few things:
-
Improve the code style, control/condition statement and consistency the way you prefer. I am always 👍 x 💯 for it.
-
For
OverwriteSelfErrorandOrphanSelfError, I suggest keep them separate based on my understanding. However, feel free to change it if you have a good reason that I am not aware of.
| self.shares_storage_root(src_provider) and | ||
| src_path.materialized_path == dest_path.materialized_path | ||
| ): | ||
| raise exceptions.OverwriteSelfError(src_path) |
There was a problem hiding this comment.
- Both
dest_pathandsrc_pathare instances ofWaterButlerPath, which hasis_dir,is_folderandis_file. I don't know why we have three when we only need one. Let's choose one and be consistent at least within each provider. For new providers, I useis_folder.
@property
def is_dir(self) -> bool:
""" Returns `True` if the path represents a folder. """
return self._is_folder
@property
def is_folder(self) -> bool:
return self._is_folder
@property
def is_file(self) -> bool:
""" Returns `True` if the path represents a file. """
return not self._is_folder| if not dest_path.kind == 'folder': | ||
| return False | ||
| if src_path.name != dest_path.name: | ||
| return False |
There was a problem hiding this comment.
And we have a 4th one, which is used here ... Instead of doing str comparison, we should use one of the other three which is of type bool.
@property
def kind(self) -> str:
""" Returns `folder` if the path represents a folder, otherwise returns `file`. """
return 'folder' if self._is_folder else 'file'| self.shares_storage_root(src_provider) and | ||
| src_path.materialized_path == dest_path.materialized_path | ||
| ): | ||
| raise exceptions.OverwriteSelfError(src_path) |
There was a problem hiding this comment.
-
For the two exception, my understanding is that they handle different cases.
OverwriteSelfErrorhandles the case when a move/copy'sdest_pathis exactly the same as thesrc_path.OrphanSelfErrorhandles the case whensrc_pathanddest_pathare different butdest_pathhappens to be one of the parent folders of thesrc_pathup in the tree.
-
My suggestion is that we keep the two separate. If necessary, please update the docstr for both class and maybe a few comments on what each checks does if necessary.
| raise exceptions.OverwriteSelfError(src_path) | ||
|
|
||
| # files and folders shouldn't overwrite themselves | ||
| # TODO: how is this different from above will_self_overwrite call? |
|
New PR #333 |
|
This PR is replaced and closed. |

Purpose:
SVCS-479
Replace folder from inside folder deletes both.
Changes:
Updated waterbutler/core/provider.py
-- Add def replace_will_orphan
-- Update def move calls replace_will_orphan
-- Update def copy calls replace_will_orphan
Update waterbutler/core/exceptions.py
-- Add class OrphanSelfError
Update tests/providers/osfstorage/test_provider.py
-- Add test test_intra_copy_folder_orphan
Side effects
Folder move/copies that use conflict=replace may now be more costly