Skip to content

Conversation

rinu
Copy link
Contributor

@rinu rinu commented Aug 15, 2023

Fixed lots of issues with formatting non-trivial SQL queries. This is NOT backwards compatible, there are many breaking changes. This is more in line with what looks good to me and what prettier is doing.

Highlights include:

  • generalized blocks with indented contents, no longer hard-coded just for parenthesis
  • for all sorts of JOINs the ON part and other conditions are on a new line and indented
  • ON DUPLICATE KEY UPDATE is one token
  • CASE WHEN and the like are blocks with indented contents

@derrabus derrabus changed the base branch from 1.1.x to 1.2.x August 16, 2023 07:18
@derrabus
Copy link
Member

Thank you. This is a feature, so we should target the 1.2.x branch. 1.1.x is for bugfixes.

@derrabus derrabus added the enhancement New feature or request label Aug 16, 2023
@greg0ire
Copy link
Member

Thank you. This is a feature, so we should target the 1.2.x branch. 1.1.x is for bugfixes.

There was a mention of breaking changes in the description. Can you please detail what they are and why they should be considered breaking? Also, please read the contributing guide, especially the part about commit messages.

@derrabus
Copy link
Member

There was a mention of breaking changes in the description.

The public interface of the Token class was changed: methods have been added and removed from it.

@rinu
Copy link
Contributor Author

rinu commented Aug 16, 2023

Conflicts with 1.2.x are now resolved. The main breaking change is the formatted output, especially for all types of JOINs. In my app I rely on the exact output of format.

@greg0ire
Copy link
Member

In my app I rely on the exact output of format.

Can you elaborate on this?

@rinu
Copy link
Contributor Author

rinu commented Aug 16, 2023

I use this formatter to lint SQL in my project. My linter will look at every string in the project and if it looks like SQL it will apply this formatting. There are probably thousands of these so a consistent style is very useful.

@greg0ire
Copy link
Member

I had no idea this project was used that way… in that particular case, any change does seem like a breaking change, but I'm not sure this means that's the case for most consumers of this project.

@rinu
Copy link
Contributor Author

rinu commented Aug 17, 2023

I think this is ready now.

This logic of blocks could be used in the future to give better context for both formatting and highligting. For example in the case of VALUES that is both a function and a keyword, you could one day have an INSERT INTO block and VALUES would only be teated as a keyword if it's in the INSERT INTO block but as a function in the ON DUPLICATE KEY UPDATE block. Then there are lots of reserved words that should not be treated as such in places where field or table names are expected.

@derrabus derrabus changed the base branch from 1.2.x to 1.5.x September 11, 2024 07:59
}

public function withValue(string $value): self
public function isBlockStart(): ?Condition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this is fully possible this way. There are ambiguous tokens (and group of tokens) as we never know the full grammar nor parse the full grammar.

With this said, we cannot tell if a token represents a "block start/end" or not by just looking around. In order to implement this and fix #118 (and prevent related issues), I think we need to parse the tokens into some intermediate tree data structure in "try consume" fashion first.

With this approach the tree of parsed blocks will always be correct (otherwise they will be kept unparsed/as tokens) and the formatter can then be hugely simplified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants