Skip to content
90 changes: 16 additions & 74 deletions src/Import.php
Original file line number Diff line number Diff line change
Expand Up @@ -50,87 +50,29 @@ public function run( $sql_file_path, $args ) {
* @throws Exception
*/
protected function execute_statements( $import_file ) {
foreach ( $this->parse_statements( $import_file ) as $statement ) {
$result = $this->driver->query( $statement );
if ( false === $result ) {
WP_CLI::warning( 'Could not execute statement: ' . $statement );
$raw_queries = file_get_contents( $import_file );
$queries_text = $this->remove_comments( $raw_queries );
$parser = $this->driver->create_parser( $queries_text );
while ( $parser->next_query() ) {
$ast = $parser->get_query_ast();
$statement = substr( $queries_text, $ast->get_start(), $ast->get_length() );
try {
$this->driver->query( $statement );
} catch ( Exception $e ) {
WP_CLI::error( 'SQLite import could not execute statement: ' . $statement );
echo $this->driver->get_error_message();
}
}
}

/**
* Parse SQL statements from an SQL dump file.
* @param string $sql_file_path The path to the SQL dump file.
* Remove comments from the input.
*
* @param string $input
*
* @return Generator A generator that yields SQL statements.
* @return string
*/
public function parse_statements( $sql_file_path ) {

$handle = fopen( $sql_file_path, 'r' );

if ( ! $handle ) {
WP_CLI::error( "Unable to open file: $sql_file_path" );
}

$single_quotes = 0;
$double_quotes = 0;
$in_comment = false;
$buffer = '';

// phpcs:ignore
while ( ( $line = fgets( $handle ) ) !== false ) {
$line = trim( $line );

// Skip empty lines and comments
if ( empty( $line ) || strpos( $line, '--' ) === 0 || strpos( $line, '#' ) === 0 ) {
continue;
}

// Handle multi-line comments
if ( ! $in_comment && strpos( $line, '/*' ) === 0 ) {
$in_comment = true;
}
if ( $in_comment ) {
if ( strpos( $line, '*/' ) !== false ) {
$in_comment = false;
}
continue;
}

$strlen = strlen( $line );
for ( $i = 0; $i < $strlen; $i++ ) {
$ch = $line[ $i ];

// Handle escaped characters
if ( $i > 0 && '\\' === $line[ $i - 1 ] ) {
$buffer .= $ch;
continue;
}

// Handle quotes
if ( "'" === $ch && 0 === $double_quotes ) {
$single_quotes = 1 - $single_quotes;
}
if ( '"' === $ch && 0 === $single_quotes ) {
$double_quotes = 1 - $double_quotes;
}

// Process statement end
if ( ';' === $ch && 0 === $single_quotes && 0 === $double_quotes ) {
yield trim( $buffer );
$buffer = '';
} else {
$buffer .= $ch;
}
}
}

// Handle any remaining buffer content
if ( ! empty( $buffer ) ) {
yield trim( $buffer );
}

fclose( $handle );
protected function remove_comments( $text ) {
return preg_replace( '/\/\*.*?\*\/(;)?/s', '', $text );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove all the comments with a regex because the AST parser was identifying them as queries to execute, which caused it to fail. For example Error: SQLite import could not execute statement: SET @saved_cs_client = @@character_set_client */;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sejas Ahh, this is because the /*!<number> ...*/ comments are special MySQL comments that can execute queries conditionally based on the MySQL version. Therefore, /*!40101 SET character_set_client = @saved_cs_client */; means execute this on all versions >= 4.1.1. So the part with the query being executed is correct.

But somehow, the execution fails... so let's keep a quick fix. Regexes are tricky because they can match any random string in the dump, etc. What about catching the error instead, and if it starts with SQLite import could not execute statement: SET @, then we would skip it? I would then check the root cause of the failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I just noticed is that the statement doesn't include the leading /* for some reason 🤔 Trying just $this->assertQuery( '/*!40101 SET character_set_client = @saved_cs_client */;' ); — and this passes. Anyway, we can have a hotfix for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed that too. The leading /* exists in the file, but not when using the AST parser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the regex and used a basic check: ca2f090#diff-aea1542aa0e46981e70c6bfb53a15a583242580d74dfb5df25dbfe71d098757bR159-R162
I'm checking that the query that failed starts with SET and it contains */.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice solution! This way it will target exactly these "wrongly parsed" comments.

}
}
Loading