Skip to content

Latest commit

 

History

History
306 lines (217 loc) · 9.24 KB

File metadata and controls

306 lines (217 loc) · 9.24 KB

PHP Extensions: FrankenPHP vs Traditional C

This repository demonstrates building PHP extensions using two different approaches:

  • FrankenPHP (Go-based) - Modern extension development using Go
  • Traditional C - Classic PHP extension development

This project was created for a talk comparing these two approaches to PHP extension development.

Project Structure

.
├── go-version/     # FrankenPHP extension built with Go
├── c-version/      # Traditional PHP extension built with C
└── compose.yml     # Docker Compose configuration for both versions

What's Included

Both versions implement the same functionality to demonstrate equivalent capabilities:

Functions

  1. repeat_this(string $str, int $count, bool $reverse): string

    • Repeats a string $count times
    • Optionally reverses the result (UTF-8 aware)
    • Example: repeat_this("Hi", 3, false)"HiHiHi"
    • Example: repeat_this("Hi", 3, true)"iHiHiH"
  2. matrix_multiply(array $matrix1, array $matrix2): array

    • Multiplies two matrices
    • C version uses CBLAS for optimized performance
    • Go version uses custom matrix multiplication with cache-optimized transposition

Classes

DirectoryScanner (Go version only)

  • setPath(string $path): bool - Set the directory to scan
  • scan(): array - Recursively scan directory and return all file paths
  • scanConcurrently(): array - Same as scan but uses Go goroutines for concurrent traversal

Getting Started

Prerequisites

  • Docker
  • Docker Compose

Building the Extensions

Build the service you want to work with:

# Build the C extension
docker compose build cext

# Build the Go/FrankenPHP extension
docker compose build goext

Running the Extensions

C Version

# Enter the container
docker compose run --rm -it cext bash

# Inside the container, run the test file
php test.php

The C extension is pre-compiled and loaded via php.ini.

Go/FrankenPHP Version

# Enter the container
docker compose run --rm -it goext bash

# Inside the container, run test files using frankenphp
frankenphp php-cli php-tests/repeat.php
frankenphp php-cli php-tests/matrix_multiply.php
frankenphp php-cli php-tests/directory_scan.php

Implementation Details

Go Version (FrankenPHP)

Located in go-version/sdphp/stringext.go

Key Features

Special Comment Directives: FrankenPHP uses special comments to export Go code to PHP:

// export_php:function repeat_this(string $str, int $count, bool $reverse): string
func repeat_this(s *C.zend_string, count int64, reverse bool) unsafe.Pointer {
    // Implementation
}

// export_php:class DirectoryScanner
type DirectoryScanner struct {
    Path string
}

// export_php:method DirectoryScanner::setPath(string $path): bool
func (ds *DirectoryScanner) SetPath(path *C.zend_string) bool {
    // Implementation
}

CGo Integration: The Go code uses CGo to interact with PHP's C API:

/*
#include <Zend/zend_types.h>
#include <Zend/zend_hash.h>
#include <stdlib.h>
#include "stringext.h"
#include "helper.h"

HashTable* create_matrix_array(long rows, long cols, long* data);
HashTable* create_string_array(char** strings, long count);
*/
import "C"

Type Conversions: FrankenPHP provides helper functions to convert between Go and PHP types:

  • frankenphp.GoString() - Convert PHP string to Go string
  • frankenphp.PHPString() - Convert Go string to PHP string
  • frankenphp.GoPackedArray() - Convert PHP array to Go slice

Build Process & Monkey Patching

The build process (build-ext.sh) includes several workarounds for FrankenPHP bugs (as of v1.9.1):

  1. C Preamble Restoration: The code generator removes C preamble comments, breaking extensions that reference C functions. The script extracts and re-inserts the preamble.
# Get our preamble
sed -n '1,/^import "C"$/p' $EXT_PATH/stringext.go > /tmp/preamble.txt

# Remove generated preamble and prepend our preamble
sed '1,/^import "C"$/d' $EXT_PATH/build/stringext.go | cat /tmp/preamble.txt - > /tmp/stringext_fixed.go
  1. Type Mismatch Fix: FrankenPHP incorrectly generates (int)path type hint for the SetPath method wrapper when it should be zend_string*:
# Fix the incorrect (int) typecast to allow proper zend_string* passing
sed -i 's/(int)path/path/g' $EXT_PATH/build/stringext.c
  1. Missing Import: The generated code relies on runtime/cgo but doesn't import it:
# Automatically add missing imports using goimports
go install golang.org/x/tools/cmd/goimports@latest
goimports -w $EXT_PATH/build/stringext.go
  1. Module Setup: Copy necessary files and run go mod tidy to ensure all dependencies are resolved:
cp $EXT_PATH/go.mod $EXT_PATH/go.sum $EXT_PATH/helper.h $EXT_PATH/helper.c $EXT_PATH/build/
cd $EXT_PATH/build && go mod tidy

These workarounds are necessary due to FrankenPHP being experimental for extension development. Future versions may resolve these issues.

C Version (Traditional)

Located in c-version/sdphp/sdphp.c

Key Features

Function Registration: Functions are registered via stub files and arginfo:

  1. Define function signature in sdphp.stub.php:
function repeat_this(string $str, int $count, bool $reverse): string {}
  1. Generate arginfo using PHP's gen_stub.php (creates sdphp_arginfo.h)

  2. Implement the function using PHP_FUNCTION macro:

PHP_FUNCTION(repeat_this)
{
    zend_string *str;
    zend_long count;
    zend_bool reverse;

    ZEND_PARSE_PARAMETERS_START(3, 3)
        Z_PARAM_STR(str)
        Z_PARAM_LONG(count)
        Z_PARAM_BOOL(reverse)
    ZEND_PARSE_PARAMETERS_END();
    
    // Implementation
}

UTF-8 String Handling: The C version includes custom UTF-8 character boundary detection for proper string reversal:

// Detect UTF-8 character length by examining the first byte
static size_t utf8_char_len(const char *s) {
    unsigned char c = (unsigned char)*s;
    
    if (c < 0x80) return 1;      // 0xxxxxxx - ASCII
    if (c < 0xC0) return 1;      // Invalid/continuation byte
    if (c < 0xE0) return 2;      // 110xxxxx - 2 bytes
    if (c < 0xF0) return 3;      // 1110xxxx - 3 bytes
    if (c < 0xF8) return 4;      // 11110xxx - 4 bytes
    return 1;
}

// Two-pass reversal: reverse bytes, then fix multi-byte chars
static void utf8_reverse(char *str, size_t len) {
    // First pass: reverse all bytes
    // Second pass: reverse bytes within each multi-byte character
}

This ensures proper handling of multi-byte UTF-8 characters (emoji, Chinese, etc.) when reversing strings.

CBLAS Integration: The matrix multiplication uses optimized BLAS routines:

cblas_dgemm(
    CblasRowMajor,  // Row-major order
    CblasNoTrans,   // Don't transpose A
    CblasNoTrans,   // Don't transpose B
    m, p, n,        // Dimensions
    1.0,            // Alpha scaling
    A_flat, n,      // Matrix A
    B_flat, p,      // Matrix B
    0.0,            // Beta scaling
    C_flat, p       // Result matrix C
);

Comparison: Go vs C

Advantages of Go/FrankenPHP

  • Modern Language: Use Go's standard library (goroutines, channels, etc.)
  • Memory Safety: Go's garbage collector handles memory management
  • Easier Development: Higher-level abstractions, better tooling
  • Concurrency: Built-in goroutines for parallel operations
  • Less Boilerplate: Simpler type conversions and error handling

Advantages of Traditional C

  • Mature Ecosystem: Well-documented, stable API
  • Performance: Direct memory manipulation, no GC overhead
  • Fine-grained Control: Complete control over memory and optimization
  • No Workarounds: Established toolchain without experimental bugs
  • Smaller Binary: No Go runtime overhead

Current State (FrankenPHP 1.9.1)

FrankenPHP extension development is experimental and requires workarounds for:

  • Code generation bugs (C preamble removal)
  • Type system issues (incorrect type hints)
  • Missing imports (runtime/cgo)
  • Documentation gaps

These issues will likely be resolved in future versions as FrankenPHP matures.

Performance Considerations

Memory Management

C Version: Manual memory management with malloc/free. Requires careful cleanup to avoid leaks.

Go Version: Garbage collected. Memory is automatically managed, but GC pauses may occur.

String Reversal

Both versions implement in-place UTF-8 aware reversal with O(1) space complexity and O(n) time complexity. The algorithm:

  1. Reverse all bytes
  2. Iterate through and reverse bytes within each multi-byte UTF-8 character

This preserves multi-byte characters (like emoji: 😀, Chinese: 世界) correctly.

Matrix Multiplication

C Version: Uses CBLAS cblas_dgemm - highly optimized, industry-standard BLAS implementation.

Go Version: Custom implementation with cache-optimized matrix transposition. Pre-transposes the second matrix for better cache locality during multiplication.

Learning Resources

License

This is demonstration code for educational purposes.