Skip to content

v0.2.0 - MERGE Operations

Choose a tag to compare

@michaeloboyle michaeloboyle released this 02 Nov 23:39
· 54 commits to main since this release

🚀 Major Features

Cypher-like MERGE Operations

This release adds idempotent upsert operations similar to Neo4j's Cypher MERGE, solving the critical problem of handling duplicate data from external sources.

New Methods

  • mergeNode() - Create if not exists, update if exists

    • Match on single or multiple properties
    • onCreate / onMatch semantics for conditional property setting
    • Returns { node, created } indicating whether node was created or matched
    • Automatic conflict detection when multiple nodes match
  • mergeEdge() - Ensure unique relationships

    • Prevents duplicate edges between nodes
    • Property merging on match
    • ON CREATE / ON MATCH support
  • Index Management

    • createPropertyIndex(nodeType, property, unique?) - Create JSON property indexes
    • listIndexes() - View all merge indexes
    • dropIndex(indexName) - Remove indexes
    • Performance warnings in dev mode when indexes are missing

Example Usage

// Daily job scraper - run multiple times safely
db.createPropertyIndex('Job', 'url', true);

const { node: job, created } = db.mergeNode(
  'Job',
  { url: 'https://example.com/job/123' },  // Match on URL
  { title: 'Engineer', status: 'active' },
  {
    onCreate: { discovered: Date.now(), applicationStatus: 'not_applied' },
    onMatch: { lastSeen: Date.now() }
  }
);

console.log(created ? 'New job discovered!' : 'Job updated');

📊 Performance

  • 1.39x faster than manual SELECT-then-INSERT/UPDATE pattern
  • Atomic transactions ensure data consistency
  • JSON property indexes enable efficient lookups on large datasets

📚 Documentation

  • MERGE-DESIGN.md - Complete design specification
  • merge-patterns.ts - 7 comprehensive examples including:
    • Simple upserts
    • ON CREATE / ON MATCH tracking
    • Company deduplication
    • Bulk ETL imports
    • Conflict handling
    • Performance benchmarks

✅ Testing

  • 33 new unit tests with 100% coverage of merge functionality
  • Tests for creation, matching, conflicts, and edge cases
  • Index management test suite

🎯 Use Cases

Perfect for:

  • ETL pipelines that run repeatedly
  • Job scrapers and data importers
  • Distributed systems requiring retry-safe operations
  • Data deduplication (companies, skills, tags)
  • Tracking discovery vs. update timestamps

Breaking Changes

None - this is a backward-compatible feature addition.


Full Changelog: https://github.com/michaeloboyle/sqlite-graph/blob/main/CHANGELOG.md