Skip to content

feat: implement data replication plugin for external database sync #96

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

MAVRICK-1
Copy link

Closes #72
/claim #72

Purpose

Implement a comprehensive data replication plugin for StarbaseDB that enables users to pull data from external databases (PostgreSQL, MySQL, etc.) into the internal SQLite database with configurable sync intervals and incremental updates.

This addresses the request for creating a "pull mechanism" to replicate data from external sources like Supabase PostgreSQL into StarbaseDB's edge-close SQLite replica.

Screencast.from.2025-07-23.13-11-42.mp4

Tasks

  • Create DataReplicationPlugin following StarbaseDB plugin architecture
  • Implement external database connection support (PostgreSQL, MySQL, Hyperdrive)
  • Add incremental sync mechanism using tracking columns (id, created_at, updated_at)
  • Create configurable sync intervals per replication configuration
  • Build comprehensive REST API for replication management
  • Add real-time logging and monitoring system
  • Implement event callback system for external integrations
  • Create internal database tables for configuration and logs storage
  • Add proper error handling and graceful degradation
  • Write unit tests and comprehensive documentation
  • Register plugin in main application with event handling
  • Deploy and test on live Cloudflare Workers environment

Verify

🔍 Code Review Checklist

  • Plugin Architecture: Verify adherence to StarbaseDB plugin patterns in plugins/data-replication/index.ts
  • External Database Security: Review credential handling and SQL injection prevention
  • API Endpoints: Test all 7 REST endpoints with proper authentication
  • TypeScript Types: Validate type safety and interface compliance
  • Error Handling: Ensure graceful degradation and comprehensive logging

🧪 Testing Steps

  1. Build & Deploy: npm run build && npx wrangler deploy
  2. Configure Replication:
    curl -X POST https://starbasedb.mavrickrishi.workers.dev/data-replication/configure \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer ABC123" \
      -d '{"name":"test","sourceConfig":{"dialect":"postgresql","host":"demo.supabase.co","port":5432,"user":"postgres","password":"demo-pass","database":"postgres"},"targetTable":"users","sourceTable":"users","syncIntervalMinutes":30,"trackingColumn":"updated_at"}'
  3. Check Status: curl -X GET https://starbasedb.mavrickrishi.workers.dev/data-replication/status -H "Authorization: Bearer ABC123"
  4. Monitor Logs: View Cloudflare Workers dashboard for real-time sync events

📊 Performance Verification

  • Memory: Plugin reuses existing connection infrastructure (minimal overhead)
  • Scalability: Supports multiple concurrent replication configurations
  • Efficiency: Incremental sync with WHERE clause filtering

Before

GitHub Issue #72 Request:

  • No data replication functionality
  • No external database sync capability
  • Manual data management required
  • No edge-close replica creation

StarbaseDB Plugin Ecosystem:

plugins/
├── cdc/           # Change Data Capture
├── cron/          # Scheduled tasks  
├── query-log/     # Query logging
├── stats/         # System statistics
└── [missing data replication]

After

✅ Complete Data Replication System:

🔌 New Plugin Added

plugins/data-replication/
├── index.ts       # Main plugin (475+ lines)
├── index.test.ts  # Unit tests (3 passing)
├── meta.json      # Plugin metadata
└── README.md      # Comprehensive docs

🚀 Live Deployment

📡 7 New API Endpoints

  • POST /data-replication/configure - Set up replication
  • POST /data-replication/start/:name - Start specific replication
  • POST /data-replication/stop/:name - Stop specific replication
  • POST /data-replication/sync/:name - Manual sync trigger
  • GET /data-replication/status - View all configurations
  • GET /data-replication/logs - Sync operation logs
  • DELETE /data-replication/configure/:name - Remove config

💾 Database Schema

Two new internal tables created:

  • tmp_replication_configs - Store replication settings
  • tmp_replication_logs - Track sync operations and performance

🔄 Working Demo

# ✅ TESTED: Configuration successful
{"result":{"success":true,"message":"Replication configured successfully"}}

# ✅ TESTED: Status shows active replication
{"result":{"configs":[{"name":"supabase-users","is_running":true,"sync_interval_minutes":30}]}}

🎯 Issue Requirements Met

✅ Pull mechanism for external databases
✅ Configurable sync intervals
✅ Table-specific replication
✅ Incremental updates via tracking columns
✅ Supabase PostgreSQL support
✅ Edge-close SQLite replica creation
✅ wrangler.toml configuration support

  - Add DataReplicationPlugin following StarbaseDB plugin architecture
  - Support PostgreSQL, MySQL, and Hyperdrive external connections
  - Implement incremental sync with configurable tracking columns
  - Add comprehensive API endpoints for replication management
  - Include real-time logging and event callback system
  - Add unit tests and comprehensive documentation
  - Successfully deployed and tested on Cloudflare Workers

  Closes outerbase#72
@MAVRICK-1
Copy link
Author

cc @Brayden

@MAVRICK-1
Copy link
Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replicate data from external source to internal source with a Plugin
1 participant