DuckLake Data Lakehouse

LakeHouse Storage,
RealTime Speed

Stream from Servers (Kafka), Browsers (HTTP/2), or any DuckDB client (FlightRPC) to our Rust DuckDB Server.
Query in realtime via PostgreSQL interface.

~1s
Hot Tier Commit
Configurable, can go much faster
1-10m
Cold Tier (S3)
Durable storage

DuckLake Two-Tier Architecture

Built on DuckLake with PostgreSQL catalog and data inlining. DuckDB clients with the ducklake extension just query tables – fresh data visible in ~1s, with in-instance speed for recent data. The boilstream extension configures ducklake with temporary credentials for secure access.

Clients
BI Tools
Power BI, Tableau
dbt
Transformations
DuckDB
Extension
Apps
Any PG client
PostgreSQL Interface (pgwire)
Ingestion
Kafka
HTTP/2
FlightRPC
BoilStream Server
Single Binary
Hot Tier (~1s visibility)
DuckDB Inlined Data + PostgreSQL Catalog
Cold Tier
S3 / Azure / GCS
DuckLake Parquet
INSTALL boilstream FROM community;
LOAD boilstream;

-- Login with email, password, and MFA code
PRAGMA boilstream_login('https://your-server.com/user@example.com', 'password', '123456');

-- List and use your ducklakes
FROM boilstream_ducklakes();
USE my_catalog;
SELECT * FROM events;

Built for Streaming

Everything you need for a real-time data lakehouse.

🦆

DuckDB Extension

The boilstream extension manages DuckLake for you, vends temporary credentials for seamless hot + cold tier access. Secure OPAQUE PAKE authentication with MFA support.

K

Kafka Protocol + JIT Avro

Confluent Schema Registry compatible Avro with JIT-compiled decoder – 3-5x faster than Apache Arrow's Rust decoder published Oct 2025. Use standard Kafka clients to stream data.

S3

DuckLake Cold Storage

Automatic S3 Parquet snapshots with DuckLake catalog registration. Remote DuckDB clients with DuckLake extension work seamlessly.

🛡

Enterprise Ready

SSO with Entra ID (upload/download XML files) with automated user provisioning (SCIM), RBAC access control, audit trails, and user/admin dashboards. Built-in registration as an alternative. Prometheus monitoring. Configure multiple cloud backends and assign BoilStream roles for users.

0

Zero-Copy Pipeline

Envelope recycling and zero-copy Arrow processing eliminate memory allocations. 2.5+ GB/s throughput (16 vCPU) with 10,000 concurrent sessions.

SQL

Materialized Views

DuckLake VIEWs are materialized with never-ending DuckDB queries that transform data in real-time. Standard SQL syntax, continuous execution.

Start querying in minutes

Download the binary, point to your S3 bucket, and start ingesting. No complex setup required.