EastBay DuckDB

Here are some examples I put together while exploring the DuckLake ATTACH command.

.bail on

.echo on

INSTALL ducklake; LOAD ducklake;

INSTALL httpfs; LOAD httpfs;

INSTALL sqlite; LOAD sqlite;

INSTALL postgres; LOAD postgres;

INSTALL mysql; LOAD mysql;

-- ATTACHMENT DATABASE DATA PATH DATABASE CREATION

-- x1 unspecified default

-- x2 duckdb default

-- x3 sqlite local file

-- x4 duckdb gcs

-- x5 postgres gcs createdb x5_catalog

-- x6 mysql gcs mysqladmin -u root -p create x6_catalog

ATTACH 'ducklake:x1.ducklake' AS x1;

ATTACH 'ducklake:duckdb:x2.ducklake' AS x2;

ATTACH 'ducklake:sqlite:x3.sqlite' AS x3 (DATA_PATH 'x3.ducklake.files/');

ATTACH 'ducklake:duckdb:x4.ducklake' AS x4 (DATA_PATH 'gs://marhar-scratch/x4.ducklake.files/');

ATTACH 'ducklake:postgres:dbname=x5_catalog' AS x5 (DATA_PATH 'gs://marhar-scratch/x5.ducklake.files/');

ATTACH 'ducklake:mysql:db=x6_catalog user=root' AS x6 (DATA_PATH 'gs://marhar-scratch/x6.ducklake.files/');

ATTACH 'ducklake:postgres:dbname=x7_catalog' AS x7 (DATA_PATH 's3://marhar-scratch/x7.ducklake.files/');

CREATE TABLE IF NOT EXISTS x1.t1 (a int, b int);

CREATE TABLE IF NOT EXISTS x2.t1 (a int, b int);

CREATE TABLE IF NOT EXISTS x3.t1 (a int, b int);

CREATE TABLE IF NOT EXISTS x4.t1 (a int, b int);

CREATE TABLE IF NOT EXISTS x5.t1 (a int, b int);

CREATE TABLE IF NOT EXISTS x6.t1 (a int, b int);

CREATE TABLE IF NOT EXISTS x7.t1 (a int, b int);

INSERT INTO x1.t1 VALUES (1, 2), (3, 4);

INSERT INTO x2.t1 VALUES (5, 6), (7, 8);

INSERT INTO x3.t1 VALUES (9, 10), (11, 12);

INSERT INTO x4.t1 VALUES (13, 14), (15, 16);

INSERT INTO x5.t1 VALUES (13, 14), (15, 16);

INSERT INTO x6.t1 VALUES (13, 14), (15, 16);

INSERT INTO x7.t1 VALUES (13, 14), (15, 16);

Here's what I did to get DuckDB/rust going on my system.

Prerequisites

We need two things from DuckDB: header files (so we can build duckdb-rs) and the library (so duckdb-rs can call the DuckDB functions). We will get these by building DuckDB. At the time of writing , version 0.8.1 is the DuckDB version supported by duckdb-rs.

We will clone the DuckDB repo, check out version 0.8.1, and build it.

mkdir $HOME/duck81

cd $HOME/duck81

git clone https://github.com/duckdb/duckdb.git

cd duckdb

git checkout tags/v0.8.1

make

Next, we need a couple of environment variables. These are used when crate builds the duckdb-rs crate.

export DUCKDB_LIB_DIR=$HOME/duck81/duckdb/build/release/src

export DUCKDB_INCLUDE_DIR=$HOME/duck81/duckdb/src/include

Finally, we need to tell the OS where the dynamic libraries are located.

# macos

export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:$DUCKDB_LIB_DIR

# linux

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$DUCKDB_LIB_DIR

This setup is fine for development but not for production use. If you are shipping rust in production, you know how to package shared libraries, etc. There are some packaging notes on the duckdb-rs github page.

Cargo.toml

Add this line to your dependencies section.

[dependencies]

duckdb = "0.8.1"

Sample Code

Sample Code

Here's a lightly modified version of the example code on the duckdb-rs github page.

use duckdb::{params, Connection, Result};

use duckdb::arrow::record_batch::RecordBatch;

use duckdb::arrow::util::pretty::print_batches;

#[derive(Debug)]

struct Person {

id: i32,

name: String,

data: Option<Vec<u8>>,

}

fn main() -> Result<()> {

let conn = Connection::open_in_memory()?;

conn.execute_batch(r"

CREATE SEQUENCE seq;

CREATE TABLE person (

id INTEGER PRIMARY KEY DEFAULT NEXTVAL('seq'),

name TEXT NOT NULL,

data BLOB

);

")?;

conn.execute("INSERT INTO person (name, data) VALUES (?, ?)",

params!["able", "aaa"],)?;

conn.execute("INSERT INTO person (name, data) VALUES (?, ?)",

params!["baker", "bbb"],)?;

conn.execute("INSERT INTO person (name, data) VALUES (?, ?)",

params!["charlie", "ccc"],)?;

conn.execute("INSERT INTO person (name, data) VALUES (?, ?)",

params!["dog", "ddd"],)?;

println!("query table by rows");

let mut stmt = conn.prepare("SELECT id, name, data FROM person")?;

let person_iter = stmt.query_map([], |row| {

Ok(Person {

id: row.get(0)?,

name: row.get(1)?,

data: row.get(2)?,

})

})?;

for person in person_iter {

println!("Found person {:?}", person.unwrap());

}

println!("query table by arrow");

let rbs: Vec<RecordBatch> = stmt.query_arrow([])?.collect();

print_batches(&rbs).unwrap();

Ok(())

}

Execution
We can ignore the (incorrect) warning about the People fields never being read.

% cargo run

Compiling duck1 v0.1.0 (/Users/neologic/biblio/anachron/duck1)

warning: fields `id`, `name`, and `data` are never read

--> src/main.rs:7:5

6 | struct Person {

| ------ fields in this struct

7 | id: i32,

| ^^

8 | name: String,

| ^^^^

9 | data: Option<Vec<u8>>,

| ^^^^

= note: `Person` has a derived impl for the trait `Debug`, but this is intentionally ignored during dead code analysis

= note: `#[warn(dead_code)]` on by default

warning: `duck1` (bin "duck1") generated 1 warning

Finished dev [unoptimized + debuginfo] target(s) in 1.24s

Running `target/debug/duck1`

query table by rows

Found person Person { id: 1, name: "able", data: Some([97, 97, 97]) }

Found person Person { id: 2, name: "baker", data: Some([98, 98, 98]) }

Found person Person { id: 3, name: "charlie", data: Some([99, 99, 99]) }

Found person Person { id: 4, name: "dog", data: Some([100, 100, 100]) }

query table by arrow

+----+---------+--------+

| id | name | data |

+----+---------+--------+

| 1 | able | 616161 |

| 2 | baker | 626262 |

| 3 | charlie | 636363 |

| 4 | dog | 646464 |

+----+---------+--------+

EastBay DuckDB

Monday, June 16, 2025

DuckLake ATTACH quick reference.

Sunday, August 20, 2023

Setting up DuckDB/Rust

DuckLake ATTACH quick reference.

Report Abuse

Labels