R package for accessing and visualizing CalCOFI data. Connect directly to the CalCOFI database via DuckDB or use the CalCOFI API.

Install

This package lives on Github, not yet CRAN, so you’ll need to run the following to install or update the package:

remotes::install_github("calcofi/calcofi4r")

Then load the package:

Quick Start

Connect to CalCOFI Database

Access the CalCOFI integrated database directly via DuckDB:

# connect to latest frozen release
con <- cc_get_db()

# list available tables
cc_list_tables()

# query with SQL
DBI::dbGetQuery(con, "SELECT COUNT(*) FROM larva")

Read Data with Convenience Functions

# read larvae data
larvae <- cc_read_larvae()

# read bottle samples
bottles <- cc_read_bottle()

# read cast data
casts <- cc_read_cast()

# filter while reading (uses dplyr syntax)
engraulis <- cc_read_larvae(species_id == 123)

Version Control

Access specific database versions for reproducibility:

# list available versions
cc_list_versions()
#>    version   is_latest
#> 1 v2026.03       TRUE
#> 2 v2026.02      FALSE

# connect to specific version
con <- cc_get_db(version = "v2026.02")

# get release information
cc_db_info("v2026.02")

# view release notes
cc_release_notes("v2026.02")

Execute Custom Queries

# run SQL queries
results <- cc_query("
  SELECT species_id, COUNT(*) as n
  FROM larva
  GROUP BY species_id
  ORDER BY n DESC
  LIMIT 10")

# describe table schema
cc_describe_table("larva")

CalCOFI API Functions

The package also provides functions for the CalCOFI API at api.calcofi.io:

# get available variables
get_variables()

# get cruise information
get_cruises()

# get interpolated raster
get_raster(
  variable  = "ctdcast_bottle.t_deg_c",
  cruise_id = "2020-01-05-C-33RL",
  out_tif   = "temperature.tif")

# get time series summary
get_timeseries(
  variable    = "ctdcast_bottle.t_deg_c",
  aoi_wkt     = "POLYGON((-121 33, -119 33, -119 35, -121 35, -121 33))",
  depth_m_min = 0,
  depth_m_max = 100,
  time_step   = "year")

Package Data

The package includes small lookup and example datasets:

# CalCOFI sampling grid
cc_grid
cc_grid_ctrs
cc_grid_zones

# example bottle data
cc_bottle

# station locations
stations

# geographic places
cc_places

Data Architecture

CalCOFI data is stored in frozen DuckLake releases:

gs://calcofi-db/ducklake/releases/
├── v2026.02/
│   ├── catalog.json
│   ├── RELEASE_NOTES.md
│   └── parquet/
│       ├── bottle.parquet
│       ├── cast.parquet
│       ├── larva.parquet
│       └── ...
├── v2026.03/
└── latest.txt → v2026.03

Data is accessed directly via DuckDB’s httpfs extension - no download required for queries.

Code of Conduct

This is an open-source project so your input is greatly welcomed! Please note that the calcofi4r project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.