7  Status

8 Status

8.1 2025-12-01

Over the past several months, CalCOFI’s software and data systems have advanced significantly toward a unified, reliable, and user‑friendly platform for exploring and publishing CalCOFI data. Work has focused on four main areas: the integrated data app, the underlying database and workflows, pushing to OBIS, and organizing of CTD data.


8.1.0.1 1. A More Capable, User-Friendly Integrated App

The CalCOFI integrated application (int-app) has evolved into a much richer and more intuitive tool for scientists and partners:

  • Taxonomy-aware exploration
    The app now understands species and their taxonomic relationships. Users can browse taxa hierarchies, see taxonomic ranks, and work with improved species metadata tied to authoritative sources (e.g., WoRMS, ITIS). This makes it easier to find and compare species and groups of species consistently.

  • Better visual experience and theming
    A new dark/light theme toggle has been implemented and refined so that maps, time series, and other plots remain readable and visually consistent. Navigation has been reorganized, with a clearer About page, a guided “tour” of the app, and more intuitive icons and labels, making the app easier to learn and use.

  • Stronger spatial and temporal tools
    Spatial maps now rely on efficient hexagon grids calculated in the database, improving performance and scalability. Default settings for time and depth matching have been tuned to yield better joins between environmental and biological data out of the box.

Overall, the app is moving from a prototype to a polished, guided interface that better supports exploratory analysis and communication.


8.1.0.2 2. A Stable, Well-Documented Database Foundation

The CalCOFI database package (calcofi4db) has been formalized and versioned, providing a solid foundation for all downstream tools:

  • Two stable releases (versions 1.0 and 1.1) have established a reliable baseline for the database, including bottle-level data.
  • The project now follows a clear strategy for separate development and production databases, reducing risk when making changes and improving reproducibility.
  • Data ingestion from NOAA and other sources has been hardened, with several rounds of fixes to handle edge cases and ensure that raw files are consistently and correctly translated into the database.

In addition, the package’s online documentation site has been refreshed so that developers and analysts have up-to-date guidance on how data flow into and through the database.


8.1.0.3 3. Unified R Tools and Documentation Around DuckDB

Across the toolchain, CalCOFI has standardized on DuckDB as the core data engine:

  • The R package (calcofi4r) now encapsulates key logic originally developed inside the integrated app, so the same high-quality data access and processing is available in scripts, reports, and analyses—not just in the web interface.
  • Both the app and R package can connect to local or remote DuckDB databases, improving performance and enabling offline or near‑offline workflows.
  • Documentation in the docs repository has been updated to describe the full data creation process (from raw data to ready‑to‑use databases) and to explain the new development/production database strategy. Status documents and helper scripts provide clearer visibility into project progress.

This brings CalCOFI closer to a coherent, documented platform where analysts can move seamlessly between app-based exploration and scripted analysis.


8.1.0.4 4. Standards-Compliant Publication of Biological Data

The workflows repository has seen major progress in turning CalCOFI’s biological datasets into publication-ready products:

  • New workflows now publish larval data to OBIS, the global biodiversity information system, with repeatable recipes that combine biological observations with CTD (oceanographic) data.
  • The underlying data model for events and occurrences has been strengthened to align with international standards (e.g., Darwin Core), including:
    • Clear hierarchies of sampling events,
    • Better handling of life‑stage and size information (e.g., egg and larval stages),
    • Automated generation of metadata files required for data archives.
  • Additional integrity checks and foreign key relationships help ensure that data are correct and consistent before publication.

These advances substantially improve CalCOFI’s ability to share high-quality, well-structured biodiversity data with the broader scientific community.


8.1.0.5 5. Improved Public Access and Infrastructure

Finally, several changes improve how external users find and access CalCOFI tools:

  • The public website (CalCOFI.github.io) now highlights key applications, including the integrated app and a pollutants-focused app, making them easier to discover.
  • Server configuration has been updated so that Shiny apps are served from a new dedicated domain app.calcofi.io (and still shiny.calcofi.io), clarifying the entry point for interactive tools and simplifying operations.

8.1.1 Overall Impact

Together, these developments move CalCOFI toward a modern, integrated data platform:

  • Scientists and partners gain a more powerful, user‑friendly app and R toolkit for exploring CalCOFI data.
  • The underlying database and workflows are more robust, testable, and clearly documented.
  • CalCOFI’s biological data are better positioned for global visibility and reuse through standards‑compliant publication channels like OBIS.
  • Public‑facing web presence and infrastructure are cleaner and more aligned, making it easier for stakeholders to find and use CalCOFI resources.

8.2 2025-07-01

This report summarizes the key development activities, major accomplishments, and ongoing work for the first 6 monhts of 2025 across the CalCOFI GitHub repositories: api, apps, calcofi4db, calcofi4r, docs, server, workflows. The findings are based on issues and commits from January–July 2025.


8.2.1 API Enhancements

8.2.1.1 New Features & Data Integration

  • Expanded API Options
    • Added ability to include bottle data and use relaxed criteria for net-to-cast matching (commit).
    • Supported upcast/downcast data downloads (commit).
    • Added Zooplankton biomass and improved ichthyodata output (commit).
  • Performance & Maintenance
    • Implemented docker compose restart for Plumber API service (commit).
  • Ongoing Work
    • Migration of database contouring functions to API/app level for improved caching and rendering efficiency.
    • Development of a robust, user-friendly API for seamless DB integration (issue).

8.2.2 Apps Development

8.2.2.1 Visualization & User Interface

  • Continuous Improvements
    • Multiple commits indicate ongoing enhancement, likely focused on UI, data visualization, and integration with the API (see recent commit log).
    • Close coordination between API and Apps for improved workflows and data access.

8.2.3 calcofi4db: R Package & Data Management

8.2.3.1 R Package Initialization & Data Ingestion

  • New R Package: calcofi4db
    • Initial commit and setup (commit), including functions for ingesting CSV datasets and metadata.
    • Refined change detection logic for source CSV files, improving tracking of table/field changes (commit).
    • Enhanced documentation and site via pkgdown.
    • Improved function naming and structure for ingestion (commits, commit).

8.2.4 calcofi4r: Spatial & Ecological Data Tools

8.2.4.1 Data Layers, Analysis, and Bug Fixes

  • Spatial Management Layers
    • Ongoing integration of BOEM Wind Planning Areas, Marine Protected Areas, and SCCWRP management regions (issue, issue).
  • Analysis Functions
    • Improved packages for ecological and spatial analysis, including new dependencies (commit).
  • User Feedback
    • Addressing user-reported bugs such as deprecated function calls (issue).

8.2.5 Documentation (docs)

8.2.5.1 Infrastructure & Environment

  • Documentation Site Updates
    • Added documentation for new packages and ingestion workflows (commit).
    • Improved environment handling for rendering with Quarto and Chromium (multiple commits Jan-Mar 2025).
    • Updated diagrams and edge labels for database documentation.

8.2.6 Server

8.2.6.1 Backend Infrastructure

  • Backend Maintenance
    • Numerous commits for improving server reliability, configuration, and deployment.
    • Indicates active backend support for API and Apps.

8.2.7 Workflows

8.2.7.1 Data Pipeline, Integration, and Registration

  • Workflow Automation
    • Multiple commits show ongoing development of data ingestion, harmonization, and visualization workflows (commit, commit).
  • ODIS Registration
    • Registering datasets with ODIS (using JSON-LD) for broader interoperability (issue).
  • Integration with External Data
    • Ongoing work to load and harmonize diverse ecological datasets (bottle data, larvae, zooplankton, etc.).
  • Spatial Data Management
    • Continued development of AOI (areas of interest), spatial buffer creation, and integration of management regions.

8.2.8 Key Themes & Impact

8.2.8.1 Integration & Interoperability

  • Strong focus on connecting API, Apps, R packages, and backend infrastructure for seamless data access and visualization.
  • Enhanced interoperability through ODIS registration and harmonized workflows.

8.2.8.2 Data Accessibility & Usability

  • Improvements to API and Apps make ecological data more accessible to researchers and managers.
  • Expanded support for spatial management areas and ecological datasets.

8.2.8.3 Infrastructure & Sustainability

  • Investments in documentation, backend reliability, and workflow automation contribute to long-term sustainability and reproducibility.

8.2.9 For More Details

  • Some results may be incomplete due to API limits.
  • To view all commits/issues for 2025, visit each repository’s GitHub UI and filter by year.