Compares local CSV files with a GCS archive to detect changes. Uses md5 hash as primary comparison when available, with file size as fallback when md5 is unavailable (e.g., gcloud CLI fallback).
Usage
compare_local_vs_archive(
dir_csv,
archive_timestamp,
provider,
dataset,
gcs_bucket = "calcofi-files-public",
archive_prefix = "archive"
)Value
List with:
matches: Logical, TRUE if local matches archivelocal_manifest: Tibble of local filesarchive_manifest: Tibble of archive filesadded: Files in local but not archiveremoved: Files in archive but not localchanged: Files with different content (md5) or size
Examples
if (FALSE) { # \dontrun{
comparison <- compare_local_vs_archive(
dir_csv = "/path/to/csv",
archive_timestamp = "2026-02-02_121557",
provider = "swfsc.noaa.gov",
dataset = "calcofi-db")
if (!comparison$matches) {
message("Local files have changed since archive")
}
} # }