Assign deterministic UUIDs using DuckDB-native md5
Source:R/wrangle.R
assign_deterministic_uuids_md5.RdGenerates UUID-style identifiers from a composite key entirely inside
DuckDB using md5(). Unlike assign_deterministic_uuids() which pulls
data into R and uses UUID v5 (SHA-1), this runs as a single SQL statement
with zero R data transfer — orders of magnitude faster on large tables.
Details
The 32-char md5 hex is formatted as 8-4-4-4-12 to resemble a UUID string.
These are not RFC 4122 UUIDs — they are md5-based internal identifiers.
Use this for internal-only IDs on large tables (e.g., ctd_measurement).
Use assign_deterministic_uuids() when RFC 4122 UUID v5 compatibility
is needed (e.g., ichthyo tables shared with external systems).