Add order option to with_temp_table, integrate LargeObject util#52
Add order option to with_temp_table, integrate LargeObject util#52KirIgor wants to merge 2 commits into
Conversation
with_temp_table/each_record now accept order: :asc|:desc (default :desc, backwards compatible) to control primary key iteration direction. Also add UmbrellioUtils::LargeObject and ::LargeObject::Writer for streaming reads/writes to Postgres large objects. Bump version to 1.14.0. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
tycooon
left a comment
There was a problem hiding this comment.
Review of the order: option + LargeObject util. The order: change itself is clean and preserves the default :desc behavior. Inline comments cover the specific issues; two PR-level notes:
Test coverage. The new LargeObject/Writer feature ships with no specs, and writer.rb wraps the whole class in # :nocov:, excluding its buffer/seek logic from the coverage gate. The sibling Sequel-backed utilities (database.rb, sql.rb) both have specs — and a single round-trip append test with a UTF-8 string would have caught the byte-offset bug flagged inline. Composite-PK + order: :asc (ORDER BY ROW(...)) is also a new, untested SQL shape (behavior looks correct — it mirrors the existing :desc path).
Not a bug, for the record: the oid = -1 default with create! discarding lo_create's return looks broken but is self-consistent — -1 casts to oid 4294967295, so lo_create/lo_open/lo_put/lo_get/lo_unlink and the exists? lookup all target the same object (verified against Postgres). The only caveats: all default-constructed objects share that one OID (a 2nd create! raises AlreadyExists), and #oid reports -1 rather than the real OID.
|
|
||
| def append!(str) | ||
| run(:lo_put, oid, str_offset, Sequel.blob(str)) | ||
| self.str_offset += str.length |
There was a problem hiding this comment.
str.length counts characters, but large objects are byte-addressed and lo_put writes str.bytesize bytes. For multibyte input the offset drifts and later appends overwrite earlier bytes.
Verified against Postgres: append!("café") then append!("x") yields caf\303x (é's 2nd byte overwritten) instead of caféx.
| self.str_offset += str.length | |
| self.str_offset += str.bytesize |
(Writer escapes this via binmode, but append! is public API.)
There was a problem hiding this comment.
Жесть, все это время был баг?)
| temp_table_name: nil, | ||
| transaction: true | ||
| transaction: true, | ||
| order: :desc |
There was a problem hiding this comment.
order isn't validated until ordered runs inside the batch loop — which is after create_temp_table has already created and populated a persistent UNLOGGED table. With no ensure, an invalid order: raises ArgumentError and skips DB.drop_table, orphaning the table (it survives the session, unlike a real TEMP table); repeated bad calls accumulate orphans.
Consider validating order here (or at the top of each_record) before any DB work, e.g. raise ArgumentError unless %i[asc desc].include?(order).
|
|
||
| def write(*) | ||
| sio.write(*) | ||
| flush if sio.string.size >= MAX_BUFFER_SIZE |
There was a problem hiding this comment.
write returns the value of flush if …, which is nil when the buffer is below MAX_BUFFER_SIZE. Since << aliases write, IO-style chaining writer << a << b becomes nil << b → NoMethodError. StringIO#<< returns self; returning self here would match that contract.
| def each_record(dataset, primary_key: nil, **options, &block) | ||
| primary_key = primary_key_from(dataset, primary_key:) | ||
| eager_tables = Array(options.delete(:eager_load)) | ||
| order = options.fetch(:order, :desc) |
There was a problem hiding this comment.
Minor: the :desc default now lives here and again on with_temp_table's order: param. Both are exercised (this one for the record fetch, the param for the batch pop), so they must be kept in sync — changing one default alone would make batch order and within-batch order diverge.
| end | ||
|
|
||
| def delete! | ||
| run(:lo_unlink, oid) if exists? |
There was a problem hiding this comment.
Minor: delete! issues a separate exists? SELECT before lo_unlink, doubling round-trips. lo_unlink already raises on a missing OID, so rescuing that would avoid the extra query.
There was a problem hiding this comment.
Не хотелось рейзить просто) Могу ловить тогда исключение
with_temp_table/each_record now accept order: :asc|:desc (default :desc, backwards compatible) to control primary key iteration direction.
Also add UmbrellioUtils::LargeObject and ::LargeObject::Writer for streaming reads/writes to Postgres large objects.
Bump version to 1.14.0.