feat: PostgreSQL table driver behind the ACE ABI (builds on #18)#22
feat: PostgreSQL table driver behind the ACE ABI (builds on #18)#22Admnwk wants to merge 9 commits into
Conversation
Wire Mg telemetry + Studio HTTP sessions. See tools/openmonitor/README.md.
libpq backend with postgresql:// URI, parameterized queries, E2E doctest (seed via OPENADS_TEST_PG_URI), and NMake build script. No third-party SQL amalgamation.
Add pgsql:// URI alias and compile-time disabled rejection. Fix doctest tests: default local PG URI, correct OpenIndex array capacity.
Eliminates libgcc_s_seh-1.dll dependency; add run_postgres_tests.bat wrapper.
ctid is not a stable row identifier (it moves on UPDATE/VACUUM), so it is unsafe as the cursor key for write-back. Switch open_table snapshot, load_current_row and seek_index to a primary-key snapshot mirroring the MariaDB backend, binding values with parameterized PQexecParams. PK columns are discovered from information_schema. ABI read+seek e2e green against PostgreSQL 17.10. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Build/test helper scripts no longer hard-code an environment-specific toolchain root fallback; they require OPENADS_TOOLCHAIN_ROOT to point at the directory holding the msvc, winlibs and pgsql/mariadb client dependencies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a PostgreSQL-backed table driver (libpq) to OpenADS, enabling read, navigation, and seek capabilities via the ACE ABI, alongside comprehensive unit tests and build scripts. It also adds openmonitor, a Rust-based TUI and web dashboard for monitoring server telemetry. The review feedback highlights several critical areas for improvement: resolving potential SQL injection and syntax errors by escaping double quotes in quote_ident, replacing SELECT * with explicit column lists to prevent schema mismatches, ensuring deterministic seek behavior on duplicate keys by ordering by primary keys, and optimizing string allocations in field_index_ci. Additionally, for the openmonitor tool, the reviewer recommends enforcing a maximum frame size limit in read_frame to prevent out-of-memory crashes and clamping the selected session index to avoid out-of-bounds errors when the session list shrinks.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| const std::string sql = | ||
| "SELECT * FROM " + quote_ident(tbl->name) + " WHERE " + | ||
| pk_where_clause(*tbl); |
There was a problem hiding this comment.
Using SELECT * is fragile and can lead to data corruption or mismatches if columns are added/dropped concurrently or if physical column ordering differs from ordinal_position. Explicitly listing the columns from tbl->fields guarantees that the returned columns match the cached schema exactly.
| const std::string sql = | |
| "SELECT * FROM " + quote_ident(tbl->name) + " WHERE " + | |
| pk_where_clause(*tbl); | |
| std::string cols_list; | |
| for (std::size_t i = 0; i < tbl->fields.size(); ++i) { | |
| if (i > 0) cols_list += ", "; | |
| cols_list += quote_ident(tbl->fields[i].name); | |
| } | |
| const std::string sql = | |
| "SELECT " + cols_list + " FROM " + quote_ident(tbl->name) + " WHERE " + | |
| pk_where_clause(*tbl); |
| std::string sql; | ||
| if (last_key) { | ||
| sql = soft | ||
| ? "SELECT " + sel + from + qcol + " <= $1 ORDER BY " + qcol + | ||
| " DESC LIMIT 1" | ||
| : "SELECT " + sel + from + qcol + " = $1 ORDER BY " + qcol + | ||
| " DESC LIMIT 1"; | ||
| } else { | ||
| sql = soft | ||
| ? "SELECT " + sel + from + qcol + " >= $1 ORDER BY " + qcol + | ||
| " ASC LIMIT 1" | ||
| : "SELECT " + sel + from + qcol + " = $1 LIMIT 1"; | ||
| } |
There was a problem hiding this comment.
When seeking with duplicate keys, the query can return any matching row non-deterministically if there is no ORDER BY clause or if the ORDER BY clause doesn't break ties. To guarantee correct xBase/ISAM behavior (positioning at the first or last matching record), we must order by the primary key columns as well.
std::string pk_order_asc;
std::string pk_order_desc;
for (std::size_t i = 0; i < tbl->pk_columns.size(); ++i) {
if (i > 0) {
pk_order_asc += ", ";
pk_order_desc += ", ";
}
std::string qpk = quote_ident(tbl->pk_columns[i]);
pk_order_asc += qpk + " ASC";
pk_order_desc += qpk + " DESC";
}
std::string sql;
if (last_key) {
sql = soft
? "SELECT " + sel + from + qcol + " <= $1 ORDER BY " + qcol + " DESC, " + pk_order_desc + " LIMIT 1"
: "SELECT " + sel + from + qcol + " = $1 ORDER BY " + pk_order_desc + " LIMIT 1";
} else {
sql = soft
? "SELECT " + sel + from + qcol + " >= $1 ORDER BY " + qcol + " ASC, " + pk_order_asc + " LIMIT 1"
: "SELECT " + sel + from + qcol + " = $1 ORDER BY " + pk_order_asc + " LIMIT 1";
}| std::string quote_ident(const std::string& name) { | ||
| return '"' + name + '"'; | ||
| } |
There was a problem hiding this comment.
Double quotes inside identifiers are not escaped, which can lead to SQL syntax errors or potential SQL injection if metadata contains double quotes. Doubling the double quotes is the standard way to escape them in PostgreSQL double-quoted identifiers.
| std::string quote_ident(const std::string& name) { | |
| return '"' + name + '"'; | |
| } | |
| std::string quote_ident(const std::string& name) { | |
| std::string escaped; | |
| escaped.reserve(name.size() + 2); | |
| escaped += '"'; | |
| for (char c : name) { | |
| if (c == '"') { | |
| escaped += "\"\""; | |
| } else { | |
| escaped += c; | |
| } | |
| } | |
| escaped += '"'; | |
| return escaped; | |
| } |
| std::size_t field_index_ci(const PostgresTable& tbl, const std::string& name) { | ||
| std::string want = name; | ||
| for (auto& c : want) { | ||
| c = static_cast<char>(std::toupper(static_cast<unsigned char>(c))); | ||
| } | ||
| for (std::size_t i = 0; i < tbl.fields.size(); ++i) { | ||
| std::string have = tbl.fields[i].name; | ||
| for (auto& c : have) { | ||
| c = static_cast<char>(std::toupper(static_cast<unsigned char>(c))); | ||
| } | ||
| if (have == want) return i; | ||
| } | ||
| return static_cast<std::size_t>(-1); | ||
| } |
There was a problem hiding this comment.
Converting tbl.fields[i].name to uppercase inside the loop allocates a new std::string and performs uppercase conversion for every field on every call. Since field_index_ci is called frequently during row reading, this is highly inefficient. We can optimize it to perform zero string allocations inside the loop.
std::size_t field_index_ci(const PostgresTable& tbl, const std::string& name) {
std::string want = name;
for (auto& c : want) {
c = static_cast<char>(std::toupper(static_cast<unsigned char>(c)));
}
for (std::size_t i = 0; i < tbl.fields.size(); ++i) {
const std::string& have = tbl.fields[i].name;
if (have.size() != want.size()) continue;
bool match = true;
for (std::size_t j = 0; j < have.size(); ++j) {
if (static_cast<char>(std::toupper(static_cast<unsigned char>(have[j]))) != want[j]) {
match = false;
break;
}
}
if (match) return i;
}
return static_cast<std::size_t>(-1);
}| fn read_frame(stream: &mut TcpStream) -> Result<(u8, Vec<u8>)> { | ||
| let mut header = [0u8; 5]; | ||
| stream | ||
| .read_exact(&mut header) | ||
| .context("read frame header")?; | ||
| let len = u32::from_be_bytes([header[0], header[1], header[2], header[3]]) as usize; | ||
| let opcode = header[4]; | ||
| let mut payload = vec![0u8; len]; | ||
| if len > 0 { | ||
| stream | ||
| .read_exact(&mut payload) | ||
| .context("read frame payload")?; | ||
| } | ||
| Ok((opcode, payload)) |
There was a problem hiding this comment.
The frame length len is read directly from the network header and used to allocate a Vec of that size. If a corrupted or malicious packet specifies a huge length (e.g., 4GB), the application will attempt to allocate a massive vector and crash due to Out of Memory (OOM). We should enforce a reasonable maximum frame size limit (e.g., 16MB) before allocating.
fn read_frame(stream: &mut TcpStream) -> Result<(u8, Vec<u8>)> {
let mut header = [0u8; 5];
stream
.read_exact(&mut header)
.context("read frame header")?;
let len = u32::from_be_bytes([header[0], header[1], header[2], header[3]]) as usize;
// Prevent out-of-memory crashes from corrupted or malicious frame lengths
if len > 16 * 1024 * 1024 {
return Err(anyhow!("frame length {} exceeds maximum limit of 16MB", len));
}
let opcode = header[4];
let mut payload = vec![0u8; len];
if len > 0 {
stream
.read_exact(&mut payload)
.context("read frame payload")?;
}
Ok((opcode, payload))
}| impl TuiState { | ||
| fn refresh(&mut self, cfg: &MonitorConfig) { | ||
| self.inner.refresh(cfg); | ||
| self.last_refresh = Instant::now(); | ||
| } | ||
| } |
There was a problem hiding this comment.
If the HTTP sessions list shrinks during a background refresh, app.selected_session can become out of bounds. While the rendering code is safe, subsequent key presses (like Up or k) will operate on an invalid index or show incorrect state. Clamping selected_session to the current sessions list size ensures state consistency.
fn refresh(&mut self, cfg: &MonitorConfig) {
self.inner.refresh(cfg);
self.last_refresh = Instant::now();
if self.inner.http.sessions.is_empty() {
self.selected_session = 0;
} else if self.selected_session >= self.inner.http.sessions.len() {
self.selected_session = self.inner.http.sessions.len() - 1;
}
}- load_current_row selects an explicit column list in schema order instead of SELECT *, so current_row stays aligned with the cached field order even if physical and logical column order differ. - seek_index ORDER BY now appends the primary key as a tie-breaker, making duplicate-key seeks deterministic. - quote_ident doubles any embedded double-quote (identifier safety). - field_index_ci compares case-insensitively without allocating a string per field. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed all four findings on the PostgreSQL backend: explicit column list in schema order (no more |
|
The automated review comments here were generated against an earlier revision — the current HEAD (
|
|
Heads-up: this inline-dispatch version conflicts pairwise with the other backend PRs (each adds its own block to the same ~17 ABI functions, so combining N backends produces ~17xN conflicting hunks in ace_exports.cpp). #31 supersedes that approach: it lifts this driver onto a small pluggable backend-ops registry so each backend becomes one ops struct + one registration line, the 17 ABI functions stay backend-agnostic, and the native / tcp:// path is the unchanged fall-through. #31 bundles SQLite + PostgreSQL + MariaDB + ODBC together, all behavior-preserving and e2e-verified. Suggested path: merge #31 as the combined line, or land the registry first and rebase this PR onto it. (PostgreSQL e2e there: 41/41.) |
Adds an optional PostgreSQL backend to the SQL-backend family started in #18, exposed through the ACE ABI so an existing Harbour/
rddadsapp can open apostgresql://connection and navigate tables (USE/SKIP/GO TOP/GO BOTTOM/SEEK/field reads) without recompiling.libpq.ctid), bound with parameterizedPQexecParams, so positioning stays stable across UPDATE/VACUUM and is safe for future write-back.OPENADS_TEST_PG_URI, defaultpostgresql://postgres@127.0.0.1:5433/postgres); verified green against PostgreSQL 17.Part of the optional SQL-backend series (#18 SQLite, #19 SQLCipher, #21 SQL passthrough). Build:
tools/scripts/build_msvc_x64_postgres.batwith-DOPENADS_WITH_POSTGRESQL=ON.