Current Problem
The Cesium tutorial (parquet_cesium.qmd) works with the OpenContext Parquet file but fails with the full iSamples Zenodo dataset, throwing "RuntimeError: data is not iterable".
Root Cause: Schema Mismatch
The tutorial is hardcoded for OpenContext's graph-based schema but fails with the flattened iSamples schema:
OpenContext Schema (Working)
- File:
oc_isamples_pqg.parquet
- Table:
nodes
- Coordinates:
latitude, longitude
- Filter:
otype='GeospatialCoordLocation'
- Query:
SELECT pid, latitude, longitude FROM nodes WHERE otype='GeospatialCoordLocation'
Full iSamples Schema (Failing)
- File:
isamples_export_2025_04_21_16_23_46_geo.parquet
- Table:
isamples_data (or direct access)
- Coordinates:
sample_location_latitude, sample_location_longitude
- Filter: No otype filter needed
- Query:
SELECT sample_location_longitude, sample_location_latitude FROM isamples_data
Current Hardcoded Implementation
// This only works for OpenContext schema
const query = `SELECT pid, latitude, longitude FROM nodes WHERE otype='GeospatialCoordLocation'`;
Proposed Solution
Implement schema detection and adaptive querying similar to the zenodo_isamples_analysis.qmd approach:
- Schema Detection: Probe the file to determine available tables and columns
- Adaptive Queries: Use different query patterns based on detected schema
- Unified Interface: Present same functionality regardless of underlying schema
- Error Handling: Graceful fallbacks when schema detection fails
Implementation Approach
// Detect available schema
async function detectSchema(db) {
// Try OpenContext schema first
try {
await db.query("SELECT COUNT(*) FROM nodes WHERE otype='GeospatialCoordLocation' LIMIT 1");
return 'opencontext';
} catch {
// Try iSamples schema
try {
await db.query("SELECT COUNT(*) FROM isamples_data WHERE sample_location_latitude IS NOT NULL LIMIT 1");
return 'isamples';
} catch {
return 'unknown';
}
}
}
// Adaptive query builder
function buildLocationQuery(schema) {
switch(schema) {
case 'opencontext':
return "SELECT pid, latitude, longitude FROM nodes WHERE otype='GeospatialCoordLocation'";
case 'isamples':
return "SELECT sample_identifier as pid, sample_location_latitude as latitude, sample_location_longitude as longitude FROM isamples_data WHERE sample_location_latitude IS NOT NULL";
default:
throw new Error('Unsupported schema');
}
}
Benefits
- Universal compatibility with different iSamples Parquet formats
- Better user experience - any valid iSamples Parquet file should work
- Future-proof - easier to add new schema support
- Consistent tutorial behavior across different data sources
Files to Update
tutorials/parquet_cesium.qmd - Primary implementation
tutorials/parquet_cesium_split.qmd - Apply same pattern
- Consider extracting common schema detection into shared utility
Priority
Medium - affects tutorial usability with the primary iSamples dataset sources.
Current Problem
The Cesium tutorial (parquet_cesium.qmd) works with the OpenContext Parquet file but fails with the full iSamples Zenodo dataset, throwing "RuntimeError: data is not iterable".
Root Cause: Schema Mismatch
The tutorial is hardcoded for OpenContext's graph-based schema but fails with the flattened iSamples schema:
OpenContext Schema (Working)
oc_isamples_pqg.parquetnodeslatitude,longitudeotype='GeospatialCoordLocation'SELECT pid, latitude, longitude FROM nodes WHERE otype='GeospatialCoordLocation'Full iSamples Schema (Failing)
isamples_export_2025_04_21_16_23_46_geo.parquetisamples_data(or direct access)sample_location_latitude,sample_location_longitudeSELECT sample_location_longitude, sample_location_latitude FROM isamples_dataCurrent Hardcoded Implementation
Proposed Solution
Implement schema detection and adaptive querying similar to the zenodo_isamples_analysis.qmd approach:
Implementation Approach
Benefits
Files to Update
tutorials/parquet_cesium.qmd- Primary implementationtutorials/parquet_cesium_split.qmd- Apply same patternPriority
Medium - affects tutorial usability with the primary iSamples dataset sources.