Update ingestion pipeline to handle timeseries table by vish-cs · Pull Request #528 · datacommonsorg/import

vish-cs · 2026-06-08T11:03:30Z

DataCommons Spanner Graph is migrating to a new schema to introduce separate TimeSeries and Observation tables instead of the currently Observation table. This allows removing the Observation proto used to store the timeseries. This PR updates the ingestion pipeline to populate data into the tables with the new schema. For this, we eliminate the step to build the optimized graph (timeseries) from the input observations. Instead, we directly create timeseries and observation mutations from the input observation. This avoids the costly group by operation on the observations.

codacy-production · 2026-06-08T11:05:35Z

Not up to standards ⛔

🔴 Issues 3 high · 2 medium · 5 minor

Alerts:
⚠ 10 issues (≤ 0 issues of at least minor severity)

Results:
10 new issues

Category Results

Compatibility 3 high

CodeStyle 5 minor

Complexity 2 medium

View in Codacy

🟢 Metrics 70 complexity · 6 duplication

Metric Results

Complexity 70

Duplication 6

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

vish-cs · 2026-06-09T08:52:19Z

/gemini review

gemini-code-assist

Code Review

This pull request migrates the Spanner storage schema and ingestion pipeline from a protobuf-based Observations column to a relational model featuring a TimeSeries table (storing metadata as JSON) and an interleaved Observation table for individual data points. Feedback focuses on performance and maintainability improvements: optimizing GraphReader.extractObservations by constructing the TimeSeriesKey once outside the loop rather than for every data point, refactoring SpannerClient.toObservationKVMutations to reuse toTimeSeriesMutation to eliminate duplicate mutation-building logic, and simplifying toObservation to accept TimeSeriesKey directly while updating the corresponding tests.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

n-h-diaz

Thanks!

I had a few questions mostly relating to what the final schema looks like (I was following the schema in datcom-store.dc-graph-staging.dc_graph)

This comment was marked as outdated.

Sign in to view

vish-cs force-pushed the schema branch 4 times, most recently from 7058981 to 65a60e7 Compare June 9, 2026 08:38

vish-cs requested a review from rohitkumarbhagat June 9, 2026 08:52

gemini-code-assist Bot reviewed Jun 9, 2026

View reviewed changes

vish-cs force-pushed the schema branch 5 times, most recently from 29592ac to f8292be Compare June 10, 2026 08:29

vish-cs requested a review from n-h-diaz June 10, 2026 15:53

n-h-diaz reviewed Jun 10, 2026

View reviewed changes

rohitkumarbhagat approved these changes Jun 11, 2026

View reviewed changes

Update ingestion pipeline to handle timeseries table

626748c

vish-cs force-pushed the schema branch from f8292be to 2c55ce0 Compare June 11, 2026 12:54

Address reviewer comments

3bdfe4d

vish-cs force-pushed the schema branch from 2c55ce0 to 3bdfe4d Compare June 11, 2026 12:59

vish-cs merged commit 6cb7571 into datacommonsorg:master Jun 12, 2026
9 of 10 checks passed

Conversation

vish-cs commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codacy-production Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not up to standards ⛔

Uh oh!

This comment was marked as outdated.

Uh oh!

vish-cs commented Jun 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

n-h-diaz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vish-cs commented Jun 8, 2026 •

edited

Loading

codacy-production Bot commented Jun 8, 2026 •

edited

Loading