AMR indiv and fungal large files uploads management by anagperal · Pull Request #367 · EyeSeeTea/glass-dev

anagperal · 2025-12-16T11:39:26Z

📌 References

BASED ON CODE FROM #363 by @MatiasArriola

Issue: Closes #8699wradu AMR indiv and fungal large files uploads/deletions management

📝 Implementation

install papaparse
read CVS in chunks
allow import tracker entities in sync mode
in server uploads for AMR indiv and AMR fungal: use chunks to manage RIS files and save tracker entitites in sync mode
Use concurrency when importing in async uploads
Add skipSideEffects true in async uploads
Async upload workflow — one continuous sequence
Add Stop-on-import-error

🔁 Async Uploads Workflow (AMR – Individual / AMR – Fungal)

This document describes, step by step, how the server-side async upload script imports a
primary CSV file into DHIS2. The script processes uploads that were queued in the Datastore,
validating the whole file before importing and sending the data to the server in chunks.

The entry point is the uploadDatasets routine in src/scripts/cliAsyncUploads.ts.

One continuous sequence

Processing the queue

The script takes the pending uploads that still have retry attempts left and processes them
one at a time — there is no parallelism between uploads; all chunking and concurrency
happens inside a single upload.
For each upload it pauses about half a second, then re-checks the upload is still marked
pending. If not, it is skipped.
If still pending, it loads the upload's record, determines whether it is a primary or
secondary file, and finds its module (a missing module is an error).

Preparing the file

The upload is marked "uploading" and its CSV is downloaded from storage.
The configured upload chunk size is read — how many rows go to the server per request.
It comes from the module's Datastore config (default 100; currently testing 300 vs 500 comming from Datastore config).
The program's configuration and validation rules are fetched once, up front.

Validation pass (whole file, before anything is imported)

The file is read in large blocks of 5,000 rows. Each block is fully validated — custom
checks plus program-rule checks.
This pass is fail-fast: the instant a blocking error appears, reading stops, an error report
is saved, and the import is skipped entirely — guaranteeing the whole file is validated
before any data goes in.

Import pass (only if validation passed completely)

The file is read again in the same 5,000-row blocks.
Each block is split into smaller chunks of the configured upload chunk size (300/500).
Those chunks are sent to the server in synchronous mode, with the server's rule engine
and side effects turned off (the rules were already checked in the validation pass),
6 at a time. The concurrency is batched: six are sent, the script waits for all six to
finish, then sends the next six.
Stop-on-import-error: if any chunk fails, the process stops immediately — no further
chunks or blocks are imported — keeping whatever succeeded so far.
As it goes, it accumulates the results and the IDs of the created records. At the end, if any
records were imported, their ID list is saved to a file and linked to the upload, and the
import summaries are saved.

Finishing the upload

The accumulated results are inspected for blocking errors and whether any rows were actually
imported.
The upload's final status is set accordingly — validated (clean), imported (partial, with
errors), or uploaded (nothing imported) — and the upload is removed from the queue.
If anything failed during the whole sequence, the upload's attempt counter is increased
instead; once it reaches the maximum (3 by default), it is removed from the queue and marked
failed, otherwise it stays for a later run to retry.

Secondary files follow the same shape, except AMR – Individual does a trial run first and
only imports for real if the trial finds no blocking errors.

The knobs, in one place

Knob	Value	Notes
Uploads	one at a time	No concurrency between uploads.
File blocks	5,000 rows (fixed)	Used for both the validation and import passes.
Upload chunk size	300 / 500 (default 100)	Datastore config; rows per server request.
Concurrency	6 chunks, batched	Currently hardcoded

📹 Screenshots/Screen capture

🔥 Testing

transform_JPN_INDIV_2023.sh

…al RIS CSV files in chunks

…unking

ifoche · 2026-05-08T10:25:19Z

Task linked: CU-8699wradu BLOCKING AMR-I RIS upload error

anagperal added 2 commits December 16, 2025 12:25

Install paparse

1db8255

Allow sync tracker imports and validate and import AMR indiv and Fung…

1f40d96

…al RIS CSV files in chunks

anagperal changed the title ~~AMR indiv and fungal large files uploads/deletions management~~ AMR indiv and fungal large files uploads management Dec 17, 2025

anagperal added 11 commits December 17, 2025 11:17

Change chunk sizes and add more logs about time

bfc970d

Remove time logs

2dfec99

Fix default date format in SAMPLE_DATE and the number of line when ch…

dd75c95

…unking

Add skipSideEffects true in async uploads

2d5c572

Import TEIs with concurrency instead of sequential for async uploads

6d83376

Add more logs when future error

9e5430b

Get only program metadata once

ec8c845

Create a Map instead of use find

38d571a

Get programRules metadata at the beginning for runProgramRuleValidations

fa37659

Improve code format

9b62582

Add support to node 18 for async upload script for server execution

30b9c0d

Add Stop-on-import-error in async uploads

9cc1924

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMR indiv and fungal large files uploads management#367

AMR indiv and fungal large files uploads management#367
anagperal wants to merge 14 commits into
developmentfrom
feature/amr-indiv-fungal-uploads-file-chunk

anagperal commented Dec 16, 2025 •

edited

Loading

Uh oh!

ifoche commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

anagperal commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 References

📝 Implementation

🔁 Async Uploads Workflow (AMR – Individual / AMR – Fungal)

One continuous sequence

Processing the queue

Preparing the file

Validation pass (whole file, before anything is imported)

Import pass (only if validation passed completely)

Finishing the upload

The knobs, in one place

📹 Screenshots/Screen capture

🔥 Testing

Uh oh!

ifoche commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anagperal commented Dec 16, 2025 •

edited

Loading