A template repository for lightweight Python data science analysis projects.
This template is designed for:
- Windows (USACE GFE)
- VS Code
- Miniforge (conda-forge) + mamba
- one environment per repository, defined by
environment.yml - notebooks as a first-class analysis workflow
- Prerequisites
- Create a new repository from this template
- Clone your new repository
- Create the conda environment
- Open the project in VS Code
- Configure VS Code to use the environment
- Daily workflow
- Project structure
- Managing the environment
Verify Git is installed:
git --version
If Git is not installed, install it using your organization’s approved method.
Conda manages Python and project dependencies. Miniforge is a minimal, conda-forge–oriented installer that supports per-user installs on Windows.
-
Download the Miniforge installer: https://github.com/conda-forge/miniforge/releases
-
Run the installer (Windows:
.exe). -
Close and reopen your terminal, then verify:
conda --version mamba --version
If mamba is not available, install it into base:
conda install -n base mamba -c conda-forge
mamba --version
Install VS Code using your organization’s approved method.
Recommended extensions:
- Python (Microsoft)
- Jupyter (Microsoft)
- In GitHub, open this template repository:
MVR-GIS/repo-generic-python. - Click Use this template.
- Create your new repository:
- Owner:
MVR-GIS(or your approved org/user location) - Repository name: choose a descriptive name (example:
my-analysis-project) - Visibility: choose per project needs (public or private)
- Owner:
- After GitHub creates your new repository, copy its clone URL (HTTPS).
-
Open a terminal (or Miniforge Prompt on Windows if conda is not on your system PATH).
-
Navigate to where you store projects:
cd ~/Documents
-
Clone your new repository (replace with your repo name):
git clone https://github.com/MVR-GIS/.git
-
Move into the project directory:
cd
The environment.yml file defines the project environment. This template standardizes the environment name as analysis.
From the repo root:
mamba env create -f environment.yml
conda activate analysis
Verify Python runs:
python --version
From the repo root:
code .
- Open the Command Palette (Ctrl+Shift+P).
- Run: Python: Select Interpreter
- Choose the interpreter for the
analysisconda environment.
-
Pull the latest changes:
git pull
-
Activate the environment:
conda activate analysis
-
Open VS Code:
code .
If environment.yml changed, update your environment:
mamba env update -f environment.yml --prune
notebooks/— Jupyter notebooks (analysis + narrative)src/— reusable Python code (helpers, functions, modules)data/— data (follow your project’s data policy)outputs/— figures/tables/results produced by analysis
-
Add the package to
environment.ymlunderdependencies. -
Update your local environment:
mamba env update -f environment.yml --prune
-
Commit the updated
environment.yml.
conda deactivate
conda deactivate
conda env remove -n analysis
mamba env create -f environment.yml
conda activate analysis
conda env export > environment-lock.yml
This template supports a reproducible “save often” workflow for capturing Foundry Threads chat session history into this repository. The intent is to preserve a transparent audit trail of AI assistance and support continuity across sessions.
Design goals:
- minimal user friction (sensible defaults)
- “save often” updates (rerun throughout the day)
- committed, reviewable artifacts in git
- raw export preserved for audit / re-derivation
- backups created automatically on overwrite
Committed artifacts (in git):
dev/sessions/YYYY-MM-DD[_topic].md- human-readable transcript
- includes YAML front matter (metadata) at the top of the file
dev/sessions/.raw/YYYY-MM-DD[_topic].threads_export.json- raw Foundry export preserved for audit and re-derivation
Ignored artifacts (not committed):
dev/sessions/.backups/- timestamped backups created automatically when updating an existing transcript
- this folder is gitignored to prevent repo bloat
In Foundry Threads, use the UI option to export/save the chat as JSON.
Save the export to your OS Downloads folder with the filename:
threads_export.json
(This tool assumes that name by default.)
From the repository root (the folder containing environment.yml), with the analysis
environment activated:
conda activate analysis
Run the extractor (default behavior: reads ~/Downloads/threads_export.json):
python -m tools.foundry_threads
Save often workflow (recommended):
- export again from Foundry Threads to
threads_export.json - rerun the command above
- the tool will overwrite the existing session transcript and create a backup copy
By default, the session transcript file is date-only:
dev/sessions/YYYY-MM-DD.md
If you have multiple distinct sessions/topics in the same day, use --topic:
python -m tools.foundry_threads --topic data_validation
This produces:
dev/sessions/YYYY-MM-DD_data_validation.mddev/sessions/.raw/YYYY-MM-DD_data_validation.threads_export.json
Topic values are sanitized for filenames (special characters converted to underscores).
If the export JSON is not in your Downloads folder, provide an explicit path:
python -m tools.foundry_threads --export-path C:\path\to\threads_export.json
The transcript .md includes YAML front matter with extracted metadata such as:
- title
- created_at / last_updated_at
- foundry_user_id
- user_display_name (from SENT messages)
- model_display_name (from RECEIVED messages)
- export_sha256 (integrity hash of the raw export JSON)
- message counts
This repository uses pytest. From the repo root:
conda activate analysis
pytest
A runbook is available at:
dev/RUNBOOK_TESTS.md
Notes:
pytest.iniis included to ensure tests can import repository-local modules consistently.- Tests validate:
- transcript generation (YAML front matter + conversation rendering)
- raw export preservation under
dev/sessions/.raw/ - overwrite backups under
dev/sessions/.backups/