datacarpentry · ShebMichel · Jun 19, 2026 · May 18, 2026
diff --git a/episodes/03-transform.md b/episodes/03-transform.md
@@ -9,15 +9,15 @@ exercises: 5
 - Select rows and columns from an Astropy `Table`.
 - Use Matplotlib to make a scatter plot.
 - Use Gala to transform coordinates.
-- Make a Pandas `DataFrame` and use a Boolean `Series` to select rows.
+- Make a pandas `DataFrame` and use a Boolean `Series` to select rows.
 - Save a `DataFrame` in an HDF5 file.
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
 :::::::::::::::::::::::::::::::::::::::: questions
 
 - How do we make scatter plots in Matplotlib?
-- How do we store data in a Pandas `DataFrame`?
+- How do we store data in a pandas `DataFrame`?
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
@@ -40,7 +40,7 @@ analysis, identifying stars with the proper motion we expect for GD-1.
 2. Then we will transform the coordinates and proper motion data from
   ICRS back to the coordinate frame of GD-1.
 
-3. We will put those results into a Pandas `DataFrame`.
+3. We will put those results into a pandas `DataFrame`.
 
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -471,7 +471,7 @@ We started with a rectangle in the GD-1 frame.  When
 transformed to the ICRS frame, it is a non-rectangular region.  Now,
 transformed back to the GD-1 frame, it is a rectangle again.
 
-## Pandas DataFrame
+## pandas DataFrame
 
 At this point we have two objects containing different sets of the
 data relating to identifying stars in GD-1.  `polygon_results` is the Astropy `Table` we downloaded from Gaia.
@@ -563,33 +563,33 @@ We could have: `proper_motion` contains the same data as
 
 :::::::::::::::::::::::::::::::::::::::::  callout
 
-## Pandas `DataFrame`s versus Astropy `Table`s
+## pandas `DataFrame`s versus Astropy `Table`s
 
-Two common choices are the Pandas `DataFrame` and Astropy `Table`.
-Pandas `DataFrame`s and Astropy `Table`s share many of the same characteristics
+Two common choices are the pandas `DataFrame` and Astropy `Table`.
+pandas `DataFrame`s and Astropy `Table`s share many of the same characteristics
 and most of the manipulations that we do can be done with either.  As you become
 more familiar with each, you will develop a sense of which one you prefer for
 different tasks.  For instance you may choose to use Astropy `Table`s to read
-in data, especially astronomy specific data formats, but Pandas `DataFrame`s to
+in data, especially astronomy specific data formats, but pandas `DataFrame`s to
 inspect the data. Fortunately, Astropy makes it easy to convert between the
-two data types. We will choose to use Pandas `DataFrame`, for two reasons:
+two data types. We will choose to use pandas `DataFrame`, for two reasons:
 
 1. It provides capabilities that are (almost) a superset of the other data
   structures, so it is the all-in-one solution.
 
-2. Pandas is a general-purpose tool that is useful in many domains,
+2. pandas is a general-purpose tool that is useful in many domains,
   especially data science.  If you are going to develop expertise in one
-  tool, Pandas is a good choice.
+  tool, pandas is a good choice.
 
-However, compared to an Astropy `Table`, Pandas has one big drawback:
+However, compared to an Astropy `Table`, pandas has one big drawback:
 it does not keep the metadata associated with the table, including the
 units for the columns.  Nevertheless, we think it's a useful data type
 to be familiar with.
 
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
-It is straightforward to convert an Astropy `Table` to a Pandas `DataFrame`.
+It is straightforward to convert an Astropy `Table` to a pandas `DataFrame`.
 
 ```python
 import pandas as pd
@@ -642,7 +642,7 @@ and consolidate them into a single function that we can use to take the
 coordinates and proper motion that we get as an Astropy `Table` from our
 Gaia query, add columns representing the reflex corrected
 GD-1 coordinates and proper motions, and transform it into a
-Pandas `DataFrame`.
+pandas `DataFrame`.
 This is a general function that we will use multiple times as we build different
 queries so we want to write it once and then call the function rather than having
 to copy and paste the code over and over again.
@@ -653,7 +653,7 @@ def make_dataframe(table):
 
     table: Astropy Table
 
-    returns: Pandas DataFrame
+    returns: pandas DataFrame
     """
     #Create a SkyCoord object with the coordinates and proper motions
     # in the input table
@@ -696,7 +696,7 @@ results_df = make_dataframe(polygon_results)
 
 At this point we have run a successful query and combined the results into a single `DataFrame`. This is a good time to save the data.
 
-To save a Pandas `DataFrame`, one option is to convert it to an
+To save a pandas `DataFrame`, one option is to convert it to an
 Astropy `Table`, like this:
 
 ```python
@@ -713,7 +713,7 @@ astropy.table.table.Table
 Then we could write the `Table` to a FITS file, as we did in the
 previous lesson.
 
-But, like Astropy, Pandas provides functions to write DataFrames in other formats; to
+But, like Astropy, pandas provides functions to write DataFrames in other formats; to
 see what they are [find the functions here that begin with
 `to_`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html).
 
@@ -733,10 +733,10 @@ And HDF5 stores the metadata associated with the table, including
 column names, row labels, and data types (like FITS).
 
 Finally, HDF5 is a cross-language standard, so if you write an HDF5
-file with Pandas, you can read it back with many other software tools
+file with pandas, you can read it back with many other software tools
 (more than FITS).
 
-We can write a Pandas `DataFrame` to an HDF5 file like this:
+We can write a pandas `DataFrame` to an HDF5 file like this:
 
 ```python
 filename = 'gd1_data.hdf'
@@ -760,19 +760,19 @@ file if it already exists rather than append another dataset to it.
 In this episode, we re-loaded the Gaia data we saved from a previous query.
 
 We transformed the coordinates and proper motion from ICRS to a frame
-aligned with the orbit of GD-1, stored the results in a Pandas
+aligned with the orbit of GD-1, stored the results in a pandas
 `DataFrame`, and visualized them.
 
-We combined all of these steps into a single function that we can reuse in the future to go straight from the output of a query with object coordinates in the ICRS reference frame directly to a Pandas DataFrame that includes object coordinates in the GD-1 reference frame.
+We combined all of these steps into a single function that we can reuse in the future to go straight from the output of a query with object coordinates in the ICRS reference frame directly to a pandas DataFrame that includes object coordinates in the GD-1 reference frame.
 
 We saved our results to an HDF5 file which we can use to restart the analysis from this stage or verify our results at some future time.
 
 :::::::::::::::::::::::::::::::::::::::: keypoints
 
 - When you make a scatter plot, adjust the size of the markers and their transparency so the figure is not overplotted; otherwise it can misrepresent the data badly.
 - For simple scatter plots in Matplotlib, `plot` is faster than `scatter`.
-- An Astropy `Table` and a Pandas `DataFrame` are similar in many ways and they provide many of the same functions.  They have pros and cons, but for many projects, either one would be a reasonable choice.
-- To store data from a Pandas `DataFrame`, a good option is an HDF5 file, which can contain multiple Datasets (we'll dig in more in the Join lesson).
+- An Astropy `Table` and a pandas `DataFrame` are similar in many ways and they provide many of the same functions.  They have pros and cons, but for many projects, either one would be a reasonable choice.
+- To store data from a pandas `DataFrame`, a good option is an HDF5 file, which can contain multiple Datasets (we'll dig in more in the Join lesson).
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 

diff --git a/episodes/04-motion.md b/episodes/04-motion.md
@@ -1,12 +1,12 @@
 ---
-title: Plotting and Pandas
+title: Plotting and pandas
 teaching: 50
 exercises: 15
 ---
 
 ::::::::::::::::::::::::::::::::::::::: objectives
 
-- Use a Boolean Pandas `Series` to select rows in a `DataFrame`.
+- Use a Boolean pandas `Series` to select rows in a `DataFrame`.
 - Save multiple `DataFrame`s in an HDF5 file.
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -30,7 +30,7 @@ analysis, identifying stars with the proper motion we expect for GD-1.
 
 ## Outline
 
-1. We will put those results into a Pandas `DataFrame`, which we will use
+1. We will put those results into a pandas `DataFrame`, which we will use
   to select stars near the centerline of GD-1.
 
 2. Plotting the proper motion of those stars, we will identify a region
@@ -88,7 +88,7 @@ results_df = pd.read_hdf(filename, 'results_df')
 
 ## Exploring data
 
-One benefit of using Pandas is that it provides functions for
+One benefit of using pandas is that it provides functions for
 exploring the data and checking for problems.
 One of the most useful of these functions is `describe`, which
 computes summary statistics for each column.
@@ -236,7 +236,7 @@ type(phi2)
 pandas.core.series.Series
 ```
 
-The result is a `Series`, which is the structure Pandas uses to
+The result is a `Series`, which is the structure pandas uses to
 represent columns.
 
 We can use a comparison operator, `>`, to compare the values in a
@@ -282,7 +282,7 @@ mask = (phi2 > phi2_min) & (phi2 < phi2_max)
 ## Logical operators
 
 Python's logical operators (`and`, `or`, and `not`)
-do not work with NumPy or Pandas.  Both libraries use the bitwise
+do not work with NumPy or pandas.  Both libraries use the bitwise
 operators (`&`, `|`, and `~`) to do elementwise logical operations
 ([explanation here](https://stackoverflow.com/questions/21415661/logical-operators-for-boolean-indexing-in-pandas)).
 
@@ -433,7 +433,7 @@ plt.plot(pm1_rect, pm2_rect, '-')
 Now that we have identified the bounds of the cluster in proper motion,
 we will use it to select rows from `results_df`.
 
-We will use the following function, which uses Pandas operators to make
+We will use the following function, which uses pandas operators to make
 a mask that selects rows where `series` falls between `low` and
 `high`.
 
@@ -563,7 +563,7 @@ Recall that we chose HDF5 because it is a binary format producing small files th
 
 Additionally, HDF5 files can contain more than one dataset and can store metadata associated with each dataset (such as column names or observatory information, like a FITS header).
 
-We can add to our existing Pandas `DataFrame` to an HDF5 file by omitting the `mode='w'` keyword like this:
+We can add to our existing pandas `DataFrame` to an HDF5 file by omitting the `mode='w'` keyword like this:
 
 ```python
 filename = 'gd1_data.hdf'
@@ -662,7 +662,7 @@ the proper motion limits we identified in this lesson, which will allow us to ex
 :::::::::::::::::::::::::::::::::::::::: keypoints
 
 - A workflow is often prototyped on a small set of data which can be explored more easily and used to identify ways to limit a dataset to exactly the data you want.
-- To store data from a Pandas `DataFrame`, a good option is an HDF5 file, which can contain multiple Datasets.
+- To store data from a pandas `DataFrame`, a good option is an HDF5 file, which can contain multiple Datasets.
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 

diff --git a/episodes/05-select.md b/episodes/05-select.md
@@ -435,7 +435,7 @@ our analysis at a later date we should save this information to a file.
 There are several ways we could do that, but since we are already
 storing data in an HDF5 file, we will do the same with these variables.
 
-To save them to an HDF5 file we first need to put them in a Pandas object.
+To save them to an HDF5 file we first need to put them in a pandas object.
 We have seen how to create a `Series` from a column in a `DataFrame`.
 Now we will build a `Series` from scratch.
 We do not need the full `DataFrame` format with multiple rows and columns

diff --git a/episodes/06-join.md b/episodes/06-join.md
@@ -882,7 +882,7 @@ that for each candidate star we have identified exactly one source in
 Pan-STARRS that is likely to be the same star.
 
 To check whether there are any values other than `1`, we can convert
-this column to a Pandas `Series` and use `describe`, which we saw
+this column to a pandas `Series` and use `describe`, which we saw
 in episode 3.
 
 ```python
@@ -979,7 +979,7 @@ getsize(filename) / MB
 
 ## Another file format - CSV
 
-Pandas can write a variety of other formats, [which you can read about
+pandas can write a variety of other formats, [which you can read about
 here](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html).
 We won't cover all of them, but one other important one is
 [CSV](https://en.wikipedia.org/wiki/Comma-separated_values), which
@@ -1064,7 +1064,7 @@ the CSV file also does not.
 However, even if we had written a CSV file from an astropy `Table`, which does contain data type, 
 data type would not appear in the CSV file, highlighting a limitation of this format.
 Additionally, notice that the index in `candidate_df` has become an unnamed column
-in `read_back_csv` and a new index has been created.  The Pandas functions for writing and reading CSV
+in `read_back_csv` and a new index has been created.  The pandas functions for writing and reading CSV
 files provide options to avoid that problem, but this is an example of
 the kind of thing that can go wrong with CSV files.
 

diff --git a/episodes/07-photo.md b/episodes/07-photo.md
@@ -185,7 +185,7 @@ stars.  But the main sequence of GD-1 appears as an overdense region in the lowe
 
 We want to be able to make this plot again, with any selection of PanSTARRs photometry,
 so this is a natural time to put it into a function that accepts as input
-an Astropy `Table` or Pandas `DataFrame`, as long as
+an Astropy `Table` or pandas `DataFrame`, as long as
 it has columns named `g_mean_psf_mag` and `i_mean_psf_mag`. To do this we will change
 our variable name from `candidate_df` to the more generic `dataframe`.
 

diff --git a/index.md b/index.md
@@ -3,7 +3,7 @@ permalink: index.html
 site: sandpaper::sandpaper_site
 ---
 
-The Foundations of Astronomical Data Science curriculum covers a range of core concepts necessary to efficiently study the ever-growing datasets developed in modern astronomy. In particular, this curriculum teaches learners to perform database operations (SQL queries, joins, filtering) and to create publication-quality data visualisations. Learners will use software packages common to the general and astronomy-specific data science communities ([Pandas](https://pandas.pydata.org), [Astropy](https://www.astropy.org), [Astroquery](https://astroquery.readthedocs.io/en/latest/) combined with two astronomical datasets: the large, all-sky, multi-dimensional dataset from the [Gaia satellite](https://sci.esa.int/web/gaia), which measures the positions, motions, and distances of approximately a billion stars in our Milky Way galaxy with unprecedented accuracy and precision; and the [Pan-STARRS photometric survey](https://panstarrs.stsci.edu/), which precisely measures light output and distribution from many stars. Together, the software and datasets are used to reproduce part of the analysis from the article ["Off the beaten path: Gaia reveals GD-1 stars outside of the main stream"](https://arxiv.org/abs/1805.00425) by Drs. Adrian M. Price-Whelan and Ana Bonaca. This lesson shows how to identify and visualize the GD-1 stellar stream, which is a globular cluster that has been tidally stretched by the Milky Way.
+The Foundations of Astronomical Data Science curriculum covers a range of core concepts necessary to efficiently study the ever-growing datasets developed in modern astronomy. In particular, this curriculum teaches learners to perform database operations (SQL queries, joins, filtering) and to create publication-quality data visualisations. Learners will use software packages common to the general and astronomy-specific data science communities ([pandas](https://pandas.pydata.org), [Astropy](https://www.astropy.org), [Astroquery](https://astroquery.readthedocs.io/en/latest/) combined with two astronomical datasets: the large, all-sky, multi-dimensional dataset from the [Gaia satellite](https://sci.esa.int/web/gaia), which measures the positions, motions, and distances of approximately a billion stars in our Milky Way galaxy with unprecedented accuracy and precision; and the [Pan-STARRS photometric survey](https://panstarrs.stsci.edu/), which precisely measures light output and distribution from many stars. Together, the software and datasets are used to reproduce part of the analysis from the article ["Off the beaten path: Gaia reveals GD-1 stars outside of the main stream"](https://arxiv.org/abs/1805.00425) by Drs. Adrian M. Price-Whelan and Ana Bonaca. This lesson shows how to identify and visualize the GD-1 stellar stream, which is a globular cluster that has been tidally stretched by the Milky Way.
 
 GD-1 is a stellar stream around the Milky Way. This means it is a collection of stars that we believe was once part of a bound clump, but the gravitational influence of the Milky Way has torn it apart and spread it over an arc that traces out its orbit on the sky.  This is interesting, because if the original bound clump was a dwarf galaxy, understanding its orbit with sufficient precision allows us to measure the mass of the Milky Way, which is very important for understanding the future and past of the Milky Way as a whole. But that is much easier to do if we have a coordinate system aligned with the stream because that makes fitting the location of the stars much easier mathematically - it becomes more linear instead of some complicated curve.  Additionally, this stream is especially interesting because it has "gaps", which have a natural interpretation as being caused by the influence of small clumps of dark matter passing near the stream. Knowing the typical rate of these gaps tells you about the typical size and density of these clumps, which turns out to be one of the best probes we have of the fine structure of dark matter.
 
@@ -13,7 +13,7 @@ This lesson can be taught in approximately 10 hours and covers the following top
 - Using Astroquery to query a remote server in Python.
 - Transforming coordinates between common coordinate systems using Astropy units and coordinates.
 - Working with common astronomical file formats, including FITS, HDF5, and CSV.
-- Managing your data with Pandas DataFrames and Astropy Tables.
+- Managing your data with pandas DataFrames and Astropy Tables.
 - Writing functions to make your work less error-prone and more reproducible.
 - Creating a reproducible workflow that brings the computation to the data.
 - Customising all elements of a plot and creating complex, multi-panel, publication-quality graphics.

diff --git a/instructors/calculating_MIST_isochrone.md b/instructors/calculating_MIST_isochrone.md
@@ -185,7 +185,7 @@ expect to find stars in GD-1.
 We will save this result so we can reload it later without repeating the
 steps in this section.
 
-So we can save the data in an HDF5 file, we will put it in a Pandas
+So we can save the data in an HDF5 file, we will put it in a pandas
 `DataFrame` first:
 
 ```python