Do not modify input astropy table when converting to pandas DataFrame (fixes #268)#273
Open
zonca wants to merge 1 commit into
Open
Do not modify input astropy table when converting to pandas DataFrame (fixes #268)#273zonca wants to merge 1 commit into
zonca wants to merge 1 commit into
Conversation
Fixes datacarpentry#268 Previously the step-by-step walkthrough in episode 03 added GD-1 coordinate columns (phi1, phi2, pm_phi1, pm_phi2) directly to the input astropy Table before calling to_pandas(). This modified the function's input, which is bad practice. Now the episode converts to a pandas DataFrame first, then adds the GD-1 columns using .value to extract plain numerical values without units. The make_dataframe() function already used this approach and is unchanged except for a docstring update. Tested by executing all 30 Python code blocks from the updated episode sequentially — all pass (1 Jupyter magic skipped). See test script: https://gist.github.com/zonca/71774b5e3cd00480558dfc3825c6ed2b
🆗 Pre-flight checks passed 😃This pull request has been checked and contains no modified workflow files, spoofing, or invalid commits. It should be safe to Approve and Run the workflows that need maintainer approval. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #268
The step-by-step walkthrough in episode 03 was adding GD-1 coordinate columns (
phi1,phi2,pm_phi1,pm_phi2) directly to the input astropyTable(polygon_results) before callingto_pandas(). This modified the function's input, which is bad practice — a function should not silently modify its input unless that is its explicit purpose.The original reason for adding columns to the table was a workaround for a pandas 1.3.0 bug (#74) where adding
Quantitycolumns to a DataFrame causeddescribe()to fail. However, using.valueto extract plain numerical values (without units) when adding columns to the DataFrame avoids this bug entirely.Changes
episodes/03-transform.mdpolygon_results['phi1'] = skycoord_gd1.phi1)to_pandas()is called first (producing a 6-column DataFrame), then GD-1 columns are added with.value(producing the final 10-column DataFrame)shapenow shows(140339, 6)before GD-1 columns and(140339, 10)after.value?" callout explaining why we use.valueto strip unitsproper_motioncallouts to flow naturally with the restructured sectionstudent_download/episode_functions.pymake_dataframe()docstring to match the episode (function code was already correct — it already used.value)Testing
I executed all 30 Python code blocks from the updated episode 03 sequentially and confirmed they all pass (1 block containing
%matplotlib inlineis a Jupyter magic and was skipped as expected).A test script that extracts and runs all code blocks with additional verification asserts is available as a public gist:
https://gist.github.com/zonca/71774b5e3cd00480558dfc3825c6ed2b
The assert statements in the test script (clearly marked "ASSERT: not part of the lesson") verify:
polygon_resultsis NOT modified after the full episode flow (only has original 6 columns)results_dfhas 10 columns with float dtype (not Quantity)make_dataframe()produces identical results to the step-by-step approachdescribe()works correctly (no pandas Quantity bug)