Return to Course Home Page


EDS 217 - Python for Environmental Data Science

Python for Environmental Data Science

code_screen.jpg

These class provides an introduction to the Python programming language, the major libraries associated with the python data science stack, and opportunities to explore applications of python to environmental data science via analysis of datasets.

The module assumes no prior experience with python or jupyter notebooks.

Course Webpage:

https://environmental-data-science.github.io/eds217_2023/

🚦 EDS217 Stoplight - Let us know how thinks are going in real time!

Course Repository

EDS217_2023 on GitHub

Learning Goals

(what you will be able to do)

  • Setup a python environment for data science using conda.

  • Conduct reproducible analyses within interactive jupyter notebook environments

  • Use the VSCode IDE to write and execute python notebooks as well as scripts.

  • Read and write basic-to-intermediate scripts and programs in the Python programming language

  • Perform analyses on structured data using numpy

  • Load, explore, aggregate, analyze, and display data using pandas.

  • Learn to visualize data using matplotlib and friends.

  • Apply all of these tools to analyze environmental datasets

  • Develop a short tutorial on how to use a python data science library for environmental analysis

Course Activities

Our time together will be spent in a combination of lecture, interactive sessions, paired practice sessions, and group work.

Lectures

Lectures will be used to introduce general concepts and principles of the python language. We won’t have too many of these, but they will be used to introduce new concepts and provide context for the interactive sessions.

Interactive Sessions

Interactive sessions will be used to demonstrate the use of python syntax, libraries, and other tools essential to the environmental data science workflow. These sessions will generally be conducted in jupyter notebooks, and will be available for you to download and use as a reference. We will also use these sessions to work through example code and computations related to the major course topics.

Practice Sessions

Practice sessions will be used to provide opportunities to using the concepts, tools, and libraries presented during interactive sessions to solve more open-ended programming problems. While the goal is to develop independent confidence in python programming, we will work through these problems in paired/collaborative coding. Example solutions to these practice problems will always be available.

TryPy Sessions

TryPy sessions are designed to allow you to develop your own reproducible workflows based on core data science principles. We will often structure these sessions in a similar way to prior activities from EDS221. This will allow you to get a better understanding of the differences between python and R as well as the strengths and weaknesses of each.

Group Work

Your group project is a “Data Science Show-and-Tell” that will extend the skills you develop during the first week of the course through the creation of collaborative presentations focused on datasets and libraries of your choosing. Working in teams, you will develop a short, reproducible tutorial on how to use a python data science library with examples using datasets relevant to environmental analysis. These tutorials will be presented to the class during the final week of the course and shared with the class via a github repository.

Course Outline

Tuesday, September 5

Day[0] - Ready, Set, Python!

The materials for our first day are designed to introduce the basics of working with Python and getting your local machines setup for the course.

Morning Session

Entry Survey

Intro to Python Data Science

Interactive Session 0-1 - Ready, Set, Python!

Afternoon Session

Getting Help

Interactive Session 0-2 - Hello, Python Data Science

Wednesday, September 6

Day[1] - Do you speak Python?

Today we will explore variables and learn the basic syntax of the python programming language.

Morning Session

Interactive Session 1-1 - Variables & Operators

Practice Session 1-1 - Variables & Operators

Interactive Session 1-2 - Lists and Indexing

Afternoon Session

Practice Session 1-2 - Lists and Indexing

TryPy 01 - St. Louis Lead Data

Thursday, September 7

Day[2] - Going with the flow

Today we complete our quick tour of the core python language as we learn the fundamentals of controlling the flow of programs in python. We will also prepare ourselves for the data science packages to come by learning how to use python to work with structured data.

Morning Session

Interactive Session 2-1 - Ifs or Elses

Practice Session 2-1 - Ifs or Elses

Afternoon Session

TryPy02 - Conditionals and Loops

Friday, September 8

Day[3] - Numpy 🧮 (Hooray for Arrays!)

Morning Session

Interactive Session 2-2 - Structured Data

Practice Session 2-2 - Structured Data

Afternoon Session

Session 3-1 - NumPy

Monday, September 11

Day[4] - Pandas 🐼

Our journey into Python’s Data Science toolkit begins with NumPy, a library designed to perform advanced calculations on matrices.

We end our first week with arguably the most important library in the Python data science ecosystem: pandas.

Now that we’ve learned how to import, manage, and analyze data using pandas, it’s time to make some graphs!

Morning Session

The Zen of Python

Session 4-1 - Pandas

Afternoon Session

Practice Session 4-1 - Pandas

Tuesday, September 12

Day[5] - Matplotlib 📈

Matplotlib is the primary libary used for plotting data in Python (although there are some great alternatives), so we will start there.

Morning Session

Session 5-1 - Matplotlib

Afternoon Session

Debugging

Practice Session 5-1 - Matplotlib

Wednesday September 13 - Friday, September 15

Day[6:] - Group Project Work ✏️

Our final activity will be a group project in which you work with a team of 3-4 of your classmates to create a brief tutorial introducting one of the many other libraries available to conduct environmental data science in Python.

Morning Session

🐿️ Team Formation & Data Ice-breaker 🐿️

Afternoon Session

Group Project

Group Project Sign-Ups

You will develop your tutorial using the same Jupyter Notebook structures that we’ve been using throughout the class and by incorporating examples using a dataset of your choosing.

The goal of this excercise is to collaboratively develop a set of data, notebooks, and visualizations that are entirely reproducible, shared on github, and used by others to learn how to use the library you’ve chosen.

🎉 Friday, September 15 🎉

Morning Session

Fill out your ESCI evaluations

Fill out this EDS217 Exit Survey

Next Steps

Afternoon Session

On our last day (Day[-1]), we’ll spend the afternoon conducting a Python Data Science Show and Tell