Introduction to Data Science with Pandas
Source: GitHub - stefmolin/pandas-workshop: An introductory workshop on pandas with notebooks and exercises for following along, Slides
Introduction
Section titled “Introduction”- See resources for code and slides, follow slides through Jupyter notebooks to see and run code
- Prerequisite knowledge: Python, Jupyter notebooks
Getting Started
Section titled “Getting Started”- Introducing Series, DataFrame, and Index classes - basics of the pandas library to create DataFrames, do operations on them and inspect/filter data.
DataFrame
Section titled “DataFrame”-
Composed of one or more Series and an Index
-
Series = column names
-
Index = row labels
-
DataFrames can be created from sources like:
- Python objects
- Files
- Webscraping
- APIs
- Other examples at IO tools (text, CSV, HDF5 pandas 2.0.3 documentation
Inspect Data
Section titled “Inspect Data”- Check rows/columns, volume of data
Continue Workshop in Git Repo
Section titled “Continue Workshop in Git Repo”See fork of https://github.com/stefmolin/pandas-workshop