14 Sessions Over 7 Weeks
This is a new course in a rapidly evolving field. The syllabus will surely change as we adapt to emerging tools and techniques.
Structured Data • Sessions 1-4
Goal: Build an interactive emissions dashboard in 4 sessions
Concept: Setting up VS Code with AI coding assistants (Continue.dev, GitHub Copilot). Using Google Antigravity IDE. Understanding how to work with AI-augmented development environments.
Tech: VS Code, Copilot, Python Virtual Environments
In-Class Lab: Hello Climate. Download a raw CSV of Global Carbon Budget data. Use natural language prompts to load it, clean headers, and output a basic trend line.
Concept: Understanding how large language models function as coding agents. Distinguishing between chat interfaces and agents with local code execution capabilities. Using natural language to specify data operations.
Tech: VS Code with GitHub Copilot, Ibis Framework, Altair
In-Class Lab: Emissions Analysis. Work with the OWID CO2 dataset (~50,000 rows) to explore filtering, aggregation, visualization, and independent analysis—all through AI-assisted coding.
Concept: Working with complex, large-scale environmental databases. Understanding how data format and access patterns constrain automated workflows. Multi-source data integration and methodological reconciliation.
Tech: Cloud-optimized GeoParquet, DuckDB, Multi-Regional Input-Output models
In-Class Lab: Exiobase Analysis. Work with the Exiobase-3 database—an environmentally-extended input-output model covering 163 industries × 49 regions × multiple environmental pressures. Compare results with OWID to understand methodological differences (territorial vs consumption-based accounting).
Concept: Communicating results effectively. Moving from exploratory analysis to shareable outputs. Understanding when different formats serve different audiences.
Tech: Streamlit dashboards, PDF generation, GitHub Pages
In-Class Lab: Publication Studio. Take your Module 1 analysis and produce multiple output formats: an interactive Streamlit dashboard for exploration, a polished PDF report for stakeholders, and a simple webpage summarizing key findings.
Mapping Data Center Impacts • Sessions 5-7
Goal: Map and analyze the environmental and ecological impacts of data center expansion through an environmental justice lens
Concept: Introduction to geospatial analysis. Understanding Coordinate Reference Systems (CRS) and why they matter for accurate spatial analysis. Building on DuckDB skills from Module 1 to work with spatial data.
Tech: DuckDB Spatial extension, Python anymap library (maplibre wrapper), Point geometries
In-Class Lab: Data Center Atlas. Load a dataset of data center locations across the US. Use DuckDB Spatial to transform coordinates between CRS systems and create an interactive MapLibre visualization showing the distribution of data centers. Explore patterns in their geographic clustering.
Concept: Spatial joins and overlay analysis. Using census and demographic data to examine environmental justice dimensions of infrastructure siting. Who lives near data centers and what communities bear the environmental burden?
Tech: DuckDB Spatial joins, Census vector data (shapefiles/GeoJSON), demographic analysis
In-Class Lab: Data Center Environmental Justice Audit. Perform spatial joins between data center locations and census tract boundaries. Analyze demographic characteristics (income, race, education) of communities within buffer zones around data centers. Identify patterns of environmental injustice in data center siting decisions and visualize findings on an interactive map.
Concept: Working with large-scale raster data for ecological analysis. Understanding the biodiversity impacts of data center expansion on local ecosystems. Extracting values from continuous spatial data layers.
Tech: Rasterio, DuckDB Spatial with raster operations, species richness datasets, cloud-optimized GeoTIFFs
In-Class Lab: Data Centers & Biodiversity Hotspots. Analyze the intersection of data center locations with biodiversity data layers (e.g., species richness rasters from NatureServe). Extract raster values at data center locations to assess ecological sensitivity. Identify data centers located in biodiversity hotspots and quantify potential habitat impacts. Create visualizations showing the ecological footprint of digital infrastructure expansion.
LLM APIs & Document Intelligence • Sessions 8-10
Goal: Extract structured insights from unstructured corporate sustainability documents using modern LLM workflows
Concept: Moving beyond IDE chat assistants to programmatic LLM use. Understanding how to work with LLMs through APIs for reproducible, automated workflows. Introduction to open-source models through OpenRouter.
Tech: LangChain, OpenRouter (accessing open models like gpt-oss, Olmo, nemotron), OpenAI structured outputs (JSON mode)
In-Class Lab: Your First LLM Pipeline. Build a simple Python script that uses LangChain to send prompts to different open-source models via OpenRouter. Compare responses across models. Experiment with OpenAI's structured output feature to extract specific fields (company name, emission target, baseline year) from a sample text passage about corporate climate commitments.
Concept: AI-era document parsing. Extracting structured information from messy, unstructured corporate documents without traditional web scraping tools. Understanding the types of sustainability and energy disclosures that climate professionals encounter: CDP reports, GRI disclosures, corporate sustainability reports, utility rate filings.
Tech: LangChain document loaders, OpenAI structured outputs with Pydantic schemas, PDF parsing libraries
In-Class Lab: Sustainability Report Parser. Students work with real public documents (e.g., Apple's Environmental Progress Report, Microsoft Sustainability Report, or utility Integrated Resource Plans). Build a pipeline that loads PDFs, chunks them intelligently, and uses LLMs with structured output schemas to extract specific data fields: renewable energy percentages, Scope 1/2/3 emissions, energy consumption metrics, water usage, and waste diversion rates. Output results as clean JSON or CSV for further analysis.
Concept: Introduction to Model Context Protocol (MCP) as a modern approach to giving LLMs access to external data sources and tools. Understanding how MCP servers provide structured interfaces for document processing, database access, and other external capabilities without traditional RAG embeddings.
Tech: Model Context Protocol, MCP servers for PDF/document processing, LangChain integration with MCP
In-Class Lab: Multi-Document ESG Analysis. Use MCP-based tools to analyze multiple sustainability documents simultaneously. Build a workflow that compares climate commitments across several Fortune 500 companies, identifying gaps, inconsistencies, and best practices. Students explore how MCP simplifies complex document workflows compared to traditional embedding-based RAG approaches.
Build Your MVP • Sessions 11-14
Goal: Build a deployable Minimum Viable Product (MVP)
Activity: Collaborative design session. Teams define their project goals, identify data sources, and sketch their technical approach using AI as a design partner.
Focus: Feasibility and impact. Does the project leverage techniques from across the course? Will it provide actionable insights for climate decision-makers?
Activity: Focused development time. Instructors provide technical guidance and help teams overcome implementation challenges.
Focus: Building core functionality. Whether that's data pipelines, spatial analysis workflows, document extraction systems, or interactive visualizations—teams make substantial progress on their MVP.
Activity: Peer testing and feedback. Teams experience each other's projects and provide constructive feedback on usability and impact.
Focus: User experience and communication. Is the tool intuitive? Are insights clearly communicated? Does the project effectively tell its climate story?
Format: Lightning presentations showcasing live projects.
Evaluation: Does the project demonstrate technical sophistication? Does it address a real climate challenge? Could it influence decision-making in the real world?
This course takes a non-traditional approach. We won't master Python syntax, tidy data principles, Codd's third normal form, the mechanics of filters and joins, or the grammar of graphics—the traditional vocabulary of data science. For instructors and students familiar with conventional data science curricula, this will feel different.
We believe this is the right choice for our audience. Rather than building foundational programming skills from scratch, we focus on what climate professionals can accomplish today with modern AI-augmented tools. This is an authentic experience: these are the tools being used to solve real problems right now.
We acknowledge the risks. AI tools can produce incorrect results, and working at a higher level of abstraction can obscure understanding. But data science has always carried these risks—that's why software developers write unit tests and validation checks. Like any technology, AI coding assistants can be used well or poorly. Our goal is to teach you to use them well.
Our course design reflects core principles of how people learn effectively: