Pangeo is a community effort for big data in the geosciences. As part of this effort, we are bringing together the a collection of Python tools, including Xarray, Dask, and Jupyter to facilitate highly scalable data analytics on large geoscientific datasets.


Xarray is the fusion of Python’s Pandas, Numpy, and netCDF4 packages. It offers unparalleled computational ability on labeled N-dimensional arrays. I have been contributing to the Xarray project since May 2014.


Dask is Python a library for scalable computing with dynamic task scheduling. As part of my work with Pangeo and Xarray, I have been contributing to the Dask ecosystem. In particular, I have worked extensively on Dask deployment utilities like Dask-jobqueue and Dask-Kubernetes.


The Community Terrestrial Systems Model (CTSM) project is an effort to unify the land modeling efforts across NCAR and more broadly, across the terrestrial systems modeling community. Work is currently underway to combine the development efforts and software functionality of NCAR’s Community Land Model (CLM), the Noah-MP land surface model, and the Structure for Unifying Multiple Modeling Alternatives (SUMMA). I previously contributed to the development of CTSM’s land-atmosphere coupling infrastructure on a project called LILAC.


From 2012-2016, I was part of the core development team working on the Variable Infiltration Capacity macro-scale land surface hydrology model. During this time, I acted as the VIC model administrator, responsible for the VIC github repository, making releases, and issuing bug fixes.


RVIC is a source-to-sink linear routing model used to route streamflows from runoff produced in macro-scale hydrologic models such as VIC. RVIC is written in Python and C and is publicly available here.


pynco is a set of Python bindings to the popular netCDF Operators (NCO). This package is available on PyPI, Anaconda, and Github.