Derisking marine carbon dioxide removal

About

The ocean is the largest carbon sink in the world and absorbs about a quarter of annual emissions. To harness the ocean’s potential to help combat climate change, a variety of marine carbon dioxide removal (mCDR) approaches are being undertaken to accelerate carbon sequestration at a gigaton scale. These approaches include ocean alkalinity enhancement, blue carbon restoration, macroalgae cultivation, and iron fertilization, among others. However, these approaches are fraught with risk – both the additionality risk of being able to accurately measure the additional carbon being sequestered and the ecosystem risk of potential negative outcomes.

These risk profiles vary across mCDR technologies. For example, seagrass restoration has low ecological risk and the carbon sequestration rate is relatively well known, but the carbon sequestration potential is lower than other technologies. On the other hand, iron fertilization is all over the map for both the amount of carbon sequestered and potential ecological damage.
Since risk is the biggest barrier to investment in mCDR technologies, this project utilizes an adaptive learning model to optimize the balance between exploring more risky mCDR technologies and exploiting a “safe” mCDR technology. We hope this will ultimately guide mCDR investment to unlock greater carbon sequestration in the ocean.

Approach

To better understand the tradeoffs in mCDR technologies given risk and uncertainty, we are developing an adaptive learning model called a multi-armed bandit. A multi-armed bandit model is used when the cost of gathering information is high and it tackles resource allocation under uncertainty, dealing with the constraint of imperfect information. It provides a framework for optimizing the tradeoff between earning a return now (“exploit”) and learning how to earn a high return later (“explore”).

The casino analogy

The most common analogy for understanding a multi-armed bandit is to imagine you are at a casino facing a row of slot machines. You have a limited budget and you must pull the slot machine arms one by one. Each slot machine provides a reward at a given probability, but you don’t know the probabilities beforehand. Every time you pull a lever, you are making a choice between two activities: (1) exploiting your current knowledge and pulling the lever you believe has the highest expected payout, which maximizes short-term marginal utility or (2) exploring and investing in information where you pull a lever you know less about, which helps reduce information asymmetry to maximize long-term utility. If you explore too much, you waste resources on "bad" machines. If you exploit too soon, you might get stuck on a "decent" machine while ignoring a "jackpot" machine.

Applying this casino analogy to mCDR, imagine each arm is now a different mCDR technology and their carbon and ecosystem risks have varying levels of uncertainty. Each arm is parameterized given the best known science on the distribution of outcomes (i.e. mean and variance) for carbon sequestration and ecosystem risk. The multi-arm bandit model will help evaluate whether it’s worth taking the risk on a high risk project, and what the value of information is in helping reduce uncertainty in these investment decisions. From this, we will investigate what market structures are needed to incentivize learning, such as additional investments in research, “exploration” subsidies, or insurance against downside risks.

Team

Christopher Costello (Principal Investigator), Andrew Plantinga (Principal Investigator), Erin O'Reilly* (Project Manager), Claudia Kelsall (Postdoctoral Researcher), Abigail Kirk (Data Scientist)

*Please reach out to Erin (eoreilly@ucsb.edu) if you have questions about this project.

Partners

This project is funded by the Builders Initiative.