Using NMDC skills and Claude Code to explore the biogeochemical parameters in a given study (BioScales). The NMDC skill is still new, so part of the purpose here is to also show how we can ask the agent how to improve API/programmatic/CLI access to the data.
We start by searching for the study by name. The agent starts by assuming it can do wildcard searches – this fails as we didn’t think to implement this function. The agent gets around this by dumping all studies and grepping.
We notice this, and ask the agent to write up an issue ticket for us – an example of how we can weave together exploration and tool improvement.
After this, the agent queries for all samples and collects all params and summarizes these:
Key Findings
416 total samples across three sample types:
199 endosphere samples (47.8%)
109 bulk soil samples (26.2%)
108 rhizosphere samples (26.0%)
Biogeochemical data available for 103 bulk soil samples only (24.8%)
Parameters Measured:
pH: Mean 5.95 (range 4.83-7.18) - moderately acidic to neutral soils
The data shows high spatial variability in nitrogen species (CV 71-97%), suggesting heterogeneous nitrogen cycling across the poplar common garden sites. Nitrite levels are near-zero, indicating active nitrification. The Ca:Mg:K ratio
is approximately 10.6:2.0:1.0.