feat: improve Soil Data Access metrics for soilDB usage #420

brownag · 2025-11-17T23:00:24Z

This PR adds a more granular "SDA Query application" header comment to SDA queries following Phil Anzel's proposal for CART application. Low-level queries via SDA_query(), with no specific application context, already have the soilDB R package version included in the HTTP User-Agent header when POSTing queries.

To prevent having to put boilerplate handling into the many functions that call SDA_query(), I added simple logic to SDA_query itself to determine the calling environment. This will allow us to create comments that uniquely capture bare calls of SDA_query, user-defined functions, and functions defined in R packages. If we want to provide a great deal more specific info we may need to add custom handling, possibly via a new argument to pass query metadata to SDA_query.

TODO:

Additional data elements (explicit list of: properties, interpretation names, aggregation methods? or simply include the raw function call?)
Truncate to ensure comments are relatively short (much less than 3000 characters, and if we include the raw call explicitly avoid including e.g. a very large vector of mukeys)

Handling of highest-level functions that run many queries (e.g. fetchLDM(), fetchSDA()) will result in several queries being generated each with their own comment. Some queries then may be associated with the top level fetch* function and some will be associated with lower-level get* or other internal functions.

cc: @jneme910

brownag · 2025-11-21T00:48:39Z

I am debating about automatic generation of "rule" from the call stack vs. more purposive hardcoded values.

It would not be difficult to hard-code these, in fact I fully implemented it before going the current route.

The nice thing about using the call stack is it could allow us to display the whole hierarchy of calls, so we could tell the difference between someone calling get_component_from_SDA() on its own vs. from fetchSDA(). The current implementation in the draft PR does no filtering, and only returns the top-level call. For production we should probably limit the inspection of calls to just those from soilDB, and otherwise just fall back to soilDB::SDA_query() as the low-level operation. This, I think, should be a bit safer and more well behaved when considering all the different ways the call stack could appear, and is manipulated by tools like testthat.

brownag added 2 commits November 17, 2025 14:37

feat(SDA_query): add query comment header with caller info

b620de0

fix: properly handle namespaced calls and special environments

35d778a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: improve Soil Data Access metrics for soilDB usage #420

feat: improve Soil Data Access metrics for soilDB usage #420

Uh oh!

brownag commented Nov 17, 2025

Uh oh!

brownag commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: improve Soil Data Access metrics for soilDB usage #420

Are you sure you want to change the base?

feat: improve Soil Data Access metrics for soilDB usage #420

Uh oh!

Conversation

brownag commented Nov 17, 2025

Uh oh!

brownag commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants