Skip to content

Conversation

@liamhuber
Copy link
Member

@liamhuber liamhuber commented Jan 13, 2026

From python functions to recipe data models

With:

  • recipe models added as a simple attribute on the function object; the functions are still just functions
  • output labels automatically scraped when they are (uniformly) available as ast names
  • unpacking modes (including dataclasses) as discussed with @XzzX
  • output labels can be explicitly set (as long as these are consistent with the function analysis and unpacking instructions)

E.g.

from flowrep import parser

@parser.atomic("a", "b", unpack_mode="tuple")
def something(x, y):
    return x, y

something.recipe
>>> AtomicNode(
...    type='atomic', 
...    inputs=['x', 'y'], 
...    outputs=['a', 'b'], 
...    fully_qualified_name='__main__.something', 
...    unpack_mode=<UnpackMode.TUPLE: 'tuple'>
... )

I.e. a universal atomic node decorator.

I also spent some time today re-reading @samwaseda's workflow module for parsing the python as a workflow. This is obviously the harder part compared to the simple atomic nodes, but I think what's already there can be massaged to return a model.WorkflowNode. I also think I see a way to allow non-simple function calls even when creating the workflow from python def!

@github-actions
Copy link

Binder 👈 Launch a binder notebook on branch pyiron/flowrep/atomic_parser

@liamhuber liamhuber changed the base branch from main to data_models January 13, 2026 22:09
@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.05%. Comparing base (01e2965) to head (0d16a82).

Additional details and impacted files
@@               Coverage Diff               @@
##           data_models      #73      +/-   ##
===============================================
+ Coverage        96.52%   97.05%   +0.53%     
===============================================
  Files                4        5       +1     
  Lines              805      950     +145     
===============================================
+ Hits               777      922     +145     
  Misses              28       28              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

From python functions

Signed-off-by: liamhuber <[email protected]>
Which must be consistent with the function analysis and output unpacking instructions

Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
But still keep the linter happy

Signed-off-by: liamhuber <[email protected]>
`return` and `return None` are handled differently, but actually I'm OK with that right now.

Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
Copy link
Member

@samwaseda samwaseda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still would like to have the possibility of setting the output labels by annotations, i.e.:

def f(x) -> typing.Annotation[float, {"label": "output_label"}]:
    ...

liamhuber and others added 11 commits January 15, 2026 07:51
Co-authored-by: Sam Dareska <[email protected]>
Co-authored-by: Sam Dareska <[email protected]>
No magic dictionary keys!

Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
Signed-off-by: liamhuber <[email protected]>
When trying to decorate things that aren't function definitions.

Signed-off-by: liamhuber <[email protected]>
@liamhuber
Copy link
Member Author

liamhuber commented Jan 15, 2026

I still would like to have the possibility of setting the output labels by annotations, i.e.:

def f(x) -> typing.Annotation[float, {"label": "output_label"}]:
    ...

Yes, good call. This is now supported, and plays nicely with the unpacking modes such that we can leverage annotations on either the tuple itself or its components. E.g., for

from typing import Annotated

from flowrep import parser

def compass2cart(direction: str) -> Annotated[
    tuple[
        Annotated[int, {"label": "x"}],
        Annotated[int, {"label": "y"}],
    ],
    {"label": "cartesian"}
]:
    match direction.lower():
        case "e":
            return (1, 0)
        case "ne":
            return (1, 1)
        case "n":
            return (0, 1)
        case "nw":
            return (-1, 1)
        case "w":
            return (-1, 0)
        case "sw":
            return (-1, -1)
        case "s":
            return (0, -1)
        case "se":
            return (1, -1)
        case _:
            raise ValueError(f"Needed a compass direction, got {direction}")

We can unpack the sub-labels:

>>> parser.parse_atomic(compass2cart).model_dump()
{'type': <RecipeElementType.ATOMIC: 'atomic'>,
 'inputs': ['direction'],
 'outputs': ['x', 'y'],
 'fully_qualified_name': '__main__.compass2cart',
 'unpack_mode': <UnpackMode.TUPLE: 'tuple'>}

Or ask for the tuple as a whole

>>> parser.parse_atomic(compass2cart, unpack_mode="none").model_dump()
{'type': <RecipeElementType.ATOMIC: 'atomic'>,
'inputs': ['direction'],
'outputs': ['cartesian'],
'fully_qualified_name': '__main__.compass2cart',
'unpack_mode': <UnpackMode.NONE: 'none'>}

The order of priority for output labels is:

  1. User-defined at parse-time
  2. Annotations
  3. Clever scraping with ast for symbol names
  4. A default label based on output position "output_{n}"

(1) is all-or-nothing; give a label for each output port on the node or go home. (2-4) are merged together to get the prettiest output labels available from the code without user intervention.

Under the hood, I also added a new data model for the output label metadata -- I really get lost quickly when we have magic dictionaries floating around with special keywords I need to memorize. Critically, the model still parses dictionary and it flexibly ignores extra fields. This means that users don't ever need to import (or even know about) the data model and can keep writing their annotations as dictionaries, and that it will play nicely with semantikon by allowing extra metadata to be included in the same dictionary. Further down the road, semantikon can define a new model that inherits from this one (but optionally forbids more extra fields) so we can get rid of magic dictionary keys there too. What's nice is that we should be able to do so step-wise, where this plays perfectly well with semantikon dictionaries in the meantime.

@liamhuber
Copy link
Member Author

Aha, I suppose actually that I want this to populate .flowrep_recipe instead of .recipe. I'm on mobile now, but I'll push this later.

@samwaseda, my rough idea is then that semantikon parsers/decorators then start by running the flowrep parser, then are free to use its results and do additional parsing to build its own more complex recipe (.semantikon_recipe), I.e. one with sufficiently complex data to run type, unit, and ontological validations. Then a tool like pyiron_workflow doesn't do any parsing at all, but relies purely on these recipes to build its management system.

So we only define a single decorating function

Signed-off-by: liamhuber <[email protected]>
@samwaseda
Copy link
Member

In the current version of flowrep, the fully qualified name also includes the version from __version__. Not sure if that is to be included.

@liamhuber
Copy link
Member Author

In the current version of flowrep, the fully qualified name also includes the version from __version__. Not sure if that is to be included.

Honestly, this had slipped off my radar. Yes, I agree that we will need to log the version in the AtomicNode data model. Since this goes back and touches #69 as well, can we agree it needs to be done but leave it out of scope for this PR in particular? I started an issue (#74) to discuss the details.

Base automatically changed from data_models to main January 16, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants