Atomic parser #73

liamhuber · 2026-01-13T22:09:30Z

From python functions to recipe data models

With:

recipe models added as a simple attribute on the function object; the functions are still just functions
output labels automatically scraped when they are (uniformly) available as ast names
unpacking modes (including dataclasses) as discussed with @XzzX
output labels can be explicitly set (as long as these are consistent with the function analysis and unpacking instructions)

E.g.

from flowrep import parser

@parser.atomic("a", "b", unpack_mode="tuple")
def something(x, y):
    return x, y

something.recipe
>>> AtomicNode(
...    type='atomic', 
...    inputs=['x', 'y'], 
...    outputs=['a', 'b'], 
...    fully_qualified_name='__main__.something', 
...    unpack_mode=<UnpackMode.TUPLE: 'tuple'>
... )

I.e. a universal atomic node decorator.

I also spent some time today re-reading @samwaseda's workflow module for parsing the python as a workflow. This is obviously the harder part compared to the simple atomic nodes, but I think what's already there can be massaged to return a model.WorkflowNode. I also think I see a way to allow non-simple function calls even when creating the workflow from python def!

github-actions · 2026-01-13T22:09:41Z

👈 Launch a binder notebook on branch pyiron/flowrep/atomic_parser

codecov · 2026-01-13T22:16:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.05%. Comparing base (01e2965) to head (0d16a82).

Additional details and impacted files

@@               Coverage Diff               @@
##           data_models      #73      +/-   ##
===============================================
+ Coverage        96.52%   97.05%   +0.53%     
===============================================
  Files                4        5       +1     
  Lines              805      950     +145     
===============================================
+ Hits               777      922     +145     
  Misses              28       28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

From python functions Signed-off-by: liamhuber <[email protected]>

Which must be consistent with the function analysis and output unpacking instructions Signed-off-by: liamhuber <[email protected]>

Signed-off-by: liamhuber <[email protected]>

But still keep the linter happy Signed-off-by: liamhuber <[email protected]>

`return` and `return None` are handled differently, but actually I'm OK with that right now. Signed-off-by: liamhuber <[email protected]>

Signed-off-by: liamhuber <[email protected]>

samwaseda

I still would like to have the possibility of setting the output labels by annotations, i.e.:

def f(x) -> typing.Annotation[float, {"label": "output_label"}]:
    ...

flowrep/parser.py

Co-authored-by: Sam Dareska <[email protected]>

@samwaseda

Per @samwaseda's request Signed-off-by: liamhuber <[email protected]>

No magic dictionary keys! Signed-off-by: liamhuber <[email protected]>

Signed-off-by: liamhuber <[email protected]>

When trying to decorate things that aren't function definitions. Signed-off-by: liamhuber <[email protected]>

liamhuber · 2026-01-15T19:23:56Z

I still would like to have the possibility of setting the output labels by annotations, i.e.:
def f(x) -> typing.Annotation[float, {"label": "output_label"}]:
    ...

Yes, good call. This is now supported, and plays nicely with the unpacking modes such that we can leverage annotations on either the tuple itself or its components. E.g., for

from typing import Annotated

from flowrep import parser

def compass2cart(direction: str) -> Annotated[
    tuple[
        Annotated[int, {"label": "x"}],
        Annotated[int, {"label": "y"}],
    ],
    {"label": "cartesian"}
]:
    match direction.lower():
        case "e":
            return (1, 0)
        case "ne":
            return (1, 1)
        case "n":
            return (0, 1)
        case "nw":
            return (-1, 1)
        case "w":
            return (-1, 0)
        case "sw":
            return (-1, -1)
        case "s":
            return (0, -1)
        case "se":
            return (1, -1)
        case _:
            raise ValueError(f"Needed a compass direction, got {direction}")

We can unpack the sub-labels:

>>> parser.parse_atomic(compass2cart).model_dump()
{'type': <RecipeElementType.ATOMIC: 'atomic'>,
 'inputs': ['direction'],
 'outputs': ['x', 'y'],
 'fully_qualified_name': '__main__.compass2cart',
 'unpack_mode': <UnpackMode.TUPLE: 'tuple'>}

Or ask for the tuple as a whole

>>> parser.parse_atomic(compass2cart, unpack_mode="none").model_dump()
{'type': <RecipeElementType.ATOMIC: 'atomic'>,
'inputs': ['direction'],
'outputs': ['cartesian'],
'fully_qualified_name': '__main__.compass2cart',
'unpack_mode': <UnpackMode.NONE: 'none'>}

The order of priority for output labels is:

User-defined at parse-time
Annotations
Clever scraping with ast for symbol names
A default label based on output position "output_{n}"

(1) is all-or-nothing; give a label for each output port on the node or go home. (2-4) are merged together to get the prettiest output labels available from the code without user intervention.

Under the hood, I also added a new data model for the output label metadata -- I really get lost quickly when we have magic dictionaries floating around with special keywords I need to memorize. Critically, the model still parses dictionary and it flexibly ignores extra fields. This means that users don't ever need to import (or even know about) the data model and can keep writing their annotations as dictionaries, and that it will play nicely with semantikon by allowing extra metadata to be included in the same dictionary. Further down the road, semantikon can define a new model that inherits from this one (but optionally forbids more extra fields) so we can get rid of magic dictionary keys there too. What's nice is that we should be able to do so step-wise, where this plays perfectly well with semantikon dictionaries in the meantime.

liamhuber · 2026-01-15T22:35:32Z

Aha, I suppose actually that I want this to populate .flowrep_recipe instead of .recipe. I'm on mobile now, but I'll push this later.

@samwaseda, my rough idea is then that semantikon parsers/decorators then start by running the flowrep parser, then are free to use its results and do additional parsing to build its own more complex recipe (.semantikon_recipe), I.e. one with sufficiently complex data to run type, unit, and ontological validations. Then a tool like pyiron_workflow doesn't do any parsing at all, but relies purely on these recipes to build its management system.

Signed-off-by: liamhuber <[email protected]>

So we only define a single decorating function Signed-off-by: liamhuber <[email protected]>

samwaseda · 2026-01-16T03:57:06Z

In the current version of flowrep, the fully qualified name also includes the version from __version__. Not sure if that is to be included.

liamhuber · 2026-01-16T16:25:33Z

In the current version of flowrep, the fully qualified name also includes the version from __version__. Not sure if that is to be included.

Honestly, this had slipped off my radar. Yes, I agree that we will need to log the version in the AtomicNode data model. Since this goes back and touches #69 as well, can we agree it needs to be done but leave it out of scope for this PR in particular? I started an issue (#74) to discuss the details.

liamhuber changed the base branch from main to data_models January 13, 2026 22:09

liamhuber added 7 commits January 14, 2026 07:40

Parse atomic recipes

d120466

From python functions Signed-off-by: liamhuber <[email protected]>

Allow explicit output labels

f6f94dd

Which must be consistent with the function analysis and output unpacking instructions Signed-off-by: liamhuber <[email protected]>

Lint

2e7d74a

Signed-off-by: liamhuber <[email protected]>

Re-narrow model typing

13192c0

But still keep the linter happy Signed-off-by: liamhuber <[email protected]>

Test more output possibilities

0948088

`return` and `return None` are handled differently, but actually I'm OK with that right now. Signed-off-by: liamhuber <[email protected]>

Rename tests

dc07e4a

Signed-off-by: liamhuber <[email protected]>

Cover exception with test

cfa6461

Signed-off-by: liamhuber <[email protected]>

liamhuber force-pushed the atomic_parser branch from f7195ad to cfa6461 Compare January 14, 2026 15:41

liamhuber requested review from XzzX and samwaseda January 14, 2026 15:56

samwaseda reviewed Jan 15, 2026

View reviewed changes

flowrep/parser.py Outdated Show resolved Hide resolved

flowrep/parser.py Outdated Show resolved Hide resolved

liamhuber and others added 11 commits January 15, 2026 07:51

Update flowrep/parser.py

7bd01ff

Co-authored-by: Sam Dareska <[email protected]>

Update flowrep/parser.py

3bda97e

Co-authored-by: Sam Dareska <[email protected]>

Merge branch 'data_models' into atomic_parser

793db69

Allow output labels to be set by annotation

3c573b9

Per @samwaseda's request Signed-off-by: liamhuber <[email protected]>

Introduce a model for metadata

aaafdd1

No magic dictionary keys! Signed-off-by: liamhuber <[email protected]>

Black

e91522d

Signed-off-by: liamhuber <[email protected]>

Test exception cases

648d0ec

Signed-off-by: liamhuber <[email protected]>

Fix return hint

319d620

Signed-off-by: liamhuber <[email protected]>

Satisfy mypy

5f22a10

Signed-off-by: liamhuber <[email protected]>

Complete hints

5af330a

Signed-off-by: liamhuber <[email protected]>

Fail earlier and cleaner

f3ce0b8

When trying to decorate things that aren't function definitions. Signed-off-by: liamhuber <[email protected]>

liamhuber added 2 commits January 15, 2026 15:10

Rename recipe to flowrep_recipe

263ad54

Signed-off-by: liamhuber <[email protected]>

Refactor atomic decorator

29cb12b

So we only define a single decorating function Signed-off-by: liamhuber <[email protected]>

Merge branch 'data_models' into atomic_parser

0d16a82

liamhuber mentioned this pull request Jan 16, 2026

Questions on storing version data #74

Open

Base automatically changed from data_models to main January 16, 2026 21:55

samwaseda approved these changes Jan 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Atomic parser #73

Atomic parser #73

Uh oh!

liamhuber commented Jan 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

codecov bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

samwaseda left a comment

Uh oh!

Uh oh!

Uh oh!

liamhuber commented Jan 15, 2026 •

edited

Loading

Uh oh!

liamhuber commented Jan 15, 2026

Uh oh!

samwaseda commented Jan 16, 2026

Uh oh!

liamhuber commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Atomic parser #73

Are you sure you want to change the base?

Atomic parser #73

Uh oh!

Conversation

liamhuber commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

codecov bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

samwaseda left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

liamhuber commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liamhuber commented Jan 15, 2026

Uh oh!

samwaseda commented Jan 16, 2026

Uh oh!

liamhuber commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liamhuber commented Jan 13, 2026 •

edited

Loading

codecov bot commented Jan 13, 2026 •

edited

Loading

liamhuber commented Jan 15, 2026 •

edited

Loading