Skip to content

Added instance method to generate MolBar from Structure#236

Open
crjacinto wants to merge 1 commit into
faccts:mainfrom
crjacinto:molbar
Open

Added instance method to generate MolBar from Structure#236
crjacinto wants to merge 1 commit into
faccts:mainfrom
crjacinto:molbar

Conversation

@crjacinto
Copy link
Copy Markdown
Contributor

@crjacinto crjacinto commented Apr 29, 2026

Closes Issues

Closes #227

Description

  • Added an instance method that generates the Molecular Barcode from Structure.

Release Notes

(Project adheres to Keep a Changelog; Every entry should start in upper-case and end with (#<pr-id>); Remove sections that don't apply)

Changed

  • Modified structure.py.

@crjacinto crjacinto requested a review from a team as a code owner April 29, 2026 13:04
@haneug haneug added enhancement New feature or request side input Concerning writing ORCA input labels Apr 30, 2026
@haneug
Copy link
Copy Markdown
Contributor

haneug commented Apr 30, 2026

@crjacinto please repair the title and the description of the PR

@crjacinto crjacinto changed the title Added class method to generate the Molecular Barcode (Molbar) from St… Added instance method to generate MolBar from Structure Class Apr 30, 2026
@crjacinto crjacinto changed the title Added instance method to generate MolBar from Structure Class Added instance method to generate MolBar from Structure Apr 30, 2026
@crjacinto
Copy link
Copy Markdown
Contributor Author

@crjacinto please repair the title and the description of the PR

Done

Copy link
Copy Markdown
Contributor

@haneug haneug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR! Please split the function according to the return types and change the docstring to numpy style. Also an enum for the mode would be a good thing for verification I think.

Comment thread src/opi/input/structures/structure.py
return_data : bool, default False
If ``True``, return the full MolBar data dictionary in addition to
the barcode string.
mode : str, default ``"mb"``
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please set up a string enum to verify that valid string options are passed. Something like this:

class MolBarMode(StrEnum):
    MB = "mb"
    TOPO = "topo"

and then you can do:

  try:
      mode = MolBarMode(mode)
  except ValueError:
      raise ValueError(f"Invalid mode {mode!r}. Must be one of {[m.value for m in MolBarMode]}")

Either use StrEnum or use StringEnum from opi/src/opi/models/string_enum.py if it can be case insensitive.


Returns
-------
str
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split this into two separate functions, one returning a str, one returning tuple[str, dict[str, Any]. A function that changes its return type based on a keyword argument makes type hints unreliable and is easy to misuse. Maybe make to_molbar return the string and to_molbar_dict or to_molbar_full return the whole data.
Also please document or give the link to the documentation what this whole data even means.

# > iterate self.atoms and call atom.format_xyz_line() on each entry.
# > Each line has the format "SYMBOL x y z" (standard XYZ).
elements: list[str] = []
coords: list[list[float]] = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this also be list[tuple[float,float,float]] since we only have x, y, z?

elements.append(parts[0])
coords.append([float(parts[1]), float(parts[2]), float(parts[3])])

if not elements:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could check for if self.atoms before trying to loop over them and then the error message would fit better.

)

# > Build (N, 3) coordinate array for type safety; convert back to list for MolBar
coordinates: list[list[float]] = np.array(coords, dtype=np.float64).tolist()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step seems redundant to me. You already converted everything to floats?

mode : str, default ``"mb"``
MolBar calculation mode. ``"mb"`` computes the full barcode;
``"topo"`` computes only the topology part.
total_charge : int | None, default None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would directly go with the default and I do not directly see the advantage of setting this via a keyword argument. One can always just change self.charge?


for atom in self.atoms:
parts = atom.format_xyz_line().split()
elements.append(parts[0])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be problematic when you have fragments in your structure. Please check that as well and use atom.element instead!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you should make sure to just include actual atoms and not e.g., point charges do that by checking if it is an instance of Atom

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and use atom.coordinates for x,y,z

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a Structure.get_coordinates() which returns all coordinates as single numpy array.
This can easily be converted to a list of lists using ndarray.tolist().

If this structure contains no real :class:`~opi.input.structures.atom.Atom`
instances.
"""
try:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good PR. Thanks for adding this MolBar.

Import statements should always be on module level. There are more less only two reasons to do them on a lower level:

  1. For performance, e.g. to speed up CLI auto-completion.
  2. If it would lead to a circular import.

It does not seem to me, that either criteria is met here.

The import could look as follows:

from functools import wraps

try:
  from molbar.barcode import get_molbar_from_coordinates
except ImportError:
  # > Making sure the identifier is the global namespace
  get_molbar_from_coordinates = None

def requires_molbar(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
      if get_molbar_from_coordinates is None:
        raise raise ImportError("MolBar is not installed. Install it with: pip install molbar")
      return func(*args, **kwargs)
    return wrapper

# [...]

@requires_molbar
def to_molbar(...):


for atom in self.atoms:
parts = atom.format_xyz_line().split()
elements.append(parts[0])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a Structure.get_coordinates() which returns all coordinates as single numpy array.
This can easily be converted to a list of lists using ndarray.tolist().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request side input Concerning writing ORCA input

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simple integration of MolBar into OPI´s Structure Object

3 participants