LACAN

LACAN filter: Leveraging adjacent co-ocurrence of atomic neighborhoods for molecular filtering

"All sorts of things in the world behave like mirrors" -Jacques Lacan

Some molecular fragments are common, but they have the tendency not to occur together. For example, alkyloxy radicals are frequent motifs in medicinal chemistry datasets, whereas the linkage of both radicals into a peroxide is rather uncommon. Likewise, halides and amines are some of the most commonly occurring atomic neighborhoods, and yet their pairing results in the unstable and toxic haloamine motif. We apply this concept using co-occurences of ECFP2 like atomic neighborhoods at the bond interface, and leverage co-occurence patterns to construct a molecular filter that highlights uncommon linkages.

Current version

This is version 0.0.2alpha. This version is still experimental, and breaking changes are still expected. Several changes have been added since 0.0.1alpha, including a change to manually hash the environments instead of relying on hacky usage of the rdkit morgan fingerprint generator. Also introduced in this version is functionality for molecule generation. This is currently unoptimized and subject to change.

Installation

clone this repo, activate your environment, navigate to root dir and run:

pip install .

Example notebooks

Some notebooks with typical use cases are provided in lacan/example_notebooks. Note that these notebook will need jupyter installed in the python environment. The molecule generation notebook additionally requires scikit-learn installed in the python environment.

Basic usage: Localizing problem bonds

import lacan and inspect a molecule by running the following commands:

from lacan import lacan
from rdkit import Chem
p = lacan.load_profile("chembl")
m = Chem.MolFromSmiles("c1ccccc1CCN(OCCc1occc1)")
score,info = lacan.score_mol(m,p)
print(info["bad_bonds"])

which will output a dictionary with an entry for every bond in the molecule. Currently the filter is binary, so the score is 1 if the molecule passes the filter and 0 if it doesn't. The problem bonds output follow rdkit bond numbering which means we can visualize problem bonds in our molecules easily as follows:

from rdkit.Chem import Draw
d = Draw.MolToImage(m,highlightBonds=info["bad_bonds"])
display(d)

giving the following result:

This correctly identified the N-O linkage as problematic.

Breeding molecules

This filter enables us to recombine fragments and filter out linkages that are rare in the reference set. Lacan has a "breeding" or crossover functionality where two molecules get fragmented and recombined. By subjecting the recombinations to LACAN filter we can retain only decent looking "median molecules".

example:

from lacan import breed
from rdkit import Chem
from rdkit.Chem import Draw

m1 = Chem.MolFromSmiles("c1cc(ccc1[C@@H]2CCNC[C@H]2COc3ccc4c(c3)OCO4)F")
m2 = Chem.MolFromSmiles("CNCCC(C1=CC=CC=C1)OC2=CC=C(C=C2)C(F)(F)F")
median_molecules = breed.breed(m1,m2,p,nmols=9)

this outputs the following molecules that are "in between" its parents fluoxetine and sertraline:

d = Draw.MolsToGridImage(median_molecules)
display(d)

Generating molecules

Random molecules can by generated simply using

ms = gen.generate_filtered_molecules(n_jobs=-1,
                                     n_molecules=9,
                                     profile=p,
                                     seed=456,
                                     min_atoms=20)

For generation towards a goal, see the example notebooks, which showcase this functionality.

Building a profile

If you want to build a custom profile using your own reference data set, this can be done through the LACAN cli as follows

python lacan.py -i your_dataset_here.smi -m profile -p my_new_profile

This will create a pickled profile in the data folder which you can then invoke using:

p = lacan.load_profile("my_new_profile")

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
lacan		lacan
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LACAN

Current version

Installation

Example notebooks

Basic usage: Localizing problem bonds

Breeding molecules

Generating molecules

Building a profile

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LACAN

Current version

Installation

Example notebooks

Basic usage: Localizing problem bonds

Breeding molecules

Generating molecules

Building a profile

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages