Try to use new LLM phi3 #memo #LLM
Is Life Worth Living?
by iwatobipen
1d ago
As name of LLM means that to use these kinds of models, we need enough GPU memory and it’s not so cost effective for personal use ;) To overcome the limitation, there are lots of technologies are developt and still be developping. LLAMA-cpp is one of the them. Today I would like to share new model which is named phi3 developed by microsoft. The original article is found in arixv. https://arxiv.org/pdf/2404.14219 From the abstract, “phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal t ..read more
Visit website
Generate new molecules from fragments with Diffusion model #cheminformatics #rdkit #difflinker #memo
Is Life Worth Living?
by iwatobipen
4d ago
Designing linked molecule from fragments is one of the important task for drug desing such as FBDD, Scaffold hopping (e.g. replace core) and PROTAC molecule design. As readers know there are lots of solutions to do it, for examoke BROOD is one of the famous commercial package for fragment replacement. I can’t use commercial package in my hobby so I like OSS which can apply for compound design ;) Today I would like to share difflinker which can design linker with diffusion model. The original article is open access, I would like to share the URL. https://www.nature.com/articles/s42256-024-00815 ..read more
Visit website
Try to use new version of REINVENT #cheminformatics #memo #rdkit
Is Life Worth Living?
by iwatobipen
6d ago
As many cheminformaticians know that (I expected…) REINVENT which is developed by AZ team is one of the useful and famous AI based compound generator in cheminformatics field. New version of REINVENT 4 is still active and recently it is version apped and added some useful code on the repositly. The DL framework is moved from pytorch 1.x to 2.x and cool example notebooks are available. https://github.com/MolecularAI/REINVENT4 I tried to use it today ;) If you’ve instaled reinvent already, you should update it before run the new code. The way to update is described in README.md. #Updating depe ..read more
Visit website
Edit atom indices of RDKit Mol object #memo #cheminformatics
Is Life Worth Living?
by iwatobipen
3w ago
Atom indecies are unique number of each atom. And RDKit adds the index when mol object is generated. RDKit makes mol object from SMILES, Inchi, molblock and lots of formats. To make canonical representation of molecules, the atom indices are asigned allways same roules in automatically. The indices are asigned regardless of scaffold. So if you have two molecules which have different substituents and same scaffold will not have same indices on the scaffold. To memorize the indices atom map numbers are used sometime but I would like to know is it possible to do same thing with atom indices, it m ..read more
Visit website
Enumerate molecules with CXSMILES #RDKit #cheminforamtics #memo
Is Life Worth Living?
by iwatobipen
3w ago
Recent version of RDKit supports not only SMILES but also CXSMILES (chemaxon extended smiles). As name shown that CXSMILES can add lots of informations in SMILES strings but it’s little bit difficult to understand. But I found that is useful to enumerate molecules ;) For example I would like to enumerate core and two R-groups from SMILES, there are lots of way to do that. MolZip is one of the useful way to do that however annotation is required to enumerate molecules. Today I tried to enumerate molecules from CXSMILES with molecule enumerator. More details of Enumerator is described in Greg’s ..read more
Visit website
Current status of DMTA cycle in AZ #memo #DDT #publication #AI/ML
Is Life Worth Living?
by iwatobipen
1M ago
March is end of fiscal year in most of Japanese company. I spent lots of time for paper work in these days… ;P As many readers know that, DMTA cycle is key of drug discovery/optimization process and lots of computational and high thoughput experimental apporaches are available in the process recenlty. I think AZ is the front runner of these area and I respect the activity because they do not only paper publication but also sharing the code. And by sharing the code, the code is improved by good open science community. Today I read nice open access publication, published by Gian Marco et. al. fr ..read more
Visit website
Predict protein-ligand complex dynamics with python #dynamicbind #cheminformatics #memo
Is Life Worth Living?
by iwatobipen
1M ago
Computer aided drug design is one of the powerful approach for drug discovery these days. Docking study of target protein and ligands is common proceduer to evaluate whether the compound fit target protein’s pocket or not. However there is a limitation in the method. Most of the docking apporach handle protein and ligand as rigid body, so it’s difficult to consider the dynamics. Molecular dynamics can solve the limitation but it requires lots of computational resouces. Recent progress of Protein Language Model(PLM), these model can predict protein’s 3D structure from sequences. But these predi ..read more
Visit website
Pocket awaer structure generation #DiffDec #cheminformatics
Is Life Worth Living?
by iwatobipen
1M ago
Diffusion model is the one of hot area of generative model. It’s not only computer vision but also cheminformatics. Diffusion model is interesting because it generates object from some noise. BTW, de novo compound design with target protein structure information is really attractive but difficult approach in drug design. There are some approaches to conduct it with diffusion model such as DiffSBDD. DiffSBDD generates molecule with pockect information but sometime generates strange molecules. One reason of it is DiffSBDD generates whole molecule from diffusion process without any information of ..read more
Visit website
Update rdkit/shape-it #RDKit #shape-it #cheminformatics
Is Life Worth Living?
by iwatobipen
2M ago
Today I tried to build rdkit/shape-it because I could not build shape-it with current version of rdkit. So I struggled error message to fix the issue.(I’m not so good at C++ ;P) There are two issues in the current code. 1. Version of c++ in the CMakeLists.txt is old, so I changed it from c++14 to c++17 2. Version of Catch2 is old too. Recent version of catch2 is 3.x. Old version of catch2 cause problem during the make process, so I update CMakeLists.txt and some cpp files. After the modification, build process worked well and I could call shape-it from python. Here is an install process. $ g ..read more
Visit website
Visualize feature importance with marimo #cheminformatics #RDKit #marimo
Is Life Worth Living?
by iwatobipen
2M ago
I posted new generation of notebook, marimo recently. It is cool and easy to make interactive analysis environment with python. I’m interested in the package and am thinking how to use in chemoinformatics tasks. In QSAR tasks, chemoinformaticians are often asked the reason of prediction of the model. So XAI (explainable AI) is an attractive area in the field. rkakamilan shared really useful posts about visualize feature importance of ML models in his blog site with code. Now I’m writing code for visualize ML weights with compound structure and most of code came from his post. There are lots of ..read more
Visit website

Follow Is Life Worth Living? on FeedSpot

Continue with Google
Continue with Apple
OR