首页 > 解决方案 > 如何在 RDKit 中将带有粗红线的分子的子结构突出显示为 SVG(高分辨率)

问题描述

我有以下代码:

from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import rdDepictor
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG


m = Chem.MolFromSmiles('c1cc(C(=O)O)c(OC(=O)C)cc1')
substructure = Chem.MolFromSmarts('C(=O)O')
print(m.GetSubstructMatches(substructure))
m

这产生了以下情节。

在此处输入图像描述

但是,上面的代码不会产生高分辨率图像。我想要SVG。我试过这个:

drawer = rdMolDraw2D.MolDraw2DSVG(400,200)
drawer.DrawMolecule(m,highlightAtoms=m.GetSubstructMatch(Chem.MolFromSmarts('C(=O)O')))
drawer.FinishDrawing()
svg = drawer.GetDrawingText().replace('svg:','')
SVG(svg)

但我得到:

在此处输入图像描述

正确的方法是什么?

该代码可以在我的Google Colab中进行测试。

标签: pythonrdkit

解决方案


GetSubstructMatch仅返回第一个匹配项。使用GetSubstructMatches. 根据您安装的 rdkit 版本,这里有多种方案。在最新的 rdkit 版本(2021.09.2)中,以下代码应该可以工作。

from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import rdDepictor
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG
from copy import deepcopy


def increase_resolution(mol, substructure, size=(400, 200)):
    mol = deepcopy(mol)
    substructure = deepcopy(substructure)
    drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
    
    # highlightAtoms expects only one tuple, not tuple of tuples. So it needs to be merged into a single tuple
    matches = sum(mol.GetSubstructMatches(substructure), ())
    drawer.DrawMolecule(mol, highlightAtoms=matches)
    
    drawer.FinishDrawing()
    svg = drawer.GetDrawingText()
    
    return svg.replace('svg:','')


mol = Chem.MolFromSmiles('c1cc(C(=O)O)c(OC(=O)C)cc1')
substructure = Chem.MolFromSmarts('C(=O)O')
SVG(increase_resolution(mol, substructure))

如果您遇到Value Error: Bad Conformer id错误,请将 rdkit 包更新到最新版本或尝试以下操作:

from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import rdDepictor
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG
from copy import deepcopy


def increase_resolution(mol, substructure, size=(400, 200), kekulize=True):
    mol = deepcopy(mol)
    substructure = deepcopy(substructure)
    rdDepictor.Compute2DCoords(mol)
    if kekulize:
        Chem.Kekulize(mol) # Localize the benzene ring bonds
        
    drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
    
    # highlightAtoms expects only one tuple, not tuple of tuples. So it needs to be merged into a single tuple
    matches = sum(mol.GetSubstructMatches(substructure), ())
    drawer.DrawMolecule(mol, highlightAtoms=matches)
    
    drawer.FinishDrawing()
    svg = drawer.GetDrawingText()
    return svg.replace('svg:','')


mol = Chem.MolFromSmiles('c1cc(C(=O)O)c(OC(=O)C)cc1')
substructure = Chem.MolFromSmarts('C(=O)O')
SVG(increase_resolution(mol, substructure, kekulize=True))

如果在某些情况下,例如将手性结构作为 SMILES 字符串的一部分引入其中,它可能无法工作。对于这种情况,设置kekulize=False.


推荐阅读