首页 > 解决方案 > Bio.Align 使用 smith-waterman 局部对齐导致内存泄漏

问题描述

我有一个 DNA 序列排列列表,其中获得了序列对的比对分数。我不知道为什么当排列列表很大时这个过程会导致内存泄漏,因为在每个交互中都创建了对齐对象。这里是分数计算的例子:

for sequence1, sequence2 in sequence_permutation:
   score = self.__calculate_sequence_similarity(sequence1, sequence2)
   alignments[sequence1].append(sequence2)

save_aligments(alignments)

def __calculate_score_alignment(self, sequence1, sequence2): 
   from Bio.Align import substitution_matrices
   from Bio import Align
   from Bio.SubsMat import MatrixInfo

   aligner = Align.PairwiseAligner()
   aligner.mode = 'local'
   aligner.substitution_matrix = substitution_matrices.load('BLOSUM62')
   return aligner.score(sequence1, sequence2)


def __calculate_sequence_similarity(self, sequence1: str, sequence2: str) -> float:         
   if not sequence1 and not sequence2:
      return -1

   score = self.__calculate_score_alignment(sequence1, sequence2)
   score1 = self.__calculate_score_alignment(sequence1, sequence1)
   score2 = self.__calculate_score_alignment(sequence2, sequence2)

   return score / (math.sqrt(score1) * math.sqrt(score2))

标签: pythonbioinformaticsbiopythonsequence-alignment

解决方案


推荐阅读