首页 > 解决方案 > python优化目标搜索算法

问题描述

我正在优化一些数据搜索,并就所采用的方法和用于数据的最佳结构寻找一些建议。目标是在整数可排序数据中找到一组不同的匹配,其差异是给定的目标值。我已经仔细研究过分别查找匹配计数和匹配列表,但是对于如何最好地结合这两种方法有点迷失。任何建议都会有所帮助!

到目前为止,这是我汇总的内容:

#!/usr/bin/env python2.7

# Evaluator
# given a list of integer-sortable data
# find pairs whose difference is a given number

from collections import Counter


def pairs_diff_target(numbers, target):
  """Find pairs, ignoring duplicates and the 0 target."""
  if len(numbers) < 2 or target == 0:
    return None

  number_map = {}
  for i in numbers:
    number_map[i] = True

  pairs = []
  for i in numbers:
    pair = (i + target)
    if pair in number_map:
      pairs.append((pair, i))
      del number_map[i]

  return pairs

def duplicates_match_count(numbers, target):
  """Return a count of matches"""
  if len(numbers) < 2:
    return None

  # Use a counter to count occurrences
  counter = Counter()
  for number in numbers:
    counter[number] += 1

  if target == 0:
    # Return nC2 combinations
    return ((counter[number] * counter[number] - 1) // 2)
  else:
    matches = 0
    for number in numbers
      target_match = (number + target)
      matches += counter[number] * counter[target_match]
      del counter[number]
    return matches


def duplicates_match_list(numbers, target):
  """Return the actual list of matching elements"""
  if len(numbers) < 2:
    return None

  # Possible implementation better than n^2


if __name__ == "__main__":
  nums = [1,1,1,1,0,1,1,2,2,2,2,2,2,3,6]

  distinct_pairs = duplicates_match_count(nums, 0)
  # all 1,1 and 2,2 pairs
  assert len(distinct_pairs) == 60

  distinct_pairs = duplicates_match_count(nums, 1)
  # all 1,2 pairs and 2,3 pairs
  assert len(distinct_pairs) == 42

  distinct_pairs = duplicates_match_count(nums, 3)
  # (0,3), (3,6)
  assert len(distinct_pairs) == 2

标签: pythonalgorithm

解决方案


您可以建立一个对重复的列表,只计算每对的计数,而不是立即生成所有实例。这应该在 O(n) 时间内工作:

from collections import Counter

def pairsWithDifference(target,numbers):
    counts = Counter(numbers)
    pairs  = [ (n,n+target,(c-0**target)*counts[n+target]) for n,c in counts.items() ]
    return [ pair for pair in pairs if pair[2] > 0 ]

nums = [1,1,1,1,0,1,1,2,2,2,2,2,2,3,6]
pd0 = pairsWithDifference(0,nums) # [(1, 1, 30), (2, 2, 30)] :: 30+30 = 60
pd1 = pairsWithDifference(1,nums) # [(1, 2, 36), (0, 1, 6), (2, 3, 6)] 36+6+6 = 48
pd2 = pairsWithDifference(2,nums) # [(1, 3, 6), (0, 2, 6)] 6+6 = 12
pd3 = pairsWithDifference(3,nums) # [(0, 3, 1), (3, 6, 1)] 1+1 = 2

然后,您可以根据需要使用此数据重新创建对列表。

from functools import reduce
pairs0 = reduce(lambda a,p:a+[(p[0],p[1])]*p[2],pd0,[]) 
pairs1 = reduce(lambda a,p:a+[(p[0],p[1])]*p[2],pd1,[]) 
pairs2 = reduce(lambda a,p:a+[(p[0],p[1])]*p[2],pd2,[]) 
pairs3 = reduce(lambda a,p:a+[(p[0],p[1])]*p[2],pd3,[]) 

如果您只需要对数,该函数可以直接生成它(并且更快):

def countPairDifference(target,numbers):
    counts = Counter(numbers)
    return sum([ (c-0**target)*counts[n+target] for n,c in counts.items() ])

推荐阅读