首页 > 解决方案 > Alpha-beta 修剪在完全实施时会做出更糟糕的决定

问题描述

我正在使用 minimax 算法编写一个基本的国际象棋 AI。我实施了似乎工作正常的 alpha-beta 修剪。这是代码:

def move(self, board):
    moves = {}

    for move in board.legal_moves:
        board.push(move)
        moves[move] = self.evaluate_move(board, 1, float("-inf"), float("inf"))
        board.pop()

    best_moves = []
    for key in moves.keys():
        if moves[key] == max(moves.values()):
            best_moves.append(key)
    
    chosen_move = random.choice(best_moves)
    return chosen_move

def evaluate_move(self, board, depth, alpha, beta):
    if depth % 2: # if depth is odd ie. minimizing player
        extremepoints = float("inf")
    else:
        extremepoints = float("-inf")

    if depth < self.depth_limit and (not board.is_game_over()):
        for move in board.legal_moves:
                board.push(move)
                if depth % 2: # if depth is odd ie. minimizing player
                    points = self.evaluate_move(board, depth+1, alpha, beta)
                    extremepoints = min(extremepoints, points)
                    beta = min(beta, points)
                    if alpha >= beta:
                        board.pop()
                        break
                else:
                    points = self.evaluate_move(board, depth+1, alpha, beta)
                    extremepoints = max(extremepoints, points)
                    alpha = max(alpha, points)
                    if beta <= alpha:
                        board.pop()
                        break
                board.pop()
    else:
        return self.evaluate_position(board)

    return extremepoints

但是,在观看此视频时,我意识到我可能会失去潜在的性能。在视频中的那个点,alpha 被设置在树的最顶端,并且它被赋予所有其他第一级移动。我的实现没有这样做,而是给每个一级移动值 -inf 用于 alpha。我试图通过这样做来解决这个问题:

def move(self, board):
    alpha = float("-inf")
    beta = float("inf")
    moves = {}

    for move in board.legal_moves:
        board.push(move)
        moves[move] = self.evaluate_move(board, 1, alpha, beta) # Change here
        alpha = max(alpha, moves[move])
        board.pop()

    best_moves = []
    for key in moves.keys():
        if moves[key] == max(moves.values()):
            best_moves.append(key)
    
    chosen_move = random.choice(best_moves)
    return chosen_move

问题是,这导致了更糟糕的人工智能。它的速度更快,但每次都输给没有这个“修复”的人工智能。然而,在浏览 Stack Overflow 时,我发现了这个实现的链接,这似乎和我一样。

所以,我的问题是:我是否已经尽可能地进行了 alpha-beta 修剪,并且不需要进行任何更改,或者,我实施修复的方式是否有问题?

标签: pythonartificial-intelligencechessminimaxalpha-beta-pruning

解决方案


推荐阅读