首页 > 解决方案 > 为什么我的 Haskell 选择排序实现非常快?

问题描述

我实现了选择排序并将其与 Data.List 的排序进行比较。它比 Data.List 的排序快几个数量级。如果我将它应用于 10,000 个随机生成的数字,结果如下:

 ✓ in 1.22µs: Selection sort
 ✓ in 9.84ms: Merge sort (Data.List)

这不可能。首先我想也许合并排序的中间结果被缓存了,选择排序使用这些结果要快得多。即使我注释掉合并排序和仅时间选择排序,它也是这么快。我还验证了输出并且正确排序。

是什么导致了这种行为?

我使用此代码进行测试:

{-# LANGUAGE BangPatterns #-}

module Lib
    ( testSortingAlgorithms
    ) where

import System.Random (randomRIO)
import Text.Printf
import Control.Exception
import System.CPUTime
import Data.List (sort, sortOn)


selectionSort :: Ord a => [a] -> [a]
selectionSort [] = []
selectionSort nrs = 
  let (smallest, rest) = getSmallest nrs
  in smallest : selectionSort rest
  where getSmallest :: Ord a => [a] -> (a, [a])
        getSmallest [a] = (a, [])
        getSmallest (a:as) = let (smallest, rest) = getSmallest as
                             in if smallest > a then (a, smallest : rest)
                                                else (smallest, a : rest)


main :: IO ()
main = testSortingAlgorithms


testSortingAlgorithms :: IO ()
testSortingAlgorithms = do
    !list' <- list (10000)
    results <- mapM (timeIt list') sorts
    let results' = sortOn fst results
    mapM_ (\(diff, msg) -> printf (msg) (diff::Double)) results'
    return ()


sorts :: Ord a => [(String, [a] -> [a])]
sorts = [
        ("Selection sort", selectionSort)
      , ("Merge sort (Data.List)", sort)
    ]


list :: Int -> IO [Int]
list n = sequence $ replicate n $ randomRIO (-127,127::Int)


timeIt :: (Ord a, Show a) 
       => [a] -> (String, [a] -> [a]) -> IO (Double, [Char])
timeIt vals (name, sorter) = do
    start <- getCPUTime
    --v <- sorter vals `seq` return ()
    let !v = sorter vals
    --putStrLn $ show v
    end   <- getCPUTime
    let (diff, ext) = unit $ (fromIntegral (end - start)) / (10^3)
    let msg = if correct v 
              then (" ✓ in %0.2f" ++ ext ++ ": " ++ name ++ "\n") 
              else (" ✗ in %0.2f" ++ ext ++ ": " ++ name ++ "\n") 
    return (diff, msg)


correct :: (Ord a) => [a] -> Bool
correct [] = True
correct (a:[]) = True
correct (a1:a2:as) = a1 <= a2 && correct (a2:as)


unit :: Double -> (Double, String)
unit v | v < 10^3 = (v, "ns")
       | v < 10^6 = (v / 10^3, "µs")
       | v < 10^9 = (v / 10^6, "ms")
       | otherwise = (v / 10^9, "s")

标签: performancesortinghaskell

解决方案


你写

let !v = sorter vals

这是“严格的”,但仅限于 WHNF。因此,您只计算找到列表中最小元素所需的时间,而不是对整个事物进行排序所需的时间。选择排序正是从这样做开始的,因此对于这个不正确的基准测试它是“最佳”的,而如果您只查看第一个元素,则合并排序会做更多“浪费”的工作。


推荐阅读