我正在尝试从多个图像的平均值创建一个图像。我这样做的方法是遍历 2 张照片的像素值,将它们加在一起并除以 2。简单的数学。然而,虽然这是有效的,但它非常慢(在最大规格的 MacBook Pro 15" 2016 上平均 2x 10MP 照片大约需要 23 秒,而使用 Apples CIFilter API 进行类似算法的时间要少得多)。我的代码当前使用的是这个,基于此处的另一个 StackOverflow 问题:

static func averageImages(primary: CGImage, secondary: CGImage) -> CGImage? {
        guard (primary.width == secondary.width && primary.height == secondary.height) else {
            return nil

        let colorSpace       = CGColorSpaceCreateDeviceRGB()
        let width            = primary.width
        let height           = primary.height
        let bytesPerPixel    = 4
        let bitsPerComponent = 8
        let bytesPerRow      = bytesPerPixel * width
        let bitmapInfo       = RGBA32.bitmapInfo

        guard let context = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else {
            print("unable to create context")
            return nil

        guard let context2 = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else {
            print("unable to create context 2")
            return nil

        context.draw(primary, in: CGRect(x: 0, y: 0, width: width, height: height))

        context2.draw(secondary, in: CGRect(x: 0, y: 0, width: width, height: height))

        guard let buffer = context.data else {
            print("Unable to get context data")
            return nil

        guard let buffer2 = context2.data else {
            print("Unable to get context 2 data")
            return nil

        let pixelBuffer = buffer.bindMemory(to: RGBA32.self, capacity: width * height)
        let pixelBuffer2 = buffer2.bindMemory(to: RGBA32.self, capacity: width * height)

        for row in 0 ..< Int(height) {
            if row % 10 == 0 {
                print("Row: \(row)")

            for column in 0 ..< Int(width) {
                let offset = row * width + column

                let picture1 = pixelBuffer[offset]
                let picture2 = pixelBuffer2[offset]

                let minR = min(255,(UInt32(picture1.redComponent)+UInt32(picture2.redComponent))/2)
                let minG = min(255,(UInt32(picture1.greenComponent)+UInt32(picture2.greenComponent))/2)
                let minB = min(255,(UInt32(picture1.blueComponent)+UInt32(picture2.blueComponent))/2)
                let minA = min(255,(UInt32(picture1.alphaComponent)+UInt32(picture2.alphaComponent))/2)

                pixelBuffer[offset] = RGBA32(red: UInt8(minR), green: UInt8(minG), blue: UInt8(minB), alpha: UInt8(minA))

        let outputImage = context.makeImage()

        return outputImage

    struct RGBA32: Equatable {
        //private var color: UInt32
        var color: UInt32

        var redComponent: UInt8 {
            return UInt8((color >> 24) & 255)

        var greenComponent: UInt8 {
            return UInt8((color >> 16) & 255)

        var blueComponent: UInt8 {
            return UInt8((color >> 8) & 255)

        var alphaComponent: UInt8 {
            return UInt8((color >> 0) & 255)

        init(red: UInt8, green: UInt8, blue: UInt8, alpha: UInt8) {
            let red   = UInt32(red)
            let green = UInt32(green)
            let blue  = UInt32(blue)
            let alpha = UInt32(alpha)
            color = (red << 24) | (green << 16) | (blue << 8) | (alpha << 0)

        init(color: UInt32) {
            self.color = color

        static let red     = RGBA32(red: 255, green: 0,   blue: 0,   alpha: 255)
        static let green   = RGBA32(red: 0,   green: 255, blue: 0,   alpha: 255)
        static let blue    = RGBA32(red: 0,   green: 0,   blue: 255, alpha: 255)
        static let white   = RGBA32(red: 255, green: 255, blue: 255, alpha: 255)
        static let black   = RGBA32(red: 0,   green: 0,   blue: 0,   alpha: 255)
        static let magenta = RGBA32(red: 255, green: 0,   blue: 255, alpha: 255)
        static let yellow  = RGBA32(red: 255, green: 255, blue: 0,   alpha: 255)
        static let cyan    = RGBA32(red: 0,   green: 255, blue: 255, alpha: 255)

        static let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Little.rawValue

        static func ==(lhs: RGBA32, rhs: RGBA32) -> Bool {
            return lhs.color == rhs.color

在处理 RAW 像素值时,我不是很有经验,可能还有很多优化的空间。可能不需要声明RGBA32,但我再次不确定如何简化代码。我尝试用 UInt32 简单地替换该结构,但是,当我除以 2 时,四个通道之间的分隔变得混乱,我最终得到错误的结果(积极的一面,这使计算时间减少到大约 6秒)。

我已经尝试删除 alpha 通道(只是将其硬编码为 255)并删除没有值超过 255 的安全检查。这已将计算时间减少到 19 秒。然而,它远不是我希望接近的 6 秒,而且平均 alpha 通道也很好。

注意:我知道 CIFilters;但是,首先使图像变暗,然后使用CIAdditionCompositing过滤器不起作用,因为 Apple 提供的 API 实际上使用的是比直接添加更复杂的算法。有关这方面的更多详细信息,请参阅此处了解我之前关于该主题的代码以及此处的类似问题,其中测试证明 Apple 的 API 不是直接添加像素值。

**编辑:**感谢所有反馈,我现在能够做出巨大的改进。迄今为止最大的不同是从调试更改为发布,这大大减少了时间。然后,我能够编写更快的代码来修改 RGBA 值,而无需为此创建单独的结构。这将时间从 23 秒更改为大约 10 秒(加上调试到发布的改进)。代码现在看起来像这样,也被重写了一点,看起来更具可读性:

static func averageImages(primary: CGImage, secondary: CGImage) -> CGImage? {
    guard (primary.width == secondary.width && primary.height == secondary.height) else {
        return nil

    let colorSpace       = CGColorSpaceCreateDeviceRGB()
    let width            = primary.width
    let height           = primary.height
    let bytesPerPixel    = 4
    let bitsPerComponent = 8
    let bytesPerRow      = bytesPerPixel * width
    let bitmapInfo       = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Little.rawValue

    guard let primaryContext = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo),
        let secondaryContext = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else {
            print("unable to create context")
            return nil

    primaryContext.draw(primary, in: CGRect(x: 0, y: 0, width: width, height: height))
    secondaryContext.draw(secondary, in: CGRect(x: 0, y: 0, width: width, height: height))

    guard let primaryBuffer = primaryContext.data, let secondaryBuffer = secondaryContext.data else {
        print("Unable to get context data")
        return nil

    let primaryPixelBuffer = primaryBuffer.bindMemory(to: UInt32.self, capacity: width * height)
    let secondaryPixelBuffer = secondaryBuffer.bindMemory(to: UInt32.self, capacity: width * height)

    for row in 0 ..< Int(height) {
        if row % 10 == 0 {
            print("Row: \(row)")

        for column in 0 ..< Int(width) {
            let offset = row * width + column

            let primaryPixel = primaryPixelBuffer[offset]
            let secondaryPixel = secondaryPixelBuffer[offset]

            let red = (((primaryPixel >> 24) & 255)/2 + ((secondaryPixel >> 24) & 255)/2) << 24
            let green = (((primaryPixel >> 16) & 255)/2 + ((secondaryPixel >> 16) & 255)/2) << 16
            let blue = (((primaryPixel >> 8) & 255)/2 + ((secondaryPixel >> 8) & 255)/2) << 8
            let alpha = ((primaryPixel & 255)/2 + (secondaryPixel & 255)/2)

            primaryPixelBuffer[offset] = red | green | blue | alpha

    print("Done looping")
    let outputImage = primaryContext.makeImage()

    return outputImage



  1. 并行化例程:

    您可以使用 , 提高性能concurrentPerform,将处理移至多个内核。这是最简单的形式,您只需将外部for循环替换为concurrentPerform

    extension CGImage {
        func average(with secondImage: CGImage) -> CGImage? {
                width == secondImage.width,
                height == secondImage.height
            else {
                return nil
            let colorSpace       = CGColorSpaceCreateDeviceRGB()
            let bytesPerPixel    = 4
            let bitsPerComponent = 8
            let bytesPerRow      = bytesPerPixel * width
            let bitmapInfo       = RGBA32.bitmapInfo
                let context1 = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo),
                let context2 = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo),
                let buffer1 = context1.data,
                let buffer2 = context2.data
            else {
                return nil
            context1.draw(self,        in: CGRect(x: 0, y: 0, width: width, height: height))
            context2.draw(secondImage, in: CGRect(x: 0, y: 0, width: width, height: height))
            let imageBuffer1 = buffer1.bindMemory(to: UInt8.self, capacity: width * height * 4)
            let imageBuffer2 = buffer2.bindMemory(to: UInt8.self, capacity: width * height * 4)
            DispatchQueue.concurrentPerform(iterations: height) { row in   // i.e. a parallelized version of `for row in 0 ..< height {`
                var offset = row * bytesPerRow
                for _ in 0 ..< bytesPerRow {
                    offset += 1
                    let byte1 = imageBuffer1[offset]
                    let byte2 = imageBuffer2[offset]
                    imageBuffer1[offset] = byte1 / 2 + byte2 / 2
            return context1.makeImage()


    • 因为您对每个字节都进行了相同的计算,所以您可以进一步简化它,摆脱强制转换、移位、掩码等。我还将重复计算移出内部循环。

    • 结果,我使用UInt8type 并遍历bytesPerRow.

    • FWIW,我已将其定义为CGImage扩展,它被调用为:

      let combinedImage = image1.average(with: image2)
    • 现在,我们正在逐行遍历像素数组中的像素。您可以尝试实际更改它以每次迭代处理多个像素concurrentPerform,尽管我这样做时没有看到实质性变化。

    我发现这concurrentPerform比非并行for循环快很多倍。不幸的是,嵌套for循环只占整个函数的总处理时间的一小部分(例如,一旦您包括构建这两个像素缓冲区的开销,整体性能仅比未优化的再现快 40%)。在配置良好的 MBP 2018 上,它可以在半秒内处理 10,000 × 10,000 像素的图像。

  2. 另一种选择是 Accelerate vImage库。

    这个库提供了各种各样的图像处理例程,如果您要处理大图像,它是一个很好的库,可以让您熟悉。我不知道它的alpha 合成算法在数学上是否与“平均字节值”算法相同,但可能足以满足您的目的。它的优点是for通过单个 API 调用减少了嵌套循环。它还为更多种类的图像合成和处理例程打开了大门:

    extension CGImage {
        func averageVimage(with secondImage: CGImage) -> CGImage? {
            let bitmapInfo: CGBitmapInfo = [.byteOrder32Little, CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue)]
            let colorSpace = CGColorSpaceCreateDeviceRGB()
                width == secondImage.width,
                height == secondImage.height,
                let format = vImage_CGImageFormat(bitsPerComponent: 8, bitsPerPixel: 32, colorSpace: colorSpace, bitmapInfo: bitmapInfo)
            else {
                return nil
            guard var sourceBuffer = try? vImage_Buffer(cgImage: self, format: format) else { return nil }
            defer { sourceBuffer.free() }
            guard var sourceBuffer2 = try? vImage_Buffer(cgImage: secondImage, format: format) else { return nil }
            defer { sourceBuffer2.free() }
            guard var destinationBuffer = try? vImage_Buffer(width: width, height: height, bitsPerPixel: 32) else { return nil }
            defer { destinationBuffer.free() }
            guard vImagePremultipliedConstAlphaBlend_ARGB8888(&sourceBuffer, Pixel_8(127), &sourceBuffer2, &destinationBuffer, vImage_Flags(kvImageNoFlags)) == kvImageNoError else {
                return nil
            return try? destinationBuffer.createCGImage(format: format)


  3. 对于咯咯笑和笑声,我还尝试使用 BLAS 渲染图像CGBitmapInfo.floatComponents并使用 BLAScatlas_saxpby进行单行调用以平均两个向量。它运行良好,但不出所料,它比上述基于整数的例程慢。
