首页 > 解决方案 > 在 Fortran 和 MPI 中使用 FFTW3(高级接口)执行多个 FFT

问题描述

我使用 FFTW3 执行傅立叶变换。目前,在我的代码中,由于fftw_mpi_execute_dft_r2c ,我创建了一个真正到复杂的计划,以便执行 3 次傅立叶变换。

我了解到可以在一次调用中折叠三个fftw_mpi_execute_dft_r2c以优化我的代码。如果我使用fftw_mpi_plan_many_dft_c2r创建 3 个数组的计划,这将成为可能。我阅读了文档,并尝试找到示例(互联网上关于这个特定问题的例子很少......)但我正在努力定义正确的参数。

  1. 我不太了解 iblock 和 obblock 参数的含义。我将 iblock 和 obblock 定义为每个处理器上单个复数和实数数组的数组大小。因此我设置:iblock = size(c_out1)oblock=size(out1)。那是对的吗 ?

  2. 当我设置时,integer(C_INTPTR_T) :: nx_fft我在编译过程中出现以下错误:Error: Rank mismatch in argument 'n' at (1) (rank-1 and scalar). 我不明白这个错误,因为nx_fft它是一个标量并且具有文档所需的类型:https ://www.fftw.org/fftw3_doc/MPI-Plan-Creation.html

  3. 为了对我在fftw_mpi_execute_dft_c2r子例程中输入的多个数组执行傅里叶变换,数组:(/c_out1, c_out2, c_out3/)(/c_out1, c_out2, c_out3/)。但是在编译时有一个错误:Error: Non-variable expression in variable definition context (actual argument to INTENT = OUT/INOUT) at (1)。这是否意味着我应该创建一个由这些数组组成的数组,这将是 IN 和 OUT ?如果情况良好,我应该创建一个这样的数组out_totalout_total(:,:,1) = out1(:,:) ; out_total(:,:,2) = out2(:,:) ; out_total(:,:,3) = out3(:,:) 还是这样:out_total(:,:,1) = out1(:,:) ; out_total(:,:,2) = out2(:,:) ; out_total(:,:,3) = out3(:,:)

预先感谢您的帮助。

PS:您可以在下面找到具有 3 个连续 FFT 的代码(旧版本)和我尝试折叠的 FFT 执行(新版本

新版本

use, intrinsic :: iso_c_binding
use mpi
include 'fftw3-mpi.f03'

integer(C_INTPTR_T)     :: alloc_local, local_ny_fft
integer(C_INTPTR_T)     :: local_j_offset
type(C_PTR)             :: fft_plan_c2r
integer(C_INTPTR_T)     :: nx_fft       
integer                 :: nx_global             ! defined elsewhere
integer                 :: ny_global             ! defined elsewhere
integer                 :: MPI_FFTW_COMMUNICATOR ! defined elsewhere
    
! C pointers
type(C_PTR)             :: p_out1, p_out2, p_out3
real(C_DOUBLE), pointer :: out1(:, :), out2(:, :), out3(:, :)
complex(C_DOUBLE_COMPLEX), pointer :: c_out1, c_out2, c_out3

integer(C_INTPTR_T)     :: howmany_fftw
integer(C_INTPTR_T)     :: iblock, oblock
!-------------------------------------------------------------------------------------------

nx_fft             = 2*nx_global 
ny_fft_global      = ny_global
alloc_local        = fftw_mpi_local_size_2d( ny_fft_global , nx_fft/2+1, MPI_FFTW_COMMUNICATOR, &
                                           & local_ny_fft, local_j_offset)
    
!---------------Allocate align memory thanks to external C function ------------- 
    
p_gravity_disk_x_tmp = fftw_alloc_complex(alloc_local)
call c_f_pointer(p_out1,   out1,   [(nx_fft/2 + 1) * 2,    local_ny_fft])
call c_f_pointer(p_out1, c_out1,   [nx_fft/2 + 1,          local_ny_fft])

call c_f_pointer(p_out2,   out2,   [(nx_fft/2 + 1) * 2,    local_ny_fft])
call c_f_pointer(p_out2, c_out2,   [nx_fft/2 + 1,          local_ny_fft])

call c_f_pointer(p_out3,   out3,   [(nx_fft/2 + 1) * 2,    local_ny_fft])
call c_f_pointer(p_out3, c_out3,   [nx_fft/2 + 1,          local_ny_fft])   
        
! --------------- Plan creation for FFTW ----------------- 
iblock = size(c_out1)
oblock = size(out1)
howmany_fftw =3

fft_plan_c2r = fftw_mpi_plan_many_dft_c2r( ny_fft_global, nx_fft,            &
                                         & howmany_fftw, iblock, oblock,     &
                                         & c_out1, out1,                     & 
                                         & MPI_FFTW_COMMUNICATOR, FFTW_EXHAUSTIVE)

        
! Here I fill the c_out1, c_out2 and c_out3 arrays with other subroutines which are not of interest here
    
! ---------- Complex to real transforms -----------------
call fftw_mpi_execute_dft_c2r( fft_plan_c2r, (/c_out1, c_out2, c_out3/), (/p_out1, p_out2, p_out3/))

旧版

use, intrinsic :: iso_c_binding
use mpi
include 'fftw3-mpi.f03'

integer(C_INTPTR_T)     :: alloc_local, local_ny_fft
integer(C_INTPTR_T)     :: local_j_offset
type(C_PTR)             :: fft_plan_c2r
integer(C_INTPTR_T)     :: nx_fft                ! defined elsewhere
integer                 :: ny_global             ! defined elsewhere
integer                 :: MPI_FFTW_COMMUNICATOR ! defined elsewhere
    
! C pointers
type(C_PTR)             :: p_out1, p_out2, p_out3
real(C_DOUBLE), pointer :: out1(:, :), out2(:, :), out3(:, :)
complex(C_DOUBLE_COMPLEX), pointer :: c_out1, c_out2, c_out3
!-------------------------------------------------------------------------------------------

ny_fft_global      = ny_global
alloc_local        = fftw_mpi_local_size_2d( ny_fft_global , nx_fft/2+1, MPI_FFTW_COMMUNICATOR, &
                                           & local_ny_fft, local_j_offset)
    
!---------------Allocate align memory thanks to external C function ------------- 
    
p_gravity_disk_x_tmp = fftw_alloc_complex(alloc_local)
call c_f_pointer(p_out1,   out1,   [(nx_fft/2 + 1) * 2,    local_ny_fft])
call c_f_pointer(p_out1, c_out1,   [nx_fft/2 + 1,          local_ny_fft])

call c_f_pointer(p_out2,   out2,   [(nx_fft/2 + 1) * 2,    local_ny_fft])
call c_f_pointer(p_out2, c_out2,   [nx_fft/2 + 1,          local_ny_fft])

call c_f_pointer(p_out3,   out3,   [(nx_fft/2 + 1) * 2,    local_ny_fft])
call c_f_pointer(p_out3, c_out3,   [nx_fft/2 + 1,          local_ny_fft])   
        
! --------------- Plan creation for FFTW ----------------- 
fft_plan_c2r = fftw_mpi_plan_dft_c2r_2d( ny_fft_global, nx_fft,     &
                                       & c_out1, out1,              & 
                                       & MPI_FFTW_COMMUNICATOR, FFTW_EXHAUSTIVE)
        
! Here I fill the c_out1, c_out2 and c_out3 arrays with other subroutines which are not of interest here
    
! ---------- Complex to real transforms -----------------
call fftw_mpi_execute_dft_c2r( fft_plan_c2r, c_out1, p_out1)
call fftw_mpi_execute_dft_c2r( fft_plan_c2r, c_out2, p_out2)
call fftw_mpi_execute_dft_c2r( fft_plan_c2r, c_out3, p_out3)

标签: fortranmpifftwfortran2003

解决方案


推荐阅读