首页 > 解决方案 > GMP 将最后一位设置为零

问题描述

我正在寻找将正数的最后一位设为零的最快l方法mpz_t。我还没有发现这个功能已经可以了。比如6531489321483应该改成6531489321480.

标签: roundingmathematical-optimizationgmp

解决方案


更新

看来减法和取模是用mpz_t类型清零最后一位数字的最佳方法。正如@MarkDickinson 和@MarcGlisse 指出的那样,渐近行为非常有利于使用( mpz_tdiv_r_uior mpz_fdiv_r_ui) 而不是. 我最初的基准是相对较小的数字(25 位数)。我重新测试了一个 175 位数字,该方法快了近 40%。mpz_div_uimpz-mul_uisub_mod

Test value: 1234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789
Result with div_mul: 1234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456780
Result with sub_mod: 1234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456780
time with division followed by multiplication: 6.145656
time with subtraction and modulo: 4.413998

使用 350 位数字,我们看到sub_mod速度提高了 85% 左右:

Test value: 12345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789
Result with div_mul: 12345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456780
Result with sub_mod: 12345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456789123456789876543212345678912345678987654321234567891234567898765432123456780
time with division followed by multiplication: 10.256122
time with subtraction and modulo: 5.522990

需要注意的是,无论我们使用mpz_tdiv_r_uimpz_fdiv_r_ui,结果几乎相同。

由于该sub_mod方法在数量较小的情况下仅稍微慢一些,因此仅在所有情况下都使用此方法似乎是合理的。

在不同的编译器上测试它会很好。我目前正在使用clang 5.0.1.

原来的

我机器上的基准测试表明,除法后乘法比通过模运算符和减法找到余数更快。

#include <stdio.h>
#include <time.h>
#include <gmp.h>

void div_mul(mpz_t x) {
    mpz_tdiv_q_ui(x, x, 10u);
    mpz_mul_ui(x, x, 10u);
}

// Maybe this could be simpler?
void sub_mod(mpz_t x, mpz_t y) {
    // N.B. mpz_mod_ui is equivalent to mpz_fdiv_r_ui. Changed to 
    // mpz_tdiv_r_ui for consistency with div_mul.
    mpz_tdiv_r_ui(y, x, 10u);
    mpz_sub(x, x, y);
}

主要的:

int main() {

    mpz_t testVal;
    mpz_init(testVal);
    mpz_set_str(testVal, "1234567898765432123456789", 10);
    gmp_printf("Test value: %Zd\n", testVal);
    
    mpz_t x;
    mpz_t y;
    
    mpz_init(x);
    mpz_init(y);
    
    mpz_set(x, testVal);
    div_mul(x);
    gmp_printf("Result with div_mul: %Zd\n", x);
    
    mpz_set(x, testVal);
    sub_mod(x, y);
    gmp_printf("Result with sub_mod: %Zd\n", x);
    
    const int limit = 100000000;
    const double checkPoint0 = (double) clock() / CLOCKS_PER_SEC;
    
    for (int i = 0; i < limit; ++i) {
        mpz_set(x, testVal);
        div_mul(x);
    }
    
    const double checkPoint1 = (double) clock() / CLOCKS_PER_SEC;
    const double time_div_mul = checkPoint1 - checkPoint0;
    printf("time with division followed by multiplication: %f\n", time_div_mul);
    const double checkPoint2 = (double) clock() / CLOCKS_PER_SEC;
    
    for (int i = 0; i < limit; ++i) {
        mpz_set(x, testVal);
        sub_mod(x, y);
    }
    
    const double checkPoint3 = (double) clock() / CLOCKS_PER_SEC;
    const double time_sub_mod = checkPoint3 - checkPoint2;
    printf("time with subtraction and modulo: %f\n", time_sub_mod);

    mpz_clear(testVal);
    mpz_clear(x);
    mpz_clear(y);
    return 0;
}

输出:

Test value: 1234567898765432123456789
Result with div_mul: 1234567898765432123456780
Result with sub_mod: 1234567898765432123456780
time with division followed by multiplication: 2.941251
time with subtraction and modulo: 3.171949

我怀疑后一种方法稍慢的原因之一是需要 2 个变量,因为在Capi 中无法访问同一行上的复杂多操作。如果我们可以使用gmpxx,我们可以编写x - x % 10

关于为什么第一种方法更快的另一个想法是,它div_mul涉及两个无符号整数的运算,而该sub_mod方法涉及一个无符号整数的运算,然后是一个与 的运算mpz_t

我试图在ideone.com上复制此内容,但无法gmp.h加载。我选择使用 type实现类似的基准测试long long int只是为了好玩。你会注意到存在volatile的限制是十亿,而不是上面看到的一亿。volatile需要防止 for 循环被优化掉。


推荐阅读