Programing

1.0은 std :: generate_canonical의 유효한 출력입니까?

crosscheck 2020. 7. 9. 08:09
반응형

1.0은 std :: generate_canonical의 유효한 출력입니까?


난 항상 임의의 숫자는 0과 1 사이에 거짓말을 생각 하지 않고1 , 그들은 반 개방 구간 [0,1)에서 숫자 즉. cppreference.com에 documentionstd::generate_canonical확인한다이.

그러나 다음 프로그램을 실행할 때 :

#include <iostream>
#include <limits>
#include <random>

int main()
{
    std::mt19937 rng;

    std::seed_seq sequence{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    rng.seed(sequence);
    rng.discard(12 * 629143 + 6);

    float random = std::generate_canonical<float,
                   std::numeric_limits<float>::digits>(rng);

    if (random == 1.0f)
    {
        std::cout << "Bug!\n";
    }

    return 0;
}

그것은 다음과 같은 출력을 제공합니다.

Bug!

즉, 그것은 나를 완벽하게 만들어 1내 MC 통합에 문제를 일으킨다. 이것이 유효한 행동입니까, 아니면 제 편에 오류가 있습니까? 이것은 G ++ 4.7.3과 동일한 출력을 제공합니다.

g++ -std=c++11 test.c && ./a.out

클랜 3.3

clang++ -stdlib=libc++ -std=c++11 test.c && ./a.out

이것이 올바른 행동이라면 어떻게 피할 수 1있습니까?

편집 1 : git의 G ++도 같은 문제로 어려움을 겪고있는 것 같습니다. 나는 ~에있다

commit baf369d7a57fb4d0d5897b02549c3517bb8800fd
Date:   Mon Sep 1 08:26:51 2014 +0000

컴파일 ~/temp/prefix/bin/c++ -std=c++11 -Wl,-rpath,/home/cschwan/temp/prefix/lib64 test.c && ./a.out하면 같은 결과를 ldd얻을 수 있습니다.

linux-vdso.so.1 (0x00007fff39d0d000)
libstdc++.so.6 => /home/cschwan/temp/prefix/lib64/libstdc++.so.6 (0x00007f123d785000)
libm.so.6 => /lib64/libm.so.6 (0x000000317ea00000)
libgcc_s.so.1 => /home/cschwan/temp/prefix/lib64/libgcc_s.so.1 (0x00007f123d54e000)
libc.so.6 => /lib64/libc.so.6 (0x000000317e600000)
/lib64/ld-linux-x86-64.so.2 (0x000000317e200000)

편집 2 : 여기에 동작을보고했습니다 : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63176

편집 3 : clang 팀은 문제를 알고있는 것 같습니다 : http://llvm.org/bugs/show_bug.cgi?id=18767


문제는 std::mt19937( std::uint_fast32_t) 의 공동 도메인 에서 float; 표준에 의해 기술 된 알고리즘은 현재 IEEE754 반올림 모드가 원형에서 음의 무한대 이외의 값일 경우 정밀도 손실이 발생할 때 잘못된 결과 (알고리즘의 출력 설명과 일치하지 않음)를 제공합니다 (기본값은 반올림입니다. 가장 가까운 곳).

하여 씨앗 mt19937의 7백54만9천7백23번째 출력 4,294,967,257이다 ( 0xffffffd9u)되는 둥근 때 플로트 준다 32 비트 0x1p+32mt19937, 4294967295 (의 최대 값과 동일한, 0xffffffffu즉도 32 비트 플로우트로 반올림).

이 지정한다면 표준이 올바른 동작을 보장 할 수 URNG 그 행의 출력으로 변환 할 때 RealTypegenerate_canonical라운딩은 음의 무한대쪽으로 수행되는; 이것은이 경우 올바른 결과를 줄 것입니다. QOI로서 libstdc ++가이 변경을하는 것이 좋습니다.

이 변경으로 1.0더 이상 생성되지 않습니다. 대신 경계 값 0x1.fffffep-N에 대한이 0 < N <= 8(약 더 자주 발생한다 2^(8 - N - 32)N, MT19937의 실제 분포에 따라).

I would recommend to not use float with std::generate_canonical directly; rather generate the number in double and then round towards negative infinity:

    double rd = std::generate_canonical<double,
        std::numeric_limits<float>::digits>(rng);
    float rf = rd;
    if (rf > rd) {
      rf = std::nextafter(rf, -std::numeric_limits<float>::infinity());
    }

This problem can also occur with std::uniform_real_distribution<float>; the solution is the same, to specialize the distribution on double and round the result towards negative infinity in float.


According to the standard, 1.0 is not valid.

C++11 §26.5.7.2 Function template generate_canonical

Each function instantiated from the template described in this section 26.5.7.2 maps the result of one or more invocations of a supplied uniform random number generator g to one member of the specified RealType such that, if the values gi produced by g are uniformly distributed, the instantiation’s results tj , 0 ≤ tj < 1, are distributed as uniformly as possible as specified below.


I just ran into a similar question with uniform_real_distribution, and here's how I interpret the Standard's parsimonious wording on the subject:

The Standard always defines math functions in terms of math, never in terms of IEEE floating-point (because the Standard still pretends that floating-point might not mean IEEE floating point). So, any time you see mathematical wording in the Standard, it's talking about real math, not IEEE.

The Standard says that both uniform_real_distribution<T>(0,1)(g) and generate_canonical<T,1000>(g) should return values in the half-open range [0,1). But these are mathematical values. When you take a real number in the half-open range [0,1) and represent it as IEEE floating-point, well, a significant fraction of the time it will round up to T(1.0).

When T is float (24 mantissa bits), we expect to see uniform_real_distribution<float>(0,1)(g) == 1.0f about 1 in 2^25 times. My brute-force experimentation with libc++ confirms this expectation.

template<class F>
void test(long long N, const F& get_a_float) {
    int count = 0;
    for (long long i = 0; i < N; ++i) {
        float f = get_a_float();
        if (f == 1.0f) {
            ++count;
        }
    }
    printf("Expected %d '1.0' results; got %d in practice\n", (int)(N >> 25), count);
}

int main() {
    std::mt19937 g(std::random_device{}());
    auto N = (1uLL << 29);
    test(N, [&g]() { return std::uniform_real_distribution<float>(0,1)(g); });
    test(N, [&g]() { return std::generate_canonical<float, 32>(g); });
}

Example output:

Expected 16 '1.0' results; got 19 in practice
Expected 16 '1.0' results; got 11 in practice

When T is double (53 mantissa bits), we expect to see uniform_real_distribution<double>(0,1)(g) == 1.0 about 1 in 2^54 times. I don't have the patience to test this expectation. :)

My understanding is that this behavior is fine. It may offend our sense of "half-open-rangeness" that a distribution claiming to return numbers "less than 1.0" can in fact return numbers that are equal to 1.0; but those are two different meanings of "1.0", see? The first is the mathematical 1.0; the second is the IEEE single-precision floating-point number 1.0. And we've been taught for decades not to compare floating-point numbers for exact equality.

Whatever algorithm you feed the random numbers into isn't going to care if it sometimes gets exactly 1.0. There's nothing you can do with a floating-point number except mathematical operations, and as soon as you do some mathematical operation, your code will have to deal with rounding. Even if you could legitimately assume that generate_canonical<float,1000>(g) != 1.0f, you still wouldn't be able to assume that generate_canonical<float,1000>(g) + 1.0f != 2.0f — because of rounding. You just can't get away from it; so why would we pretend in this single instance that you can?

참고URL : https://stackoverflow.com/questions/25668600/is-1-0-a-valid-output-from-stdgenerate-canonical

반응형