mysql - MYSQL Random Entry with weight fails
问题描述
I'm trying to display weighted random results from my database and I'm unable to get results with expected accuracy. I've followed what I learnt here and here.
This would be my table:
+--------+-----------+
| weight | image |
+--------+-----------+
| 50 | A |
| 25 | B |
| 25 | C |
+--------+-----------+
I need the image A to appear 50% of the times, the image B the other 25% of the times and C the remaining 25% of the times.
The SQL estatement I'm using goes like this:
SELECT image FROM images WHERE weight > 0 ORDER BY -LOG(1.0 - RAND()) / weight LIMIT 10
So in order to test this properly I made a php script to have this iterate 10,000 times, counting how many times a, b or c was being shown and I display the results on my test script with percentages, like this:
a total: 4976 - 49,76%
b total: 2538 - 25,38%
c total: 2486 - 24,86%
With only 10,000 results and considering the RAND() is just a randomization function I would consider this results to be accurate enough. The problem is that I run this script about 100 times and I realized that 98 out of 100 times b had a higher percentage count than c.
I'm trying to understand what's wrong, both values (b and c) on the table are the same and I'm not introducing any other ordering factor. I took it up a notch and I went for 100,000 iterations of the SQL clause. These are the results:
a total: 50185 - 50,185%
b total: 25201 - 25,201%
c total: 24614 - 24,614%
I run this last test about 50 times (with long wait times between each). This time b was above c every time and accuracy was worse than the accuracy at 10000 iterations. You would expect that as you go higher on the number of iterations, the percentage variation should be getting smaller and the results more accurate. It's obvious that either I'm doing something wrong or RAND() is not really random enough.
Matematically speaking if it was perfectly random it should be improving accuracy the more iterations you make and not the opposite.
Any explanation/solution is welcome.
解决方案
推荐阅读
- gdb - XPack-QEMU-ARM 与 GDB 服务器错误:STM32CubeIDE 不支持不间断模式
- gcc - 为什么 gcc 不优化即时?
- c++ - 如何在读取或写入该对象之前检查指针指向的地址是否指向对象而不是垃圾?
- kubernetes - 对集群角色“视图”的编辑不会保留在 aws eks 中吗?
- python - 通过云功能在 Google Cloud Storage 中创建新的 csv 文件
- ddev - 如果容器内工具(如 drush 或typo3cms)不在标准 DDEV PATH 中,如何使它们可用?
- html - 将 div 放在所有 div 的下方
- algorithm - 如何校正 3D 平面多边形?
- jquery - 更改具有特定父类型的特定类型的所有元素的颜色
- flutter - Flutter ElevatedButton onPressed 函数避免使用不必要的语句