limits to dsb_rand

This forum is for the Lua scriptable clone of DM/CSB called Dungeon Strikes Back by Sophia. Use DSB to build your own highly customised games.

Moderator: Sophia

Forum rules
Please read the Forum rules and policies before posting.
Post Reply
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

limits to dsb_rand

Post by kaypy »

Are there any particular limits to how far you can push dsb_rand?

I'm throwing some pretty big numbers at it so I'm wondering if I ought to be breaking it up to something like
dsb_rand(0,9999)*10000 + dsb_rand(0,9999)
where the exact details would depend on the number I am after and what dsb_rand can handle.
Friends don't let friends eat worm round
User avatar
Sophia
Concise and Honest
Posts: 4239
Joined: Thu Sep 12, 2002 9:50 pm
Location: Nowhere in particular
Contact:

Re: limits to dsb_rand

Post by Sophia »

You're fine. The largest range dsb_rand can handle is 2147483647. :mrgreen:
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

OK, I'll admit that's a seriously big number 8-)

Thanks
Friends don't let friends eat worm round
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

A correction: while dsb_rand may not *crash* for very large ranges, I wouldn't call the results of any range above about 3200 really viable.

a) dsb_rand seems to use "generate a randomish number in 0 .. 32767 and then modulus it into the required range" which means that:

b) if the range is greater than 32768, then the higher values have zero probability

c) if the range doesn't perfectly fit the modulus, then lower numbers get an additional chance to be selected. So if you do dsb(1,30000) then you have twice as much change of getting 100 than of 10000.

(Or rather, experimentally determining (b) and (c) leads me to believe (a))

At about 3200ish the difference between high and low probabilities drops below 10%, hence my initial statement
Friends don't let friends eat worm round
User avatar
Sophia
Concise and Honest
Posts: 4239
Joined: Thu Sep 12, 2002 9:50 pm
Location: Nowhere in particular
Contact:

Re: limits to dsb_rand

Post by Sophia »

Interesting. I admit I haven't used DSB's RNG with very large numbers, so there is a possibility there could be some big problems with it.

I'm curious what would happen in your experiments if you replace dsb_rand with Lua's built-in math.random and see if that changes anything. (Don't forget to seed it with math.randomseed first)
I'd also appreciate it if you'd post your actual data, rather than just your conclusions, because it'll make me easier to analyze/debug.
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

If I put

Code: Select all

local dist={}
local temp
for i=0,999999 do
	temp=dsb_rand(0,99999999)
	if (dist[temp] == nil) then
		dist[temp] = 1
	else
		dist[temp] = dist[temp]+1
	end
end
for k,v in pairs(dist) do
	__log(tostring(k) .. " -> " .. tostring(v))
end
in startup.lua I get

Code: Select all

Lua: 1 -> 34
Lua: 2 -> 35
Lua: 3 -> 24
Lua: 4 -> 36
...
lots of about the same number
...
Lua: 32763 -> 33
Lua: 32764 -> 27
Lua: 32765 -> 32
Lua: 32766 -> 21
Lua: 32767 -> 28
Lua: 0 -> 25
(double check: 30 x 32768 is about the number of loops - so I'm not getting random numbers that aren't being printed)
Last edited by kaypy on Sun Aug 09, 2015 3:50 am, edited 1 time in total.
Friends don't let friends eat worm round
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

If I put

Code: Select all

local dist={}
local temp
for i=0,999999 do
	temp=dsb_rand(0,30000)
	if (dist[temp] == nil) then
		dist[temp] = 1
	else
		dist[temp] = dist[temp]+1
	end
end
for k,v in pairs(dist) do
	__log(tostring(k) .. " -> " .. tostring(v))
end
in startup.lua I get

Code: Select all

Lua: 1 -> 61
Lua: 2 -> 60
Lua: 3 -> 62
Lua: 4 -> 70
Lua: 5 -> 54
...
Lua: 2758 -> 57
Lua: 2759 -> 68
Lua: 2760 -> 58
Lua: 2761 -> 62
Lua: 2762 -> 61
Lua: 2763 -> 62
Lua: 2764 -> 69
Lua: 2765 -> 58
Lua: 2766 -> 73
Lua: 2767 -> 25
Lua: 2768 -> 28
Lua: 2769 -> 46
Lua: 2770 -> 38
Lua: 2771 -> 31
Lua: 2772 -> 34
Lua: 2773 -> 24
Lua: 2774 -> 45
Lua: 2775 -> 30
Lua: 2776 -> 29
Lua: 2777 -> 23
...
Lua: 29996 -> 31
Lua: 29997 -> 40
Lua: 29998 -> 24
Lua: 29999 -> 37
Lua: 30000 -> 28
Lua: 0 -> 59
Friends don't let friends eat worm round
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

Lua's random falls through to ANSI C random, which can have a range (depending on the implementation) "as low as 32767"

using

Code: Select all

math.randomseed(34892895636) -- arbitrary seed, for reproducability

local dist={}
local temp
for i=0,999999 do
	temp=math.random(0,9999999)
	if (dist[temp] == nil) then
		dist[temp] = 1
	else
		dist[temp] = dist[temp]+1
	end
end
for k,v in pairs(dist) do
	__log(tostring(k) .. " -> " .. tostring(v))
end
gives me

Code: Select all

Lua: 4846949 -> 32
Lua: 8172856 -> 42
Lua: 9617297 -> 29
Lua: 5109103 -> 28
Lua: 3472090 -> 24
...
Lua: 3209936 -> 35
Lua: 6089053 -> 27
Lua: 7075106 -> 18
Lua: 2226935 -> 26
Lua: 5237281 -> 35
where there are still 32768 different return values, but now spread across the target range instead of being the first values.
Friends don't let friends eat worm round
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

The overlap case for lua random is a bit trickier, so I added a histogram

Code: Select all

math.randomseed(34892895636) -- arbitrary seed, for reproducability

local dist={}
local hist={}
local temp
for i=0,999999 do
	temp=math.random(0,30000)
	if (dist[temp] == nil) then
		dist[temp] = 1
	else
		dist[temp] = dist[temp]+1
	end
end
for k,v in pairs(dist) do
	__log(tostring(k) .. " -> " .. tostring(v))
	if (hist[v] == nil) then
		hist[v] = 1
	else
		hist[v] = hist[v]+1
	end
end
__log("---hist--")
for k,v in pairs(hist) do
	__log(tostring(k) .. " => " .. tostring(v))
end
The initial output is not immediately obviously skewed

Code: Select all

Lua: 1 -> 25
Lua: 2 -> 27
Lua: 3 -> 33
Lua: 4 -> 31
Lua: 5 -> 29
Lua: 6 -> 33
Lua: 7 -> 32
Lua: 8 -> 36
Lua: 9 -> 30
Lua: 10 -> 68
Lua: 11 -> 35
Lua: 12 -> 28
Lua: 13 -> 36
...
although the result for 10 there probably gives you an idea for what is coming:

The histogram shows a distinctive double bell curve. The results with increased probability are just again spread over the range, so 10, 21, 32, etc are double sized

peaks, troughs and midpoints for brevity:

Code: Select all

Lua: 11 => 3
Lua: 20 => 318
Lua: 29 => 1980
Lua: 38 => 750
Lua: 47 => 50
Lua: 53 => 91
Lua: 59 => 155
Lua: 77 => 17
Lua: 96 => 1
Friends don't let friends eat worm round
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

Oh two additional things:

1) Is there sufficient data up there, or would you rather the full reams and reams of it? (I would be looking at putting a zip file somewhere rather than dumping all that to the forum)

2) I'm not sure that any of this actually constitutes a problem with dsb_rand, just a limitation of the system. If you use it within the limits, everything is fine. (eg if you are generating numbers in a sensible 1-100 range then the inaccuracies come out to about 0.3% which really isn't much)

(I would be more worried about "is this falling though to a C LCRNG which means it will inherit the usual issues with LCRNGs?" - but really, that is probably not a big deal either)
Friends don't let friends eat worm round
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

Oh, and in case anyone winds up here looking for larger dsb_rands, I'll leave this here. (nb I know there are statistical issues with this style of algorithm, but I'm not using it for cryptography, so it ought to be OK)

Code: Select all

-- random number generator suitable for ranges up to a couple hundred million
function rand_30bit(minval,maxval)
	-- determine required range
	local randrange = 1 + maxval - minval
	if (randrange == 1) then return minval end
	-- get maximum whole group (above this is 'unfair')
	local randmax = math.floor(32768 * 32768 / randrange) * randrange
	-- get useful random number (ie not in the unfair part of the range)
	local randnum = 32768 * 32768
	while randnum >= randmax do	
		randnum = dsb_rand(0,32767) * 32768 + dsb_rand(0,32767)
	end
	return minval + math.mod(randnum,randrange)
end
Friends don't let friends eat worm round
User avatar
Sophia
Concise and Honest
Posts: 4239
Joined: Thu Sep 12, 2002 9:50 pm
Location: Nowhere in particular
Contact:

Re: limits to dsb_rand

Post by Sophia »

Thanks, that was all very helpful. I don't need the full reams and reams; I think I've managed to figure out what's going on from what you've posted.

First of all, you're exactly right, it is using a C LCRNG (dsb_rand is more or less just a wrapper for stdlib rand) and apparently not a particularly robust implementation of one, either. Since both my code and Lua's built-in code have the same issue, I'm guessing the implementation of rand() on Windows (or, at least, the one you get if you're not using MSVC) only gives the bare minimum of 15 bits.

As you might already know, there are two fair ways to clamp a random number generator to a given range:
1) Use mod, and throw out values that will skew the distribution, like your code does. The disadvantage is this can occasionally slow the program down.
2) Treat the number as a fixed-point number and multiply it by the total range desired, like Lua does. The disadvantage is that for large ranges you might still have some weirdness, and some missing numbers.

There's also the method used by DSB, which is to use mod and not throw out the unfair values-- I wrote this code some 8 years ago and I just didn't think of it back then. This is unnoticeable with small numbers, but becomes problematic with larger ones, especially when it's only generating 15 random bits rather than 32.

Anyway, there are very fast and good implementations of the Mersenne Twister algorithm that didn't exist back when I first started working on DSB, so I think the best solution is probably to just completely swap out DSB's RNG for something more robust, which is what I'll do.
kaypy
Artisan
Posts: 171
Joined: Sun Jan 19, 2014 7:11 am

Re: limits to dsb_rand

Post by kaypy »

Cool. I wasn't sure any change was necessary, but if there's a drop in replacement MT in a suitable license, a better RNG is always better 8-)

thanks
Friends don't let friends eat worm round
Post Reply