Thursday, October 29, 2015

Integer factoring

Factoring of large integers is a great challenge in computational mathematics.
When the numbers are very large, no efficient, non-quantum integer factorization algorithm is known; an effort by several researchers concluded in 2009, factoring a 232-digit number (RSA-768), utilizing hundreds of machines over a span of two years. Wikipedia
However, there are efficient and simple algorithms for cell size numbers (about 19 digit numbers in 64 bit systems) that immediately delivers the prime factorization. One such algorithm is Pollard rho method.

: func ( numb n -- m ) dup um* 1 0 d+ rot um/mod drop ; 


: pollard1 ( numb start -- pfac flag )
\ numb is an odd composite 

  dup dup loc{ numb start alpha beta } 0
  begin drop numb alpha func to alpha
     numb dup beta func func to beta
     alpha beta - abs        \ |alpha-beta|
     numb ugcd dup numb =    \ gcd flag 
     if false exit 
     then dup 1 = 0=         \ gcd<>1 
  until true ;

: sqrtf \ m -- n             \ floor

  0 d>f fsqrt f>s ;

: sqrtfslow \ m -- n

  dup 4 u< if dup
  if drop 1 then exit then
  dup 1- >r 1 rshift
  begin r@ over 0 swap um/mod nip
     over + 1 rshift tuck u> 0=
  until r> drop ;

: sqrtc \ m -- n             \ ceiling

  1- sqrtf 1+ ;

: pollard2 \ numb -- prime numb>1
  dup 1 and 0= if drop 2 exit then
  dup prime if exit then
  dup sqrtf dup * over = 
  if sqrtf exit then -1 2    \ numb 100 0
  do dup i pollard1          \ numb pfac flag
     if leave                \ numb pfac
     then drop               \ numb
  loop nip ;                 \ pfac

: pollard \ n -- p1 ... pk      

  dup pollard2 2dup =        \ n q f=
  if drop exit               \ n
  then dup >r 
  0 swap um/mod nip recurse  \ q n/q
  r> recurse ;

4267640728818788929 pollard .s 1234567891 3456789019  ok

12345678987654321 pollard cr .s 
333667 37 3 3 333667 37 3 3  ok

Sometimes the greatest prime factor is of interest and I guess there is no other way to find it than factoring and sorting the stack.

: lower \ m1 ... mn m n+1 -- m1 ... m ... mi n+1  

\ lower m on stack until next number not is greater
  dup 2 =
  if drop 2dup u>
     if swap
     then 2 exit
  then >r 2dup u> 0= 
  if r> exit
  then r> rot >r 
  1- recurse 1+ r> swap ;

: sort \ m1 ... mn n -- s1 ... sn n o(n²)

  dup 3 <
  if dup 2 =
     if drop 2dup u> 
        if swap 
        then 2 
     then exit
  then dup >r
  1- recurse roll 
  r> lower ;

But to use sort one have to know the number of factors to sort.

: pollard# \ numb -- p1 ... pk k
  depth >r
  pollard depth r> - 1+ ;

12345678987654321 pollard# cr .s
333667 37 3 3 333667 37 3 3 8  ok
sort .s 3 3 3 3 37 37 333667 333667 8  ok
over . 333667  ok

Monday, October 26, 2015

Prime number tests

A simple pseudo prime number test is using Fermat's little theorem
for any prime number p and integer a such that 0<a<p. To select a first there is a need for a pseudo random number generator.

variable rnd base @ hex 

: reset_seed  0ABCDEF1 rnd ! ; reset_seed 

: rand ( -- u )
  rnd @ 08088405 * 1+ dup rnd ! ;

: random ( u1 -- u2 ) 
  rand um* nip ;

base ! 

A good thing with fermat is that flag always is true if p is a prime.

: fermat ( numb -- flag ) 
  dup 2 - random 2 + 
  over 1- rot u**mod 1 = ; 

It could therefore be used as a rather fast first primality test to see whether it's necessary to take a closer look.

Rabin-Miller strong pseudoprime test

is an efficient probabilistic algorithm for determining if a given number is prime.
: 2/mod \ n -- r q
  dup 1 and swap 1 rshift ;

: get-rs 0 loc{ numb r -- r s }
  numb 1-
  begin dup 2/mod swap 0=            \ n qout rest=0
  while nip r 1+ to r
  repeat 2drop 
  r numb 1- r rshift ;

: get-a ( numb -- a )

  2 - random 2 + ;

: rabin-miller1 ( numb -- flag )
  dup dup get-rs rot get-a false loc{ numb  r s a flag }
  a s numb u**mod 1 = 
  if true exit
  then false to flag r 0 
  ?do a s i lshift numb u**mod numb 1- = 
     if true to flag leave then
  loop flag ;

Both fermat and rabin-miller1 depends on pseudo random numbers that varies from test to test and are not even combined accurate enough. A repeated test with all odd numbers between 3 and 1000000 gave the series 12 12 7 17 13 of number of errors. A second test series with fermat plus 2 times rabin-miller gave the series 1 1 2 1 3.

Due to OEIS A014233 it is enough to repeat the Rabin-Miller test a few times, depending of the word length, to be sure. Instead of random numbers one use the first prime numbers: 

2 3 5 7 11 13 17 19 23 29 31 37 ...

Repeated Rabin-Miller will reveal all composite numbers less than:

         2027 if a=2
      1373653 if a=2,3
     25326001 if a=2,3,5
   3215031751 if a=2,3,5,7
2152302898747 if a=2,3,5,7,11

Since 2^32 < 2152302898747 a=2,3,5,7,11 suffice for 32 bit integers. For an accurate test of all 64 bit integers it's enough to test for a=2,3,5,7,11,13,17,19,23,29,31,37.

create pnr 
2 c, 3 c, 5 c, 7 c, 11 c, 13 c, 17 c, 19 c, 23 c, 29 c, 31 c, 37 c,

The word create defines the word pnr which leaves an address on tos, the address where the bytes 2 to 37 are allocated.

: rabin-miller2 loc{ numb a | s r flag -- flag } \ odd numb>a>1
  numb get-rs to s to r
  a s numb u**mod 1 =
  if true exit then
  false to flag r 0
  ?do a s i lshift numb u**mod numb 1- =
      if true to flag leave then
  loop flag ;

: rmx ( numb -- n )   \ 32 bit integers
  dup     2047 u< if drop 1 exit then
  dup  1373653 u< if drop 2 exit then
  dup 25326001 u< if drop 3 exit then
    3215031751 u< if 4 else 5 then ;

: rmx ( numb -- n )   \ 64 bit integers
  dup            2047 u< if drop 1 exit then
  dup         1373653 u< if drop 2 exit then
  dup        25326001 u< if drop 3 exit then
  dup      3215031751 u< if drop 4 exit then
  dup   2152302898747 u< if drop 5 exit then
  dup   3474749660383 u< if drop 6 exit then
  dup 341550071728321 u< if drop 8 exit then
  3825123056546413051 u< if 11 else 12 then ;

: rabin-miller ( numb -- flag )
  dup rmx 0
  do dup i pnr + c@ rabin-miller2 0=
     if 0= leave then
  loop 0= 0= ;

Or faster when composite:

: fermat-rabin-miller ( numb -- flag )   \ numb odd
  dup fermat
  if rabin-miller
  else 0=
  then ;

suitable to define 

: nextprime ( numb -- prime )  \ numb unsigned integer
  dup 3 u< if drop 2 exit then
  1 or

  begin dup fermat-rabin-miller 0=
  while 2 +
  repeat ;

: prime ( numb -- flag )
  dup 3 u< if 2 = exit then
  dup 1 and 0= 
  if drop false exit 
  then rabin-miller ;

Friday, October 23, 2015

Single cell computational arithmetic

To manage single cell computational arithmetic there is a need for words as:

ugcd ( a b -- c )  unsigned-greatest-common-divisor
where
a and b are unsigned single cell numbers and c is their greatest common divisor.

u*mod ( a b m -- c )  unsigned-star-mod
where
a b m are any unsigned single cell numbers and the result is the single cell number c=a*b(mod m).

u**mod ( b a m -- c )  unsigned-power-mod
as above but where
c=b^a(mod m).

invmod ( a m -- c )  signed-invers-mod
here
0<a<m and m is a signed number relative prime to a. The result c is a number 0<c<m such that a*c(mod m)=1.

It should be possible to compute the numbers above for any single number arguments (in the case below in 64bit-systems):

1234567890123456789 987654321098765432 123123123123123123 u**mod  ok
. 1321916066429949  ok

The usual single number arithmetic
+ - * / */ etc. aren't powerful enough but there are standard Forth words which are:

um* ( b  a-- ud )  where a b are unsigned singles and ud is an unsigned double

um/mod ( ud u -- r q )
 where r is the rest and q is the quote

m*/ ( d a b -- d' )  where d'=d*a/b, d d' dubble cell numbers

Greatest common divisor

: ugcd 0 loc{ a b t -- c }     \ Algorithm from Wikipedia
  a b u< if a b to a to b then \ a>=b as unsigned numbers
 
begin b                      \ while b<>0
 
while b to t                 \ t <- b
    
a 0 b um/mod drop to b    \ b <- a(mod b)
    
t to a                    \ a <- t
  
repeat a ;

: ugcd ( a b -- c )     \ a version without local variables
 
2dup u< if swap then  \ smallest of a b on top of stack (tos)
 
?dup 0= if exit then  \ return second value if tos is zero
 
begin tuck            \ y x y  first time b a b
    
0 swap um/mod      \ y x 0 y --> y r q
    
drop dup 0=        \ y r [r=0]
 
until drop ;          \ y

The word
um/mod has the stackdiagram ( ud u -- r q ) and is used above to calculate a(mod b). The single unsigned number a is converted to a double by putting a zero on the top of a. The word ?dup duplicates the top of stack if it isn't zero.

Modular multiplication

: u*mod ( u1 u2 u3 -- u4 )
  >r r@ umod swap r@ umod um*
  r> um/mod drop ;


Here m is stored in the return stack:

>r ( a -- )  tos is popped from stack and pushed to r-stack
r> ( -- a )  r-stack is popped and a is pushed to the parameterstack
r@ ( -- a )  a copy of a is pushed on parameterstack without popping

Modular exponentiation

Below lshift ( m i -- n ) n=m*2^i and rshift ( m i -- n ) n=m/2^i
are standard words and log~ is defined (bits gives the maximal number of bits in one cell):

cell 8 * constant bits

: log~ ( n -- nr )      \ nr = 1+²log n
 
bits here !           \ bits is stored at the address 'here'
 
bits 0                \ do-loop from i=0,...,bits-1
 
do 1 rshift ?dup 0=   \ shift tos at right test if zero
    
if i 1+ here !     \ if zero store i+1 at 'here'
       
leave           \ and leave the loop
    
then
 
loop here @ ;

: u**mod loc{ b a m -- x }
  1                     \
preparation for repeated multiplication
  a log~ 0              \ n 0  n is the number of binary digits in a
  ?do a 1 i lshift and  \ flag  the i-th digit of a is 0/1
     if b m u*mod       \ multiply if flag
     then b dup m       \ square b for each i: b b^2 b^4 b^8 ...
     u*mod to b         \ new squared number to b
  loop ;                \ result of the repeated multiplication

The algorithm is called square-and-multiply:



The ?do-loop differs from do-loop in that there is an initial check with ?do, while with do the calculation between do and loop always is done at least once.

Modular inverse

: invmod ( a m -- a' )  \ a m must be coprime
  dup 1 0 loc{ a m v b c -- a' }
  begin a
  while v a / >r
     c b s>d c s>d r@ 1 m*/ d- d>s to c to b  \
old c become new b
     a v s>d a s>d r> 1 m*/ d- d>s to a to v  \ old a become new v
  repeat b m mod dup to b 0<
  if m b + else b then ;


The words s>d and d>s converts between single and double signed integers.

1234567890123456789 123123123123123133 2dup ugcd . 1  ok
2dup invmod dup . 94365978201029936 ok
swap u*mod . 1 ok