Showing results for

- Intel Community
- Software
- Software Archive
- Cheap random projections

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-15-2017
07:08 AM

348 Views

Cheap random projections

Here is a low computational cost way of doing random projections:https://drive.google.com/open?id=0BwsgMLjV0BnhOGNxOTVITHY1U28

You can then go and do <vector,vector> or <vector,scalar> single layer networks using that:

https://drive.google.com/open?id=0BwsgMLjV0BnhRFNoTTh3UEhwZlk

Or if you are willing to use evolution you can do deep networks with it. For example this algorithm works well:

https://pdfs.semanticscholar.org/c980/dc8942b4d058be301d463dc3177e8aab850e.pdf

Link Copied

16 Replies

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-19-2017
06:41 AM

348 Views

So the latest nVidia GPU can do 120 TFlops of 16 bit floating point per second. That's 120,000,000 65536-point Walsh Hadamard transforms per second divided by say about 3 for all the addressing arithmetic. That's about 40 million 65536-point random projections (RP) per second. I won't even try to count the number of 1024-point RP's you could do a second. I guess you could evolve deep neural networks in real time that way.

Another thing you can do is use single layer RP networks as soft memory that deep neural networks can learn to exploit more easily that random access memory (RAM) which poses impossible needle in a haystack difficulties to evolution or gradient descent training algorithms. I mentioned you could use a truncation function to gate writes to such soft memory here: https://discourse.numenta.org/t/artificial-life-concept/2308

gaston-hillar

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-22-2017
10:00 PM

348 Views

Sean,

The forum in which you entered the info is about Big Data with Intel Technologies, at Intel Software Network.

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-22-2017
10:59 PM

348 Views

Yes, I followed the AI links on your website and this is where I was directed. Thank you.

There is also a related paper that was put on Arxiv today:

https://arxiv.org/pdf/1705.07441.pdf

And I said a little more here:

https://groups.google.com/forum/#!topic/artificial-general-intelligence/K_aBGB3khBk

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-25-2017
04:41 AM

348 Views

Maybe there are a couple of ideas that could be useful to Intel if random projections (RP) etc. become important.

The Walsh Hadamard transform (and RP) could be pipelined in hardware. And if you went to extremes they could be done in analog since only patterns of addition and subtraction are required. There was also talk of multiplier free neural networks in recent papers. That you can do with single layer RP networks if you use the signof function as a nonlinear activation function: signof(x)=1, x>=0 signof(x)=-1, x<0 then weight updates during training don't require multiply either. And actually that proves to be a good nonlinearity to use anyway. For deep RP nets I think you are restricted to training with evolution strategies (ES) rather than BP (but I could be wrong.) However with super fast evaluation times for RP nets that mightn't be a problem.

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-29-2017
06:46 PM

348 Views

For the "no multiply" networks:

In hardware all you would need are low transistor count, low power requirement integer add and subtract operations and a few other simple bit operators. Avoiding much more complex and space consuming multiply logic circuits. It should be quite easy to pipe-line the operations on a FPGA. One other thing is that you may not need full precision integer +- because 2's complement overflow would simply increase the amount of nonlinearity, but probably not too much.

You could imagine a chip that had say 100 million add/subtract logic units all working merrily away at 1 billion operations per second. 100 Peta operations per second. 10 Chips would get you to Exa scale. I presume you could train a network in real time at such speeds.

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-30-2017
05:25 PM

348 Views

Code for binarization of continuous vectors showing the 37 degree effect:

http://www.freebasic.net/forum/viewtopic.php?f=7&t=25710&p=232616#p232616

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-04-2017
01:26 AM

348 Views

A must have book for this kind of thing is:

http://www.jjj.de/fxt/fxtbook.pdf

There are lots of bit transforms and other useful things for random projections, reservoir computing and such:

The source code is also available:

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-04-2017
05:56 AM

348 Views

This has high use in random projection,reservoir and extreme learning machine computing. And in fact turns a simple locality sensitive hash formed by random projection followed by binarization into a useful single layer neural network.

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-05-2017
06:24 PM

348 Views

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-07-2017
01:56 AM

348 Views

This book will tell you how to do error correcting associative memory using orthogonal projections:

https://archive.org/details/SelfOrganizationAndAssociativeMemory

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-07-2017
09:45 PM

348 Views

By allowing more ways for information to pass through each layer I am seeing markedly improved results.

https://drive.google.com/open?id=0BwsgMLjV0BnhSmtURDFCcjJnQTA

or

https://drive.google.com/open?id=0BwsgMLjV0BnhRDdaaUlreWZSVXc

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-09-2017
08:41 PM

348 Views

Just to show how easy it is to code associative memory here is some code in FreeBasic:

type associativememory veclen as ulong density as ulong hash as ulong '32 bit unsigned integer weights as single ptr 'pointer to 32 bit floats binarize as boolean ptr work as single ptr declare sub init(veclen as ulong,density as ulong,hash as ulong) declare sub free() declare sub train(target as single,inVec as single ptr) declare function recall(inVec as single ptr) as single declare sub signflip(vec as single ptr,h as long) declare sub wht(vec as single ptr) end type sub associativememory.init(veclen as ulong,density as ulong, hash as ulong) this.veclen=veclen this.density=density this.hash=hash weights=callocate(veclen*density,sizeof(single)) 'allocate zeroed memory binarize=callocate(veclen*density,sizeof(boolean)) work=callocate(veclen,sizeof(single)) end sub sub associativememory.free() deallocate(weights) deallocate(binarize) deallocate(work) end sub function associativememory.recall(inVec as single ptr) as single dim as ulong i,j dim as single x=0f dim as single ptr wtidx=weights dim as boolean ptr bidx=binarize for i=0 to veclen-1 work=inVecnext for i=0 to density-1 signflip(work,hash+i) 'sign flip and wht=random projection wht(work) for j=0 to veclen-1 if work>0f then 'if greater than 0 x+=wtidx bidx =true else x-=wtidx bidx =false end if next wtidx+=veclen bidx+=veclen next return x end function sub associativememory.train(target as single,inVec as single ptr) dim as ulong i,j dim as single ptr wtidx=weights dim as boolean ptr bidx=binarize dim as single e=target-recall(inVec) 'get the prediction error e/=veclen*density 'scale the error correctly so that it will be fully corrected for i=0 to density-1 for j=0 to veclen-1 if bidx then wtidx +=e else wtidx -=e end if next wtidx+=veclen bidx+=veclen next end sub ' Pseudorandomly flip the sign of the values in vector using h as a seed ' h is a 32 bit signed interger sub associativememory.signflip(vec as single ptr,h as long) dim as ulong i for i=0 to veclen-1 h*=1664525 h+=1013904223 if h<0 then vec =-vecnext end sub 'Fast Walsh Hadamard transform. Leaves vector length unchanged sub associativememory.wht(vec as single ptr) dim as ulong i,j,hs=1 dim as single a,b,scale=1.0/sqr(veclen) while hs<veclen i=0 while i<veclen j=i+hs while i<j a=vecb=vec[i+hs] vec=a+b vec[i+hs]=a-b i+=1 wend i+=hs wend hs+=hs wend for i=0 to veclen-1 vec*=scale next end sub screenres 300,300,32 print "Please wait!" dim as associativememory net dim as single ptr vec=callocate(256,sizeof(single)) 'allocate zeroed net.init(256,3,1234567) for i as ulong=0 to 99 for j as ulong=0 to 254 vec=1f net.train(sin(j*.06)*100,vec) vec[j+1]=1f net.train(cos(j*.2)*100,vec) vec =0f vec[j+1]=0f next next cls for i as ulong=0 to 254 vec =1f pset (i,150-sin(i*.06)*100),rgb(0,255,0) pset (i,150-net.recall(vec)),rgb(255,255,0) vec=0f next for i as ulong=0 to 254 vec=1f vec[i+1]=1f pset (i,150-cos(i*.2)*100),rgb(0,255,0) pset (i,150-net.recall(vec)),rgb(255,0,255) vec=0f vec[i+1]=0f next deallocate(vec) net.free() getkey

I've written the code in a way that should be easy to translate into C. The number of elements in the vector must be a power of 2 (16,32,64...) because of the wht function.

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-31-2017
06:36 PM

348 Views

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-31-2017
06:38 PM

348 Views

Anyway let's continue:

Sean_O_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-12-2017
06:29 AM

348 Views

Circumstances led me to make a comment on the nVidia site about this type of network and associative memory:

Maybe they won't pay any attention. I did present earlier versions of the same ideas to them 10 years ago and all they did was "clone" the idea for generation the Gaussian distribution fast with out any attribution. Google also tend to "clone" ideas without attribution too.

Cloony__Arnold

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-03-2020
05:01 AM

348 Views

Thank you for the tips

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.