Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

Optimization of sine function's taylor expansion

Bernard
Valued Contributor I
37,720 Views

Hello!
While coding in assembly various series expansions of many functions i tried to optimize my code mainly by using rcpps instruction instead of divaps.When performing tests oncode which calculates sine function by taylor series i did a few measurements as explained here :"http://software.intel.com/en-us/forums/showthread.php?t=52482" and the result was between 10-12 cycles per first term of sine expansion(i used 14 terms).
I would like to ask you how can i rewrite this code in order to gain speed of execution improvment.
[bash]movups xmm0,argument movups xmm1,argument mulps xmm1,xmm1 mulps xmm1,xmm0 mov ebx,OFFSET coef movups xmm2,[ebx] rcpps xmm3,xmm2 ;rcpps used instead of divaps mulps xmm1,xmm3 subps xmm0,xmm1[/bash]

0 Kudos
1 Solution
bronxzv
New Contributor II
37,529 Views

calls 1e6 times fastsin() the result in millisecond is 63

so it's 63 ns per iteration or ~ 120 clocks on your CPU, it does't match your previous reports IIRC

if you keep only the polynomial (get rid of the strange domain check) you should begin to see timings nearer than mine

View solution in original post

0 Kudos
342 Replies
SergeyKostrov
Valued Contributor II
2,054 Views
Hi Iliya, . Thanks for the results. By the way, you have not included performance numbers for a single-thread case. It would be nice to see a performance improvement of a multi-threaded sorting over single-thread sorting. . >>... five threads(main and 4 spawned threads)without synchronized{} statement:~7.603 miliseconds... . It's the best case.
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
Hi Iliya, . >>...- 2. Intel wanted to make the upgrade before IDF 2012 event in San Francisco... . By the way, if you have some time and interest you could look at some videos from the IDF 2012. Here a couple of links: . http://www.intel.com/content/www/us/en/intel-developer-forum-idf/san-francisco/idf-2012-san-francisco-library.html?eid=idf12cn+em0927_e1 . IDF 2012 San Francisco highlights from day two: . http://www.intel.com/content/www/us/en/intel-developer-forum-idf/san-francisco/idf-2012-san-francisco-keynote-renee-james.html . Renee James in her keynotes announced about the new IDZ website.
0 Kudos
Bernard
Valued Contributor I
2,054 Views
Hi Sergey! >>... five threads(main and 4 spawned threads)without synchronized{} statement:~7.603 miliseconds... . It's the best case. Yes probably because of my CPU(2physical cores +2 logical),but when I looked at sorting results the numbers were not sorted at monotonically increased order over the whole domain(array length).For this I can blame the lack of global synchronization lock in the form of synchronized{} statement.With the synchronized{} statement only one thread was accessing an array and other threads were waiting on global lock.This can explain the better results pointed by you. Soon I will add single-threaded results and integer values array sorting in order to eliminate dependency on FPU. I also plan to test floating-point heavy calculation like a projectile motion. Sergey thanks for posting info about the IDF
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
>>... the numbers were not sorted at monotonically increased order over the whole domain(array length).For this I can blame the lack of global synchronization lock in the form of >>synchronized{} statement... . The Bubble Sorting algorithm is iterative and there has to be a last step in the sorting that uses as input a complete data set. I think only in that case the data set will be sorted completely. Did you consider a recursive Merge Sorting algorithm for your R&D?
0 Kudos
Bernard
Valued Contributor I
2,054 Views
Hi Sergey >>>....Did you consider a recursive Merge Sorting algorithm for your R&D...>>> No.I think that I cannot exclude an unsynchronized array accesses performed by the different threads running on two different physical CPUs. I started recently a new project which involves arbitrary precision arithmetics represented by the Java.Math.BigDecimal class.Today i was able to calculate at very large precision reaching more than 300 decimal places.So I think that I can unleash the power of multithreading to perform calculation at very large arbitrary precision.
0 Kudos
Bernard
Valued Contributor I
2,054 Views
Hi Sergey If you are interested I can post a test-case which involves arbitrary-precision arithmetics and multi-threading. Today I performed a few tests of one of my Special Functions library method.This method (Airy function) is approximated by very large precision coefficients which have 100 decimal places of precision(coefficients were calculated by Mathematica 8) and final calculation is performed by two Horner scheme polynomials multiplications.Every polynomial is constructed from 12 Java.BigDecimal coefficients multiplied by the function argument. The result is accurate up to 15 decimal places were compared to Mathematica 8. But the most interesting part is the time measurement of the method calculation - it is whopping ~2.1 microsecond per one loop iteration and I suppose that this is more than 40-50 time slower than similar method based on double precision arithmetics.
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
Hi Iliya, . >>...Today i was able to calculate at very large precision reaching more than 300 decimal places... . That's interesting and you could calculate your own highly accurate tables of any trigonometric functions. I still don't have time for Intel and Borland Big Decimal Numbers Libraries.
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
>>...This method (Airy function)... . Is it related to Optics? . >>...If you are interested I can post a test-case which involves arbitrary-precision arithmetics and multi-threading... . Did you do / consider any programming for Android? It is 99% Java based and IDZ has a dedicated forum for this.
0 Kudos
Bernard
Valued Contributor I
2,053 Views
Hi Sergey! ...>>>That's interesting and you could calculate your own highly accurate tables of any trigonometric functions. I still don't have time for Intel and Borland Big Decimal Numbers Libraries....>>> Yes this is what I'am doing now.My new library will contain many trigonometric and special functions tables calculated up to 200 decimal places of precision. If you are interested I will post the source code. ...>>>Is it related to Optics?...>>> Yes , but my main reason behind dealing with those function is to simply optimize my Special Functions library. ...>>>Did you do / consider any programming for Android? It is 99% Java based and IDZ has a dedicated forum for this...>>> No mainly because I'm a dedicated Windows "fanboy":) Java is used for those areas of programming where I'am not fluent in(Windows system programming,Windows multithreading).
0 Kudos
Bernard
Valued Contributor I
2,054 Views
@Sergey Here is the code for Airy(x) function written in Java.This program calculates the value of Airy function up to 225 decimal places. package arbitrary_precision_functions_lib; import java.math.BigDecimal; import arbitrary_precision_functions_lib.HelperMethods; public class ArbitraryPrecisionAi implements Runnable{ private BigDecimal[]a; private Thread[]threads; private BigDecimal result; private int size,numofThreads; private int startLoop,currLoop,endLoop; private class AiRange{ public int start,end; } public ArbitraryPrecisionAi(int size,int numofThreads){ this.size = size; this.numofThreads = numofThreads; a = new BigDecimal[size]; threads = new Thread[numofThreads]; startLoop = currLoop = 0; endLoop = size; } private synchronized AiRange getAiRange(){ AiRange ar = new AiRange(); if(currLoop > endLoop) return null; ar.start = currLoop; currLoop += (endLoop - startLoop)/numofThreads+1; ar.end = (currLoop < endLoop)?currLoop:endLoop; return ar; } public void run(){ AiRange ar2; while((ar2 = getAiRange()) != null){ multiThreadedAi(ar2.start,ar2.end,0.001); System.out.println("Current thread = " + Thread.currentThread() + " "); } } public void multiThreadedAi(int start,int end,double inc){ if( inc == 0.0) throw new IllegalArgumentException("arg inc cannot be equal to zero"); double x; x = 0; BigDecimal sum = BigDecimal.ZERO; BigDecimal term1 = BigDecimal.ZERO; BigDecimal term2 = BigDecimal.ZERO; for(int i = start;i < end;i++){ x += inc; BigDecimal x1 = BigDecimal.valueOf(x); BigDecimal x2 = x1; BigDecimal c1 = BigDecimal.valueOf( 0.355028053887817); BigDecimal c2 = BigDecimal.valueOf(0.258819403792807); //double f_part,g_part,f_arg,g_arg,c1,c2; BigDecimal fcoef1 = new BigDecimal("0.1666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666667"); BigDecimal fcoef2 = new BigDecimal("0.005555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555556"); BigDecimal fcoef3 = new BigDecimal("0.00007716049382716049382716049382716049382716049382716049382716049382716049382716049382716049382716049383"); BigDecimal fcoef4 = new BigDecimal("5.845491956603067714178825289936401047512158623269734380845491956603067714178825289936401047512158623e-7"); BigDecimal fcoef5 = new BigDecimal("2.783567598382413197228012042826857641672456487271302086116900931715746530561345376160190975005789821e-9"); BigDecimal fcoef6 = new BigDecimal("9.096626138504618291594810597473390985857700938795104856591179515410936374383481621438532598058136669e-12"); BigDecimal fcoef7 = new BigDecimal("2.165863366310623402760669189874616901394690699713120203950280837002603898662733719390126809061461112e-14"); BigDecimal fcoef8 = new BigDecimal("3.923665518678665584711357228033726270642555615422319210054856588772833149751329201793707987430183173e-17"); BigDecimal fcoef9 = new BigDecimal("5.589267120624879750301078672412715485245805719974813689536832747539648361469129917085054113148409079e-20"); BigDecimal fcoef10 = new BigDecimal("6.424444966235493965863308818865190212926213471235418033950382468436377426976011398948338061090125378e-22"); BigDecimal fcoef11 = new BigDecimal("6.083754702874520801006921229985975580422550635639600410937862186019296805848495642943501951789891456e-25"); BigDecimal fcoef12 = new BigDecimal("4.828376748313111746830889865068234587636944948920317786458620782554997464959123526145636469674517029e-28"); BigDecimal gcoef1 = new BigDecimal("0.08333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333"); BigDecimal gcoef2 = new BigDecimal("0.001984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984"); BigDecimal gcoef3 = new BigDecimal("0.00002204585537918871252204585537918871252204585537918871252204585537918871252204585537918871252204585538"); BigDecimal gcoef4 = new BigDecimal("1.413195857640302084746529190973635418079862524306968751413195857640302084746529190973635418079862524e-7"); BigDecimal gcoef5 = new BigDecimal("5.888316073501258686443871629056814241999427184612369797554982740167925353110538295723480908666093851e-10"); BigDecimal gcoef6 = new BigDecimal("1.737381935562572339700344898525535302663307861488498265626270824451461228589233787414424191552196750e-12"); BigDecimal gcoef7 = new BigDecimal("3.760566960092147921429317962176483339098068964260818756766819966345154174435570968429489592104321971e-15"); BigDecimal gcoef8 = new BigDecimal("6.267611600153579869048863270294138898496781607101364594611366610575256957392618280715815986840536618e-18"); BigDecimal gcoef9 = new BigDecimal("8.290491534594682366466750357531929759916377787171117188639373823512244652635738466555312151905471716e-21"); BigDecimal gcoef10 = new BigDecimal("8.914507026445895017706183180141859956899330953947437837246638519905639411436277921027217367640292168e-24"); BigDecimal gcoef11 = new BigDecimal("7.945193428204897520237239910999875184402255752181317145496112762839250812331798503589320292014520649e-27"); BigDecimal gcoef12 = new BigDecimal("5.964859931084757898076005939189095483785477291427415274396481053182620729978827705397387606617507995e-30"); BigDecimal f_arg = BigDecimal.valueOf(0); BigDecimal g_arg = BigDecimal.valueOf(0); BigDecimal f_part = BigDecimal.valueOf(0); BigDecimal g_part = BigDecimal.valueOf(0); BigDecimal one = BigDecimal.ONE; f_arg = x1.pow(3); g_arg = x1.pow(4); // Example of Horner scheme polynomial multiplication for BigDecimal polynomial f_part = one.add(f_arg.multiply(fcoef1)).add(f_arg.multiply(fcoef2)).add(f_arg.multiply(fcoef3)).add(f_arg.multiply(fcoef4)).add(f_arg.multiply(fcoef5)).add(f_arg.multiply(fcoef6)).add(f_arg.multiply(fcoef7)).add(f_arg.multiply(fcoef8)).add(f_arg.multiply(fcoef9)).add(f_arg.multiply(fcoef10)).add(f_arg.multiply(fcoef11)).add(f_arg.multiply(fcoef12)); g_part = x2.add(g_arg.multiply(gcoef1)).add(g_arg.multiply(gcoef2)).add(g_arg.multiply(gcoef3)).add(g_arg.multiply(gcoef4)).add(g_arg.multiply(gcoef5)).add(g_arg.multiply(gcoef6)).add(g_arg.multiply(gcoef7)).add(g_arg.multiply(gcoef8)).add(g_arg.multiply(gcoef9)).add(g_arg.multiply(gcoef10)).add(g_arg.multiply(gcoef11)).add(g_arg.multiply(gcoef12)); term1 = c1.multiply(f_part); term2 = c2.multiply(g_part); sum = term1.subtract(term2); a = sum; } } public BigDecimal[] getBigDecimalAi(){ for(int i = 0;i < numofThreads;i++){ threads = new Thread(this); threads.setName(Integer.toString(i)); threads.start(); System.out.println("Active threads =" + Thread.activeCount() + " " + "names = " + threads.getName() + " " + " threads status = " + threads.getState() + " "); } for(int i = 0;i < numofThreads;i++){ try{ threads.join(); }catch(InterruptedException e){ e.printStackTrace(); } } return a; } /** * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub BigDecimal[]test; ArbitraryPrecisionAi ap = new ArbitraryPrecisionAi(10,2); long start,end; start = System.nanoTime(); test = ap.getBigDecimalAi(); end = System.nanoTime(); for(int i = 0;i < test.length;i++){ System.out.println("test = " + test + " " + " precision = " + test.precision() + " " ); } System.out.println("running time is = " + (end - start)); } }
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
Hi Iliya, Thank you for the source codes! . >>>>Did you do / consider any programming for Android? It is 99% Java based and IDZ has a dedicated forum for this... >> >>No mainly because I'm a dedicated Windows "fanboy":) >>Java is used for those areas of programming where I'am not fluent in(Windows system programming,Windows multithreading). . Do you know how many Android applications implemented in Java are downloaded from Google Play? Take a look at: . http://en.wikipedia.org/wiki/Google_Play . PS: It would be nice to know how many scientific applications uploaded to / exist on / downloaded from Google Play.
0 Kudos
Bernard
Valued Contributor I
2,054 Views
Hi Sergey! ...>>>Hi Iliya, Thank you for the source codes!...>>> You are welcome. When I will finish my project I will send you all the source code for the dozens of methods. ...>>>Do you know how many Android applications implemented in Java are downloaded from Google Play? Take a look at>>> Thank you for the link.I would like to focus my effort on learning DirectX programming.It would be great to write an application which could be able to visualize in 3D various math objects ie. functions. >>>....PS: It would be nice to know how many scientific applications uploaded to / exist on / downloaded from Google Play.>>> Do you refer to Android based scientific apps?
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
>>...Do you refer to Android based scientific apps? . Yes.
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
Hi Iliya, . >>...Today i was able to calculate at very large precision reaching more than 300 decimal places... . Did you try to calculate PI number? If interested I could give a very simple formula...
0 Kudos
Bernard
Valued Contributor I
2,054 Views
Hi Sergey ...>>>Did you try to calculate PI number? If interested I could give a very simple formula...>>> No I did not,but I calculated Zeta(3) function only by means of an iterative method and precision was more than 200 decimal places. Could you give me your formula? Yesterday I tested fastsin() like function written with the help of arbitrary precision BigDecimal class the result reached almost 1025 decimal places ,but was accurate only to four decimal places when compared to similar calculation performed by Mathematica 8. I also uploaded a text file with the results sadly an accuracy is up to 4 dec. places.I will write another version with 100 dec. places coefficients and singl-threaded to find the error. here is the code for arbitrary precision fastsin(): package arbitrary_precision_functions_lib; import java.math.BigDecimal; public class ArbitraryPrecisionSin implements Runnable{ private BigDecimal[]a; private Thread[]threads; private BigDecimal result; private int size,numofThreads; private int startLoop,currentLoop,endLoop; private int numofCPUs; private class SinRange{ public int start; public int end; } public ArbitraryPrecisionSin(int size,int numofThreads ){ if(numofThreads <= 0) throw new IllegalArgumentException("parameter numofThreads must be greater than 0"); this.size = size; this.numofThreads = numofThreads; a = new BigDecimal[size]; threads = new Thread[numofThreads]; startLoop = currentLoop = 0; endLoop = size; } public ArbitraryPrecisionSin(int size){ numofCPUs = Runtime.getRuntime().availableProcessors(); this.size = size; numofThreads = numofCPUs; a = new BigDecimal[size]; threads = new Thread[numofThreads]; startLoop = currentLoop = 0; endLoop = size; } private synchronized SinRange getSinRange(int c){ if(c < 0 || c > 1) throw new IllegalArgumentException("parameter c must be 0 or 1"); SinRange sr = new SinRange(); if(currentLoop > endLoop) return null; int cpu; cpu = Runtime.getRuntime().availableProcessors(); switch(c){ case 0 : { sr.start = currentLoop; currentLoop += (endLoop - startLoop)/numofThreads; sr.end = (currentLoop < endLoop)?currentLoop:endLoop; } break; case 1 :{ sr.start = currentLoop; currentLoop += (endLoop - startLoop)/cpu; sr.end = (currentLoop < endLoop)?currentLoop:endLoop; } break; default : { System.out.println("default switch option hit for arbitrary number of threads enter 0 for CPU limited thread number enter 1"); } } return sr; } public void run(){ SinRange sr; while((sr = getSinRange(1)) != null){ multiThreadedSin(sr.start,sr.end,0.01); System.out.println("Current thread = " + Thread.currentThread() + " "); } } public void multiThreadedSin(int start,int end,double inc){ if(inc == 0) throw new IllegalArgumentException("parameter inc cannot be 0"); if(size * inc > (Math.PI/2) || size * -inc < (-Math.PI/2)) throw new IllegalArgumentException("parameter inc * size cannot excede PI/2"); double x; x = 0; BigDecimal rad = BigDecimal.ZERO; BigDecimal sum = BigDecimal.valueOf(0); for(int i = start;i < end;i++){ x += inc; BigDecimal arg = BigDecimal.valueOf(x); rad = BigDecimal.valueOf(x); BigDecimal coef1 = new BigDecimal("-0.1666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666667"); BigDecimal coef2 = new BigDecimal("0.008333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333"); BigDecimal coef3 = new BigDecimal("-0.0001984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984126984"); BigDecimal coef4 = new BigDecimal("2.755731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922398589065255731922e-6"); BigDecimal coef5 = new BigDecimal("-2.505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210838544171877505210839e-8"); BigDecimal coef6 = new BigDecimal("1.605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717015494793272571050348828126605904383682161459939237717e-10"); BigDecimal coef7 = new BigDecimal("-7.647163731819816475901131985788070444155100239756324412409068493724578380663036747692832348917005001661086317170973255629340285424941509597594253678909763565848221932878017534102190186846271502356158440814525470610126694782779438864094948751033407118063202719287375372031456687541343625999710655795311879967964624049280133936218592303248387904472560557216641872726528811184895840980497065153149809234465319121403777488433573089657745742401827057911713996370081026165682250338334994419650504306588962673618758274842930927587012243096899181555266211350867435523520179604835689491774147858803943460028116112772197428282084366740451396536052620708705364790020874676959333043989128645213301297957382613467269551925636581721237805893890549975206059862144518229174313830398486483142567798652454737110821766906422991079075735160391245047329703414359499015583671668327752983837639922296006952091608176264260920345576430232514888599544684200768856853512938169022825107481192137276793361449446105530761615417700e-13"); BigDecimal coef8 = new BigDecimal("2.811457254345520763198945583010320016233492735204531033973922240339918522302587039592953069454781250610693498959916638099022163759169672646174357970187413075679493357675741740478746392222893934689764132652399070077252461317198323111799613511409340852229118646796829180893917899831376333088128917571805838223516405900470637476550953052664848494291382557800235982620047357053270529772241568071010959277376955559339624076629990106491818287647730535996953675136059200796206709683211395007224450112716530394712778777515783429259930971726801169689436107114289498354235360148836650548446377889266155683833866217930954936868413370125165954608842875260553442937507674513587990089701885531328419594837273019657084394090307566809278605108047996314413992596376661113667027143528855324684767573033990712173096237833243746720248432044261487149753567431749815814552820466296968008763838206726473144151326535391272397185873687585483414926303192720870903254967991973905450407162202991645879912297590479974544711550625e-15"); BigDecimal coef9 = new BigDecimal("-8.220635246624329716955981236872280749220738991826114134426673217368182813750254501733780904838541668452320172397417070464977087015116001889398707515167874490290916250513864738241948515271619692075333721205845234144013044787129599742104133074296318281371691949698330938286309648629755359906809700502356252115545046492604203147809804247558036533015738473100105212339319757465703303427606924184242570986482326196899485604181257621321106104233130222213314839579120470164347104336875423997732310271100966066411633852385331664502722139552050203770281014954062860684898713885487282305398765758088174514134111748336125546398869503289958931604803728832027610928385013197625702016672179916164969575547581928821884193246513353243504693298385954135713428644376202086745693402131155920130899336356697988810222917641063586901311204807782126168870080209794783083487779141219204703987831013820096912723177004068047944987934759021881330193868984563950009517450268929548100605737435648087368164612837660744282782311769e-18"); BigDecimal coef10 = new BigDecimal("1.95729410633912612308475743735054303552874737900621765105396981365909114613101297660328116781870039725055242199938501677737549690836095283080921607503997011673593244059853922339094012268371897430365088600139172241524120113979276184335812692245150435270754570230912641387769277348327508569209754773865625050370120154585814360662138196370429441262279487454764409817602851368231031033990641052005775499678150623735702038194350262134777717055679139364682024457652274297667564352622608110204924975884258449352929856848990020993931936693273413363225333064167449990221810667554483816097840091669808958500480160042837182135613225614744957591474582503179343691593642472846222788911581994333830177546048126465695269261746751594759240529266105015139738407429716138601468778590526141615019878042652671239215235197860433356407791232620016201891663793990243305078617587853467607145123202444983225738239461294201925563848717342300177210186264707e-20"); BigDecimal coef11 = new BigDecimal("-3.868170170630684037716911931522812323179342646257347136470296074425081316464452522931385707151581812748127316204318214975050389146958404803970782756995988372995913914226362101563122772102211411667294241109469807144745456798009410757624756763738150894678944075709735995805716943642836137731418078534893775699014232304067477483441466331431411882653744811358980431177921963799032233873332827114738646238698628927583044233098653125033458547069984106066871277799322637946709535261093273102640838636881689284025801737429574470404066506470944007044175143494288942539478032131322831877187448596879434648096231765639057757575225627371522177491437854711099007589113971954463439684110756595221611883845088428769943625657120907793856904431764518226855556486154809941062343968629378844405655625991293990593931355938765098297247884814503164958060455585260108734936843187097310702045845573978965232788997272759292276015403142773330194896418682742306610915419851745505411540437340320010995748453245652524130802894678e-23"); BigDecimal coef12 = new BigDecimal("6.446950284384473396194853219204687205298904410428911894117160124041802194107420871552309511919303021246878860340530358291750648578264008006617971261659980621659856523710603502605204620170352352778823735182449678574575761330015684596041261272896918157798240126182893326342861572738060229552363464224822959498357053840112462472402443885719019804422908018931634051963203272998387056455554711857897743731164381545971740388497755208389097578449973510111452129665537729911182558768488788504401397728136148806709669562382624117340110844118240011740291905823814904232463386885538053128645747661465724413493719609398429595958709378952536962485729757851831679315189953257439066140184594325369353139741814047949906042761868179656428174052940863711425927476924683235103906614382298074009426043318823317656552259897941830495413141357505274930100759308766847891561405311828851170076409289964942054648328787932153793359005237955550324827364471237177684859033086242509019234062233866684992914088742754206884671491130e-26"); arg = arg.pow(2); sum = rad.add(rad.multiply(arg)).add(arg.multiply(coef1)).add(arg.multiply(coef2)).add(arg.multiply(coef3)).add(arg.multiply(coef4)).add(arg.multiply(coef5)).add(arg.multiply(coef6)).add(arg.multiply(coef7)).add(arg.multiply(coef8)).add(arg.multiply(coef9)).add(arg.multiply(coef10)).add(arg.multiply(coef11)).add(arg.multiply(coef12)); a = sum; } } public BigDecimal[] getSinValues(int c){ if(c < 0 || c > 1) throw new IllegalArgumentException("parameter c must be 0 or 1"); switch(c){ case 0 : { for(int i = 0;i < numofThreads;i++){ threads = new Thread(this); threads.setName(Integer.toString(i)); threads.start(); System.out.println("Active threads =" + Thread.activeCount() + " " + "names = " + threads.getName() + " " + " threads status = " + threads.getState() + " "); } for(int i = 0;i < numofThreads;i++){ try{ threads.join(); }catch(InterruptedException e){ e.printStackTrace(); } } } break; case 1 : { for(int i = 0;i < numofCPUs;i++){ threads = new Thread(this); threads.setName(Integer.toString(i)); threads.start(); System.out.println("Active threads =" + Thread.activeCount() + " " + "names = " + threads.getName() + " " + " threads status = " + threads.getState() + " "); } for(int i = 0;i < numofCPUs;i++){ try{ threads.join(); }catch(InterruptedException e){ e.printStackTrace(); } } } break; } return a; } /** * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub BigDecimal[]test; ArbitraryPrecisionSin ap = new ArbitraryPrecisionSin(100); test = ap.getSinValues(1); for(int i = 0;i < test.length;i++) System.out.println("test = " + test + " " + "precision = " + test.precision() + " "); } }
0 Kudos
Bernard
Valued Contributor I
2,054 Views
@Sergey Uploaded text file with the results of 1000 decimal precision fastsin() test step 0.01 two threads.
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
>>>>Did you try to calculate PI number? If interested I could give a very simple formula... >> >>No I did not,but I calculated Zeta(3) function only by means of an iterative method and precision was more than 200 decimal places. >>Could you give me your formula? . The value of PI can be calculated as follows: . PI / 4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - ...
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
>>>>Did you try to calculate PI number? If interested I could give a very simple formula... >> >>No I did not,but I calculated Zeta(3) function only by means of an iterative method and precision was more than 200 decimal places. >>Could you give me your formula? . Here is a piece of C++ codes I tried ( single- and double precision data types ): ... RTfloat fPI = ( 1.0f - 1.0f/ 3.0f + 1.0f/ 5.0f - 1.0f/ 7.0f + 1.0f/ 9.0f - 1.0f/ 11.0f + 1.0f/ 13.0f - 1.0f/ 15.0f + 1.0f/ 17.0f - 1.0f/ 19.0f + 1.0f/ 21.0f - 1.0f/ 23.0f + 1.0f/ 25.0f - 1.0f/ 27.0f + 1.0f/ 29.0f - 1.0f/ 31.0f + 1.0f/ 33.0f - 1.0f/ 35.0f + 1.0f/ 37.0f - 1.0f/ 39.0f + 1.0f/ 41.0f - 1.0f/ 43.0f + 1.0f/ 45.0f - 1.0f/ 47.0f + 1.0f/ 49.0f - 1.0f/ 51.0f + 1.0f/ 53.0f - 1.0f/ 55.0f + 1.0f/ 57.0f - 1.0f/ 59.0f + 1.0f/ 61.0f - 1.0f/ 63.0f + 1.0f/ 65.0f - 1.0f/ 67.0f + 1.0f/ 69.0f - 1.0f/ 71.0f + 1.0f/ 73.0f - 1.0f/ 75.0f + 1.0f/ 77.0f - 1.0f/ 79.0f + 1.0f/ 81.0f - 1.0f/ 83.0f + 1.0f/ 85.0f - 1.0f/ 87.0f + 1.0f/ 89.0f - 1.0f/ 91.0f + 1.0f/ 93.0f - 1.0f/ 95.0f + 1.0f/ 97.0f - 1.0f/ 99.0f + 1.0f/101.0f - 1.0f/103.0f + 1.0f/105.0f - 1.0f/107.0f + 1.0f/109.0f - 1.0f/111.0f + 1.0f/113.0f - 1.0f/115.0f + 1.0f/117.0f - 1.0f/119.0f + 1.0f/121.0f - 1.0f/123.0f + 1.0f/125.0f - 1.0f/127.0f + 1.0f/129.0f - 1.0f/131.0f + 1.0f/133.0f - 1.0f/135.0f + 1.0f/137.0f - 1.0f/139.0f + 1.0f/141.0f - 1.0f/143.0f + 1.0f/145.0f - 1.0f/147.0f + 1.0f/149.0f - 1.0f/151.0f + 1.0f/153.0f - 1.0f/155.0f + 1.0f/157.0f - 1.0f/159.0f + 1.0f/161.0f - 1.0f/163.0f + 1.0f/165.0f - 1.0f/167.0f + 1.0f/169.0f - 1.0f/171.0f + 1.0f/173.0f - 1.0f/175.0f + 1.0f/177.0f - 1.0f/179.0f + 1.0f/181.0f - 1.0f/183.0f + 1.0f/185.0f - 1.0f/187.0f + 1.0f/189.0f - 1.0f/191.0f + 1.0f/193.0f - 1.0f/195.0f + 1.0f/197.0f - 1.0f/199.0f + 1.0f/201.0f - 1.0f/203.0f + 1.0f/205.0f - 1.0f/207.0f + 1.0f/209.0f - 1.0f/211.0f + 1.0f/213.0f - 1.0f/215.0f + 1.0f/217.0f - 1.0f/219.0f + 1.0f/221.0f - 1.0f/223.0f + 1.0f/225.0f - 1.0f/227.0f + 1.0f/229.0f - 1.0f/231.0f + 1.0f/233.0f - 1.0f/235.0f + 1.0f/237.0f - 1.0f/239.0f + 1.0f/241.0f - 1.0f/243.0f + 1.0f/245.0f - 1.0f/247.0f + 1.0f/249.0f - 1.0f/251.0f + 1.0f/253.0f ) * 4.0f; . RTdouble dPI = ( 1.0L - 1.0L/ 3.0L + 1.0L/ 5.0L - 1.0L/ 7.0L + 1.0L/ 9.0L - 1.0L/ 11.0L + 1.0L/ 13.0L - 1.0L/ 15.0L + 1.0L/ 17.0L - 1.0L/ 19.0L + 1.0L/ 21.0L - 1.0L/ 23.0L + 1.0L/ 25.0L - 1.0L/ 27.0L + 1.0L/ 29.0L - 1.0L/ 31.0L + 1.0L/ 33.0L - 1.0L/ 35.0L + 1.0L/ 37.0L - 1.0L/ 39.0f + 1.0L/ 41.0L - 1.0L/ 43.0L + 1.0L/ 45.0L - 1.0L/ 47.0L + 1.0L/ 49.0L - 1.0L/ 51.0L + 1.0L/ 53.0L - 1.0L/ 55.0L + 1.0L/ 57.0L - 1.0L/ 59.0L + 1.0L/ 61.0L - 1.0L/ 63.0L + 1.0L/ 65.0L - 1.0L/ 67.0L + 1.0L/ 69.0L - 1.0L/ 71.0L + 1.0L/ 73.0L - 1.0L/ 75.0L + 1.0L/ 77.0L - 1.0L/ 79.0L + 1.0L/ 81.0L - 1.0L/ 83.0L + 1.0L/ 85.0L - 1.0L/ 87.0L + 1.0L/ 89.0L - 1.0L/ 91.0L + 1.0L/ 93.0L - 1.0L/ 95.0L + 1.0L/ 97.0L - 1.0L/ 99.0L + 1.0L/101.0L - 1.0L/103.0L + 1.0L/105.0L - 1.0L/107.0L + 1.0L/109.0L - 1.0L/111.0L + 1.0L/113.0L - 1.0L/115.0L + 1.0L/117.0L - 1.0L/119.0L + 1.0L/121.0L - 1.0L/123.0L + 1.0L/125.0L - 1.0L/127.0L + 1.0L/129.0L - 1.0L/131.0L + 1.0L/133.0L - 1.0L/135.0L + 1.0L/137.0L - 1.0L/139.0L + 1.0L/141.0L - 1.0L/143.0L + 1.0L/145.0L - 1.0L/147.0L + 1.0L/149.0L - 1.0L/151.0L + 1.0L/153.0L - 1.0L/155.0L + 1.0L/157.0L - 1.0L/159.0L + 1.0L/161.0L - 1.0L/163.0L + 1.0L/165.0L - 1.0L/167.0L + 1.0L/169.0L - 1.0f/171.0L + 1.0f/173.0L - 1.0f/175.0L + 1.0f/177.0L - 1.0f/179.0L + 1.0f/181.0L - 1.0f/183.0L + 1.0f/185.0L - 1.0f/187.0L + 1.0f/189.0L - 1.0f/191.0L + 1.0f/193.0L - 1.0f/195.0L + 1.0f/197.0L - 1.0f/199.0L + 1.0f/201.0L - 1.0f/203.0L + 1.0f/205.0L - 1.0f/207.0L + 1.0f/209.0L - 1.0f/211.0L + 1.0f/213.0L - 1.0f/215.0L + 1.0f/217.0L - 1.0f/219.0L + 1.0f/221.0L - 1.0f/223.0L + 1.0f/225.0L - 1.0f/227.0L + 1.0f/229.0L - 1.0f/231.0L + 1.0f/233.0L - 1.0f/235.0L + 1.0f/237.0L - 1.0f/239.0L + 1.0f/241.0L - 1.0f/243.0L + 1.0f/245.0L - 1.0f/247.0L + 1.0f/249.0L - 1.0f/251.0L + 1.0f/253.0L ) * 4.0L; . CrtPrintf( RTU("PI = %.32f ( Single-precision accuracy / Using Series )\n"), fPI ); CrtPrintf( RTU("PI = %.32f ( Double-precision accuracy / Using Series )\n"), dPI ); ... Of course, accuracy is very limited by constraints of IEEE 754 Standard.
0 Kudos
SergeyKostrov
Valued Contributor II
2,054 Views
>>...Of course, accuracy is very limited by constraints of IEEE 754 Standard... . Or something else... . ... PI = 3.14946579933166500000000000000000 ( Single-precision accuracy / Using Series ) PI = 3.14946654729979380000000000000000 ( Double-precision accuracy / Using Series ) ... Just to two digits! . Here is PI value up to 31 digits ( from Windows Calculator ): . 3.1415926535897932384626433832795
0 Kudos
Bernard
Valued Contributor I
2,054 Views
Hi Sergey! Ok I agree with you.Someone who is interested will look through all our posts and find the relevant info. Regarding the title Yes please change it.I think that proper word is optimization. Soon i will post Pi calculation tests I want to test many Pi-approximations formulas , please follow this link http://en.wikipedia.org/wiki/Numerical_approximations_of_%CF%80 Now I'm working on my new project BigDecimal Complex numbers class.I have a BigComplex Constructor which receives two member fields as a parameter BigDecimal BigRe and BigDecimal BigIm.My problem is that I cannot display properly the numerical values of BigComplex constructor when its members have values larger than maximal double-precision values ie. Double.Max_Value.
0 Kudos
SergeyKostrov
Valued Contributor II
2,048 Views
Hi Iliya, . >>>>...Does your summary project is still relevant? >>. >>I just completed a quick review and the most important piece of information that I wanted to use is removed and this is "A Post Number". . If somebody will be interested in getting more information from our posts he/she will get it. I was thinking about a "Summary" for about one week and I would say let's save our time and forget it. However, I would do just one modification for a subject of the thread. What do you think about a change: . from 'Optimalization of sine function's taylor expansion' to 'Optimization of sine function's Taylor expansion'? . PS: Did you want to use a word 'Optimal' instead of 'Optimization'?
0 Kudos
Reply