Solved: I cannot say anything because - Page 12

Anonymous · ‎09-02-2014

Hello there,

ok, here we go, I have a dream, make a 3D engine 100% assembler intel only with CPU, I use rotation matrix only for now.

it works of course, but it's slow when I put a lot of pixels.

Recently I decided to include voxels in my engine, and it's slow when I put> = 8000 voxels (20 * 20 * 20 cube) and when I saw that nvidia display 32M voxels (fire) I wonder how they can do it !

And I have a little idea of the reason: MMU, paging, segmentation. memory.

Am I right?

Another question, is the FPU is the slowest to compute floating point than SSE or depending of data manipulate ?

PS: I work without OS like Windows or Linux, I run on my own kernel + bootloader in assembly too with NASM.

Sorry if i don't wirte a good english, i'm french and use google translate ^-^

Bradley_W_Intel · ‎09-02-2014

You clearly are using the processor in a very advanced way. I will do my best to answer your questions:

1) Why is your voxel engine not able to efficiently render as many voxels as you'd like? Voxel engines need to maximize their use of parallelism (both threading and SIMD) and also to store the data efficiently in an octree or some other structure that can handle sparse data. If you are doing all these things and still not getting the performance you expect, it's an optimization problem. Some Intel tools like VTune Performance Analyzer are excellent for performance analysis.

2) Is single data floating point math faster than SIMD (if I understood you)? Typically SIMD will be faster than single data instructions if your data is laid out in a way that supports the SIMD calls. In all cases, the only way for you to know for certain which way is faster is to test it.

3) How can you select between discrete and processor graphics? DirectX has methods of enumerating adapters. In such a case, the processor graphics is listed separately from the discrete graphics. If you are choosing your adapter based on the amount of available memory, you may be favoring the processor graphics when you didn't intend to. Intel has sample code that shows how to properly detect adapters in DirectX at https://software.intel.com/en-us/vcsource/samples/gpu-detect. The process for OpenGL is not well documented.

4) Can I use one processor to control execution of a second processor? Probably not. The details on Intel processors are covered at http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html. It's possible, though unlikely, that you'll be able to find something in there that can help you.

View solution in original post

Bernard · ‎02-23-2015

Good intro about the quaternions and rotation

http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation

Anonymous · ‎02-23-2015

Hello,

Sorry but ... I hate matrix system :p

Else I'm close to make a camera, the only problem is when I want to move it, my object move on Z-axe and not in camera-axe, here's the update: http://forum.nasm.us/index.php?topic=2041.0.

So how I build this camera, first I learned we can create it without use perspective projection:

X = (X*focale) / (X+distance) 
Y = (Y*focale) / (Y+distance)

Where focal = distance.

Honnestly I don't know what focale mean, and don't know if it exactly the correct name for this.

I choosed focal = distance = 1000.

Then we can build our camera:

   Coord = Coord + Position
   Coord = Rotate(Coord, Angle)
   Coord = Coord + Repere
Show(Coord)

I put Position.z at positive number who mean that the distance of Camera-Object
I put Repere.z at negative number equal to distance of perspective projection

That's all so of course, i'm little stupid to not split those variable for manage camera cause it needed to manage object too, but I will do it later when I will not too lazy :p.

But Anyway I have this problem, I can rotate/see my camera in 360° but I make wrong forward/backward.

And there is another problem my when I do a 360° with my camera, my object is inversed lol.

And about quaternions, all people say it's wonderful tool for make rotation, but we need always sin and cos function right ? So I don't get it what is real faster. Is it true is just an unroll of 3D matrix rotation, cause it make less operation for doing the same thing than matrix rotation, like me I did with:

      ;;                      X-axe                             ;;
      ;;   1y' = (1y * cos(phi_x)) - (1z * sin(phi_x))          ;;
      ;;   1z' = (1z * cos(phi_x)) + (1y * sin(phi_x))          ;;

      ;;                      Y-axe                             ;;
      ;; End.1z  = 1z' = (1z * cos(phi_y)) - (1x * sin(phi_y))  ;;
      ;;     1x' = (1x * cos(phi_y)) + (1z * sin(phi_y))        ;;
 
      ;;                      Z-axe                             ;;
      ;; End.1x  = 1x' = (1x * cos(phi_z)) - (1y * sin(phi_z))  ;;
      ;; End.1y  = 1y' = (1y * cos(phi_z)) + (1x * sin(phi_z))  ;;

Thanks.

Anonymous · ‎02-23-2015

Well I don't know how quaternions can be make those equations more easier than it is:

      ;;                      X-axe                             ;;
      ;;           1y' = (1y * cos(phi_x)) - (1z * sin(phi_x))  ;;
      ;;           1z' = (1z * cos(phi_x)) + (1y * sin(phi_x))  ;;

      ;;                      Y-axe                             ;;
      ;; End.1z  = 1z' = (1z * cos(phi_y)) - (1x * sin(phi_y))  ;;
      ;;           1x' = (1x * cos(phi_y)) + (1z * sin(phi_y))  ;;
 
      ;;                      Z-axe                             ;;
      ;; End.1x  = 1x' = (1x * cos(phi_z)) - (1y * sin(phi_z))  ;;
      ;; End.1y  = 1y' = (1y * cos(phi_z)) + (1x * sin(phi_z))  ;;

Bernard · ‎02-23-2015

You cannot get rid of matrices in 3D. Regarding sin and cos function calls they can take few dozen of cycles when compiled by ICC.

Anonymous · ‎02-24-2015

My goal is precisely to remove the matrix product, in the case of 3D rotation matrix, a simple unroll and it's done, but for other thing like change size, change position, change repere, I can do it wihout matrix like this:

		;===============================================================================.
		 ; void 	scale_object ()													    |
		 ; Purpose : Scale the object by multiply x y z by [temp]                       |
		 ; Input   : rsi - [temp]													    |
		 ; Output  : [rsi]																|
		 ; Destoy  : rcx - rsi - r8														|
		 ; Data    : None																|
		;===============================================================================.
		 scale_object:
			; Point rcx to End object
				mov 	rcx, rsi
				lea		r8, [rsi + OBJ_3D_SIZE_PROP]
				add 	ecx, [r8 + OBJ_3D_PROP_SIZE]
	
			; ...
					vbroadcastss	xmm0, [temp]
					vmovups			[temp +  0], xmm0
					mov				[temp + 12], f32(1.0)
				vbroadcastf128	ymm0, [temp]
			
			.loop:
					vmulps		ymm1, ymm0, [rsi]
					vmovdqu		[rsi], ymm1
			 add	rsi, 32
			 cmp	rsi, rcx
			jb		.loop
		 ret
		;===========================================================================.
		; / scale_object															|
		;===========================================================================.

		;=================================================================================================.
		; move_position																				      |
		;=================================================================================================.
		 .move_position:
				vaddps		ymm8, ymm6, [rsi]
				vmovups		[coord_x4], ymm8
				vaddps		ymm8, ymm6, [rsi + _3x]
				vmovups		[coord_x4 + _3x], ymm8
		 .end_move_position:
		;=================================================================================================.
		; / end_move_position																	          |
		;=================================================================================================.

		;=================================================================================================.
		; move_repere																				      |
		;=================================================================================================.
		 .move_repere:
				vaddps		ymm8, ymm7, [rdx]
				vmovups		[rdx], ymm8
				vaddps		ymm8, ymm7, [rdx + _3x]
				vmovups		[rdx + _3x], ymm8
		 .end_move_repere:
		;=================================================================================================.
		; / move_repere																				  |
		;=================================================================================================.

Ps: And about IE.edit comment of this site, i seen IE have a bad manage of space, i must pass through Firefox :/

Have you any idea for fix it ?

Bernard · ‎02-24-2015

World to Camera transformation at least translation you can do without matrix , it can be done simply vertex component wise by subtracting camera coordinate from object vertex coordinate.Rotation part will be done with 3 matrices.

Bernard · ‎02-24-2015

:>>> And about IE.edit comment of this site, i seen IE have a bad manage of space>>>

Sorry , but I was not able to understand your question?

Bernard · ‎02-24-2015

I think that next book to read is Dave Eberly "3D Graphics Engine Design".

Anonymous · ‎02-24-2015

Hmm, ok I will see that later, and about Internet explore, the problem is when I edit comment, IE don't manage space like firefox when I past source code into add code tool.

Ex for IE:

  ;============================================================================================================.
   ; void  func(...)                             |
   ; Purpose : None                                 |
   ; Input   : None                                 |
   ; Output  : None                                 |
   ; Destoy  : None                                 |
   ; Data    :                                     |
     [section .data use64]                               ;|
       ; None                                       ;|
     [section .code use64]                               ;|
  ;============================================================================================================.
   func:                                                           ;|
   ; Code                                                       ;|
   ret                                                           ;|
  ;============================================================================================================.
  ; /func                                             |
  ;============================================================================================================.

Ex for firefox:

		;============================================================================================================.
		 ; void		func(...)																				         |
		 ; Purpose : None																				             |
		 ; Input   : None																				             |
		 ; Output  : None																				             |
		 ; Destoy  : None																				             |
		 ; Data    :																				                 |
					[section .data use64]															                ;|
							; None															                        ;|
					[section .code use64]															                ;|
		;============================================================================================================.
		 func:															                                            ;|
			; Code															                                        ;|
		 ret															                                            ;|
		;============================================================================================================.
		; /func																				                         |
		;============================================================================================================.

And I have a other question for help me to build my new topic "Factorization of instructions block" for nasm

What do you think about that transformation:

Source:

mov            [instance         ], rcx
mov            [previous_instance], rdx
mov            [cmd_line         ], r8
mov            [cmd_show         ], r9d

Dest:

_mov {[instance], rcx }, {[previous_instance], rdx}, {[cmd_line], r8}, {[cmd_show], r9d}

Is it easier to read ? yeah I would like to put ( ) instead { }, but nasm don't let me to do this :p

Bernard · ‎02-24-2015

Regarding IE formatting problem I think that you need start specific thread in "Catch All" forum I hope that someone from Intel will respond to it.

I think that source syntax is more readable than destination at least for assembly programmer. Dest example remind me more start of C++ lambda definition.

Bernard · ‎02-24-2015

How was the reading of A. LaMothe book?

Anonymous · ‎02-25-2015

Hello,

Sorry i'm bad student since i'm not at school anymore, I will read it today I think, and I need to reference all CPU instruction in one file.

And bout Factorization of instruction's block and write multiple-instruction in same line, for the first tool, yes it's a little too high level for asm language, but I will anyway use that system for factorize some block, but i'm little agree it's concretely ugly to read, or miss habit ? I don't know

And for write multiple instruction in same line, hmm is like c, once ' ; ' and we can write c code in same line, unfortunnaly in nasm forum, I didn't had answer about that, I hope they will integrate that, little boring to read code only by line :/

And about other subject, have you knowledge in Operating system, build in asm code ?

Bernard · ‎02-25-2015

I cannot say anything because I do not know NASM syntax. Personally I prefer the old style.

Bernard · ‎02-25-2015

A. LaMothe book is very good if you are also interested in software renderer even for embedded systems.

Bernard · ‎03-08-2015

I answer you here because I cannot directly reply by sending private message.

Regarding those formulas they are Taylor approximation of sine and cosine functions. Do not implement them directly because of division which is involved in the calculation. Use Horner Scheme instead.

https://software.intel.com/pt-br/forums/topic/278083

Anonymous · ‎03-10-2015

Ok thanks, I have a little problem with my camera's configuration, when we move camera, what we need to move too ?

I had a big surprise with that configuration!

    if  i64 [word_param_msg] is SHIFT_WHEELUP
        mov  [cam_repere   + _1z], f32(20.0)
        mov  [obj_repere   + _1z], f32(20.0)
        mov  [obj_position + _1z], f32(20.0)
    endif

I'm able to mov the rotation repere of the camera on Z-axe, I upload the program

Use:

Key   binding:  alt + ²                = Exit program.
Mouse binding : MouseMove + Left click = Make a rotation of active object through mouse movement: 
                                          - MouseRight/Left(x) = RotY.
                                          - MouseUp/Down(y)    = RotX. 
                Ctrl + MouseWheel      = Wheel up/camera forward . Wheel down/camera backward

Bernard · ‎03-11-2015

You are rotating camera coordinate when moving it around.

Bernard · ‎03-13-2015

Which NASM syntax are you using? For me it looks like a function call "f32(20,0)".

Anonymous · ‎03-14-2015

It's just a redefine of nasm keyword (dword __float32__(nbr))

  ; redefine some keyword of NASM
      %define  f8 (nbr)     byte  __float8__(nbr)
      %define  f16(nbr)     word  __float16__(nbr)
      %define  f32(nbr)     dword __float32__(nbr)
      %define  f64(nbr)     qword __float64__(nbr)
 
      %define  f80m(nbr)    tword __float80m__(nbr)
      %define  f80e(nbr)    tword __float80e__(nbr)
      %define  f128l(nbr)   oword __float128l__(nbr)
      %define  f128lh(nbr)  oword __float128h__(nbr)

      %define  i8    byte
      %define  i16   word
      %define  i32   dword
      %define  i64   qword

(http://forum.nasm.us/index.php?topic=2076.0)

Bernard · ‎03-14-2015

Thanks for the explanation.

>>>Ok thanks, I have a little problem with my camera's configuration, when we move camera, what we need to move too ?>>>

I still cannot understand your question?

Anonymous · ‎03-14-2015

Hello, no problem.

hmm ok (hard to be french ^^), i mean i don't understand when book (Andre LaMothe) talk about the world to camera transformation. i have so many variable that i don't know how to manage them -_-

		world_position:		dd	0, 0, 0, 0		; x, y, z
		
		objcam_prop:
			obj_position: 	dd	0, 0, 0, 0		; x, y, z
			obj_angle: 		dd	0, 0, 0, 0		; x, y, z
			obj_repere:		dd	0, 0, 0, 0		; x, y, z
							dd	0, 0, 0, 0
			cam_position: 	dd	0, 0, 0, 0		; x, y, z
			cam_angle: 		dd	0, 0, 0, 0		; x, y, z
			cam_repere:		dd	0, 0, 0, 0		; x, y, z
							dd	0, 0, 0, 0

Do you know what is the word for repere word (french), it mean the center of rotation of object.

I'm not sure about variable's name about repere and position.