Developing Games on Intel Graphics
If you are gaming on graphics integrated in your Intel Processor, this is the place for you! Find answers to your questions or post your issues with PC games
547 Discussions

3D engine

Anonymous
Not applicable
54,760 Views

Hello there,

ok, here we go, I have a dream, make a 3D engine 100% assembler intel only with CPU, I use rotation matrix only for now.


it works of course, but it's slow when I put a lot of pixels.

Recently I decided to include voxels in my engine, and it's slow when I put> = 8000 voxels (20 * 20 * 20 cube) and when I saw that nvidia display 32M voxels (fire) I wonder how they can do it !



And I have a little idea of  the reason: MMU, paging, segmentation. memory.

Am I right?



Another question, is the FPU is the slowest to compute floating point  than SSE or depending of data manipulate ?


PS: I work without OS like Windows or Linux, I run on my own kernel + bootloader in assembly too with NASM.

Sorry if i don't wirte a good english, i'm french and use google translate ^-^

0 Kudos
1 Solution
Bradley_W_Intel
Employee
54,330 Views

You clearly are using the processor in a very advanced way. I will do my best to answer your questions:

1) Why is your voxel engine not able to efficiently render as many voxels as you'd like? Voxel engines need to maximize their use of parallelism (both threading and SIMD) and also to store the data efficiently in an octree or some other structure that can handle sparse data. If you are doing all these things and still not getting the performance you expect, it's an optimization problem. Some Intel tools like VTune Performance Analyzer are excellent for performance analysis.

2) Is single data floating point math faster than SIMD (if I understood you)? Typically SIMD will be faster than single data instructions if your data is laid out in a way that supports the SIMD calls. In all cases, the only way for you to know for certain which way is faster is to test it.

3) How can you select between discrete and processor graphics? DirectX has methods of enumerating adapters. In such a case, the processor graphics is listed separately from the discrete graphics. If you are choosing your adapter based on the amount of available memory, you may be favoring the processor graphics when you didn't intend to. Intel has sample code that shows how to properly detect adapters in DirectX at https://software.intel.com/en-us/vcsource/samples/gpu-detect. The process for OpenGL is not well documented.

4) Can I use one processor to control execution of a second processor? Probably not. The details on Intel processors are covered at http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html. It's possible, though unlikely, that you'll be able to find something in there that can help you.

 

View solution in original post

0 Kudos
270 Replies
Bernard
Valued Contributor I
3,395 Views

Good intro about the quaternions and rotation

http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation

0 Kudos
Anonymous
Not applicable
3,395 Views

Hello,

Sorry but ... I hate matrix system :p

Else I'm close to make a camera, the only problem is when I want to move it, my object move on Z-axe and not in camera-axe, here's the update: http://forum.nasm.us/index.php?topic=2041.0.

So how I build this camera, first I learned we can create it without use perspective projection:

X = (X*focale) / (X+distance) 
Y = (Y*focale) / (Y+distance)

Where focal = distance.

Honnestly I don't know what focale mean, and don't know if it exactly the correct name for this.

I choosed focal = distance = 1000.

Then we can build our camera:

   Coord = Coord + Position
   Coord = Rotate(Coord, Angle)
   Coord = Coord + Repere
Show(Coord)

I put Position.z at positive number who mean that the distance of Camera-Object
I put Repere.z at negative number equal to distance of perspective projection

That's all so of course, i'm little stupid to not split those variable for manage camera cause it needed to manage object too, but I will do it later when I will not too lazy :p.

But Anyway I have this problem, I can rotate/see my camera in 360° but I make wrong forward/backward.

And there is another problem my when I do a 360° with my camera, my object is inversed lol.

And about quaternions, all people say it's wonderful tool for make rotation, but we need always sin and cos function right ? So I don't get it what is real faster. Is it true is just an unroll of 3D matrix rotation, cause it make less operation for doing the same thing than matrix rotation, like me I did with:

      ;;                      X-axe                             ;;
      ;;   1y' = (1y * cos(phi_x)) - (1z * sin(phi_x))          ;;
      ;;   1z' = (1z * cos(phi_x)) + (1y * sin(phi_x))          ;;

      ;;                      Y-axe                             ;;
      ;; End.1z  = 1z' = (1z * cos(phi_y)) - (1x * sin(phi_y))  ;;
      ;;     1x' = (1x * cos(phi_y)) + (1z * sin(phi_y))        ;;
 
      ;;                      Z-axe                             ;;
      ;; End.1x  = 1x' = (1x * cos(phi_z)) - (1y * sin(phi_z))  ;;
      ;; End.1y  = 1y' = (1y * cos(phi_z)) + (1x * sin(phi_z))  ;;

 

Thanks.

 

0 Kudos
Anonymous
Not applicable
3,395 Views

Well I don't know how quaternions can be make those equations more easier than it is:

      ;;                      X-axe                             ;;
      ;;           1y' = (1y * cos(phi_x)) - (1z * sin(phi_x))  ;;
      ;;           1z' = (1z * cos(phi_x)) + (1y * sin(phi_x))  ;;

      ;;                      Y-axe                             ;;
      ;; End.1z  = 1z' = (1z * cos(phi_y)) - (1x * sin(phi_y))  ;;
      ;;           1x' = (1x * cos(phi_y)) + (1z * sin(phi_y))  ;;
 
      ;;                      Z-axe                             ;;
      ;; End.1x  = 1x' = (1x * cos(phi_z)) - (1y * sin(phi_z))  ;;
      ;; End.1y  = 1y' = (1y * cos(phi_z)) + (1x * sin(phi_z))  ;;

 

0 Kudos
Bernard
Valued Contributor I
3,395 Views

You cannot get rid of matrices in 3D. Regarding sin and cos function calls they can take few dozen of cycles when compiled by ICC.

0 Kudos
Anonymous
Not applicable
3,395 Views

My goal is precisely to remove the matrix product, in the case of 3D rotation matrix, a simple unroll and it's done, but for other thing like change size, change position, change repere, I can do it wihout matrix like this:

		;===============================================================================.
		 ; void 	scale_object ()													    |
		 ; Purpose : Scale the object by multiply x y z by [temp]                       |
		 ; Input   : rsi - [temp]													    |
		 ; Output  : [rsi]																|
		 ; Destoy  : rcx - rsi - r8														|
		 ; Data    : None																|
		;===============================================================================.
		 scale_object:
			; Point rcx to End object
				mov 	rcx, rsi
				lea		r8, [rsi + OBJ_3D_SIZE_PROP]
				add 	ecx, [r8 + OBJ_3D_PROP_SIZE]
	
			; ...
					vbroadcastss	xmm0, [temp]
					vmovups			[temp +  0], xmm0
					mov				[temp + 12], f32(1.0)
				vbroadcastf128	ymm0, [temp]
			
			.loop:
					vmulps		ymm1, ymm0, [rsi]
					vmovdqu		[rsi], ymm1
			 add	rsi, 32
			 cmp	rsi, rcx
			jb		.loop
		 ret
		;===========================================================================.
		; / scale_object															|
		;===========================================================================.

		;=================================================================================================.
		; move_position																				      |
		;=================================================================================================.
		 .move_position:
				vaddps		ymm8, ymm6, [rsi]
				vmovups		[coord_x4], ymm8
				vaddps		ymm8, ymm6, [rsi + _3x]
				vmovups		[coord_x4 + _3x], ymm8
		 .end_move_position:
		;=================================================================================================.
		; / end_move_position																	          |
		;=================================================================================================.

		;=================================================================================================.
		; move_repere																				      |
		;=================================================================================================.
		 .move_repere:
				vaddps		ymm8, ymm7, [rdx]
				vmovups		[rdx], ymm8
				vaddps		ymm8, ymm7, [rdx + _3x]
				vmovups		[rdx + _3x], ymm8
		 .end_move_repere:
		;=================================================================================================.
		; / move_repere																				  |
		;=================================================================================================.

 

Ps: And about IE.edit comment of this site, i seen IE have a bad manage of space, i must pass through Firefox :/

Have you any idea for fix it ?

0 Kudos
Bernard
Valued Contributor I
3,395 Views

World to Camera transformation at least translation you can do without matrix , it can be done simply vertex component wise by subtracting   camera coordinate from object vertex coordinate.Rotation part will be done with 3 matrices.

0 Kudos
Bernard
Valued Contributor I
3,395 Views

:>>> And about IE.edit comment of this site, i seen IE have a bad manage of space>>>

Sorry , but I was not able to understand your question?

0 Kudos
Bernard
Valued Contributor I
3,395 Views

I think that next book to read is Dave Eberly "3D Graphics Engine Design".

0 Kudos
Anonymous
Not applicable
3,395 Views

Hmm, ok I will see that later, and about Internet explore, the problem is when I edit comment, IE don't manage space like firefox when I past source code into add code tool.

Ex for IE:

  ;============================================================================================================.
   ; void  func(...)                             |
   ; Purpose : None                                 |
   ; Input   : None                                 |
   ; Output  : None                                 |
   ; Destoy  : None                                 |
   ; Data    :                                     |
     [section .data use64]                               ;|
       ; None                                       ;|
     [section .code use64]                               ;|
  ;============================================================================================================.
   func:                                                           ;|
   ; Code                                                       ;|
   ret                                                           ;|
  ;============================================================================================================.
  ; /func                                             |
  ;============================================================================================================.

Ex for firefox:

		;============================================================================================================.
		 ; void		func(...)																				         |
		 ; Purpose : None																				             |
		 ; Input   : None																				             |
		 ; Output  : None																				             |
		 ; Destoy  : None																				             |
		 ; Data    :																				                 |
					[section .data use64]															                ;|
							; None															                        ;|
					[section .code use64]															                ;|
		;============================================================================================================.
		 func:															                                            ;|
			; Code															                                        ;|
		 ret															                                            ;|
		;============================================================================================================.
		; /func																				                         |
		;============================================================================================================.

 

And I have a other question for help me to build my new topic "Factorization of instructions block" for nasm

What do you think about that transformation:

Source:

mov            [instance         ], rcx
mov            [previous_instance], rdx
mov            [cmd_line         ], r8
mov            [cmd_show         ], r9d

Dest:

_mov {[instance], rcx }, {[previous_instance], rdx}, {[cmd_line], r8}, {[cmd_show], r9d}

 

Is it easier to read ? yeah I would like to put ( ) instead { }, but nasm don't let me to do this :p

0 Kudos
Bernard
Valued Contributor I
3,388 Views

Regarding IE formatting problem I think that you need start specific thread in "Catch All" forum I hope that someone from Intel will respond to it.

I think that source syntax is more readable than destination at least for assembly programmer. Dest example remind me more  start of C++ lambda definition.

0 Kudos
Bernard
Valued Contributor I
3,388 Views

How was the reading of A. LaMothe book?

0 Kudos
Anonymous
Not applicable
3,388 Views

Hello,

Sorry i'm bad student since i'm not at school anymore, I will read it today I think, and I need to reference all CPU instruction in one file.

And bout Factorization of instruction's block and write multiple-instruction in same line, for the first tool, yes it's a little too high level for asm language, but I will anyway use that system for factorize some block, but i'm little agree it's concretely ugly to read, or miss habit ? I don't know

And for write multiple instruction in same line, hmm is like c, once ' ; ' and we can write c code in same line, unfortunnaly in nasm forum, I didn't had answer about that, I hope they will integrate that, little boring to read code only by line :/

 

And about other subject, have you knowledge in Operating system, build in asm code ? 

0 Kudos
Bernard
Valued Contributor I
3,388 Views

I cannot say anything because I do not know NASM syntax. Personally I prefer the old style.

0 Kudos
Bernard
Valued Contributor I
3,388 Views

A. LaMothe book is very good if you are also interested in software renderer even for embedded systems.

0 Kudos
Bernard
Valued Contributor I
3,388 Views

 

I answer you here because I cannot directly reply by sending private message.

Regarding those formulas they are Taylor approximation of sine and cosine functions. Do not implement them directly because of division which is involved in the calculation. Use Horner Scheme instead.

https://software.intel.com/pt-br/forums/topic/278083

0 Kudos
Anonymous
Not applicable
3,388 Views

Ok thanks, I have a little problem with my camera's configuration, when we move camera, what we need to move too ?

I had a big surprise with that configuration!

    if  i64 [word_param_msg] is SHIFT_WHEELUP
        mov  [cam_repere   + _1z], f32(20.0)
        mov  [obj_repere   + _1z], f32(20.0)
        mov  [obj_position + _1z], f32(20.0)
    endif

I'm able to mov the rotation repere of the camera on Z-axe, I upload the program

Use:

Key   binding:  alt + ²                = Exit program.
Mouse binding : MouseMove + Left click = Make a rotation of active object through mouse movement: 
                                          - MouseRight/Left(x) = RotY.
                                          - MouseUp/Down(y)    = RotX. 
                Ctrl + MouseWheel      = Wheel up/camera forward . Wheel down/camera backward 

 

0 Kudos
Bernard
Valued Contributor I
3,388 Views

You are rotating camera coordinate when moving it around.

0 Kudos
Bernard
Valued Contributor I
3,388 Views

Which NASM syntax are you using? For me it looks like a function call "f32(20,0)".

0 Kudos
Anonymous
Not applicable
3,388 Views

It's just a redefine of nasm keyword (dword  __float32__(nbr))

  ; redefine some keyword of NASM
      %define  f8 (nbr)     byte  __float8__(nbr)
      %define  f16(nbr)     word  __float16__(nbr)
      %define  f32(nbr)     dword __float32__(nbr)
      %define  f64(nbr)     qword __float64__(nbr)
 
      %define  f80m(nbr)    tword __float80m__(nbr)
      %define  f80e(nbr)    tword __float80e__(nbr)
      %define  f128l(nbr)   oword __float128l__(nbr)
      %define  f128lh(nbr)  oword __float128h__(nbr)

      %define  i8    byte
      %define  i16   word
      %define  i32   dword
      %define  i64   qword

(http://forum.nasm.us/index.php?topic=2076.0)

0 Kudos
Bernard
Valued Contributor I
3,388 Views

Thanks for the explanation.

>>>Ok thanks, I have a little problem with my camera's configuration, when we move camera, what we need to move too ?>>>

I still cannot understand your question?

0 Kudos
Anonymous
Not applicable
3,388 Views

Hello, no problem.

hmm ok (hard to be french ^^), i mean i don't understand when book (Andre LaMothe) talk about the world to camera transformation. i have so many variable that i don't know how to manage them -_-

		world_position:		dd	0, 0, 0, 0		; x, y, z
		
		objcam_prop:
			obj_position: 	dd	0, 0, 0, 0		; x, y, z
			obj_angle: 		dd	0, 0, 0, 0		; x, y, z
			obj_repere:		dd	0, 0, 0, 0		; x, y, z
							dd	0, 0, 0, 0
			cam_position: 	dd	0, 0, 0, 0		; x, y, z
			cam_angle: 		dd	0, 0, 0, 0		; x, y, z
			cam_repere:		dd	0, 0, 0, 0		; x, y, z
							dd	0, 0, 0, 0

Do you know what is the word for repere word (french), it mean the center of rotation of object.

I'm not sure about variable's name about repere and position.

0 Kudos
Reply