- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp] ippsMulC_32f(dataset.xpoints, a, temp1, n); ippsAddProductC_32f(dataset.ypoints, b, temp1, n); ippsAddProductC_32f(dataset.zpoints, c, temp1, n); ippsAddC_32f_I(d, temp1, n); ippsMulC_32f(dataset.xpoints, e, temp2, n); ippsAddProductC_32f(dataset.ypoints, f, temp2, n); ippsAddProductC_32f(dataset.zpoints, g, temp2, n); ippsAddC_32f_I(h, temp2, n); ippsMulC_32f(dataset.xpoints, i, temp3, n); ippsAddProductC_32f(dataset.ypoints, j, temp3, n); ippsAddProductC_32f(dataset.zpoints, k, temp3, n); ippsAddC_32f_I(l, temp3, n); ippsDivCRev_32f_I(1.0, temp3, n); for (int p = 0; p= ca + ca*temp1 *vd*temp3
; ytrans
= cb - cb*temp2
*vda*temp3
; }[/cpp]
X' = t11*X + t12*Y + t13*Z + t14
Y' = t21*X + t22*Y + t23*Z + t24
Z' = t31*X + t32*Y + t33*Z + t34
W = t41*X + t42*Y + t43*Z + t44
X' = X'/W
Y' = Y'/W
Z' = Z'/W
But I need
X' = t11*X + t12*Y + t13*Z + t14
Y' = t21*X + t22*Y + t23*Z + t24
W = t41*X + t42*Y + t43*Z + t44,
and then the following vector transformation:
X' = X'/W
Y' = Y'/W
So I guess my question is this, are there any internal optimisations that will flagt31,t32,t33and t34
as zero and not include them in the calculation?
It would be really, really nice to sort this last bit out so I can get balanced parallel (multi socket/core) and full vectorisation.
Currently getting 2 billion points rendering at 20 frames per second at 9600x4800 resolution (20x1920x1200). The interaction is just a fraction too slow for fluid manipulation. By the end of the year the resolution will expected to increase 6x more so milking every bit of performance is essential.
Pointers from the experts?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page