INT 8 and 16 for precise calculations?

Moderators: Site Moderators, FAHC Science Team

Post Reply
MeeLee
Posts: 1339
Joined: Tue Feb 19, 2019 10:16 pm

INT 8 and 16 for precise calculations?

Post by MeeLee »

Would it in theory be possible, to use INT 8, and 16 shaders (like RT cores, and other), to calculate FAH projects, with an equal precision, by either looping the data 2 to 4x through that shader, or using multiple shaders to perform the duty of a full FP 32 bit (cuda) core?

And if it is, can it be used to either enable GPUs for certain workloads, or even speed up GPU workloads?
Joe_H
Site Admin
Posts: 7936
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: INT 8 and 16 for precise calculations?

Post by Joe_H »

In theory you can do all kinds of calculations on integer registers to be used in place of floating point. In practice it takes many more cycles to do what can be done on floating point registers and adds more levels of complexity to the code and debugging so that it is rarely used these days. Something that might have been done 30+ years ago when floating point often might not be supported in hardware.

I see absolutely no benefit for F@h in that approach.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: INT 8 and 16 for precise calculations?

Post by JimboPalmer »

FP32 has a 24 bit mantissa, so both INT16 and INT8 are less precise than FP32. If you use 2 INT16 or 3 INT8 registers, and a whole lot of slow code, you could achieve the same precision as is built into the CPU.

So for a 4 times slower WU, you could write that code. I find that I make more correct code when I take advantage of the built in computer abilities. (which is why i wrote in PL/SQL the last decade I was a programmer)

https://en.wikipedia.org/wiki/PL/SQL

You mention using INT code to replace FP omissions, which are present in old, slow GPUs. Slowing old, slow GPUs even further does not seem ideal.
Last edited by JimboPalmer on Wed Sep 23, 2020 2:28 am, edited 1 time in total.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
MeeLee
Posts: 1339
Joined: Tue Feb 19, 2019 10:16 pm

Re: INT 8 and 16 for precise calculations?

Post by MeeLee »

But despite the slowdown, looking at a 3090, it can calculate up to ~36Tflops, 142 of FP 16, and/or 285 Tensor tflops.
That's 36 Tflops at 32 bit,
Potentially ~70tflops at 16 bit
and/or the same 70tflops at 8 bit.

Not sure if those ray tracing cores can be added to the regular cores.
If optimized, it seems like they could outdo the 32bit cores!
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: INT 8 and 16 for precise calculations?

Post by PantherX »

I personally think that rather than "going backwards" maybe we think "forwards" as in, we don't use those Tensor cores for FP32 processing but instead, can we use it for something new and exciting, like AI or ML or unique algorithms. We are already using FP32 for folding so why not see what the "additional" hardware can be used to supplement F@H. I have no idea if F@H can even do those things but dreams are free :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: INT 8 and 16 for precise calculations?

Post by bruce »

MeeLee wrote:But despite the slowdown, looking at a 3090, it can calculate up to ~36Tflops, 142 of FP 16, and/or 285 Tensor tflops.
That's 36 Tflops at 32 bit,
Potentially ~70tflops at 16 bit
and/or the same 70tflops at 8 bit.

Not sure if those ray tracing cores can be added to the regular cores.
If optimized, it seems like they could outdo the 32bit cores!
Yes, in theory, It could work but it would be EXPENSIVE in programming time and debugging time plus the issues of validating some entirely new code. It's easier to wait for the next generation of hardware that adds hardware that will enhance the FP performance.

I think it's better to use the tensor cores for problems that will benefit from the use of tensor mathematics. (I'm sure that there are FAH scientists already considering such things.) In the meantime, you can temporary add another GPU that can produce 36 Tflops at 32 bit plus ?? Tflops at 64 bit and donate it to some needy FAH donor whenever you do your next upgrade.
Post Reply