Re: Folding@Home Benchmark Beta Testing
Posted: Sat Jan 19, 2013 12:56 am
PS - Anandtech promised to play around with the benchmarks this week too ^_^
Community driven support forum for Folding@home
https://foldingforum.org/
upgraded the ATI/AMD HD57xx drivers from Catalyist 11.12 to Catalyist 12.8 and now OpenCL runs.proteneer wrote:Do you guys have the latest drivers?
That was with Version 0_1, Version 0_3 seems to work fine!proteneer wrote:Is this with Version 0_2?PinHead wrote:It doesn't seem to like Nvidia/Nvidia either:
Vista Ultimate 32 bit, Tesla C2050 and GTX550Ti
FAHBench.exe --display-devices
Output:Tesla C2050 WDDM mode ( OpenCL )Code: Select all
O O P R O T E N E E R C--N \ \ N | C C=O / \-C C / | N-C \ .C-C C/ C C | C / \ O | | / N | C C | | O C C /-C \_N_/ \ N _C_ C | / O / C C-/ \_C/ \N-/ \ N /-C-\ C | | O / | | C-/ \C/ N-/ \_ N\ /C\ -C N | | O | | | \C/ C/ N/ \_C__/ \ C-\ C C O | | | | C-/ N/ \-C \_C C O | O | | \ \-O C C O | \ \ C N Folding@Home C--N C \ | Benchmark (Beta) | | N--C O | \ Yutong Zhao C=O N proteneer@gmail.com / O for official stats, please visit http://www.fahbench.com === 1 OpenCL platform(s) found: === -- 0 -- PROFILE = FULL_PROFILE VERSION = OpenCL 1.1 CUDA 4.2.1 NAME = NVIDIA CUDA VENDOR = NVIDIA Corporation EXTENSIONS = cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll === 1 OpenCL device(s) found on platform: -- 0 -- DEVICE_NAME = Tesla C2050 / C2070 DEVICE_VENDOR = NVIDIA Corporation DEVICE_VERSION = OpenCL 1.1 CUDA DRIVER_VERSION = 306.94 DEVICE_MAX_COMPUTE_UNITS = 14 DEVICE_MAX_CLOCK_FREQUENCY = 1147 DEVICE_GLOBAL_MEM_SIZE = 2818572288 -- 1 -- DEVICE_NAME = GeForce GTX 550 Ti DEVICE_VENDOR = NVIDIA Corporation DEVICE_VERSION = OpenCL 1.1 CUDA DRIVER_VERSION = 306.94 DEVICE_MAX_COMPUTE_UNITS = 4 DEVICE_MAX_CLOCK_FREQUENCY = 1962 DEVICE_GLOBAL_MEM_SIZE = 1073741824 Invalid Platform (please use either OpenCL or CUDA)
FAHBench.exe -deviceId 0 -platform OpenCL -precision single
Explicit:
13.5521 ns/day ( accuracy really beats on the slow drive, factor ? )
Implicit:
61.4903 ns/day
GTX550Ti ( OpenCL )FAHBench.exe -deviceId 1 -platform OpenCL -precision single
Explicit:
8.40868 ns/day
Implicit:
36.6661 ns/day
Tesla C2050 WDDM mode ( CUDA )FAHBench.exe -deviceId 0 -platform CUDA -precision single
Explicit:
15.8791 ns/day
Implicit:
55.0354 ns/day
GTX550Ti ( CUDA )FAHBench.exe -deviceId 1 -platform CUDA -precision single
Explicit:
Error launching CUDA compiler:-1
nvcc : fatal error : Value 'compute_21' is not defined for option
'gpu-architecture'
I'll see about giving my 3 x 580SCs and 2 x 590 Classifieds a run on this either today or tomorrow.proteneer wrote:nice - its a nice list of numbers the you guys are getting. the numbers are going to vary in the upcoming months, as we begin to heavily optimize OpenMM. I trust all of you guys here, but you might want to mention that those numbers are on unofficial - it's pretty easy to hack so we keep an official list on fahbench.com as well.
You can keep both the CUDA/OpenCL numbers, we care more about single precision perf. at this point. Don't really need double (though it is one area in which the K20 shines).
PS - does anyone have overclocked 580? That card is a BEAST.
Fascinating. How similar will they have to be? Exact same card? same family, or any two GPUs that meet base requirements?proteneer wrote:quick heads up - next update (not released yet) will allow MULTIPLE GPUs to WORK TOGETHER
I don't think he meant multiple cards working on the very same WU (like SLI or something).k1wi wrote:Fascinating. How similar will they have to be? Exact same card? same family, or any two GPUs that meet base requirements?proteneer wrote:quick heads up - next update (not released yet) will allow MULTIPLE GPUs to WORK TOGETHER
I think you would have better luck running multiple WUs on the same GPU (utilization).proteneer wrote:no actually i do mean 2 cards working on the same WU - the problem is that kinda don't scale very well when testing internally (and they only work on explicit). I'll probably need to add another flag specifying explicit/implicit - so sorry if I end up breaking all your batch files =P
Code: Select all
C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform CUDA -precision single --disable-accuracy-check
O O
P R O T E N E E R C--N \ \ N
| C C=O / \-C
C / | N-C \
.C-C C/ C C | C
/ \ O | | / N |
C C | | O C C /-C
\_N_/ \ N _C_ C | / O / C
C-/ \_C/ \N-/ \ N /-C-\ C | | O /
| | C-/ \C/ N-/ \_ N\ /C\ -C N | |
O | | | \C/ C/ N/ \_C__/ \ C-\ C
C O | | | | C-/ N/ \-C
\_C C O | O | |
\ \-O C C O
| \ \
C N Folding@Home C--N C
\ | Benchmark (Beta) | |
N--C O |
\ Yutong Zhao C=O
N proteneer@gmail.com /
O
for official stats, please visit www.fahbench.com
Explicit:
Accuracy checking disabled.
14.2741 ns/day
Implicit:
Accuracy checking disabled.
56.962 ns/day
C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform CUDA -precision double --disable-splash --disable-accuracy-check
Explicit:
Accuracy checking disabled.
3.94604 ns/day
Implicit:
Accuracy checking disabled.
5.61699 ns/day
C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform OpenCL -precision single --disable-splash --disable-accuracy-check
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit:
Accuracy checking disabled.
10.2394 ns/day
Implicit:
Accuracy checking disabled.
52.4224 ns/day
C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform OpenCL -precision double --disable-splash --disable-accuracy-check
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit:
Accuracy checking disabled.
3.40654 ns/day
Implicit:
Accuracy checking disabled.
4.85512 ns/day
C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform CUDA -precision single --disable-splash --disable-accuracy-check
Explicit:
Accuracy checking disabled.
22.9827 ns/day
Implicit not supported on multiple devices.
C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform CUDA -precision double --disable-splash --disable-accuracy-check
Explicit:
Accuracy checking disabled.
7.089 ns/day
Implicit not supported on multiple devices.
C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform OpenCL -precision single --disable-splash --disable-accuracy-check
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit:
Accuracy checking disabled.
17.0206 ns/day
Implicit not supported on multiple devices.
C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform OpenCL -precision double --disable-splash --disable-accuracy-check
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit:
Accuracy checking disabled.
6.16491 ns/day
Implicit not supported on multiple devices.
That may be a bit of an understatement. I did a small check using CUDA and my 590s. Adding one gpu increased the performance by about 33%, adding a third gpu increased the performance about another 25% of the original score, and then a fourth showed no improvement over three. This was done on the same 590s I posted about in the other thread (linked above), but I had a couple of additional things running that may have reduced scores from yesterday. I will have to run this again when I have nothing else to do on this computer to see if the reduction seen on a single gpu from yesterday to today was because of other processes or the change from v0.3 to v0.4.proteneer wrote:no actually i do mean 2 cards working on the same WU - the problem is that kinda don't scale very well when testing internally (and they only work on explicit). I'll probably need to add another flag specifying explicit/implicit - so sorry if I end up breaking all your batch files =P
proteneer wrote:Unfortunately yes. The CUDA platform does a lot of JIT compilation and hence requires the nvcc compiler. We assume the user has the NVIDIA GPU COMPUTING TOOLKIT 5.0 installed (and hence why its able to find CUFF).Napoleon wrote:Do I actually have to install the whole CUDA5 Toolkit to try this? I've got Visual Studio Express already, as well as NVidia 310.90 WHQL driver, which does have CUDA5 support.
Did this ever get resolved? I have 0.4 and without installing the toolkit I ran the http://www.dependencywalker on the OpenMMOpenCL.dll file that came with the benchmark. I don't know if you're still interested in the results.proteneer wrote:Updated the download with some additional libraries - please redownload. Should fix OpenMMOpenCL.dll errors.
Code: Select all
Error: At least one required implicit or forwarded dependency was not found.
Error: At least one module has an unresolved import due to a missing export function in an implicitly dependent module.
Error: Modules with different CPU types were found.
Warning: At least one delay-load dependency module was not found.
Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module.
Code: Select all
C:\Users\bruce\FAHBench_0_4>FAHBench.exe --display-devices --disable-splash
[1] compatible platform(s):
-- 0 --
PROFILE = FULL_PROFILE
VERSION = OpenCL 1.1 CUDA 4.2.1
NAME = NVIDIA CUDA
VENDOR = NVIDIA Corporation
(1) device(s) found on platform 0:
-- 0 --
DEVICE_NAME = GeForce GTX 650 Ti
DEVICE_VENDOR = NVIDIA Corporation
DEVICE_VERSION = OpenCL 1.1 CUDA
Invalid Platform (please use either OpenCL or CUDA)
C:\Users\bruce\FAHBench_0_4>FAHBench.exe -platformId 0 -deviceId 0 -platform OpenCL -precision single --disable-splash
Explicit:
Checking for accuracy...done
5.84779 ns/day
Implicit:
Checking for accuracy...done
38.4752 ns/day
C:\Users\bruce\FAHBench_0_4>FAHBench.exe -platformId 0 -deviceId 0 -platform OpenCL -precision double --disable-splash
Explicit:
Checking for accuracy...done
1.80511 ns/day
Implicit:
Checking for accuracy...done
2.57122 ns/day