Page 7 of 13

Re: Folding@Home Benchmark Beta Testing

Posted: Sat Jan 19, 2013 12:56 am
by proteneer
PS - Anandtech promised to play around with the benchmarks this week too ^_^

Re: Folding@Home Benchmark Beta Testing

Posted: Sat Jan 19, 2013 1:05 am
by PinHead
proteneer wrote:Do you guys have the latest drivers?
upgraded the ATI/AMD HD57xx drivers from Catalyist 11.12 to Catalyist 12.8 and now OpenCL runs.

Win 7 64bit
Catalyist 12.8
AMD/ATI HD5700 series
FAHBench v 0.3

Open CL Single Precision
Explicit
5.05426 ns/day

Implicit
42.1176 ns/day

Open CL Double Precision
Explicit
"This device does not support double precision"

Re: Folding@Home Benchmark Beta Testing

Posted: Sat Jan 19, 2013 2:42 am
by proteneer
Ok thanks - that makes sense =)

Re: Folding@Home Benchmark Beta Testing

Posted: Sat Jan 19, 2013 2:49 am
by PinHead
proteneer wrote:
PinHead wrote:It doesn't seem to like Nvidia/Nvidia either:

Vista Ultimate 32 bit, Tesla C2050 and GTX550Ti

FAHBench.exe --display-devices
Output:

Code: Select all

                                                                               
                                          O              O                     
   P R O T E N E E R     C--N              \              \               N    
                         |                  C              C=O           / \-C 
                         C                 /               |          N-C     \
  .C-C                 C/                  C               C           |      C
 /    \          O     |                   |               /           N      |
C     C          |     |           O       C              C                 /-C
 \_N_/ \   N    _C_    C           |      /         O    /                 C   
        C-/ \_C/   \N-/ \    N   /-C-\   C          |    |           O    /    
        |     |           C-/ \C/     N-/ \_   N\  /C\  -C      N    |    |    
        O     |           |    |            \C/  C/   N/  \_C__/ \   C-\  C    
              C           O    |             |   |          |     C-/   N/ \-C
               \_C             C             O   |          O     |          | 
                  \             \-O              C                C          O 
                  |                               \                \           
                  C    N         Folding@Home      C--N             C          
                   \   |      Benchmark  (Beta)    |                |          
                    N--C                           O                |          
                        \        Yutong Zhao                       C=O        
                         N    proteneer@gmail.com                 /           
                                                                 O            
                                                                               
               for official stats, please visit http://www.fahbench.com               

=== 1 OpenCL platform(s) found: ===
  -- 0 --
  PROFILE = FULL_PROFILE
  VERSION = OpenCL 1.1 CUDA 4.2.1
  NAME = NVIDIA CUDA
  VENDOR = NVIDIA Corporation
  EXTENSIONS = cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll 
=== 1 OpenCL device(s) found on platform:
  -- 0 --
  DEVICE_NAME = Tesla C2050 / C2070
  DEVICE_VENDOR = NVIDIA Corporation
  DEVICE_VERSION = OpenCL 1.1 CUDA
  DRIVER_VERSION = 306.94
  DEVICE_MAX_COMPUTE_UNITS = 14
  DEVICE_MAX_CLOCK_FREQUENCY = 1147
  DEVICE_GLOBAL_MEM_SIZE = 2818572288
  -- 1 --
  DEVICE_NAME = GeForce GTX 550 Ti
  DEVICE_VENDOR = NVIDIA Corporation
  DEVICE_VERSION = OpenCL 1.1 CUDA
  DRIVER_VERSION = 306.94
  DEVICE_MAX_COMPUTE_UNITS = 4
  DEVICE_MAX_CLOCK_FREQUENCY = 1962
  DEVICE_GLOBAL_MEM_SIZE = 1073741824
Invalid Platform (please use either OpenCL or CUDA)
Tesla C2050 WDDM mode ( OpenCL )
FAHBench.exe -deviceId 0 -platform OpenCL -precision single
Explicit:
13.5521 ns/day ( accuracy really beats on the slow drive, factor ? )

Implicit:
61.4903 ns/day

GTX550Ti ( OpenCL )FAHBench.exe -deviceId 1 -platform OpenCL -precision single
Explicit:
8.40868 ns/day

Implicit:
36.6661 ns/day

Tesla C2050 WDDM mode ( CUDA )FAHBench.exe -deviceId 0 -platform CUDA -precision single
Explicit:
15.8791 ns/day

Implicit:
55.0354 ns/day

GTX550Ti ( CUDA )FAHBench.exe -deviceId 1 -platform CUDA -precision single
Explicit:
Error launching CUDA compiler:-1
nvcc : fatal error : Value 'compute_21' is not defined for option
'gpu-architecture'
Is this with Version 0_2?
That was with Version 0_1, Version 0_3 seems to work fine!

c:\FAHBench_0_3>FAHBench.exe --display-devices


[1] compatible platform(s):
-- 0 --
PROFILE = FULL_PROFILE
VERSION = OpenCL 1.1 CUDA 4.2.1
NAME = NVIDIA CUDA
VENDOR = NVIDIA Corporation

(2) device(s) found on platform 0:
-- 0 --
DEVICE_NAME = Tesla C2050 / C2070
DEVICE_VENDOR = NVIDIA Corporation
DEVICE_VERSION = OpenCL 1.1 CUDA

-- 1 --
DEVICE_NAME = GeForce GTX 550 Ti
DEVICE_VENDOR = NVIDIA Corporation
DEVICE_VERSION = OpenCL 1.1 CUDA

Invalid Platform (please use either OpenCL or CUDA)


Tesla C2050

c:\FAHBench_0_3>FAHBench.exe -platformId 0 -deviceId 0 -platform OpenCL -precision single --disable-splash
Explicit:
Checking for accuracy...done
13.5529 ns/day
Implicit:
Checking for accuracy...done
61.5779 ns/day


c:\FAHBench_0_3>FAHBench.exe -platformId 0 -deviceId 0 -platform OpenCL -precision double --disable-splash
Explicit:
Checking for accuracy...done
7.24334 ns/day
Implicit:
Checking for accuracy...done
10.7788 ns/day


c:\FAHBench_0_3>FAHBench.exe -deviceId 0 -platform CUDA -precision single --disable-splash
Explicit:
Checking for accuracy...done
15.885 ns/day
Implicit:
Checking for accuracy...done
55.1865 ns/day


c:\FAHBench_0_3>FAHBench.exe -deviceId 0 -platform CUDA -precision double --disable-splash
Explicit:
Checking for accuracy...done
7.16828 ns/day
Implicit:
Checking for accuracy...done
11.0659 ns/day


GTX 550Ti

c:\FAHBench_0_3>FAHBench.exe -platformId 0 -deviceId 1 -platform OpenCL -precision single --disable-splash
Explicit:
Checking for accuracy...done
8.56641 ns/day
Implicit:
Checking for accuracy...done
42.1772 ns/day


c:\FAHBench_0_3>FAHBench.exe -platformId 0 -deviceId 1 -platform OpenCL -precision double --disable-splash
Explicit:
Checking for accuracy...done
2.51918 ns/day
Implicit:
Checking for accuracy...done
3.38533 ns/day


c:\FAHBench_0_3>FAHBench.exe -deviceId 1 -platform CUDA -precision single --disable-splash
Explicit:
Checking for accuracy...done
11.3592 ns/day
Implicit:
Checking for accuracy...done
20.203 ns/day


c:\FAHBench_0_3>FAHBench.exe -deviceId 1 -platform CUDA -precision double --disable-splash
Explicit:
Checking for accuracy...done
2.914 ns/day
Implicit:
Checking for accuracy...done
3.95735 ns/day

Re: Folding@Home Benchmark Beta Testing

Posted: Sat Jan 19, 2013 3:12 pm
by rjbelans
proteneer wrote:nice - its a nice list of numbers the you guys are getting. the numbers are going to vary in the upcoming months, as we begin to heavily optimize OpenMM. I trust all of you guys here, but you might want to mention that those numbers are on unofficial - it's pretty easy to hack so we keep an official list on fahbench.com as well.

You can keep both the CUDA/OpenCL numbers, we care more about single precision perf. at this point. Don't really need double (though it is one area in which the K20 shines).

PS - does anyone have overclocked 580? That card is a BEAST.
I'll see about giving my 3 x 580SCs and 2 x 590 Classifieds a run on this either today or tomorrow.



EDIT 1: 1:30 PM EST January 19, 2013

Added 590 info here: viewtopic.php?f=38&t=23440&start=15#p234172

FAHBench ran fine as far as I can tell.

Re: Folding@Home Benchmark Beta Testing

Posted: Sun Jan 20, 2013 1:48 am
by proteneer
quick heads up - next update (not released yet) will allow MULTIPLE GPUs to WORK TOGETHER

Re: Folding@Home Benchmark Beta Testing

Posted: Sun Jan 20, 2013 2:57 am
by k1wi
proteneer wrote:quick heads up - next update (not released yet) will allow MULTIPLE GPUs to WORK TOGETHER
Fascinating. How similar will they have to be? Exact same card? same family, or any two GPUs that meet base requirements?

Re: Folding@Home Benchmark Beta Testing

Posted: Sun Jan 20, 2013 7:51 am
by Evil Penguin
k1wi wrote:
proteneer wrote:quick heads up - next update (not released yet) will allow MULTIPLE GPUs to WORK TOGETHER
Fascinating. How similar will they have to be? Exact same card? same family, or any two GPUs that meet base requirements?
I don't think he meant multiple cards working on the very same WU (like SLI or something).
Just multiple instances at the same time.

Re: Folding@Home Benchmark Beta Testing

Posted: Sun Jan 20, 2013 9:24 am
by proteneer
no actually i do mean 2 cards working on the same WU - the problem is that kinda don't scale very well when testing internally (and they only work on explicit). I'll probably need to add another flag specifying explicit/implicit

Re: Folding@Home Benchmark Beta Testing

Posted: Sun Jan 20, 2013 10:01 am
by Evil Penguin
proteneer wrote:no actually i do mean 2 cards working on the same WU - the problem is that kinda don't scale very well when testing internally (and they only work on explicit). I'll probably need to add another flag specifying explicit/implicit - so sorry if I end up breaking all your batch files =P
I think you would have better luck running multiple WUs on the same GPU (utilization).
I believe some GPUs can allocate a certain amount of SPUs to different tasks.
Not sure about that...

Also, isn't there a bit of a problem with smaller proteins not scaling too well with more SPUs?

Re: Folding@Home Benchmark Beta (0.4 Latest)

Posted: Sun Jan 20, 2013 10:05 am
by proteneer
Updated - 0.4 released, can now do multiple GPUs on same device.

"I think you would have better luck running multiple WUs on the same GPU (utilization)."

In FAH we highly prefer longer trajectories of a single simulation when possible.

Re: Folding@Home Benchmark Beta (0.4 Latest)

Posted: Sun Jan 20, 2013 3:28 pm
by P5-133XL
Q6600@3.0GHz, 4GB RAM, 2x GTX450@825/1848; x64 Win7 SP1 Nvidia v306.94

This gives single GTX460 data to compare against the dual SMP GTX460 data.

Code: Select all

C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform CUDA -precision single --disable-accuracy-check 
                                                                               
                                          O              O                     
   P R O T E N E E R     C--N              \              \               N    
                         |                  C              C=O           / \-C 
                         C                 /               |          N-C     \
  .C-C                 C/                  C               C           |      C
 /    \          O     |                   |               /           N      |
C     C          |     |           O       C              C                 /-C
 \_N_/ \   N    _C_    C           |      /         O    /                 C   
        C-/ \_C/   \N-/ \    N   /-C-\   C          |    |           O    /    
        |     |           C-/ \C/     N-/ \_   N\  /C\  -C      N    |    |    
        O     |           |    |            \C/  C/   N/  \_C__/ \   C-\  C    
              C           O    |             |   |          |     C-/   N/ \-C
               \_C             C             O   |          O     |          | 
                  \             \-O              C                C          O 
                  |                               \                \           
                  C    N         Folding@Home      C--N             C          
                   \   |      Benchmark  (Beta)    |                |          
                    N--C                           O                |          
                        \        Yutong Zhao                       C=O        
                         N    proteneer@gmail.com                 /           
                                                                 O            
                                                                               
               for official stats, please visit www.fahbench.com               

Explicit: 
Accuracy checking disabled.
14.2741 ns/day
Implicit: 
Accuracy checking disabled.
56.962 ns/day

C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform CUDA -precision double --disable-splash --disable-accuracy-check 
Explicit: 
Accuracy checking disabled.
3.94604 ns/day
Implicit: 
Accuracy checking disabled.
5.61699 ns/day

C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform OpenCL -precision single --disable-splash --disable-accuracy-check 
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit: 
Accuracy checking disabled.
10.2394 ns/day
Implicit: 
Accuracy checking disabled.
52.4224 ns/day

C:\Temp\FAHBench_0_4>fahbench -deviceId 0 -platform OpenCL -precision double --disable-splash --disable-accuracy-check 
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit: 
Accuracy checking disabled.
3.40654 ns/day
Implicit: 
Accuracy checking disabled.
4.85512 ns/day

C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform CUDA -precision single --disable-splash --disable-accuracy-check 
Explicit: 
Accuracy checking disabled.
22.9827 ns/day
Implicit not supported on multiple devices.

C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform CUDA -precision double --disable-splash --disable-accuracy-check 
Explicit: 
Accuracy checking disabled.
7.089 ns/day
Implicit not supported on multiple devices.

C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform OpenCL -precision single --disable-splash --disable-accuracy-check 
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit: 
Accuracy checking disabled.
17.0206 ns/day
Implicit not supported on multiple devices.

C:\Temp\FAHBench_0_4>fahbench -deviceId 0,1 -platform OpenCL -precision double --disable-splash --disable-accuracy-check 
Warning: Using OpenCL platform but no platformId specified, setting platformId=0
Explicit: 
Accuracy checking disabled.
6.16491 ns/day
Implicit not supported on multiple devices.

Re: Folding@Home Benchmark Beta Testing

Posted: Sun Jan 20, 2013 9:57 pm
by rjbelans
proteneer wrote:no actually i do mean 2 cards working on the same WU - the problem is that kinda don't scale very well when testing internally (and they only work on explicit). I'll probably need to add another flag specifying explicit/implicit - so sorry if I end up breaking all your batch files =P
That may be a bit of an understatement. I did a small check using CUDA and my 590s. Adding one gpu increased the performance by about 33%, adding a third gpu increased the performance about another 25% of the original score, and then a fourth showed no improvement over three. This was done on the same 590s I posted about in the other thread (linked above), but I had a couple of additional things running that may have reduced scores from yesterday. I will have to run this again when I have nothing else to do on this computer to see if the reduction seen on a single gpu from yesterday to today was because of other processes or the change from v0.3 to v0.4.

Image

Re: Folding@Home Benchmark Beta Testing

Posted: Mon Jan 21, 2013 6:54 am
by bruce
proteneer wrote:
Napoleon wrote:Do I actually have to install the whole CUDA5 Toolkit to try this? I've got Visual Studio Express already, as well as NVidia 310.90 WHQL driver, which does have CUDA5 support.
Unfortunately yes. The CUDA platform does a lot of JIT compilation and hence requires the nvcc compiler. We assume the user has the NVIDIA GPU COMPUTING TOOLKIT 5.0 installed (and hence why its able to find CUFF).
proteneer wrote:Updated the download with some additional libraries - please redownload. Should fix OpenMMOpenCL.dll errors.
Did this ever get resolved? I have 0.4 and without installing the toolkit I ran the http://www.dependencywalker on the OpenMMOpenCL.dll file that came with the benchmark. I don't know if you're still interested in the results.

Code: Select all

Error: At least one required implicit or forwarded dependency was not found.
Error: At least one module has an unresolved import due to a missing export function in an implicitly dependent module.
Error: Modules with different CPU types were found.
Warning: At least one delay-load dependency module was not found.
Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module.
Three files could not be found:
CUFFT32_50-35.DLL
MVSCR90.DLL
IESHIMS.DLL

I'm not sure if you ever want to make FAHBench download all it's dependencies, but if you do, I though this might help.

I'll install the toolkit now.

In the meantime, plain vanilla (no overclock):

Code: Select all

C:\Users\bruce\FAHBench_0_4>FAHBench.exe --display-devices --disable-splash
[1] compatible platform(s):
  -- 0 --
  PROFILE = FULL_PROFILE
  VERSION = OpenCL 1.1 CUDA 4.2.1
  NAME = NVIDIA CUDA
  VENDOR = NVIDIA Corporation

(1) device(s) found on platform 0:
  -- 0 --
  DEVICE_NAME = GeForce GTX 650 Ti
  DEVICE_VENDOR = NVIDIA Corporation
  DEVICE_VERSION = OpenCL 1.1 CUDA

Invalid Platform (please use either OpenCL or CUDA)

C:\Users\bruce\FAHBench_0_4>FAHBench.exe -platformId 0 -deviceId 0 -platform OpenCL -precision single --disable-splash
Explicit:
Checking for accuracy...done
5.84779 ns/day
Implicit:
Checking for accuracy...done
38.4752 ns/day

C:\Users\bruce\FAHBench_0_4>FAHBench.exe -platformId 0 -deviceId 0 -platform OpenCL -precision double --disable-splash
Explicit:
Checking for accuracy...done
1.80511 ns/day
Implicit:
Checking for accuracy...done
2.57122 ns/day

Re: Folding@Home Benchmark Beta (0.4 Latest)

Posted: Mon Jan 21, 2013 7:14 am
by bruce
Do you plan to port FAHBench to Linux or OSX any time soon?