Page 2 of 2

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Sat Sep 14, 2013 12:54 pm
by 7im
NaN errors are hardware related. Too much overclocking is the most common cause. Too much heat can be another. Not enough power (PSU too small). What are the GPU temps?

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Sat Sep 14, 2013 10:03 pm
by ntsarb
Update: Running FaH on the GPU connected to the (more) stable PCI-E slot also resulted in an UNSTABLE MACHINE error. It did not appear as early as it would on the "unstable" PCI-E slot. I am attaching the Log file, in case it helps:

Code: Select all

*********************** Log Started 2013-09-14T18:16:38Z ***********************
18:16:38:************************* Folding@home Client *************************
18:16:38:      Website: http://folding.stanford.edu/
18:16:38:    Copyright: (c) 2009-2013 Stanford University
18:16:38:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:16:38:         Args: --open-web-control
18:16:38:       Config: <none>
18:16:38:******************************** Build ********************************
18:16:38:      Version: 7.3.6
18:16:38:         Date: Feb 18 2013
18:16:38:         Time: 15:25:17
18:16:38:      SVN Rev: 3923
18:16:38:       Branch: fah/trunk/client
18:16:38:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
18:16:38:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
18:16:38:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
18:16:38:     Platform: win32 XP
18:16:38:         Bits: 32
18:16:38:         Mode: Release
18:16:38:******************************* System ********************************
18:16:38:          CPU: Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
18:16:38:       CPU ID: GenuineIntel Family 6 Model 30 Stepping 5
18:16:38:         CPUs: 8
18:16:38:       Memory: 16.00GiB
18:16:38:  Free Memory: 12.52GiB
18:16:38:      Threads: WINDOWS_THREADS
18:16:38:  Has Battery: false
18:16:38:   On Battery: false
18:16:38:   UTC offset: 1
18:16:38:          PID: 8576
18:16:38:          CWD: C:/Users/Nikos/AppData/Roaming/FAHClient
18:16:38:           OS: Windows 7 Home Premium
18:16:38:      OS Arch: AMD64
18:16:38:         GPUs: 2
18:16:38:        GPU 0: NVIDIA:3 GK106 [GeForce GTX 660]
18:16:38:        GPU 1: NVIDIA:3 GK106 [GeForce GTX 660]
18:16:38:         CUDA: 3.0
18:16:38:  CUDA Driver: 5050
18:16:38:Win32 Service: false
18:16:38:***********************************************************************
18:16:38:<config>
18:16:38:  <!-- Folding Slots -->
18:16:38:</config>
18:16:38:Connecting to assign-GPU.stanford.edu:80
18:16:40:Connecting to assign-GPU.stanford.edu:8080
18:16:41:Read GPUs.txt
18:16:41:Trying to access database...
18:16:41:Successfully acquired database lock
18:16:41:Enabled folding slot 00: PAUSED gpu:0:GK106 [GeForce GTX 660] (not configured)
18:16:41:Enabled folding slot 01: PAUSED gpu:1:GK106 [GeForce GTX 660] (not configured)
18:16:41:Enabled folding slot 02: PAUSED cpu:7 (not configured)
18:17:47:3:127.0.0.1:New Web connection
18:21:37:Set client configured
18:21:38:WU00:FS02:Connecting to assign-GPU.stanford.edu:80
18:21:38:WU00:FS02:Connecting to assign3.stanford.edu:8080
18:21:39:WU00:FS02:News: Welcome to Folding@Home
18:21:39:WU00:FS02:Assigned to work server 128.143.231.202
18:21:39:WU00:FS02:Requesting new work unit for slot 02: READY cpu:7 from 128.143.231.202
18:21:39:WU00:FS02:Connecting to 128.143.231.202:8080
18:21:40:WU00:FS02:Downloading 3.64MiB
18:21:44:WU00:FS02:Download complete
18:21:44:WU00:FS02:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:6099 run:4 clone:59 gen:303 core:0xa3 unit:0x000001b80a3b1e594e91dc2629515ad6
18:21:44:WU00:FS02:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah
18:21:44:WU00:FS02:Connecting to www.stanford.edu:80
18:21:44:WU00:FS02:FahCore a3: Downloading 2.89MiB
18:21:50:WU00:FS02:FahCore a3: 30.30%
18:21:56:WU00:FS02:FahCore a3: 67.09%
18:22:01:WU00:FS02:FahCore a3: Download complete
18:22:01:WU00:FS02:Valid core signature
18:22:01:WU00:FS02:Unpacked 9.59MiB to cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe
18:22:01:WU00:FS02:Starting
18:22:01:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe -dir 00 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -np 7
18:22:01:WU00:FS02:Started FahCore on PID 8760
18:22:02:WU00:FS02:Core PID:3672
18:22:02:WU00:FS02:FahCore 0xa3 started
18:22:02:WU00:FS02:0xa3:
18:22:02:WU00:FS02:0xa3:*------------------------------*
18:22:02:WU00:FS02:0xa3:Folding@Home Gromacs SMP Core
18:22:02:WU00:FS02:0xa3:Version 2.27 (Dec. 15, 2010)
18:22:02:WU00:FS02:0xa3:
18:22:02:WU00:FS02:0xa3:Preparing to commence simulation
18:22:02:WU00:FS02:0xa3:- Looking at optimizations...
18:22:02:WU00:FS02:0xa3:- Created dyn
18:22:02:WU00:FS02:0xa3:- Files status OK
18:22:02:WU00:FS02:0xa3:- Expanded 3813569 -> 4169428 (decompressed 109.3 percent)
18:22:02:WU00:FS02:0xa3:Called DecompressByteArray: compressed_data_size=3813569 data_size=4169428, decompressed_data_size=4169428 diff=0
18:22:02:WU00:FS02:0xa3:- Digital signature verified
18:22:02:WU00:FS02:0xa3:
18:22:02:WU00:FS02:0xa3:Project: 6099 (Run 4, Clone 59, Gen 303)
18:22:02:WU00:FS02:0xa3:
18:22:02:WU00:FS02:0xa3:Assembly optimizations on if available.
18:22:02:WU00:FS02:0xa3:Entering M.D.
18:22:08:WU00:FS02:0xa3:Mapping NT from 7 to 7 
18:22:09:WU00:FS02:0xa3:Completed 0 out of 500000 steps  (0%)
18:35:13:WU00:FS02:0xa3:Completed 5000 out of 500000 steps  (1%)
18:41:22:WU01:FS00:Connecting to assign-GPU.stanford.edu:80
18:41:22:WU02:FS01:Connecting to assign-GPU.stanford.edu:80
18:41:23:WU01:FS00:News: Welcome to Folding@Home
18:41:23:WU01:FS00:Assigned to work server 171.64.65.105
18:41:23:WU01:FS00:Requesting new work unit for slot 00: READY gpu:0:GK106 [GeForce GTX 660] from 171.64.65.105
18:41:23:WU01:FS00:Connecting to 171.64.65.105:8080
18:41:23:WU02:FS01:News: Welcome to Folding@Home
18:41:23:WU02:FS01:Assigned to work server 171.64.65.105
18:41:23:WU02:FS01:Requesting new work unit for slot 01: READY gpu:1:GK106 [GeForce GTX 660] (waiting for idle) from 171.64.65.105
18:41:23:WU02:FS01:Connecting to 171.64.65.105:8080
18:41:24:WU02:FS01:Downloading 78.43KiB
18:41:24:WU01:FS00:Downloading 78.06KiB
18:41:24:WU01:FS00:Download complete
18:41:24:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:7660 run:1077 clone:0 gen:188 core:0x15 unit:0x00000117664f2dd150f83fbbb95316ab
18:41:25:WU02:FS01:Download complete
18:41:25:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:7660 run:645 clone:0 gen:185 core:0x15 unit:0x00000103664f2dd150f83f804e65c4bc
18:43:35:FS02:Shutting core down
18:43:35:WU01:FS00:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah
18:43:35:WU01:FS00:Connecting to www.stanford.edu:80
18:43:35:WU01:FS00:FahCore 15: Downloading 1.88MiB
18:43:41:WU01:FS00:FahCore 15: 49.98%
18:43:42:WU00:FS02:0xa3:Client no longer detected. Shutting down core 
18:43:42:WU00:FS02:0xa3:
18:43:42:WU00:FS02:0xa3:Folding@home Core Shutdown: CLIENT_DIED
18:43:43:WU00:FS02:FahCore returned: INTERRUPTED (102 = 0x66)
18:43:43:WU00:FS02:Starting
18:43:43:WARNING:WU00:FS02:Changed SMP threads from 7 to 8 this can cause some work units to fail
18:43:43:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe -dir 00 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -np 8
18:43:43:WU00:FS02:Started FahCore on PID 4240
18:43:43:WU00:FS02:Core PID:2388
18:43:43:WU00:FS02:FahCore 0xa3 started
18:43:43:WU00:FS02:0xa3:
18:43:43:WU00:FS02:0xa3:*------------------------------*
18:43:43:WU00:FS02:0xa3:Folding@Home Gromacs SMP Core
18:43:43:WU00:FS02:0xa3:Version 2.27 (Dec. 15, 2010)
18:43:43:WU00:FS02:0xa3:
18:43:43:WU00:FS02:0xa3:Preparing to commence simulation
18:43:43:WU00:FS02:0xa3:- Looking at optimizations...
18:43:43:WU00:FS02:0xa3:- Files status OK
18:43:43:WU00:FS02:0xa3:- Expanded 3813569 -> 4169428 (decompressed 109.3 percent)
18:43:43:WU00:FS02:0xa3:Called DecompressByteArray: compressed_data_size=3813569 data_size=4169428, decompressed_data_size=4169428 diff=0
18:43:44:WU00:FS02:0xa3:- Digital signature verified
18:43:44:WU00:FS02:0xa3:
18:43:44:WU00:FS02:0xa3:Project: 6099 (Run 4, Clone 59, Gen 303)
18:43:44:WU00:FS02:0xa3:
18:43:44:WU00:FS02:0xa3:Assembly optimizations on if available.
18:43:44:WU00:FS02:0xa3:Entering M.D.
18:43:46:WU01:FS00:FahCore 15: Download complete
18:43:46:WU01:FS00:Valid core signature
18:43:46:WU01:FS00:Unpacked 7.71MiB to cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe
18:43:46:WU01:FS00:Starting
18:43:46:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
18:43:46:WU01:FS00:Started FahCore on PID 5748
18:43:46:WU01:FS00:Core PID:1616
18:43:46:WU01:FS00:FahCore 0x15 started
18:43:46:WU02:FS01:Starting
18:43:46:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 02 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:43:46:WU02:FS01:Started FahCore on PID 6188
18:43:46:WU02:FS01:Core PID:8588
18:43:46:WU02:FS01:FahCore 0x15 started
18:43:50:WU00:FS02:0xa3:Using Gromacs checkpoints
18:43:50:WU00:FS02:0xa3:Mapping NT from 8 to 8 
18:43:50:WU00:FS02:0xa3:Resuming from checkpoint
18:43:51:WU00:FS02:0xa3:Verified 00/wudata_01.log
18:43:51:WU00:FS02:0xa3:Verified 00/wudata_01.trr
18:43:51:WU00:FS02:0xa3:Verified 00/wudata_01.edr
18:43:51:WU01:FS00:0x15:
18:43:51:WU01:FS00:0x15:*------------------------------*
18:43:51:WU01:FS00:0x15:Folding@Home GPU Core
18:43:51:WU01:FS00:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:43:51:WU01:FS00:0x15:Build host             AmoebaRemote
18:43:51:WU01:FS00:0x15:Board Type             NVIDIA/CUDA
18:43:51:WU01:FS00:0x15:Core                   15
18:43:51:WU01:FS00:0x15:
18:43:51:WU01:FS00:0x15:Window's signal control handler registered.
18:43:51:WU01:FS00:0x15:Preparing to commence simulation
18:43:51:WU01:FS00:0x15:- Looking at optimizations...
18:43:51:WU01:FS00:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:43:51:WU01:FS00:0x15:- Created dyn
18:43:51:WU01:FS00:0x15:- Files status OK
18:43:51:WU01:FS00:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:43:51:WU01:FS00:0x15:- Expanded 79426 -> 307810 (decompressed 387.5 percent)
18:43:51:WU01:FS00:0x15:Called DecompressByteArray: compressed_data_size=79426 data_size=307810, decompressed_data_size=307810 diff=0
18:43:51:WU01:FS00:0x15:- Digital signature verified
18:43:51:WU01:FS00:0x15:
18:43:51:WU01:FS00:0x15:Project: 7660 (Run 1077, Clone 0, Gen 188)
18:43:51:WU01:FS00:0x15:
18:43:51:WU01:FS00:0x15:Assembly optimizations on if available.
18:43:51:WU01:FS00:0x15:Entering M.D.
18:43:51:WU00:FS02:0xa3:Completed 5740 out of 500000 steps  (1%)
18:43:51:WU02:FS01:0x15:
18:43:51:WU02:FS01:0x15:*------------------------------*
18:43:51:WU02:FS01:0x15:Folding@Home GPU Core
18:43:51:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:43:51:WU02:FS01:0x15:Build host             AmoebaRemote
18:43:51:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
18:43:51:WU02:FS01:0x15:Core                   15
18:43:51:WU02:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:43:51:WU02:FS01:0x15:
18:43:51:WU02:FS01:0x15:Window's signal control handler registered.
18:43:51:WU02:FS01:0x15:Preparing to commence simulation
18:43:51:WU02:FS01:0x15:- Looking at optimizations...
18:43:51:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
18:43:51:WU02:FS01:0x15:- Created dyn
18:43:51:WU02:FS01:0x15:- Files status OK
18:43:51:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:43:51:WU02:FS01:0x15:- Expanded 79804 -> 307810 (decompressed 385.7 percent)
18:43:51:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=79804 data_size=307810, decompressed_data_size=307810 diff=0
18:43:51:WU02:FS01:0x15:- Digital signature verified
18:43:51:WU02:FS01:0x15:
18:43:51:WU02:FS01:0x15:Project: 7660 (Run 645, Clone 0, Gen 185)
18:43:51:WU02:FS01:0x15:
18:43:51:WU02:FS01:0x15:Assembly optimizations on if available.
18:43:51:WU02:FS01:0x15:Entering M.D.
18:43:53:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  3787850416 1496236415 1649889738 1869792425 2509146834
18:43:53:WU02:FS01:0x15:GPU device id=1
18:43:53:WU01:FS00:0x15:Tpr hash 01/wudata_01.tpr:  3486599561 625215593 1251349550 4210270855 2310075965
18:43:53:WU01:FS00:0x15:GPU device id=0
18:43:53:WU01:FS00:0x15:Working on Protein
18:43:53:WU01:FS00:0x15:Client config unavailable.
18:43:53:WU02:FS01:0x15:Working on Protein
18:43:53:WU02:FS01:0x15:Client config unavailable.
18:43:53:WU02:FS01:0x15:Starting GUI Server
18:43:53:WU01:FS00:0x15:Starting GUI Server
18:44:06:Saving configuration to config.xml
18:44:06:<config>
18:44:06:  <!-- Folding Slot Configuration -->
18:44:06:  <power v='full'/>
18:44:06:
18:44:06:  <!-- Network -->
18:44:06:  <proxy v=':8080'/>
18:44:06:
18:44:06:  <!-- User Information -->
18:44:06:  <passkey v='********************************'/>
18:44:06:  <team v='44079'/>
18:44:06:  <user v='ntsarb'/>
18:44:06:
18:44:06:  <!-- Folding Slots -->
18:44:06:  <slot id='0' type='GPU'/>
18:44:06:  <slot id='1' type='GPU'/>
18:44:06:  <slot id='2' type='CPU'>
18:44:06:    <cpus v='4'/>
18:44:06:  </slot>
18:44:06:</config>
18:44:06:FS02:Shutting core down
18:44:14:WU00:FS02:0xa3:Client no longer detected. Shutting down core 
18:44:14:WU00:FS02:0xa3:
18:44:14:WU00:FS02:0xa3:Folding@home Core Shutdown: CLIENT_DIED
18:44:14:WU00:FS02:FahCore returned: INTERRUPTED (102 = 0x66)
18:44:35:Saving configuration to config.xml
18:44:35:<config>
18:44:35:  <!-- Folding Slot Configuration -->
18:44:35:  <power v='full'/>
18:44:35:
18:44:35:  <!-- Network -->
18:44:35:  <proxy v=':8080'/>
18:44:35:
18:44:35:  <!-- User Information -->
18:44:35:  <passkey v='********************************'/>
18:44:35:  <team v='44079'/>
18:44:35:  <user v='ntsarb'/>
18:44:35:
18:44:35:  <!-- Folding Slots -->
18:44:35:  <slot id='0' type='GPU'/>
18:44:35:  <slot id='1' type='GPU'/>
18:44:35:  <slot id='2' type='CPU'>
18:44:35:    <cpus v='4'/>
18:44:35:  </slot>
18:44:35:</config>
18:44:43:WU00:FS02:Starting
18:44:43:WARNING:WU00:FS02:Changed SMP threads from 8 to 4 this can cause some work units to fail
18:44:43:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe -dir 00 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -np 4
18:44:43:WU00:FS02:Started FahCore on PID 8616
18:44:43:WU00:FS02:Core PID:8460
18:44:43:WU00:FS02:FahCore 0xa3 started
18:44:44:WU00:FS02:0xa3:
18:44:44:WU00:FS02:0xa3:*------------------------------*
18:44:44:WU00:FS02:0xa3:Folding@Home Gromacs SMP Core
18:44:44:WU00:FS02:0xa3:Version 2.27 (Dec. 15, 2010)
18:44:44:WU00:FS02:0xa3:
18:44:44:WU00:FS02:0xa3:Preparing to commence simulation
18:44:44:WU00:FS02:0xa3:- Looking at optimizations...
18:44:44:WU00:FS02:0xa3:- Files status OK
18:44:44:WU00:FS02:0xa3:- Expanded 3813569 -> 4169428 (decompressed 109.3 percent)
18:44:44:WU00:FS02:0xa3:Called DecompressByteArray: compressed_data_size=3813569 data_size=4169428, decompressed_data_size=4169428 diff=0
18:44:44:WU00:FS02:0xa3:- Digital signature verified
18:44:44:WU00:FS02:0xa3:
18:44:44:WU00:FS02:0xa3:Project: 6099 (Run 4, Clone 59, Gen 303)
18:44:44:WU00:FS02:0xa3:
18:44:44:WU00:FS02:0xa3:Assembly optimizations on if available.
18:44:44:WU00:FS02:0xa3:Entering M.D.
18:44:50:WU00:FS02:0xa3:Using Gromacs checkpoints
18:44:50:WU00:FS02:0xa3:Mapping NT from 4 to 4 
18:44:51:WU00:FS02:0xa3:Resuming from checkpoint
18:44:51:WU00:FS02:0xa3:Verified 00/wudata_01.log
18:44:51:WU00:FS02:0xa3:Verified 00/wudata_01.trr
18:44:51:WU00:FS02:0xa3:Verified 00/wudata_01.edr
18:44:51:WU00:FS02:0xa3:Completed 5740 out of 500000 steps  (1%)
18:44:54:WU02:FS01:0x15:Setting checkpoint frequency: 400000
18:44:54:WU02:FS01:0x15:Completed         3 out of 40000000 steps (0%).
18:44:54:WU01:FS00:0x15:Setting checkpoint frequency: 400000
18:44:54:WU01:FS00:0x15:Completed         3 out of 40000000 steps (0%).
18:47:52:WU01:FS00:0x15:Completed    400000 out of 40000000 steps (1%).
18:47:52:WU01:FS00:0x15:mdrun_gpu returned 52
18:47:52:WU01:FS00:0x15:NANs detected on GPU
18:47:52:WU01:FS00:0x15:
18:47:52:WU01:FS00:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
18:47:52:WU02:FS01:0x15:Completed    400000 out of 40000000 steps (1%).
18:47:53:WARNING:WU01:FS00:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
18:47:53:WU01:FS00:Starting
18:47:53:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
18:47:53:WU01:FS00:Started FahCore on PID 3340
18:47:53:WU01:FS00:Core PID:6012
18:47:53:WU01:FS00:FahCore 0x15 started
18:47:53:WU01:FS00:0x15:
18:47:53:WU01:FS00:0x15:*------------------------------*
18:47:53:WU01:FS00:0x15:Folding@Home GPU Core
18:47:53:WU01:FS00:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:47:53:WU01:FS00:0x15:Build host             AmoebaRemote
18:47:53:WU01:FS00:0x15:Board Type             NVIDIA/CUDA
18:47:53:WU01:FS00:0x15:Core                   15
18:47:53:WU01:FS00:0x15:
18:47:53:WU01:FS00:0x15:Window's signal control handler registered.
18:47:53:WU01:FS00:0x15:Preparing to commence simulation
18:47:53:WU01:FS00:0x15:- Looking at optimizations...
18:47:53:WU01:FS00:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:47:53:WU01:FS00:0x15:- Created dyn
18:47:53:WU01:FS00:0x15:- Files status OK
18:47:53:WU01:FS00:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:47:53:WU01:FS00:0x15:- Expanded 79426 -> 307810 (decompressed 387.5 percent)
18:47:53:WU01:FS00:0x15:Called DecompressByteArray: compressed_data_size=79426 data_size=307810, decompressed_data_size=307810 diff=0
18:47:53:WU01:FS00:0x15:- Digital signature verified
18:47:53:WU01:FS00:0x15:
18:47:53:WU01:FS00:0x15:Project: 7660 (Run 1077, Clone 0, Gen 188)
18:47:53:WU01:FS00:0x15:
18:47:53:WU01:FS00:0x15:Assembly optimizations on if available.
18:47:53:WU01:FS00:0x15:Entering M.D.
18:47:55:WU01:FS00:0x15:Tpr hash 01/wudata_01.tpr:  3486599561 625215593 1251349550 4210270855 2310075965
18:47:55:WU01:FS00:0x15:GPU device id=0
18:47:55:WU01:FS00:0x15:Working on Protein
18:47:55:WU01:FS00:0x15:Client config unavailable.
18:47:55:WU01:FS00:0x15:Starting GUI Server
18:48:46:Saving configuration to config.xml
18:48:46:<config>
18:48:46:  <!-- Folding Slot Configuration -->
18:48:46:  <power v='full'/>
18:48:46:
18:48:46:  <!-- Network -->
18:48:46:  <proxy v=':8080'/>
18:48:46:
18:48:46:  <!-- User Information -->
18:48:46:  <passkey v='********************************'/>
18:48:46:  <team v='44079'/>
18:48:46:  <user v='ntsarb'/>
18:48:46:
18:48:46:  <!-- Folding Slots -->
18:48:46:  <slot id='1' type='GPU'/>
18:48:46:  <slot id='2' type='CPU'>
18:48:46:    <cpus v='4'/>
18:48:46:  </slot>
18:48:46:</config>
18:48:46:FS00:Shutting core down
18:48:53:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
18:48:53:WARNING:WU01:Slot ID 0 no longer exists, migrating to FS01
18:48:58:FS01:Shutting core down
18:48:58:FS02:Shutting core down
18:49:01:WU02:FS01:0x15:Client no longer detected. Shutting down core 
18:49:01:WU02:FS01:0x15:
18:49:01:WU02:FS01:0x15:Folding@home Core Shutdown: CLIENT_DIED
18:49:01:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
18:49:04:WU00:FS02:FahCore returned: INTERRUPTED (102 = 0x66)
18:49:07:WU00:FS02:Starting
18:49:07:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe -dir 00 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -np 4
18:49:07:WU00:FS02:Started FahCore on PID 3692
18:49:07:WU00:FS02:Core PID:9188
18:49:07:WU00:FS02:FahCore 0xa3 started
18:49:07:WU01:FS01:Starting
18:49:07:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:49:07:WU01:FS01:Started FahCore on PID 8820
18:49:07:WU01:FS01:Core PID:4596
18:49:07:WU01:FS01:FahCore 0x15 started
18:49:07:WU00:FS02:0xa3:
18:49:07:WU00:FS02:0xa3:*------------------------------*
18:49:07:WU00:FS02:0xa3:Folding@Home Gromacs SMP Core
18:49:07:WU00:FS02:0xa3:Version 2.27 (Dec. 15, 2010)
18:49:07:WU00:FS02:0xa3:
18:49:07:WU00:FS02:0xa3:Preparing to commence simulation
18:49:07:WU00:FS02:0xa3:- Looking at optimizations...
18:49:07:WU00:FS02:0xa3:- Files status OK
18:49:07:WU00:FS02:0xa3:- Expanded 3813569 -> 4169428 (decompressed 109.3 percent)
18:49:07:WU00:FS02:0xa3:Called DecompressByteArray: compressed_data_size=3813569 data_size=4169428, decompressed_data_size=4169428 diff=0
18:49:07:WU01:FS01:0x15:
18:49:07:WU01:FS01:0x15:*------------------------------*
18:49:07:WU01:FS01:0x15:Folding@Home GPU Core
18:49:07:WU01:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:49:07:WU01:FS01:0x15:Build host             AmoebaRemote
18:49:07:WU01:FS01:0x15:Board Type             NVIDIA/CUDA
18:49:07:WU01:FS01:0x15:Core                   15
18:49:07:WU01:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:49:07:WU01:FS01:0x15:
18:49:07:WU01:FS01:0x15:Window's signal control handler registered.
18:49:07:WU01:FS01:0x15:Preparing to commence simulation
18:49:07:WU01:FS01:0x15:- Looking at optimizations...
18:49:07:WU01:FS01:0x15:- Files status OK
18:49:07:WU01:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:49:07:WU01:FS01:0x15:- Expanded 79426 -> 307810 (decompressed 387.5 percent)
18:49:07:WU01:FS01:0x15:Called DecompressByteArray: compressed_data_size=79426 data_size=307810, decompressed_data_size=307810 diff=0
18:49:07:WU01:FS01:0x15:- Digital signature verified
18:49:07:WU01:FS01:0x15:
18:49:07:WU01:FS01:0x15:Project: 7660 (Run 1077, Clone 0, Gen 188)
18:49:07:WU01:FS01:0x15:
18:49:07:WU01:FS01:0x15:Assembly optimizations on if available.
18:49:07:WU01:FS01:0x15:Entering M.D.
18:49:08:WU00:FS02:0xa3:- Digital signature verified
18:49:08:WU00:FS02:0xa3:
18:49:08:WU00:FS02:0xa3:Project: 6099 (Run 4, Clone 59, Gen 303)
18:49:08:WU00:FS02:0xa3:
18:49:08:WU00:FS02:0xa3:Assembly optimizations on if available.
18:49:08:WU00:FS02:0xa3:Entering M.D.
18:49:09:WU01:FS01:0x15:Tpr hash 01/wudata_01.tpr:  3486599561 625215593 1251349550 4210270855 2310075965
18:49:09:WU01:FS01:0x15:GPU device id=1
18:49:09:WU01:FS01:0x15:Working on Protein
18:49:09:WU01:FS01:0x15:Client config unavailable.
18:49:09:WU01:FS01:0x15:Starting GUI Server
18:49:10:FS01:Shutting core down
18:49:14:WU00:FS02:0xa3:Using Gromacs checkpoints
18:49:14:WU00:FS02:0xa3:Mapping NT from 4 to 4 
18:49:14:WU00:FS02:0xa3:Resuming from checkpoint
18:49:14:WU00:FS02:0xa3:Verified 00/wudata_01.log
18:49:14:WU00:FS02:0xa3:Verified 00/wudata_01.trr
18:49:14:WU00:FS02:0xa3:Verified 00/wudata_01.edr
18:49:15:WU00:FS02:0xa3:Completed 5740 out of 500000 steps  (1%)
18:49:17:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
18:49:17:WU02:FS01:Starting
18:49:17:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 02 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:49:17:WU02:FS01:Started FahCore on PID 8952
18:49:17:WU02:FS01:Core PID:2436
18:49:17:WU02:FS01:FahCore 0x15 started
18:49:18:WU02:FS01:0x15:
18:49:18:WU02:FS01:0x15:*------------------------------*
18:49:18:WU02:FS01:0x15:Folding@Home GPU Core
18:49:18:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:49:18:WU02:FS01:0x15:Build host             AmoebaRemote
18:49:18:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
18:49:18:WU02:FS01:0x15:Core                   15
18:49:18:WU02:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:49:18:WU02:FS01:0x15:
18:49:18:WU02:FS01:0x15:Window's signal control handler registered.
18:49:18:WU02:FS01:0x15:Preparing to commence simulation
18:49:18:WU02:FS01:0x15:- Looking at optimizations...
18:49:18:WU02:FS01:0x15:- Files status OK
18:49:18:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:49:18:WU02:FS01:0x15:- Expanded 79804 -> 307810 (decompressed 385.7 percent)
18:49:18:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=79804 data_size=307810, decompressed_data_size=307810 diff=0
18:49:18:WU02:FS01:0x15:- Digital signature verified
18:49:18:WU02:FS01:0x15:
18:49:18:WU02:FS01:0x15:Project: 7660 (Run 645, Clone 0, Gen 185)
18:49:18:WU02:FS01:0x15:
18:49:18:WU02:FS01:0x15:Assembly optimizations on if available.
18:49:18:WU02:FS01:0x15:Entering M.D.
18:49:19:WU02:FS01:0x15:Will resume from checkpoint file 02/wudata_01.ckp
18:49:19:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  3787850416 1496236415 1649889738 1869792425 2509146834
18:49:19:WU02:FS01:0x15:GPU device id=1
18:49:19:WU02:FS01:0x15:Working on Protein
18:49:19:WU02:FS01:0x15:Client config unavailable.
18:49:20:WU02:FS01:0x15:Starting GUI Server
18:49:30:FS01:Shutting core down
18:49:37:WU02:FS01:0x15:Client no longer detected. Shutting down core 
18:49:37:WU02:FS01:0x15:
18:49:37:WU02:FS01:0x15:Folding@home Core Shutdown: CLIENT_DIED
18:49:38:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
18:50:07:WU01:FS01:Starting
18:50:07:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:50:07:WU01:FS01:Started FahCore on PID 7684
18:50:07:WU01:FS01:Core PID:3620
18:50:07:WU01:FS01:FahCore 0x15 started
18:50:08:WU01:FS01:0x15:
18:50:08:WU01:FS01:0x15:*------------------------------*
18:50:08:WU01:FS01:0x15:Folding@Home GPU Core
18:50:08:WU01:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:50:08:WU01:FS01:0x15:Build host             AmoebaRemote
18:50:08:WU01:FS01:0x15:Board Type             NVIDIA/CUDA
18:50:08:WU01:FS01:0x15:Core                   15
18:50:08:WU01:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:50:08:WU01:FS01:0x15:
18:50:08:WU01:FS01:0x15:Window's signal control handler registered.
18:50:08:WU01:FS01:0x15:Preparing to commence simulation
18:50:08:WU01:FS01:0x15:- Looking at optimizations...
18:50:08:WU01:FS01:0x15:- Files status OK
18:50:08:WU01:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:50:08:WU01:FS01:0x15:- Expanded 79426 -> 307810 (decompressed 387.5 percent)
18:50:08:WU01:FS01:0x15:Called DecompressByteArray: compressed_data_size=79426 data_size=307810, decompressed_data_size=307810 diff=0
18:50:08:WU01:FS01:0x15:- Digital signature verified
18:50:08:WU01:FS01:0x15:
18:50:08:WU01:FS01:0x15:Project: 7660 (Run 1077, Clone 0, Gen 188)
18:50:08:WU01:FS01:0x15:
18:50:08:WU01:FS01:0x15:Assembly optimizations on if available.
18:50:08:WU01:FS01:0x15:Entering M.D.
18:50:09:WU01:FS01:0x15:Tpr hash 01/wudata_01.tpr:  3486599561 625215593 1251349550 4210270855 2310075965
18:50:09:WU01:FS01:0x15:GPU device id=1
18:50:09:WU01:FS01:0x15:Working on Protein
18:50:09:WU01:FS01:0x15:Client config unavailable.
18:50:10:WU01:FS01:0x15:Starting GUI Server
18:50:36:Saving configuration to config.xml
18:50:36:<config>
18:50:36:  <!-- Folding Slot Configuration -->
18:50:36:  <power v='full'/>
18:50:36:
18:50:36:  <!-- Network -->
18:50:36:  <proxy v=':8080'/>
18:50:36:
18:50:36:  <!-- User Information -->
18:50:36:  <passkey v='********************************'/>
18:50:36:  <team v='44079'/>
18:50:36:  <user v='ntsarb'/>
18:50:36:
18:50:36:  <!-- Folding Slots -->
18:50:36:  <slot id='1' type='GPU'/>
18:50:36:  <slot id='2' type='CPU'>
18:50:36:    <cpus v='4'/>
18:50:36:  </slot>
18:50:36:</config>
18:51:10:WU01:FS01:0x15:Setting checkpoint frequency: 400000
18:51:10:WU01:FS01:0x15:Completed         3 out of 40000000 steps (0%).
18:54:12:WU01:FS01:0x15:Completed    400000 out of 40000000 steps (1%).
18:57:11:WU01:FS01:0x15:Completed    800000 out of 40000000 steps (2%).
19:00:10:WU01:FS01:0x15:Completed   1200000 out of 40000000 steps (3%).
19:03:07:WU01:FS01:0x15:Completed   1600000 out of 40000000 steps (4%).
19:04:15:WU00:FS02:0xa3:Completed 10000 out of 500000 steps  (2%)
19:06:04:WU01:FS01:0x15:Completed   2000000 out of 40000000 steps (5%).
19:09:01:WU01:FS01:0x15:Completed   2400000 out of 40000000 steps (6%).
19:11:57:WU01:FS01:0x15:Completed   2800000 out of 40000000 steps (7%).
19:14:54:WU01:FS01:0x15:Completed   3200000 out of 40000000 steps (8%).
19:17:51:WU01:FS01:0x15:Completed   3600000 out of 40000000 steps (9%).
19:20:48:WU01:FS01:0x15:Completed   4000000 out of 40000000 steps (10%).
19:21:24:WU00:FS02:0xa3:Completed 15000 out of 500000 steps  (3%)
19:23:45:WU01:FS01:0x15:Completed   4400000 out of 40000000 steps (11%).
19:26:42:WU01:FS01:0x15:Completed   4800000 out of 40000000 steps (12%).
19:29:41:WU01:FS01:0x15:Completed   5200000 out of 40000000 steps (13%).
19:32:41:WU01:FS01:0x15:Completed   5600000 out of 40000000 steps (14%).
19:35:39:WU01:FS01:0x15:Completed   6000000 out of 40000000 steps (15%).
19:38:37:WU01:FS01:0x15:Completed   6400000 out of 40000000 steps (16%).
19:38:50:WU00:FS02:0xa3:Completed 20000 out of 500000 steps  (4%)
19:41:35:WU01:FS01:0x15:Completed   6800000 out of 40000000 steps (17%).
19:44:32:WU01:FS01:0x15:Completed   7200000 out of 40000000 steps (18%).
19:47:30:WU01:FS01:0x15:Completed   7600000 out of 40000000 steps (19%).
19:50:27:WU01:FS01:0x15:Completed   8000000 out of 40000000 steps (20%).
19:53:26:WU01:FS01:0x15:Completed   8400000 out of 40000000 steps (21%).
19:56:06:WU00:FS02:0xa3:Completed 25000 out of 500000 steps  (5%)
19:56:24:WU01:FS01:0x15:Completed   8800000 out of 40000000 steps (22%).
19:59:21:WU01:FS01:0x15:Completed   9200000 out of 40000000 steps (23%).
20:02:18:WU01:FS01:0x15:Completed   9600000 out of 40000000 steps (24%).
20:05:17:WU01:FS01:0x15:Completed  10000000 out of 40000000 steps (25%).
20:08:16:WU01:FS01:0x15:Completed  10400000 out of 40000000 steps (26%).
20:11:15:WU01:FS01:0x15:Completed  10800000 out of 40000000 steps (27%).
20:13:38:WU00:FS02:0xa3:Completed 30000 out of 500000 steps  (6%)
20:14:13:WU01:FS01:0x15:Completed  11200000 out of 40000000 steps (28%).
20:17:12:WU01:FS01:0x15:Completed  11600000 out of 40000000 steps (29%).
20:20:09:WU01:FS01:0x15:Completed  12000000 out of 40000000 steps (30%).
20:23:06:WU01:FS01:0x15:Completed  12400000 out of 40000000 steps (31%).
20:26:05:WU01:FS01:0x15:Completed  12800000 out of 40000000 steps (32%).
20:29:02:WU01:FS01:0x15:Completed  13200000 out of 40000000 steps (33%).
20:31:03:WU00:FS02:0xa3:Completed 35000 out of 500000 steps  (7%)
20:32:00:WU01:FS01:0x15:Completed  13600000 out of 40000000 steps (34%).
20:34:59:WU01:FS01:0x15:Completed  14000000 out of 40000000 steps (35%).
20:37:31:WU01:FS01:0x15:Completed  14400000 out of 40000000 steps (36%).
20:37:31:WU01:FS01:0x15:mdrun_gpu returned 52
20:37:31:WU01:FS01:0x15:NANs detected on GPU
20:37:31:WU01:FS01:0x15:
20:37:31:WU01:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
20:37:31:WARNING:WU01:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
20:37:31:WU02:FS01:Starting
20:37:31:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nikos/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 02 -suffix 01 -version 703 -lifeline 8576 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
20:37:31:WU02:FS01:Started FahCore on PID 9776
20:37:31:WU02:FS01:Core PID:2700
20:37:31:WU02:FS01:FahCore 0x15 started
20:37:32:WU02:FS01:0x15:
20:37:32:WU02:FS01:0x15:*------------------------------*
20:37:32:WU02:FS01:0x15:Folding@Home GPU Core
20:37:32:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
20:37:32:WU02:FS01:0x15:Build host             AmoebaRemote
20:37:32:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
20:37:32:WU02:FS01:0x15:Core                   15
20:37:32:WU02:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
20:37:32:WU02:FS01:0x15:
20:37:32:WU02:FS01:0x15:Window's signal control handler registered.
20:37:32:WU02:FS01:0x15:Preparing to commence simulation
20:37:32:WU02:FS01:0x15:- Looking at optimizations...
20:37:32:WU02:FS01:0x15:- Files status OK
20:37:32:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
20:37:32:WU02:FS01:0x15:- Expanded 79804 -> 307810 (decompressed 385.7 percent)
20:37:32:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=79804 data_size=307810, decompressed_data_size=307810 diff=0
20:37:32:WU02:FS01:0x15:- Digital signature verified
20:37:32:WU02:FS01:0x15:
20:37:32:WU02:FS01:0x15:Project: 7660 (Run 645, Clone 0, Gen 185)
20:37:32:WU02:FS01:0x15:
20:37:32:WU02:FS01:0x15:Assembly optimizations on if available.
20:37:32:WU02:FS01:0x15:Entering M.D.
20:37:33:WU02:FS01:0x15:Will resume from checkpoint file 02/wudata_01.ckp
20:37:33:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  3787850416 1496236415 1649889738 1869792425 2509146834
20:37:33:WU02:FS01:0x15:GPU device id=1
20:37:33:WU02:FS01:0x15:Working on Protein
20:37:33:WU02:FS01:0x15:Client config unavailable.
20:37:34:WU02:FS01:0x15:Starting GUI Server
20:38:35:WU02:FS01:0x15:Resuming from checkpoint
20:38:35:WU02:FS01:0x15:fcCheckPointResume: retreived and current tpr file hash:
20:38:35:WU02:FS01:0x15:   0   3787850416   3787850416
20:38:35:WU02:FS01:0x15:   1   1496236415   1496236415
20:38:35:WU02:FS01:0x15:   2   1649889738   1649889738
20:38:35:WU02:FS01:0x15:   3   1869792425   1869792425
20:38:35:WU02:FS01:0x15:   4   2509146834   2509146834
20:38:35:WU02:FS01:0x15:fcCheckPointResume: file hashes same.
20:38:35:WU02:FS01:0x15:fcCheckPointResume: state restored.
20:38:35:WU02:FS01:0x15:fcCheckPointResume: name 02/wudata_01.log Verified 02/wudata_01.log
20:38:35:WU02:FS01:0x15:fcCheckPointResume: name 02/wudata_01.trr Verified 02/wudata_01.trr
20:38:35:WU02:FS01:0x15:fcCheckPointResume: name 02/wudata_01.xtc Verified 02/wudata_01.xtc
20:38:35:WU02:FS01:0x15:fcCheckPointResume: name 02/wudata_01.edr Verified 02/wudata_01.edr
20:38:35:WU02:FS01:0x15:fcCheckPointResume: state restored 2
20:38:35:WU02:FS01:0x15:Resumed from checkpoint
20:38:35:WU02:FS01:0x15:Setting checkpoint frequency: 400000
20:38:35:WU02:FS01:0x15:Completed    400001 out of 40000000 steps (1%).
20:41:34:WU02:FS01:0x15:Completed    800000 out of 40000000 steps (2%).
20:44:32:WU02:FS01:0x15:Completed   1200000 out of 40000000 steps (3%).
20:47:31:WU02:FS01:0x15:Completed   1600000 out of 40000000 steps (4%).
20:48:42:WU00:FS02:0xa3:Completed 40000 out of 500000 steps  (8%)
20:50:28:WU02:FS01:0x15:Completed   2000000 out of 40000000 steps (5%).
20:53:26:WU02:FS01:0x15:Completed   2400000 out of 40000000 steps (6%).
20:56:24:WU02:FS01:0x15:Completed   2800000 out of 40000000 steps (7%).
20:59:26:WU02:FS01:0x15:Completed   3200000 out of 40000000 steps (8%).
21:02:27:WU02:FS01:0x15:Completed   3600000 out of 40000000 steps (9%).
21:05:29:WU02:FS01:0x15:Completed   4000000 out of 40000000 steps (10%).
21:06:39:WU00:FS02:0xa3:Completed 45000 out of 500000 steps  (9%)
21:08:32:WU02:FS01:0x15:Completed   4400000 out of 40000000 steps (11%).
21:11:34:WU02:FS01:0x15:Completed   4800000 out of 40000000 steps (12%).
21:14:36:WU02:FS01:0x15:Completed   5200000 out of 40000000 steps (13%).
21:17:36:WU02:FS01:0x15:Completed   5600000 out of 40000000 steps (14%).
21:20:38:WU02:FS01:0x15:Completed   6000000 out of 40000000 steps (15%).
21:23:39:WU02:FS01:0x15:Completed   6400000 out of 40000000 steps (16%).
21:25:18:WU00:FS02:0xa3:Completed 50000 out of 500000 steps  (10%)
21:26:39:WU02:FS01:0x15:Completed   6800000 out of 40000000 steps (17%).
21:29:40:WU02:FS01:0x15:Completed   7200000 out of 40000000 steps (18%).
21:32:40:WU02:FS01:0x15:Completed   7600000 out of 40000000 steps (19%).
:

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Sun Sep 15, 2013 8:33 am
by bruce
It does seem that Project: 7660 (Run 1077, Clone 0, Gen 188) is an unstable WU. I've reported it and presumably it won't be reassigned after 8am PDT tomorrow.

Unfortunately the software can't distinguish between defective hardware and problems with the data.

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Sun Sep 15, 2013 1:00 pm
by ntsarb
Bruce, the FaH client tried to recompute project 7660 (Run 1077, Clone 0, Gen 188) and reached 41% before failing, while the previous attempt failed at 36%. I don't know how FaH operates and I am wondering, if the unit is faulty, is it normal to fail at different points?

7im, my CPU is running at max 60C (thanks to a Cooler Master Hyper 612s I presume) and the GPU temp reaches 63C (100% utilisation), while folding. I have configured FaH to use 4 out of 8 CPU threads, so that I can use my computer for lightweight ever day tasks while folding.

I have configured FaH not to use the second GPU, as I have confirmed that the (primary) PCI-E slot, where it's attached to, is causing FaH to fail. To tackle this, I have ordered a new motherboard, the MSI Big Bang Trinergy, and I'm waiting for its arrival. I was not looking for a "gamer's board" but this was the only affordable board I could find in the UK and it's considered to be made of high quality components. I was very surprised how hard it is to find an LGA1156 board in the market nowadays, especially taking into consideration that new CPUs are not that much faster, to justify replacing the whole system (CPU, mobo, memory). I'm hoping replacing the motherboard will allow me to keep using this system for another 1+ year, until Haswell-E systems come to market and become affordable.

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Sun Sep 15, 2013 3:05 pm
by 7im
ntsarb wrote:...snip

I don't know how FaH operates and I am wondering, if the unit is faulty, is it normal to fail at different points?
Bad work units typically fail at the same frame number every time, but not always. We use this in troubleshooting bad hardware vs. bad WUs. Bad hardware (or bad HW settings) typically fails at different places, while bad WUs typically fail at the same frame every time. There's always an exception to prove the rule. ;)

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Mon Sep 16, 2013 9:02 pm
by ntsarb
bruce wrote:It does seem that Project: 7660 (Run 1077, Clone 0, Gen 188) is an unstable WU. I've reported it and presumably it won't be reassigned after 8am PDT tomorrow.

Unfortunately the software can't distinguish between defective hardware and problems with the data.
FaH is still trying to compute 7660 (Run 134, Clone 0, Gen 366) and keeps failing.

Re: UNSTABLE_MACHINE resets GPU at ~1.51%

Posted: Wed Sep 18, 2013 6:09 pm
by bruce
Project: 7660 (Run 134, Clone 0, Gen 366) has absolutely nothing to do with the (in-)stability of Project: 7660 (Run 1077, Clone 0, Gen 188). Both have been assigned many times (I don't know why) but Project: 7660 (Run 1077, Clone 0, Gen 188) has no error reports. Project: 7660 (Run 134, Clone 0, Gen 366) does.

The log you posted doesn't show instabilities in Gen 366. What kind of error was it?