8054 (Run 0, Clone 3617, Gen 12)

Moderators: Site Moderators, FAHC Science Team

Post Reply
wrinvert
Posts: 16
Joined: Sat Oct 22, 2011 8:24 am

8054 (Run 0, Clone 3617, Gen 12)

Post by wrinvert »

completely different rig then the last one 1 posted, this has only started happening since the new core17 release and I did I driver update to 320.00.

Code: Select all

12:45:38:WU02:FS01:Connecting to assign-GPU.stanford.edu:80
12:45:38:WU02:FS01:News: Welcome to Folding@Home
12:45:38:WU02:FS01:Assigned to work server 171.67.108.143
12:45:38:WU02:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:"GF114 [GeForce GTX 680]" from 171.67.108.143
12:45:38:WU02:FS01:Connecting to 171.67.108.143:8080
12:45:39:WU02:FS01:Downloading 59.52KiB
12:45:39:WU02:FS01:Download complete
12:45:39:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:8054 run:0 clone:3617 gen:12 core:0x15 unit:0x0000000e6953ee2f50626bb2d931ff01
12:45:40:WU02:FS01:Downloading project 8054 description
12:45:40:WU02:FS01:Connecting to fah-web.stanford.edu:80
12:45:40:WU02:FS01:Project 8054 description downloaded successfully
12:45:40:WU02:FS01:Starting
12:45:40:WU02:FS01:Running FahCore: "E:\V7 folding\FAHClient/FAHCoreWrapper.exe" "E:/V7 folding/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 02 -suffix 01 -version 702 -lifeline 2584 -checkpoint 15 -gpu 0
12:45:40:WU02:FS01:Started FahCore on PID 2528
12:45:40:WU02:FS01:Core PID:1240
12:45:40:WU02:FS01:FahCore 0x15 started
12:45:41:WU02:FS01:0x15:
12:45:41:WU02:FS01:0x15:*------------------------------*
12:45:41:WU02:FS01:0x15:Folding@Home GPU Core
12:45:41:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
12:45:41:WU02:FS01:0x15:Build host             AmoebaRemote
12:45:41:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
12:45:41:WU02:FS01:0x15:Core                   15
12:45:41:WU02:FS01:0x15:
12:45:41:WU02:FS01:0x15:Window's signal control handler registered.
12:45:41:WU02:FS01:0x15:Preparing to commence simulation
12:45:41:WU02:FS01:0x15:- Looking at optimizations...
12:45:41:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
12:45:41:WU02:FS01:0x15:- Created dyn
12:45:41:WU02:FS01:0x15:- Files status OK
12:45:41:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
12:45:41:WU02:FS01:0x15:- Expanded 60435 -> 264278 (decompressed 437.2 percent)
12:45:41:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=60435 data_size=264278, decompressed_data_size=264278 diff=0
12:45:41:WU02:FS01:0x15:- Digital signature verified
12:45:41:WU02:FS01:0x15:
12:45:41:WU02:FS01:0x15:Project: 8054 (Run 0, Clone 3617, Gen 12)
12:45:41:WU02:FS01:0x15:
12:45:41:WU02:FS01:0x15:Assembly optimizations on if available.
12:45:41:WU02:FS01:0x15:Entering M.D.
12:45:43:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  49211287 2857462032 905143316 4137867563 3281227665
12:45:43:WU02:FS01:0x15:GPU device id=0
12:45:43:WU02:FS01:0x15:Working on Good ROcking Metal Altar for Chronical Sinners
12:45:43:WU02:FS01:0x15:Client config unavailable.
12:45:43:WU02:FS01:0x15:Starting GUI Server
12:46:56:WU02:FS01:0x15:Setting checkpoint frequency: 500000
12:46:56:WU02:FS01:0x15:Completed         3 out of 50000000 steps (0%).
12:48:23:WU02:FS01:0x15:Completed    500000 out of 50000000 steps (1%).
12:48:23:WU02:FS01:0x15:mdrun_gpu returned 52
12:48:23:WU02:FS01:0x15:NANs detected on GPU
12:48:23:WU02:FS01:0x15:
12:48:23:WU02:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
12:48:24:WARNING:WU02:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
12:48:24:WU02:FS01:Starting
12:48:24:WU02:FS01:Running FahCore: "E:\V7 folding\FAHClient/FAHCoreWrapper.exe" "E:/V7 folding/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 02 -suffix 01 -version 702 -lifeline 2584 -checkpoint 15 -gpu 0
12:48:24:WU02:FS01:Started FahCore on PID 1052
12:48:24:WU02:FS01:Core PID:2716
12:48:24:WU02:FS01:FahCore 0x15 started
12:48:24:WU02:FS01:0x15:
12:48:24:WU02:FS01:0x15:*------------------------------*
12:48:24:WU02:FS01:0x15:Folding@Home GPU Core
12:48:24:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
12:48:24:WU02:FS01:0x15:Build host             AmoebaRemote
12:48:24:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
12:48:24:WU02:FS01:0x15:Core                   15
12:48:24:WU02:FS01:0x15:
12:48:24:WU02:FS01:0x15:Window's signal control handler registered.
12:48:24:WU02:FS01:0x15:Preparing to commence simulation
12:48:24:WU02:FS01:0x15:- Looking at optimizations...
12:48:24:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
12:48:24:WU02:FS01:0x15:- Created dyn
12:48:24:WU02:FS01:0x15:- Files status OK
12:48:24:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
12:48:24:WU02:FS01:0x15:- Expanded 60435 -> 264278 (decompressed 437.2 percent)
12:48:24:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=60435 data_size=264278, decompressed_data_size=264278 diff=0
12:48:24:WU02:FS01:0x15:- Digital signature verified
12:48:24:WU02:FS01:0x15:
12:48:24:WU02:FS01:0x15:Project: 8054 (Run 0, Clone 3617, Gen 12)
12:48:24:WU02:FS01:0x15:
12:48:24:WU02:FS01:0x15:Assembly optimizations on if available.
12:48:24:WU02:FS01:0x15:Entering M.D.
12:48:26:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  49211287 2857462032 905143316 4137867563 3281227665
12:48:26:WU02:FS01:0x15:GPU device id=0
12:48:26:WU02:FS01:0x15:Working on Good ROcking Metal Altar for Chronical Sinners
12:48:26:WU02:FS01:0x15:Client config unavailable.
12:48:26:WU02:FS01:0x15:Starting GUI Server
12:49:40:WU02:FS01:0x15:Setting checkpoint frequency: 500000
12:49:40:WU02:FS01:0x15:Completed         3 out of 50000000 steps (0%).
12:49:41:WARNING:WU02:FS01:Detected clock skew (1 mins 17 secs), adjusting time estimates
12:51:19:WU02:FS01:0x15:Completed    500000 out of 50000000 steps (1%).
12:52:57:WU02:FS01:0x15:Completed   1000000 out of 50000000 steps (2%).
12:52:57:WU02:FS01:0x15:mdrun_gpu returned 52
12:52:57:WU02:FS01:0x15:NANs detected on GPU
12:52:57:WU02:FS01:0x15:
12:52:57:WU02:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
12:52:57:WARNING:WU02:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
12:52:57:WU02:FS01:Starting
12:52:57:WU02:FS01:Running FahCore: "E:\V7 folding\FAHClient/FAHCoreWrapper.exe" "E:/V7 folding/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 02 -suffix 01 -version 702 -lifeline 2584 -checkpoint 15 -gpu 0
12:52:57:WU02:FS01:Started FahCore on PID 4912
12:52:57:WU02:FS01:Core PID:4996
12:52:57:WU02:FS01:FahCore 0x15 started
12:52:58:WU02:FS01:0x15:
12:52:58:WU02:FS01:0x15:*------------------------------*
12:52:58:WU02:FS01:0x15:Folding@Home GPU Core
12:52:58:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
12:52:58:WU02:FS01:0x15:Build host             AmoebaRemote
12:52:58:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
12:52:58:WU02:FS01:0x15:Core                   15
12:52:58:WU02:FS01:0x15:
12:52:58:WU02:FS01:0x15:Window's signal control handler registered.
12:52:58:WU02:FS01:0x15:Preparing to commence simulation
12:52:58:WU02:FS01:0x15:- Looking at optimizations...
12:52:58:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
12:52:58:WU02:FS01:0x15:- Created dyn
12:52:58:WU02:FS01:0x15:- Files status OK
12:52:58:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
12:52:58:WU02:FS01:0x15:- Expanded 60435 -> 264278 (decompressed 437.2 percent)
12:52:58:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=60435 data_size=264278, decompressed_data_size=264278 diff=0
12:52:58:WU02:FS01:0x15:- Digital signature verified
12:52:58:WU02:FS01:0x15:
12:52:58:WU02:FS01:0x15:Project: 8054 (Run 0, Clone 3617, Gen 12)
12:52:58:WU02:FS01:0x15:
12:52:58:WU02:FS01:0x15:Assembly optimizations on if available.
12:52:58:WU02:FS01:0x15:Entering M.D.
12:52:59:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  49211287 2857462032 905143316 4137867563 3281227665
12:52:59:WU02:FS01:0x15:GPU device id=0
12:52:59:WU02:FS01:0x15:Working on Good ROcking Metal Altar for Chronical Sinners
12:52:59:WU02:FS01:0x15:Client config unavailable.
12:52:59:WU02:FS01:0x15:Starting GUI Server
12:54:13:WU02:FS01:0x15:Setting checkpoint frequency: 500000
12:54:13:WU02:FS01:0x15:Completed         3 out of 50000000 steps (0%).
12:55:40:WU02:FS01:0x15:Completed    500000 out of 50000000 steps (1%).
12:55:40:WU02:FS01:0x15:mdrun_gpu returned 52
12:55:40:WU02:FS01:0x15:NANs detected on GPU
12:55:40:WU02:FS01:0x15:
12:55:40:WU02:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
12:55:40:WARNING:WU02:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
12:55:40:WU02:FS01:Starting
12:55:40:WU02:FS01:Running FahCore: "E:\V7 folding\FAHClient/FAHCoreWrapper.exe" "E:/V7 folding/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 02 -suffix 01 -version 702 -lifeline 2584 -checkpoint 15 -gpu 0
12:55:40:WU02:FS01:Started FahCore on PID 2448
12:55:40:WU02:FS01:Core PID:4144
12:55:40:WU02:FS01:FahCore 0x15 started
12:55:41:WU02:FS01:0x15:
12:55:41:WU02:FS01:0x15:*------------------------------*
12:55:41:WU02:FS01:0x15:Folding@Home GPU Core
12:55:41:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
12:55:41:WU02:FS01:0x15:Build host             AmoebaRemote
12:55:41:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
12:55:41:WU02:FS01:0x15:Core                   15
12:55:41:WU02:FS01:0x15:
12:55:41:WU02:FS01:0x15:Window's signal control handler registered.
12:55:41:WU02:FS01:0x15:Preparing to commence simulation
12:55:41:WU02:FS01:0x15:- Looking at optimizations...
12:55:41:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
12:55:41:WU02:FS01:0x15:- Created dyn
12:55:41:WU02:FS01:0x15:- Files status OK
12:55:41:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
12:55:41:WU02:FS01:0x15:- Expanded 60435 -> 264278 (decompressed 437.2 percent)
12:55:41:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=60435 data_size=264278, decompressed_data_size=264278 diff=0
12:55:41:WU02:FS01:0x15:- Digital signature verified
12:55:41:WU02:FS01:0x15:
12:55:41:WU02:FS01:0x15:Project: 8054 (Run 0, Clone 3617, Gen 12)
12:55:41:WU02:FS01:0x15:
12:55:41:WU02:FS01:0x15:Assembly optimizations on if available.
12:55:41:WU02:FS01:0x15:Entering M.D.
12:55:43:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  49211287 2857462032 905143316 4137867563 3281227665
12:55:43:WU02:FS01:0x15:GPU device id=0
12:55:43:WU02:FS01:0x15:Working on Good ROcking Metal Altar for Chronical Sinners
12:55:43:WU02:FS01:0x15:Client config unavailable.
12:55:43:WU02:FS01:0x15:Starting GUI Server
12:56:56:WU02:FS01:0x15:Setting checkpoint frequency: 500000
12:56:56:WU02:FS01:0x15:Completed         3 out of 50000000 steps (0%).
12:56:57:WARNING:WU02:FS01:Detected clock skew (1 mins 17 secs), adjusting time estimates
12:58:23:WU02:FS01:0x15:Completed    500000 out of 50000000 steps (1%).
12:58:23:WU02:FS01:0x15:mdrun_gpu returned 52
12:58:23:WU02:FS01:0x15:NANs detected on GPU
12:58:23:WU02:FS01:0x15:
12:58:23:WU02:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
12:58:24:WARNING:WU02:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
12:58:24:WU02:FS01:Starting
12:58:24:WU02:FS01:Running FahCore: "E:\V7 folding\FAHClient/FAHCoreWrapper.exe" "E:/V7 folding/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 02 -suffix 01 -version 702 -lifeline 2584 -checkpoint 15 -gpu 0
12:58:24:WU02:FS01:Started FahCore on PID 260
12:58:24:WU02:FS01:Core PID:4900
12:58:24:WU02:FS01:FahCore 0x15 started
12:58:24:WU02:FS01:0x15:
12:58:24:WU02:FS01:0x15:*------------------------------*
12:58:24:WU02:FS01:0x15:Folding@Home GPU Core
12:58:24:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
12:58:24:WU02:FS01:0x15:Build host             AmoebaRemote
12:58:24:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
12:58:24:WU02:FS01:0x15:Core                   15
12:58:24:WU02:FS01:0x15:
12:58:24:WU02:FS01:0x15:Window's signal control handler registered.
12:58:24:WU02:FS01:0x15:Preparing to commence simulation
12:58:24:WU02:FS01:0x15:- Looking at optimizations...
12:58:24:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
12:58:24:WU02:FS01:0x15:- Created dyn
12:58:24:WU02:FS01:0x15:- Files status OK
12:58:24:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
12:58:24:WU02:FS01:0x15:- Expanded 60435 -> 264278 (decompressed 437.2 percent)
12:58:24:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=60435 data_size=264278, decompressed_data_size=264278 diff=0
12:58:24:WU02:FS01:0x15:- Digital signature verified
12:58:24:WU02:FS01:0x15:
12:58:24:WU02:FS01:0x15:Project: 8054 (Run 0, Clone 3617, Gen 12)
12:58:24:WU02:FS01:0x15:
12:58:24:WU02:FS01:0x15:Assembly optimizations on if available.
12:58:24:WU02:FS01:0x15:Entering M.D.
12:58:26:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  49211287 2857462032 905143316 4137867563 3281227665
12:58:26:WU02:FS01:0x15:GPU device id=0
12:58:26:WU02:FS01:0x15:Working on Good ROcking Metal Altar for Chronical Sinners
12:58:26:WU02:FS01:0x15:Client config unavailable.
12:58:26:WU02:FS01:0x15:Starting GUI Server
12:59:39:WU02:FS01:0x15:Setting checkpoint frequency: 500000
12:59:39:WU02:FS01:0x15:Completed         3 out of 50000000 steps (0%).
12:59:40:WARNING:WU02:FS01:Detected clock skew (1 mins 16 secs), adjusting time estimates
13:01:06:WU02:FS01:0x15:Completed    500000 out of 50000000 steps (1%).
13:01:06:WU02:FS01:0x15:mdrun_gpu returned 52
13:01:06:WU02:FS01:0x15:NANs detected on GPU
13:01:06:WU02:FS01:0x15:
13:01:06:WU02:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
13:01:07:WARNING:WU02:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
13:01:07:WARNING:WU02:FS01:Too many errors, failing
13:01:07:WU02:FS01:Sending unit results: id:02 state:SEND error:FAILED project:8054 run:0 clone:3617 gen:12 core:0x15 unit:0x0000000e6953ee2f50626bb2d931ff01
13:01:07:WU02:FS01:Connecting to 171.67.108.143:8080
13:01:07:WU02:FS01:Server responded WORK_QUIT (404)
13:01:07:WARNING:WU02:FS01:Server did not like results, dumping
13:01:07:WU02:FS01:Cleaning up
rickoic
Posts: 320
Joined: Sat May 23, 2009 4:49 pm
Hardware configuration: eVga x299 DARK 2070 Super, eVGA 2080, eVga 1070, eVga 2080 Super
MSI x399 eVga 2080, eVga 1070, eVga 1070, GT970
Location: Mississippi near Memphis, Tn

Re: 8054 (Run 0, Clone 3617, Gen 12)

Post by rickoic »

Have you tried uninstalling the driver update and reverting to the older drivers?

It also kinda looks like a bad wu as it's happening at virtually in the same spot.

Rick
I'm folding because Dec 2005 I had radical prostate surgery.
Lost brother to spinal cancer, brother-in-law to prostate cancer.
Several 1st cousins lost and a few who have survived.
wrinvert
Posts: 16
Joined: Sat Oct 22, 2011 8:24 am

Re: 8054 (Run 0, Clone 3617, Gen 12)

Post by wrinvert »

not yet, but it looks like im going to have to try it. core17 seems to be issue free but every time it grabs a core15 I get this error on 2 different systems and versions of 680's.
rickoic
Posts: 320
Joined: Sat May 23, 2009 4:49 pm
Hardware configuration: eVga x299 DARK 2070 Super, eVGA 2080, eVga 1070, eVga 2080 Super
MSI x399 eVga 2080, eVga 1070, eVga 1070, GT970
Location: Mississippi near Memphis, Tn

Re: 8054 (Run 0, Clone 3617, Gen 12)

Post by rickoic »

Perhaps someone more up to date than I can answer this question, but I seem to remember something about negative numbers causing problems back in the dark ages of folding. My memory faulty or not?
And if it tries to pass a negative number if it might consider it a NaN?

Tks
Rick
I'm folding because Dec 2005 I had radical prostate surgery.
Lost brother to spinal cancer, brother-in-law to prostate cancer.
Several 1st cousins lost and a few who have survived.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 8054 (Run 0, Clone 3617, Gen 12)

Post by PantherX »

The WU isn't a bad one since it was successfully completed by another donor:
Your WU (P8054 R0 C3617 G12) was added to the stats database on 2013-05-20 12:13:20 for 3874 points of credit.
Are your GPUs overclocked? If so, you could try to lower the frequencies to that of stock and see what happens. FahCore_15 uses CUDA so is more stressful on Nvidia GPUs than FahCore_17 which is using OpenCL. Do note that generally, we don't recommend the usage of Beta drivers, just the WHQL ones so is there a reason that you are using the Beta ones?

NaN error has many causes and generally caused on the hardware side like overheating, overclocking, etc. The WU fails mostly 1% but did manage to get up to 2% thus it is likely to be caused by hardware issues. If the WU is bad, it tends to always fail at the exact same spot regardless of the hardware being used.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply