project 7624 (183 0 48) NAN at 1%

Moderators: Site Moderators, FAHC Science Team

Post Reply
kiore
Posts: 921
Joined: Fri Jan 16, 2009 5:45 pm
Location: USA

project 7624 (183 0 48) NAN at 1%

Post by kiore »

I have just had this failure on one of mt GTX 560's a NAN at 1% these things happen and would not normally be too concerned to see one every once in a while just this happened while I was watching and the card was not OCed at the time and the temps were very low for here, ambient 20c and card reading 64c .
I got the same unit reissued to me straight away and it is folding happily now although only at 8% so far, I haven't changed the config and the new version running at 65c.
I understand NANs can happen for different reasons but this one occurred in the middle of the upload, while not OCed or hot and the same unit is now running well

Code: Select all

18:05:27:WU00:FS00:0x15:
18:05:27:WU00:FS00:0x15:Project: 7624 (Run 183, Clone 0, Gen 48)
18:05:27:WU00:FS00:0x15:
18:05:27:WU00:FS00:0x15:Assembly optimizations on if available.
18:05:27:WU00:FS00:0x15:Entering M.D.
18:05:28:WU00:FS00:0x15:Tpr hash 00/wudata_01.tpr:  3552458257 441301 4217539330 1061998400 853868277
18:05:28:WU00:FS00:0x15:GPU device info: vendor=0 device=0 name=<NA> match=0
18:05:28:WU00:FS00:0x15:Working on Protein
18:05:28:WU00:FS00:0x15:Client config unavailable.
18:05:29:WU00:FS00:0x15:Starting GUI Server
18:05:58:WU02:FS01:0x15:Completed   2800000 out of 40000000 steps (7%).
18:06:32:WU00:FS00:0x15:Setting checkpoint frequency: 400000
18:06:32:WU00:FS00:0x15:Completed         3 out of 40000000 steps (0%).
18:06:37:WU03:FS00:Upload 7.91%
18:07:55:WU01:FS02:0xa4:Completed 430000 out of 500000 steps  (86%)
18:08:02:WU03:FS00:Upload 15.83%
18:08:25:WU03:FS00:Upload 23.74%
18:08:41:WU03:FS00:Upload 31.65%
18:09:54:WU03:FS00:Upload 39.57%
18:10:51:WU01:FS02:0xa4:Completed 435000 out of 500000 steps  (87%)
18:11:29:WU02:FS01:0x15:Completed   3200000 out of 40000000 steps (8%).
18:11:52:WU03:FS00:Upload 47.48%
[color=#FF0000]18:12:05:WU00:FS00:0x15:Completed    400000 out of 40000000 steps (1%).
18:12:05:WU00:FS00:0x15:mdrun_gpu returned 52
18:12:05:WU00:FS00:0x15:NANs detected on GPU
18:12:05:WU00:FS00:0x15:
18:12:05:WU00:FS00:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
18:12:05:WU00:FS00:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)[/color]
18:12:05:WU00:FS00:Starting
18:12:05:WU00:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" "C:/Documents and Settings/Administrator/Application Data/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe" -dir 00 -suffix 01 -version 701 -lifeline 1536 -checkpoint 15 -gpu 0
18:12:05:WU00:FS00:Started FahCore on PID 2024
18:12:05:WU00:FS00:Core PID:1488
18:12:05:WU00:FS00:FahCore 0x15 started
18:12:06:WU00:FS00:0x15:
18:12:06:WU00:FS00:0x15:*------------------------------*
18:12:06:WU00:FS00:0x15:Folding@Home GPU Core
18:12:06:WU00:FS00:0x15:Version                2.22 (Thu Dec 8 17:08:05 PST 2011)
18:12:06:WU00:FS00:0x15:Build host             SimbiosNvdWin7
18:12:06:WU00:FS00:0x15:Board Type             NVIDIA/CUDA
18:12:06:WU00:FS00:0x15:Core                   15
18:12:06:WU00:FS00:0x15:
18:12:06:WU00:FS00:0x15:Wind
Running winxppro32 on amd phenom 955 4gm RAM.
Image
i7 7800x RTX 3070 OS= win10. AMD 3700x RTX 2080ti OS= win10 .

Team page: https://www.rationalskepticism.org/viewtopic.php?t=616
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: project 7624 (183 0 48) NAN at 1%

Post by P5-133XL »

With v7 client uploads and downloads are asynchronous. Folding is still occuring during those processes.

P7624 (183,0,48) is not bad in that it has been successfully folded 45 times with only one failure. No, I don't know why the WU was issued to 46 people not including you but I counted them off the database.
Image
tjlane
Pande Group Member
Posts: 161
Joined: Wed Jun 01, 2011 11:19 pm
Location: Stanford, CA

Re: project 7624 (183 0 48) NAN at 1%

Post by tjlane »

Thanks for the report. I have manually shut down this WU - feel free to dump it. Please let me know if other issues crop up.
kiore
Posts: 921
Joined: Fri Jan 16, 2009 5:45 pm
Location: USA

Re: project 7624 (183 0 48) NAN at 1%

Post by kiore »

tjlane wrote:Thanks for the report. I have manually shut down this WU - feel free to dump it. Please let me know if other issues crop up.

Thank you, the 2nd time it downloaded it ran fine to 45%, but I dumped it when I saw the above as it seems to have been done often enough.
Image
i7 7800x RTX 3070 OS= win10. AMD 3700x RTX 2080ti OS= win10 .

Team page: https://www.rationalskepticism.org/viewtopic.php?t=616
Post Reply