Page 1 of 1
Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Sun May 08, 2011 3:20 am
by CCTCHFUN
I been getting this one the whole day, did anyone complete this WU? I am not sure is this a Bad WU or just me, below is the log.
[01:24:54] Project: 6801 (Run 8950, Clone 1, Gen 3)
[01:24:54]
[01:24:54] Assembly optimizations on if available.
[01:24:54] Entering M.D.
[01:24:56] Tpr hash work/wudata_08.tpr: 3607300280 4160975690 466456219 2840422234 798880461
[01:24:56] Working on ALZHEIMER'S DISEASE AMYLOID
[01:24:56] Client config found, loading data.
[01:24:57] Setting checkpoint frequency: 500000
[01:24:57] Setting checkpoint frequency: 500000
[01:24:57] Starting GUI Server
[01:26:15] Completed 500000 out of 50000000 steps (1%).
[01:26:15] mdrun_gpu returned 52
[01:26:15] NANs detected on GPU
[01:26:15]
[01:26:15] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:26:18] CoreStatus = 7A (122)
[01:26:18] Sending work to server
[01:26:18] Project: 6801 (Run 8950, Clone 1, Gen 3)
[01:26:18] - Read packet limit of 540015616... Set to 524286976.
[01:26:18] - Error: Could not get length of results file work/wuresults_08.dat
[01:26:18] - Error: Could not read unit 08 file. Removing from queue.
[01:26:18] EUE limit exceeded. Pausing 24 hours.
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Sun May 08, 2011 9:34 pm
by CCTCHFUN
Delete the Que and Work folder didn't help either, after changing the client with different I.D.#, it picked up a different WU and folding Ok....fingers crossed.
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Mon May 09, 2011 12:01 am
by PantherX
There is a single (failure) report in the WU Database but it doesn't match your Forum username:
Your WU (P6801 R8950 C1 G3) was added to the stats database on 2011-04-12 00:06:34 for 0 points of credit.
I have marked it for a follow-up.
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Mon May 09, 2011 5:31 am
by speedy6635
I'm Getting the same thing here too. With a gtx 470 and gtx 560ti
Code: Select all
Arguments: -oneunit -forcegpu nvidia_fermi -advmethods -gpu 0
[05:10:41] - Ask before connecting: No
[05:10:41] - User name: Speedy6635 (Team 111065)
[05:10:41] - User ID: 2CBEC193014C91A
[05:10:41] - Machine ID: 3
[05:10:41]
[05:10:41] Gpu type=3 species=30.
[05:10:41] Work directory not found. Creating...
[05:10:41] Could not open work queue, generating new queue...
[05:10:41] - Preparing to get new work unit...
[05:10:41] Cleaning up work directory
[05:10:41] + Attempting to get work packet
[05:10:41] Passkey found
[05:10:41] Gpu type=3 species=30.
[05:10:41] - Connecting to assignment server
[05:10:42] - Successful: assigned to (171.64.65.64).
[05:10:42] + News From Folding@Home: Welcome to Folding@Home
[05:10:42] Loaded queue successfully.
[05:10:42] Gpu type=3 species=30.
[05:10:42] + Closed connections
[05:10:42]
[05:10:42] + Processing work unit
[05:10:42] Core required: FahCore_15.exe
[05:10:42] Core found.
[05:10:42] Working on queue slot 01 [May 9 05:10:42 UTC]
[05:10:42] + Working ...
[05:10:42]
[05:10:42] *------------------------------*
[05:10:42] Folding@Home GPU Core
[05:10:42] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[05:10:42]
[05:10:42] Build host: SimbiosNvdWin7
[05:10:42] Board Type: NVIDIA/CUDA
[05:10:42] Project: 6801 (Run 8950, Clone 1, Gen 3)
[05:10:42]
[05:10:42] Assembly optimizations on if available.
[05:10:42] Entering M.D.
[05:10:44] Tpr hash work/wudata_01.tpr: 3607300280 4160975690 466456219 2840422234 798880461
[05:10:44] Working on ALZHEIMER'S DISEASE AMYLOID
[05:10:44] Client config found, loading data.
[05:10:44] Starting GUI Server
[05:10:45] Setting checkpoint frequency: 500000
[05:10:45] Setting checkpoint frequency: 500000
[05:12:05] Completed 500000 out of 50000000 steps (1%).
[05:12:05] mdrun_gpu returned 52
[05:12:05] NANs detected on GPU
[05:12:05]
[05:12:05] Folding@home Core Shutdown: UNSTABLE_MACHINE
[05:12:07] CoreStatus = 7A (122)
[05:12:07] Sending work to server
[05:12:07] Project: 6801 (Run 8950, Clone 1, Gen 3)
[05:12:07] - Read packet limit of 540015616... Set to 524286976.
[05:12:07] - Error: Could not get length of results file work/wuresults_01.dat
[05:12:07] - Error: Could not read unit 01 file. Removing from queue.
only way to keep folding is remove -advmethods flag
Mod Edit: Added Code Tags - PantherX
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Mon May 09, 2011 9:52 pm
by CCTCHFUN
PantherX wrote:There is a single (failure) report in the WU Database but it doesn't match your Forum username:
Your WU (P6801 R8950 C1 G3) was added to the stats database on 2011-04-12 00:06:34 for 0 points of credit.
I have marked it for a follow-up.
5/7/11 is my first time getting this WU.
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Sun May 15, 2011 4:35 am
by Bill1024
I am getting the same wu and the same result.
Tried del work and que 10 times and keep getting the same wu.
Project: 6801 (Run 8950, Clone 1, Gen 3)
[04:30:03]
[04:30:03] Assembly optimizations on if available.
[04:30:03] Entering M.D.
[04:30:05] Tpr hash work/wudata_01.tpr: 3607300280 4160975690 466456219 2840422234 798880461
[04:30:05] Working on ALZHEIMER'S DISEASE AMYLOID
[04:30:05] Client config found, loading data.
[04:30:05] Starting GUI Server
[04:30:06] Setting checkpoint frequency: 500000
[04:30:06] Setting checkpoint frequency: 500000
[04:32:01] Completed 500000 out of 50000000 steps (1%).
[04:32:02] mdrun_gpu returned 52
[04:32:02] NANs detected on GPU
[04:32:02]
[04:32:02] Folding@home Core Shutdown: UNSTABLE_MACHINE
[04:32:04] CoreStatus = 7A (122)
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Sun May 15, 2011 7:23 am
by GreyWhiskers
There was a huge thread on the
NaNs Detected issue - that you might want to peruse. I've quoted Bruce's last post, edited by PantherX, on the thread.
Have you looked at any of these troubleshooting steps on your hardware, or on your GPU drivers?
Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE er
Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE er
by bruce ยป Tue Mar 22, 2011 2:30 pm
I'm closing this topic. It has become a catch-all for several DIFFERENT types of problems and each one has it's own signature even though all may give you the NaNs detected message.
1) It may be a bad WU. The post, above, by "drbricks" is an excellent example of that problem. The same WU failed three times but there's no idication of a problem with other WUs. "HendricksSA" answered that one accurately. I'll copy the information from drbricks' post into a new post in that forum, though an extract from a log would be helpful. (It's not clear if he had an UNSTABLE_MACHINE error or not.)
2) Be sure your software has been upgraded to the latest version. Golden Dragoon's log shows that, and again, HendricksSA suggested the proper next step.
3) There's also the possibility of a hardware problem. GPUs do sometimes fail. They can overheat (particularly the single slot variety which leaves all the heat inside the case). They can be installed in systems which do not provide enough power. All of these options should be considered if you're still seeing UNSTABLE_MACHINE after eliminating (1) and (2).
EDIT By PantherX-> A good place to start troubleshooting is by reading this post ->
15 - Troubleshooting The GPU3 BETA Client
Re: Project: 6801 (Run 8950, Clone 1, Gen 3) EUE @ 1%
Posted: Sun May 15, 2011 7:01 pm
by Bill1024
I had to run config, and say NO to advmethods, then del work folder, unit info .ect, to get a new WU..
Now I am back to folding just fine.
Not sure why it would not give me a new wu after trying better than 15 times the normal way.