Page 1 of 1

7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 3:37 pm
by aluiaam
I tried the "client-type = advanced" option this morning. I had set both CPU and GPU slots to finish overnight so there were no WUs being worked on. I configured the GPU slot with the option and restarted FAHControl, then manually commanded just the GPU slot to start folding. See the log below (CPU slot is filtered out).

Code: Select all

******************************* Date: 2013-08-09 *******************************
13:25:33:WU00:FS00:Connecting to assign-GPU.stanford.edu:80
13:25:33:WU00:FS00:News: Welcome to Folding@Home
13:25:33:WU00:FS00:Assigned to work server 171.64.65.69
13:25:33:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:PITCAIRN [Radeon HD 7800] from 171.64.65.69
13:25:33:WU00:FS00:Connecting to 171.64.65.69:8080
13:25:34:WU00:FS00:Downloading 4.18MiB
13:25:37:WU00:FS00:Download complete
13:25:37:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:8900 run:22 clone:1 gen:90 core:0x17 unit:0x0000006e028c1266519a63e81627fb8c
13:25:37:WU00:FS00:Downloading core from http://www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah
13:25:37:WU00:FS00:Connecting to www.stanford.edu:80
13:25:38:WU00:FS00:FahCore 17: Downloading 2.12MiB
13:25:44:WU00:FS00:FahCore 17: 67.66%
13:25:46:WU00:FS00:FahCore 17: Download complete
13:25:46:WU00:FS00:Valid core signature
13:25:46:WU00:FS00:Unpacked 7.34MiB to cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe
13:25:46:WU00:FS00:Starting
13:25:46:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 4704 -checkpoint 15 -gpu 0 -gpu-vendor ati
13:25:46:WU00:FS00:Started FahCore on PID 3112
13:25:46:WU00:FS00:Core PID:2200
13:25:46:WU00:FS00:FahCore 0x17 started
13:25:47:WU00:FS00:0x17:*********************** Log Started 2013-08-09T13:25:46Z ***********************
13:25:47:WU00:FS00:0x17:Project: 8900 (Run 22, Clone 1, Gen 90)
13:25:47:WU00:FS00:0x17:Unit: 0x0000006e028c1266519a63e81627fb8c
13:25:47:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
13:25:47:WU00:FS00:0x17:Machine: 0
13:25:47:WU00:FS00:0x17:Reading tar file state.xml
13:25:47:WU00:FS00:0x17:Reading tar file system.xml
13:25:48:WU00:FS00:0x17:Reading tar file integrator.xml
13:25:48:WU00:FS00:0x17:Reading tar file core.xml
13:25:48:WU00:FS00:0x17:Digital signatures verified
14:13:44:FS00:Paused
14:13:44:FS00:Shutting core down
14:13:44:WU00:FS00:0x17:WARNING:Console control signal 1 on PID 2200
14:13:44:WU00:FS00:0x17:Exiting, please wait. . .
14:14:45:WARNING:FS00:Killing WU00
14:14:45:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
14:18:06:FS00:Unpaused
14:18:06:WU00:FS00:Starting
14:18:06:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 4704 -checkpoint 15 -gpu 0 -gpu-vendor ati
14:18:06:WU00:FS00:Started FahCore on PID 3248
14:18:06:WU00:FS00:Core PID:5332
14:18:06:WU00:FS00:FahCore 0x17 started
14:18:06:WU00:FS00:0x17:*********************** Log Started 2013-08-09T14:18:06Z ***********************
14:18:06:WU00:FS00:0x17:Project: 8900 (Run 22, Clone 1, Gen 90)
14:18:06:WU00:FS00:0x17:Unit: 0x0000006e028c1266519a63e81627fb8c
14:18:06:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
14:18:06:WU00:FS00:0x17:Machine: 0
14:18:06:WU00:FS00:0x17:Reading tar file state.xml
14:18:07:WU00:FS00:0x17:Reading tar file system.xml
14:18:08:WU00:FS00:0x17:Reading tar file integrator.xml
14:18:08:WU00:FS00:0x17:Reading tar file core.xml
14:18:08:WU00:FS00:0x17:Digital signatures verified
******************************* Date: 2013-08-09 *******************************
A project 8900 WU and Core 17 were downloaded, the CPU came up to 25% load (one core of four), but the GPU load stayed at 0% and GPU core and memory clocks stayed at their idle values (this is all according to GPU-Z). After a few minutes the FahCore_17 process in Task Manager stopped, and still the GPU was idling. At 14:13:44 I paused and restarted the GPU slot, and the same thing happened again - FahCore_17 process showed 25% CPU usage for a few minutes, then abruptly stopped, all the while the GPU idled.

I suspect a driver issue because there was a driver reset five minutes after each time I started the GPU slot (according to Windows Event Viewer). I am running Catalyst 13.4 drivers, and I have had problems with driver resets before (see my other forum post). I was just wondering if anyone else has had this problem and how to successfully fold these post-beta, pre-release GPU projects. Hardware specs are in the signature.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 4:08 pm
by bruce
Download the latest driver directly from ati.com unless you already have it. There have been several problems reported and they've been gradually fixing them in new versions. I see there was a new Linux version yesterday (I didn't check Windows yet).

Is your GPU overclocked? Driver resets SHOULD NOT HAPPEN.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 4:22 pm
by aluiaam
I've checked for the latest Catalyst drivers and 13.4 is the latest non-beta version. If it's just a problem with driver compatibility that will be worked out over time, I'll go back to only folding full-release projects.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 4:40 pm
by bruce
My HD 7790 is folding a P8900 with Catalyst 12.191.2.1. I have not yet upgraded, partly because I sometimes get older projects for FahCore_16.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 4:46 pm
by 7im
The new OpenCL code in fahcore_17 pushes the GPU harder than previous fahcores, so previous stable overclocks are not stable with these newer work units. Try folding at stock speeds, and work your way up from there.
GPU: AMD Radeon HD 7870 @ 1100 MHz core / 1350 MHz VRAM (2 GiB)

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 5:44 pm
by aluiaam
bruce wrote:My HD 7790 is folding a P8900 with Catalyst 12.191.2.1. I have not yet upgraded, partly because I sometimes get older projects for FahCore_16.
This is also my gaming rig so I'd like to keep the latest drivers. I was folding fahcore_16 projects with only the occasional driver reset issue, so I'll probably keep doing that until the major fahcore_17 bugs are worked out.
7im wrote:The new OpenCL code in fahcore_17 pushes the GPU harder than previous fahcores, so previous stable overclocks are not stable with these newer work units. Try folding at stock speeds, and work your way up from there.
I just reverted to 1000/1200 MHz clocks and restarted the GPU slot. Initially it was the same problem - fahcore_17.exe was using 25% of the CPU, but the GPU stayed at idle clocks and 0% usage. Nothing about the state of the GPU changed that I could see, so I don't think it's a stability issue. Then at 16:56:17 there was an error. See log below.

Code: Select all

16:51:47:FS00:Unpaused
16:51:48:WU00:FS00:Starting
16:51:48:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 4704 -checkpoint 15 -gpu 0 -gpu-vendor ati
16:51:48:WU00:FS00:Started FahCore on PID 1856
16:51:48:WU00:FS00:Core PID:948
16:51:48:WU00:FS00:FahCore 0x17 started
16:51:48:WU00:FS00:0x17:*********************** Log Started 2013-08-09T16:51:48Z ***********************
16:51:48:WU00:FS00:0x17:Project: 8900 (Run 22, Clone 1, Gen 90)
16:51:48:WU00:FS00:0x17:Unit: 0x0000006e028c1266519a63e81627fb8c
16:51:48:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
16:51:48:WU00:FS00:0x17:Machine: 0
16:51:48:WU00:FS00:0x17:Reading tar file state.xml
16:51:49:WU00:FS00:0x17:Reading tar file system.xml
16:51:50:WU00:FS00:0x17:Reading tar file integrator.xml
16:51:50:WU00:FS00:0x17:Reading tar file core.xml
16:51:50:WU00:FS00:0x17:Digital signatures verified
16:56:17:WU00:FS00:0x17:ERROR:exception: Force RMSE error of 240.443 with threshold of 5
16:56:17:WU00:FS00:0x17:Saving result file logfile_01.txt
16:56:17:WU00:FS00:0x17:Saving result file badStateCheckpoint_41
16:56:18:WU00:FS00:0x17:Saving result file badStateForceGroup0_41Core.xml
16:56:20:WU00:FS00:0x17:Saving result file badStateForceGroup0_41Ref.xml
16:56:24:WU00:FS00:0x17:Saving result file badStateForceGroup1_41Core.xml
16:56:27:WU00:FS00:0x17:Saving result file badStateForceGroup1_41Ref.xml
16:56:30:WU00:FS00:0x17:Saving result file badStateForceGroup2_41Core.xml
16:56:33:WU00:FS00:0x17:Saving result file badStateForceGroup2_41Ref.xml
16:56:35:WU00:FS00:0x17:Saving result file log.txt
16:56:35:WU00:FS00:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
16:56:36:WARNING:WU00:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
16:56:36:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:8900 run:22 clone:1 gen:90 core:0x17 unit:0x0000006e028c1266519a63e81627fb8c
16:56:36:WU00:FS00:Uploading 19.75MiB to 171.64.65.69
16:56:36:WU00:FS00:Connecting to 171.64.65.69:8080
16:56:36:WU02:FS00:Connecting to assign-GPU.stanford.edu:80
16:56:37:WU02:FS00:News: Welcome to Folding@Home
16:56:37:WU02:FS00:Assigned to work server 171.64.65.69
16:56:37:WU02:FS00:Requesting new work unit for slot 00: READY gpu:0:PITCAIRN [Radeon HD 7800] from 171.64.65.69
16:56:37:WU02:FS00:Connecting to 171.64.65.69:8080
16:56:37:WU02:FS00:Downloading 4.17MiB
16:56:41:WU02:FS00:Download complete
16:56:41:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:8900 run:22 clone:7 gen:86 core:0x17 unit:0x00000066028c1266519a640b3bd24eb6
16:56:41:WU02:FS00:Starting
16:56:41:WU02:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 02 -suffix 01 -version 703 -lifeline 4704 -checkpoint 15 -gpu 0 -gpu-vendor ati
16:56:41:WU02:FS00:Started FahCore on PID 3572
16:56:41:WU02:FS00:Core PID:124
16:56:41:WU02:FS00:FahCore 0x17 started
16:56:41:WU02:FS00:0x17:*********************** Log Started 2013-08-09T16:56:41Z ***********************
16:56:41:WU02:FS00:0x17:Project: 8900 (Run 22, Clone 7, Gen 86)
16:56:41:WU02:FS00:0x17:Unit: 0x00000066028c1266519a640b3bd24eb6
16:56:41:WU02:FS00:0x17:CPU: 0x00000000000000000000000000000000
16:56:41:WU02:FS00:0x17:Machine: 0
16:56:41:WU02:FS00:0x17:Reading tar file state.xml
16:56:42:WU00:FS00:Upload 3.48%
16:56:42:WU02:FS00:0x17:Reading tar file system.xml
16:56:43:WU02:FS00:0x17:Reading tar file integrator.xml
16:56:43:WU02:FS00:0x17:Reading tar file core.xml
16:56:43:WU02:FS00:0x17:Digital signatures verified
16:56:48:WU00:FS00:Upload 6.96%
16:56:54:WU00:FS00:Upload 10.44%
16:57:00:WU00:FS00:Upload 13.93%
16:57:06:WU00:FS00:Upload 17.41%
16:57:12:WU00:FS00:Upload 21.20%
16:57:18:WU00:FS00:Upload 24.69%
16:57:24:WU00:FS00:Upload 28.17%
16:57:30:WU00:FS00:Upload 31.65%
16:57:36:WU00:FS00:Upload 35.13%
16:57:42:WU00:FS00:Upload 38.93%
16:57:48:WU00:FS00:Upload 42.41%
16:57:54:WU00:FS00:Upload 45.89%
16:58:00:WU00:FS00:Upload 49.37%
16:58:06:WU00:FS00:Upload 52.85%
16:58:12:WU00:FS00:Upload 56.33%
16:58:18:WU00:FS00:Upload 59.82%
16:58:24:WU00:FS00:Upload 63.30%
16:58:30:WU00:FS00:Upload 66.78%
16:58:36:WU00:FS00:Upload 70.26%
16:58:42:WU00:FS00:Upload 73.74%
16:58:48:WU00:FS00:Upload 77.22%
16:58:54:WU00:FS00:Upload 80.70%
16:59:00:WU00:FS00:Upload 84.18%
16:59:06:WU00:FS00:Upload 87.67%
16:59:12:WU00:FS00:Upload 90.83%
16:59:18:WU00:FS00:Upload 94.63%
16:59:24:WU00:FS00:Upload 98.11%
16:59:28:WU00:FS00:Upload complete
16:59:28:WU00:FS00:Server responded WORK_ACK (400)
16:59:28:WU00:FS00:Cleaning up
A new work unit was downloaded and 19.75 MiB of something were uploaded. Fahcore_17.exe continued to use 25% of the CPU, and the GPU continued to remain idle until 17:01:20 when there was a display driver reset and the core process died.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Fri Aug 09, 2013 8:36 pm
by bruce
The message "exception: Force RMSE error..." sounds like a calculation error to me as to the messages after that point. If the errors can be "cured" by reducing clock rates, by changing drivers, or by getting a new WU remains to be seen.

After a WU fails, it is assigned to someone else to be completed. If it's successfully completed, we have to assume something is wrong with your hardware. If it fails repeatedly, we have to assume the data in the WU is corrupt and cannot be calculated. Nobody else has returned this WU yet, so I'll flag it for follow-up.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Sat Aug 10, 2013 1:11 am
by folding_hoomer
The delay you describe is the normal behavior of Core17.
First the WU starts using one full core of the CPU to prepare the data (the faster the CPU, the shorter the delay in "switching to the GPU") - and once data were "ready" the GPU starts calculating them.
At that moment the CPU-utilisation is reduced to several percent for the rest of the WU.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Sun Aug 11, 2013 10:56 pm
by aluiaam
I just want to provide a bit a closure to this thread, though not really a solution. Friday evening I tried again to start folding that P8900 WU. After restarting the slot there were the same symptoms previously described; then after about four and a half minutes my screen went crazy and the system rebooted. When the desktop came up, FAHControl started, the CPU started folding, and according to GPU-Z the GPU was folding. As of writing this it's completed two P8900 WUs without a hitch. To address the overclock stability issue, it has been folding for almost 48 hours at the same clocks I've been gaming with (1100/1350 MHz). For the sake of diagnosis I'm sorry to say I didn't explicitly change anything. The only thing different was that the computer rebooted. Go figure. :e?:

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Mon Aug 12, 2013 3:22 am
by 7im
Since you had already rebooted before then really nothing changed... And yet it did change.

Re: 7870 / Core 17 / Project 8900 not folding

Posted: Mon Aug 12, 2013 4:17 am
by bruce
The reassigned WU was completed by someone else. The fact that your computer was unable to perform the same calculation seems to be due to what we call "unstable"

Hi aluiaamPC (team 211754),
Your WU (P8900 R22 C1 G90) was added to the stats database on 2013-08-09 10:00:36 for 0 points of credit.
Hi xxx (team xxx),
Your WU (P8900 R22 C1 G90) was added to the stats database on 2013-08-11 05:01:32 for 11532 points of credit.