Page 1 of 2

Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Thu Jan 26, 2012 12:05 pm
by VFR
Hiya People!

First post here so please be gentle.

Not sure about the about this WU but have been using the Windows 6.34 SMP with the advmethods with passkey etc without any issues but starting Jan 24/25, I received Project: 6099 (Run 8, Clone 25, Gen 6) and ever since then I have been having

Code: Select all

Client-core communications error: ERROR 0xc0000005

Code: Select all

[09:34:36] Project: 6099 (Run 8, Clone 25, Gen 6)
[09:34:36] 
[09:34:36] Assembly optimizations on if available.
[09:34:36] Entering M.D.
[09:34:42] Mapping NT from 7 to 7 
[09:34:43] Completed 0 out of 500000 steps  (0%)
[09:34:43] Gromacs cannot continue further.
[09:34:43] Going to send back what have done -- stepsTotalG=500000
[09:34:43] Work fraction=0.0000 steps=500000.
[09:34:46] CoreStatus = C0000005 (-1073741819)
[09:34:46] Client-core communications error: ERROR 0xc0000005
[09:34:46] Deleting current work unit & continuing...
[09:35:00] Trying to send all finished work units
After checking FahWiki and restarting the client and machine a few time, but still the same error. After several retrys to start, client tries to download the FahCore_a3.exe:

Code: Select all

[09:35:21] Project: 6099 (Run 8, Clone 25, Gen 6)
[09:35:21] 
[09:35:21] Assembly optimizations on if available.
[09:35:21] Entering M.D.
[09:35:28] Mapping NT from 7 to 7 
[09:35:28] Completed 0 out of 500000 steps  (0%)
[09:35:31] CoreStatus = C0000005 (-1073741819)
[09:35:31] Client-core communications error: ERROR 0xc0000005
[09:35:31] Deleting current work unit & continuing...
[09:35:44] Trying to send all finished work units
[09:35:44] + No unsent completed units remaining.
[09:35:44] - Preparing to get new work unit...
[09:35:44] Cleaning up work directory
[09:35:44] + Attempting to get work packet
[09:35:44] Passkey found
[09:35:44] - Will indicate memory of 8148 MB
[09:35:44] - Connecting to assignment server
[09:35:44] Connecting to http://assign.stanford.edu:8080/
[09:35:47] Posted data.
[09:35:47] Initial: 8F80; - Successful: assigned to (128.143.231.202).
[09:35:47] + News From Folding@Home: Welcome to Folding@Home
[09:35:48] Loaded queue successfully.
[09:35:48] Sent data
[09:35:48] Connecting to http://128.143.231.202:8080/
[09:35:49] Posted data.
[09:35:49] Initial: 0000; - Receiving payload (expected size: 3817097)
[09:35:53] - Downloaded at ~931 kB/s
[09:35:53] - Averaged speed for that direction ~948 kB/s
[09:35:53] + Received work.
[09:35:53] + Closed connections
[09:35:58] 
[09:35:58] + Processing work unit
[09:35:58] Core required: FahCore_a3.exe
[09:35:58] Core found.
[09:35:58] Working on queue slot 04 [January 25 09:35:58 UTC]
[09:35:58] + Working ...
[09:35:58] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 04 -np 7 -checkpoint 15 -forceasm -verbose -lifeline 4500 -version 634'

[09:35:58] 
[09:35:58] *------------------------------*
[09:35:58] Folding@Home Gromacs SMP Core
[09:35:58] Version 2.27 (Dec. 15, 2010)
[09:35:58] 
[09:35:58] Preparing to commence simulation
[09:35:58] - Assembly optimizations manually forced on.
[09:35:58] - Not checking prior termination.
[09:35:59] - Expanded 3816585 -> 4169428 (decompressed 109.2 percent)
[09:35:59] Called DecompressByteArray: compressed_data_size=3816585 data_size=4169428, decompressed_data_size=4169428 diff=0
[09:35:59] - Digital signature verified
[09:35:59] 
[09:35:59] Project: 6099 (Run 8, Clone 25, Gen 6)
[09:35:59] 
[09:35:59] Assembly optimizations on if available.
[09:35:59] Entering M.D.
[09:36:05] Mapping NT from 7 to 7 
[09:36:05] Completed 0 out of 500000 steps  (0%)
[09:36:09] CoreStatus = C0000005 (-1073741819)
[09:36:09] Client-core communications error: ERROR 0xc0000005
[09:36:09] - Attempting to download new core...
[09:36:09] + Downloading new core: FahCore_a3.exe
Even after manually downloading the FahCore_a3 core, this keeps happening. I don't think its hardware as per the FahWiki suggests that it might be, so I cleaned up the work folder and change the machine ID so that it gets another project:

Code: Select all

[11:35:28] Project: 6097 (Run 0, Clone 6, Gen 85)
[11:35:28] 
[11:35:28] Assembly optimizations on if available.
[11:35:28] Entering M.D.
[11:35:34] Mapping NT from 7 to 7 
[11:35:35] Completed 0 out of 500000 steps  (0%)
[11:45:57] Completed 5000 out of 500000 steps  (1%)
[11:56:09] Completed 10000 out of 500000 steps  (2%)
So this is more of a heads up to say that maybe this WU does not like this client :?

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Fri Jan 27, 2012 7:23 pm
by PantherX
Welcome to the F@H Forum VFR,

Please note that there isn't any data in the WU Database for Project: 6099 (Run 8, Clone 25, Gen 6) so I have marked it for a follow up.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Fri Jan 27, 2012 7:53 pm
by Grandpa_01
it is the new beta WU and you should not be getting it on advance are you running a beta flag. viewtopic.php?f=66&t=20619

Never mind it is not a beta I misread the thread. :oops:

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Fri Jan 27, 2012 8:16 pm
by rwh202
Have you managed any other 6099 WUs?

Just an idea, but you're running -smp 7 and a few projects don't play nice with that.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Fri Jan 27, 2012 9:37 pm
by kasson
This project is in full release (not beta), but in general we discourage running with -smp 7. Six or eight should be much more stable.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sun Jan 29, 2012 11:35 am
by VFR
I had checked to see if this WU was NOT in beta as per http://foldingforum.org/viewtopic.php?f=24&t=19786 but unfortunately I do change the number on the SMP as I require some processing power as its my work PC :wink:

During the evenings and/or weekends, I do run with -SMP 8 but by the time I saw these instabilities, the log files had already rotated so I couldn't check to see if I had worked on this WU before.

Why does an odd number of SMP cause an issue?

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sun Jan 29, 2012 1:05 pm
by kasson
It's the prime part rather than the odd part that's an issue. 9 should work fine. FAH cuts the simulation into pieces of work for each thread like cutting a cake; 8 = 2x2x2 or 4x2x1, and 6 = 3x2x1, but 7 = 7x1x1. The smaller the slice, the more likely the simulation will be unstable.

If you can't do 8 during daytime hours, I'd recommend swapping to 6 for that period.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sun Jan 29, 2012 2:17 pm
by bruce
VFR wrote:...but by the time I saw these instabilities, the log files had already rotated so I couldn't check to see if I had worked on this WU before.
The log file rotates when FAH is restarted, but the previous log is still there.

V6 only rotates the log if it's larger than 50K, so often it may not rotate at all. When it does rotate, look in FAHlog-Prev.txt.
V7 rotates every time the client is restarted, but it keeps many more previous logs in a folder called "logs"

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sun Jan 29, 2012 8:15 pm
by PantherX
bruce wrote:...V7 rotates every time the client is restarted, but it keeps many more previous logs in a folder called "logs"
With default settings, there will be 16 log files in the log folder plus the current one in the FAHClient directory so there will be a total of 17 log files.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Tue Jan 31, 2012 4:00 pm
by VFR
bruce wrote:The log file rotates when FAH is restarted, but the previous log is still there.
V6 only rotates the log if it's larger than 50K, so often it may not rotate at all. When it does rotate, look in FAHlog-Prev.txt"
I did look in all the log files and I had only FAHlog.txt, FAHlog2.txt (the next to-be rotated log file) and FAHlog-Prev.txt as its a v6 client but as I had verbosity set to 9 :egeek: plus it was nearly a day or so before I noticed the anomaly, so it was gone :(

Anyhow, the client is now folding with no issues with SMP set to 7. As for setting it to SMP 6, I think I prefer the faster grinding power of the additional core :wink:

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Fri Feb 03, 2012 10:19 pm
by Mytheroo
Hi, I am having the same issue, have been running a few different projects but I believe this was the first 6099 (6098's were fine), and I had about 5 tries at a 6099 with the a3 core crashing each time just moments into the run.

Am running following settings on i7-2500k @3.8 (motherboard stock (over)clock), 8gb ram (2133 running at 1600). Have done about 20 WUs with this setup with noprevious issues
[settings]
username=Myth!
team=31574
passkey=xxxxxxxxxxxxxx
asknet=no
machineid=1
bigpackets=big
extra_parms=-smp
local=24

[http]
active=no
host=localhost
port=8080
usereg=no

[clienttype]
type=3

[core]
cpuusage=100
addr=192.168.xxxx

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sat Feb 04, 2012 1:08 am
by bruce
Several people have had the same troubles, so ...
The WU (P6099,R8,C25,G6) has been reported as a bad WU. Note that the list of reported WUs are stopped daily at 8am pacific time.


Mytheroo: Welcome to foldingforum.org.
Was this Project: 6099 (Run 8, Clone 25, Gen 6) or was it some other assignment? If so, my previous statement applies to everyone. If it was some other assignment, we need to see FAHlog.txt. Post the top page or so and then a representative section or two showing a WU downloading, starting, and failing.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sat Feb 04, 2012 1:05 pm
by Mytheroo
thanks for the welcome Bruce. Yes, is the run 8, clone 25, gen 6. Ok, so it is bad, and has been reported....

How do I stop my client trying to run it?

I tried blocking that assignment server in 2 diff. ways hoping it would try a different server, but it just fails to contact that server over and over. Deleting work folder and a3 core has no effect, it just DL's them both again.

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sat Feb 04, 2012 7:15 pm
by bruce
There was a report that internet connections between Stanford's servers and the internet were having a problem not long ago but it has been resolved. Perhaps that explains your problem. See viewtopic.php?f=18&t=20693

Re: Project: 6099 (Run 8, Clone 25, Gen 6)

Posted: Sun Feb 05, 2012 5:49 am
by Mytheroo
having blocked the server in the outgoing firewall, i was expecting it to fail to contact it, but i thought it might then try a different server (and hence a different WU), but it just kept trying the same server (and failing). There was no issue with internet connections