Page 1 of 1

Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 10:51 am
by bollix47
FYI

Code: Select all

[10:21:42] Connecting to http://171.64.65.54:8080/
[10:21:42] Posted data.
[10:21:42] Initial: 0000; - Receiving payload (expected size: 42776)
[10:21:43] - Downloaded at ~41 kB/s
[10:21:43] - Averaged speed for that direction ~446 kB/s
[10:21:43] + Received work.
[10:21:43] Trying to send all finished work units
[10:21:43] + No unsent completed units remaining.
[10:21:43] + Closed connections
[10:21:43]
[10:21:43] + Processing work unit
[10:21:43] Core required: FahCore_a3.exe
[10:21:43] Core found.
[10:21:43] Working on queue slot 03 [June 4 10:21:43 UTC]
[10:21:43] + Working ...
[10:21:43] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 03 -np
iority 96 -checkpoint 30 -verbose -lifeline 2812 -version 634'

[10:21:43]
[10:21:43] *------------------------------*
[10:21:43] Folding@Home Gromacs SMP Core
[10:21:43] Version 2.27 (Dec. 15, 2010)
[10:21:43]
[10:21:43] Preparing to commence simulation
[10:21:43] - Looking at optimizations...
[10:21:43] - Created dyn
[10:21:43] - Files status OK
[10:21:43] - Expanded 42264 -> 115421 (decompressed 273.0 percent)
[10:21:43] Called DecompressByteArray: compressed_data_size=42264 data_siz
21, decompressed_data_size=115421 diff=0
[10:21:43] - Digital signature verified
[10:21:43]
[10:21:43] Project: 6050 (Run 0, Clone 119, Gen 406)
[10:21:43]
[10:21:43] Assembly optimizations on if available.
[10:21:43] Entering M.D.
[10:21:49] Mapping NT from 15 to 15
[10:41:29] CoreStatus = C0000417 (-1073740777)
[10:41:29] Client-core communications error: ERROR 0xc0000417
[10:41:29] Deleting current work unit & continuing...
Windows 7 Pro 64-bit pop-up ... FahCore_a3.exe has stopped running.

Code: Select all

Problem Event Name:	BEX
  Application Name:	FahCore_a3.exe
  Application Version:	0.0.0.0
  Application Timestamp:	4d4720af
  Fault Module Name:	FahCore_a3.exe
  Fault Module Version:	0.0.0.0
  Fault Module Timestamp:	4d4720af
  Exception Offset:	0008816d
  Exception Code:	c0000417
  Exception Data:	00000000
  OS Version:	6.1.7601.2.1.0.256.48
  Locale ID:	4105
  Additional Information 1:	88f7
  Additional Information 2:	88f70b5904b84d8cc95e82b3c6f7647f
  Additional Information 3:	fa46
  Additional Information 4:	fa46fc87ddfc5983a29bd91072f459f4

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 1:56 pm
by Sleepee
Running into the same problem here.

i5-760 @ Stock
4GB DDR3-1600

Deleting the WU from queue and/or deleting the work folder has no effect; the exact same unit redownloads. It's currently stuck on that rig, and till then, to prevent further WU failures, I've turned it off.

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 2:35 pm
by PantherX
The WU has been reported as a bad one:
The WU (P6050,R0,C119,G406) has been reported as a bad WU.
Thanks for the report.

Sleepee -> Welcome to the F@H Forum Sleepee,
Please read this post to resolve your issue (viewtopic.php?f=19&t=16526#p164322).

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 2:41 pm
by Grandpa_01
PantherX wrote:The WU has been reported as a bad one:
The WU (P6050,R0,C119,G406) has been reported as a bad WU.
Thanks for the report.

Sleepee -> Welcome to the F@H Forum Sleepee,
Please read this post to resolve your issue (viewtopic.php?f=19&t=16526#p164322).
I do not think it is a bad WU bollix47 is running -smp 15 and it is failing instantly bollix47 try running -smp 16 or 14.

[10:21:43] Project: 6050 (Run 0, Clone 119, Gen 406)
[10:21:43]
[10:21:43] Assembly optimizations on if available.
[10:21:43] Entering M.D.
[10:21:49] Mapping NT from 15 to 15
[10:41:29] CoreStatus = C0000417 (-1073740777)
[10:41:29] Client-core communications error: ERROR 0xc0000417
[10:41:29] Deleting current work unit & continuing...

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 2:45 pm
by bollix47
Can't rerun that WU as the client deleted it and moved on to something else. This client has completed well over 500 WUs (both smp2 regular and bigadv) with the -15 setting without a problem. Using the other core to feed a gtx 480 and I have tried both -14 and -16 but -15 works best.

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 3:01 pm
by PantherX
Thanks for the catch Grandpa_01, I have asked around and let's see what happens.

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 3:10 pm
by bollix47
Why would using 15 be a problem? I thought it was only prime numbers that caused a problem with the exceptions of 3 and maybe 5:

viewtopic.php?f=66&t=18549&p=186053#p186053

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 7:22 pm
by Grandpa_01
bollix47 wrote:Why would using 15 be a problem? I thought it was only prime numbers that caused a problem with the exceptions of 3 and maybe 5:

viewtopic.php?f=66&t=18549&p=186053#p186053
In the link you provided there is a post from Kasson that explaines what the problem might be. 15 would be 15x1
by kasson ยป Mon May 09, 2011 3:48 pm

-smp 3 should be fine, -smp 5 probably ok. The problem is that Gromacs does a 2D decomposition based on factoring the number of threads you give it. So if you give it a number like 7 or 13, the best it can do is 7x1 or 13x1, whereas 12 can yield 4x3 and 8 can yield 4x2. The more thinly the system gets broken up, the more likely it is to fail. Hence easily factorable numbers are better...

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 8:27 pm
by bruce
[quote="Grandpa_01"]In the link you provided there is a post from Kasson that explaines what the problem might be. 15 would be 15x1

Not necessarily. Why not 15=5x3.

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 9:09 pm
by Sleepee
Thanks Bruce. I'm crunching on a different WU now.

To others: I don't think -smp x would be the problem. I was failing units with -smp 4, my maximum amount.

Re: Project: 6050 (Run 0, Clone 119, Gen 406)

Posted: Sat Jun 04, 2011 10:36 pm
by Grandpa_01
bruce wrote:
Grandpa_01 wrote:In the link you provided there is a post from Kasson that explaines what the problem might be. 15 would be 15x1

Not necessarily. Why not 15=5x3.
Must have been the drugs. I just jot out of surgery when I posted that. :lol: