Project: 5750 (Run 13, Clone 133, Gen 15)

Moderators: Site Moderators, FAHC Science Team

Post Reply
ItsLasher
Posts: 9
Joined: Thu Jun 19, 2008 1:13 am
Hardware configuration: Intel E8400 @ 4.3
x2 eVGA GTX 260
XfX 780i 3-way SLI
4 Gig Corsair Dominator 4-4-4-13 2T
Location: My Forum
Contact:

Project: 5750 (Run 13, Clone 133, Gen 15)

Post by ItsLasher »

I'm getting this on all the 57xx series WU's no matter what GPU it running on or what system I'm running it on.

http://foldingforum.org/viewtopic.php?f=19&t=7184
I've also posted here on this as well for the 5752 WU and when these WU's go wrong like this it locks up the entire system and I have to do a hard reset to reboot.

I'm running Vista x64, 4Gig of ram, (2) GTX 260's (NOT SLI) and 177.83 drivers.
This also happens on the 8600GT and 9800GTX systems.
I have no problems at all with other WU's just the 57xx series.




Code: Select all

[05:01:29] *------------------------------*
[05:01:29] Folding@Home GPU Core - Beta
[05:01:29] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[05:01:29] 
[05:01:29] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[05:01:29] Build host: amoeba
[05:01:29] Board Type: Nvidia
[05:01:29] Core      : 
[05:01:29] Preparing to commence simulation
[05:01:29] - Looking at optimizations...
[05:01:29] - Created dyn
[05:01:29] - Files status OK
[05:01:29] - Expanded 98708 -> 492276 (decompressed 498.7 percent)
[05:01:29] Called DecompressByteArray: compressed_data_size=98708 data_size=492276, decompressed_data_size=492276 diff=0
[05:01:29] - Digital signature verified
[05:01:29] 
[05:01:29] Project: 5750 (Run 13, Clone 133, Gen 15)
[05:01:29] 
[05:01:29] Assembly optimizations on if available.
[05:01:29] Entering M.D.
[05:01:36] Working on Protein
[05:01:38] Client config found, loading data.
[05:01:38] Starting GUI Server
[05:03:02] Completed 1%
[05:04:25] Completed 2%
[05:05:47] Completed 3%
[05:07:11] Completed 4%
[05:08:34] Completed 5%
[05:09:57] Completed 6%
[05:11:20] Completed 7%
[05:12:43] Completed 8%
[05:14:08] Completed 9%
[05:15:31] Completed 10%
[05:16:54] Completed 11%
[05:18:17] Completed 12%
[05:19:40] Completed 13%
[05:21:03] Completed 14%
[05:22:27] Completed 15%
[05:23:52] Completed 16%
[05:25:15] Completed 17%
[05:26:38] Completed 18%
[05:28:01] Completed 19%
[05:29:25] Completed 20%
[05:30:48] Completed 21%
[05:32:11] Completed 22%
[05:33:34] Completed 23%
[05:34:58] Completed 24%
[05:36:20] Completed 25%
[05:37:44] Completed 26%
[05:39:07] Completed 27%
[05:40:30] Completed 28%
[05:41:53] Completed 29%
[05:43:16] Completed 30%
[05:44:39] Completed 31%
[05:46:02] Completed 32%
[05:47:25] Completed 33%
[05:48:48] Completed 34%
[05:50:12] Completed 35%
[05:51:35] Completed 36%
[05:52:58] Completed 37%
[05:54:22] Completed 38%
[05:55:45] Completed 39%
[05:57:07] Completed 40%
[05:58:31] Completed 41%
[05:59:54] Completed 42%
[06:01:17] Completed 43%
[06:02:40] Completed 44%
[06:04:03] Completed 45%
[06:05:26] Completed 46%
[06:06:49] Completed 47%
[06:08:13] Completed 48%
[06:09:35] Completed 49%
[06:10:58] Completed 50%
[06:12:22] Completed 51%
[06:13:45] Completed 52%
[06:15:08] Completed 53%
[06:16:31] Completed 54%
[06:17:55] Completed 55%
[06:19:18] Completed 56%
[06:20:40] Completed 57%
[06:22:04] Completed 58%
[06:23:27] Completed 59%
[06:24:50] Completed 60%
[06:26:13] Completed 61%
[06:27:36] Completed 62%
[06:28:59] Completed 63%
[06:30:22] Completed 64%
[06:31:46] Completed 65%
[06:33:08] Completed 66%
[06:34:31] Completed 67%
[06:35:55] Completed 68%
[06:37:18] Completed 69%
[06:38:40] Completed 70%
[06:40:04] Completed 71%
[06:41:27] Completed 72%
[06:42:50] Completed 73%
[06:44:13] Completed 74%
[06:45:36] Completed 75%
[06:46:59] Completed 76%
[06:48:22] Completed 77%
[06:49:45] Completed 78%
[06:51:08] Completed 79%
[06:52:32] Completed 80%
[06:53:55] Completed 81%
[06:55:18] Completed 82%
[06:56:41] Completed 83%
[06:57:47] Run: exception thrown during GuardedRun
[06:57:47] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[06:57:47] Going to send back what have done -- stepsTotalG=10000000
[06:57:47] Work fraction=0.8359 steps=10000000.
Image
toTOW
Site Moderator
Posts: 6359
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by toTOW »

There's no report for this WU in the DB yet.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
ItsLasher
Posts: 9
Joined: Thu Jun 19, 2008 1:13 am
Hardware configuration: Intel E8400 @ 4.3
x2 eVGA GTX 260
XfX 780i 3-way SLI
4 Gig Corsair Dominator 4-4-4-13 2T
Location: My Forum
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by ItsLasher »

toTOW wrote:There's no report for this WU in the DB yet.
It's not just this one it's all the 57xx (511 pointers) that are causing issues with my systems.
Looking through there are a bunch of people reporting the same issues with these WU's.
Image
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by sortofageek »

Project: 5750 (Run 13, Clone 133, Gen 15)

Two other donors have now completed this WU and returned it for full credit. It isn't a bad WU.
ItsLasher
Posts: 9
Joined: Thu Jun 19, 2008 1:13 am
Hardware configuration: Intel E8400 @ 4.3
x2 eVGA GTX 260
XfX 780i 3-way SLI
4 Gig Corsair Dominator 4-4-4-13 2T
Location: My Forum
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by ItsLasher »

sortofageek wrote:Project: 5750 (Run 13, Clone 133, Gen 15)

Two other donors have now completed this WU and returned it for full credit. It isn't a bad WU.
Two donors out of how many that have got this WU from the servers and that makes it ok?

How many were sent out and how many were return 100% complete? I'd like to see how 2 returned WU's justify there not being a problem.

I'll just start deleting these WU's as they come in to avoid issues on all the systems
Image
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by sortofageek »

If one donor is able to complete a WU successfully, then the problem cannot be with the WU. It must be some other cause. The WU isn't bad or nobody could complete it.

The cause could be one or more of any number of issues, such as:

1. The GPU card itself. It may be that certain makes/models of GPU cards are having trouble, impossible for us to determine from this end unless those having trouble with these WUs post detailed information about their systems.
2. A driver
3. Bad RAM
4. Bad disk
5. File corruption
6. Other software
7. Heat
8. Overclocking

And the list goes on. There are many who post here who are adept at narrowing down such issues and helping to find the culprit, but they can only help if you provide enough detail about your hardware, software, drivers, error messages in logs, etc. and what you are doing at the time the system crashes.

If it crashes when you are running particular software, try it without running that software. If it then completes successfully, you have a start toward isolating the problem. If you update a driver and are then able to complete one of these WUs, you can be fairly confident you have found the issue. If you run memtest and find the memory is bad, that could be it. Like that ...

All we are saying is that the problem is not a bad work unit.
ItsLasher
Posts: 9
Joined: Thu Jun 19, 2008 1:13 am
Hardware configuration: Intel E8400 @ 4.3
x2 eVGA GTX 260
XfX 780i 3-way SLI
4 Gig Corsair Dominator 4-4-4-13 2T
Location: My Forum
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by ItsLasher »

sortofageek wrote:If one donor is able to complete a WU successfully, then the problem cannot be with the WU. It must be some other cause. The WU isn't bad or nobody could complete it.

The cause could be one or more of any number of issues, such as:

1. The GPU card itself. It may be that certain makes/models of GPU cards are having trouble, impossible for us to determine from this end unless those having trouble with these WUs post detailed information about their systems.
2. A driver
3. Bad RAM
4. Bad disk
5. File corruption
6. Other software
7. Heat
8. Overclocking

And the list goes on. There are many who post here who are adept at narrowing down such issues and helping to find the culprit, but they can only help if you provide enough detail about your hardware, software, drivers, error messages in logs, etc. and what you are doing at the time the system crashes.

If it crashes when you are running particular software, try it without running that software. If it then completes successfully, you have a start toward isolating the problem. If you update a driver and are then able to complete one of these WUs, you can be fairly confident you have found the issue. If you run memtest and find the memory is bad, that could be it. Like that ...

All we are saying is that the problem is not a bad work unit.
I am also quite adept at tracing down a problem and fixing it.
This happens on 3 different systems and even if they are completely idle aside from crunching this happens and only on the 57xx WU's that are worth 511 points.
This ONLY happens with these WU's and no others so why would it not happen with all instead of just a few?
All I'm saying is 2 good out of 10,000 doesn't make sense to me and who's to say there aren't 1,000 failed and not reported WU's because when this happens to me and I restart the client it does not continue, it gets another WU from the servers and starts fresh.
Image
anko1
Posts: 438
Joined: Mon Dec 03, 2007 1:31 am
Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional
Location: SF Peninsula

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by anko1 »

I keep fairly good track of my EUEs and I've had about 160 units in the 57xx series (according to the stats page), but only reported about 5 bad ones (maybe a few more on my other computer, I'll check later), and those were 5753s and 5754s. It might be your driver. I've seen some reports of people upgrading the drivers and having problems. I'll provide info on my cards later. One difference is that I'm using Windows XP.

Edit: checked my other machine and I don't have any EUEs in the 57xx series on it. My two cards are NVIDIA Quadro FX-3700 and BFG 9800GTX+.
Last edited by anko1 on Mon Dec 15, 2008 10:40 pm, edited 1 time in total.
toTOW
Site Moderator
Posts: 6359
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by toTOW »

ItsLasher wrote: This happens on 3 different systems and even if they are completely idle aside from crunching this happens and only on the 57xx WU's that are worth 511 points.
What is common in your 3 systems ? OS ? drivers ? hardware parts ? ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
ItsLasher
Posts: 9
Joined: Thu Jun 19, 2008 1:13 am
Hardware configuration: Intel E8400 @ 4.3
x2 eVGA GTX 260
XfX 780i 3-way SLI
4 Gig Corsair Dominator 4-4-4-13 2T
Location: My Forum
Contact:

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by ItsLasher »

toTOW wrote:
ItsLasher wrote: This happens on 3 different systems and even if they are completely idle aside from crunching this happens and only on the 57xx WU's that are worth 511 points.
What is common in your 3 systems ? OS ? drivers ? hardware parts ? ...
The only thing that is common between the 3 is the OS which is Vista x64 and all 3 installs are from their own separate discs.
Image
Sophocles
Posts: 4
Joined: Sat Nov 15, 2008 1:11 pm

Re: Project: 5750 (Run 13, Clone 133, Gen 15)

Post by Sophocles »

Two other donors have now completed this WU and returned it for full credit. It isn't a bad WU.

I've successfully ran 511 point series on both XP and Vista with overclocked GPUs and successfully returned them all many times over, but that doesn't mean that there isn't an issue with these WUs.They use far more processing resources than a WU should to be considered truly stable. They push one's GPUs to their very limit and if ones system and or GPU isn't exceptionally well cooled a problem is likely to occur.
Post Reply