[11:49:33] Project: 2975 (Run 184, Clone 0, Gen 3)
[11:49:33]
[11:49:33] Assembly optimizations on if available.
[11:49:33] Entering M.D.
[11:49:39] Using Gromacs checkpoints
[11:49:39] Mapping NT from 1 to 1
[11:49:40] Resuming from checkpoint
[11:49:40] Verified work/wudata_06.log
[11:49:40] Verified work/wudata_06.trr
[11:49:40] Verified work/wudata_06.xtc
[11:49:40] Verified work/wudata_06.edr
[11:49:43] Completed 16510 out of 2500000 steps (0%)
[12:51:38] Completed 25000 out of 2500000 steps (1%)
[15:53:39] Completed 50000 out of 2500000 steps (2%)
[16:47:08] ***** Got a SIGTERM signal (2)
[16:47:08] Killing all core threads
Hmm it seems to me unusual that a bad WU would manifest itself as taking an exceptional amount of time to complete. My first thought is that there is something else going on here. What's your hardware? Are there background processes running? I just seems to me that the WU is fine its just taking a long time on your computer because there's something going on at your end. I'd like to figure it out, so more information please.
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
Jesse, Thanks for your reply. I always check the things that you mentioned prior to making a report but probably should have mentioned that in my OP. As far as a bad unit not manifesting itself in this manner, there have been other reports of work units manifesting themselves in exactly this manner. Hence, my report. I've moved on to a different WU and it's progressing normally.
There's a chance that it's a bad WU, but that's really difficult to determine.
At this point, nobody has completed that WU. Unfortunately we have no way to determine when WUs are assigned (or reissued) and that would be very useful information if everybody who processes the WU sees the same characteristics that you're reporting.
What was the date-time that the WU was assigned to you?
OK. November 1 09:39:09 UTC makes sense ... or in Stanford's timezone: November 1 02:39:09 PDT. I note that the previous WU, Gen 2, was returned to the server twice,
WU assigned to donor1 at: 2011-10-03 08:33:53 PDT
Past the Preferred deadline 13.30 days later at 2011-10-16 15:45:53 PDT
WU assigned to donor2 at: 2011-10-16 16:08:46 PDT <--- 23 minutes later.
Logged as returned by donor1 at: 2011-10-18 19:03:48 PDT
Logged as returned by donor2 at: 2011-10-22 09:04:57 PDT
I don't know how long after a Gen is returned before the next Gen is assigned, but the earliest that Gen 3 could have been assigned was 2011-10-18 19:03:48 PDT. That WU would be scheduled to be reissued 13.30 days later at 2011-11-01 02:15:31 PDT which would be 2011-11-01 09:15:31 UTC which allows an additional 24 minutes the WU waiting on the server to be reassigned. I conclude that the WU was assigned to someone else before you got it and either they're still working on it or they've dumped it.
[BTW, these calculations are not easy, even when I have access to much of the data. I learned something by doing it and it's also instructive because it shows that both Donor1 and Donor2 got credit for the same WU. Reissuing Gen 2 made sense at 2011-10-16 15:45:53 PDT because the WU was presumed lost and that made sure it would be returned by 2011-10-22 09:04:57 PDT. The server didn't know that Donor1 was still working on it. The fact that donor1 did return it at 2011-10-18 19:03:48 PDT helped the project along since that was still 3+ days before donor2's result was turned in.]
The WU was running on one core of an i7 940 @ stock of 2.93Ghz with 6GB of ram.
The computer also has an smp client using -smp 6 and a gpu(gtx 285) client.
All clients are using v6 console running on Windows Vista 64-bit.
If I understand what you're saying about Gen 2 being done twice does that mean that two Gen 3's would be created or is the system set to disallow the 2nd creation?
Sorry for the delay in answering Bruce ... you must have added that question after I read your initial response.
When a WU is completed, it is processed to create Gen (N+1) and that WU is issued. Extra copies are avoided whenever possible, and the duplication of Gen 2 only happened because it was assumed to be lost. That assumption later proved to be incorrect but the whole point of the duplication of the WU is to get a result from Gen 2 so that Gen 3 can be generated.
My calculations seem to imply that Gen 3 was issued "immediately" (well, within maybe 12 minutes) and it expired, too. You got the second copy of Gen 3 but only after 13.3 days (plus maybe 12 minutes).
The server-based process of creating Gen 3 from Gen 2 would not have run at 2011-10-22 09:04:57 PDT because Gen 3 already existed at that time.
For anyone familiar with BOINC projects, FAH is very, very different in this respect.
Just a followup on this work unit. Database info now indicates three other folders have completed Project: 2975 (Run 184, Clone 0, Gen 3) successfully.
Folder 1.
Your WU (P2975 R184 C0 G3) was added to the stats database on 2011-11-12 13:08:56 for 1681.7 points of credit.
Folder 2.
Your WU (P2975 R184 C0 G3) was added to the stats database on 2011-11-13 03:06:38 for 895 points of credit.
Folder 3.
Your WU (P2975 R184 C0 G3) was added to the stats database on 2011-11-13 23:06:38 for 1823.11 points of credit.
It would be of interest to know how long they took to complete this WU. Were they also looking at TPFs of over 3 hours or did they complete them in sub 1 hour TPFs? Download/upload times should be close to being able to determine which TPF they were closer to assuming they ran 24/7.
Interesting ... 2 got similar TPF to my previous experience with the project (different WU where I was getting ~36 minutes TPF using a single core) and one got similar to my experience on the same WU of ~3 hours TPF. It's almost as if there are two different WUs with the same PRCG but if I understood what Bruce said earlier that's not possible. If I get another from this project I'll just let it run no matter how long it takes.
I think the assumption that Donor #2 runs 24x7 is false. We have no data about how many hours (s)he actually was folding during that 18.62 days and without that information, guessing what TPF he got during those processing hours is meaningless.
... and then, too, there's the probability of different hardware.