Page 1 of 2
Project: 2484
Posted: Sun Oct 26, 2008 2:51 am
by PeterA
Is this a new project? Just downloaded it, but is not listed on Projects page. Just finished a 2485, so maybe it's similar?
Re: Project: 2484
Posted: Thu Oct 30, 2008 2:38 pm
by Columbus-Hilliard,Oh
still missing from projects page. Looks like I have five days left of processing at about 40% of my PPD for 2485.
Re: Project: 2484
Posted: Thu Oct 30, 2008 2:46 pm
by Flathead74
Project : 2484
Core : Gromacs
Frames : 100
Credit : 905
Re: Project: 2484
Posted: Thu Oct 30, 2008 3:57 pm
by Foxery
The whole range of projects 2482-2496 use the same FAHCore, and are all 905 points. PPD should be nearly identical.
Re: Project: 2484
Posted: Mon Nov 17, 2008 4:52 pm
by Aardvark
I have encountered an EUE at 56% on Proj 2484 (Run 89 Clone 6 Gen 1). First EUE that I have ever had.
Details from FAHlog.txt are as follows:
Code: Select all
[12:35:39] Completed 140000 out of 250000 steps (56)
[12:50:41] Timered checkpoint triggered.
[13:05:43] Timered checkpoint triggered.
[13:20:46] Timered checkpoint triggered.
[13:35:46] Timered checkpoint triggered.
[13:50:50] Timered checkpoint triggered.
[14:05:50] Timered checkpoint triggered.
[14:20:53] Timered checkpoint triggered.
[14:23:44] Quit 101 - Fatal error:
[14:23:44] Step 142241, time 784.482 (ps) LINCS WARNING
[14:23:44] relative constraint deviation after LINCS:
[14:23:44] max 0.175649 (between atoms 7699 and 7701) rms 0.001293
[14:23:44]
[14:23:44] Simulation instability has been encountered. The run has entered a
[14:23:44] state from which no further progress can be made.
[14:23:44] This may be the correct result of the simulation, however if you
[14:23:44] often see other project units terminating early like this
[14:23:44] too, you may wish to check the stability of your computer (issues
[14:23:44] such as high temperature, overclocking, etc.).
[14:23:44] Going to send back what have done.
[14:23:44] logfile size: 51231
[14:23:44] - Writing 51915 bytes of core data to disk...
[14:23:44] ... Done.
[14:23:44]
[14:23:44] Folding@home Core Shutdown: EARLY_UNIT_END
[14:23:46] CoreStatus = 72 (114)
[14:23:46] Sending work to server
[14:23:46] + Attempting to send results
[14:23:46] - Reading file work/wuresults_03.dat from core
[14:23:46] (Read 51915 bytes from disk)
[14:23:46] Connecting to http://171.65.103.162:8080/
[14:23:46] - Couldn't send HTTP request to server
[14:23:46] + Could not connect to Work Server (results)
[14:23:46] (171.65.103.162:8080)
[14:23:46] - Error: Could not transmit unit 03 (completed November 17) to work server.
[14:23:46] - 1 failed uploads of this unit.
[14:23:46] Keeping unit 03 in queue.
I am attempting to upload the EUE remains but 171.65.103.162 is not accepting, neither is the CS. Will keep trying as long as necessary.
I assume that this will get to Paula.
Re: Project: 2484
Posted: Wed Nov 19, 2008 5:46 pm
by ppetrone
Thanks for your report. Sorry about the EUE. I have stopped all the Clones that seem to have trouble, including the one you reported. I am not sure yet that the entire run is in trouble. I will be keeping an eye open.
Thanks
Paula
Re: Project: 2484
Posted: Wed Nov 19, 2008 6:35 pm
by Aardvark
@ppetrone,
Nothing to be sorry about, Paula. The System did give me ~500 points for the time spent up to the EUE event. These are "tough" WUs. Taking about twice as long per % as earlier typical WUs on my G4 Mac.
EDIT: Is there any know reason that I should be concerned about the Proj 2484 (R 236, C 8, G 1) WU that I am currently folding?
Re: Project: 2484 and others
Posted: Sun Nov 23, 2008 1:54 am
by Mizzou_Engineer
I cannot complete very many of the more recent WUs on my machine, the latest being 2484s. It has successfully completed two WUs in the past few weeks:
1. 2483 (R104, C27, G0): NaN (ener[13]) at 5%
2. 2483 (R199, C19, G0): Error 0x0 at 30%
3. 2483 (R199, C19, G0): Error 0x0 at 19%
4. 2483 (R199, C19, G0): Error 0x0 at 15%
[downloaded new Core_78]
5. 1487 (R0, C114, G39): Error 0x0 15 minutes into simulation (0%)
[downloaded Core_81]
6. 4100 (R62, C7, G17): NaN (ener[17]) at 4%
7. 4608 (R20, C40, G32): successful
8. 1487 (R0, C776, G33): Unknown error at 2% (client-core communication error 0x79)
9. 2483 (R199, C19, G0): NaN (ener[13]) at 5%
10. 1487 (R0, C776, G33): Error 0x0 at 1%
11. 4626 (R38, C5, G12): successful
And now the 2484s:
12. 2484 (R84, C17, G1): NaN (ener[13]) at 7%
13. 2484 (R148, C12, G1): Error 0x0 at 11%
14. 2484 (R111, C25, G0): Error 0x0 at 2%
15. 2484 (R111, C35, G0): Error 0x0 at 75 minutes (0%)
[downloaded new Core_78]
16. 2484 (R111, C35, G0): Error 0x0 at 13%
17. 2484 (R111, C35, G0): Error 0x0 at 5%
18. 2484 (R111, C35, G0): Error 0x0 at 17%
[downloaded new Core_78]
19. 2484 (R111, C35, G0): NaN (ener[13]) at 26%
20. 2484 (R41, C19, G2): Error 0x0 at 23%
21. 2484 (R41, C19, G2): NaN (ener[17]) at 1%
22. 1487 (R0 C369, G31): Error 0x0 at 190 minutes (0%).
23. 2484 (R168, C9, G2): NaN (ener[13]) at 12%
24. 1487 (R0, C369, G31): Unknown error(ERROR 0x79) at 3%
25. 2484 (R29, C5, G3): Error 0x0 at 12%
26. 2484 (R29, C5, G3): Error 0x0 at 4%
[downloaded new Core_78]
27. 2484 (R29, C5, G3): Error 0x0 at 4%
28. 1487 (R0, C223, G30): Error 0x0 at 240 minutes (0%)
29. 2484 (R29, C5, G3): currently processing
So here's what I see:
1. All of the project 2483s (which use Core_78) failed.
2. All of the project 1487s (which use Core_a0) failed.
3. All of the project 2484s (which use Core_78) failed.
4. The one project 4608 that I got (which used Core_82) was successful.
5. The one project 4100 that I got (which used Core_81) failed.
6. The one project 4626 I got (which used Core_82) was successful.
So as I can gather from this data, cores a0, 78, and 81 have not successfully returned a single WU while the two WUs that used Core_82 were both successfully completed.
Re: Project: 2484
Posted: Sun Nov 23, 2008 10:49 pm
by toTOW
Is your machine overclocked ?
Re: Project: 2484
Posted: Mon Nov 24, 2008 3:42 am
by anko1
I love these WUs. At 905 points and just about 3 days to complete, much better than the 110 ppd benchmark. Btw, I don't think I've had any EUEs on these units. I'm using Windows XP and the Windows graphical 5.03 client.
Edit: upon further examination, the 3 days applies to my newer computers. The older ones take 4 1/2 - 5 days.
Re: Project: 2484
Posted: Mon Nov 24, 2008 4:39 am
by divery4eyes
WOW, three days? I thought I was doing good @ six.
I am running four low end celery's and each one has been chewing on these
for about two weeks now with no problems.
Re: Project: 2484
Posted: Mon Nov 24, 2008 1:49 pm
by Mizzou_Engineer
toTOW wrote:Is your machine overclocked ?
Nope. It is running completely stock. The CPU frequency and voltage are set at default and the RAM timings, voltages, and speed is set at SPD select. It's a 1.6 Duron Tbred-B on an ASUS A7N8X-E board with 2x 1 GB sticks of DDR-400, in case you were wondering.
It kicked a few more 2484s since that last post as well:
- The 2484 (R29, C5, G3) it was working on hit an NaN (ener[13]) at 8%.
- It got a 2484 (R121, C5, G4) that got an Error 0x0 at 7%.
- The next WU was another 2484 (R121, C5, G4) that died at 3% of an Error 0x0.
- It just got a 1487 (R0, C223, G30) that it has put away about 1% on and is still running.
I looked through the dmesg and I see a FahCore_78.exe or FahCore_a0.exe segfault whenever a WU gets an Error 0x0. For example, the 2484 (R121, C5, G4) failed with
Code: Select all
[1801885.798907] FahCore_78.exe[7924]: segfault at a8b5a70 ip 0818de0c sp bf5fde40 error 4 in FahCore_78.exe[8048000+322000]
shown in the dmesg.
Re: Project: 2484
Posted: Mon Nov 24, 2008 1:52 pm
by toTOW
So many issues looks like a hardware issue ... have you tested your machine with Memtest and StressCPU v2 ?
Re: Project: 2484
Posted: Tue Nov 25, 2008 1:47 pm
by Mizzou_Engineer
toTOW wrote:So many issues looks like a hardware issue ... have you tested your machine with Memtest and StressCPU v2 ?
It checks out with Memtest. I hadn't heard of StressCPU before, so I haven't run it. I'll run it and see what it says.
Re: Project: 2484
Posted: Tue Nov 25, 2008 1:54 pm
by toTOW
You can find it in this thread : viewtopic.php?f=14&t=430