Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Moderators: Site Moderators, FAHC Science Team

_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by _r2w_ben »

LINCS WARNING EUE at 45%

Code: Select all

[11:44:35] Working on Unit 03 [October 14 11:44:35]
[11:44:35] + Working ...
[11:44:35] - Calling 'FahCore_78.exe -dir work/ -suffix 03 -checkpoint 15 -service -forceasm -verbose -lifeline 1720 -version 504'

[11:44:35] 
[11:44:35] *------------------------------*
[11:44:35] Folding@Home Gromacs Core
[11:44:35] Version 1.90 (March 8, 2006)
[11:44:35] 
[11:44:35] Preparing to commence simulation
[11:44:35] - Assembly optimizations manually forced on.
[11:44:35] - Not checking prior termination.
[11:44:39] - Expanded 2205621 -> 15088829 (decompressed 684.1 percent)
[11:44:39] - Starting from initial work packet
[11:44:39] 
[11:44:39] Project: 2492 (Run 99, Clone 25, Gen 0)
[11:44:39] 
[11:44:40] Assembly optimizations on if available.
[11:44:40] Entering M.D.
[11:44:47] Protein: system
[11:44:47] 
[11:44:47] Writing local files
[11:44:51] Extra SSE boost OK.
[11:44:52] Writing local files
[11:44:52] Completed 0 out of 250000 steps  (0)
[11:59:53] Timered checkpoint triggered.
[12:14:55] Timered checkpoint triggered.
[12:29:57] Timered checkpoint triggered.
[12:38:09] Writing local files
[12:38:09] Completed 2500 out of 250000 steps  (1)
[12:53:11] Timered checkpoint triggered.
[13:08:12] Timered checkpoint triggered.
[13:23:13] Timered checkpoint triggered.
[13:31:25] Writing local files
[13:31:25] Completed 5000 out of 250000 steps  (2)
[13:46:27] Timered checkpoint triggered.
[14:01:28] Timered checkpoint triggered.
[14:16:29] Timered checkpoint triggered.
[14:24:41] Writing local files
[14:24:41] Completed 7500 out of 250000 steps  (3)
...
[05:15:04] Writing local files
[05:15:04] Completed 110000 out of 250000 steps  (44)
[05:30:06] Timered checkpoint triggered.
[05:45:07] Timered checkpoint triggered.
[06:00:08] Timered checkpoint triggered.
[06:08:36] Writing local files
[06:08:36] Completed 112500 out of 250000 steps  (45)
[06:23:38] Timered checkpoint triggered.
[06:24:31] Quit 101 - Fatal error: 
[06:24:31] Step 113244, time 226.488 (ps)  LINCS WARNING
[06:24:31] relative constraint deviation after LINCS:
[06:24:31] max 7164557524992.000000 (between atoms 18524 and 18528) rms 53824798720.000000
[06:24:31] 
[06:24:31] Simulation instability has been encountered. The run has entered a
[06:24:31]   state from which no further progress can be made.
[06:24:31] This may be the correct result of the simulation, however if you
[06:24:31]   often see other project units terminating early like this
[06:24:31]   too, you may wish to check the stability of your computer (issues
[06:24:31]   such as high temperature, overclocking, etc.).
[06:24:31] Going to send back what have done.
[06:24:31] logfile size: 18386
[06:24:31] - Writing 19094 bytes of core data to disk...
[06:24:31]   ... Done.
[06:24:32] 
[06:24:32] Folding@home Core Shutdown: EARLY_UNIT_END
[06:24:36] CoreStatus = 72 (114)
[06:24:36] Sending work to server
toTOW
Site Moderator
Posts: 6349
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by toTOW »

Hi _r2w_ben (team 11108),
Your WU (P2492 R99 C25 G0) was added to the stats database on 2008-10-16 00:22:48 for 409.94 points of credit.

This is still the only report for this WU.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by _r2w_ben »

I have some more p2492 LINCS WARNINGs to report from team mates. Someone might want to take a look at these workunits.

Project: 2492 (Run 77, Clone 38, Gen 0)

Code: Select all

[13:07:16] Completed 107500 out of 250000 steps  (43)
[13:44:09] Quit 101 - Fatal error: 
[13:44:09] Step 109987, time 219.974 (ps)  LINCS WARNING
[13:44:09] relative constraint deviation after LINCS:
[13:44:09] max 0.040513 (between atoms 18518 and 18521) rms 0.000345
[13:44:09] 
[13:44:09] Simulation instability has been encountered. The run has entered a
[13:44:09]   state from which no further progress can be made.
[13:44:09] This may be the correct result of the simulation, however if you
[13:44:09]   often see other project units terminating early like this
[13:44:09]   too, you may wish to check the stability of your computer (issues
[13:44:09]   such as high temperature, overclocking, etc.).
[13:44:09] Going to send back what have done.
[13:44:09] logfile size: 26198
[13:44:09] - Writing 26884 bytes of core data to disk...
[13:44:09]   ... Done.
[13:44:09] 
[13:44:09] Folding@home Core Shutdown: EARLY_UNIT_END
[13:44:11] CoreStatus = 72 (114)
[13:44:11] Sending work to server
Project: 2492 (Run 85, Clone 28, Gen 0)

Code: Select all

[05:41:29] Completed 117500 out of 250000 steps  (47)
[06:06:59] Quit 101 - Fatal error: 
[06:06:59] Step 119018, time 238.036 (ps)  LINCS WARNING
[06:06:59] relative constraint deviation after LINCS:
[06:06:59] max 1.082720 (between atoms 5954 and 5956) rms 1.#QNAN0
[06:06:59] 
[06:06:59] Simulation instability has been encountered. The run has entered a
[06:06:59]   state from which no further progress can be made.
[06:06:59] This may be the correct result of the simulation, however if you
[06:06:59]   often see other project units terminating early like this
[06:06:59]   too, you may wish to check the stability of your computer (issues
[06:06:59]   such as high temperature, overclocking, etc.).
[06:06:59] Going to send back what have done.
[06:06:59] logfile size: 27437
[06:06:59] - Writing 28121 bytes of core data to disk...
[06:06:59]   ... Done.
[06:06:59] 
[06:06:59] Folding@home Core Shutdown: EARLY_UNIT_END
[06:07:01] CoreStatus = 72 (114)
[06:07:01] Sending work to server
Project: 2492 (Run 126, Clone 12, Gen 0)

Code: Select all

[08:06:17] Completed 195000 out of 250000 steps  (78)
[08:10:54] Quit 101 - Fatal error: 
[08:10:54] Step 195367, time 390.734 (ps)  LINCS WARNING
[08:10:54] relative constraint deviation after LINCS:
[08:10:54] max 4379309603037601900000000000000.000000 (between atoms 18530 and 18531) rms 34326768627183862000000000000.000000
[08:10:54] 
[08:10:54] Simulation instability has been encountered. The run has entered a
[08:10:54]   state from which no further progress can be made.
[08:10:54] This may be the correct result of the simulation, however if you
[08:10:54]   often see other project units terminating early like this
[08:10:54]   too, you may wish to check the stability of your computer (issues
[08:10:54]   such as high temperature, overclocking, etc.).
[08:10:54] Going to send back what have done.
[08:10:54] logfile size: 37482
[08:10:54] - Writing 38226 bytes of core data to disk...
[08:10:54]   ... Done.
[08:10:54] 
[08:10:54] Folding@home Core Shutdown: EARLY_UNIT_END
[08:10:57] CoreStatus = 72 (114)
[08:10:57] Sending work to server
Project: 2492 (Run 162, Clone 0, Gen 0)

Code: Select all

[12:25:50] Completed 130000 out of 250000 steps  (52%)
[12:40:02] Quit 101 - Fatal error: 
[12:40:02] Step 131111, time 262.222 (ps)  LINCS WARNING
[12:40:02] relative constraint deviation after LINCS:
[12:40:02] max 190943951252789130000.000000 (between atoms 18524 and 18525) rms 1510665204889813000.000000
[12:40:02] 
[12:40:02] Simulation instability has been encountered. The run has entered a
[12:40:02]   state from which no further progress can be made.
[12:40:02] This may be the correct result of the simulation, however if you
[12:40:02]   often see other project units terminating early like this
[12:40:02]   too, you may wish to check the stability of your computer (issues
[12:40:02]   such as high temperature, overclocking, etc.).
[12:40:02] Going to send back what have done.
[12:40:02] logfile size: 39457
[12:40:02] - Writing 40181 bytes of core data to disk...
[12:40:02]   ... Done.
[12:40:02] 
[12:40:02] Folding@home Core Shutdown: EARLY_UNIT_END
[12:40:05] CoreStatus = 72 (114)
[12:40:05] Sending work to server
toTOW
Site Moderator
Posts: 6349
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by toTOW »

Project: 2492 (Run 77, Clone 38, Gen 0)
There is a report for partial credit (not yours).

Project: 2492 (Run 85, Clone 28, Gen 0)
Same here ... the same guy as the previous one.

Project: 2492 (Run 126, Clone 12, Gen 0)
Same here.

Project: 2492 (Run 162, Clone 0, Gen 0)
Same here, but an other guy.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by _r2w_ben »

Thanks toTOW. I'm more concerned that there might be something wrong with the simulations.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by bruce »

I think distribuion of proj 2492 has been suspended until something can be corrected. Are you still getting them?
CDG
Posts: 1
Joined: Fri Aug 15, 2008 1:13 pm

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by CDG »

hopefully this the last one

Code: Select all

[20:42:50] Completed 132500 out of 250000 steps  (53%)
[21:27:33] Quit 101 - Fatal error: 
[21:27:33] Step 134380, time 268.76 (ps)  LINCS WARNING
[21:27:33] relative constraint deviation after LINCS:
[21:27:33] max 502561579368185860.000000 (between atoms 5619 and 5621) rms 4041417659777024.000000
[21:27:33] 
[21:27:33] Simulation instability has been encountered. The run has entered a
[21:27:33]   state from which no further progress can be made.
[21:27:33] This may be the correct result of the simulation, however if you
[21:27:33]   often see other project units terminating early like this
[21:27:33]   too, you may wish to check the stability of your computer (issues
[21:27:33]   such as high temperature, overclocking, etc.).
[21:27:33] Going to send back what have done.
[21:27:33] logfile size: 45538
[21:27:33] - Writing 46253 bytes of core data to disk...
[21:27:33]   ... Done.
[21:27:33] 
[21:27:33] Folding@home Core Shutdown: EARLY_UNIT_END
[21:27:36] CoreStatus = 72 (114)
[21:27:36] Sending work to server
[21:27:36] Project: 2492 (Run 163, Clone 6, Gen 0)
[21:27:36] - Read packet limit of 540015616... Set to 524286976.


[21:27:36] + Attempting to send results [October 17 21:27:36 UTC]
[21:27:43] + Results successfully sent
[21:27:43] Thank you for your contribution to Folding@Home.
[21:27:47] - Preparing to get new work unit...
Sahkuhnder
Posts: 43
Joined: Sun Dec 02, 2007 5:28 am
Location: Vegas Baby! Yeah!

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by Sahkuhnder »

bruce wrote:I think distribuion of proj 2492 has been suspended until something can be corrected. Are you still getting them?
I haven't received any new 2492s in the last few days but if it helps any here are the FAHlogs from the most recent two.

Project: 2492 (Run 18, Clone 18, Gen 0)

Code: Select all

[15:23:50] Completed 110000 out of 250000 steps  (44)
[16:09:16] Writing local files
[16:09:16] Completed 112500 out of 250000 steps  (45)
[16:54:42] Writing local files
[16:54:43] Completed 115000 out of 250000 steps  (46)
[17:28:14] Quit 101 - Fatal error: 
[17:28:14] Step 116845, time 233.69 (ps)  LINCS WARNING
[17:28:14] relative constraint deviation after LINCS:
[17:28:14] max 46795360.000000 (between atoms 3728 and 3729) rms 434021.093750
[17:28:14] 
[17:28:14] Simulation instability has been encountered. The run has entered a
[17:28:14]   state from which no further progress can be made.
[17:28:14] This may be the correct result of the simulation, however if you
[17:28:14]   often see other project units terminating early like this
[17:28:14]   too, you may wish to check the stability of your computer (issues
[17:28:14]   such as high temperature, overclocking, etc.).
[17:28:14] Going to send back what have done.
[17:28:14] logfile size: 37533
[17:28:14] - Writing 38228 bytes of core data to disk...
[17:28:14]   ... Done.
[17:28:14] 
[17:28:14] Folding@home Core Shutdown: EARLY_UNIT_END
[17:28:18] CoreStatus = 72 (114)
[17:28:18] Sending work to server


[17:28:18] + Attempting to send results
Project: 2492 (Run 3, Clone 14, Gen 0)

Code: Select all

[22:43:11] Completed 112500 out of 250000 steps  (45)
[23:28:34] Writing local files
[23:28:34] Completed 115000 out of 250000 steps  (46)
[00:13:56] Writing local files
[00:13:57] Completed 117500 out of 250000 steps  (47)
[00:57:39] Quit 101 - Fatal error: 
[00:57:39] Step 119909, time 239.818 (ps)  LINCS WARNING
[00:57:39] relative constraint deviation after LINCS:
[00:57:39] max 10583853056.000000 (between atoms 18554 and 18558) rms 93461000.000000
[00:57:39] 
[00:57:39] Simulation instability has been encountered. The run has entered a
[00:57:39]   state from which no further progress can be made.
[00:57:39] This may be the correct result of the simulation, however if you
[00:57:39]   often see other project units terminating early like this
[00:57:39]   too, you may wish to check the stability of your computer (issues
[00:57:39]   such as high temperature, overclocking, etc.).
[00:57:39] Going to send back what have done.
[00:57:39] logfile size: 29331
[00:57:39] - Writing 30034 bytes of core data to disk...
[00:57:39]   ... Done.
[00:57:39] 
[00:57:39] Folding@home Core Shutdown: EARLY_UNIT_END
[00:57:43] CoreStatus = 72 (114)
[00:57:43] Sending work to server


[00:57:43] + Attempting to send results
Image
brityank
Posts: 161
Joined: Wed Dec 05, 2007 9:16 pm
Location: SE Pennsylvania

Re: Project: 2492 (Run 32, Clone 30, Gen 0) EUE

Post by brityank »

Just got my first EUE on this box since I've had it live, and had to check if the :twisted: Gremlins :twisted: had invaded for the winter. :wink:

Code: Select all

[10:35:06] Completed 172500 out of 250000 steps  (69)
[10:45:55] Quit 101 - Fatal error: 
[10:45:55] Step 173183, time 346.366 (ps)  LINCS WARNING
[10:45:55] relative constraint deviation after LINCS:
[10:45:55] max 22244861809039081000000000.000000 (between atoms 18554 and 18557) rms 170078998374000580000000.000000
[10:45:55] 
[10:45:55] Simulation instability has been encountered. The run has entered a
[10:45:55]   state from which no further progress can be made.
Let me know if you need the full log, as I see there are many similar endings for this Run.
Hope they get this sim corrected, great PPD return! Thanks.
... ... Free Republic Folders - A Tribute to Ronald Reagan ... ...
Image
John_Weatherman
Posts: 289
Joined: Sun Dec 02, 2007 4:31 am
Location: Carrizo Plain National Monument, California
Contact:

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by John_Weatherman »

One more EUE - Project: 2492 (Run 103, Clone 34, Gen 0)
EUE LINCS WARNING after completing 87%

Code: Select all

[15:58:23] - User name: John_Weatherman (Team 48913)
[15:58:23] - User ID: BC627F80FF0BC0A
[15:58:23] - Machine ID: 2
[15:58:23] 
[15:58:23] Loaded queue successfully.
[15:58:23] + Benchmarking ...
[15:58:26] The benchmark result is 6168
[15:58:26] 
[15:58:26] + Processing work unit
[15:58:26] Core required: FahCore_78.exe
[15:58:26] Core found.
[15:58:26] - Autosending finished units...
[15:58:26] Trying to send all finished work units
[15:58:26] + No unsent completed units remaining.
[15:58:26] - Autosend completed
[15:58:26] Working on Unit 00 [October 27 15:58:26]
[15:58:26] + Working ...
[15:58:26] - Calling 'FahCore_78.exe -dir work/ -suffix 00 -checkpoint 10 -service -forceasm -verbose -lifeline 1860 -version 504'

[15:58:26] 
[15:58:26] *------------------------------*
[15:58:26] Folding@Home Gromacs Core
[15:58:26] Version 1.90 (March 8, 2006)
[15:58:26] 
[15:58:26] Preparing to commence simulation
[15:58:26] - Assembly optimizations manually forced on.
[15:58:26] - Not checking prior termination.
[15:58:34] - Expanded 2211037 -> 15087213 (decompressed 682.3 percent)
[15:58:35] 
[15:58:35] Project: 2492 (Run 103, Clone 34, Gen 0)
[15:58:35] 
[15:58:39] Assembly optimizations on if available.
[15:58:39] Entering M.D.
[15:59:03] (Starting from checkpoint)
[15:59:03] Protein: system
[15:59:03] 
[15:59:03] Writing local files
[15:59:03] Completed 217044 out of 250000 steps  (87)
[15:59:08] Extra SSE boost OK.
[16:09:08] Timered checkpoint triggered.
[16:15:29] Writing local files
[16:15:30] Completed 217500 out of 250000 steps  (87)
[16:25:33] Timered checkpoint triggered.
[16:36:32] Timered checkpoint triggered.
[16:46:34] Timered checkpoint triggered.
[16:56:36] Timered checkpoint triggered.
[17:07:34] Timered checkpoint triggered.
[17:17:33] Timered checkpoint triggered.
[17:27:34] Timered checkpoint triggered.
[17:33:13] Quit 101 - Fatal error: 
[17:33:13] Step 219906, time 439.812 (ps)  LINCS WARNING
[17:33:13] relative constraint deviation after LINCS:
[17:33:13] max 143039883296076990000.000000 (between atoms 18524 and 18528) rms 1219981818296533000.000000
[17:33:13] 
[17:33:13] Simulation instability has been encountered. The run has entered a
[17:33:13]   state from which no further progress can be made.
[17:33:13] This may be the correct result of the simulation, however if you
[17:33:13]   often see other project units terminating early like this
[17:33:13]   too, you may wish to check the stability of your computer (issues
[17:33:13]   such as high temperature, overclocking, etc.).
[17:33:13] Going to send back what have done.
[17:33:13] logfile size: 270500
[17:33:13] - Writing 271224 bytes of core data to disk...
[17:33:13]   ... Done.
[17:33:13] 
[17:33:13] Folding@home Core Shutdown: EARLY_UNIT_END
[17:33:16] CoreStatus = 72 (114)
[17:33:16] Sending work to server


[17:33:16] + Attempting to send results
[17:33:16] - Reading file work/wuresults_00.dat from core
[17:33:16]   (Read 271224 bytes from disk)
[17:33:16] Connecting to http://171.65.103.160:8080/
[17:33:31] Posted data.
[17:33:32] Initial: 0000; - Uploaded at ~16 kB/s
[17:33:32] - Averaged speed for that direction ~38 kB/s
[17:33:32] + Results successfully sent
[17:33:32] Thank you for your contribution to Folding@Home.
I've got a new Project 2492 WU (Run 61, Clone 5, Gen 0) so I'll keep my fingers crossed!
Mizzou_Engineer
Posts: 13
Joined: Tue Dec 18, 2007 3:30 pm

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by Mizzou_Engineer »

I have also had trouble with 2492s. 2492 (Run 22, Clone 6, Gen 0) has NaN'd:

Code: Select all

Launch directory: /home/phillip/nfs/FAH
Executable: ./fah6
Arguments: -verbosity 9 

[14:10:08] - Ask before connecting: No
[14:10:08] - User name: Mizzou_Engineer (Team 34106)
[14:10:08] - User ID: 3BAC6955549B0C2A
[14:10:08] - Machine ID: 2
[14:10:08] 
[14:10:08] Loaded queue successfully.
[14:10:08] - Autosending finished units...
[14:10:08] Trying to send all finished work units
[14:10:08] + No unsent completed units remaining.
[14:10:08] - Autosend completed
[14:10:08] 
[14:10:08] + Processing work unit
[14:10:08] Core required: FahCore_78.exe
[14:10:08] Core found.
[14:10:08] Working on Unit 01 [October 26 14:10:08]
[14:10:08] + Working ...
[14:10:08] - Calling './FahCore_78.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 7914 -version 602'

[14:10:08] 
[14:10:08] *------------------------------*
[14:10:08] Folding@Home Gromacs Core
[14:10:08] Version 1.90 (March 8, 2006)
[14:10:08] 
[14:10:08] Preparing to commence simulation
[14:10:08] - Looking at optimizations...
[14:10:08] - Files status OK
[14:10:10] - Expanded 2213678 -> 15095697 (decompressed 681.9 percent)
[14:10:10] 
[14:10:10] Project: 2492 (Run 22, Clone 6, Gen 0)
[14:10:10] 
[14:10:10] Assembly optimizations on if available.
[14:10:10] Entering M.D.
[14:10:33] (Starting from checkpoint)
[14:10:33] Protein: system
[14:10:33] 
[14:10:33] Writing local files
[14:10:33] Completed 1568 out of 250000 steps  (1%)
[14:10:36] Extra SSE boost OK.
[14:25:35] Timered checkpoint triggered.
[14:40:35] Timered checkpoint triggered.
[14:46:34] Writing local files
[14:46:34] Completed 2500 out of 250000 steps  (1%)
[15:01:35] Timered checkpoint triggered.
[15:16:36] Timered checkpoint triggered.
[15:31:38] Timered checkpoint triggered.
[15:47:37] Timered checkpoint triggered.
[16:02:38] Timered checkpoint triggered.
[16:17:39] Timered checkpoint triggered.
[16:23:11] Writing local files
[16:23:11] Completed 5000 out of 250000 steps  (2%)
[16:38:12] Timered checkpoint triggered.
[16:53:12] Timered checkpoint triggered.
[17:08:13] Timered checkpoint triggered.
[17:24:13] Timered checkpoint triggered.
[17:39:16] Timered checkpoint triggered.
[17:54:16] Timered checkpoint triggered.
[18:09:18] Timered checkpoint triggered.
[18:12:59] Writing local files
[18:12:59] Completed 7500 out of 250000 steps  (3%)
[18:28:00] Timered checkpoint triggered.
[18:43:01] Timered checkpoint triggered.
[18:58:02] Timered checkpoint triggered.
[19:13:03] Timered checkpoint triggered.
[19:28:03] Timered checkpoint triggered.
[19:43:05] Timered checkpoint triggered.
[19:49:12] Writing local files
[19:49:13] Completed 10000 out of 250000 steps  (4%)
[20:04:14] Timered checkpoint triggered.
[20:10:08] - Autosending finished units...
[20:10:08] Trying to send all finished work units
[20:10:08] + No unsent completed units remaining.
[20:10:08] - Autosend completed
[20:19:15] Timered checkpoint triggered.
[20:34:16] Timered checkpoint triggered.
[20:49:17] Timered checkpoint triggered.
[21:04:18] Timered checkpoint triggered.
[21:19:19] Timered checkpoint triggered.
[21:25:29] Writing local files
[21:25:29] Completed 12500 out of 250000 steps  (5%)
[21:40:30] Timered checkpoint triggered.
[21:55:31] Timered checkpoint triggered.
[22:08:15] Quit 101 - Fatal error: NaN detected: (ener[13])
[22:08:15] 
[22:08:15] Simulation instability has been encountered. The run has entered a
[22:08:15]   state from which no further progress can be made.
[22:08:15] This may be the correct result of the simulation, however if you
[22:08:15]   often see other project units terminating early like this
[22:08:15]   too, you may wish to check the stability of your computer (issues
[22:08:15]   such as high temperature, overclocking, etc.).
[22:08:15] Going to send back what have done.
[22:08:15] logfile size: 18057
[22:08:15] - Writing 18620 bytes of core data to disk...
[22:08:15]   ... Done.
[22:08:16] 
[22:08:16] Folding@home Core Shutdown: EARLY_UNIT_END
[22:08:17] CoreStatus = 72 (114)
[22:08:17] Sending work to server
[22:08:17] - Read packet limit of 540015616... Set to 524286976.
The same machine started having problems with 2485s as well:
- 2485 (Run 69, Clone 15, Gen 0): NaN detected (ener[17]) at 5% completion
- 2485 (Run 69, Clone 17, Gen 0): NaN detected (ener[13]) at 4% completion
- 2485 (Run 94, Clone 7, Gen 0): ERROR 0x0 at 59% completion, then again at 12% completion, then NaN detected (ener[13]) right at the start of the third one.
- 2484 (Run 115, Clone 8, Gen 0): NaN detected (ener[13]) at 4% completion.

The machine is running stock speeds and everything and it currently has a new 2492 (Run 68, Clone 9, Gen 0) it is working on. It has successfully returned 2493s, 4595s, and 4622s and I haven't touched the machine since then, so I doubt my machine is unstable.
brityank
Posts: 161
Joined: Wed Dec 05, 2007 9:16 pm
Location: SE Pennsylvania

Re: Project: 2492 (Run 29, Clone 37, Gen 0) EUE

Post by brityank »

Got another EUE/LINCS error --

Code: Select all

Project: 2492 (Run 29, Clone 37, Gen 0)
==================================================================
[04:59:40] + Processing work unit
[04:59:40] Core required: FahCore_78.exe
[04:59:40] Core found.
[04:59:40] Working on Unit 07 [October 28 04:59:40]
[04:59:40] + Working ...
[04:59:40] - Calling 'FahCore_78.exe -dir work/ -suffix 07 -priority 96 -checkpoint 15 -verbose -lifeline 3584 -version 504'

[04:59:40] 
[04:59:40] *------------------------------*
[04:59:40] Folding@Home Gromacs Core
[04:59:40] Version 1.90 (March 8, 2006)
[04:59:40] 
[04:59:40] Preparing to commence simulation
[04:59:40] - Looking at optimizations...
[04:59:40] - Created dyn
[04:59:40] - Files status OK
[04:59:43] - Expanded 2217905 -> 15098121 (decompressed 680.7 percent)
[04:59:43] - Starting from initial work packet
[04:59:43] 
[04:59:43] Project: 2492 (Run 29, Clone 37, Gen 0)
[04:59:43] 
[04:59:43] Assembly optimizations on if available.
[04:59:43] Entering M.D.
[04:59:50] Protein: system
[04:59:50] 
[04:59:51] Writing local files
[04:59:53] Extra SSE boost OK.
[04:59:54] Writing local files
[04:59:54] Completed 0 out of 250000 steps  (0)

~ ~ ~ ~ ~ snip ~ ~ ~ ~ ~ ~

[01:05:15] Completed 190000 out of 250000 steps  (76)
[01:21:16] Timered checkpoint triggered.
[01:32:48] Quit 101 - Fatal error: 
[01:32:48] Step 192014, time 384.028 (ps)  LINCS WARNING
[01:32:48] relative constraint deviation after LINCS:
[01:32:48] max 11672236220074689000.000000 (between atoms 18536 and 18538) rms 128784852068597760.000000
[01:32:48] 
[01:32:48] Simulation instability has been encountered. The run has entered a
[01:32:48]   state from which no further progress can be made.
[01:32:48] This may be the correct result of the simulation, however if you
[01:32:48]   often see other project units terminating early like this
[01:32:48]   too, you may wish to check the stability of your computer (issues
[01:32:48]   such as high temperature, overclocking, etc.).
[01:32:48] Going to send back what have done.
[01:32:48] logfile size: 29192
[01:32:48] - Writing 29914 bytes of core data to disk...
[01:32:48]   ... Done.
[01:32:48] 
[01:32:48] Folding@home Core Shutdown: EARLY_UNIT_END
[01:32:52] CoreStatus = 72 (114)
[01:32:52] Sending work to server


[01:32:52] + Attempting to send results
[01:32:52] - Reading file work/wuresults_07.dat from core
[01:32:52]   (Read 29914 bytes from disk)
[01:32:52] Connecting to http://171.65.103.160:8080/
[01:32:53] Posted data.
[01:32:53] Initial: 0000; - Uploaded at ~30 kB/s
[01:32:53] - Averaged speed for that direction ~38 kB/s
[01:32:53] + Results successfully sent
[01:32:53] Thank you for your contribution to Folding@Home.
==================================================================
... ... Free Republic Folders - A Tribute to Ronald Reagan ... ...
Image
toTOW
Site Moderator
Posts: 6349
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by toTOW »

I wonder if anyone was able to complete a single WU from this project :?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Mizzou_Engineer
Posts: 13
Joined: Tue Dec 18, 2007 3:30 pm

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by Mizzou_Engineer »

I haven't had one successfully complete, and I have had more since my last post. 2492 (Run 68, Clone 9, Gen 0) got a NaN error (ener[17]) at 4% and 2492 (Run 2, Clone 23, Gen 0) got an ERROR 0x0 at 8%. The machine also kicked a 2484 (Run 176, Clone 27, Gen 0) WU at 1% with an ERROR 0x0.

However, there *might* be hope. I am currently working on another 2494 (Run 2, Clone 23, Gen 0) that has gotten to 19% and is still running. We'll see if it decides to finish, but it'll take about five and a half more days to see.
brityank
Posts: 161
Joined: Wed Dec 05, 2007 9:16 pm
Location: SE Pennsylvania

Re: Project: 2492 (Run 99, Clone 25, Gen 0) EUE

Post by brityank »

Mizzou_Engineer wrote:I haven't had one successfully complete, and I have had more since my last post. 2492 (Run 68, Clone 9, Gen 0) got a NaN error (ener[17]) at 4% and 2492 (Run 2, Clone 23, Gen 0) got an ERROR 0x0 at 8%. The machine also kicked a 2484 (Run 176, Clone 27, Gen 0) WU at 1% with an ERROR 0x0.

However, there *might* be hope. I am currently working on another 2494 (Run 2, Clone 23, Gen 0) that has gotten to 19% and is still running. We'll see if it decides to finish, but it'll take about five and a half more days to see.
I hit 76% on my last one, so you're not safe yet. :(

toTOW -- I also checked my logs -- I've done 3 and believe all three failed; two are listed above, I don't have the RCG for my first one.
... ... Free Republic Folders - A Tribute to Ronald Reagan ... ...
Image
Post Reply