Page 1 of 1

Protomol 10000 - shocking!

Posted: Tue Dec 29, 2009 9:47 pm
by endrik
Seeing something new, I went on to see if the Protomol will work on my venerable rig (OS: Win 2k SP4, CPU: Athlon 64 Clawhammer Engineering Sample @2.4 Ghz, 1024 Mb RAM ).
First came a 10001 (Run 300, Clone 3, Gen 6). With a step averaging 5 minutes it would make somewhere over 100 PPD, were it not for stalls in the core. They were very deceptive - no EUE's, everything runs fine, so I check only several hours later and with dismay I see that last two hrs are wasted since it just stalled or choked - 100% CPU all the time, yet for two hours no gain. So I started to supervise it closer and indeed it kept covertly hanging every 3-4 steps, requiring manual restart (to be precise, the client shut down all right but the core remained in 100% CPU operation, demanding a process kill). Well, such is life.
Displeased, I almost changed the advmethods back to "no", but decided to have another try and was ASTOUNDED:

Code: Select all

--- Opening Log file [December 29 19:02:49] 


# Windows Console Edition #####################################################
###############################################################################
                       Folding@Home Client Version 5.04beta
                          http://folding.stanford.edu
###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\JJR\Pulpit\plupit\FAH\FAHfut4
Executable: C:\Documents and Settings\JJR\Pulpit\plupit\FAH\FAHfut4\FAH504-Console.exe
Arguments: -config 

[19:02:49] - Ask before connecting: Yes
[19:02:49] - User name: endrik (Team 276)
[19:02:49] - User ID: 216BCD1453BB7B92
[19:02:49] - Machine ID: 1
[19:02:49] 
[19:02:49] Configuring Folding@Home...


[19:03:10] - Ask before connecting: Yes
[19:03:10] - User name: endrik (Team 276)
[19:03:10] - User ID: 216BCD1453BB7B92
[19:03:10] - Machine ID: 1
[19:03:10] 
[19:03:11] Loaded queue successfully.
[19:03:11] + Benchmarking ...
[19:03:13] 
[19:03:13] + Processing work unit
[19:03:13] Core required: FahCore_b4.exe
[19:03:13] Core found.
[19:03:13] Working on Unit 05 [December 29 19:03:13]
[19:03:13] + Working ...
[19:03:13] *********************** Log Started 29/Dec/2009 19:03:13 ***********************
[19:03:13] ************************** ProtoMol Folding@Home Core **************************
[19:03:13]   Version: 21
[19:03:13]      Type: 180
[19:03:13]      Core: ProtoMol
[19:03:13]   Website: http://folding.stanford.edu/
[19:03:13] Copyright: (c) 2009 Stanford University
[19:03:13]    Author: Joseph Coffland <joseph@cauldrondevelopment.com>
[19:03:13]      Args: -dir work/ -suffix 05 -checkpoint 4 -lifeline 396 -version 504
[19:03:13] 
[19:03:13] ************************************ Build *************************************
[19:03:13]      Date: Dec 24 2009
[19:03:13]      Time: 14:36:31
[19:03:13]  Revision: 1748
[19:03:13]  Compiler: Intel(R) C++ MSVC 1500 mode 1110
[19:03:13]   Options: /TP /nologo /EHsc /wd4297 /wd4103 /wd1786 /arch:IA32 /Ox
[19:03:13]            /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qrestrict /MT
[19:03:13]  Platform: Windows XP
[19:03:13]      Bits: 32
[19:03:13] ************************************ System ************************************
[19:03:13]        OS: Microsoft Windows 2000 Service Pack 4
[19:03:13]       CPU: AMD Engineering Sample
[19:03:13]    CPU ID: AuthenticAMD Family 15 Model 4 Stepping 10
[19:03:13]      CPUs: 1 Logical, 1 Physical
[19:03:13]    Memory: 1023 MB
[19:03:13] ********************************************************************************
[19:03:13] Project: 10000 (Run 629, Clone 0, Gen 6)
[19:03:13] Digital signatures verified
[19:03:14] Completed 300 out of 1000000 steps (0)
[19:07:36] Completed 10000 out of 1000000 steps (1)
[19:12:32] Completed 20000 out of 1000000 steps (2)
[19:17:29] Completed 30000 out of 1000000 steps (3)
[19:22:22] Completed 40000 out of 1000000 steps (4)
[19:25:39] Completed 50000 out of 1000000 steps (5)
[19:25:52] Completed 60000 out of 1000000 steps (6)
[19:26:05] Completed 70000 out of 1000000 steps (7)
[19:26:18] Completed 80000 out of 1000000 steps (8)
[19:26:31] Completed 90000 out of 1000000 steps (9)
[19:26:44] Completed 100000 out of 1000000 steps (10)
[19:26:57] Completed 110000 out of 1000000 steps (11)
[19:27:10] Completed 120000 out of 1000000 steps (12)
[19:27:23] Completed 130000 out of 1000000 steps (13)
[19:27:36] Completed 140000 out of 1000000 steps (14)
[19:27:49] Completed 150000 out of 1000000 steps (15)
[19:28:02] Completed 160000 out of 1000000 steps (16)
[19:28:15] Completed 170000 out of 1000000 steps (17)
[19:28:28] Completed 180000 out of 1000000 steps (18)
[19:28:41] Completed 190000 out of 1000000 steps (19)
[19:28:54] Completed 200000 out of 1000000 steps (20)
[19:29:07] Completed 210000 out of 1000000 steps (21)
[19:29:20] Completed 220000 out of 1000000 steps (22)
[19:29:33] Completed 230000 out of 1000000 steps (23)
[19:29:46] Completed 240000 out of 1000000 steps (24)
[19:29:59] Completed 250000 out of 1000000 steps (25)
[19:30:12] Completed 260000 out of 1000000 steps (26)
[19:30:25] Completed 270000 out of 1000000 steps (27)
[19:30:38] Completed 280000 out of 1000000 steps (28)
[19:30:51] Completed 290000 out of 1000000 steps (29)
[19:31:06] Completed 300000 out of 1000000 steps (30)
[19:31:19] Completed 310000 out of 1000000 steps (31)
[19:31:32] Completed 320000 out of 1000000 steps (32)
[19:31:46] Completed 330000 out of 1000000 steps (33)
[19:32:00] Completed 340000 out of 1000000 steps (34)
[19:32:13] Completed 350000 out of 1000000 steps (35)
[19:32:27] Completed 360000 out of 1000000 steps (36)
[19:32:40] Completed 370000 out of 1000000 steps (37)
[19:32:53] Completed 380000 out of 1000000 steps (38)
[19:33:06] Completed 390000 out of 1000000 steps (39)
[19:33:19] Completed 400000 out of 1000000 steps (40)
[19:33:32] Completed 410000 out of 1000000 steps (41)
[19:33:45] Completed 420000 out of 1000000 steps (42)
[19:33:58] Completed 430000 out of 1000000 steps (43)
[19:34:11] Completed 440000 out of 1000000 steps (44)
[19:34:24] Completed 450000 out of 1000000 steps (45)
[19:34:37] Completed 460000 out of 1000000 steps (46)
[19:34:50] Completed 470000 out of 1000000 steps (47)
[19:35:03] Completed 480000 out of 1000000 steps (48)
[19:35:16] Completed 490000 out of 1000000 steps (49)
[19:35:29] Completed 500000 out of 1000000 steps (50)
[19:35:42] Completed 510000 out of 1000000 steps (51)
[19:35:55] Completed 520000 out of 1000000 steps (52)
[19:36:09] Completed 530000 out of 1000000 steps (53)
[19:36:23] Completed 540000 out of 1000000 steps (54)
[19:36:35] Completed 550000 out of 1000000 steps (55)
[19:36:49] Completed 560000 out of 1000000 steps (56)
[19:37:02] Completed 570000 out of 1000000 steps (57)
[19:37:15] Completed 580000 out of 1000000 steps (58)
[19:37:28] Completed 590000 out of 1000000 steps (59)
[19:37:42] Completed 600000 out of 1000000 steps (60)
[19:37:55] Completed 610000 out of 1000000 steps (61)
[19:38:08] Completed 620000 out of 1000000 steps (62)
[19:38:21] Completed 630000 out of 1000000 steps (63)
[19:38:34] Completed 640000 out of 1000000 steps (64)
[19:38:47] Completed 650000 out of 1000000 steps (65)
[19:39:01] Completed 660000 out of 1000000 steps (66)
[19:39:14] Completed 670000 out of 1000000 steps (67)
[19:39:28] Completed 680000 out of 1000000 steps (68)
[19:39:42] Completed 690000 out of 1000000 steps (69)
[19:39:56] Completed 700000 out of 1000000 steps (70)
[19:40:12] Completed 710000 out of 1000000 steps (71)
[19:40:25] Completed 720000 out of 1000000 steps (72)
[19:40:39] Completed 730000 out of 1000000 steps (73)
[19:40:53] Completed 740000 out of 1000000 steps (74)
[19:41:07] Completed 750000 out of 1000000 steps (75)
[19:41:21] Completed 760000 out of 1000000 steps (76)
[19:41:34] Completed 770000 out of 1000000 steps (77)
[19:41:48] Completed 780000 out of 1000000 steps (78)
[19:42:02] Completed 790000 out of 1000000 steps (79)
[19:42:16] Completed 800000 out of 1000000 steps (80)
[19:42:30] Completed 810000 out of 1000000 steps (81)
[19:42:43] Completed 820000 out of 1000000 steps (82)
[19:42:57] Completed 830000 out of 1000000 steps (83)
[19:43:11] Completed 840000 out of 1000000 steps (84)
[19:43:25] Completed 850000 out of 1000000 steps (85)
[19:43:40] Completed 860000 out of 1000000 steps (86)
[19:43:54] Completed 870000 out of 1000000 steps (87)
[19:44:08] Completed 880000 out of 1000000 steps (88)
[19:44:22] Completed 890000 out of 1000000 steps (89)
[19:44:37] Completed 900000 out of 1000000 steps (90)
[19:44:51] Completed 910000 out of 1000000 steps (91)
[19:45:05] Completed 920000 out of 1000000 steps (92)
[19:45:21] Completed 930000 out of 1000000 steps (93)
[19:45:35] Completed 940000 out of 1000000 steps (94)
[19:45:49] Completed 950000 out of 1000000 steps (95)
[19:46:03] Completed 960000 out of 1000000 steps (96)
[19:46:17] Completed 970000 out of 1000000 steps (97)
[19:46:32] Completed 980000 out of 1000000 steps (98)
Can you see that? No change within my rig, and all of a sudden it reaps some 15 seconds for a step, with even more points that 10001. That would be well over 2500 PPD on my single-core, aged Athlon :!: :!: :!:

So, disbelieving and agitated as hell I went for another one, luckily got a 10000 again, but nothing extraordinary :( - a usual 5 minutes step with regular 105 PPD. Same compiler, same rig, same settings, yet no miracles this time :

Code: Select all

--- Opening Log file [December 29 20:22:23] 


# Windows Console Edition #####################################################
###############################################################################

                       Folding@Home Client Version 5.04beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\JJR\Pulpit\plupit\FAH\FAHfut4
Executable: C:\Documents and Settings\JJR\Pulpit\plupit\FAH\FAHfut4\FAH504-Console.exe


[20:22:23] - Ask before connecting: Yes
[20:22:23] - User name: endrik (Team 276)
[20:22:23] - User ID: 216BCD1453BB7B92
[20:22:23] - Machine ID: 1
[20:22:23] 
[20:22:24] Loaded queue successfully.
[20:22:24] + Benchmarking ...
[20:22:26] 
[20:22:26] + Processing work unit
[20:22:26] Core required: FahCore_b4.exe
[20:22:26] Core found.
[20:22:26] Working on Unit 06 [December 29 20:22:26]
[20:22:26] + Working ...
[20:22:26] *********************** Log Started 29/Dec/2009 20:22:26 ***********************
[20:22:26] ************************** ProtoMol Folding@Home Core **************************
[20:22:26]   Version: 21
[20:22:26]      Type: 180
[20:22:26]      Core: ProtoMol
[20:22:26]   Website: http://folding.stanford.edu/
[20:22:26] Copyright: (c) 2009 Stanford University
[20:22:26]    Author: Joseph Coffland <joseph@cauldrondevelopment.com>
[20:22:26]      Args: -dir work/ -suffix 06 -checkpoint 4 -lifeline 1032 -version 504
[20:22:26] 
[20:22:26] ************************************ Build *************************************
[20:22:26]      Date: Dec 24 2009
[20:22:26]      Time: 14:36:31
[20:22:26]  Revision: 1748
[20:22:26]  Compiler: Intel(R) C++ MSVC 1500 mode 1110
[20:22:26]   Options: /TP /nologo /EHsc /wd4297 /wd4103 /wd1786 /arch:IA32 /Ox
[20:22:26]            /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qrestrict /MT
[20:22:26]  Platform: Windows XP
[20:22:26]      Bits: 32
[20:22:26] ************************************ System ************************************
[20:22:26]        OS: Microsoft Windows 2000 Service Pack 4
[20:22:26]       CPU: AMD Engineering Sample
[20:22:26]    CPU ID: AuthenticAMD Family 15 Model 4 Stepping 10
[20:22:26]      CPUs: 1 Logical, 1 Physical
[20:22:26]    Memory: 1023 MB
[20:22:26] ********************************************************************************
[20:22:26] Project: 10000 (Run 3319, Clone 0, Gen 5)
[20:22:26] Digital signatures verified
[20:22:27] Completed 71600 out of 1000000 steps (7)
[20:26:33] Completed 80000 out of 1000000 steps (8)
[20:31:29] Completed 90000 out of 1000000 steps (9)
[20:36:26] Completed 100000 out of 1000000 steps (10)
[20:41:28] Completed 110000 out of 1000000 steps (11)
[20:46:46] Completed 120000 out of 1000000 steps (12)
[20:51:55] Completed 130000 out of 1000000 steps (13)
[20:56:51] Completed 140000 out of 1000000 steps (14)
[21:01:51] Completed 150000 out of 1000000 steps (15)
[21:06:52] Completed 160000 out of 1000000 steps (16)
[21:11:55] Completed 170000 out of 1000000 steps (17)
[21:16:55] Completed 180000 out of 1000000 steps (18)


Well, at least it isn't hanging so far, but we have to see yet :?

Now, anybody have an idea what's going on ? Did the core miraculously launched SSE4's on my CPU or what ?
To be sure, I've checked the logs again and found one interesting instance in that first, troublesome WU:

Code: Select all

[20:19:56] *********************** Log Started 28/Dec/2009 20:19:56 ***********************
[20:19:56] ************************** ProtoMol Folding@Home Core **************************
[20:19:56]   Version: 21
[20:19:56]      Type: 180
[20:19:56]      Core: ProtoMol
[20:19:56]   Website: http://folding.stanford.edu/
[20:19:56] Copyright: (c) 2009 Stanford University
[20:19:56]    Author: Joseph Coffland <joseph@cauldrondevelopment.com>
[20:19:56]      Args: -dir work/ -suffix 00 -checkpoint 20 -lifeline 1512 -version 504
[20:19:56] 
[20:19:56] ************************************ Build *************************************
[20:19:56]      Date: Dec 24 2009
[20:19:56]      Time: 14:36:31
[20:19:56]  Revision: 1748
[20:19:56]  Compiler: Intel(R) C++ MSVC 1500 mode 1110
[20:19:56]   Options: /TP /nologo /EHsc /wd4297 /wd4103 /wd1786 /arch:IA32 /Ox
[20:19:56]            /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qrestrict /MT
[20:19:56]  Platform: Windows XP
[20:19:56]      Bits: 32
[20:19:56] ************************************ System ************************************
[20:19:56]        OS: Microsoft Windows 2000 Service Pack 4
[20:19:56]       CPU: AMD Engineering Sample
[20:19:56]    CPU ID: AuthenticAMD Family 15 Model 4 Stepping 10
[20:19:56]      CPUs: 1 Logical, 1 Physical
[20:19:56]    Memory: 1023 MB
[20:19:56] ********************************************************************************
[20:19:56] Project: 10001 (Run 300, Clone 3, Gen 6)
[20:19:56] Digital signatures verified
[20:19:57] Completed 7900 out of 200000 steps (3)
[20:20:15] Completed 8000 out of 200000 steps (4)
[20:25:15] Completed 10000 out of 200000 steps (5)
[20:31:05] Completed 12000 out of 200000 steps (6)


Looks like it promisingly started with 15 seconds step, but then returned to normal 5 minutes and stayed like that till the end.
Again - any explanation?

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 12:17 am
by Pette Broad
Can't explain the very fast WU, could do with a few of those myself. :) However, if you look closely at the last log you will see

[20:19:57] Completed 7900 out of 200000 steps (3)
[20:20:15] Completed 8000 out of 200000 steps (4)
[20:25:15] Completed 10000 out of 200000 steps (5)

Frpm 7900 to 8000 isn't a step, that would be from 6000 to 8000...

Pete

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 12:44 am
by endrik
Oops, you are right - I was too excited to look closely. Still, the other thing I witnessed live on screen, sitting with my mouth wide open :)

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 5:45 am
by bruce
Pette Broad wrote:Can't explain the very fast WU, could do with a few of those myself. :) However, if you look closely at the last log you will see

[20:19:57] Completed 7900 out of 200000 steps (3)
[20:20:15] Completed 8000 out of 200000 steps (4)
[20:25:15] Completed 10000 out of 200000 steps (5)

Frpm 7900 to 8000 isn't a step, that would be from 6000 to 8000...

Pete
This also happens with most other FAHcores. If there's a checkpoint it taken at some point other than precisely at a integer number of percent and then the client is shut down, when you later restart the first interval has already been partially processed so it will be shorter than normal.

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 1:34 pm
by endrik
Right, but this is not the main question. Bruce, could you in any way explain the first log please?

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 2:47 pm
by Rattledagger
endrik wrote:Right, but this is not the main question. Bruce, could you in any way explain the first log please?
Well, I'm not Bruce, but let's see if can't find a more or less "good" explanation...

Hmm, how to explain it... :idea:

Well, not anything to do with FAH, but you probably has heard about the LHC, the big, circular particle-accelerator.
Let's say you're running a simulator of the LHC, and wants to track 1000 particles through 1 million circulations. But, due to your start-conditions, after 50000 circulations or something, 999 of the particles crashes into the wall, and you're only got a single particle left. Simulating a single particle obviously needs less cpu-power than to track 1000 particles simultaneously, and this leads to much faster to complete the remaining circulations than the initial 50k circulations. So, you'll have a slow start, when suddenly a large speed-up.


In FAH the atoms doesn't circulate round and round, but you've also got many atoms. Now I don't know how many atoms you've got, but let's say you've got 1000 atoms, and these atoms can bend and vibrate various ways. Since the atoms is connected together, it's not all ways they can move, and you can get various cross-links between atoms. After more or less "random", free movement, it's possible the various atoms has folded into a "rigid" structure, and don't want to bend-away again from this structure, so the movements of the various atoms is more or less fixed...

Well, no idea how good this explanation is... :oops:

Hmm, another method trying to explain it...
Let's say the 1000 atoms is just like 1000 cars, and at start they're more or less randomly accellerating/retarding across the highway, and there's no queue or anything. But, after driving some time, there's suddenly a full-stop, and the 1000 cars stands still bumper-to-bumper after each-others in a long queue. In this current position, it's really only the last car in the queue that can move, he can back-away, turn-around and move the opposite way, while the 999 other cars can't move until the last car has done anything. So, instead of 1000 cars moving individually, only 1 car is moving.

In FAH, if the atoms has managed to fold in such a way that it's basically only one of the atoms that can move, it's much easier to simulate, since the 999 other atoms is "locked-away" and can't really do anything currently.... Well, they can still vibrate, but still it's easier if they can't also move in other ways...



Bloody forum, wtf isn't is possible to write without the text scrolling-out of the box. :(

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 4:02 pm
by John Naylor
I don't know if its scientifically accurate, but the first explanation seems plausible for random frame speedups :P and the second for frame slowdowns.... nice

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 6:53 pm
by bruce
endrik wrote:Right, but this is not the main question. Bruce, could you in any way explain the first log please?
No, I can't. The ProtoMol core is new to FAH. I might offer a guess or two, I really don't know enough to comment. Rattledagger's explanation sounds as good or better than anything I might offer.

We do know that ProtoMol is quite different than the other software that I'm familiar with.

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 7:02 pm
by bruce
Rattledagger wrote:Bloody forum, wtf isn't is possible to write without the text scrolling-out of the box. :(
You must be using IE8.

My current assumption is that MS has introduced a new browser bug. Other browsers and other versions of IE don't act like that . . . . but since they're the elephant (lion?) in the jungle, the whole world will have to pay homage to their dominance and reprogram their text boxes to avoid whatever it is that is causing this problem.

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 7:40 pm
by endrik
Thanks for the answers. Rattledagger's hypothesis of 1 atom in a 1000 is surely plausible, especially as I am approaching 1000 Wu's done so this was about time for a bonus like that ;) - a pity it didn't happen with 2498 or SMP unit though :)
(BTW that WU has 544 atoms - I've just checked )

On the other hand, I've never witnessed anything like that before, with all Wu's folding in their regular own pace, so maybe that's another Protomol feature. This way or another, a great pity that it was just an unusual glitch and not something you can expect from a WU or a core :(

Re: Protomol 10000 - shocking!

Posted: Wed Dec 30, 2009 11:26 pm
by Rattledagger
bruce wrote:You must be using IE8.

My current assumption is that MS has introduced a new browser bug. Other browsers and other versions of IE don't act like that . . . . but since they're the elephant (lion?) in the jungle, the whole world will have to pay homage to their dominance and reprogram their text boxes to avoid whatever it is that is causing this problem.
Maybe not so likely, but it's also possible they've fixed a bug the text-box has been relying on is present...

In any case, didn't think of it before, but using "compability view" seems to have fixed the problem. :)