Re: New AS testing

Moderators: Site Moderators, FAHC Science Team

Joe_H
Site Admin
Posts: 7927
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Re: New AS testing

Post by Joe_H »

Some of the projects on 128.143.199.97 have been restricted for assignment due to issues of WU's being created with too many steps. There are topics on Projects 7520 & 7528 connected to that problem. Dr. Kasson is aware of the problem, but has not been able to get that fixed yet.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
arkaine23
Posts: 4
Joined: Thu Apr 17, 2008 9:28 pm

Re: New AS testing

Post by arkaine23 »

Have over a hundred i-5's running 6.34 SMP. Seeing a production drop ~20% the last few days. Not able to babysit all of these clients, but I know its effecting many/most.... Figure it was just low/no SMP WU availablity, then saw this thread.

Log from one:

[22:01:47] Folding@home Core Shutdown: FINISHED_UNIT
[22:01:50] CoreStatus = 64 (100)
[22:01:50] Sending work to server
[22:01:50] Project: 9752 (Run 2021, Clone 0, Gen 246)


[22:01:50] + Attempting to send results [October 5 22:01:50 UTC]
[22:02:03] + Results successfully sent
[22:02:03] Thank you for your contribution to Folding@Home.
[22:02:03] + Number of Units Completed: 1044

[22:02:07] - Preparing to get new work unit...
[22:02:07] Cleaning up work directory
[22:02:07] + Attempting to get work packet
[22:02:07] Passkey found
[22:02:07] - Connecting to assignment server
[22:02:08] + No appropriate work server was available; will try again in a bit.
[22:02:08] + Couldn't get work instructions.
[22:02:08] - Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[22:02:23] + Attempting to get work packet
[22:02:23] Passkey found
[22:02:23] - Connecting to assignment server
[22:02:23] + No appropriate work server was available; will try again in a bit.
[22:02:23] + Couldn't get work instructions.
[22:02:23] - Attempt #2 to get work failed, and no other work to do.
Waiting before retry.
[22:02:36] + Attempting to get work packet
[22:02:36] Passkey found
[22:02:36] - Connecting to assignment server
[22:02:37] + No appropriate work server was available; will try again in a bit.
[22:02:37] + Couldn't get work instructions.
[22:02:37] - Attempt #3 to get work failed, and no other work to do.
Waiting before retry.


It's up to 67 retries.
Joe_H
Site Admin
Posts: 7927
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Re: New AS testing

Post by Joe_H »

Joe Coffland did post -viewtopic.php?f=24&p=279775#p279772 - that there were some issues with a new AS working with vV6 clients and that he hopedthat would be resolved soon. Possibly addition issues still exist, will ask that he check on that.

P.S. Depending on your systems, they may also have been affected by one WS improperly handling returns part of the day yesterday - viewtopic.php?f=18&t=28169. That server does have a 6.34 minimum version allowed for assignment.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Re: New AS testing

Post by bruce »

Let's assume that WUs for CPUs with higher numbers of threads are currently in limited circulation for any of the reasons given above. Let's also assume that there are plenty of WUs that can run on 12 threads (a semi-arbitrary number -- choose your own value).

It's not difficult to reconfigure a 48-way system into 4 slots using CPU:12 -- but that's assuming V7. It's quite a bit more challenging if you're running V6.

Ordinary we do not recommend splitting a CPU up to run concurrent WUs with fewer threads, but this may be the exception.

All I can say is the the high-thread count projects will come back on line soon™ and you'll be able to switch back.
toTOW
Site Moderator
Posts: 6349
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Re: New AS testing

Post by toTOW »

I don't know if it related to the recent AS upgrade, or to WS updates, but the psummary ( http://fah-web.stanford.edu/new/psummaryC.html ) is broken ... projects with blank fields and "NaN" string instead of deadline value.

Joe, can you look at this ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Dead Things
Posts: 18
Joined: Wed Jun 18, 2008 5:45 pm

Re: New AS testing

Post by Dead Things »

Just wanted to ask if someone would kindly make an announcement when SMP projects are available again. No point keeping the machines going doing nothing, so I've shut them all down.
Image
toTOW
Site Moderator
Posts: 6349
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Re: New AS testing

Post by toTOW »

The quickest fix is to update your client with v7, many SMP projects are available for it ... and it's the safest way to keep contributing.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Dead Things
Posts: 18
Joined: Wed Jun 18, 2008 5:45 pm

Re: Re: New AS testing

Post by Dead Things »

Thanks - it's a holiday weekend here, so if we're still running dry come Tuesday, I'll look into upgrading the clients.
Image
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: Re: New AS testing

Post by Grandpa_01 »

toTOW wrote:The quickest fix is to update your client with v7, many SMP projects are available for it ... and it's the safest way to keep contributing.
That is not necessarily true for those of us that have multi socket multi core rigs v7 is not the answer since there is a limited supply of smp WU's that will run on more than 24 cores, we can do as bruce suggested and run multiple WU.s at 24 or less but even if you do that you will still be assigned to the same server which has a limited supply of WU's and if you do get 3 or more you will face a very large deficit in PPD. The only viable option I have found for multi socket with 48 core or greater than 48 core rigs is v6 running the bigadv flag.
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: Re: New AS testing

Post by Nathan_P »

toTOW wrote:The quickest fix is to update your client with v7, many SMP projects are available for it ... and it's the safest way to keep contributing.
This "upgrade to v7" solution is starting to get boring, v6 is perfectly good for cpu work - no core updates have been sent out in years. the problem is a lack of SMP work for anything over 24 cores - that is what needs fixing and the bigadv replacement WU are not and should not be the only answer. People with multi socket rigs are starting to get interested in FAH again - lets not send all that hardware back to boinc or wcg.
Image
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: New AS testing

Post by 7im »

No core updates in years, yet AVX was just recently mentioned again. But there have been assignment server updates, and V6 will never work with those newer servers.

Bollix proved V7 could run just as fast as V6 on multi socket multi core servers. So not upgrading is what's getting boring.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Re: New AS testing

Post by bruce »

Nathan_P wrote:...the problem is a lack of SMP work for anything over 24 cores - that is what needs fixing...
Large proteins don't run well on machines with a few CPUs and small proteins simply cannot run on machines with lots of cores. Nobody disputes that '>24" needs fixing but that's an issue for the Pande Group, not the support forum. We can't do anything about the corruption that occurred or about the amount of time involved in fixing it. We can only suggest ways that YOU might get around the problem until it's fixed. If you don't choose to accept any of those suggestions, that's on you.

... or, you could be helpful and come up with some other suggestions of things that are within your power to fix. This is a community support forum, and you're knowledgeable enough to offer constructive support, too.
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: New AS testing

Post by Grandpa_01 »

7im wrote:No core updates in years, yet AVX was just recently mentioned again. But there have been assignment server updates, and V6 will never work with those newer servers.

Bollix proved V7 could run just as fast as V6 on multi socket multi core servers. So not upgrading is what's getting boring.
Below is the HFM logs from both the {H} v6 and the PG v7 they are not even close.

Code: Select all

 Project ID: 8106
 Core: GRO_A5
 Credit: 5856
 Frames: 100


 Name: Core32 Slot 00
 Path: 10.0.0.10-36330  (v7 of FAH)
 Number of Frames Observed: 209

 Min. Time / Frame : 00:08:41 - 162,365.0 PPD
 Avg. Time / Frame : 00:09:09 - 150,103.4 PPD


 Name: Core321 (v6 of FAH)
 Path: \\CORE32\fah\  (v6 of FAH)
 Number of Frames Observed: 300

 Min. Time / Frame : 00:05:20 - 337,305.9 PPD
 Avg. Time / Frame : 00:05:36 - 313,501.8 PPD
 Cur. Time / Frame : 00:05:41 - 308,110.1 PPD
 R3F. Time / Frame : 00:05:38 - 310,969.1 PPD
 All  Time / Frame : 00:05:37 - 311,933.6 PPD
 Eff. Time / Frame : 00:05:37 - 311,933.6 PPD


 Name: Musky1
 Path: \\SCOTTY\fah\ [(v6 of FAH)
 Number of Frames Observed: 300

 Min. Time / Frame : 00:05:04 - 364,282.7 PPD
 Avg. Time / Frame : 00:05:26 - 328,036.7 PPD
 Cur. Time / Frame : 00:05:29 - 322,385.0 PPD
 R3F. Time / Frame : 00:05:26 - 326,625.6 PPD
 All  Time / Frame : 00:05:26 - 326,625.6 PPD
 Eff. Time / Frame : 00:05:46 - 299,999.4 PPD


 Name: Patriot Slot 00
 Path: 10.0.0.17-36330  (v7 of FAH)
 Number of Frames Observed: 108

 Min. Time / Frame : 00:08:23 - 171,157.9 PPD
 Avg. Time / Frame : 00:08:36 - 164,730.6 PPD


 Name: Patriot1
 Path: \\PATRIOT\fah\ [(v6 of FAH)
 Number of Frames Observed: 300

 Min. Time / Frame : 00:05:15 - 345,368.8 PPD
 Avg. Time / Frame : 00:05:28 - 325,041.1 PPD
 Cur. Time / Frame : 00:05:33 - 313,440.6 PPD
 R3F. Time / Frame : 00:05:51 - 293,306.5 PPD
 All  Time / Frame : 00:05:46 - 298,672.2 PPD
 Eff. Time / Frame : 00:05:52 - 292,253.2 PPD


 Name: tear1
 Path: \\TEAR\fah\ (v6 of FAH)
 Number of Frames Observed: 210

 Min. Time / Frame : 00:05:23 - 332,617.5 PPD
 Avg. Time / Frame : 00:05:30 - 322,090.5 PPD
 Cur. Time / Frame : 00:05:30 - 320,818.1 PPD
 R3F. Time / Frame : 00:05:29 - 322,219.7 PPD
 All  Time / Frame : 00:05:29 - 322,219.7 PPD
 Eff. Time / Frame : 00:05:51 - 293,578.7 PPD

Code: Select all

Project ID: 8108
 Core: GRO_A5
 Credit: 7349
 Frames: 100


 Name: Core321
 Path: \\CORE32\fah\ (v6 of FAH)
 Number of Frames Observed: 300

 Min. Time / Frame : 00:07:13 - 300,990.1 PPD
 Avg. Time / Frame : 00:07:27 - 286,961.0 PPD


 Name: Musky Slot 00
 Path: 10.0.0.11-36330  (v7 of FAH)
 Number of Frames Observed: 200

 Min. Time / Frame : 00:11:19 - 153,277.9 PPD
 Avg. Time / Frame : 00:11:40 - 146,432.4 PPD


 Name: Musky Slot 02
 Path: 10.0.0.11-36330  (v7 of FAH)
 Number of Frames Observed: 2

 Min. Time / Frame : 00:12:10 - 137,499.1 PPD
 Avg. Time / Frame : 00:12:14 - 136,376.6 PPD


 Name: Musky1
 Path: \\SCOTTY\fah\  (v6 of FAH)
 Number of Frames Observed: 300

 Min. Time / Frame : 00:06:45 - 332,737.3 PPD
 Avg. Time / Frame : 00:07:16 - 297,888.9 PPD


 Name: Patriot Slot 00
 Path: 10.0.0.17-36330  (v7 of FAH)
 Number of Frames Observed: 100

 Min. Time / Frame : 00:11:25 - 151,268.4 PPD
 Avg. Time / Frame : 00:11:33 - 148,656.7 PPD


 Name: Patriot1
 Path: \\PATRIOT\fah\  (v6 of FAH)
 Number of Frames Observed: 300

 Min. Time / Frame : 00:06:47 - 330,287.7 PPD
 Avg. Time / Frame : 00:07:07 - 307,356.5 PPD


 Name: tear1
 Path: \\TEAR\fah\  (v6 of FAH) 
 Number of Frames Observed: 300

 Min. Time / Frame : 00:07:04 - 310,624.2 PPD
 Avg. Time / Frame : 00:07:12 - 302,035.8 PPD
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: Re: New AS testing

Post by Grandpa_01 »

bruce wrote:
Nathan_P wrote:...the problem is a lack of SMP work for anything over 24 cores - that is what needs fixing...
Large proteins don't run well on machines with a few CPUs and small proteins simply cannot run on machines with lots of cores. Nobody disputes that '>24" needs fixing but that's an issue for the Pande Group, not the support forum. We can't do anything about the corruption that occurred or about the amount of time involved in fixing it. We can only suggest ways that YOU might get around the problem until it's fixed. If you don't choose to accept any of those suggestions, that's on you.

... or, you could be helpful and come up with some other suggestions of things that are within your power to fix. This is a community support forum, and you're knowledgeable enough to offer constructive support, too.
bruce there is no cure at this time if you run 12 core on a 64 core box you still get assigned to the same server as the large proteins are on thus still a shortage of work, and that does not even address the points deficit that comes with running them at a slower tpf 2 - smp WU's = more PPD than 1 on the large boxes using v7 running 3 smp WU = around 30% less $ = around 1/2, the only viable option at this time is v6 with the bigadv flag.
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Re: New AS testing

Post by bruce »

Grandpa_01 wrote:bruce there is no cure at this time if you run 12 core on a 64 core box you still get assigned to the same server as the large proteins are on thus still a shortage of work, and that does not even address the points deficit that comes with running them at a slower tpf 2 - smp WU's = more PPD than 1 on the large boxes using v7 running 3 smp WU = around 30% less $ = around 1/2, the only viable option at this time is v6 with the bigadv flag.
No doubt, but as I just said, it's not a community support issue. Only PG can do anything about it.

Once again, discussing it here will not get you anywhere because the PG members almost never read topics in this forum. They have moved their support to reddit.

I can't manufacture new projects or fix broken ones and neither can anybody else on this forum.
Post Reply