Re: New AS testing
Moderators: Site Moderators, FAHC Science Team
-
- Site Admin
- Posts: 7929
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Re: New AS testing
Some of the projects on 128.143.199.97 have been restricted for assignment due to issues of WU's being created with too many steps. There are topics on Projects 7520 & 7528 connected to that problem. Dr. Kasson is aware of the problem, but has not been able to get that fixed yet.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: New AS testing
Have over a hundred i-5's running 6.34 SMP. Seeing a production drop ~20% the last few days. Not able to babysit all of these clients, but I know its effecting many/most.... Figure it was just low/no SMP WU availablity, then saw this thread.
Log from one:
[22:01:47] Folding@home Core Shutdown: FINISHED_UNIT
[22:01:50] CoreStatus = 64 (100)
[22:01:50] Sending work to server
[22:01:50] Project: 9752 (Run 2021, Clone 0, Gen 246)
[22:01:50] + Attempting to send results [October 5 22:01:50 UTC]
[22:02:03] + Results successfully sent
[22:02:03] Thank you for your contribution to Folding@Home.
[22:02:03] + Number of Units Completed: 1044
[22:02:07] - Preparing to get new work unit...
[22:02:07] Cleaning up work directory
[22:02:07] + Attempting to get work packet
[22:02:07] Passkey found
[22:02:07] - Connecting to assignment server
[22:02:08] + No appropriate work server was available; will try again in a bit.
[22:02:08] + Couldn't get work instructions.
[22:02:08] - Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[22:02:23] + Attempting to get work packet
[22:02:23] Passkey found
[22:02:23] - Connecting to assignment server
[22:02:23] + No appropriate work server was available; will try again in a bit.
[22:02:23] + Couldn't get work instructions.
[22:02:23] - Attempt #2 to get work failed, and no other work to do.
Waiting before retry.
[22:02:36] + Attempting to get work packet
[22:02:36] Passkey found
[22:02:36] - Connecting to assignment server
[22:02:37] + No appropriate work server was available; will try again in a bit.
[22:02:37] + Couldn't get work instructions.
[22:02:37] - Attempt #3 to get work failed, and no other work to do.
Waiting before retry.
It's up to 67 retries.
Log from one:
[22:01:47] Folding@home Core Shutdown: FINISHED_UNIT
[22:01:50] CoreStatus = 64 (100)
[22:01:50] Sending work to server
[22:01:50] Project: 9752 (Run 2021, Clone 0, Gen 246)
[22:01:50] + Attempting to send results [October 5 22:01:50 UTC]
[22:02:03] + Results successfully sent
[22:02:03] Thank you for your contribution to Folding@Home.
[22:02:03] + Number of Units Completed: 1044
[22:02:07] - Preparing to get new work unit...
[22:02:07] Cleaning up work directory
[22:02:07] + Attempting to get work packet
[22:02:07] Passkey found
[22:02:07] - Connecting to assignment server
[22:02:08] + No appropriate work server was available; will try again in a bit.
[22:02:08] + Couldn't get work instructions.
[22:02:08] - Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[22:02:23] + Attempting to get work packet
[22:02:23] Passkey found
[22:02:23] - Connecting to assignment server
[22:02:23] + No appropriate work server was available; will try again in a bit.
[22:02:23] + Couldn't get work instructions.
[22:02:23] - Attempt #2 to get work failed, and no other work to do.
Waiting before retry.
[22:02:36] + Attempting to get work packet
[22:02:36] Passkey found
[22:02:36] - Connecting to assignment server
[22:02:37] + No appropriate work server was available; will try again in a bit.
[22:02:37] + Couldn't get work instructions.
[22:02:37] - Attempt #3 to get work failed, and no other work to do.
Waiting before retry.
It's up to 67 retries.
-
- Site Admin
- Posts: 7929
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Re: New AS testing
Joe Coffland did post -viewtopic.php?f=24&p=279775#p279772 - that there were some issues with a new AS working with vV6 clients and that he hopedthat would be resolved soon. Possibly addition issues still exist, will ask that he check on that.
P.S. Depending on your systems, they may also have been affected by one WS improperly handling returns part of the day yesterday - viewtopic.php?f=18&t=28169. That server does have a 6.34 minimum version allowed for assignment.
P.S. Depending on your systems, they may also have been affected by one WS improperly handling returns part of the day yesterday - viewtopic.php?f=18&t=28169. That server does have a 6.34 minimum version allowed for assignment.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: Re: New AS testing
Let's assume that WUs for CPUs with higher numbers of threads are currently in limited circulation for any of the reasons given above. Let's also assume that there are plenty of WUs that can run on 12 threads (a semi-arbitrary number -- choose your own value).
It's not difficult to reconfigure a 48-way system into 4 slots using CPU:12 -- but that's assuming V7. It's quite a bit more challenging if you're running V6.
Ordinary we do not recommend splitting a CPU up to run concurrent WUs with fewer threads, but this may be the exception.
All I can say is the the high-thread count projects will come back on line soon™ and you'll be able to switch back.
It's not difficult to reconfigure a 48-way system into 4 slots using CPU:12 -- but that's assuming V7. It's quite a bit more challenging if you're running V6.
Ordinary we do not recommend splitting a CPU up to run concurrent WUs with fewer threads, but this may be the exception.
All I can say is the the high-thread count projects will come back on line soon™ and you'll be able to switch back.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Re: New AS testing
I don't know if it related to the recent AS upgrade, or to WS updates, but the psummary ( http://fah-web.stanford.edu/new/psummaryC.html ) is broken ... projects with blank fields and "NaN" string instead of deadline value.
Joe, can you look at this ?
Joe, can you look at this ?
-
- Posts: 18
- Joined: Wed Jun 18, 2008 5:45 pm
Re: New AS testing
Just wanted to ask if someone would kindly make an announcement when SMP projects are available again. No point keeping the machines going doing nothing, so I've shut them all down.
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Re: New AS testing
The quickest fix is to update your client with v7, many SMP projects are available for it ... and it's the safest way to keep contributing.
-
- Posts: 18
- Joined: Wed Jun 18, 2008 5:45 pm
Re: Re: New AS testing
Thanks - it's a holiday weekend here, so if we're still running dry come Tuesday, I'll look into upgrading the clients.
-
- Posts: 1122
- Joined: Wed Mar 04, 2009 7:36 am
- Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M
Re: Re: New AS testing
That is not necessarily true for those of us that have multi socket multi core rigs v7 is not the answer since there is a limited supply of smp WU's that will run on more than 24 cores, we can do as bruce suggested and run multiple WU.s at 24 or less but even if you do that you will still be assigned to the same server which has a limited supply of WU's and if you do get 3 or more you will face a very large deficit in PPD. The only viable option I have found for multi socket with 48 core or greater than 48 core rigs is v6 running the bigadv flag.toTOW wrote:The quickest fix is to update your client with v7, many SMP projects are available for it ... and it's the safest way to keep contributing.
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: Re: New AS testing
This "upgrade to v7" solution is starting to get boring, v6 is perfectly good for cpu work - no core updates have been sent out in years. the problem is a lack of SMP work for anything over 24 cores - that is what needs fixing and the bigadv replacement WU are not and should not be the only answer. People with multi socket rigs are starting to get interested in FAH again - lets not send all that hardware back to boinc or wcg.toTOW wrote:The quickest fix is to update your client with v7, many SMP projects are available for it ... and it's the safest way to keep contributing.
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: New AS testing
No core updates in years, yet AVX was just recently mentioned again. But there have been assignment server updates, and V6 will never work with those newer servers.
Bollix proved V7 could run just as fast as V6 on multi socket multi core servers. So not upgrading is what's getting boring.
Bollix proved V7 could run just as fast as V6 on multi socket multi core servers. So not upgrading is what's getting boring.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: Re: New AS testing
Large proteins don't run well on machines with a few CPUs and small proteins simply cannot run on machines with lots of cores. Nobody disputes that '>24" needs fixing but that's an issue for the Pande Group, not the support forum. We can't do anything about the corruption that occurred or about the amount of time involved in fixing it. We can only suggest ways that YOU might get around the problem until it's fixed. If you don't choose to accept any of those suggestions, that's on you.Nathan_P wrote:...the problem is a lack of SMP work for anything over 24 cores - that is what needs fixing...
... or, you could be helpful and come up with some other suggestions of things that are within your power to fix. This is a community support forum, and you're knowledgeable enough to offer constructive support, too.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 1122
- Joined: Wed Mar 04, 2009 7:36 am
- Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M
Re: New AS testing
Below is the HFM logs from both the {H} v6 and the PG v7 they are not even close.7im wrote:No core updates in years, yet AVX was just recently mentioned again. But there have been assignment server updates, and V6 will never work with those newer servers.
Bollix proved V7 could run just as fast as V6 on multi socket multi core servers. So not upgrading is what's getting boring.
Code: Select all
Project ID: 8106
Core: GRO_A5
Credit: 5856
Frames: 100
Name: Core32 Slot 00
Path: 10.0.0.10-36330 (v7 of FAH)
Number of Frames Observed: 209
Min. Time / Frame : 00:08:41 - 162,365.0 PPD
Avg. Time / Frame : 00:09:09 - 150,103.4 PPD
Name: Core321 (v6 of FAH)
Path: \\CORE32\fah\ (v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:05:20 - 337,305.9 PPD
Avg. Time / Frame : 00:05:36 - 313,501.8 PPD
Cur. Time / Frame : 00:05:41 - 308,110.1 PPD
R3F. Time / Frame : 00:05:38 - 310,969.1 PPD
All Time / Frame : 00:05:37 - 311,933.6 PPD
Eff. Time / Frame : 00:05:37 - 311,933.6 PPD
Name: Musky1
Path: \\SCOTTY\fah\ [(v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:05:04 - 364,282.7 PPD
Avg. Time / Frame : 00:05:26 - 328,036.7 PPD
Cur. Time / Frame : 00:05:29 - 322,385.0 PPD
R3F. Time / Frame : 00:05:26 - 326,625.6 PPD
All Time / Frame : 00:05:26 - 326,625.6 PPD
Eff. Time / Frame : 00:05:46 - 299,999.4 PPD
Name: Patriot Slot 00
Path: 10.0.0.17-36330 (v7 of FAH)
Number of Frames Observed: 108
Min. Time / Frame : 00:08:23 - 171,157.9 PPD
Avg. Time / Frame : 00:08:36 - 164,730.6 PPD
Name: Patriot1
Path: \\PATRIOT\fah\ [(v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:05:15 - 345,368.8 PPD
Avg. Time / Frame : 00:05:28 - 325,041.1 PPD
Cur. Time / Frame : 00:05:33 - 313,440.6 PPD
R3F. Time / Frame : 00:05:51 - 293,306.5 PPD
All Time / Frame : 00:05:46 - 298,672.2 PPD
Eff. Time / Frame : 00:05:52 - 292,253.2 PPD
Name: tear1
Path: \\TEAR\fah\ (v6 of FAH)
Number of Frames Observed: 210
Min. Time / Frame : 00:05:23 - 332,617.5 PPD
Avg. Time / Frame : 00:05:30 - 322,090.5 PPD
Cur. Time / Frame : 00:05:30 - 320,818.1 PPD
R3F. Time / Frame : 00:05:29 - 322,219.7 PPD
All Time / Frame : 00:05:29 - 322,219.7 PPD
Eff. Time / Frame : 00:05:51 - 293,578.7 PPD
Code: Select all
Project ID: 8108
Core: GRO_A5
Credit: 7349
Frames: 100
Name: Core321
Path: \\CORE32\fah\ (v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:07:13 - 300,990.1 PPD
Avg. Time / Frame : 00:07:27 - 286,961.0 PPD
Name: Musky Slot 00
Path: 10.0.0.11-36330 (v7 of FAH)
Number of Frames Observed: 200
Min. Time / Frame : 00:11:19 - 153,277.9 PPD
Avg. Time / Frame : 00:11:40 - 146,432.4 PPD
Name: Musky Slot 02
Path: 10.0.0.11-36330 (v7 of FAH)
Number of Frames Observed: 2
Min. Time / Frame : 00:12:10 - 137,499.1 PPD
Avg. Time / Frame : 00:12:14 - 136,376.6 PPD
Name: Musky1
Path: \\SCOTTY\fah\ (v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:06:45 - 332,737.3 PPD
Avg. Time / Frame : 00:07:16 - 297,888.9 PPD
Name: Patriot Slot 00
Path: 10.0.0.17-36330 (v7 of FAH)
Number of Frames Observed: 100
Min. Time / Frame : 00:11:25 - 151,268.4 PPD
Avg. Time / Frame : 00:11:33 - 148,656.7 PPD
Name: Patriot1
Path: \\PATRIOT\fah\ (v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:06:47 - 330,287.7 PPD
Avg. Time / Frame : 00:07:07 - 307,356.5 PPD
Name: tear1
Path: \\TEAR\fah\ (v6 of FAH)
Number of Frames Observed: 300
Min. Time / Frame : 00:07:04 - 310,624.2 PPD
Avg. Time / Frame : 00:07:12 - 302,035.8 PPD
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
-
- Posts: 1122
- Joined: Wed Mar 04, 2009 7:36 am
- Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M
Re: Re: New AS testing
bruce there is no cure at this time if you run 12 core on a 64 core box you still get assigned to the same server as the large proteins are on thus still a shortage of work, and that does not even address the points deficit that comes with running them at a slower tpf 2 - smp WU's = more PPD than 1 on the large boxes using v7 running 3 smp WU = around 30% less $ = around 1/2, the only viable option at this time is v6 with the bigadv flag.bruce wrote:Large proteins don't run well on machines with a few CPUs and small proteins simply cannot run on machines with lots of cores. Nobody disputes that '>24" needs fixing but that's an issue for the Pande Group, not the support forum. We can't do anything about the corruption that occurred or about the amount of time involved in fixing it. We can only suggest ways that YOU might get around the problem until it's fixed. If you don't choose to accept any of those suggestions, that's on you.Nathan_P wrote:...the problem is a lack of SMP work for anything over 24 cores - that is what needs fixing...
... or, you could be helpful and come up with some other suggestions of things that are within your power to fix. This is a community support forum, and you're knowledgeable enough to offer constructive support, too.
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
Re: Re: New AS testing
No doubt, but as I just said, it's not a community support issue. Only PG can do anything about it.Grandpa_01 wrote:bruce there is no cure at this time if you run 12 core on a 64 core box you still get assigned to the same server as the large proteins are on thus still a shortage of work, and that does not even address the points deficit that comes with running them at a slower tpf 2 - smp WU's = more PPD than 1 on the large boxes using v7 running 3 smp WU = around 30% less $ = around 1/2, the only viable option at this time is v6 with the bigadv flag.
Once again, discussing it here will not get you anywhere because the PG members almost never read topics in this forum. They have moved their support to reddit.
I can't manufacture new projects or fix broken ones and neither can anybody else on this forum.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.