Project: 8080 (Run 51, Clone 15, Gen 6)

Moderators: Site Moderators, FAHC Science Team

Post Reply
HendricksSA
Posts: 339
Joined: Fri Jun 26, 2009 4:34 am

Project: 8080 (Run 51, Clone 15, Gen 6)

Post by HendricksSA »

Not sure what is going on with this WU. I downloaded it about 18z and it simply did not start processing. The v7.3.6 client appears to have continued running but I did not detect this problem until hours later. I rebooted my computer and unpaused the slot (I start with pause on start true). Fedora 18 ABRT (their bug tracker) detects some sort of error and the WU never begins processing. Was not able to collect and process backtrace as this system is a minimal install without all the debugging tools. I was able to repeat the "crash" (not sure this will be the right term) several times. Finally ended up deleting the work unit to force a dump. I've not had any other problems with 80xx WUs at any time. The machine downloaded a new 8083 WU and is happily crunching away. I sure hope this WU stays away. Log extract:

06:59:56:WU00:FS00:Sending unit results: id:00 state:SEND error:DUMPED project:8080 run:51 clone:15 gen:6 core:0xa4 unit:0x000000306652edcc51229daa78cd98fd
06:59:56:WARNING:WU00:FS00:Missing original Unit data, cannot send dump report
06:59:56:WU00:FS00:Cleaning up
bollix47
Posts: 2982
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by bollix47 »

Thank you for your report.

There are numerous failures in the database and no successful completions.

The WU (P8080,R51,C15,G6) has been reported as a bad WU. Note that the list of reported WUs are stopped daily at 8am pacific time.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by PantherX »

By chance, do you have the previous log file? Also, how were you able to "crash"?
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
HendricksSA
Posts: 339
Joined: Fri Jun 26, 2009 4:34 am

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by HendricksSA »

Here are the supporting logs. First, the download and startup of the offending WU. I believe the first time it tried to run the core (guessing) it probably caused a crash caught by ABRT (but I did not see the popup as I found it hours later).

Code: Select all

18:08:19:WU01:FS00:0xa4:Completed 495000 out of 500000 steps  (99%)
18:08:20:WU00:FS00:Connecting to assign3.stanford.edu:8080
18:08:20:WU00:FS00:News: Welcome to Folding@Home
18:08:20:WU00:FS00:Assigned to work server 171.67.108.60
18:08:20:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:6 from 171.67.108.60
18:08:20:WU00:FS00:Connecting to 171.67.108.60:8080
18:08:21:WU00:FS00:Downloading 83.29KiB
18:08:21:WU00:FS00:Download complete
18:08:21:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:8080 run:51 clone:15 gen:6 core:0xa4 unit:0x000000306652edcc51229daa78cd98fd
18:11:12:WU01:FS00:0xa4:Completed 500000 out of 500000 steps  (100%)
18:11:12:WU01:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
18:11:22:WU01:FS00:0xa4:
18:11:22:WU01:FS00:0xa4:Finished Work Unit:
18:11:22:WU01:FS00:0xa4:- Reading up to 1353084 from "01/wudata_01.trr": Read 1353084
18:11:22:WU01:FS00:0xa4:trr file hash check passed.
18:11:22:WU01:FS00:0xa4:- Reading up to 1509092 from "01/wudata_01.xtc": Read 1509092
18:11:22:WU01:FS00:0xa4:xtc file hash check passed.
18:11:22:WU01:FS00:0xa4:edr file hash check passed.
18:11:22:WU01:FS00:0xa4:logfile size: 27033
18:11:22:WU01:FS00:0xa4:Leaving Run
18:11:25:WU01:FS00:0xa4:- Writing 2898033 bytes of core data to disk...
18:11:26:WU01:FS00:0xa4:Done: 2897521 -> 2809679 (compressed to 96.9 percent)
18:11:26:WU01:FS00:0xa4:  ... Done.
18:12:53:WU01:FS00:0xa4:- Shutting down core
18:12:53:WU01:FS00:0xa4:
18:12:53:WU01:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
18:13:03:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
18:13:03:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:8082 run:28 clone:50 gen:3 core:0xa4 unit:0x000000066652edb3512a0df0b36b2f0d
18:13:03:WU01:FS00:Uploading 2.68MiB to 171.67.108.35
18:13:03:WU00:FS00:Starting
18:13:03:WU01:FS00:Connecting to 171.67.108.35:8080
18:13:03:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/www.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 703 -lifeline 1231 -checkpoint 15 -np 6
18:13:03:WU00:FS00:Started FahCore on PID 25348
18:13:03:WU00:FS00:Core PID:25352
18:13:03:WU00:FS00:FahCore 0xa4 started
18:13:03:WU00:FS00:0xa4:
18:13:03:WU00:FS00:0xa4:*------------------------------*
18:13:03:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
18:13:03:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
18:13:03:WU00:FS00:0xa4:
18:13:03:WU00:FS00:0xa4:Preparing to commence simulation
18:13:03:WU00:FS00:0xa4:- Looking at optimizations...
18:13:03:WU00:FS00:0xa4:- Created dyn
18:13:03:WU00:FS00:0xa4:- Files status OK
18:13:03:WU00:FS00:0xa4:- Expanded 84773 -> 507904 (decompressed 599.1 percent)
18:13:03:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=84773 data_size=507904, decompressed_data_size=507904 diff=0
18:13:03:WU00:FS00:0xa4:- Digital signature verified
18:13:03:WU00:FS00:0xa4:
18:13:03:WU00:FS00:0xa4:Project: 8080 (Run 51, Clone 15, Gen 6)
18:13:03:WU00:FS00:0xa4:
18:13:03:WU00:FS00:0xa4:Assembly optimizations on if available.
18:13:03:WU00:FS00:0xa4:Entering M.D.
18:13:09:WU01:FS00:Upload 51.31%
18:13:16:WU01:FS00:Upload complete
18:13:16:WU01:FS00:Server responded WORK_ACK (400)
18:13:16:WU01:FS00:Final credit estimate, 3954.00 points
18:13:16:WU01:FS00:Cleaning up
And then nothing happens until about 03z. It looks like the client restarted itself (I did not do it) and came up in the paused state (I do have it pause on start true). Then I noticed it was doing nothing and shut down the service and restarted it.

Code: Select all

18:13:25:Trying to access database...
18:13:25:Successfully acquired database lock
18:13:25:Enabled folding slot 00:  PAUSED cpu:6 (paused)
******************************* Date: 2013-04-03 *******************************
03:14:04:Caught signal SIGINT(2) on PID 25358
03:14:04:Exiting, please wait. . .
03:14:06:Clean exit
On each restart I unpaused the client to start it processing the 8080 WU. It would start (Entering M.D.) and then about 5 seconds later ABRT would detect a crash in what it said was the client. All the logs looked the same until I dumped the WU.

Code: Select all

06:42:01:Successfully acquired database lock
06:42:01:Enabled folding slot 00: PAUSED cpu:6 (paused)
06:42:39:FS00:Unpaused
06:42:39:WU00:FS00:Starting
06:42:39:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/www.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 703 -lifeline 1272 -checkpoint 15 -np 6
06:42:39:WU00:FS00:Started FahCore on PID 1929
06:42:39:WU00:FS00:Core PID:1933
06:42:39:WU00:FS00:FahCore 0xa4 started
06:42:40:WU00:FS00:0xa4:
06:42:40:WU00:FS00:0xa4:*------------------------------*
06:42:40:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
06:42:40:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
06:42:40:WU00:FS00:0xa4:
06:42:40:WU00:FS00:0xa4:Preparing to commence simulation
06:42:40:WU00:FS00:0xa4:- Looking at optimizations...
06:42:40:WU00:FS00:0xa4:- Files status OK
06:42:40:WU00:FS00:0xa4:- Expanded 84773 -> 507904 (decompressed 599.1 percent)
06:42:40:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=84773 data_size=507904, decompressed_data_size=507904 diff=0
06:42:40:WU00:FS00:0xa4:- Digital signature verified
06:42:40:WU00:FS00:0xa4:
06:42:40:WU00:FS00:0xa4:Project: 8080 (Run 51, Clone 15, Gen 6)
06:42:40:WU00:FS00:0xa4:
06:42:40:WU00:FS00:0xa4:Assembly optimizations on if available.
06:42:40:WU00:FS00:0xa4:Entering M.D.
HendricksSA
Posts: 339
Joined: Fri Jun 26, 2009 4:34 am

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by HendricksSA »

Here is some of the ouput from ABRT. Perhaps it will help.

Code: Select all

[root@tinalinux ccpp-2013-04-03-01:51:44-2104]# cat var_log_messages
Mar 31 19:46:46 tinalinux FAHClient[1228]: Starting fahclient ... OK
Mar 31 19:57:31 tinalinux FAHClient[1213]: Starting fahclient ... OK
Apr  2 13:13:24 tinalinux kernel: [148386.230095] FAHClient[1254]: segfault at ffffffff88000ec0 ip 00007f1195c020f1 sp 00007f1195169718 error 5 in libc-2.16.so (deleted)[7f1195b7c000+1ad000]
Apr  2 13:13:24 tinalinux abrt[25356]: File '/usr/bin/FAHClient' seems to be deleted
Apr  2 13:13:25 tinalinux abrt[25356]: Saved core dump of pid 1231 (/usr/bin/FAHClient) to /var/tmp/abrt/ccpp-2013-04-02-13:13:24-1231 (92794880 bytes)
Apr  3 01:42:01 tinalinux FAHClient[1217]: Starting fahclient ... OK
Apr  3 01:43:01 tinalinux kernel: [  121.512760] FAHClient[1289]: segfault at ffffffffdd3ee010 ip 000000311d0860f1 sp 00007f852a7d5718 error 5 in libc-2.16.so[311d000000+1ad000]
Apr  3 01:43:13 tinalinux abrt[1944]: Saved core dump of pid 1272 (/usr/bin/FAHClient) to /var/tmp/abrt/ccpp-2013-04-03-01:43:01-1272 (9437605888 bytes)
Apr  3 01:45:55 tinalinux kernel: [  295.492235] FAHClient[1951]: segfault at fffffffffc000c00 ip 000000311d0860f1 sp 00007f4c0d081718 error 5 in libc-2.16.so[311d000000+1ad000]
Apr  3 01:45:55 tinalinux abrt[2085]: Saved core dump of pid 1945 (/usr/bin/FAHClient) to /var/tmp/abrt/ccpp-2013-04-03-01:45:55-1945 (63143936 bytes)
Apr  3 01:51:44 tinalinux kernel: [  644.645475] FAHClient[2109]: segfault at ffffffff813f1010 ip 000000311d0860f1 sp 00007fec1edec718 error 5 in libc-2.16.so[311d000000+1ad000]
Apr  3 01:51:57 tinalinux abrt[2370]: Saved core dump of pid 2104 (/usr/bin/FAHClient) to /var/tmp/abrt/ccpp-2013-04-03-01:51:44-2104 (10779652096 bytes)

Code: Select all

[root@tinalinux ccpp-2013-04-03-01:51:44-2104]# cat proc_pid_status
Name:	FAHClient
State:	D (disk sleep)
Tgid:	2104
Pid:	2104
PPid:	1236
TracerPid:	0
Uid:	991	991	991	991
Gid:	0	0	0	0
FDSize:	64
Groups:	
VmPeak:	11015180 kB
VmSize:	11015180 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	    7108 kB
VmRSS:	    7108 kB
VmData:	10983708 kB
VmStk:	     136 kB
VmExe:	    8540 kB
VmLib:	    4240 kB
VmPTE:	     188 kB
VmSwap:	       0 kB
Threads:	7
SigQ:	0/63171
SigPnd:	0000000000000000
ShdPnd:	0000000000000000
SigBlk:	0000000000005207
SigIgn:	0000000000000000
SigCgt:	0000000180005007
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	0000003fffffffff
Seccomp:	0
Cpus_allowed:	ff
Cpus_allowed_list:	0-7
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:	0
voluntary_ctxt_switches:	1121
nonvoluntary_ctxt_switches:	32

Code: Select all

[root@tinalinux ccpp-2013-04-03-01:51:44-2104]# cat backtrace
[New LWP 2109]
[New LWP 2108]
[New LWP 2110]
[New LWP 2104]
[New LWP 2348]
[New LWP 2107]
[New LWP 2106]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/FAHClient --child --lifeline 1236 /etc/fahclient/config.xml --run-as f'.
Program terminated with signal 11, Segmentation fault.
#0  __strlen_sse2 () at ../sysdeps/x86_64/strlen.S:31
31	../sysdeps/x86_64/strlen.S: No such file or directory.

Thread 7 (Thread 0x7fec24864700 (LWP 2106)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:217
No locals.
#1  0x000000000067c846 in ?? ()
No symbol table info available.
#2  0x000000000067f055 in ?? ()
No symbol table info available.
#3  0x0000000000673f56 in ?? ()
No symbol table info available.
#4  0x000000000067343a in ?? ()
No symbol table info available.
#5  0x000000311d807d15 in start_thread (arg=0x7fec24864700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fec24864700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140652201789184, 764956812980080379, 0, 29708512, 140652201789184, 16, -772959803758522629, 792518274064535291}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#6  0x000000311d0f248d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114
No locals.

Thread 6 (Thread 0x7fec1fdf2700 (LWP 2107)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x000000311d809cc1 in _L_lock_885 () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x000000311d809bda in __GI___pthread_mutex_lock (mutex=0x1cc33a0) at pthread_mutex_lock.c:85
        type = 30159776
        id = 2107
#3  0x0000000000679f25 in ?? ()
No symbol table info available.
#4  0x000000000042acdb in ?? ()
No symbol table info available.
#5  0x0000000000416f21 in ?? ()
No symbol table info available.
#6  0x0000000000673f56 in ?? ()
No symbol table info available.
#7  0x000000000067343a in ?? ()
No symbol table info available.
#8  0x000000311d807d15 in start_thread (arg=0x7fec1fdf2700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fec1fdf2700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140652123727616, 764956812980080379, 0, 210937974784, 140652123727616, 0, -773049492339339525, 792518274064535291}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#9  0x000000311d0f248d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114
No locals.

Thread 5 (Thread 0x7fec1ddee700 (LWP 2348)):
#0  0x000000311d80e86d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x00000000007153e3 in ?? ()
No symbol table info available.
#2  0x00000000006e9913 in ?? ()
No symbol table info available.
#3  0x0000000000673f56 in ?? ()
No symbol table info available.
#4  0x000000000067343a in ?? ()
No symbol table info available.
#5  0x000000311d807d15 in start_thread (arg=0x7fec1ddee700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fec1ddee700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140652090156800, 764956812980080379, 0, 210937974784, 140652090156800, 140651857448096, -773045087850377477, 792518274064535291}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#6  0x000000311d0f248d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114
No locals.

Thread 4 (Thread 0x7fec24865740 (LWP 2104)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x000000311d809cc1 in _L_lock_885 () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x000000311d809bda in __GI___pthread_mutex_lock (mutex=0x1cc33a0) at pthread_mutex_lock.c:85
        type = 30159776
        id = 2104
#3  0x0000000000679f25 in ?? ()
No symbol table info available.
#4  0x000000000042acdb in ?? ()
No symbol table info available.
#5  0x0000000000416f21 in ?? ()
No symbol table info available.
#6  0x000000000041b9e3 in ?? ()
No symbol table info available.
#7  0x000000000040a4c8 in ?? ()
No symbol table info available.
#8  0x000000311d021a05 in __libc_start_main (main=0x40a480, argc=9, ubp_av=0x7fff22656128, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff22656118) at libc-start.c:225
        result = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 764956812980080379, 4236096, 140733770457376, 0, 0, -764511454856392965, 792517188338274043}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x311cc0eff0 <_dl_init+160>, 0x317f93}, data = {prev = 0x0, cleanup = 0x0, canceltype = 482406384}}}
        not_first_call = <optimized out>
#9  0x000000000040a369 in ?? ()
No symbol table info available.
#10 0x00007fff22656118 in ?? ()
No symbol table info available.
#11 0x000000000000001c in ?? ()
No symbol table info available.
#12 0x0000000000000009 in ?? ()
No symbol table info available.
#13 0x00007fff22656eb3 in ?? ()
No symbol table info available.
#14 0x00007fff22656ec6 in ?? ()
No symbol table info available.
#15 0x00007fff22656ece in ?? ()
No symbol table info available.
#16 0x00007fff22656ed9 in ?? ()
No symbol table info available.
#17 0x00007fff22656ede in ?? ()
No symbol table info available.
#18 0x00007fff22656ef8 in ?? ()
No symbol table info available.
#19 0x00007fff22656f01 in ?? ()
No symbol table info available.
#20 0x00007fff22656f0b in ?? ()
No symbol table info available.
#21 0x00007fff22656f2d in ?? ()
No symbol table info available.
#22 0x0000000000000000 in ?? ()
No symbol table info available.

Thread 3 (Thread 0x7fec165ef700 (LWP 2110)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x000000311d809cc1 in _L_lock_885 () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x000000311d809bda in __GI___pthread_mutex_lock (mutex=0x1cc33a0) at pthread_mutex_lock.c:85
        type = 30159776
        id = 2110
#3  0x0000000000679f25 in ?? ()
No symbol table info available.
#4  0x000000000042acdb in ?? ()
No symbol table info available.
#5  0x0000000000416f21 in ?? ()
No symbol table info available.
#6  0x0000000000673f56 in ?? ()
No symbol table info available.
#7  0x000000000067343a in ?? ()
No symbol table info available.
#8  0x000000311d807d15 in start_thread (arg=0x7fec165ef700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fec165ef700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140651964331776, 764956812980080379, 0, 210937974784, 140651964331776, 0, -773070375007203589, 792518274064535291}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#9  0x000000311d0f248d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114
No locals.

Thread 2 (Thread 0x7fec1f5f1700 (LWP 2108)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x000000311d809cc1 in _L_lock_885 () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x000000311d809bda in __GI___pthread_mutex_lock (mutex=0x1cc33a0) at pthread_mutex_lock.c:85
        type = 30159776
        id = 2108
#3  0x0000000000679f25 in ?? ()
No symbol table info available.
#4  0x000000000042acdb in ?? ()
No symbol table info available.
#5  0x0000000000416f21 in ?? ()
No symbol table info available.
#6  0x0000000000673f56 in ?? ()
No symbol table info available.
#7  0x000000000067343a in ?? ()
No symbol table info available.
#8  0x000000311d807d15 in start_thread (arg=0x7fec1f5f1700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fec1f5f1700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140652115334912, 764956812980080379, 0, 210937974784, 140652115334912, 0, -773050591314096389, 792518274064535291}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#9  0x000000311d0f248d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114
No locals.

Thread 1 (Thread 0x7fec1edf0700 (LWP 2109)):
#0  __strlen_sse2 () at ../sysdeps/x86_64/strlen.S:31
No locals.
#1  0x0000000000570c72 in ?? ()
No symbol table info available.
#2  0x0000000000571927 in ?? ()
No symbol table info available.
#3  0x00000000005347c8 in ?? ()
No symbol table info available.
#4  0x00000000004f8a2d in ?? ()
No symbol table info available.
#5  0x00000000004ff396 in ?? ()
No symbol table info available.
#6  0x00000000004fff86 in ?? ()
No symbol table info available.
#7  0x00000000004ea490 in ?? ()
No symbol table info available.
#8  0x00000000004e612c in ?? ()
No symbol table info available.
#9  0x00000000004a0b2a in ?? ()
No symbol table info available.
#10 0x00000000004a1068 in ?? ()
No symbol table info available.
#11 0x000000000045a632 in ?? ()
No symbol table info available.
#12 0x0000000000463484 in ?? ()
No symbol table info available.
#13 0x0000000000416fcb in ?? ()
No symbol table info available.
#14 0x0000000000673f56 in ?? ()
No symbol table info available.
#15 0x000000000067343a in ?? ()
No symbol table info available.
#16 0x000000311d807d15 in start_thread (arg=0x7fec1edf0700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fec1edf0700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140652106942208, 764956812980080379, 0, 210937974784, 140652106942208, 0, -773051683846402309, 792518274064535291}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#17 0x000000311d0f248d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114
No locals.
From                To                  Syms Read   Shared Object Library
0x000000311d805790  0x000000311d8104b4  Yes         /lib64/libpthread.so.0
0x000000311d400ed0  0x000000311d4019f0  Yes         /lib64/libdl.so.2
0x0000003135801760  0x000000313580d3c0  Yes         /lib64/libbz2.so.1
0x000000311e002190  0x000000311e00e640  Yes         /lib64/libz.so.1
0x000000312105bb80  0x00000031210c10bb  Yes         /lib64/libstdc++.so.6
0x000000311e4055b0  0x000000311e46fd68  Yes         /lib64/libm.so.6
0x000000311f002a40  0x000000311f012168  Yes         /lib64/libgcc_s.so.1
0x000000311d01f1a0  0x000000311d160960  Yes         /lib64/libc.so.6
0x000000311cc00b20  0x000000311cc1a3d9  Yes         /lib64/ld-linux-x86-64.so.2
0x00007fec1fdf51e0  0x00007fec1fdfc67c  Yes         /lib64/libnss_files.so.2
$1 = 0x0
No symbol "__glib_assert_msg" in current context.
rax            0x0	0
rbx            0xffffffff813f1010	-2126573552
rcx            0x10	16
rdx            0x7fec1eded840	140652106930240
rsi            0x311d164560	210941396320
rdi            0xffffffff813f1010	-2126573552
rbp            0x0	0x0
rsp            0x7fec1edec718	0x7fec1edec718
r8             0x73	115
r9             0x8	8
r10            0x0	0
r11            0x0	0
r12            0xab09f4	11209204
r13            0x3f	63
r14            0x7fec1edec800	140652106926080
r15            0x7fec1eded800	140652106930176
rip            0x311d0860f1	0x311d0860f1 <__strlen_sse2+17>
eflags         0x10283	[ CF SF IF RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0
Dump of assembler code for function __strlen_sse2:
   0x000000311d0860e0 <+0>:	xor    %rax,%rax
   0x000000311d0860e3 <+3>:	mov    %edi,%ecx
   0x000000311d0860e5 <+5>:	and    $0x3f,%ecx
   0x000000311d0860e8 <+8>:	pxor   %xmm0,%xmm0
   0x000000311d0860ec <+12>:	cmp    $0x30,%ecx
   0x000000311d0860ef <+15>:	ja     0x311d08610a <__strlen_sse2+42>
=> 0x000000311d0860f1 <+17>:	movdqu (%rdi),%xmm1
   0x000000311d0860f5 <+21>:	pcmpeqb %xmm1,%xmm0
   0x000000311d0860f9 <+25>:	pmovmskb %xmm0,%edx
   0x000000311d0860fd <+29>:	test   %edx,%edx
   0x000000311d0860ff <+31>:	jne    0x311d08617b <__strlen_sse2+155>
   0x000000311d086101 <+33>:	mov    %rdi,%rax
   0x000000311d086104 <+36>:	and    $0xfffffffffffffff0,%rax
   0x000000311d086108 <+40>:	jmp    0x311d086127 <__strlen_sse2+71>
   0x000000311d08610a <+42>:	mov    %rdi,%rax
   0x000000311d08610d <+45>:	and    $0xfffffffffffffff0,%rax
   0x000000311d086111 <+49>:	pcmpeqb (%rax),%xmm0
   0x000000311d086115 <+53>:	mov    $0xffffffff,%esi
   0x000000311d08611a <+58>:	sub    %rax,%rcx
   0x000000311d08611d <+61>:	shl    %cl,%esi
   0x000000311d08611f <+63>:	pmovmskb %xmm0,%edx
   0x000000311d086123 <+67>:	and    %esi,%edx
   0x000000311d086125 <+69>:	jne    0x311d086178 <__strlen_sse2+152>
   0x000000311d086127 <+71>:	pxor   %xmm0,%xmm0
   0x000000311d08612b <+75>:	pxor   %xmm1,%xmm1
   0x000000311d08612f <+79>:	pxor   %xmm2,%xmm2
   0x000000311d086133 <+83>:	pxor   %xmm3,%xmm3
   0x000000311d086137 <+87>:	nopw   0x0(%rax,%rax,1)
   0x000000311d086140 <+96>:	pcmpeqb 0x10(%rax),%xmm0
   0x000000311d086145 <+101>:	pmovmskb %xmm0,%edx
   0x000000311d086149 <+105>:	test   %edx,%edx
   0x000000311d08614b <+107>:	jne    0x311d086190 <__strlen_sse2+176>
   0x000000311d08614d <+109>:	pcmpeqb 0x20(%rax),%xmm1
   0x000000311d086152 <+114>:	pmovmskb %xmm1,%edx
   0x000000311d086156 <+118>:	test   %edx,%edx
   0x000000311d086158 <+120>:	jne    0x311d0861a0 <__strlen_sse2+192>
   0x000000311d08615a <+122>:	pcmpeqb 0x30(%rax),%xmm2
   0x000000311d08615f <+127>:	pmovmskb %xmm2,%edx
   0x000000311d086163 <+131>:	test   %edx,%edx
   0x000000311d086165 <+133>:	jne    0x311d0861b0 <__strlen_sse2+208>
   0x000000311d086167 <+135>:	pcmpeqb 0x40(%rax),%xmm3
   0x000000311d08616c <+140>:	pmovmskb %xmm3,%edx
   0x000000311d086170 <+144>:	lea    0x40(%rax),%rax
   0x000000311d086174 <+148>:	test   %edx,%edx
   0x000000311d086176 <+150>:	je     0x311d086140 <__strlen_sse2+96>
   0x000000311d086178 <+152>:	sub    %rdi,%rax
   0x000000311d08617b <+155>:	bsf    %rdx,%rdx
   0x000000311d08617f <+159>:	add    %rdx,%rax
   0x000000311d086182 <+162>:	retq   
   0x000000311d086183 <+163>:	data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
   0x000000311d086190 <+176>:	sub    %rdi,%rax
   0x000000311d086193 <+179>:	bsf    %rdx,%rdx
   0x000000311d086197 <+183>:	lea    0x10(%rdx,%rax,1),%rax
   0x000000311d08619c <+188>:	retq   
   0x000000311d08619d <+189>:	nopl   (%rax)
   0x000000311d0861a0 <+192>:	sub    %rdi,%rax
   0x000000311d0861a3 <+195>:	bsf    %rdx,%rdx
   0x000000311d0861a7 <+199>:	lea    0x20(%rdx,%rax,1),%rax
   0x000000311d0861ac <+204>:	retq   
   0x000000311d0861ad <+205>:	nopl   (%rax)
   0x000000311d0861b0 <+208>:	sub    %rdi,%rax
   0x000000311d0861b3 <+211>:	bsf    %rdx,%rdx
   0x000000311d0861b7 <+215>:	lea    0x30(%rdx,%rax,1),%rax
   0x000000311d0861bc <+220>:	retq   
End of assembler dump.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by PantherX »

There were multiple failures for this WU so have marked it as a bad WU:
The WU (P8080,R51,C15,G6) has been reported as a bad WU. Note that the list of reported WUs are stopped daily at 8am pacific time.
BTW, were you running this as a service on the system or not?
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
HendricksSA
Posts: 339
Joined: Fri Jun 26, 2009 4:34 am

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by HendricksSA »

PantherX, yes it was running as a service. The unusual aspect for my system may be the pause on start. I do have a question for you ... can 7.3.6 restart itself? The second log seems to indicate the client detected a failure and gave the WU another attempt. The time gap between 18z and 03z would come from it starting paused until I came along about 03z. That is why I am not sure if the client crashed as ABRT seems to indicate (signal 11) or core a4 actually crashed and 7.3.6 noted it and performed some sort of unattended restart.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project: 8080 (Run 51, Clone 15, Gen 6)

Post by PantherX »

Not sure about the client per say, however, if something that is configured as a service and is running, if something "happens", it will automatically restart itself. That's what I know about Windows services and would think that F@H too would follow the same.

I have 3 systems with SMP set as a service via V7.2.6 and I monitor them from a different system. I did encounter an issue where the client was stuck downloading the WU (never finished as the internet connection dropped and F@H never detected it) so I simply restarted the system and the issue corrected itself.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply