2008年7月25日

freebsd-stable Digest, Vol 262, Issue 8

Send freebsd-stable mailing list submissions to
freebsd-stable@freebsd.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
or, via email, send a message with subject or body 'help' to
freebsd-stable-request@freebsd.org

You can reach the person managing the list at
freebsd-stable-owner@freebsd.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of freebsd-stable digest..."


Today's Topics:

1. Re: cvs commit: src/contrib/pf/pfctl parse.y src/lib/libc/sys
Symbol.map getsockopt.2 src/sbin/ipfw ipfw.8 ipfw2.c src/sys/conf
NOTES options src/sys/contrib/ipfilter/netinet ip_fil_freebsd.c
src/sys/contrib/pf/net pf.c pf_ioctl.c src/sys/kern init_sysent.c
... (Mike Tancsa)
2. Re: "sleeping without queue" ? (Robert Watson)
3. RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load (John Sullivan)
4. Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load (Kris Kennaway)
5. Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load (Chuck Swiger)
6. Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load (Michael Grant)
7. Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load (Kris Kennaway)
8. Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load (john@basicnets.co.uk)
9. CARP state changes and devd.conf (Sven Willenberger)
10. zfs, raidz, spare and jbod (Claus Guttesen)
11. Re: zfs, raidz, spare and jbod (Kris Kennaway)
12. Re: zfs, raidz, spare and jbod (Claus Guttesen)
13. Re: zfs, raidz, spare and jbod (Jeremy Chadwick)
14. Re: zfs, raidz, spare and jbod (Kris Kennaway)


----------------------------------------------------------------------

Message: 1
Date: Thu, 24 Jul 2008 08:50:37 -0400
From: Mike Tancsa <mike@sentex.net>
Subject: Re: cvs commit: src/contrib/pf/pfctl parse.y src/lib/libc/sys
Symbol.map getsockopt.2 src/sbin/ipfw ipfw.8 ipfw2.c src/sys/conf
NOTES options src/sys/contrib/ipfilter/netinet ip_fil_freebsd.c
src/sys/contrib/pf/net pf.c pf_ioctl.c src/sys/kern init_sysent.c ...
To: Julian Elischer <julian@freebsd.org>, freebsd-stable@freebsd.org
Message-ID: <200807241250.m6OCoa5n014019@lava.sentex.ca>
Content-Type: text/plain; charset="us-ascii"; format=flowed


This looks like a very cool feature addition to RELENG_7! Are there
any performance penalties that you know of with this built in ?

---Mike

At 09:13 PM 7/23/2008, Julian Elischer wrote:
>julian 2008-07-24 01:13:22 UTC
>
> FreeBSD src repository
>
> Modified files: (Branch: RELENG_7)
> contrib/pf/pfctl parse.y
> lib/libc/sys Symbol.map getsockopt.2
> sbin/ipfw ipfw.8 ipfw2.c
> sys/conf NOTES options
> sys/contrib/ipfilter/netinet ip_fil_freebsd.c
> sys/contrib/pf/net pf.c pf_ioctl.c
> sys/kern init_sysent.c sys_socket.c syscalls.c
> syscalls.master systrace_args.c
> uipc_socket.c vfs_export.c
> sys/net if.c if_atmsubr.c if_fwsubr.c if_gif.c
> if_gif.h if_gre.c if_gre.h
> if_iso88025subr.c if_stf.c if_var.h
> route.c route.h rtsock.c
> sys/netatalk at_extern.h at_proto.c
> sys/netgraph/netflow netflow.c
> sys/netinet if_atm.c if_ether.c in_gif.c in_mcast.c
> in_pcb.c in_pcb.h in_rmx.c in_var.h
> ip_fastfwd.c ip_fw.h ip_fw2.c ip_icmp.c
> ip_input.c ip_mroute.c ip_mroute.h
> ip_options.c ip_output.c ip_var.h
> raw_ip.c sctp_os_bsd.h tcp_input.c
> tcp_subr.c tcp_syncache.c
> sys/netinet6 in6.c in6_ifattach.c in6_rmx.c nd6_rtr.c
> sys/netipx ipx_proto.c
> sys/nfs4client nfs4_vfsops.c
> sys/nfsclient bootp_subr.c nfs_vfsops.c
> sys/sys domain.h mbuf.h proc.h socket.h
> socketvar.h syscall.h syscall.mk
> sysproto.h
> usr.bin/netstat route.c
> Added files: (Branch: RELENG_7)
> usr.sbin/setfib Makefile setfib.1 setfib.c
> Log:
> SVN rev 180774 on 2008-07-24 01:13:22Z by julian
>
> MFC an ABI compatible implementation of Multiple routing tables.
> See the commit message for
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/net/route.c
> version 1.129 (svn change # 178888) for more info.
>
> Obtained from: Ironport (Cisco Systems)
>
> Revision Changes Path
> 1.8.2.1 +6 -15 src/contrib/pf/pfctl/parse.y
> 1.9.2.3 +1 -0 src/lib/libc/sys/Symbol.map
> 1.38.2.1 +7 -0 src/lib/libc/sys/getsockopt.2
> 1.203.2.6 +12 -0 src/sbin/ipfw/ipfw.8
> 1.108.2.8 +21 -2 src/sbin/ipfw/ipfw2.c
> 1.1454.2.14 +2 -0 src/sys/conf/NOTES
> 1.608.2.6 +1 -0 src/sys/conf/options
> 1.6.2.3 +3 -3 src/sys/contrib/ipfilter/netinet/ip_fil_freebsd.c
> 1.46.2.2 +33 -5 src/sys/contrib/pf/net/pf.c
> 1.28.2.3 +3 -3 src/sys/contrib/pf/net/pf_ioctl.c
> 1.230.2.2 +1 -1 src/sys/kern/init_sysent.c
> 1.73.2.1 +1 -1 src/sys/kern/sys_socket.c
> 1.214.2.2 +1 -1 src/sys/kern/syscalls.c
> 1.233.2.2 +1 -1 src/sys/kern/syscalls.master
> 1.14.2.2 +7 -0 src/sys/kern/systrace_args.c
> 1.302.2.4 +20 -0 src/sys/kern/uipc_socket.c
> 1.341.2.2 +16 -3 src/sys/kern/vfs_export.c
> 1.273.2.3 +6 -3 src/sys/net/if.c
> 1.45.2.1 +2 -1 src/sys/net/if_atmsubr.c
> 1.24.2.1 +1 -1 src/sys/net/if_fwsubr.c
> 1.66.2.2 +3 -0 src/sys/net/if_gif.c
> 1.19.2.1 +1 -0 src/sys/net/if_gif.h
> 1.46.2.2 +5 -1 src/sys/net/if_gre.c
> 1.13.10.1 +1 -0 src/sys/net/if_gre.h
> 1.75.2.1 +2 -1 src/sys/net/if_iso88025subr.c
> 1.60.2.1 +7 -2 src/sys/net/if_stf.c
> 1.115.2.2 +2 -0 src/sys/net/if_var.h
> 1.120.2.4 +355 -95 src/sys/net/route.c
> 1.65.2.2 +31 -4 src/sys/net/route.h
> 1.143.2.2 +9 -5 src/sys/net/rtsock.c
> 1.18.2.1 +1 -0 src/sys/netatalk/at_extern.h
> 1.13.2.1 +1 -1 src/sys/netatalk/at_proto.c
> 1.25.2.2 +3 -2 src/sys/netgraph/netflow/netflow.c
> 1.21.2.1 +1 -1 src/sys/netinet/if_atm.c
> 1.162.2.1 +185 -116 src/sys/netinet/if_ether.c
> 1.38.2.1 +6 -2 src/sys/netinet/in_gif.c
> 1.3.2.2 +2 -1 src/sys/netinet/in_mcast.c
> 1.196.2.4 +2 -1 src/sys/netinet/in_pcb.c
> 1.100.2.2 +1 -1 src/sys/netinet/in_pcb.h
> 1.57.2.1 +126 -28 src/sys/netinet/in_rmx.c
> 1.61.2.1 +16 -0 src/sys/netinet/in_var.h
> 1.41.2.1 +1 -1 src/sys/netinet/ip_fastfwd.c
> 1.110.2.4 +4 -0 src/sys/netinet/ip_fw.h
> 1.175.2.7 +48 -5 src/sys/netinet/ip_fw2.c
> 1.118.2.1 +12 -5 src/sys/netinet/ip_icmp.c
> 1.332.2.4 +4 -4 src/sys/netinet/ip_input.c
> 1.138.2.1 +2 -2 src/sys/netinet/ip_mroute.c
> 1.31.2.1 +1 -1 src/sys/netinet/ip_mroute.h
> 1.6.2.2 +3 -2 src/sys/netinet/ip_options.c
> 1.276.2.2 +2 -1 src/sys/netinet/ip_output.c
> 1.101.2.1 +1 -1 src/sys/netinet/ip_var.h
> 1.180.2.2 +1 -1 src/sys/netinet/raw_ip.c
> 1.33.2.1 +1 -1 src/sys/netinet/sctp_os_bsd.h
> 1.370.2.3 +1 -0 src/sys/netinet/tcp_input.c
> 1.300.2.3 +7 -1 src/sys/netinet/tcp_subr.c
> 1.130.2.8 +4 -0 src/sys/netinet/tcp_syncache.c
> 1.73.2.2 +2 -1 src/sys/netinet6/in6.c
> 1.39.2.1 +3 -3 src/sys/netinet6/in6_ifattach.c
> 1.18.2.1 +8 -4 src/sys/netinet6/in6_rmx.c
> 1.36.2.1 +2 -1 src/sys/netinet6/nd6_rtr.c
> 1.22.2.1 +11 -1 src/sys/netipx/ipx_proto.c
> 1.27.2.1 +2 -1 src/sys/nfs4client/nfs4_vfsops.c
> 1.70.2.1 +3 -2 src/sys/nfsclient/bootp_subr.c
> 1.193.2.2 +1 -0 src/sys/nfsclient/nfs_vfsops.c
> 1.22.2.1 +6 -0 src/sys/sys/domain.h
> 1.217.2.3 +20 -2 src/sys/sys/mbuf.h
> 1.491.2.5 +1 -0 src/sys/sys/proc.h
> 1.95.2.2 +1 -0 src/sys/sys/socket.h
> 1.158.2.3 +1 -0 src/sys/sys/socketvar.h
> 1.211.2.2 +1 -0 src/sys/sys/syscall.h
> 1.166.2.2 +1 -0 src/sys/sys/syscall.mk
> 1.215.2.2 +5 -0 src/sys/sys/sysproto.h
> 1.82.2.6 +25 -4 src/usr.bin/netstat/route.c
> 1.1.2.1 +6 -0 src/usr.sbin/setfib/Makefile (new)
> 1.2.2.1 +97 -0 src/usr.sbin/setfib/setfib.1 (new)
> 1.3.2.1 +103 -0 src/usr.sbin/setfib/setfib.c (new)
>_______________________________________________
>cvs-all@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/cvs-all
>To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"

------------------------------

Message: 2
Date: Thu, 24 Jul 2008 13:37:21 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
Subject: Re: "sleeping without queue" ?
To: Mikhail Teterin <mi+mill@aldan.algebra.com>
Cc: Kris Kennaway <kris@FreeBSD.org>, questions@FreeBSD.org, Jeremy
Chadwick <koitsu@FreeBSD.org>, stable@FreeBSD.org
Message-ID: <20080724133555.P63347@fledge.watson.org>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed


On Tue, 22 Jul 2008, Mikhail Teterin wrote:

> Kris Kennaway ???????(??):
>> Mikhail Teterin wrote:
>>> Kris Kennaway ???????(??):
>>>> Well, I mean kernel backtrace.
>>> Can I obtain that remotely and without restarting/panicking the box?
>>> Thanks,
>> kgdb on /dev/mem or procstat
> root@aldan:~ (107) kgdb /boot/kernel/kernel /dev/mem
> [...]
> (kgdb) bt
> #0 0x0000000000000000 in ?? ()
> Error accessing memory address 0x0: Bad address.
>
> Even less luck with procstat:
>
> root@aldan:~ (108) locate procstat
> root@aldan:~ (109) procstat
> procstat: ???????? ???????.
> root@aldan:~ (110) man procstat
> No manual entry for procstat
>
> I'm sorry, but you'll need to be more specific. What should I type? Thanks,

Assuming you're using 7.0 or an older 7-STABLE: procstat(1) appeared after 7.0
was released, but should be there if you slide forward on 7-STABLE. You can
use "procstat -k pid" to see kernel stack traces for kernel threads working on
behalf of the process. Depending on the level of detail you require, you can
use -kk to also list function offsets inside the kernel, but the results are a
bit harder to read.

Robert N M Watson
Computer Laboratory
University of Cambridge


------------------------------

Message: 3
Date: Thu, 24 Jul 2008 17:15:56 +0100
From: "John Sullivan" <john@basicnets.co.uk>
Subject: RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load
To: "'Kris Kennaway'" <kris@FreeBSD.org>, <freebsd-stable@freebsd.org>
Message-ID: <A403B8D27BE048E79A94B09C0C520854@emea.hubersuhner.net>
Content-Type: text/plain; charset="iso-8859-1"


>> Removing KDB_UNATTENDED from your kernel will allow you
>> to interact with the debugger and obtain backtraces etc,
>> which is useful when dumps are not being saved.
>
> Easier said than done, this cause a few panics - no dumps
> though ...grrrr!!
>
> Still the same result ... the system seems to panic twice
> then hang.� I will keep trying unless you have some other ideas??

Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug
in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became
stable. I inserted 1 of the 2 removed sticks and all was fine. I swapped that stick with the remaining stick and all was fine. I
put them both back in and I started to see the crashes again - the first of which, gave me this dump -->

server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0xb0
fault code = supervisor read data, page not present
instruction pointer = 0x8:0xffffffff8068d4bd
stack pointer = 0x10:0xffffffffb20738e0
frame pointer = 0x10:0x0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 72836 (objdump)
trap number = 12
panic: page fault
cpuid = 1
Uptime: 28m4s
Physical memory: 4082 MB
Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39
23 7

#0 doadump () at pcpu.h:194
194 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) backtrace
#0 doadump () at pcpu.h:194
#1 0x0000000000000004 in ?? ()
#2 0xffffffff80477699 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:409
#3 0xffffffff80477a9d in panic (fmt=0x104 <Address 0x104 out of bounds>)
at /usr/src/sys/kern/kern_shutdown.c:563
#4 0xffffffff8072ed44 in trap_fatal (frame=0xffffff003c39c000,
eva=18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724
#5 0xffffffff8072f115 in trap_pfault (frame=0xffffffffb2073830, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:641
#6 0xffffffff8072fa58 in trap (frame=0xffffffffb2073830)
at /usr/src/sys/amd64/amd64/trap.c:410
#7 0xffffffff807156be in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:169
#8 0xffffffff8068d4bd in vm_page_cache_remove (m=0xffffff00da9ec3b8)
at /usr/src/sys/vm/vm_page.c:896
#9 0xffffffff8068e1b5 in vm_page_alloc (object=0xffffff00374ffc30, pindex=14,
req=64) at /usr/src/sys/vm/vm_page.c:1080
#10 0xffffffff8067fa77 in vm_fault (map=0xffffff0005f23d00, vaddr=34365804544,
fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:432
#11 0xffffffff8072efaf in trap_pfault (frame=0xffffffffb2073c70, usermode=1)
at /usr/src/sys/amd64/amd64/trap.c:618
#12 0xffffffff8072fbf8 in trap (frame=0xffffffffb2073c70)
at /usr/src/sys/amd64/amd64/trap.c:309
#13 0xffffffff807156be in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:169
#14 0x000000080059c54f in ?? ()
Previous frame inner to this frame (corrupt stack?)

So to answer your question are the backtraces always the same, no, they are not. But I am still confused as to what this means??

I would appreciate any further insight anyone can give.

Thanks

John

------------------------------

Message: 4
Date: Thu, 24 Jul 2008 18:24:55 +0200
From: Kris Kennaway <kris@FreeBSD.org>
Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load
To: John Sullivan <john@basicnets.co.uk>
Cc: freebsd-stable@freebsd.org
Message-ID: <4888ACD7.6010803@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

John Sullivan wrote:
>
>>> Removing KDB_UNATTENDED from your kernel will allow you
>>> to interact with the debugger and obtain backtraces etc,
>>> which is useful when dumps are not being saved.
>> Easier said than done, this cause a few panics - no dumps
>> though ...grrrr!!
>>
>> Still the same result ... the system seems to panic twice
>> then hang. I will keep trying unless you have some other ideas??
>
> Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug
> in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became
> stable. I inserted 1 of the 2 removed sticks and all was fine. I swapped that stick with the remaining stick and all was fine. I
> put them both back in and I started to see the crashes again - the first of which, gave me this dump -->
>
> server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1
> [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd".
>
> Unread portion of the kernel message buffer:
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> fault virtual address = 0xb0
> fault code = supervisor read data, page not present
> instruction pointer = 0x8:0xffffffff8068d4bd
> stack pointer = 0x10:0xffffffffb20738e0
> frame pointer = 0x10:0x0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 72836 (objdump)
> trap number = 12
> panic: page fault
> cpuid = 1
> Uptime: 28m4s
> Physical memory: 4082 MB
> Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39
> 23 7
>
> #0 doadump () at pcpu.h:194
> 194 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) backtrace
> #0 doadump () at pcpu.h:194
> #1 0x0000000000000004 in ?? ()
> #2 0xffffffff80477699 in boot (howto=260)
> at /usr/src/sys/kern/kern_shutdown.c:409
> #3 0xffffffff80477a9d in panic (fmt=0x104 <Address 0x104 out of bounds>)
> at /usr/src/sys/kern/kern_shutdown.c:563
> #4 0xffffffff8072ed44 in trap_fatal (frame=0xffffff003c39c000,
> eva=18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724
> #5 0xffffffff8072f115 in trap_pfault (frame=0xffffffffb2073830, usermode=0)
> at /usr/src/sys/amd64/amd64/trap.c:641
> #6 0xffffffff8072fa58 in trap (frame=0xffffffffb2073830)
> at /usr/src/sys/amd64/amd64/trap.c:410
> #7 0xffffffff807156be in calltrap ()
> at /usr/src/sys/amd64/amd64/exception.S:169
> #8 0xffffffff8068d4bd in vm_page_cache_remove (m=0xffffff00da9ec3b8)
> at /usr/src/sys/vm/vm_page.c:896
> #9 0xffffffff8068e1b5 in vm_page_alloc (object=0xffffff00374ffc30, pindex=14,
> req=64) at /usr/src/sys/vm/vm_page.c:1080
> #10 0xffffffff8067fa77 in vm_fault (map=0xffffff0005f23d00, vaddr=34365804544,
> fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:432
> #11 0xffffffff8072efaf in trap_pfault (frame=0xffffffffb2073c70, usermode=1)
> at /usr/src/sys/amd64/amd64/trap.c:618
> #12 0xffffffff8072fbf8 in trap (frame=0xffffffffb2073c70)
> at /usr/src/sys/amd64/amd64/trap.c:309
> #13 0xffffffff807156be in calltrap ()
> at /usr/src/sys/amd64/amd64/exception.S:169
> #14 0x000000080059c54f in ?? ()
> Previous frame inner to this frame (corrupt stack?)
>
> So to answer your question are the backtraces always the same, no, they are not. But I am still confused as to what this means??
>
> I would appreciate any further insight anyone can give.

That's another corrupted backtrace that doesn't point to an actual
software problem. Still sounds like bad RAM, or bad hardware.

Kris


------------------------------

Message: 5
Date: Thu, 24 Jul 2008 11:16:48 -0700
From: Chuck Swiger <cswiger@mac.com>
Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load
To: John Sullivan <john@basicnets.co.uk>
Cc: FreeBSD Stable List <freebsd-stable@freebsd.org>
Message-ID: <B4E29257-B805-4597-9024-E042F34243D1@mac.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

On Jul 24, 2008, at 9:15 AM, John Sullivan wrote:
> Right, after trying for a number of days the system still just hung
> without letting me get either a dump or to interactively debug
> in the failed state, I reverted back to the Generic kernel, removed
> half the memory (2 of the 4 1GB sticks) and the system became
> stable. I inserted 1 of the 2 removed sticks and all was fine. I
> swapped that stick with the remaining stick and all was fine. I
> put them both back in and I started to see the crashes again - the
> first of which, gave me this dump -->

You might want to double-check the detailed documentation about your
motherboard.

There are a fair number of consumer-grade motherboards that can't
reliably handle 4 double-sided DIMMs at full speed. Some of them
require you to downgrade the memory clock from, say, PC3200 (aka
200MHz DDR) down to PC2700 speed (aka 166MHz DDR); others may work,
but only if you install the more expensive buffered type of RAM (which
also tend to include ECC) rather than generic unbuffered RAM.

Regards,
--
-Chuck

------------------------------

Message: 6
Date: Thu, 24 Jul 2008 22:09:16 +0200
From: "Michael Grant" <mgrant@grant.org>
Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load
To: "FreeBSD Stable List" <freebsd-stable@freebsd.org>
Cc: John Sullivan <john@basicnets.co.uk>
Message-ID:
<62b856460807241309k3cea60dbh24eea677cd6751f7@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

I have been having what seems like similar panics. I too cannot
manage to get a crash dump, neither classic style nor minidump. Nor
can I get it to work with DDB, there seems to be a problem with DDB
and my Geom mirror.

Kris recommended I up kmem_size which I have done (twice now) and
since the last time I upped it, the machine has not crashed again
(yet?). For the moment, I'm hoping things are stable.

In /boot/loader.conf, I currently have the following:

vm.kmem_size=1G
vm.kmem_size_max=1G
vm.kmem_size_scale=2

and in my kernel conf file I have:

options KVA_PAGES=512

Here's what top says currently:

last pid: 57367; load averages: 0.56, 0.54, 0.61
up 2+10:16:57 15:50:55
407 processes: 6 running, 378 sleeping, 2 zombie, 21 waiting
CPU states: 0.1% user, 0.0% nice, 2.3% system, 0.7% interrupt, 97.0% idle
Mem: 1309M Active, 1291M Inact, 497M Wired, 155M Cache, 199M Buf, 7408K Free
Swap: 9541M Total, 1628K Used, 9540M Free

Is this a heavily loaded machine? It's using a lot of memory, but
it's mostly idle.

I have 2 sticks of double-sided memory (4gig total) in the box. The
SuperMicro documentation recommends using single sided sticks for 6 or
more sticks.

I feel for you John, I've lost many nights sleep in the last couple
weeks trying to understand why this production box was crashing. I
was really surprised to see this start happening, normally my freebsd
boxes have uptimes in terms of years, not hours.

Michael Grant


------------------------------

Message: 7
Date: Thu, 24 Jul 2008 22:11:51 +0200
From: Kris Kennaway <kris@FreeBSD.org>
Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load
To: Michael Grant <mgrant@grant.org>
Cc: FreeBSD Stable List <freebsd-stable@freebsd.org>, John Sullivan
<john@basicnets.co.uk>
Message-ID: <4888E207.4020606@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Michael Grant wrote:
> I have been having what seems like similar panics. I too cannot
> manage to get a crash dump, neither classic style nor minidump. Nor
> can I get it to work with DDB, there seems to be a problem with DDB
> and my Geom mirror.

They're not at all similar, please don't confuse the issue :)

Kris

------------------------------

Message: 8
Date: Thu, 24 Jul 2008 21:41:47 +0100
From: john@basicnets.co.uk
Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under
load
To: freebsd-stable@freebsd.org
Message-ID: <20080724214147.d1uz3iuv44g4o4g4@mail.basicnets.co.uk>
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
format="flowed"

> I feel for you John, I've lost many nights sleep in the last couple
> weeks trying to understand why this production box was crashing. I
> was really surprised to see this start happening, normally my freebsd
> boxes have uptimes in terms of years, not hours.

Thanks for the sentiment, at last I have been able to smile about
this problem - maybe we should start a support group ... I'll start
... Hi, I'm John and I'm a failing sys admin, I haven't had a panic
for 2 hours now and I'm taking it just 1 tick at a time ;-)

Just to share with the group, I had an email from Kris off of the list
that made a lot of sense. I'm beginning to agree with him that it is
probably a hardware issue. I'll go quiet now and spend some money on
different hardware. For anyone who finds this thread on Google, I can
only echo Michael's comments - the thing that makes these panics so
infuriating is that even with dodgy old hardware FreeBSD has always
proven to be a very stable OS for me and as you can see, the community
is always willing to help.

Thanks to all that have spent time on this issue for me.

John

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

------------------------------

Message: 9
Date: Thu, 24 Jul 2008 16:20:46 -0400
From: Sven Willenberger <sven@dmv.com>
Subject: CARP state changes and devd.conf
To: stable@FreeBSD.org
Message-ID: <1216930846.6489.21.camel@lanshark.dmv.com>
Content-Type: text/plain; charset="us-ascii"

I see mention of CARP as a device-type in the devd.conf documentation
but for the life of me cannot manage to get devd to recognize *any*
changes in the CARP interface.

I have set
sysctl net.inet.carp.log=2
and I see message in /var/log/messages when the interface goes
INIT->BACKUP and BACKUP -> MASTER, but for the life of me cannot get
devd to "see" these changes.

I have tried something even as simple as:
notify 100 {
action "logger -p kern.notice '$device-name interface has
changed'";
};

and then bringing the CARP interfaces up and down on either boxes to
change INIT and BACKUP/MASTER states, but *nothing* is noted. Does CARP
simply not work that way with devd (i.e. only the creation of the CARP
device, not any subsequent states, work )?

Sven
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080724/cf1a55a8/attachment-0001.pgp

------------------------------

Message: 10
Date: Fri, 25 Jul 2008 09:46:34 +0200
From: "Claus Guttesen" <kometen@gmail.com>
Subject: zfs, raidz, spare and jbod
To: "FreeBSD Stable" <freebsd-stable@freebsd.org>
Message-ID:
<b41c75520807250046y4ba061a2i63d3a40b7fc76170@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi.

I installed FreeBSD 7 a few days ago and upgraded to the latest stable
release using GENERIC kernel. I also added these entries to
/boot/loader.conf:

vm.kmem_size="1536M"
vm.kmem_size_max="1536M"
vfs.zfs.prefetch_disable=1

Initially prefetch was enabled and I would experience hangs but after
disabling prefetch copying large amounts of data would go along
without problems. To see if FreeBSD 8 (current) had better (copy)
performance I upgraded to current as of yesterday. After upgrading and
rebooting the server responded fine.

The server is a supermicro with a quad-core harpertown e5405 with two
internal sata-drives and 8 GB of ram. I installed an areca arc-1680
sas-controller and configured it in jbod-mode. I attached an external
sas-cabinet with 16 sas-disks at 1 TB (931 binary GB).

I created a raidz2 pool with 10 disks and added one spare. I copied
approx. 1 TB of small files (each approx. 1 MB) and during the copy I
simulated a disk-crash by pulling one of the disks out of the cabinet.
Zfs did not activate the spare and the copying stopped until I
rebooted after 5-10 minutes. When I performed a 'zpool status' the
command would not complete. I did not see any messages in
/var/log/message. State in top showed 'ufs-'.

A similar test on solaris express developer edition b79 activated the
spare after zfs tried to write to the missing disk enough times and
then marked it as faulted. Has any one else tried to simulate a
disk-crash in raidz(2) and succeeded?

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare


------------------------------

Message: 11
Date: Fri, 25 Jul 2008 11:18:41 +0200
From: Kris Kennaway <kris@FreeBSD.org>
Subject: Re: zfs, raidz, spare and jbod
To: Claus Guttesen <kometen@gmail.com>
Cc: FreeBSD Stable <freebsd-stable@freebsd.org>
Message-ID: <48899A71.4040508@FreeBSD.org>
Content-Type: text/plain; charset=windows-1252; format=flowed

Claus Guttesen wrote:
> Hi.
>
> I installed FreeBSD 7 a few days ago and upgraded to the latest stable
> release using GENERIC kernel. I also added these entries to
> /boot/loader.conf:
>
> vm.kmem_size="1536M"
> vm.kmem_size_max="1536M"
> vfs.zfs.prefetch_disable=1
>
> Initially prefetch was enabled and I would experience hangs but after
> disabling prefetch copying large amounts of data would go along
> without problems. To see if FreeBSD 8 (current) had better (copy)
> performance I upgraded to current as of yesterday. After upgrading and
> rebooting the server responded fine.
>
> The server is a supermicro with a quad-core harpertown e5405 with two
> internal sata-drives and 8 GB of ram. I installed an areca arc-1680
> sas-controller and configured it in jbod-mode. I attached an external
> sas-cabinet with 16 sas-disks at 1 TB (931 binary GB).
>
> I created a raidz2 pool with 10 disks and added one spare. I copied
> approx. 1 TB of small files (each approx. 1 MB) and during the copy I
> simulated a disk-crash by pulling one of the disks out of the cabinet.
> Zfs did not activate the spare and the copying stopped until I
> rebooted after 5-10 minutes. When I performed a 'zpool status' the
> command would not complete. I did not see any messages in
> /var/log/message. State in top showed 'ufs-'.

That means that it was UFS that hung, not ZFS. What was the process
backtrace, and what role does UFS play on this system?

Kris

> A similar test on solaris express developer edition b79 activated the
> spare after zfs tried to write to the missing disk enough times and
> then marked it as faulted. Has any one else tried to simulate a
> disk-crash in raidz(2) and succeeded?
>

------------------------------

Message: 12
Date: Fri, 25 Jul 2008 11:24:47 +0200
From: "Claus Guttesen" <kometen@gmail.com>
Subject: Re: zfs, raidz, spare and jbod
To: "Kris Kennaway" <kris@freebsd.org>
Cc: FreeBSD Stable <freebsd-stable@freebsd.org>
Message-ID:
<b41c75520807250224k3f1dc44alb4cb0fbd84f4c5cc@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

>>
>> I installed FreeBSD 7 a few days ago and upgraded to the latest stable
>> release using GENERIC kernel. I also added these entries to
>> /boot/loader.conf:
>>
>> vm.kmem_size="1536M"
>> vm.kmem_size_max="1536M"
>> vfs.zfs.prefetch_disable=1
>>
>> Initially prefetch was enabled and I would experience hangs but after
>> disabling prefetch copying large amounts of data would go along
>> without problems. To see if FreeBSD 8 (current) had better (copy)
>> performance I upgraded to current as of yesterday. After upgrading and
>> rebooting the server responded fine.
>>
>> The server is a supermicro with a quad-core harpertown e5405 with two
>> internal sata-drives and 8 GB of ram. I installed an areca arc-1680
>> sas-controller and configured it in jbod-mode. I attached an external
>> sas-cabinet with 16 sas-disks at 1 TB (931 binary GB).
>>
>> I created a raidz2 pool with 10 disks and added one spare. I copied
>> approx. 1 TB of small files (each approx. 1 MB) and during the copy I
>> simulated a disk-crash by pulling one of the disks out of the cabinet.
>> Zfs did not activate the spare and the copying stopped until I
>> rebooted after 5-10 minutes. When I performed a 'zpool status' the
>> command would not complete. I did not see any messages in
>> /var/log/message. State in top showed 'ufs-'.
>
> That means that it was UFS that hung, not ZFS. What was the process
> backtrace, and what role does UFS play on this system?

Arghh.. Typo, I meant 'zfs-'. Pardon.

My boot-disk is plain ufs2 but the disk I pulled out was in the raidz2-pool.

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare


------------------------------

Message: 13
Date: Fri, 25 Jul 2008 02:45:16 -0700
From: Jeremy Chadwick <koitsu@FreeBSD.org>
Subject: Re: zfs, raidz, spare and jbod
To: Claus Guttesen <kometen@gmail.com>
Cc: FreeBSD Stable <freebsd-stable@freebsd.org>
Message-ID: <20080725094516.GA71385@eos.sc1.parodius.com>
Content-Type: text/plain; charset=us-ascii

On Fri, Jul 25, 2008 at 09:46:34AM +0200, Claus Guttesen wrote:
> Hi.
>
> I installed FreeBSD 7 a few days ago and upgraded to the latest stable
> release using GENERIC kernel. I also added these entries to
> /boot/loader.conf:
>
> vm.kmem_size="1536M"
> vm.kmem_size_max="1536M"
> vfs.zfs.prefetch_disable=1
>
> Initially prefetch was enabled and I would experience hangs but after
> disabling prefetch copying large amounts of data would go along
> without problems. To see if FreeBSD 8 (current) had better (copy)
> performance I upgraded to current as of yesterday. After upgrading and
> rebooting the server responded fine.

With regards to RELENG_7, I completely agree with disabling prefetch.
The overall performance (of the system and disk I/O) appears signicantly
"smoother", e.g. less hard lock-ups and stalls, is better when prefetch
is disabled.

I have not tried CURRENT. I'm told the ZFS code in CURRENT is the same
as RELENG_7, so I'm not sure what you were trying to test by switching
from RELENG_7 to CURRENT.

> The server is a supermicro with a quad-core harpertown e5405 with two
> internal sata-drives and 8 GB of ram. I installed an areca arc-1680
> sas-controller and configured it in jbod-mode. I attached an external
> sas-cabinet with 16 sas-disks at 1 TB (931 binary GB).
>
> I created a raidz2 pool with 10 disks and added one spare. I copied
> approx. 1 TB of small files (each approx. 1 MB) and during the copy I
> simulated a disk-crash by pulling one of the disks out of the cabinet.
> Zfs did not activate the spare and the copying stopped until I
> rebooted after 5-10 minutes. When I performed a 'zpool status' the
> command would not complete. I did not see any messages in
> /var/log/message. State in top showed 'ufs-'.
>
> A similar test on solaris express developer edition b79 activated the
> spare after zfs tried to write to the missing disk enough times and
> then marked it as faulted. Has any one else tried to simulate a
> disk-crash in raidz(2) and succeeded?

Is there any way to confirm the behaviour is specific to raidz2, or
would it affect raidz1 as well? I have a raidz1 pool at home (3 disks
though; pulling one will probably result in bad things) which I can
pull a disk from, though it's off of an ICHx controller.

I have no experience with Areca controllers or their driver, but I do
have experience with standard onboard Intel ICHx chips. WRT those
chips, "pulling disks" without administratively downing the ATA channel
will cause a kernel panic. If the Areca controller/driver handles
things better, great.

I'm trying to say that I can offer to help with raidz1, but not on Areca
controllers. The hardware is similar to yours; Supermicro PDSMi+, Intel
E6600 (C2D), 4GB RAM, running RELENG_7 amd64. System contains 4 disks,
ad6,8,10 are in a ZFS pool, ad4 is the OS disk:

ad4: 190782MB <WDC WD2000JD-00HBB0 08.02D08> at ata2-master SATA150
ad6: 476940MB <WDC WD5000AAKS-00YGA0 12.01C02> at ata3-master SATA300
ad8: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata4-master SATA300
ad10: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata5-master SATA300

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad6 ONLINE 0 0 0
ad8 ONLINE 0 0 0
ad10 ONLINE 0 0 0

--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |

------------------------------

Message: 14
Date: Fri, 25 Jul 2008 11:58:53 +0200
From: Kris Kennaway <kris@FreeBSD.org>
Subject: Re: zfs, raidz, spare and jbod
To: Jeremy Chadwick <koitsu@FreeBSD.org>
Cc: FreeBSD Stable <freebsd-stable@freebsd.org>, Claus Guttesen
<kometen@gmail.com>
Message-ID: <4889A3DD.8030801@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Jeremy Chadwick wrote:
> On Fri, Jul 25, 2008 at 09:46:34AM +0200, Claus Guttesen wrote:
>> Hi.
>>
>> I installed FreeBSD 7 a few days ago and upgraded to the latest stable
>> release using GENERIC kernel. I also added these entries to
>> /boot/loader.conf:
>>
>> vm.kmem_size="1536M"
>> vm.kmem_size_max="1536M"
>> vfs.zfs.prefetch_disable=1
>>
>> Initially prefetch was enabled and I would experience hangs but after
>> disabling prefetch copying large amounts of data would go along
>> without problems. To see if FreeBSD 8 (current) had better (copy)
>> performance I upgraded to current as of yesterday. After upgrading and
>> rebooting the server responded fine.
>
> With regards to RELENG_7, I completely agree with disabling prefetch.
> The overall performance (of the system and disk I/O) appears signicantly
> "smoother", e.g. less hard lock-ups and stalls, is better when prefetch
> is disabled.

FYI I do not get "lock-ups" when running with prefetch. It is supposed
to just affect performance, i.e. if you have few disks or they have low
bandwidth or high seek times (e.g. crappy ATA) then it can saturate them
and you will have poor response times. However if your hardware is more
capable then it is a performance optimization.

Someone needs to obtain the usual debugging information.

Kris

------------------------------

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"

End of freebsd-stable Digest, Vol 262, Issue 8
**********************************************

0 条评论:

发表评论

<< 主页