[WireGuard] mips32 crash

k at vodka.home.kg k at vodka.home.kg
Sun Nov 6 08:02:58 CET 2016


Hi !

I'm  experimenting  with  wireguard  tunnel  between 2 devices running
openwrt/lede.

R1 - banana PI  kernel 4.1.16  ARM 2 core SMP PREEMPT
R2  -  Dlink  DIR-825b1   kernel  4.4.30   MIPS32r2 Big_Endian  1 core
PREEMPT

W1-R1 (mtu 1500) - inet - (mtu 1456) R2-W2
Wireguard MTU 1370
Wireguard ver 20161103, 20161105

I  try  to  copy  files  using SMB from Windows connected to R1 to
Windows  connected to R2. As further experiments show no matter if it
windows or linux - iperf uploading from W1 to W2 is enough

While ARM device has never crashed, MIPS crashes constantly.
It takes from 5 mins to 2 hours to crash.
I have crash logs.
I enabled dbgprint in wireguard module : echo "module wireguard +p" >/sys/kernel/debug/dynamic_debug/control

Typical crash log :

---------------------
<7>[13785.407900] wireguard: Sending handshake initiation to peer 1 (x.x.x.x:16)
<7>[13785.514312] wireguard: Receiving handshake response from peer 1 ((invalid address))
<7>[13785.532044] wireguard: Keypair 106 created for peer 1
<7>[13785.537164] wireguard: Sending keepalive packet to peer 1 (x.x.x.x:16)
<7>[13785.550835] wireguard: Keypair 104 destroyed for peer 1
<7>[13905.531148] wireguard: Sending handshake initiation to peer 1 (x.x.x.x:16)
<4>[13905.629622] ------------[ cut here ]------------
<1>[13905.634339] CPU 0 Unable to handle kernel paging request at virtual address 000100d7, epc == 800a6a40, ra == 800c0470
<4>[13905.634349] Oops[#1]:
<4>[13905.634360] CPU: 0 PID: 41189632 Comm:  Not tainted 4.4.30 #0
<4>[13905.634369] task: 810000ce ti: 82bca000 task.ti: 00018100
<4>[13905.634381] $ 0   : 00000000 00000001 02f40000 00000003
<4>[13905.634392] $ 4   : 810000ce 00010000 0000ffff 02f40001
<4>[13905.634402] $ 8   : 810000ce fffe6d57 00000002 00000001
<4>[13905.634412] $12   : 003d08ff c781e3dc 00000000 00000000
<4>[13905.634423] $16   : 00000001 810000ce 00000002 8049f4f0
<4>[13905.634434] $20   : ad4f6c42 00000ca5 804a01e0 82bcbd90
<4>[13905.634444] $24   : 00000000 8023b14c                  
<4>[13905.634455] $28   : 82bca000 82bcbb88 003d0900 800c0470
<4>[13905.634457] Hi    : 00000ca5
<4>[13905.634460] Lo    : 8295ea00
<4>[13905.634487] epc   : 800a6a40 account_system_time+0x158/0x1e0
<4>[13905.634497] ra    : 800c0470 update_process_times+0x24/0x70
<4>[13905.634504] Status: 10007c02      KERNEL EXL 
<4>[13905.634507] Cause : 00800008 (ExcCode 02)
<4>[13905.634510] BadVA : 000100d7
<4>[13905.634514] PrId  : 00019374 (MIPS 24Kc)
<4>[13905.634666] Modules linked in: ath9k ath9k_common pppoe ppp_async l2tp_ppp iptable_nat ath9k_hw ath pptp pppox ppp_mppe ppp_generic nf_nat_pptp nf_nat_ipv4 nf_nat_amanda nf_conntrack_pptp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack_amanda mac80211 ipt_REJECT ipt_MASQUERADE cfg80211 xt_u32 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_id xt_hl xt_helper xt_hashlimit xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NFQUEUE xt_NFLOG xt_NETMAP xt_LOG xt_IPMARK xt_HL xt_DSCP xt_CT xt_CLASSIFY ts_kmp ts_fsm ts_bm slhc nfnetlink_queue nfnetlink_log nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat_xtables compat br_netfilter em_cmp sch_teql em_nbyte sch_dsmark sch_pie act_ipt sch_codel sch_gred sch_htb cls_basic sch_prio em_text em_meta act_police sch_red sch_tbf sch_sfq sch_fq act_connmark nf_conntrack act_skbedit act_mirred em_u32 cls_u32 cls_tcindex cls_flow cls_route cls_fw sch_hfsc sch_ingress sg ledtrig_usbport xt_set ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_raw ip6table_mangle ip6table_filter ip6_tables ip_gre gre ifb wireguard x_tables l2tp_ip6 l2tp_ip sit l2tp_netlink l2tp_core udp_tunnel ip6_udp_tunnel tunnel4 ip_tunnel tun nls_utf8 sha1_generic ecb usb_storage ehci_platform ehci_hcd sd_mod scsi_mod rndis_host cdc_ether usbnet gpio_button_hotplug ext4 jbd2 mbcache usbcore nls_base usb_common crc16 mii cryptomgr aead crypto_null crc32c_generic crypto_hash
<4>[13905.634933] Process  (pid: 41189632, threadinfo=82bca000, task=810000ce, tls=8100cea5)
<4>[13905.635014] Stack : 00000244 000001b1 000001b2 00000245 00000000 810000ce 00000000 80530000
<4>[13905.635014]         80530000 800c0470 80530000 80530000 ad4f6c42 00000ca5 804a01e0 80530000
<4>[13905.635014]         00000000 800cef5c 00000000 00000000 0000a7b2 0000a7b0 804a0080 804a0040
<4>[13905.635014]         00000ca5 ad4f6c42 804a0080 804a0000 804a01e0 804a0040 00000001 00000ca5
<4>[13905.635014]         ad4f61a1 ad4f61a1 804a0000 800c1300 00000000 00000000 00000000 00000000
<4>[13905.635014]         ...
<4>[13905.635017] Call Trace:
<4>[13905.635030] [<800a6a40>] account_system_time+0x158/0x1e0
<4>[13905.635034] 
<4>[13905.635059] 
<4>[13905.635059] Code: 8e22022c  00473821  ae27022c <90c200d8> 304200ff  10400005  001210c0  8e2202c0  14400010 
<4>[13905.635064] ---[ end trace d0d8153e9e58d19b ]---
---------------------

What  is  100%  common in crash log is that crash happens exactly ~100
msec after message " wireguard: Sending handshake initiation to peer 1
(x.x.x.x:16)"

In  normal circumstances after ~100 msec happens "wireguard: Receiving
handshake response from peer 1 ((invalid address))".

So  I  can  suppose  its  somehow  connected  to  receiving  handshake
response.
Crash  most  likely  occurs  in  "account_system_time"  and related to
accessing bad memory location. But sometimes stack points to :
<4>[ 4511.098305] [<8007a018>] __do_page_fault+0x5c/0x518
OR
<4>[ 1138.193952] [<800be79c>] profile_tick+0x8/0x48
Sometimes another exception triggered :
<4>[  309.518201] Unhandled kernel unaligned access[#1]:


Likely caused by memory corruption.



More information about the WireGuard mailing list