Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault with captagent version 6.4.1 on Debian 11 #274

Open
sivagurudialpad opened this issue Oct 30, 2023 · 18 comments
Open

Segfault with captagent version 6.4.1 on Debian 11 #274

sivagurudialpad opened this issue Oct 30, 2023 · 18 comments
Assignees

Comments

@sivagurudialpad
Copy link

Hi,

I recently upgraded my OS from Deb10 to Deb11. I started noticing coredumps from captagent after the upgrade. I am using captagent version 6.4.1. I am running captagent within a Kubernetes pod (mentioning it here in case it makes any difference)
I have included the details below. Please let me know if you require further information.

Version info

# /usr/local/captagent/sbin/captagent -v
version: 6.4.1

Os info

root@prober-phase3-kube-api-production-eqx-sjc-6684c49d9f-7gw5l:/usr/local/prober# cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Backtrace full

(gdb) bt full
#0  0x00007fbfa57c86d8 in callback_proto (arg=0x7fbfa4bc2e64 "", pkthdr=0x7fbfa4bc2ba0, packet=0x7fbfa5304524 "") at socket_pcap.c:555
        _msg = {data = 0x7fbfa5304340, profile_name = 0x559ff07abab0 "hepsocket", len = 396, hdr_len = 44, tcpflag = 0 '\000', sctp_ppid = 0, 
          rcinfo = {ip_family = 2 '\002', ip_proto = 17 '\021', proto_type = 1 '\001', src_mac = 0x7fbfa4bc1830 "02-55-17-E5-D1-72", 
            dst_mac = 0x7fbfa4bc1810 "", src_ip = 0x7fbfa4bc1880 "170.10.200.68", dst_ip = 0x7fbfa4bc1850 "10.32.149.51", src_port = 5060, 
            dst_port = 9638, time_sec = 1698460986, time_usec = 372569, liid = 0, cval1 = 0, cval2 = 0, sessionid = 0, direction = 0 '\000', 
            uuid = 0x0, correlation_id = {s = 0x0, len = 0}, tags = {s = '\000' <repeats 127 times>, len = 0}, socket = 0x0}, 
          parse_it = 1 '\001', parsed_data = 0x0, sip = {responseCode = 0, isRequest = true, validMessage = true, methodType = ACK, 
            methodString = {s = 0x7fbfa5304340 "", len = 3}, method_len = 0, callId = {s = 0x7fbfa5304450 "", len = 36}, reason = {s = 0x0, 
              len = 0}, hasSdp = false, cdm = {{name = '\000' <repeats 119 times>, id = 0, rate = 0, next = 0x0} <repeats 20 times>}, mrp = {{
                media_ip = {s = 0x0, len = 0}, media_port = 0, rtcp_ip = {s = 0x0, len = 0}, rtcp_port = 0, prio_codec = 0} <repeats 20 times>}, 
            cdm_count = 0, mrp_size = 0, contentLength = 0, len = 0, cSeqNumber = 74676698, hasVqRtcpXR = false, rtcpxr_callid = {s = 0x0, 
              len = 0}, cSeqMethodString = {s = 0x7fbfa5304485 "", len = 3}, cSeqMethod = ACK, cSeq = {s = 0x7fbfa530447c "", len = 12}, via = {
              s = 0x0, len = 0}, contactURI = {s = 0x0, len = 0}, ruriUser = {s = 0x7fbfa5304348 "", len = 0}, ruriDomain = {
              s = 0x7fbfa5304348 "", len = 13}, fromUser = {s = 0x7fbfa53043d8 "", len = 12}, fromDomain = {s = 0x7fbfa53043e5 "", len = 13}, 
            toUser = {s = 0x7fbfa5304410 "", len = 12}, toDomain = {s = 0x7fbfa530441d "", len = 13}, userAgent = {s = 0x0, len = 0}, paiUser = {
              s = 0x0, len = 0}, paiDomain = {s = 0x0, len = 0}, requestURI = {s = 0x7fbfa5304344 "", len = 36}, customHeader = {s = 0x0, 
              len = 0}, hasCustomHeader = false, pidURI = {s = 0x0, len = 0}, hasPid = false, fromURI = {s = 0x7fbfa53043cc "", len = 57}, 
            hasFrom = true, toURI = {s = 0x7fbfa530440b "", len = 58}, hasTo = true, ruriURI = {s = 0x0, len = 0}, hasRuri = false, toTag = {
              s = 0x7fbfa5304435 "", len = 13}, hasToTag = true, fromTag = {s = 0x7fbfa53043f8 "", len = 10}, hasFromTag = true}, 
          cap_packet = 0x7fbfa5304314, cap_header = 0x7fbfa4bc2ba0, var = 0x0, corrdata = 0x0, mfree = 0 '\000', flag = {0, 0, 0, 0, 0, 0, 0, 0, 
            0, 0}}
        eth = 0x0
        sll = 0x7fbfa5304524
        ip4_pkt = 0x0
        ip6_pkt = 0x0
        ctx = {route_rec_lev = 0, rec_lev = 0, run_flags = 0, last_retcode = 0}
        ip_src = "170.10.200.68", '\000' <repeats 33 times>
        ip_dst = "10.32.149.51", '\000' <repeats 34 times>
        mac_src = "02-55-17-E5-D1-72\000\000"
        mac_dst = '\000' <repeats 19 times>
        ip_ver = 4
        ipip_offset = 0
        action_idx = 0
        type_ip = 0
        hdr_preset = 0 '\000'
        hdr_offset = 4 '\004'
        vlan = 2 '\002'
        ip_proto = 0 '\000'
        erspan_offset = 0 '\000'
        tmp_ip_proto = 0 '\000'
--Type <RET> for more, q to quit, c to continue without paging--
        tmp_ip_len = 0 '\000'
        is_only_gre = 0 '\000'
        ethaddr = 0x81 <error: Cannot access memory at address 0x81>
        mplsaddr = 0x45 <error: Cannot access memory at address 0x45>
        loc_index = 0 '\000'
        len = 1183
        ip_hl = 0
        ip_off = 0
        frag_offset = 0
        fragmented = 0 '\000'
        psh = 0 '\000'
        data = 0x7fbfa5304340 ""
        datatcp = 0x1600000028 <error: Cannot access memory at address 0x1600000028>
        pack = 0x0
#1  0x00007fbfa6e58c05 in ?? () from /usr/lib/x86_64-linux-gnu/libpcap.so.0.8
No symbol table info available.
#2  0x00007fbfa6e59074 in ?? () from /usr/lib/x86_64-linux-gnu/libpcap.so.0.8
No symbol table info available.
#3  0x00007fbfa6e5fb0e in pcap_loop () from /usr/lib/x86_64-linux-gnu/libpcap.so.0.8
No symbol table info available.
#4  0x00007fbfa57cab75 in proto_collect (arg=0x559ff07a4130) at socket_pcap.c:1267
        loc_idx = 0
        ret = 0
        is_file = 0
#5  0x00007fbfa6dffea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140461079279360, -749651851201127914, 140725320596318, 140725320596319, 140461079277376, 
                8396800, 785872520026872342, 785876086468336150}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, 
              cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
#6  0x00007fbfa6d1fa2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
(gdb)

Thread information

(gdb) info thr
  Id   Target Id                      Frame 
* 1    Thread 0x7fbfa4bc3700 (LWP 47) 0x00007fbfa57c86d8 in callback_proto (arg=0x7fbfa4bc2e64 "", pkthdr=0x7fbfa4bc2ba0, 
    packet=0x7fbfa5304524 "") at socket_pcap.c:555
  2    Thread 0x7fbfa6860700 (LWP 44) 0x00007fbfa6d1fd56 in epoll_wait (epfd=3, events=0x7fbfa685cda0, maxevents=1024, timeout=-1)
    at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  3    Thread 0x7fbfa2fc2700 (LWP 48) 0x00007fbfa6d1396f in __GI___poll (fds=0x7fbfa2fc1d30, nfds=2, timeout=-1)
    at ../sysdeps/unix/sysv/linux/poll.c:29
  4    Thread 0x7fbfa689f000 (LWP 42) 0x00007fbfa6d15e23 in __GI___select (nfds=0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x0)
    at ../sysdeps/unix/sysv/linux/select.c:41
  5    Thread 0x7fbfa5fd6700 (LWP 45) 0x00007fbfa6ce61a1 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, 
    req=req@entry=0x7fbfa5fd5de0, rem=rem@entry=0x7fbfa5fd5de0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48

Configuration

<?xml version="1.0"?>
<document type="captagent/xml">
        <configuration name="core.conf" description="CORE Settings" serial="2014024212">
            <settings>
                <param name="debug" value="3"/>
                <param name="version" value="2"/>
                <param name="serial" value="2014056501"/>
                <param name="uuid" value="00781a4a-5b69-11e4-9522-bb79a8fcf0f3"/>
                <param name="daemon" value="false"/>
                <param name="syslog" value="false"/>
                <param name="pid_file" value="/var/run/captagent.pid"/>
                <!-- Configure using installation path if different from default -->
                <param name="module_path" value="/usr/local/captagent/lib/captagent/modules"/>
                <param name="config_path" value="/usr/local/captagent/etc/captagent/"/>
                <param name="capture_plans_path" value="/usr/local/captagent/etc/captagent/captureplans"/>
                <param name="backup" value="/usr/local/captagent/etc/captagent/backup"/>
                <param name="chroot" value="/usr/local/captagent/etc/captagent"/>
            </settings>
        </configuration>
        <configuration name="modules.conf" description="Modules">
            <modules>

                <load module="transport_hep" register="local"/>
                <load module="protocol_sip" register="local"/>
                <load module="database_hash" register="local"/>
                <load module="protocol_rtcp" register="local"/>
                <load module="socket_pcap" register="local"/>

                <!-- NOTE: Block required for RTCPXR socket + RTCPXR protocol -->
                <!-- 
                        <load module="protocol_rtcpxr" register="local"/>
                        <load module="socket_collector" register="local"/> 
                -->

                <!--
                <load module="socket_tzsp" register="local"/>
                <load module="protocol_ss7" register="local"/>
                <load module="protocol_diameter" register="local"/>
                <load module="protocol_tls" register="local"/>
                <load module="output_json" register="local"/>
                <load module="interface_http" register="local"/>
                <load module="database_redis" register="local"/>
                -->
        </modules>
        </configuration>
</document>

Corefile
captagent.corefile.sig11.42.zip

@sivagurudialpad
Copy link
Author

sivagurudialpad commented Oct 30, 2023

I found the following issues that seem to be related

My socket_pcap.xml file mentions <param name="dev" value="any"/> in all the modules. However..I use the same configuration on Debian 10, and it didn’t coredump on Deb10.

@lmangani
Copy link
Member

@sivagurudialpad thanks for the report, our devs will take a look but I would consider using heplify instead since its lighter and more portable.

@sivagurudialpad
Copy link
Author

@lmangani Thank you for your quick response. I will certainly take a look at heplify and see if it can be used instead of captagent.

@sivagurudialpad
Copy link
Author

@lmangani I wanted to check with you if there are any updates regarding this ticket.

@kYroL01
Copy link
Collaborator

kYroL01 commented Nov 9, 2023

Hi @sivagurudialpad not yet . I will check asap.
Thank you

@kYroL01 kYroL01 self-assigned this Nov 9, 2023
@sivagurudialpad
Copy link
Author

Hi @kYroL01. Thank you for taking a look at this issue. I wanted to check with you if there are any updates regarding this ticket.

@anupamdialpad
Copy link

I see "Testing needed" label has been added. @kYroL01 We can try deploying the build if it is available

@kYroL01
Copy link
Collaborator

kYroL01 commented Jan 4, 2024

Hi @anupamdialpad not yet, but I'll manage it

@sivagurudialpad
Copy link
Author

Hi @kYroL01. Thank you for taking a look at this issue. I wanted to check with you if there are any updates regarding this ticket.

@kYroL01
Copy link
Collaborator

kYroL01 commented Jan 30, 2024

Hi @sivagurudialpad we're looking into it. I was able to reproduce and I will work on that.
Thank you

@sivagurudialpad
Copy link
Author

Hi @kYroL01. Thank you very much for the update. It is good to know that it was reproducible.

@sivagurudialpad
Copy link
Author

Hi @kYroL01. I wanted to check with you if there are any updates regarding this ticket.

@sipcapture sipcapture deleted a comment from sivagurudialpad Mar 1, 2024
@sivagurudialpad
Copy link
Author

Hi @kYroL01. I wanted to check with you if there are any updates regarding this ticket.

@kYroL01 kYroL01 added the bug label Apr 30, 2024
@sivagurudialpad
Copy link
Author

Hi @kYroL01. I wanted to check with you if there are any updates regarding this ticket. We have been following up about this ticket since Oct 2023. We have not been able to upgrade to the latest due to this issue. Could we please get a fix for it ?

@kYroL01
Copy link
Collaborator

kYroL01 commented May 6, 2024

Hi @sivagurudialpad I was very busy with other higher priority tasks, but I will take a look and get a fix by in the next weeks.

@sivagurudialpad
Copy link
Author

Hi @kYroL01. I wanted to check with you if you got a chance to look into this ticket.

@sivagurudialpad
Copy link
Author

Hi @kYroL01, @lmangani . I wanted to check with you if you got a chance to look into this ticket. This ticket was opened Oct 2023 ... one year has passed. We would really appreciate if we could get a fix for this issue.

@lmangani
Copy link
Member

lmangani commented Oct 4, 2024

Hi @kYroL01, @lmangani . I wanted to check with you if you got a chance to look into this ticket. This ticket was opened Oct 2023 ... one year has passed. We would really appreciate if we could get a fix for this issue.

Thanks for the nudge @sivagurudialpad. This is not for lack of interest but our team can only can realistically work on issues affecting multiple users. Until that happens, as suggested just as long ago I would invite you to consider using heplify instead since its lighter, maintained and more portable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants