Forum Discussion
High Latency and Packet loss
The Past couple of days I have been experiencing Latencies (Ping) of over 200ms and packet loss from 85%-100% pinging. Anyone else experiencing this?
- iTinkeralotBandwidth Buff
I have the Nokia gateway & the excessive ping latency is common to all three of the T-Mobile gateways. A gateway replacement is a waste of time, effort & money. The problem has been reported by users coast to coast pretty much. You can do an ICMP ping for ipv4 or ipv6 and the result is the same. I have used Apple & Linux clients testing & it is pretty poor. It is common to see 70-80% packet loss running pings. Sure ICMP will have low priority & can be ignored but this is a recent behavioral change.
--- 8.8.8.8 ping statistics ---
100 packets transmitted, 19 packets received, 81.0% packet loss
round-trip min/avg/max/stddev = 75.896/110.618/150.142/18.738 ms
from netstat info:
Input histogram:
echo reply: 56
destination unreachable: 6175
time exceeded: 104
Regarding the speedtest.net operation:
When a speedtest.net "test" is conducted to the server with a Wireshark capture you can see it opens a TCP session to the target test server. The TCP session (in my test run) is setup between TCP ports 8080 & 52830 for the packet exchange between the local client and the destination server. The local client TCP source port changes depending upon the packets. There are also some UDP exchange from time to time between the test server and the test client.
82.83.133.40.in-addr.arpa name = charlotte02.speedtest.windstream.net. < Target Server
Just prior to the session setup the local client, my MacBook Pro, repeats sending echo requests at 82.83.133.40. Over the course of the text there are 22 of the ICMP packets but all fail.
Resultado: All fail to reach the server. (no response found) ← Reason
Curious behavior as the trace route hits 192.0.0.1; then there are 4 responses
traceroute to 40.133.83.82 (40.133.83.82), 64 hops max, 52 byte packets
1 www.webgui.nokiawifi.com (192.168.12.1) 1.042 ms 0.393 ms 0.330 ms
2 192.0.0.1 (192.0.0.1) 0.533 ms 0.560 ms 0.454 ms
3 * 192.0.0.1 (192.0.0.1) 27.806 ms *
4 * 192.0.0.1 (192.0.0.1) 42.273 ms 30.483 ms
5 192.0.0.1 (192.0.0.1) 27.970 ms 36.486 ms 28.959 ms
6 * * *
7 * * *
8 * * *
9 * * *
10 * 10.164.165.59 (10.164.165.59) 495.730 ms *
Non-authoritative answer:
10.164.165.59.in-addr.arpa name = 59.165.164.10.man-static.vsnl.net.in.
So it appears the traffic goes out the gateway but even performing the trace route using port 8080 or 443 it has issues. I have no clear idea where 192.0.0.1 is for sure.
Is anyone else seeing the trace route where 192.0.0.1 is the next hop after the gateway?
(00:50:b6:88:1a:f8 is the MAC address associated with the 192.0.0.1 IPv4 address from the Wireshark packet cap)
I am still picking the packet capture apart as there are a variety of odd issues but the speed test is completed regardless of the issues with the exchange of packets.
- iTinkeralotBandwidth Buff
If you want to know more about ICMP packet propagation and latency I found a very good resource. Using ICMP for the mechanics of the Netgate failover is probably a poor choice. The document pretty much covers trace routing and the mechanics extremely well. Working in the industry for 22 years I know ICMP does receive low priority but is used all the time.
Based on that article it would be interesting to know if Ookia uses an MPLS core. If they are doing ICMP tunneling that would explain why their PING latency results are how they are vs pings from a client through the normal T-Mobile solution. I sort of think they do.
https://archive.nanog.org/meetings/nanog45/presentations/Sunday/RAS_traceroute_N45.pdf
- iTinkeralotBandwidth Buff
With the n41 metrics it should be working well. The problem is probably in the backhaul with a router in the network path to the internet. If they would and could dedicate a knowledgeable engineer to investigate the behavior in more detail they could probably find the problem. From what I have seen in examining traffic flows from here in the past the external IPv4 addresses rotate in the Atlanta area. I don't have intimate knowledge of the hardware in place so I'm speculating on some aspects of their network architecture. They do use 464XLAT so offer no port forwarding to the end user. Since the mobile aspect/focus is on cellular service the phones do get priority over the fixed cellular gateways. In some places the service is really good but in others not so much. I am pretty sure the controls they leverage for bandwidth distribution are also a big factor as it appears they will compromise on service levels to expand subscription delivery. They may be planning to build out service delivery in a given location to better handle the loads but it seems the cart is before the horse in some places. Just my impression as I have observed a good number of subscribers in urban areas with more population density complaining of the same behavior. They are making an aggressive push to try to be the bigger dog in the fight.
- Mark_h_Newbie Caller
The short answer is yes, others have seen it. There's another thread on here about it, but a few days ago, something broke ping on TMHI. For me around 1 in 20 pings get a reply, today over 200ms(usually 40).
I'm running pinginfoview from nirsoft, which lets you set the port it uses. Webservers respond usually on port 80, DNS 53. This lets me do a periodic ping to check my connection. My connection has been more stable since they broke the ping, so there is a bright side.
- Skull52Network Novice
Yeah That's a problem, I am using a NETGATE 4100 Firewall with Starlink and T-Mobile in a failover configuration with the TMHI as the secondary and Starlink as primary. If Starlink goes offline which, it does in heavy rain the NETGATE switches to the TMHI until Starlink comes back online then it switches back. This has worked well until a couple of days ago. The problem is that the NETGATE uses Ping and Packet loss to determine an offline condition and with the excessive latency and packet loss on TMHI it thinks it is offline all the time and won't switch so no internet access when Starlink goes off line.
- Mark_h_Newbie Caller
I tried to find how to set the netgate to use a different port for its ping, but had no luck. Another user yesterday mentioned the same thing. It's funny, because when I run a speedtest, the pings seem normal. I don't know what's different. Either the app, or www.speedtest.net still see normal ping response.
- JMSRoaming Rookie
IMO ping to an open internet host is not a reliable gauge of connectivity.
I have been running tailscale to another host (also behind TMHI at another house) to determine if we are both connected. "tailscale ping <host>" runs every 5 minutes and has been working very reliably. You might consider that or a similar VPN solution to determine when you want to failover.
- Mark_h_Newbie Caller
It is correct that a ping is not a reliable gauge of connectivity, but it is often used. It seems to be what is used by the Netgate firewall Skull52 has.
- Skull52Network Novice
Mark,
You are correct ping is the only way to detect a down condition with pfsence. This was working for 2 months until just a couple of days ago. I have the Arcadyan KVD21 Gateway and I contacted support today told them the issue about the ridiculous latency and the 1 in 20 ping replies and 85% - 100% packet loss and that there were others complaining about the same issue the tech said they were aware of an issue but the Arcadyan was not affected, so they did the same old process of re-provisioning the gateway which of course didn't fix it so now they are sending me a replacement Arcadyan we will see if that fixes it.
- Mark_h_Newbie Caller
I can tell you it probably won't fix the ping. I have the sagemcom, and have the same thing, over 90% of pings fail. Something has broken icmp on TMHI. I would expect them to fix it eventually, way too many people and devices use ping to monitor connectivity. It is technically not a correct gauge, but it is often the only one available.
Contenido relacionado
- Hace 3 años
- Hace 15 días