Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next Hop-based routing with fallback to flooding #2856

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
0d6729b
Initial version of NextHopRouter
GUVWAF Oct 2, 2023
3776064
Merge branch 'master' into NextHopRouter
thebentern Oct 3, 2023
44dc270
Set original hop limit in header flags
GUVWAF Oct 13, 2023
9b1dd75
Short-circuit to FloodingRouter for broadcasts
GUVWAF Oct 14, 2023
27a492a
If packet traveled 1 hop, set `relay_node` as `next_hop` for the orig…
GUVWAF Oct 14, 2023
c7293cf
Merge branch 'master' into NextHopRouter
caveman99 Oct 31, 2023
42757d8
Merge branch 'master' into NextHopRouter
caveman99 Nov 16, 2023
81f57b6
Merge branch 'master' into NextHopRouter
thebentern Nov 26, 2023
ef2c6ee
Set last byte to 0xFF if it ended at 0x00
GUVWAF Nov 27, 2023
3ba9ecb
Also update next-hop based on received DM for us
GUVWAF Nov 28, 2023
25ec051
Resolve conflicts (needs testing)
GUVWAF Apr 20, 2024
b456e34
Merge branch 'master' into NextHopRouter
caveman99 Apr 23, 2024
d4ef0cd
Merge branch 'master' into NextHopRouter
caveman99 May 3, 2024
b8e01b4
Merge branch 'master' into NextHopRouter
caveman99 Jun 19, 2024
e91dcb4
Merge branch 'master' into NextHopRouter
caveman99 Sep 4, 2024
913268b
temp
GUVWAF Aug 9, 2024
2e303a3
Add 1 retransmission for intermediate hops when using NextHopRouter
GUVWAF Aug 10, 2024
6fe42ed
Add next_hop and relayed_by in PacketHistory for setting next-hop and…
GUVWAF Nov 1, 2024
aae4443
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 1, 2024
ba4220f
Update protos, store multiple relayers
GUVWAF Nov 1, 2024
9de8d5a
Remove next-hop update logic from NeighborInfoModule
GUVWAF Nov 1, 2024
0134483
Fix retransmissions
GUVWAF Nov 1, 2024
e4c9818
Improve ACKs for repeated packets and responses
GUVWAF Nov 2, 2024
aab973e
Stop retransmission even if there's not relay node
GUVWAF Nov 2, 2024
28944ad
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 5, 2024
790801f
Revert perhapsRebroadcast()
GUVWAF Nov 5, 2024
bb64b14
Remove relayer if we cancel a transmission
GUVWAF Nov 5, 2024
24ff7c0
Better checking for fallback to flooding
GUVWAF Nov 5, 2024
69f88b9
Fix newlines in traceroute print logs
GUVWAF Nov 5, 2024
fbefce7
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 8, 2024
70aa28c
Stop retransmission for original packet
GUVWAF Nov 8, 2024
78bf1e1
Use relayID
GUVWAF Nov 8, 2024
f37abe8
Also when want_ack is set, we should try to retransmit
GUVWAF Nov 8, 2024
be73b09
Fix cppcheck error
GUVWAF Nov 8, 2024
71a90b3
Fix 'router' not in scope error
GUVWAF Nov 8, 2024
93bcee3
Fix another cppcheck error
GUVWAF Nov 9, 2024
17495e7
Check for hop_limit and also update next hop when `hop_start == hop_l…
GUVWAF Nov 11, 2024
3725319
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 11, 2024
42d17b3
Formatting and correct NUM_RETRANSMISSIONS
GUVWAF Nov 11, 2024
dbe520c
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 14, 2024
b229abc
Update protos
GUVWAF Nov 16, 2024
3ea2918
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 16, 2024
360637c
Start retransmissions in NextHopRouter if ReliableRouter didn't do it
GUVWAF Nov 16, 2024
bfc6a19
Handle repeated/fallback to flooding packets properly
GUVWAF Nov 16, 2024
98719e4
Merge branch 'master' into NextHopRouter
fifieldt Nov 17, 2024
47116f6
Guard against clients setting `next_hop`/`relay_node`
GUVWAF Nov 18, 2024
6a29793
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 18, 2024
e593d54
Merge remote-tracking branch 'origin/master' into NextHopRouter
GUVWAF Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -653,9 +653,9 @@ void setup()
// but we need to do this after main cpu init (esp32setup), because we need the random seed set
nodeDB = new NodeDB;

// If we're taking on the repeater role, use flood router and turn off 3V3_S rail because peripherals are not needed
// If we're taking on the repeater role, use NextHopRouter and turn off 3V3_S rail because peripherals are not needed
if (config.device.role == meshtastic_Config_DeviceConfig_Role_REPEATER) {
router = new FloodingRouter();
router = new NextHopRouter();
#ifdef PIN_3V3_EN
digitalWrite(PIN_3V3_EN, LOW);
#endif
Expand Down
3 changes: 2 additions & 1 deletion src/mesh/FloodingRouter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ FloodingRouter::FloodingRouter() {}
ErrorCode FloodingRouter::send(meshtastic_MeshPacket *p)
{
// Add any messages _we_ send to the seen message list (so we will ignore all retransmissions we see)
wasSeenRecently(p); // FIXME, move this to a sniffSent method
p->relay_node = nodeDB->getLastByteOfNodeNum(getNodeNum()); // First set the relayer to us
wasSeenRecently(p); // FIXME, move this to a sniffSent method

return Router::send(p);
}
Expand Down
8 changes: 4 additions & 4 deletions src/mesh/FloodingRouter.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
#pragma once

#include "PacketHistory.h"
#include "Router.h"

/**
Expand All @@ -26,11 +25,9 @@
Any entries in recentBroadcasts that are older than X seconds (longer than the
max time a flood can take) will be discarded.
*/
class FloodingRouter : public Router, protected PacketHistory
class FloodingRouter : public Router
{
private:
bool isRebroadcaster();

public:
/**
* Constructor
Expand Down Expand Up @@ -58,4 +55,7 @@ class FloodingRouter : public Router, protected PacketHistory
* Look for broadcasts we need to rebroadcast
*/
virtual void sniffReceived(const meshtastic_MeshPacket *p, const meshtastic_Routing *c) override;

// Return true if we are a rebroadcaster
bool isRebroadcaster();
};
3 changes: 3 additions & 0 deletions src/mesh/MeshTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ enum RxSource {
/// We normally just use max 3 hops for sending reliable messages
#define HOP_RELIABLE 3

// For old firmware or when falling back to flooding, there is no next-hop preference
#define NO_NEXT_HOP_PREFERENCE 0

typedef int ErrorCode;

/// Alloc and free packets to our global, ISR safe pool
Expand Down
247 changes: 247 additions & 0 deletions src/mesh/NextHopRouter.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
#include "NextHopRouter.h"

NextHopRouter::NextHopRouter() {}

PendingPacket::PendingPacket(meshtastic_MeshPacket *p, uint8_t numRetransmissions)
{
packet = p;
this->numRetransmissions = numRetransmissions - 1; // We subtract one, because we assume the user just did the first send
}

/**
* Send a packet
*/
ErrorCode NextHopRouter::send(meshtastic_MeshPacket *p)
{
// Add any messages _we_ send to the seen message list (so we will ignore all retransmissions we see)
p->relay_node = nodeDB->getLastByteOfNodeNum(getNodeNum()); // First set the relayer to us
wasSeenRecently(p); // FIXME, move this to a sniffSent method

p->next_hop = getNextHop(p->to, p->relay_node); // set the next hop
LOG_DEBUG("Setting next hop for packet with dest %x to %x", p->to, p->next_hop);

// If it's from us, ReliableRouter already handles retransmissions. If a next hop is set and hop limit is not 0 or want_ack is
// set, start retransmissions
if (!isFromUs(p) && p->next_hop != NO_NEXT_HOP_PREFERENCE && (p->hop_limit > 0 || p->want_ack))
startRetransmission(packetPool.allocCopy(*p)); // start retransmission for relayed packet

return Router::send(p);
}

bool NextHopRouter::shouldFilterReceived(const meshtastic_MeshPacket *p)
{
if (wasSeenRecently(p)) { // Note: this will return false for a fallback to flooding
printPacket("Already seen, try stop re-Tx and cancel sending", p);
rxDupe++;
stopRetransmission(p->from, p->id);
if (config.device.role != meshtastic_Config_DeviceConfig_Role_ROUTER &&
config.device.role != meshtastic_Config_DeviceConfig_Role_REPEATER) {
// cancel rebroadcast of this message *if* there was already one, unless we're a router/repeater!
if (Router::cancelSending(p->from, p->id))
txRelayCanceled++;
}
return true;
}

return Router::shouldFilterReceived(p);
}

void NextHopRouter::sniffReceived(const meshtastic_MeshPacket *p, const meshtastic_Routing *c)
{
NodeNum ourNodeNum = getNodeNum();
uint8_t ourRelayID = nodeDB->getLastByteOfNodeNum(ourNodeNum);
bool isAckorReply = (p->which_payload_variant == meshtastic_MeshPacket_decoded_tag) && (p->decoded.request_id != 0);
if (isAckorReply) {
// Update next-hop for the original transmitter of this successful transmission to the relay node, but ONLY if "from" is
// not 0 (means implicit ACK) and original packet was also relayed by this node, or we sent it directly to the destination
if (p->from != 0) {
meshtastic_NodeInfoLite *origTx = nodeDB->getMeshNode(p->from);
if (origTx) {
// Either relayer of ACK was also a relayer of the packet, or we were the relayer and the ACK came directly from
// the destination
if (wasRelayer(p->relay_node, p->decoded.request_id, p->to) ||
(wasRelayer(ourRelayID, p->decoded.request_id, p->to) && p->hop_start != 0 && p->hop_start == p->hop_limit)) {
if (origTx->next_hop != p->relay_node) { // Not already set
LOG_INFO("Update next hop of 0x%x to 0x%x based on ACK/reply", p->from, p->relay_node);
origTx->next_hop = p->relay_node;
}
}
}
}
if (!isToUs(p)) {
Router::cancelSending(p->to, p->decoded.request_id); // cancel rebroadcast for this DM
// stop retransmission for the original packet
stopRetransmission(p->to, p->decoded.request_id); // for original packet, from = to and id = request_id
}
}

if (!isToUs(p) && !isFromUs(p) && p->hop_limit > 0) {
if (p->next_hop == NO_NEXT_HOP_PREFERENCE || p->next_hop == ourRelayID) {
if (isRebroadcaster()) {
meshtastic_MeshPacket *tosend = packetPool.allocCopy(*p); // keep a copy because we will be sending it
LOG_INFO("Relaying received message coming from %x", p->relay_node);

tosend->hop_limit--; // bump down the hop count
NextHopRouter::send(tosend);
} else {
LOG_DEBUG("Not rebroadcasting: Role = CLIENT_MUTE or Rebroadcast Mode = NONE");
}
}
}
// handle the packet as normal
Router::sniffReceived(p, c);
}

/**
* Get the next hop for a destination, given the relay node
* @return the node number of the next hop, 0 if no preference (fallback to FloodingRouter)
*/
uint8_t NextHopRouter::getNextHop(NodeNum to, uint8_t relay_node)
{
// When we're a repeater router->sniffReceived will call NextHopRouter directly without checking for broadcast
if (isBroadcast(to))
return NO_NEXT_HOP_PREFERENCE;

meshtastic_NodeInfoLite *node = nodeDB->getMeshNode(to);
if (node && node->next_hop) {
// We are careful not to return the relay node as the next hop
if (node->next_hop != relay_node) {
// LOG_DEBUG("Next hop for 0x%x is 0x%x", to, node->next_hop);
return node->next_hop;
} else
LOG_WARN("Next hop for 0x%x is 0x%x, same as relayer; set no pref", to, node->next_hop);
}
return NO_NEXT_HOP_PREFERENCE;
}

PendingPacket *NextHopRouter::findPendingPacket(GlobalPacketId key)
{
auto old = pending.find(key); // If we have an old record, someone messed up because id got reused
if (old != pending.end()) {
return &old->second;
} else
return NULL;
}

/**
* Stop any retransmissions we are doing of the specified node/packet ID pair
*/
bool NextHopRouter::stopRetransmission(NodeNum from, PacketId id)
{
auto key = GlobalPacketId(from, id);
return stopRetransmission(key);
}

bool NextHopRouter::stopRetransmission(GlobalPacketId key)
{
auto old = findPendingPacket(key);
if (old) {
auto p = old->packet;
/* Only when we already transmitted a packet via LoRa, we will cancel the packet in the Tx queue
to avoid canceling a transmission if it was ACKed super fast via MQTT */
if (old->numRetransmissions < NUM_RELIABLE_RETX - 1) {
// remove the 'original' (identified by originator and packet->id) from the txqueue and free it
cancelSending(getFrom(p), p->id);
// now free the pooled copy for retransmission too
packetPool.release(p);
}
auto numErased = pending.erase(key);
assert(numErased == 1);
return true;
} else
return false;
}

/**
* Add p to the list of packets to retransmit occasionally. We will free it once we stop retransmitting.
*/
PendingPacket *NextHopRouter::startRetransmission(meshtastic_MeshPacket *p, uint8_t numReTx)
{
auto id = GlobalPacketId(p);
auto rec = PendingPacket(p, numReTx);

stopRetransmission(getFrom(p), p->id);

setNextTx(&rec);
pending[id] = rec;

return &pending[id];
}

/**
* Do any retransmissions that are scheduled (FIXME - for the time being called from loop)
*/
int32_t NextHopRouter::doRetransmissions()
{
uint32_t now = millis();
int32_t d = INT32_MAX;

// FIXME, we should use a better datastructure rather than walking through this map.
// for(auto el: pending) {
for (auto it = pending.begin(), nextIt = it; it != pending.end(); it = nextIt) {
++nextIt; // we use this odd pattern because we might be deleting it...
auto &p = it->second;

bool stillValid = true; // assume we'll keep this record around

// FIXME, handle 51 day rolloever here!!!
if (p.nextTxMsec <= now) {
if (p.numRetransmissions == 0) {
if (isFromUs(p.packet)) {
LOG_DEBUG("Reliable send failed, returning a nak for fr=0x%x,to=0x%x,id=0x%x", p.packet->from, p.packet->to,
p.packet->id);
sendAckNak(meshtastic_Routing_Error_MAX_RETRANSMIT, getFrom(p.packet), p.packet->id, p.packet->channel);
}
// Note: we don't stop retransmission here, instead the Nak packet gets processed in sniffReceived
stopRetransmission(it->first);
stillValid = false; // just deleted it
} else {
LOG_DEBUG("Sending retransmission fr=0x%x,to=0x%x,id=0x%x, tries left=%d", p.packet->from, p.packet->to,
p.packet->id, p.numRetransmissions);

if (!isBroadcast(p.packet->to)) {
if (p.numRetransmissions == 1) {
// Last retransmission, reset next_hop (fallback to FloodingRouter)
p.packet->next_hop = NO_NEXT_HOP_PREFERENCE;
// Also reset it in the nodeDB
meshtastic_NodeInfoLite *sentTo = nodeDB->getMeshNode(p.packet->to);
if (sentTo) {
LOG_INFO("Resetting next hop for packet with dest 0x%x\n", p.packet->to);
sentTo->next_hop = NO_NEXT_HOP_PREFERENCE;
}
FloodingRouter::send(packetPool.allocCopy(*p.packet));
} else {
NextHopRouter::send(packetPool.allocCopy(*p.packet));
}
} else {
// Note: we call the superclass version because we don't want to have our version of send() add a new
// retransmission record
FloodingRouter::send(packetPool.allocCopy(*p.packet));
}

// Queue again
--p.numRetransmissions;
setNextTx(&p);
}
}

if (stillValid) {
// Update our desired sleep delay
int32_t t = p.nextTxMsec - now;

d = min(t, d);
}
}

return d;
}

void NextHopRouter::setNextTx(PendingPacket *pending)
{
assert(iface);
auto d = iface->getRetransmissionMsec(pending->packet);
pending->nextTxMsec = millis() + d;
LOG_DEBUG("Setting next retransmission in %u msecs: ", d);
printPacket("", pending->packet);
setReceivedMessage(); // Run ASAP, so we can figure out our correct sleep time
}
Loading