Forum: 1st Choice Core

Netmail Looping

From Avon@21:1/101 to g00r00 on Sunday, December 10, 2017 21:49:56

Nick (and others) you're more than welcome to chip in with any experiential thoughts here also...

Hi g00r00

While reviewing some old notes I did come across this recorded issue I think
I had sent you (back in the day :)) but as far as I know we never wrestled
with it... I'm pasting my notes below for your consideration / feedback .. as if there wasn't enough going on! :)

[snip]

Netmail infinite looping...and how to stop it?

It's possible for netmail to get into an infinite loop when a node
addresses netmail to a non exisistent system (21:2/201) it routes all traffic to
it's HUB 21:2/100 in NET 2 using 21:*

As the HUB does not have that echonode set up it looks to the routing and
finds
the HUB 21:1/100 in NET 1 using 21:*

Netmail is routed to 1/100 which only has 2/100 defined with routing of 21:2/* so back it goes to 2/100 which in turn sends it back to 1/100 and on it goes.

[snip]

10 Jan 17 15:46:32 Shade Digital Shade Test INTL 21:2/201 21:2/104
TID: Mystic BBS 1.12 A31
MSGID: 21:2/104 00edbc41
TZUTC: 1100
Test from Sade to Shade

--- Mystic BBS v1.12 A31 (Raspberry Pi)
* Origin: Gryphus BBS (21:2/104)
Via 21:2/0 @20170109.234815.UTC Mystic 1.12 A31
Via 21:1/100 @20170110.174849.UTC Mystic 1.12 A31
Via 21:2/0 @20170109.235022.UTC Mystic 1.12 A31
Via 21:1/100 @20170110.182101.UTC Mystic 1.12 A31
Via 21:2/0 @20170110.002323.UTC Mystic 1.12 A31
Via 21:1/100 @20170110.182325.UTC Mystic 1.12 A31
Via 21:2/0 @20170110.002526.UTC Mystic 1.12 A31
Via 21:1/100 @20170110.182545.UTC Mystic 1.12 A31
Via 21:2/0 @20170110.002729.UTC Mystic 1.12 A31

[snip]

How can we set something up at a HUB to ensure this loop does not
continue? Perhaps if the Via is X lines long and/or it's been X hours
since sender sends the netmail ... then the HUB could do 1 or more of
the following

1. generate a not delivered message back to the sender and include
the original message and a reason why

2. send a similar alert to the HUB admin so he/she know what's going on
and can follow up as well

3. include the 'not delivered' stats in a

From Accession@21:1/200 to Avon on Sunday, December 10, 2017 22:02:40

On 12/10/17, Avon said the following...

Nick (and others) you're more than welcome to chip in with any experiential thoughts here also...

Hi! <waves>

Netmail infinite looping...and how to stop it?

Before I read on, there are some things you just won't be able to fix.
However, I run HPT here as my tosser on my hub, and I don't see any infinite looping. If said mail has no destination it just stays here doing nothing
until I find it and delete it (this is what I meant by you can't fix that -
but the looping can definitely be fixed).

It's possible for netmail to get into an infinite loop when a node addresses netmail to a non exisistent system (21:2/201) it routes all traffic to
it's HUB 21:2/100 in NET 2 using 21:*

As the HUB does not have that echonode set up it looks to the routing and finds
the HUB 21:1/100 in NET 1 using 21:*

So I take it you have multiple hubs in this network? If I'm understanding
what I'm reading, something is definitely amiss. Are your hubs parsing nodelists? If so, that first hub should realize it's not a valid node number, and stop it right there as undeliverable. However, since they route everything to you after their direct links, you should stop it as the main SPOF with the most current nodelist (since in the end it trickles down from you).

That makes me wonder, do you have any kind of 21:* route statement as a catchall on your hub system? If you do, you shouldn't.

Netmail is routed to 1/100 which only has 2/100 defined with routing of 21:2/* so back it goes to 2/100 which in turn sends it back to 1/100 and on it goes.

Hmm, this seems like it may be bad routing configs. You as the main hub shouldn't be "routing" anything, except for points through their bossnode.
You should have direct connections setup with everyone you connect to (on
your end).

The hubs below you should have direct connections to all of their links, and then the catchall 21:* going to you as the last option.

You shouldn't need node wildcards at all, unless you yourself are routing
past one of your lower hubs directly to a downlink of theirs, in which that
hub should be let known of that so they don't make the same route statement.

If you don't understand what I'm trying to get at (I know, I feel like I'm beginning to ramble), I can try to simplify things a bit..

Regards,
Nick

--- Mystic BBS v1.12 A37 2017/12/09 (Linux/64)
* Origin: _thePharcyde telnet://bbs.pharcyde.org (Wisconsin) (21:1/200)

From Avon@21:1/101 to Accession on Monday, December 11, 2017 19:31:54

On 12/10/17, Accession pondered and said...

Hi! <waves>

** waves back **

So I take it you have multiple hubs in this network? If I'm understanding

Yes, check the nodelist and you will find x 3 of them.

what I'm reading, something is definitely amiss. Are your hubs parsing nodelists? If so, that first hub should realize it's not a valid node

You mean building an internal nodelist from a MUTIL function? Yes I think so but will check.

That makes me wonder, do you have any kind of 21:* route statement as a catchall on your hub system? If you do, you shouldn't.

In 1/100 there are the following settings

21:3/100 has Route Info � 21:3/*
21:2/100 has Route Info � 21:2/*

The 2/100 HUB will send all netmail that can't be delivered to 21:2/*
echonodes on to 1/100 via a 21:* route info statement.

The same happens on the 3/100 HUB for messages unable to be delivered to
21:3/* nodes.

Hmm, this seems like it may be bad routing configs. You as the main hub shouldn't be "routing" anything, except for points through their

I disagree, it needs to route 21:2/* traffic to 21:2/100 for onward distribution and 21:3/* traffic to 21:3/100 for the same reasons.

The hubs below you should have direct connections to all of their links, and then the catchall 21:* going to you as the last option.

That is what is happening...

You shouldn't need node wildcards at all, unless you yourself are routing past one of your lower hubs directly to a downlink of theirs, in which that hub should be let known of that so they don't make the same route statement.

The only routing is between HUBS and at each HUB they in turn will send
netmail to known echonodes directly. It's the unknown nodes where it gets interesting :)

If you don't understand what I'm trying to get at (I know, I feel like
I'm beginning to ramble), I can try to simplify things a bit..

No it's all good I'm following along :)

Best, Paul

--- Mystic BBS v1.12 A37 2017/12/09 (Windows/32)
* Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)

From Bill McGarrity@21:2/141 to Accession on Monday, December 11, 2017 10:48:00

Accession wrote to Avon on 12-10-17 22:02 <=-

@MSGID: <5A2E0692.116.19fsx_mys@tequilamockingbirdonline.net>

Netmail infinite looping...and how to stop it?

Before I read on, there are some things you just won't be able to fix. However, I run HPT here as my tosser on my hub, and I don't see any infinite looping. If said mail has no destination it just stays here
doing nothing until I find it and delete it (this is what I meant by
you can't fix that - but the looping can definitely be fixed).

Deja Vu... :)

--

Bill

Telnet: tequilamockingbirdonline.net
Web: bbs.tequilamockingbirdonline.net
FTP: ftp.tequilamockingbirdonline.net:2121
IRC: irc.tequilamockingbirdonline.net Ports: 6661-6670 SSL: +6697
Radio: radio.tequilamockingbirdonline.net:8010/live

... Look Twice... Save a Life!!! Motorcycles are Everywhere!!!
--- MultiMail/Win32 v0.50
* Origin: TequilaMockingbird Online - Badlands of NJ (21:2/141)

From Accession@21:1/200 to Avon on Monday, December 11, 2017 19:17:43

On 12/11/17, Avon said the following...

In 1/100 there are the following settings

21:3/100 has Route Info � 21:3/*
21:2/100 has Route Info � 21:2/*

That should be fine. You should then probably check the other two hubs to
make sure they don't have anything that could cause a loop.

Otherwise, *something* should be stopping that message from going on any further. That something should be once it gets up to your system, if you're
not in a 3-way polygon and you're still at the top of the chain, that is.

If an unknown netmail came up with you with net 1 in it, you definitely shouldn't be sending it on to the other two hubs, since you don't have any route statement for net 1. It should stop in your outbound with no known destination.

If it was an unknown net 2 address, it should stop on 2/100's system as there is no routing for net 2 there. Same goes for net 3.

However, with HUBs using that 21:* catchall, it will cause loops because even if it is net 2 and is on 2/100's system, but is unknown.. it will send it to you anyways.

You have a couple options.

1) setup a 3-way polygon

1/100
route 2/* to 2/100
route 3/* to 3/100

2/100
route 1/* to 1/100
route 3/* to 3/100

3/100
route 1/* to 1/100
route 2/* to 2/100

2) If you're still top of the chain and the single point of failure (assuming 1/100 is the top level hub):

1/100
route 2/* to 2/100
route 3/* to 3/100

2/100
route 1/* to 1/100
route 3/* to 1/100

3/100
route 1/* to 1/100
route 2/* to 1/100

With #2, there will always be a place for the netmail to stop. 1/100 wouldn't have route info for 1/*, and so on.

When you add the 21:* catchall in there (in my opinion this should *only* be used for nodes with one uplink), 2/100 will still send an undefined 2/* node
to 1/100, when it shouldn't. 3/100 will still send an undefined 3/* node to you, when it shouldn't.

See what I'm getting at? 21:* shouldn't be used on any of the 3 hub systems
at all.

Regards,
Nick

--- Mystic BBS v1.12 A37 2017/12/09 (Linux/64)
* Origin: _thePharcyde telnet://bbs.pharcyde.org (Wisconsin) (21:1/200)

From Accession@21:1/200 to Bill McGarrity on Monday, December 11, 2017 19:19:42

On 12/11/17, Bill McGarrity said the following...

you can't fix that - but the looping can definitely be fixed).

Deja Vu... :)

Very much so, now that I've mentioned that damn catchall that seems to be rearing it's ugly head on hub systems these days. ;)

Regards,
Nick

--- Mystic BBS v1.12 A37 2017/12/09 (Linux/64)
* Origin: _thePharcyde telnet://bbs.pharcyde.org (Wisconsin) (21:1/200)

From Bill McGarrity@21:2/141 to Accession on Tuesday, December 12, 2017 10:40:00

Accession wrote to Bill McGarrity on 12-11-17 19:19 <=-

On 12/11/17, Bill McGarrity said the following...

you can't fix that - but the looping can definitely be fixed).

Deja Vu... :)

Very much so, now that I've mentioned that damn catchall that seems to
be rearing it's ugly head on hub systems these days. ;)

LOL... hey, now that the truth is out there... :)

BTW, working well here..

--

Bill

Telnet: tequilamockingbirdonline.net
Web: bbs.tequilamockingbirdonline.net
FTP: ftp.tequilamockingbirdonline.net:2121
IRC: irc.tequilamockingbirdonline.net Ports: 6661-6670 SSL: +6697
Radio: radio.tequilamockingbirdonline.net:8010/live

... Look Twice... Save a Life!!! Motorcycles are Everywhere!!!
--- MultiMail/Win32 v0.50
* Origin: TequilaMockingbird Online - Badlands of NJ (21:2/141)

From Avon@21:1/101 to Accession on Thursday, December 14, 2017 12:50:40

On 12/11/17, Accession pondered and said...

In 1/100 there are the following settings
21:3/100 has Route Info � 21:3/*
21:2/100 has Route Info � 21:2/*

That should be fine. You should then probably check the other two hubs to make sure they don't have anything that could cause a loop.

At the moment both 2/100 and 3/100 have a 1/100 echonode defined and that has the 21:* catchall routing any netmail statement in it's entry. This means if any netmail can't be delivered within the originating node/HUB it hands it on to 1/100 to sort.

Mystic will deliver to known echonodes set up within the HUB first then start looking at the routing info of each defined echonode thereafter to see which node it can route to if it needs to.

I could set up links between all three to create a three way polygon so
netmail is routed more directly between each HUB instead of NET 2 and NET 3 routing it via NET 1

However, with HUBs using that 21:* catchall, it will cause loops because even if it is net 2 and is on 2/100's system, but is unknown.. it will send it to you anyways.

Yep.. so the key is to change the routing info for 1/100 to 21:1/* on the
NET 2 HUB instead of 21:* , that will as you say stop a rouge 21:2/xxx addressed netmail leaving NET2 if it is unknown.

But as a feature for development I think the HUB still needs to be able to
send a reply netmail back to the originating system (and a copy to the HUB sysop) that an error in delivery occurred because of [insert reason here]

You have a couple options.

1) setup a 3-way polygon

1/100
route 2/* to 2/100
route 3/* to 3/100

2/100
route 1/* to 1/100
route 3/* to 3/100

3/100
route 1/* to 1/100
route 2/* to 2/100

I may well do this but will need to establish echonode setups for all HUBS in each HUB. At present NET 2 and NET 3 just poll NET 1 for echomail and/or any netmail that needs to be sent in/out

2) If you're still top of the chain and the single point of failure (assuming 1/100 is the top level hub):

1/100
route 2/* to 2/100
route 3/* to 3/100

That's the way things are at the moment.

2/100
route 1/* to 1/100
route 3/* to 1/100

3/100
route 1/* to 1/100
route 2/* to 1/100

I thought I had this in place but the 21:* was not the right way to go as it
is being used at present. So to speed up things I will opt for option 1..

See what I'm getting at? 21:* shouldn't be used on any of the 3 hub systems at all.

Gotcha :)

Regards,
Nick

Thanks appreciated!

Best, Paul

--- Mystic BBS v1.12 A37 2017/12/09 (Windows/32)
* Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)

From Accession@21:1/200 to Avon on Wednesday, December 13, 2017 22:08:29

On 12/14/17, Avon said the following...

At the moment both 2/100 and 3/100 have a 1/100 echonode defined and
that has the 21:* catchall routing any netmail statement in it's entry. This means if any netmail can't be delivered within the originating node/HUB it hands it on to 1/100 to sort.

That's not going to work. 1/100 has routing statements for both of the other nets, so anything from their own net that is sent to you will go right back
to their net, causing said loop.

I could set up links between all three to create a three way polygon so netmail is routed more directly between each HUB instead of NET 2 and
NET 3 routing it via NET 1

You don't have to do it this way.

Yep.. so the key is to change the routing info for 1/100 to 21:1/* on the NET 2 HUB instead of 21:* , that will as you say stop a rouge 21:2/xxx addressed netmail leaving NET2 if it is unknown.

Exactly. Same goes for 3/100.

But as a feature for development I think the HUB still needs to be able
to send a reply netmail back to the originating system (and a copy to
the HUB sysop) that an error in delivery occurred because of [insert reason here]

Sure, that's what some netmail tracker programs do, however.. in this case, it's just not-so-good routing.

Just remember, hubs shouldn't use full zone catchall route statements. While that may be my own opinion and I don't speak for anyone else, this is why you're having the looping problem. I've helped others in the past with this same issue.

While a regular node would be fine with 21:*, as soon as said node starts forwarding mail to another system, that 21:* statement can and will cause problems.

I may well do this but will need to establish echonode setups for all
HUBS in each HUB. At present NET 2 and NET 3 just poll NET 1 for
echomail and/or any netmail that needs to be sent in/out

Which is fine, and you can keep it that way.

I thought I had this in place but the 21:* was not the right way to go
as it is being used at present. So to speed up things I will opt for option 1..

Speed things up? Not sure how that would speed things up any. Your other hubs should ditch the catchall and go with more direct route statements. That is
the main problem.

Thanks appreciated!

Not a problem, mi amigo!

Regards,
Nick

--- Mystic BBS v1.12 A37 2017/12/13 (Linux/64)
* Origin: _thePharcyde telnet://bbs.pharcyde.org (Wisconsin) (21:1/200)

Who's Online

System Info

Sysop:	sneaky
Location:	Ashburton,NZ
Users:	31
Nodes:	8 (0 / 8)
Uptime:	202:44:13
Calls:	2,083
Calls today:	1
Files:	11,139
Messages:	947,981

Netmail Looping

Who's Online

System Info