• 1/100 *.BSY probs

    From Avon@21:1/100 to g00r00 on Thursday, November 16, 2017 07:19:47

    Woke up to find the HUB not processing packets as another case of *.bsy files had occured.

    Check logs are in fsxnet-hub-logs-ver-20171114b-busy-nodes-again.zip waiting for your next poll. :)

    Have cleared files and processed waiting packets in echomai\in

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/09 (Windows/32)
    * Origin: fsxNet HUB (21:1/100)
  • From g00r00@21:1/108 to Avon on Wednesday, November 15, 2017 14:28:24
    Woke up to find the HUB not processing packets as another case of *.bsy files had occured.

    Check logs are in fsxnet-hub-logs-ver-20171114b-busy-nodes-again.zip waiting for your next poll. :)

    Have cleared files and processed waiting packets in echomai\in

    We need to get in sync with the different copies. I have updated the latest as of a few minutes ago and I will hold off now until I get feedback from you just so we're working with the same stuff.

    If/when MIS2/FidoPoll crashes it will not be able to remove any BSY files so they can add up that way. fidopoll killbusy should probably be ran before putting anything back up (if you aren't already) after a crash.

    The latest MIS2 should have logs that say when its created and removing BUSY files so we can see if its happening in MIS2 or not. I haven't had time to test this before I put it on the FTP server because I have to work so for all I know they won't even run.

    MUTIL also makes heavy usage of the BUSY files as well but I don't believe it has good logging in that area.

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Sector 7 [Mystic BBS WHQ] (21:1/108)
  • From Avon@21:1/101 to g00r00 on Thursday, November 16, 2017 10:27:14
    On 11/15/17, g00r00 pondered and said...

    We need to get in sync with the different copies. I have updated the latest as of a few minutes ago and I will hold off now until I get feedback from you just so we're working with the same stuff.

    OK sounds good. I agree also. It's mid morning here I won't be able to do
    this until tonight when I get home but will post an update to this post when
    I have upgraded. Then with upgraded logging overnight (my time) I can zip up and send you the latest logging along with any other reported findings
    tomorrow morning (my Friday) before I head to work :)

    If/when MIS2/FidoPoll crashes it will not be able to remove any BSY
    files so they can add up that way. fidopoll killbusy should probably be ran before putting anything back up (if you aren't already) after a
    crash.

    I think what is happening is that Fidopoll is working to deliver packets, a
    new packet comes in via BinkP while this happens, events system fires off
    mutil to import it, mutil sits waiting for *.bsy flags to clear and timesout after 60 odd seconds as fidopoll is still working, then it seems fidpoll also then skips a bunch of nodes while polling with the *.bsy error displayed and then things jam up.

    The bsy flags have not cleared, mutil keeps getting called as new packets arrive, it never runs as it's waiting on *bsy files not cleared and on it goes...

    I think the issue is related to the interaction between the two. Ideally
    mutil should not be allowed to be run at all if fidopoll is running. perhaps
    a better way is for the code to not run if it sees fidopoll.bsy semaphore is present rather than allow the events system to call it?

    when I run a mailin process its, mutil process incoming echomail and files followed by fidopoll send all in the same shell line. mailout is the same
    thing the mutil outbound processing then a fidopoll send.

    If I separated out the processing of packets in/out of the hub from the fidopoll process then I could probably avoid what I think is happening in my description above. It would allow all traffic in/out of the HUB to be ready
    for nodes polling it via BinkP in the most efficient way possible... but the downside is nodes that want traffic crashed to them would not get things as quickly.

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/09 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From Avon@21:1/101 to g00r00 on Thursday, November 16, 2017 20:53:31
    On 11/16/17, Avon pondered and said...

    On 11/15/17, g00r00 pondered and said...

    We need to get in sync with the different copies. I have updated the latest as of a few minutes ago and I will hold off now until I get feedback from you just so we're working with the same stuff.

    OK sounds good. I agree also. It's mid morning here I won't be able to do this until tonight when I get home but will post an update to this post when I have upgraded. Then with upgraded logging overnight (my time) I
    can zip up and send you the latest logging along with any other reported findings tomorrow morning (my Friday) before I head to work :)

    8.52pm Thursday here, have just updated Agency and the fsxNet 1/100 HUB with 2017/11/15 from the freshly minted installer... will leave this all running overnight and send you logs in the morning. Hopefully it all helps and hopefully it all keeps running overnight :)

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From Avon@21:1/101 to g00r00 on Friday, November 17, 2017 07:02:19
    On 11/16/17, Avon pondered and said...

    8.52pm Thursday here, have just updated Agency and the fsxNet 1/100 HUB with 2017/11/15 from the freshly minted installer... will leave this all running overnight and send you logs in the morning. Hopefully it all
    helps and hopefully it all keeps running overnight :)


    I woke up early, coming up 7am Friday so have zipped logs and dropped them in your file box... So far it seems there are no major problems. Will leave
    things running and check after work tonight :)

    I did find this Fidopoll issue with one node I may need to chase up.

    Nov 16 20:54:44 Connecting to vistabbs.ianthurston.com
    Nov 16 20:54:44 Connected
    Nov 16 20:54:44 Frame wait
    Nov 16 20:54:44 Auth State:SendWelcome Have:False Need:True
    Nov 16 20:54:44 Sent Frame: NUL SYS fsxHUB [fsxNet WHQ]
    Nov 16 20:54:44 Sent Frame: NUL ZYZ Paul Hayton
    Nov 16 20:54:44 Sent Frame: NUL VER Mystic/1.12A36 binkp/1.0
    Nov 16 20:54:44 Sent Frame: ADR 21:1/100@fsxnet 21:1/3@fsxnet 21:1/2@fsxnet 21:1/0@fsxnet 21:0/0@fsxnet
    Nov 16 20:54:44 Frame wait
    Nov 16 20:54:45 Auth State:SendPW Have:False Need:True
    Nov 16 20:54:45 New frame detected
    Nov 16 20:54:45 Begin receive data frame (Size=20334)
    Nov 16 20:54:45 Error receiving data (RB Socket read buffer result 0 Size=29/20334)
    Nov 16 20:54:45 Data Frame: 29/20,334
    Nov 16 20:54:45 Connection lost
    Nov 16 20:54:45 Authorization failed


    oh and there's this one also..

    Nov 16 21:35:12 Polling BINKP node 21:1/128
    Nov 16 21:35:12 Connecting to bbs.ouijabrd.net
    Nov 16 21:35:13 Connected
    Nov 16 21:35:13 Frame wait
    Nov 16 21:35:13 Auth State:SendWelcome Have:False Need:True
    Nov 16 21:35:13 Sent Frame: NUL SYS fsxHUB [fsxNet WHQ]
    Nov 16 21:35:13 Sent Frame: NUL ZYZ Paul Hayton
    Nov 16 21:35:13 Sent Frame: NUL VER Mystic/1.12A36 binkp/1.0
    Nov 16 21:35:13 Sent Frame: ADR 21:1/100@fsxnet 21:1/3@fsxnet 21:1/2@fsxnet 21:1/0@fsxnet 21:0/0@fsxnet
    Nov 16 21:35:13 Frame wait
    Nov 16 21:35:13 Auth State:SendPW Have:False Need:True
    Nov 16 21:35:13 Frame wait
    Nov 16 21:35:13 Frame wait
    Nov 16 21:35:13 New frame detected
    Nov 16 21:35:13 Begin receive data frame (Size=2148)
    Nov 16 21:35:13 Error receiving data (RB Socket read buffer result 0 Size=0/2148)
    Nov 16 21:35:13 Data Frame: 0/2,148
    Nov 16 21:35:13 Connection lost
    Nov 16 21:35:13 Authorization failed

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From g00r00@21:1/108 to Avon on Thursday, November 16, 2017 15:31:43
    I did find this Fidopoll issue with one node I may need to chase up.

    This might actually just be a legitimate loss of carrier...

    But I did change this most recent version to be very "aggressive" in returning a carrier loss on almost all errors... so it could also be related.

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Sector 7 [Mystic BBS WHQ] (21:1/108)
  • From Avon@21:1/101 to g00r00 on Friday, November 17, 2017 12:17:44
    On 11/16/17, g00r00 pondered and said...

    This might actually just be a legitimate loss of carrier...

    But I did change this most recent version to be very "aggressive" in returning a carrier loss on almost all errors... so it could also be related.

    It seemed to be a consistent thing for those two nodes for the period of logging.. I'll keep an eye on it.

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From Avon@21:1/101 to g00r00 on Saturday, November 18, 2017 07:39:07
    On 11/17/17, Avon pondered and said...

    I woke up early, coming up 7am Friday so have zipped logs and dropped
    them in your file box... So far it seems there are no major problems.
    Will leave things running and check after work tonight :)


    An update as of Saturday morning here - 18 November :)

    Have left things running since last update and there have been no significant failures that I can spot.

    Fidopoll continues to return this sort of error for 2-3 nodes but it seems mostly harmless. This one occurs for the same node..

    Nov 17 02:16:41 Error receiving data (RB Socket read buffer result 0 Size=29/20334)

    Then there's this from a poll to Jeff's BBBS system that failed.

    [snip]

    Nov 18 00:33:33 Polling BINKP node 21:1/128
    Nov 18 00:33:33 Connecting to bbs.ouijabrd.net
    Nov 18 00:33:34 Connected
    Nov 18 00:33:34 Frame wait
    Nov 18 00:33:34 Auth State:SendWelcome Have:False Need:True
    Nov 18 00:33:34 Sent Frame: NUL SYS fsxHUB [fsxNet WHQ]
    Nov 18 00:33:34 Sent Frame: NUL ZYZ Paul Hayton
    Nov 18 00:33:34 Sent Frame: NUL VER Mystic/1.12A36 binkp/1.0
    Nov 18 00:33:34 Sent Frame: ADR 21:1/100@fsxnet 21:1/3@fsxnet 21:1/2@fsxnet 21:1/0@fsxnet 21:0/0@fsxnet
    Nov 18 00:33:34 Frame wait
    Nov 18 00:33:34 Auth State:SendPW Have:False Need:True
    Nov 18 00:33:34 Frame wait
    Nov 18 00:33:34 Frame wait
    Nov 18 00:33:34 New frame detected
    Nov 18 00:33:34 Begin receive data frame (Size=2148)
    Nov 18 00:33:34 Error receiving data (RB Socket read buffer result 0 Size=0/2148)
    Nov 18 00:33:34 Data Frame: 0/2,148
    Nov 18 00:33:34 Connection lost
    Nov 18 00:33:34 Authorization failed

    [snip]

    and one to Dan's the also failed

    [snip]

    Nov 17 10:18:11 Scanning 21:1/186
    Nov 17 10:18:11 Queued 1 files (1003 bytes) to 21:1/186
    Nov 17 10:18:11 Polling BINKP node 21:1/186
    Nov 17 10:18:11 Connecting to castlerockbbs.com
    Nov 17 10:18:11 Connected
    Nov 17 10:18:11 Frame wait
    Nov 17 10:18:11 Auth State:SendWelcome Have:False Need:True
    Nov 17 10:18:11 Sent Frame: NUL SYS fsxHUB [fsxNet WHQ]
    Nov 17 10:18:11 Sent Frame: NUL ZYZ Paul Hayton
    Nov 17 10:18:11 Sent Frame: NUL VER Mystic/1.12A36 binkp/1.0
    Nov 17 10:18:11 Sent Frame: ADR 21:1/100@fsxnet 21:1/3@fsxnet 21:1/2@fsxnet 21:1/0@fsxnet 21:0/0@fsxnet
    Nov 17 10:18:12 Frame wait
    Nov 17 10:18:12 Auth State:SendPW Have:False Need:True
    Nov 17 10:18:12 Frame wait
    Nov 17 10:18:12 Frame wait
    Nov 17 10:18:12 New frame detected
    Nov 17 10:18:12 Begin receive data frame (Size=16972)
    Nov 17 10:18:12 Error receiving data (RB Socket read buffer result 0 Size=7/16972)
    Nov 17 10:18:12 Data Frame: 7/16,972
    Nov 17 10:18:12 Connection lost
    Nov 17 10:18:12 Authorization failed

    [snip]

    But for the post part, packets are being pushed out OK and so far not major *.bsy issues as reported earlier.

    The only other thing I note is the logs created by MIS2 seem far smaller than MUTIL and Fidopoll. The latter date back to the last upgrade Fidopoll starts
    16 Nov 8pm, and MUTIL is the same. But MIS2 logs seem to be very incomplete
    and only span perhaps not quite 24 hours as I look at them today... that's weird.

    Have zipped up all logs and they wait for your next poll. :)

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From g00r00@21:1/108 to Avon on Saturday, November 18, 2017 00:15:20
    Have left things running since last update and there have been no significant failures that I can spot.

    Great! But I am sure this is only because I added in that BUSY logging. As soon as I turn that off it will happen again!

    Fidopoll continues to return this sort of error for 2-3 nodes but it
    seems mostly harmless. This one occurs for the same node..

    Thank you. I am pretty sure this was just a result of my aggressive changes I made in the most recent version with detecting "connection loss". I'll tweak it again and see if I can clean that up.

    --- Mystic BBS v1.12 A36 2017/11/16 (Windows/32)
    * Origin: Sector 7 [Mystic BBS WHQ] (21:1/108)
  • From Avon@21:1/101 to g00r00 on Sunday, November 19, 2017 08:13:29
    On 11/18/17, g00r00 pondered and said...

    Have left things running since last update and there have been no significant failures that I can spot.

    Great! But I am sure this is only because I added in that BUSY logging. As soon as I turn that off it will happen again!

    I spoke too soon...

    [snip]

    ----------------- MUTIL v1.12 A36 2017/11/15 Sun, Nov 19 2017 (loglevel 3)
    + Nov 19 04:16:22 Startup using mailin.ini
    - Nov 19 04:16:22 EXEC ImportEchoMail
    - Nov 19 04:16:22 EXEC FileToss
    + Nov 19 04:16:22 Process: Toss FDN/TIC Files
    + Nov 19 04:16:22 Scanning Hatches
    + Nov 19 04:16:22 Results: 0 import, 0 toss, 0 hatch, 0 bad in 0.00s
    + Nov 19 04:16:22 Process: Importing EchoMail
    + Nov 19 04:16:22 Waiting for BUSY nodes
    + Nov 19 04:17:23 Results: Cannot import. Some nodes are BUSY in 60.83s
    ! Nov 19 04:17:23 Status: FATAL
    + Nov 19 04:17:23 Shutdown Normal (0)

    ----------------- MUTIL v1.12 A36 2017/11/15 Sun, Nov 19 2017 (loglevel 3)
    + Nov 19 04:29:24 Startup using mailin.ini
    - Nov 19 04:29:24 EXEC ImportEchoMail
    - Nov 19 04:29:24 EXEC FileToss
    + Nov 19 04:29:24 Process: Toss FDN/TIC Files
    + Nov 19 04:29:24 Scanning Hatches
    + Nov 19 04:29:24 Results: 0 import, 0 toss, 0 hatch, 0 bad in 0.00s
    + Nov 19 04:29:24 Process: Importing EchoMail
    + Nov 19 04:29:24 Waiting for BUSY nodes
    + Nov 19 04:30:25 Results: Cannot import. Some nodes are BUSY in 60.84s
    ! Nov 19 04:30:25 Status: FATAL
    + Nov 19 04:30:25 Shutdown Normal (0)

    ----------------- MUTIL v1.12 A36 2017/11/15 Sun, Nov 19 2017 (loglevel 3)
    + Nov 19 04:30:26 Startup using mailin.ini
    - Nov 19 04:30:26 EXEC ImportEchoMail
    - Nov 19 04:30:26 EXEC FileToss
    + Nov 19 04:30:26 Process: Toss FDN/TIC Files
    + Nov 19 04:30:26 Scanning Hatches
    + Nov 19 04:30:26 Results: 0 import, 0 toss, 0 hatch, 0 bad in 0.00s
    + Nov 19 04:30:26 Process: Importing EchoMail
    + Nov 19 04:30:26 Waiting for BUSY nodes
    + Nov 19 04:31:27 Results: Cannot import. Some nodes are BUSY in 60.84s
    ! Nov 19 04:31:27 Status: FATAL
    + Nov 19 04:31:27 Shutdown Normal (0)

    [snip]

    ov 19 04:31:27 Scanning 21:1/165
    Nov 19 04:31:27 21:1/165 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/166
    Nov 19 04:31:27 21:1/166 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/167
    Nov 19 04:31:27 21:1/167 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/168
    Nov 19 04:31:27 21:1/168 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/169
    Nov 19 04:31:27 21:1/169 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/171
    Nov 19 04:31:27 21:1/171 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/170
    Nov 19 04:31:27 21:1/170 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/172
    Nov 19 04:31:27 21:1/172 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/173
    Nov 19 04:31:27 21:1/173 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/174
    Nov 19 04:31:27 21:1/174 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/175
    Nov 19 04:31:27 21:1/175 is busy; skipping
    Nov 19 04:31:27 Scanning 21:1/176

    [snip]

    I'll zip up the current snapshot of all logs so you can review what I have. Look for fsxnet-hub-busy-20171119.zip

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From g00r00@21:1/108 to Avon on Saturday, November 18, 2017 02:53:48
    Nov 17 02:16:41 Error receiving data (RB Socket read buffer result 0 Size=29/20334)

    I have a new MIS2 and FIDOPOLL with some adjustments in this area in the prealpha directory if you'd like to test it.

    --- Mystic BBS v1.12 A36 2017/11/18 (Windows/32)
    * Origin: Sector 7 [Mystic BBS WHQ] (21:1/108)
  • From Avon@21:1/101 to g00r00 on Sunday, November 19, 2017 18:10:26
    On 11/18/17, g00r00 pondered and said...

    I have a new MIS2 and FIDOPOLL with some adjustments in this area in the prealpha directory if you'd like to test it.

    Thank you... will update in the coming hours and send logs in the morning..

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)
  • From Avon@21:1/100 to g00r00 on Sunday, November 19, 2017 19:49:33
    On 11/18/17, g00r00 said the following...

    I have a new MIS2 and FIDOPOLL with some adjustments in this area in the prealpha directory if you'd like to test it.

    OK have updated 1/100 with these now, should ahve 12 hours logs to send you
    in the morning if you would like them? Just realised I'd better check instead of assuming you want all this logging I keep zipping up and sending you :)

    Best, Paul

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: fsxNet HUB (21:1/100)
  • From g00r00@21:1/108 to Avon on Sunday, November 19, 2017 11:48:10
    I have a new MIS2 and FIDOPOLL with some adjustments in this area in prealpha directory if you'd like to test it.

    Thank you... will update in the coming hours and send logs in the morning..

    One thing I noticed about the recent BUSY issue is that there is not any logging. I thought I added in some logging so we can line up the BUSY create/delete to see what happens, but it appears it isn't actually showing up in the logs. Doh!

    I'll try to make a new version with better BUSY debugging soon but right now everything is in shambles.

    --- Mystic BBS v1.12 A36 2017/11/18 (Windows/32)
    * Origin: Sector 7 [Mystic BBS WHQ] (21:1/108)
  • From g00r00@21:1/108 to Avon on Sunday, November 19, 2017 11:49:28
    OK have updated 1/100 with these now, should ahve 12 hours logs to send you in the morning if you would like them? Just realised I'd better
    check instead of assuming you want all this logging I keep zipping up
    and sending you :)

    Up to you! I'll take a look through them as you send them. I'll pay more attention if we know there has been a BUSY problem, otherwise ill probably
    just search for a few key words.

    --- Mystic BBS v1.12 A36 2017/11/18 (Windows/32)
    * Origin: Sector 7 [Mystic BBS WHQ] (21:1/108)
  • From Caphood@21:1/112 to g00r00 on Sunday, November 19, 2017 13:10:05
    I'll try to make a new version with better BUSY debugging soon but right now everything is in shambles.

    Wow "shambles" Thats strong language. It can mean many things but Merriam-Webster has some zingers for it.

    Definition of shambles. 1 archaic :a meat market. 2 :slaughterhouse.
    3 :a place of mass slaughter or bloodshed. the battlefield became a shambles.

    ²ß±ß° {tHE.pIRATE.kING} ÜßÛßÜ
    ß²Ü°ß HOIST THE COLOURS ßÛÜÛß
    þþþ NEVER SHALL WE DIE. ÝÝÞ

    --- Mystic BBS v1.12 A35 (Windows/32)
    * Origin: ACiD Telnet HQ / blackflag.acid.org (21:1/112)
  • From Avon@21:1/101 to g00r00 on Monday, November 20, 2017 07:38:24
    On 11/19/17, g00r00 pondered and said...

    Up to you! I'll take a look through them as you send them. I'll pay
    more attention if we know there has been a BUSY problem, otherwise ill probably just search for a few key words.

    All good, just got up so sending first 12 hours of data now... will send updated logs if error occurs or again in 24 hours..

    --- Mystic BBS v1.12 A36 2017/11/15 (Windows/32)
    * Origin: Agency BBS | telnet://agency.bbs.geek.nz (21:1/101)