I have an experimental fork of Synchronet v3.17 which uses SMB for the filebases. Not only does this resolve the 8.3 filename storage format limitation, but it solves a lot of capacity (e.g. description and extended description lenghts) and performance problems with the current filebase format and potentially enables msgbase-like features, like discussion threads, voting, etc.
My plan (always subject to change) is to merge that fork in after a v3.17c release and then make that version (with SMB filebases) a v3.18 release, hopefully sometime this year.
After that, SMB itself is possibly on the chopping block. I've been looking at sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in theory, resolve current capacity (e.g. 2+GB msgbase), concurrency/corruption and performance issues. In addition, anyone familiar with sqlite should be able to write their own msgbase tools in a wide variety of languages.
Any SQL database experts out there? This would be my first programming project utilizing one.
On 04-08-19 15:56, Digital Man wrote to All <=-
I have an experimental fork of Synchronet v3.17 which uses SMB for the filebases. Not only does this resolve the 8.3 filename storage format
After that, SMB itself is possibly on the chopping block. I've been looking at sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in theory, resolve current
Any SQL database experts out there? This would be my first programming project utilizing one.
That could be interesting.. Though, I'm not sure how much people would want could upvote or downvote files depending on how valuable/interesting or how useless/bad they think the file is.
I see some downloads from my BBS occasionally, but these days I don't think people download from BBSes very often. And, although I could see there being some sort of reward system for users (i.e., file ratios changing) based on file upvotes/dovnvotes, that might not be as relevant these days as it once was back in the 80s/90s.
My plan (always subject to change) is to merge that fork in after a v3.17c release and then make that version (with SMB filebases) a v3.18 release, hopefully sometime this year.
After that, SMB itself is possibly on the chopping block. I've been looking at sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in theory, resolve current capacity (e.g. 2+GB msgbase), concurrency/corruption and performance issues. In addition, anyone familiar with sqlite should be able to write their own msgbase tools in a wide variety of languages.
Do you think the JavaScript interface in regards to the MessageBase class, voting functions, message header objects, etc. will change significantly?
Any SQL database experts out there? This would be my first programming project utilizing one.
I wouldn't consider myself an 'expert' per se, but I've done some SQL work in the past with C++. It's been about 12 years, but I worked on a project where we used MySQL and PostgreSQL and interfaced to it with C++ (and Perl), and we had a heirarchy of database classes in C++ to interface with either one (a parent class, and child classes derived from it to interface specifically with MySQL and PostgreSQL using their respective C/C++ libraries).
Any SQL database experts out there? This would be my first programming project utilizing one.
looking at sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in theory, resolve current capacity (e.g. 2+GB msgbase), concurrency/corruption and performance issues. In addition, anyone familiar with sqlite should be
Re: The future of SMB (Synchronet Message Base)
By: Digital Man to All on Mon Apr 08 2019 15:56:20
Any SQL database experts out there? This would be my first programming project utilizing one.
Not an expert by any means, but I use SQL in many projects and interact with SQL DBs just about every day.
looking at sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in theory, resolve current capacity (e.g. 2+GB msgbase), concurrency/corruption and
performance issues. In addition, anyone familiar with sqlite should be
Would like to hear more about your plans for the schema. I can see this making
it much easier to get messages-by-thread, and that alone would be great.
If, while you're at it, you can add some sqlite stuff to the JS environment (not just abstracted SMB-related things) that would be handy. Even if it's just the bare minimum, stuff written in JS could be layered on top of that.
I have an experimental fork of Synchronet v3.17 which uses SMB for the filebases. Not only does this resolve the 8.3 filename storage format limitation, but it solves a lot of capacity (e.g. description and extended description lenghts) and performance problems with the current filebaseformat
and potentially enables msgbase-like features, like discussion threads,voting,
etc.at
My plan (always subject to change) is to merge that fork in after a v3.17c release and then make that version (with SMB filebases) a v3.18 release, hopefully sometime this year.
After that, SMB itself is possibly on the chopping block. I've been looking
sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in theory, resolve current capacity (e.g. 2+GB msgbase), concurrency/corruption and performance issues. In addition, anyone familiar with sqlite should be able to write their ownmsgbase
tools in a wide variety of languages.project
Any SQL database experts out there? This would be my first programming
utilizing one.
Synchronet "Real Fact" #44:
Synchronet added JavaScript suppport with v3.10a (2001).
---
Well I'm mainly interested in input from folks who've designed database schemas. I want to try to do it right the first time (naturally).
First, i think that you need to avoid the Libraries->Directories->files
(two level limited) and migrate to Tree data (more like filesystem)
Oh, definitely. Ragnarok already did that and has a patch set available somewhere for it... I haven't really looked at his implementation yet, but I will.that patch was a simple prof of concept only and very minial funcionality.
digital man
El 9/4/19 a las 02:03, Digital Man escribió:
Well I'm mainly interested in input from folks who've designed database schemas. I want to try to do it right the first time (naturally).
First, i think that you need to avoid the Libraries->Directories->files
(two level limited) and migrate to Tree data (more like filesystem)
files table
------------
id
name
description
directory_id
upload_by
upload_date
etc...
directories table
---------------
id
name
descripcion
parent_directory_id
etc....
I think we're talking about 2 different things. I'm talking about the database schema for a single message area.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to Ragnarok on Wed Apr 10 2019 11:22 am
I think we're talking about 2 different things. I'm talking about the database schema for a single message area.
Msg areas should be in one table; messages in another.
Message Area
------------
ID
Name
OtherField1
OtherField2
...
Messages
------------
ID
Message Area ID
Subject
....
Msg areas should be in one table; messages in another.
I'm not talking about message area configuration / organization. That's a completely different matter and not something I considering using sqlite for at this time.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to dmxrob on Wed Apr 10 2019 03:47 pm
Msg areas should be in one table; messages in another.
I'm not talking about message area configuration / organization. That's a completely different matter and not something I considering using sqlite for at this time.
<shrug> Well you asked for input.
If you are considering a 1:1 relationship for tables to store messages to message areas I would say that is something I wouldn't do.
If, while you're at it, you can add some sqlite stuff to the JS environment (not just abstracted SMB-related things) that would be handy. Even if it's just the bare minimum, stuff written in JS could be layered on top of that.
I'm inquiring if anyone here has experience constructing SQL database schemas for the purpose of a new messagebase format. Message base organization is a different matter.
I'm not really sure what you're saying. What I'm planning is a single database per message area, just like it is done today with SMB and most other "modern" message base formats (e.g. JAM and Squish). I think Hudson MB stores multiple message areas in a single database file, but I could be wrong.
I'm not really sure what you're saying. What I'm planning is a
single database per message area, just like it is done today with
SMB and most other "modern" message base formats (e.g. JAM and
Not sure you would want a single database for each message area... thinking that is a little overboard. You need a single database, let's call it 'SynchroDB'. In that database, you would have a table called 'msgbase' where everything is stored, headers and messages. SQLlite should be able to handle multiple queries and filters when requesting data from any table from the database. Not sure if there would be a need to have a seperate 'headers', 'index' and 'messages' tables.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to dmxrob on Wed Apr 10 2019 06:41 pm
I'm inquiring if anyone here has experience constructing SQL database schemas for the purpose of a new messagebase format. Message base organization is a different matter.
The files that SQLlite will store in will be it's own format if that is what you are asking... other than that, you just need the table layouts with data formats.
I think dmxrob(?) was suggesting a possible layout for the table(s) structure to use.
I've got lots of experience with different SQL databases if you need any help.
that is a little overboard. You need a single database, let's call it 'SynchroDB'. In that database, you would have a table called 'msgbase' where everything is stored, headers and messages. SQLlite should be able to handle multiple queries and filters when requesting data from any table from the
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to dmxrob on Wed Apr 10 2019 06:41 pm
I'm not really sure what you're saying. What I'm planning is a single database per message area, just like it is done today with SMB and most other "modern" message base formats (e.g. JAM and Squish). I think Hudson MB stores multiple message areas in a single database file, but I could be wrong.
Not sure you would want a single database for each message area...
thinking that is a little overboard. You need a single database, let's call it
'SynchroDB'. In that database, you would have a table called 'msgbase' where everything is stored, headers and messages.
SQLlite should be able to handle
multiple queries and filters when requesting data from any table from the database. Not sure if there would be a need to have a seperate 'headers', 'index' and 'messages' tables.
Re: Re: The future of SMB (Synchronet Message Base)
By: High Spirit to Digital Man on Thu Apr 11 2019 03:58 pm
that is a little overboard. You need a single database, let's call it 'SynchroDB'. In that database, you would have a table called 'msgbase' where everything is stored, headers and messages. SQLlite should be able to handle multiple queries and filters when requesting data from any table from the
EXACTLY! The thought of having multiple database files makes me cringe.
However, for something like a messagebase, I wonder if it would make sense to have a separate one for each message area. If there's only a single table where all headers and messages are stored, I'd wonder if it could potentially take some time to filter the messages to show only the ones relevant for the message area the user is reading. Having a separate
I was wondering about that.. From what I've seen, a single SQL database would contain multiple tables organizing the data.
However, for something like a messagebase, I wonder if it would make sense to have a separate one for each message area. If there's only a single table where all headers and messages are stored, I'd wonder if it could potentially take some time to filter the messages to show only the ones relevant for the message area the user is reading. Having a separate database for each message area could eliminate that issue. Unless perhaps there's a way to have a 'msgbase' table like you describe and be able to have a separate instance of that table for each message area.. I'm not sure that's possible with an SQL database.
I'm asking if anyone has experience defining SQL database schemas. I'm not talking about the actual sequence of bytes in the database file.
Sure: how did you handle data de-duplication?
That's how it's traditionaly done. And I think it's a good design.
It sounds like you're suggestion is that all message areas are stored in a single shared database. I think there are lot of very good reasons not do that.
That's how all modern BBS software works, including Synchronet. Each message base (and filebase, for that matter) is a separate file (or set of files). What exactly about that makes you "cringe"?
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to High Spirit on Thu Apr 11 2019 03:51 pm
I'm asking if anyone has experience defining SQL database schemas. I'm not talking about the actual sequence of bytes in the database file.
I am not sure what schemas were before databases of today, but with today's databases it refers to as a blueprint of how the database tables are laid out.
Sure: how did you handle data de-duplication?
You either are doing this for eliminating duplicate messages getting added to the database (either by posting or importing by QWK network or something) or to preserve storage space. I am going to eliminate storage space as we do not have storage limitations when it comes to text messages these days...
so
I am assuming preventing duplicate messages.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to High Spirit on Thu Apr 11 2019 03:57 pm
That's how it's traditionaly done. And I think it's a good design.
It sounds like you're suggestion is that all message areas are stored in a single shared database. I think there are lot of very good reasons not do that.
Back then it probably was the best way to do it, but databases are quite sophisticated these days and you can do pretty much anything you want with them.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to dmxrob on Thu Apr 11 2019 03:59 pm
That's how all modern BBS software works, including Synchronet. Each message base (and filebase, for that matter) is a separate file (or set of files). What exactly about that makes you "cringe"?
But all "modern" bbs's are running from code derived from the time before databases of today.
If you give a 15 year old today (with experience in
programming) and ask him to write a BBS from scratch (if he knew what that was), he would store the data in SQLite or MySQL and make it efficient as possible. Jumping from database to database (each time, disconnecting and reconnecting)
and then from table to table (if need be) just to find a
message that is sent to you would take up system resources and will probably take a lot longer than having everything in one location and asking the database to get "all new messages addressed to me" from one database call. I can search for a single word and it will come back with all the records with that word.
Not sure you would want a single database for each message area...
That's how it's traditionaly done. And I think it's a good design.
that patch was a simple prof of concept only and very minial funcionality.
This use libsqlite and export js object that represent the database.
https://bbs.docksud.com.ar/~ragnarok/sync/js_sqlite/
I'm assuming indexes are not tables.
Also, like I mentioned before, it's perfectly reasonable and expected to have multiple messages share the same body/text data (e.g. sending an email to 100 users on the BBS should NOT result in 100 copies of the message text), so that dictates that a separate table for message data/text is needed and a reference schema (separate cross-reference table) be implemented.
Yeah, but lets say I want to be able to have all my fidonet messages stored on a different file system (e.g. disk drive) than all my dovenet messages. How do I achieve that if all message areas are stored in the same database?
Let's say I just want to delete all my usenet message bases quickly and easily. How do I do that if they're all stored in the same database as all of my other message areas?
I'm not saying you couldn't store all the message bases in a single database. I'm saying the advantages of storing each message area in a unique database outweighs any disadvantages. You likely don't know, but if you run Synchronet, you already have a unigue database for each message and file area. It works pretty good, yeah? There are many things that are easy for a sysop to do *because* of that design. I don't know of any advantage to having *all* the message areas stored in a single database. Sounds like a single-point-of-failure waiting to happen.
SQL databases first appeared in the 1970's. And plenty of BBS programs were developed much more recently and had all the "modern database of today" to choose to use (or not).
SQLite has no concept of "connect" or "disconnect". And its certainly possible for an application to open multiple databases (or "connect" if that were the case) concurrently.
When you design your own BBS software, go for it. With Synchronet, not every user has access to every message base, and each user can choose which message bases to include or exclude (from those they have access to) in any searches or scans. The separate base per area model fits the access model well and provides a lot of other management benefits that sysops enjoy.
it is a good design. if there's an issue with the msg bases it makes backup/recovery very easy. lets say i dont want a msg network anymore. i can just delete it in the setup program and the data isnt being removed from some huge communial data file.
EXACTLY! The thought of having multiple database files makes me cringe.That's how all modern BBS software works, including Synchronet. Each message base (and filebase, for that matter) is a separate file (or set of files). What exactly about that makes you "cringe"?
Yeah, but lets say I want to be able to have all my fidonet messages stored on a different file system (e.g. disk drive) than all my dovenet messages. How do I achieve that if all message areas are stored in the same database?
Let's say I just want to delete all my usenet message bases quickly and easily. How do I do that if they're all stored in the same database as all of my other message areas?
I'm not saying you couldn't store all the message bases in a single database. I'm saying the advantages of storing each message area in a unique database outweighs any disadvantages. You likely don't know, but if you run
Not sure you would want a single database for each message area...
That's how it's traditionaly done. And I think it's a good design.
it is a good design. if there's an issue with the msg bases it makes backup/recovery very easy. lets say i dont want a msg network anymore. i can just delete it in the setup program and the data isnt being removed from some huge communial data file.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to High Spirit on Thu Apr 11 2019 03:57 pm
Somehow I missed this part of the message... ;)
I'm assuming indexes are not tables.
No, indexes are not tables, but they are part of the internal database schema. You can set a field in the table to be indexed. In the backend of the database when you search a field for a value (number, string, etc) you are looking for, it searches the index for what you are looking for rather then the entire database hence making results return much faster. If you search a field that is not indexed, it will search the entire field in the database.
Also, like I mentioned before, it's perfectly reasonable and expected to have multiple messages share the same body/text data (e.g. sending an email to 100 users on the BBS should NOT result in 100 copies of the message text), so that dictates that a separate table for message data/text is needed and a reference schema (separate cross-reference table) be implemented.
With Gigabytes and Tarabytes of storage these days, do you need to worry about having a 100 copies of a screen of text?
Let's do a standard 80x25
screen, that's 2000 bytes. Lets give a little bit of overhead for header information and maybe color codes, etc so lets say 250 bytes... 100 copies of the same message is only going to use 250,000 bytes. A drop in the bucket in today's storage limitations.
But... de-duplication could be done if you wanted too... but this would have to be handled manually and there are a few ways to do this. Here is how I would have a multi-recipient message laid out:
The multi-recipient message would be stored in the table like any other message, but a field in the table could refer to another table of listed users who should get this message. This might be a little sloppy as the new table would require a record for each user's id and message id. But it will not take up as much space as a copy of the message itself.
Another option would be to do break up the header and message body, kinda like you do now with the files. Each recipient would get a copy of the header part, but and it would link to a single body. Like so:
Table: MSGHeaders
id (each record contains a unique record id)
user_id (user id #)
msgbody_id (# linking to the id field in the MSGBodies table)
headerdata1
headerdata2
headerdata3
...
Table: MSGBodies
id (unique record id)
body (body text)
...
Just out of curiosity, how does Synchronet handle open message base files now?
I'm not an expert at that sort of thing, though, but I think if you came at it from the other angle (not the "this is the way BBSes do it" end of it), you may find that few (if anyone?) separates their application out in to multiple databases.
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to High Spirit on Thu Apr 11 2019 08:05 pm
I'm replying mostly for the exercise... there could well be better ways of doing the things I'm suggesting.
Yeah, but lets say I want to be able to have all my fidonet messages stored
on a different file system (e.g. disk drive) than all my dovenet messages.
How do I achieve that if all message areas are stored in the same database?
I don't think that would fall in to the design, generally. You'd build storage to hold the database. Maybe you store message attachments on-disk and separate that out.
Let's say I just want to delete all my usenet message bases quickly and easily. How do I do that if they're all stored in the same database as all
of my other message areas?
delete from messages where msg_base like 'usenet%';
Or, enhance scfg to take on those tasks?
I'm not saying you couldn't store all the message bases in a single database. I'm saying the advantages of storing each message area in a unique
database outweighs any disadvantages. You likely don't know, but if you run
I think some of this has just been answered by others (as I believe there's pretty strong technical reasons for not trying to connect to many different databases), so I'm looking forward to seeing how that discussion proceeds.
After that, SMB itself is possibly on the chopping block. I've been looking at sqlite and mentally planning out a database schema for the basis of a new SMB format. Utilizing sqlite should, at least in
theory, resolve current capacity (e.g. 2+GB msgbase), concurrency/corruption and performance issues. In addition, anyone familiar with sqlite should be able to write their own msgbase tools
in a wide variety of languages.
Any SQL database experts out there? This would be my first programming project utilizing one.
Table: MSGHeaders
id (each record contains a unique record id)
user_id (user id #)
msgbody_id (# linking to the id field in the MSGBodies table)
headerdata1
headerdata2
headerdata3
...
Table: MSGBodies
id (unique record id)
body (body text)
...
That's similar to the table approach I already posted here.
What I don't know is how does a "MSGBodies" table row get automatically deleted when the last referencing msg header is deleted?
Just out of curiosity, how does Synchronet handle open message base
files now?
Multiple readers, single writer. For details, see http://synchro.net/docs/smb.html
I think you'd find opposite. Unix NNTP servers store users and messages in multiple databases. Windows Exchange servers store users and messages in multiple databases.
I'm not really sure what you're saying. What I'm planning is a single
database per message area, just like it is done today with SMB and
most other "modern" message base formats (e.g. JAM and Squish). I
think Hudson MB stores multiple message areas in a single database
file, but I could be wrong.
Not sure you would want a single database for each message area... thinking that is a little overboard. You need a single database, let's call it 'SynchroDB'. In that database, you would have a table called 'msgbase' where everything is stored, headers and messages.
SQLlite should be able to handle multiple queries and filters when requesting data from any table from the database.
Not sure if there would be a need to have a seperate 'headers',
'index' and 'messages' tables.
A simple SQL call like the following would get you all the headers in message base 5:
SELECT id, from, to, subject, date FROM `msgbase` WHERE msgbase_id = 5;
Just need to parse the data it returns.
If the indexes are setup correctly it can be rather effeciant.
that is a little overboard. You need a single database, let's call it
'SynchroDB'. In that database, you would have a table called 'msgbase'
where everything is stored, headers and messages. SQLlite should be
able to handle multiple queries and filters when requesting data from
any table from the
EXACTLY! The thought of having multiple database files makes me cringe.
SQLlite should be able to handle multiple queries and filters when
requesting data from any table from the database. Not sure if there
would be a need to have a seperate 'headers', 'index' and 'messages'
tables.
I'm assuming indexes are not tables.
Also, like I mentioned before, it's perfectly reasonable and expected
to have multiple messages share the same body/text data (e.g. sending
an email to 100 users on the BBS should NOT result in 100 copies of
the message text), so that dictates that a separate table for message data/text is needed and a reference schema (separate cross-reference table) be implemented.
Databases are very effecient these days. With the computing power and memory available to them now you can have a huge database
and as I said in another message, as long as the index pointers are
set correctly, searching for messages in a certian message base or filtering by messages from/to users or even searching keywords in a database with thousands of messages wouldn't take much time if it was
even noticeable.
The problem with having a database for each message board is not
really how a database interface like SQLite was meant to be used and managing it would not be as easy. You _COULD_ have a table for each message board but even that is a little overkill. I single field in
the table to designate what subboard the message belongs to is all it would take.
so I am assuming preventing duplicate messages. One way I have
prevented duplication in databases is to create a hash of the text and insert it into the record with the message.
An MD5 hash of the text: "This is a test" generates an MD5 value of: ce114e4501d2f4e2dcea3e17b546f339. Now if we wanted to check for
duplicate messages, we can now search the database for anything with
that hash. If we come back with a record we have a duplicate message
in the table...
and here is where having all messages in one table will come in handy
as it will only need to check the one table in the database to find a duplicate. If we have a database for each message base, each database
will need to be checked if we are doing a entire database check. Even
a single database with message boards in multiple tables would require
a database call for each.
Jumping from database to database (each time, disconnecting and reconnecting) and then from table to table (if need be) just to find a message that is sent to you would take up system resources and will probably take a lot longer than having everything in one location and asking the database to get "all new messages addressed to me" from one database call.
If you give a 15 year old today (with experience in programming) and
ask him to write a BBS from scratch (if he knew what that was), he
would store the data in SQLite or MySQL and make it efficient as
possible. Jumping from database to database (each time, disconnecting
and reconnecting)
SQLite has no concept of "connect" or "disconnect". And its certainly possible for an application to open multiple databases (or "connect"
if that were the case) concurrently.
With Gigabytes and Tarabytes of storage these days, do you need to worry
Another option would be to do break up the header and message body, kinda like you do now with the files. Each recipient would get a copy of the header part, but and it would link to a single body. Like so:
Let's say I just want to delete all my usenet message bases quickly
and easily. How do I do that if they're all stored in the same
database as all of my other message areas?
Let's say everything is stored in the same data base. Your usenet
message base id # is 5. "DELETE FROM messages WHERE msgbaseid = 5;"
This will delete all the messages in the table that belong to message
base 5. Run for each usenet message base you are deleting.
it is a good design. if there's an issue with the msg bases it makes
backup/recovery very easy. lets say i dont want a msg network
anymore. i can just delete it in the setup program and the data isnt
being removed from some huge communial data file.
That is a good point. SQLite stores everything in one file. Other
database programs (MySQL, etc) will store a database in a folder and
each table has it's own couple files.
As simplistic as SQLite is, it could be a good way to go... I have
used SQLite a little bit in the past years, but all my clients
required something with more power. :)
you may find that few (if anyone?) separates their application out in
to multiple databases.
What I don't know is how does a "MSGBodies" table row get
automatically deleted when the last referencing msg header is deleted?
That's how all modern BBS software works, including Synchronet. Each message base (and filebase, for that matter) is a separate file (or set of files). What exactly about that makes you "cringe"?
The problem with having a database for each message board is not really how a database interface like SQLite was meant to be used and managing it would not be as easy. You _COULD_ have a table for each message board but even
But all "modern" bbs's are running from code derived from the time before databases of today. If you give a 15 year old today (with experience in programming) and ask him to write a BBS from scratch (if he knew what that
sysop to do *because* of that design. I don't know of any advantage to having *all* the message areas stored in a single database. Sounds like a single-point-of-failure waiting to happen.
searches or scans. The separate base per area model fits the access model well and provides a lot of other management benefits that sysops enjoy.
I think you'd find opposite. Unix NNTP servers store users and messages in multiple databases. Windows Exchange servers store users and messages in multiple databases.
That is because Active Directory (users) and Exchange (email) are two different things. You cannot have an Exchange server without having an Active Directory.
Message attachments are encoded within the message text. That's how message attachments work on the Internet (whether it's SMTP, IMAP, POP3, or NNTP).
I haven't seen any "strong technical reasons" for a single monolithic database that you seem to be advocating for.
will it also compress the data files and regain that storage space that is now freed? when i/we delete stuff, we expect to see the available drive space go up
and the file sizes of the database get smaller... we call that "pack and purge"
even though it kinda works the other way... purge (aka delete) the old messages
and then pack the message base to reduce the size...
As mentioned elsewhere, when one can slap a multi-terabyte system
together cheaply and easily these days, what is the concern, exactly?
will it also compress the data files and regain that storage space
that is now freed? when i/we delete stuff, we expect to see the
available drive space go up and the file sizes of the database get
smaller... we call that "pack and purge" even though it kinda works
the other way... purge (aka delete) the old messages and then pack
the message base to reduce the size...
Not sure if SQLite has the capability for pack and purge.. as I said
in other messages, I have not used SQLite enough to see how everything works. All my projects, I have used MySQL and MSSQL.
The current SMB format does not have any built-in capiabilities to
pack the database and in order to pack the database it would have to
go through each record one at a time and write to new SMB files and
then delete the old and rename the new...
not sure if this is actually done with the current format or not...
but something simlar could be done with the SQLite database (if there
is no compact feature) and read through each record and then write to
a new database, delete and rename when complete.
read the above... being able to recreate lost data is essential... if you lose one table, to you really want to have lost thousands and thousands of historical messages?
but can it scale? that's the real question... remember, right now we can
will it also compress the data files and regain that storage space that is now freed? when i/we delete stuff, we expect to see the available drive space go up
and the file sizes of the database get smaller... we call that "pack and purge"
even though it kinda works the other way... purge (aka delete) the old messages
and then pack the message base to reduce the size...
that depends on the storage engine being used... MySQL's MyISAM does that, IIRC, but their InnoDB does not... it uses one or two files for
you may find that few (if anyone?) separates their application out in to multiple databases.
remember us old folks and old-school terminology... today's tables are our databases from yesteryear ;)
Databases are very effecient these days. With the computing power
and memory available to them now you can have a huge database
do you really want to have to have 64+Gig of RAM just to run a BBS using a database format? look at what we have today using proprietary databases and running in much much less memory...
and as I said in another message, as long as the index pointers are
set correctly, searching for messages in a certian message base or
filtering by messages from/to users or even searching keywords in a
database with thousands of messages wouldn't take much time if it
was even noticeable.
true... at the expense of memory and scalability...
so I am assuming preventing duplicate messages. One way I have
prevented duplication in databases is to create a hash of the text
and insert it into the record with the message.
what do you do about hash collisions? the MD5 table space is not infinite and we can easily create messages with different message bodies having the same MD5
checksum... heck, i just saw, the other day, a malware that has the same MD5 for multiple types of files used to deliver it...
With Gigabytes and Tarabytes of storage these days, do you need to
worry
*terabytes... i don't think Tara's bites are storage... maybe kinky but certainly not storage ;)
Management is an issue, but when you start breaking things down into their own tables or even databases (as has been suggested) then getting any type of useful information out of them becomes near impossible.
If I am trying to run a query about how many messages a user has posted, for example, I either need to go through all these seperate tables or I need to run it against each and every database out there to gather the data. Something that should be a 2-second query turns into an entire project.
As mentioned elsewhere, when one can slap a multi-terabyte system together cheaply and easily these days, what is the concern, exactly?
efficiency... why throw storage space and gobs of memory at an inefficiency problem when it can be written to be faster and smaller which makes a lot more sense...
read the above... being able to recreate lost data is essential... if
you lose one table, to you really want to have lost thousands and
thousands of historical messages?
Isn't that a problem for appropriately backing up one's system,
regardless of the database mechanism?
but can it scale? that's the real question... remember, right now we
can
How many millions of queries per second is enough?
http://tinyurl.com/y6cbvc72
will it also compress the data files and regain that storage space that
is now freed? when i/we delete stuff, we expect to see the available
drive space go up and the file sizes of the database get smaller... we
call that "pack and purge" even though it kinda works the other way...
purge (aka delete) the old messages and then pack the message base to
reduce the size...
Is that actually true?
It seems to me that Synchro has to run an external utility to do that work, it surely isn't packing the message DB on every message
deletion?
In SQLite objects are just marked for deletion and one is apparently expected to run a vacuum command on the db periodically. Same thing occurs in MySQL, you optimize the tables periodically.
I'm not sure how this is a detriment to using SQL though. Just a (potential) difference in how the DB storage is handled.
that depends on the storage engine being used... MySQL's MyISAM does
that, IIRC, but their InnoDB does not... it uses one or two files for
InnoDB has the innodb_file_per_table option,
which has become default in MySQL and MariaDB.
you may find that few (if anyone?) separates their application out
in to multiple databases.
remember us old folks and old-school terminology... today's tables
are our databases from yesteryear ;)
Perhaps, yes! I consider myself an old fart these days, but I wasn't playing with RDBMSes prior to maybe 1998.
Databases are very effecient these days. With the computing power
and memory available to them now you can have a huge database
do you really want to have to have 64+Gig of RAM just to run a BBS
using a database format? look at what we have today using proprietary
databases and running in much much less memory...
Challenge accepted! LOL JK.
No way 64GB of ram is needed,
back when we had a few megabytes to work with, or even less, things
were written to work with the memory space available at the time.
My RPi with Sycnrhonet uses 160MB of ram with 375MB free, so I have
some room to run a small database program like SQLite.
[...] filtering by messages from/to users or even searching keywords
in a database with thousands of messages wouldn't take much time if
it was even noticeable.
true... at the expense of memory and scalability...
I am not sure if indexing the message bodies would be a good idea, but header information would help.
so I am assuming preventing duplicate messages. One way I have
prevented duplication in databases is to create a hash of the text
and insert it into the record with the message.
what do you do about hash collisions? the MD5 table space is not
infinite and we can easily create messages with different message
bodies having the same MD5 checksum... heck, i just saw, the other
day, a malware that has the same MD5 for multiple types of files used
to deliver it...
Synchronet is currently using a CRC32 hash.
With Gigabytes and Tarabytes of storage these days, do you need to
worry
*terabytes... i don't think Tara's bites are storage... maybe kinky
but certainly not storage ;)
I know a Tara... I could picture her being kinky...
uhh give me 5 mins please.
but can it scale? that's the real question... remember, right now we
can
How many millions of queries per second is enough?
http://tinyurl.com/y6cbvc72
The little SQL Engine that could. ;) But I do not see Synchronet ever needing to do millions of queries per second...
Another option would be to do break up the header and message body, kinda like you do now with the files. Each recipient would get a copy of the header part, but and it would link to a single body. Like so:
i think this is exactly what he was talking about... it could be done the same way in the current proprietary format, too...
Re: Re: The future of SMB (Synchronet Message Base)
By: Digital Man to High Spirit on Thu Apr 11 2019 08:11 pm
searches or scans. The separate base per area model fits the access model well and provides a lot of other management benefits that sysops enjoy.
Again, you asked for input - and two people (and I am sure more) who have worked with databases for many years are giving you our input and you just continue to either tell us we are wrong or tell us why it can't be done, or some combination thereof.
I don't see what you want at this point, other than a "you are right!" and so I am just going to remove myself from this conversation. It is clear that you have your mind made up, best of wishes with it.
yeah, probably not... i've stayed away from MSSQL ever since i saw folks struggling with it and trying to do something that was a two second job in other SQL databases...
I have actually been looking in to the SMB format and the Synchronet source code and it would work. Technically it is how it is doing it now. Open MSGBASE, gather data, Close MSGBASE, repeat. SQLite is just adding an interface to store and retrieve data.
works? how many people have a duplicate system they can test restorations on and be sure everything is correct? even in the corporate world, i've not seen any backups tested like this to make sure things are working right... not even the large multi-million dollar corps i've worked with wanted to purchase a second machineNot sure where you are working, but we have this in all our critical systems. Have for the 18+ years I've been here.
specifically and dedicated to restoration testing...
How many millions of queries per second is enough?
queries is one thing... i'm thinking more of writes... like when newslink runs and imports 10000 new news group posts... does everything else grind to a halt until all those new messages are imported and then the user sending him mother an email finally gets notice that his email was sent?
The little SQL Engine that could. ;) But I do not see Synchronet ever needing to do millions of queries per second...
How many millions of queries per second is enough?
http://tinyurl.com/y6cbvc72
The little SQL Engine that could. ;) But I do not see Synchronet ever needing to do millions of queries per second...
The little SQL Engine that could. ;) But I do not see Synchronet ever
needing to do millions of queries per second...
You never know.. What if there's a BBS that's very active, with many users online simultaneously? :P
Another option would be to do break up the header and message body,
kinda like you do now with the files. Each recipient would get a
copy of the header part, but and it would link to a single body.
Like so:
i think this is exactly what he was talking about... it could be done
the same way in the current proprietary format, too...
And that's exactly how SMB has worked in 1993. It's not new,
I'm just trying to figure out the best way to achieve the same thing
(many to one, msg headers to msg data) in a SQL database.
I guess I should probably just find a good book on the subject
(creating SQL databases).
I'm kind of sorry I brought it up here to be honest.
yeah, probably not... i've stayed away from MSSQL ever since i saw
folks struggling with it and trying to do something that was a two
second job in other SQL databases...
I've had MSSQL running for over 7 years now on one instance. Hasn't rebooted or been shut down once. Not. One. Time. It's solid and
almost up to snuff with Oracle in many areas nowadays.
works? how many people have a duplicate system they can test
restorations on and be sure everything is correct? even in the
corporate world, i've not seen any backups tested like this to make
sure things are working right... not even the large multi-million
dollar corps i've worked with wanted to purchase a second machine
specifically and dedicated to restoration testing...
Not sure where you are working,
but we have this in all our critical systems. Have for the 18+ years
I've been here.
How many millions of queries per second is enough?
queries is one thing... i'm thinking more of writes... like when
newslink runs and imports 10000 new news group posts... does
everything else grind to a halt until all those new messages are
imported and then the user sending him mother an email finally gets
notice that his email was sent?
Writes can be handled right alongside queries in any RDMS worth its salt.
please dont feel insulted because people don't agree with your idea.
one might look at it like that but let's take reality into account, ok?
in all your years, how many times have you taken a fresh backup of your system and immediately restored it to a duplicate system to make sure it works? how many people have a duplicate system they can test restorations on and be sure everything is correct? even in the corporate world, i've not seen any backups tested like this to make sure things are working right... not even the large multi-million dollar corps i've worked with wanted to purchase a second machine
queries is one thing... i'm thinking more of writes... like when newslink runs and imports 10000 new news group posts... does everything else grind to a halt until all those new messages are imported and then the user sending him mother an email finally gets notice that his email was sent?
http://tinyurl.com/y6cbvc72they're using a customized transaction layer, though...
expected to run a vacuum command on the db periodically. Same thing occurs in MySQL, you optimize the tables periodicallynever ever did that in 15 or 20 years of running phpBB on mysql... it may have done it automatically but i don't know...
i haven't been trying to point out detriments... only things that one might expect to do with a BBS system... i ran RemoteAccess BBS with FrontDoor and...
to do... especially back then when we only had the HMB (Hudson Message Base) format to work with... then along came JAM and the restrictions of the HMB were lifted overnight :)
i wasn't aware of that... the last time i set something up, i was freaked out that the database files were not where i was expecting to find them... then i discovered that InnoDB was being used by default instead of MyISAM... this was after spending hours making sure the data was imported
yeah, i read that MariaDB came from MySQL when oracle snarfed it up... i think i would have left, too ;)
they weren't even called relational databases at that time... much less (R)DBMSes... at least not in dBase and Foxbase* but you could easily relate two
databases together by an indexed field and yank the needed data as needed... none of this weird SQL muckity-muck :lol:
did 240 hours (!!) straight coding on that project... then went home and slept for four
days straight much to the chagrin of my bosses... c0ffee, cigs and pizza can only take you so far ;)
I am not sure if indexing the message bodies would be a good idea,
but header information would help.
there has to be an index in the message bodies for the headers to be linked to ;)
but no comment on hash collisions with different data giving the same hash? :) :) :)
and nothing about the malware using hash collisions specifically to hide itself? ;) ;) ;)
please dont feel insulted because people don't agree with your idea.
It's not about hurt feelings. It is what is called asking for input
and then shooting down everything that anyone mentions. You don't
want input, you want "attaboys!" at that point.
http://tinyurl.com/y6cbvc72they're using a customized transaction layer, though...
Yes, I know, but even if Synchro's method of using it could only produce, what, 100 write options per second, how many systems would that introduce any noticeable slowdown on - especially if batch queueing is done in appropriate priority?
doexpected to run a vacuum command on the db periodically. Samething
occurs in MySQL, you optimize the tables periodicallynever ever did that in 15 or 20 years of running phpBB on mysql... it
may have done it automatically but i don't know...
I believe it's only offered as an extension for PHPBB unless you want to
it by hand.
It's one of those maintenance things that should be done, but if you
have the storage space, you may never need to.
toi haven't been trying to point out detriments... only things that one...
might expect to do with a BBS system... i ran RemoteAccess BBS with
FrontDoor and
to do... especially back then when we only had the HMB (Hudson Message
Base) format to work with... then along came JAM and the restrictions
of the HMB were lifted overnight :)
Sure, fair enough... I'm just going through the discussion on what seems
be the sane thing to do with an SQL DB or not,
since I think some information needs to be disseminated if the DB
stuff in Synchro is going to go down that road.
It's obviously very preliminary, and I would hate to see some really obvious design flaws be developed in that would need to be corrected
in short order as the SQL experience level came up.
Actually, it wouldn't really matter all that much, but I've seen what
I've seen with SQL DBs over the years, just want to be sure some of
the concepts are understood where I can help with that.
I also think it's interesting that the idea of redesigning or re-back-ending the message base DBs has come forth anyway, as the
system seems to be pretty damn speedy all around. I don't have a ton
of messages on here being processed yet, but it's tough to complain
with a small system anyway. Would be interesting to see what kind of
load Vertrauen's under at times, if the load is even a thing.
they weren't even called relational databases at that time... much less
(R)DBMSes... at least not in dBase and Foxbase* but you could easily
relate two databases together by an indexed field and yank the needed
data as needed... none of this weird SQL muckity-muck :lol:
Huh, I wonder if that's why things evolved to be contained in a single database with a bunch of tables... interesting.
did 240 hours (!!) straight coding on that project... then went home
and slept for four days straight much to the chagrin of my bosses...
c0ffee, cigs and pizza can only take you so far ;)
240 hours? I'm sorry, I'm a wuss, don't think I could do that sort of thing without falling dead - even when I was on a coffee, smokes and pizza diet too! lol.
I am not sure if indexing the message bodies would be a good idea,
but header information would help.
there has to be an index in the message bodies for the headers to be
linked to ;)
I was refering to the indexing in the database to make data searches faster... :)
but no comment on hash collisions with different data giving the same
hash? :) :) :)
Did there need to be? I know there are collisions with MD5.
and nothing about the malware using hash collisions specifically to
hide itself? ;) ;) ;)
I am trying to figure out how a hash generated and stored in a
database could have anything to do with malware.
Are you saying malware detection programs may detect the hash
collision as a false-positive in memory?
Re: The future of SMB (Synchronet Message Base)
By: High Spirit to Va7aqd on Fri Apr 12 2019 04:12 pm
The little SQL Engine that could. ;) But I do not see Synchronet ever needing to do millions of queries per second...
I've had the opportunity to see how Wal-Mart does their register lookups. Amazing. Based on a parent-key database system and it rocks. The number of transactions they are doing per second was mind blowing. The response time was incredible.
Re: Re: The future of SMB (Synchronet Message Base)
By: MRO to dmxrob on Fri Apr 12 2019 04:35 pm
please dont feel insulted because people don't agree with your idea.
It's not about hurt feelings. It is what is called asking for input and then shooting down everything that anyone mentions. You don't want input, you want "attaboys!" at that point.
back room were always fighting something in the code... the few times i looked over their shoulders and asked about the problem, they were shocked when i told them how easy it was to do in some other language/RDBMS... i
unlock between each one... how long would it take to import 10000 messages into a database? 5 seconds? 30 seconds? 5 minutes? brings to mind my previous system that took 45+ minutes every night to perform the midnight
The little SQL Engine that could. ;) But I do not see Synchronet ever needing to do millions of queries per second...
Yes, I know, but even if Synchro's method of using it could only produce, what, 100 write options per second, how many systems would that introduce any noticeable slowdown on - especially if batch queueing is done in appropriate priority?100 writes per second??? that slow, hunh? i've got code here that was tested
true... i understand there is the sheer number of messages in storage going back decades...
insulting the real ones that fit such words... so anyway, spending days on a system was great... it didn't bitch, gripe or complain... it didn't beat on me or throw me around... the best part was that if it didn't do what i told it to do, i shut it off and rewrote it... it was a lot more satisfying than being around other humans... then i met a girl, got pregnant, joined the USAF and everything fell apart... i'm surprised i made it this far, TBH, but i'm, not the same person i was back then...
with a small system anyway. Would be interesting to see what kind of load Vertrauen's under at times, if the load is even a thing.true... i understand there is the sheer number of messages in storage going back decades...
with a small system anyway. Would be interesting to see what kind
of load Vertrauen's under at times, if the load is even a thing.
true... i understand there is the sheer number of messages in storage
going back decades...
I logged on to Vert earlier today to follow up on this, as I was
curious. I don't see many messages hanging around - are they archived somewhere?
dmxrob wrote to Digital Man <=-
Not to mention backup and restore considerations, etc.
They're just ideas. Performance, reliability, and extensibility. That's the main thing I'm after in any changes to any file formats (where it matters).
Well I'm mainly interested in input from folks who've designed database schemas. I want to try to do it right the first time (naturally).
Re: The future of SMB (Synchr
By: Digital Man to Nightfox on Mon Apr 08 2019 05:03 pm
They're just ideas. Performance, reliability, and extensibility. That's the main thing I'm after in any changes to any file formats (where it matters).
Are you thinking of using 1 sqlite database for all the message areas or are you thinking of having a sqlite file for each area specifically?
Well I'm mainly interested in input from folks who've designed database schemas. I want to try to do it right the first time (naturally).
I've been working with SQL for several years and have built out entire online CRM systems from scratch using PHP, MySQL, and more recently using Laravel and sqlite database files. I have also built Windows based client/server apps using Microsoft SQL Server on the backend, but that is way overkill compared to SQLite.
Needless to say, I would be happy to try and help in any way I can.
Re: The future of SMB (Synchr
By: Digital Man to Nightfox on Mon Apr 08 2019 05:03 pm
They're just ideas. Performance, reliability, and extensibility. That's the main thing I'm after in any changes to any file formats (where it matters).
Are you thinking of using 1 sqlite database for all the message areas or are you thinking of having a sqlite file for each area specifically?
Well I'm mainly interested in input from folks who've designed database schemas. I want to try to do it right the first time (naturally).
I've been working with SQL for several years and have built out entire online CRM systems from scratch using PHP, MySQL, and more recently using Laravel and sqlite database files. I have also built Windows based client/server apps using Microsoft SQL Server on the backend, but that is way overkill compared to SQLite.
Needless to say, I would be happy to try and help in any way I can.
Thanks,
Jason
I plan to do a lot more research before designing the database schema. You can see the other discussions here already on this topic. If you have anything valuable to add in regards to data de-duplication, I'm all ears.
Sysop: | sneaky |
---|---|
Location: | Ashburton,NZ |
Users: | 25 |
Nodes: | 8 (0 / 8) |
Uptime: | 143:27:13 |
Calls: | 1,905 |
Calls today: | 1 |
Files: | 11,079 |
Messages: | 935,116 |