Software design and considerations

The Central Information system is envisioned at a repository of information and experience, from library books for general reading, a computer database to map topics and indices of the materials available, and interactive computer programs and instructional video “how-to's” to relate what we have learned, in sufficient detail that the system can almost become self-instructive.
Post Reply
User avatar
Tulan
Cellarius
Cellarius
Posts: 453
Joined: Wed Aug 31, 2005 8:04 pm
Location: Austin, TX
Contact:

Software design and considerations

Post by Tulan » Mon Oct 12, 2009 1:13 pm

I wanted to write up a post on some thoughts I've had about software selection, design, and structure of the overall system. Please respond with your thoughts as I am gearing up for starting this project in earnest and it would be nice to have a solid idea of where we are going so we can avoid any serious overhauls in the future.

First consideration: relational database vs. graph database.

I'm presently arguing for a graph database, in particular: 4Store. It is geared specifically for storing RDF/OWL triples which would be in-line with our intention for building a knowledge system and not just a data store.

RDF+OWL can be used to create topic maps and there is a larger selection of tools, specs, and community reading material for the vocabularies. This would achieve the goal of being able to create a knowledge base with technology that is more accessible and easier to open up as a web service (RDF is much easier to consume than Topic Maps because there are more tools for parsing it in existence).

Using 4Store would allow us to begin small, as there is a slight learning curve, but scale as the project gets larger. Structuring our knowledge in an RDF+OWL graph would make it more portable if there is a need to move to a different structure (relational, object, or otherwise).

The upside to 4Store too is its distributed nature, 4Store is designed for sharding and can do so with less effort on the programmer's part than MySQL or any relational DB technology. This would scale well for us if and when things get big enough to need that distribution.

Second consideration: the language stack. PHP is great for building dynamic websites as it has every possible function under the sun with a lot of community support, so I don't think we should separate away from that for web development.

I do think, however, PHP lacks in many ways for more general purpose programming; I think PHP would work well as a tool for displaying results from a backend running 4Store+Python or 4Store+Scheme or 4Store+Erlang.

Granted, a lot of work will have to go into this but I am interested in building it as orthogonally as possible from the get go to make it easier to extend in the future on a solid platform.

Third consideration: servers. I think a dedicated server is close to necessary as this project picks up momentum; there are too many limitations on hosted servers to justify long-term dependence on that stack.

I have quite a bit of experience in building jailed FreeBSD servers for companies, I have my own running at home; once I get a bit more contract work and can pay for a dedicated IP address I will be able to run it with a dedicated domain (without port 80 being blocked). For Central Information I think it would be best to run it on a box we have full access to.

I also really like the idea of mixing the Central Information system with some sort of a distributed peer network/reticulum (Ala: Freenet). I know getting Central Information up first will be priority as we don't have enough 'peers' to support a fully functioning Reticulum, but it would be well to design the system so that in the future when we do begin work on the Reticulum it can plug into Central Information (in this case, Central Information could act as a high-trust peer) with ease and little restructuring.

The Reticulum is a good name for the peer-distributed version of Central Information.

Money.

At some point, money for hardware infrastructure will be needed (The Reticulum won't as it will run off of peers) for Central Information. Servers, backup medium, &c... I wonder if the software we write for Central Information is something we could sell? Maybe spawn a company to handle the sale of and support of the software?

I'm in the process of cleaning up the codebase I have to begin with the project. We might consider using my code base to rebuild Antiquatis and depart from dependence on Drupal. I have solid code built on top of the KohanaPHP framework (of which I am also a core developer); both are secure, efficient, and well documented/tested. Check it out here. Either way, it will be up on the server tonight or tomorrow for those of you interested to look at.
Ah, you seek meaning? Then listen to the music, not the song. - Kosh Naranek

User avatar
LoneBear
Legatus Legionis
Legatus Legionis
Posts: 3586
Joined: Thu Jul 22, 2004 12:38 am
Location: Utah
Contact:

Re: Software design and considerations

Post by LoneBear » Mon Oct 12, 2009 7:54 pm

Tulan wrote:Using 4Store would allow us to begin small, as there is a slight learning curve, but scale as the project gets larger. Structuring our knowledge in an RDF+OWL graph would make it more portable if there is a need to move to a different structure (relational, object, or otherwise).
I'll take a look at installing 4Store this weekend. We run CentOS 5, so it looks like an easy fit.
Tulan wrote:4Store is designed for sharding ...
Well, seems somebody has been studying the Buzzword book! You may want to define industry-specific terms, as most people only know a "Shard" as a chip off the Dark Crystal.
Tulan wrote:I do think, however, PHP lacks in many ways for more general purpose programming; I think PHP would work well as a tool for displaying results from a backend running 4Store+Python or 4Store+Scheme or 4Store+Erlang.
All languages have pros and cons... I would be more concerned with portability between operating systems, Mac, Amiga, PC, Unix, Linux, et al, than efficiency at the start. Preference would be a highly structured language like Java--it's always easy to simply a more complicated language, but there are usually difficulties trying to take something like Basic and convert it to C++.
Tulan wrote:The Reticulum is a good name for the peer-distributed version of Central Information.
Hanging out with the Zetas, eh? Somebody keep an eye on Tulan, and may sure he doesn't turn Grey!
Tulan wrote:I wonder if the software we write for Central Information is something we could sell? Maybe spawn a company to handle the sale of and support of the software?
Already "in progress"; I'll discuss this with you later.
Tulan wrote:the KohanaPHP framework
Everybody has their favorite language and utilities. Believe me... after 37 years of programming, I can fill a couple of pages on a Resume with languages I've used. The system needs to be flexible enough so that others can contribute in whatever form they can... I'd prefer a flexible framework, so that creativity can flourish rather than hard rules.

Remember that the Sanctuary project is about pushing humans forward--intuition and creativity coupled with intelligence and execution. All aspects need to promote growth.

But at the same time, I don't want another "Unix", where command names are created from the latest gurgle the programmer's baby made. Flexibility, but well documented and descriptive, so it shows a clear path for others to follow.

BTW, where is the Kohana bug reporting system? You can usually judge a program by the errors it produces (and the support given to fix it).

User avatar
Tulan
Cellarius
Cellarius
Posts: 453
Joined: Wed Aug 31, 2005 8:04 pm
Location: Austin, TX
Contact:

Re: Software design and considerations

Post by Tulan » Tue Oct 13, 2009 9:10 am

I agree with your points and will say that Java and Python have the largest number of mature RDF libraries, including reasoners... Kohana's bug tracker can be found here: bug tracker.
Ah, you seek meaning? Then listen to the music, not the song. - Kosh Naranek

User avatar
LoneBear
Legatus Legionis
Legatus Legionis
Posts: 3586
Joined: Thu Jul 22, 2004 12:38 am
Location: Utah
Contact:

Python

Post by LoneBear » Sun Oct 09, 2011 12:16 pm

I've loaded Python on my PC (Win7), but am a bit surprised at the lack of support... mod_python for Apache has is no longer supported; the Python plugin for NetBeans IDE is no longer supported... seems to be difficult to get a working development environment on Windows, at least with the tools I've been using for many years.

What development environment and frameworks are you using?

User avatar
Tulan
Cellarius
Cellarius
Posts: 453
Joined: Wed Aug 31, 2005 8:04 pm
Location: Austin, TX
Contact:

Re: Software design and considerations

Post by Tulan » Tue Oct 11, 2011 10:07 pm

I use Emacs with python-mode; mod_python is really old, don't use that!

I would suggest getting Pyramid the framework installed - there's some good basic tutorials on that and with Pyramid comes an HTTP middleware called "Paste" that has it's own HTTP server built it, so you launch it and it runs the Pyramid application.

Otherwise, you should install mod_wsgi (which is Python's "CGI" module now) - although that's more of a pain than just using Paster. I always use the paster tool chain for development, plus it will auto-reload the app when code changes are made.

On my production box I use NGINX (instead of Apache) and it just proxies over to UWSGI (which is a HTTP C server like Paster but much faster).
Ah, you seek meaning? Then listen to the music, not the song. - Kosh Naranek

Post Reply