The 3 main elements in the software are the database, the 3D visualization, and the machine learning. I'm currently outsourcing my software to a team of programmers to build a functional prototype ASAP. They consider using MySQL, three.js, OpenGL, etc. We have finalized the data relations and visualization but they don't have the capabilities to incorporate the requirements for future scaling and machine learning into the design.
Therefore, I want to consult on the architecture of my software, including choices of programming languages, databases, APIs, etc. to prepare for future scaling and minimize potential costs of database migration.
Since I don't have a technical background, I would really appreciate if your consultation can be as descriptive, detailed, and documented as possible.
In order to set out confidently with the final destination in mind, I'm going to describe the basics of the software, the purposes, and my vision for its full-fledged version (though I'm totally aware of the slim chance that I can reach that point).
----
Pichutz is an online "multidimensional knowledge mapping platform" where each unit of knowledge (a "room") is mapped on 4 dimensions of knowledge. Each dimension has a unique meaning and together they guide user towards creating more "important" knowledge rather than random knowledge - conceptually high, conceptually deep, "possibility-complete," resource-efficient, machine-relevant, multidisciplinary, algorithmic, machine-readable knowledge.
The first 3 dimensions (Conceptual, Human, Scopic) correspond with 3 spatial dimensions (x-, y-, z-axises) in 3D visualization, while the 4th ("Language") dimension can be navigated along by picking a level of machine-readability (natural language -> pseudo-code -> code -> even "rawer languages" which are discussed later).
The interface resembles Google Earth the most with the blue globe replaced by a tetrahedron (all edges are equal) filled with rooms. A user can "fly" through the rooms the same way one flies through Wikipedia articles in WikiGalaxy. This is the Pichutz Home with commands arranged on the top and tools at the bottom.
The most important dimension to understand is the 1st (Conceptual) one on the vertical axis. It's best to imagine Pichutz as a pyramid where each room is supported or "propped up" by lower rooms. "B props A" means A is either based on, inspired by, made possible with, invoked in, incorporates or hints at B. The propping room B may contribute either a high probability (a proving prop) or a low probability (a disproving prop) to A. Without a direct relation, a higher room A always implies a higher physical impact in reality over a lower room B, e.g. an international treaty over the behaviors of an isolated subatomic particle at low temperature.
----
The main guideline in room-opening is to open a room containing a knowledge that you want to prove, so that other users can prove or disprove it with their knowledge, according to their credentials, by "propping" that room with rooms containing their knowledge. This practice helps derive at an overall probability (calculated from all contributed probabilities) for any absolute statement such as "NoSQL databases are more scalable than SQL and provide superior performance" or "climate change will cause human extinction within 15 years." When such statements turned out equally proved and disproved, they suggest more advanced/encompassing ("Higher") technologies/truths. "Backprop" is prop's opposite concept which requires further contemplation.
Two important purposes of Pichutz (among others) are to address existential risks in order to divert investment into businesses or organizations that amass more "important" knowledge such as SpaceX or CERN, and to promote the "theory of weakness" (with weakness abstractly referring to any higher room's failure to incorporate a possible lower room) which addresses insecurities in the age of exponential technological advancement, answers philosophical questions regarding war and peace, ultimately proves human's "inherent weaknesses" and hopefully spawn a science of "weakness management."
A room can be forked and pull changes to incorporate its propping rooms and improve its probability. This way, sentences of knowledge will evolve into full theories (e.g. root of all existential risks), and eventually into coded programs (e.g. plans to survive by secretly dominating the world) to feed to future AIs. This is where the "Language" dimension bears fruit - a pipeline refining "human knowledge" into AIs' executable knowledge. This is the top-down approach, as opposed to the conventional bottom-up approach in AI development, to be complementary.
There is supposed to be a Cambrian explosion of rooms at first, which are mostly short sentences. Bots must be developed to brutally prune and crunch them and, in the process, guide and assist users in systematically learning and collaboratively creating more "important" knowledge as explained above.
----
Bots must be developed to perform many kinds of operations on the rooms, which increasingly require higher level of machine learning, including
1. stitching (putting rooms visualized in 3D together into Wikipedia-styled 2D articles for traditional reading),
2. validating (automatically propping rooms, merging similar rooms, migrating current rooms' probabilities to new rooms),
3. critical thinking (flagging contrasting rooms, pruning low probability rooms, killing spammers),
4. leveling (suggesting higher, unifying rooms based on contrasting lower rooms),
5. questioning (if not able to suggest unifying rooms, generating questions on Quora/StackOverflow),
6. crossing (linking problems with solutions in different fields, identifying similar solutions. See "the idea behind"),
7. translating (translating "human knowledge" into more machine-readable knowledge),
8. analytics tools will possibly be the main revenue stream (besides credentials accreditation, Private Pichutz, planning tool, Pyrofile search, headhunting, "room as a commodity," etc.). This includes summarizing common elements of successful products (e.g. ninja, wizard in games, movies), untapped combinations of product elements (e.g. ninja + wizard in the same movie), or untapped market (e.g. big data experts in North Korea, baby clothes in Vietnam, or an photo-disappearing app for exceptionally unproductive teens).
To enable such tools, data must be mined to generate rooms, and rooms must be designed to be inputted by users in special ways such as "multidimensional checklist," "graphic room," "programmable rooms," or "meta-room," which require further contemplation.
The "Language" dimension will be developed last. It will allow knowledge to be inputted in more machine-readable language (e.g. programming code), or even "rawer languages." Imagine that molecules "talk" with each other in physical interactions and their knowledge exist in physical forms, or imagine that each AI with a different setup and set of parameters stores knowledge in a different "language" (after learning from datasets) that can "magically," for example, recognizes human speech. In my daydream, Pichutz can become a Wikipedia+Github of the future, or more than that.
----
For philosophical details, you can check out this long document bit.ly/ReadTheSP
For how I described Pichutz to programmers, please refer below, which is mainly on data relations and visualization.
Please don't mind minor differences between the document and the description below. The latter is more realistic.
----
Admin Section:
- Admin Management
- User Management
- Taxonomic Tree, Scopic Tree Management
- Rooms Management
Front End:
- Open Room
- Create Prop
- Create Scopon
- Pyrofiles (user profiles)
- Visualization
- More functionality to be added
(Visualization)
The full shape of Pichutz is a tetrahedron. All edges are equal to X.
Before any room is opened, it's all black (dark). Only the edges are visible as thin white lines.
Each room when opened looks a like a square window with light coming out of it. As more rooms are opened, the SP will light up and its full shape will become gradually visible.
All rooms appear only on the front triangle (the front face of Pichutz).
There are 6 invisible lines on this front face (not to visualize, just to keep in mind):
(1) the perpendicular bisector that divides the face into the left side and the right side
(2) the Floor 0 line that is 0.6X away from the apex, so that the area of the triangle above it ("High knowledge") and the area of the trapezium below it ("Deep knowledge") are equal.
(3-6) two pairs of Soft lines and Hard lines, dividing the face into two Soft shells, two Hard shells, and a Super Hard core.
(Left or right) If the new room is related to an existing room, it appears on its same side. Otherwise, it appears on the opposite side of the last created room.
(Floor number)
(Floor 0) In theory, rooms on Floor 0 should be the easiest knowledge that humans can acquire. The 1st room we create is not likely to fits this description. So, at any time, the admin can pick a room and set it to Floor 0, and all rooms will be moved accordingly. For example, if a current room on 5th is set to be on 0th, all rooms will go down by 5 floors.
When we "Open room", we will input the room(s) that this new room props. The new room will always be below all existing rooms.
- When a new room is opened, if no prop is inputted, it will be on Floor 0.
- When a new room is opened and one prop (or many) is inputted, it will be 1 floor lower than then lowest propped room, and no existing room will change position. For example, if the new room props rooms on -2, -1, 0, 1, 2, then it will be on -3.
When we "Create a Prop" between two existing rooms, we input the probability that the propping room contributes to the propped room. Following this, existing rooms' floors may or may not change or may not.
- If the propping room is lower than the propped room, then no floor changes.
- If the propping room is higher than the propped room, check this example (create a prop from A to B) in which each arrow is a prop. There are 2 cases:
a) If there is a prop from C to D (C props D), then a relational conflict (a loop in this case) can be identified immediately: A --> B --> C --> D --> A. The prop from A to B thus can't be created.
b) If there is NO prop from C to D, then B and every room above it will move up above A. It means B will move up 5 floors from F0 to F5 (A's floor + 1). All rooms that B props will move up 5 floors, and all rooms that each of them props will also move up 5 floors, and so on. This is recursion. Only rooms that need to move move (colored black). Rooms that don't need to move don't move.
(Taxa/Scopa)
All taxa and scopa will appear inside the bulk of Pichutz. I don't put rooms in there. This space is definitely bigger than the area of the front face where rooms are populated, because eventually there will be more taxa/scopa (groups of rooms) than rooms. This is similar to there being always more synapses than neurons in the brain.
Each taxon/scopon looks like a round circle with light coming out of it.
Trees:
1. Each node on Taxonomic Tree is a taxon (in plural: taxa). Taxa are created by the Admin.
2. Each node on Scopic Tree is a scopon (in plural: scopa). Scopa are created by users.
I'm not sure yet where to download structured data of an accurate Taxonomic Tree but these can be a good start for reference (use whichever you can extract data from, or something else):
https://en.wikipedia.org/wiki/Outline_of_academic_disciplines
http://dewey.info/
http://www.loc.gov/catdir/cpso/lcco/
Authority:
1. Taxonomic Tree is viewed by everyone, and edited by Admins.
2. Scopic Tree is viewed by everyone. Users can only edit the ones they created. Admins can edit anything.
Nesting:
1. A scopon can be nested under any scopon/taxon and/or nest any scopon/taxon under it (as long as no there's hierarchical conflict).
2. There are too special taxa: "Facts" and "Data." They are included in the Taxonomic Tree.
View and management:
1. Taxonomic Tree and Scopic Tree are viewed like label view in Gmail (note: "Facts" and "Data" are separate on top).
2. When a room is opened, selecting a taxon/scopon is like selecting a label in Gmail.
3. Creating a scopon is like creating a label in Gmail.
4. Taxon/Scopon management is like label management (note "Facts" and "Data" on the top and, in this example, 2 taxa on the bottom). When taxa and scopa are nested under a scopon, only the scopa can be edited, the taxa can only be removed.
5. Scopa can be created/editted as per https://www.dropbox.com/s/u5wypu4l8o6sogw/Create-Edit%20Scopon.jpg?dl=0
Visualization of Taxonomic Tree and Scopic Tree:
1. The 2 trees are managed separately but, in visualization, they are "blended" together.
2. In short, the fixed Taxonomic Tree will be created first and serve as the backbone, equally spaced between the front face ABC and farthest vertex O. The lowest ranked taxa are positioned closest to the front face, and the root(s) furthest from the front face (or closest to the furthest vertex).
3. When a scopon is added, it will simply appear above all the taxa/scopa it contains, and below those it is nested under.
4. If the added scopon is on top of the highest taxon (as in the example in the photo above), the spacing will be adjusted and, in effect, all taxa/scopa will move closer to each other. At first, there are only taxa, which are sparsely positioned. Over time, when a lot of scopa have been added, taxa/scopa will be very dense.
(Pyrofile - user profile)
A user can:
- Create a room.
- Create a prop between room A and room B and the probability room A contributes to room B.
- Create a scopon, nest it under other scopon/taxon, and/or nest other taxon/scopon/room under it.
This is when you view your own Pyrofile or any user's Pyrofile: https://www.dropbox.com/s/gsb91wwreujhejq/Pyrofile%20%28user%20profile%29.jpg?dl=0
The "Manage list" hyperlinks link to lists displayed as perhttps://www.dropbox.com/s/y06oajqdsrlthpd/List%20display.jpg?dl=0
Lists can be sorted in ascending or descending order.When you view another user's Pyrofile, "Manage list" becames "View list."
Beta users are whom I absolutely trust and allow to set their own creds for taxon. Refer below.
(Cred)
Each user has a different cred for each taxon (which means a different qualification for each discipline). The default cred for each taxon of each user is 1%. Beta users (whom I trust) can set their own creds up to 50% (no one can be more than 50% sure about almost anything). Normal users in the future are not allowed set their own creds; we do.
If a room is under different taxons on the same line, for example, the room "Schrodinger's equation indicates that the future is predetermined" is under "Quantum Mechanics" which is under "Theoretical Physics" which is under "Physics", then for a user his cred associated with that room is his cred for the closest taxon of that room ("Quantum Mechanics") rather than the remote "Physics" because, statistically speaking, a self-proclaimed physicist hardly knows any QM to draw such conclusion.
(Room)
Each room is a unique room ID and a permanent link.
When a user clicks on a room, he can also see all the probabilities contributed by all rooms that prop or backprop that room, presented in form of a probability distribution of probabilities.
(Overall Probability)
For a room, a probability distribution of probabilities can be informative (e.g. divided opinions), but a single Overall Probability, that takes into account the creds of the users who contributed probabilities to this room, would be useful in most cases. To calculate room A’s OP, the probabilities of room A contributed by each of its propping rooms are weighted by [each propping room’s OP] and [the average cred of the props' users associated with both room A and the propping rooms]. This is the simple formula.
It is because a user needs qualifications/credentials in both fields to make a connection between them. For example, for your prop between a room under “Quantum Mechanics” and a room under “Quantum computing” to matter, you need high creds in both fields.
Since a room's OP takes into account its related rooms' OPs, I think we should employ an iterative algorithm similar to PageRank (section 2.1.1), which means iterating the calculation of all OPs until their values converge.
(Color)
Taxa: light blue
Scopa: light green
Soft rooms: pink
Hard rooms: orange
Super Hard rooms: red
Normally, all rooms glow. However, when we click on a taxon or scopon, only rooms contained by (associated with) that taxon/scopon glow while other rooms dim. When we click outside, all rooms glow again.