Most persistent data in Deme is stored in “items”. An item is an instance of a particular “item type”. This is in parallel to object-oriented programming where instances (items) are defined by classes (item types), and in parallel to filesystems where files (items) are defined by file types (item types). The Deme item types form a hierarchy through inheritance, so if the Person item type inherits from the Agent item type, then any item that is a person is also an agent. Every item type inherits from the Item item type (which corresponds to the Object class in many programming languages). We allow multiple inheritance, and use it occasionally (e.g., TextComment inherits from both Comment and TextDocument).
We use ORM with multi-table inheritance. There is a database table for every item type, and a row in that table for every item of that item type. For example, if our entire item type hierarchy is Item -> Agent -> Person, and our items are Mike[Person] and Robot[Agent], then there will be one row in the Person table (for Mike), two rows in the Agent table (for Mike and Robot), and two rows in the Item table (for Mike and Robot).
Every item type defines the “fields” relevant for its items, and item types inherit fields from their parents. As a simple example, imagine Item defines the “description” field, Agent defines no new fields, and Person defines the “first_name” field. Therefore every person has a description and a first_name. The columns in each table correspond to the fields in its item type. So if the Mike item has description=”a programmer” and first_name=”Mike”, then his row in the Item table will just have description=”a programmer”, his row in the Agent table will have no fields (because Agent did not define any new fields), and his row in the Person table will have first_name=”Mike”.
Every field has a type that corresponds to the types we can store in our database. The basic types are things like String, Integer, and Boolean. It is important to realize that fields are not items. So if Mike’s first_name field is of type String, it cannot be referred to as an item itself. You cannot store entire items as fields, but you can have fields that point to other items (foreign keys in database-speak, pointers/references in programming, links in filesystems). If we wanted to “itemize” the first_name field, we could make a new FirstName item type and have the Person’s first_name field be a pointer to an first_name item. In the case of first_name, however, this is not particularly useful, and it just adds more overhead (and makes versioning difficult, as we’ll see later on). Pointer fields are more useful for defining relationships between legitimate items. For example, the Item item type has an “creator” field pointing to the agent that created the item.
Pointers do not represent an exclusive “ownership” relationship. I.e., just because an item pointed to the agent that wrote it, this does not prevent other items from pointing to that agent. Multiple items can point to a common item.
Fields cannot store data structures like lists. If you want to express X has many Y’s, rather than storing all the Y’s in the X row, you should itemize the Y’s, and have each Y point to the X that it belongs to. For example, an Agent has many ContactMethods. So rather than storing the ContactMethods as fields inside each Agent, we make ContactMethod an item type, and give it an “Agent pointer” field. So the contact methods for agent 123 are represented by all of the ContactMethods that have agent_pointer=123.
The most important field is the id field (primary key in database-speak, memory address of the object in programming, inode number in filesystems). Every item has a unique id, an auto-incrementing integer starting at 1. Items share the same id with their parent-item-type versions (so Mike’s row in the Person table has the same id as Mike’s row in the Agent table and Item table). Pointer fields are effectively references to the id of the pointee. It is important that the id field never change so that there is always a single reliable way to refer to a particular item. No other field is guaranteed to be unique among all items (although some item types define unique fields within that item type, such as DemeAccount’s unique username).
Some fields are specified as immutable, which means once they are set, they cannot be changed. The id field is a prime example of an immutable fields, but other fields like creator and created_at are immutable as well.
For every item type table, there is a dual “revisions” table. So in our previous example, in addition to the Item, Agent, and Person tables, we now have ItemVersion, AgentVersion, and PersonVersion tables. These tables store the exact same fields as their original non-versioned tables, with a few exceptions:
So apart from these differences, each itemversion stores the exact same fields as the regular item. Every time a change is made to the regular item, a snapshot is taken and a new itemversion is created with an increased version number. So when the Mike item is changed, Mike’s rows in the Item, Agent, and Person table are updated, and those updates are copied over to the ItemVersion, AgentVersion, and PersonVersion tables, so that we can refer to the past.
Here is the major caveat. Imagine there is a student and a class. We must represent their relationship with a third item type, the ClassMembership, since classes cannot store arrays of students, and students cannot store arrays of classes. If the student joins the class, a ClassMembership is created. Ideally, we’d like to be look at the previous roster of the class, but since the class roster is just composed of ClassMemberships that point to the class, there is no way to refer to a previous version. A possible solution is to refer to the entire state of items at a particular time step, which is possible since we can compute what versions were around at any given moment, but that gets convoluted. For now, just assume that versioning only plays well with regular fields, and does not work on data structures created via relationship tables.
There are two ways of deleting items: deactivating and destroying. Neither of these methods removes any rows from the database. Deactivating is recoverable (by reactivating), destroying is not. The user interface ensures that deactivating happens before destroying.
Deactivating: If an agent deactivates an item, it sets the active field to false. An agent can recover the item by reactivating it, which sets the active field back to true. Each time this happens, the version does not change, but a DeactivateActionNotice or ReactivateActionNotice is automatically generated to log when the item was active. Inactive items can still be viewed and edited as normally. The major difference between an active and an inactive item is that when an item is inactive, it will not be returned as the result of queries (unless the query specifically requests inactive items). For example, when you look at the list of students in a class, it will only show students that are active with classmemberships that are active.
Destroying: After an item is deactivated, you can permanently nullify all of its fields (and/or the fields in its versions) so that it is impossible to recover (but keep active=false). A DestroyActionNotice is automatically generated to log when the item was around.
Our solution is as follows. We allow any field to have the special NULL value from SQL. The application (not the database) ensures that fields only take on these values when the item is destroyed, and never otherwise (I haven’t finished making sure this happens yet). Thus, to destroy an item is to set every field to NULL, and set destroyed=True (and leave alone id, item_type, and active, version_number). Destroying an item also removes all permissions and versions of the item. After an item is destroyed, nobody can make changes (in particular, it cannot be reactivated or edited).
Normally, having NULL values makes the code much more complex and prone to bugs, since the developer has to write a lot of checks for NULL. For example, to display the name of the creator of an item, the developer would have to write something like if (item.creator != NULL && item.creator.name != NULL) .... Since we already do all of this up-front error checking in the permission system (to ensure that the logged in agent has permission to view the creator of the item and the name of the creator), all we have to do is modify the permission code so that users cannot view fields (or take any actions) for destroyed items. So if an item’s creator was destroyed, a simple viewer will just display the creator’s name in the same way it would display something it does not have permission to view (a more advanced viewer could check to see if it was destroyed).
It will also be possible to destroy specific versions of an item (not yet implemented). You can destroy any version except for the latest version (if you want to destroy the latest version, just edit the item to make a new version so that the version you want to destroy is now the second-latest). Destroying a version will permanently NULLify all fields in the version.
Not every bit of persistent data is stored in the database in item fields. Here are the exceptions so far:
Below are the core item types and the role they play (see the full ontology at http://deme.stanford.edu/viewing/codegraph).
Agents and related item types
Agent: This item type represents an agent that can “do” things. Often this will be a person (see the Person subclass), but actions can also be performed by other agents, such as bots and anonymous agents. Agents are unique in the following ways:
There is only one field defined by this item type, last_online_at, which stores the date and time when the agent last accessed a viewer.
AnonymousAgent: This item type is the agent that users of Deme authenticate as by default. Because every action must be associated with a responsible Agent (e.g., updating an item), we require that users are authenticated as some Agent at all times. So if a user never bothers logging in at the website, they will automatically be logged in as an AnonymousAgent, even if the website says “not logged in”. There should be exactly one AnonymousAgent at all times.
This item type does not define any new fields.
GroupAgent: This item type is an Agent that acts on behalf of an entire group. It can’t do anything that other agents can’t do. Its significance is just symbolic: by being associated with a group, the actions taken by the group agent are seen as collective action of the group members. In general, permission to login_as the group agent will be limited to powerful members of the group. There should be exactly one GroupAgent for every group.
This item type defines one field, a unique group pointer that points to the group it represents.
AuthenticationMethod: This item type represents an Agent’s credentials to login. For example, there might be a AuthenticationMethod representing my Facebook account, a AuthenticationMethod representing my WebAuth account, and a AuthenticationMethod representing my OpenID account. Rather than storing the login credentials directly in a particular Agent, we allow agents to have multiple authentication methods, so that they can login different ways. In theory, AuthenticationMethods can also be used to sync profile information through APIs. There are subclasses of AuthenticationMethod for each different way of authenticating.
This item type defines one field, an agent pointer that points to the agent that is holds this authentication method.
DemeAccount: This is an AuthenticationMethod that allows a user to log on with a username and a password. The username must be unique across the entire Deme installation. The password field is formatted the same as in the User model of the Django admin app (algo$salt$hash), and is thus not stored in plain text.
This item type defines four fields: username, password, password_question, and password_answer (the last two can be used to reset the password and send it to the Agent via one of its ContactMethods).
Person: A Person is an Agent that represents a person in real life. It defines four user-editable fields about the person’s name: first_name, middle_names, last_name, and suffix.
ContactMethod: A ContactMethod belongs to an Agent and contains details on how to contact them. ContactMethod is meant to be abstract, so developers should always create subclasses rather than creating raw ContactMethods.
This item type defines one field, an agent pointer that points to the agent that is holds this contact method.
Currently, the following concrete subclasses of ContactMethod are defined (with the fields in parentheses):
Subscription: A Subscription is a relationship between an Item and a ContactMethod, indicating that all action notices on the item should be sent to the contact method as notifications. This item type defines the following fields:
Collections and related item types
Collection: A Collection is an Item that represents an unordered set of other items. Collections just use pointers from Memberships to represent their contents, so multiple Collections can point to the same contained items. Since Collections are just pointed to, they do not define any new fields.
Collections “directly” contain items via Memberships, but they also “indirectly” contain items via chained Memberships. If Collection 1 directly contains Collection 2 which directly contains Item 3, then Collection 1 indirectly contains Item 3, even though there may be no explicit Membership item specifying the indirect relationship between Collection 1 and Item 3. (In the actual implementation, a special database table called RecursiveMembership is used to store all indirect membership tuples, but it does not inherit from Item.)
It is possible for there to be circular memberships. Collection 1 might contain Collection 2 and Collection 2 might contain Collection 1. This will not cause any errors: it simply means that Collection 1 indirectly contains itself. It is even possible that Collection 1 directly contains itself via a Membership to itself.
Group: A group is a collection of Agents. A group has a folio that is used for collaboration among members. THis item type does not define any new fields, since it just inherits from Collection and is pointed to by Folio.
Folio: A folio is a special collection that belongs to a group. It has one field, the group pointer, which must be unique (no two folios can share a group).
Membership: A Membership is a relationship between a collection and one of its items. It defines two basic fields, an item pointer and a collection pointer. It also defines a permission_enabled boolean, which allows permissions to propagate through the containing collection to the member item (explained more in the Permissions section).
Documents
Annotations (Transclusions, Comments, and Excerpts)
Transclusion: A Transclusion is an embedded reference from a location in a specific version of a TextDocument to another Item. This item type defines the following fields:
Comment: A Comment is a unit of discussion about an Item. Each comment specifies the commented item and version number (in the item and item_version_number fields). Comment is meant to be abstract, so developers should always create subclasses rather than creating raw Comments. Currently, users can only create TextComments.
If somebody creates Item 1, someone creates Comment 2 about Item 2, and someone responds to Comment 2 with Comment 3, then one would say that Comment 3 is a direct comment on Comment 2, and Comment 3 is an indirect comment on Item 1. The Comment item type only stores information about direct comments, but behind the scenes, the RecursiveComment table (which does not inherit from Item) keeps track of all of the indirect commenting so that viewers can efficiently render entire threads.
A Comment also specifies a from_contact_method field, which points to a ContactMethod that was used to generate this comment. Often this will be null, but in cases where people send emails to generate comments, this will point to the EmailContactMethod, and is used to set an appropriate reply address.
TextComment: A TextComment is a Comment and a TextDocument combined. It is currently the only form of user-generated comments. It defines no new fields.
Excerpt: An Excerpt is an Item that refers to a portion of another Item (or an external resource, such as a webpage). Excerpt is meant to be abstract, so developers should always create subclasses rather than creating raw Excerpts.
TextDocumentExcerpt: A TextDocumentExcerpt refers to a contiguous region of text in a version of another TextDocument in Deme. The body field contains the excerpted region, and the following fields are introduced:
Viewer aliases
In order to allow vanity URLs (i.e., things other than /viewing/item/5), we have a system of hierarchical URLs. In the future, we’ll need to make sure URL aliases cannot start with /viewing/ (our base URL for viewers), /static/ (our base URL for static content like stylesheets), or /meta/ (our base URL for Deme framework things like authentication). Right now, if someone makes a vanity URL with one of those prefixes, you just cannot reach it (it does not shadow the important URLs).
ViewerRequest: A ViewerRequest represents a particular action at a particular viewer (basically a URL, although its stored more explicitly). A ViewerRequest is supposed to be abstract, so users can only create Sites and CustomUrls. It specifies the following fields
Site: A Site is a ViewerRequest that represents a logical website with URLs. Multiple Sites on the same Deme installation share the same Items with the same unique ids, but they resolve URLs differently so each Site can have a different page for /mike. If you go to the base URL of a site (like http://example.com/), you see the ViewerRequest that this Site inherits from. This item type specifies the following fields:
CustomUrl: A CustomUrl is a ViewerRequest that represents a specific path.
Each CustomUrl has a parent_url field pointing to the parent ViewerRequest (it will be the Site if this CustomUrl is the first path component) and a path field. So when a user visits http://example.com/abc/def, Deme looks for a CustomUrl with name “def” with a parent with name “abc” with a parent Site with hostname “example.com”. In other words, we need to find something that looks like this:
CustomUrl(name="def", parent_url=CustomUrl(name="abc", parent_url=Site(hostname="example.com")))
Misc item types
ActionNotices keep records of every action that occurs in Deme. ActionNotices are not items themselves, but they exist in the database and point to items.
Every ActionNotice keeps the following fields
There are currently 6 types of ActionNotices: DeactivateActionNotices, ReactivateActionNotices, DestroyActionNotices, CreateActionNotices, EditActionNotices, and RelationActionNotices. The first 5 are self-explanatory: when an agent deactivates, reactivates, destroys, creates, or edits an item, this automatically generates an ActionNotice. None of these 5 ActionNotices define new fields. Although it seems like the CreateActionNotices and EditActionNotices should define fields to specify what changed, this information can be inferred from the item itself (and its revisions).
RelationActionNotices are more interesting: when an agent modifies an item (the from item) that points to another item (the to item), a RelationActionNotice is generated about the to item. These notices are only generated when the pointer changes, either from something else to the to item, or from the to item to something else. RelationActionNotices define new fields to specify the from item and its version at the time of the action, and the field in the from item that points to the to item.
A good example of a RelationActionNotice is a membership that points to a collection. If I’m viewing the ActionNotices for the collection, I will see a RelationActionNotice saying that at some date, some user set the membership to point to this collection. Or in other words, an item was added to this collection.
In order to view ActionNotices, an agent must have the view action_notices permission with respect to the action item. For RelationActionNotices, an agent must also have permission to view the pointing field in the from item.
If you are subscribed to an item (via the Subscription item type), and you have permission to view ActionNotices on that item, you will receive notifications by email every time an ActionNotice is generated.
The ActionNotices about an agent include ActionNotices whose action_agent field points to the agent, in addition to ActionNotices whose action_item field points to the agent. Thus, if you subscribe to an agent, you will get emails about things they do, in addition to things done to them. For this reason, RelationActionNotices are not generated for the action_agent field of an item, or else there would be redundant ActionNotices on the same item.
Permissions define what actions Agents can and cannot do. Similar to ActionNotices, permissions are not items themselves, but they exist in the database and point to items (it used to be that permissions were items, but for simplicity and efficiency, we now keep them separate).
There are 9 types of permissions, divided among 2 axis: the source axis and the to axis. Along the source axis, permissions can be given at 3 levels: to a single Agent, to the members of a Collection of Agents, or to all Agents. Along the to axis, permissions can be applied to 3 levels: to a single Item, to the members of a Collection of Items, or to all Items. For both axes, we refer to these three levels as “one”, “some”, and “all”. The 9 possible permissions are shown in the table below:
.
To
|-------------------------|--------------------------|-------------------------|
| One | Some | All |
|------|-------------------------|--------------------------|-------------------------|
F | One | OneToOnePermission (1) | OneToSomePermission (2) | OneToAllPermission (3) |
r |------|-------------------------|--------------------------|-------------------------|
o | Some | SomeToOnePermission (4) | SomeToSomePermission (5) | SomeToAllPermission (6) |
m |------|-------------------------|--------------------------|-------------------------|
| All | AllToOnePermission (7) | AllToSomePermission (8) | AllToAllPermission (9) |
|------|-------------------------|--------------------------|-------------------------|
Although we could accomplish anything using only OneToOnePermissions, the other permission types allow us to more concisely express permissions. For example, if our site was a wiki and we wanted any user to be able to edit any document, we would create a single AllToAllPermission, rather than a new OneToOnePermission for every Agent/Item pair.
Each permission, in addition to specifying the source and the to axes, specifies an ability string and an is_allowed boolean. When there are multiple permissions with the same ability, the permissions at a level with a lower number (shown in parentheses after each permission type in the table above) take precedence. When there are multiple permissions at the same level, the negative (is_allowed=False) permissions take precedence over the positive permissions.
On both axes, when we refer to all agents or items in a collection (i.e., [X]ToSome or SomeTo[X]), we refer to both direct and indirect members. Thus, the permission code checks the RecursiveMembership table to determine whether an agent or an item is affected by the permission.
There are two types of abilities: item abilities and global abilities. Item abilities can apply to a particular item (or collection of items), such as “can edit the name of the item”; while global cannot apply to any particular item, such as “can create new documents”. Each item type defines the item abilities that are relevant to it, and the global abilities it introduces.
An agent has an ability if there exists a relevant permission with is_allowed=True at some level without any relevant permissions with is_allowed=False at any levels with the same or lower number.
Below is a list of all possible global abilities:
Below is a list of item types and the item abilities they introduce:
In order to implement permissions, Deme takes the currently authenticated Agent (anonymous or not), and decides whether it has the required ability to complete the requested action (or display some part of the view). Abilities are not just checked before doing actions, but they can also be used to filter out items on database lookups. For example, if my viewer is supposed to display a list of items I am allowed to see (because I have the view name ability), it will need to use permissions to filter out inappropriate results.
To modify a [X]ToOne permission, one must have the do_anything ability with respect to the target item. Similarly, to modify a [X]ToSome permission, one must have the do_anything ability with respect to the target collection. Finally, to modify a [X]ToAll permission, one must have the global do_anything permission.
However, there is a loophole in the setup described above. A user could simply create a collection, add a private item to that collection (because they have do_anything with respect to that collection), create a [X]ToSome permission for that collection (because they have do_anything with respect to that collection), and thus gain full access to the private item. In order to resolve this, we use the permission_enabled field in Membership. [X]ToSome permissions only propagate to members of the collection through memberships with permission_enabled=True, and agents can only modify the permission_enabled field of an membership if they have the do_anything ability with respect to the member item. By enforcing this, we guarantee that when a user modifies a [X]ToSome permission, it only affects items in the collection that were added to that collection with permission_enabled=True by a user that has power over that item. Since [X]ToSome permissions recursively traverse Memberships, we have a permission_enabled field in RecursiveMembership, that is set to true if and only if there exists a path of memberships from the parent collection to the child item all with permission_enabled=True.
A viewer is a Python class that processes browser or API requests. Any URL that starts with /viewing/ is routed to a viewer (vanity URLs are also routed to viewers via ViewerRequests, but /static/ URLs and invalid URLs are not). Each viewer defines the item type it can accept, and multiple viewers can accept the same item type (you could have ItemViewer and SuperItemViewer which both handle items). There should be a default viewer for every item type with the same name as the item type (in lowercase), and if there is none, then the default viewer of the superclass should be used. Viewers that handle item type X always handle items that are in subclasses of X.
Our URLs are restful. Every URL defines a viewer, an action, a noun (or none for actions on the entire item type), a format, an optional parameters in the query string. Here are some example URLs:
Every viewer defines a set of actions it responds to. Actions are divided into two groups: those that take nouns (which are always item ids) called item actions, and those that do not take nouns called item type actions. In order to make URLs unambiguous, item ids must be numbers, and action names can only be letters (although we may later decide to allow other characters, such as underscores and dashes, or even numbers that do not appear at the beginning).
An action corresponds to a single Python function. If you visit /viewing/item/list, Deme will call the type_list method of the ItemViewer class. If you visit /viewing/person/5/show, Deme will call the item_show method of the PersonViewer class. Actions return the HTTP response to go back to the browser. Actions can call other actions from other viewers to embed views in other views (for example, the DocumentViewer could embed a view from the PersonViewer to show a little profile of the author at the top).
Item actions take in a noun in the URL, which is the unique id of the item it acts upon. If viewers need more information (say I submitted a form that specified multiple people I wanted to add to a group), the data is passed in the query string or the HTTP post data, and the data required is up to the specific viewer. The only query string parameters that are reserved right now by convention are “version” (which specifies a specific version of the item the viewer is acting on) and “redirect” (which specifies the URL to return to after submitting the form on this page).
An additional parameter is passed in defining the response format, like HTML or XML. The default is HTML. Each action specifies a different behavior for each format it accepts. For example, in the “show” action, the “html” format will display a page showing everything about the item, while the “rss” format will render an RSS document with the latest action notices. Note that the format only specifies the response format. The request format (what the browser sends to the server) is always the same: all parameters encoded in the URL or the HTTP post data. We will only be using HTTP as the transport for viewers (although we can define things that accept emails and SSH and other protocols, they just won’t be called viewers).
Whenever a visitor (or another web service or bot) is at an action of a viewer, he has an authenticated AuthenticationMethod, and through that AuthenticationMethod, is an Agent. If a visitor has not authenticated, they’ll be using AnonymousAgent. We will support various ways of authenticating via the different subclasses of AuthenticationMethod.
There is a DjangoTemplateDocument viewer right now, which accepts DjangoTemplateDocuments, and when viewed with the “render” action, it renders the DjangoTemplateDocument as HTML (or whatever format) straight back to the browser. This allows users to add web content that is not really tied to a viewer, so they can fully customize the user experience. By using DjangoTemplateDocuments and vanity URLs, a webmaster can use Deme to create a completely customized site that has no sign of Deme (unless a visitor specifically types in a /viewing/ or /static/ URL).
However, DjangoTemplateDocuments only allow the content to be customized, and not the things that a view does. For example, one cannot write a DjangoTemplateDocument to create a new record in the database, or to send out an email when visited, or more importantly, to do unauthorized things like execute UNIX commands.
Also, every HTML response from a viewer is rendered by inheriting from the default layout from the given site, so by modifying DjangoTemplateDocuments, one can change the look and feel of ordinary viewers to some extent.
Modules are self-contained collections of item types and viewers (and arbitrary Django code) that can be imported into any Deme project. They work just like Django apps, except by virtue of being in the modules/ directory they are registered into the Deme viewer framework. All of the item types discussed in this document are part of the Deme “core” (the cms/ directory). Modules cannot generally override or change functionality of existing parts of code (so you cannot add a button to a page rendered by ItemViewer). They can only add new functionality.
As described in the section on Subscriptions, Deme will email notifications for every action notice made on items that are subscribed to (in the future we will support other ContactMethods, like sending SMS notifications). The communication also goes the other way: if someone responds to a notification email (or sends an email to the address corresponding to a particular item), that will become a comment on Deme.
Deme uses the Bootstrap 3 front-end framework (http://getbootstrap.com).
Stylesheets (CSS) for Deme are generated from LESS files (http://lesscss.org). There are many ways to compile LESS files but the one used was CodeKit (http://incident57.com/codekit), a commercial software package that simplifies the process.
In order to allow easy upgrade of the Bootstrap framework files, files in /static/less/bootstrap should not be customized for Deme as any customization would be overwritten when upgrading Bootstrap. Instead, changes should be made to files in /static/less/deme.
The Bootstrap Javascript is simply included as a minified JS file.