#?zml0.7 /setup.py install^^ This installs: * a package named "rx" in the Python ^^site-packages^^ directory * shell scripts or .bat files for running Raccoon and ZML in the Python scripts directory * a directory named "rx4rdf" containing documentation, Rhizome pages, and other ancillary files in the Python share directory !Running Raccoon is an application server that runs a given application and is always in conjunction with a config file that defines that application. (Currently, each Raccoon instance only supports one application.) To specify that application's config file invoke Raccoon with the ^^-a^^ argument, e.g.: ^^/run-raccoon -a your-app-config.py^^ ^^/run-raccoon -a your-app-config.py --app-specific-arg --another-app-specific-arg^^ ^^run-raccoon^^ is a simple shell script or batch file that just invokes ^^python /site-packages/rx/raccoon.py^^. This will launch Raccoon's built-in http server, which runs on port 8000 by default. You can change this by editing ^^server.cfg^^ (see below in Configuration). If you don't want to run the built-in http server, use the ^^-x^^ option and Raccoon will exit after it processes the application's start-up actions and command line arguments. (Note that any options directed towards Raccoon should appear before the ^^-a^^ argument.) ^^/run-raccoon -x -a your-app-config.py --app-specific-arg^^ !!Running multiple applications A single instance of Raccoon can be used to run multiple applications with different base URLs or different hostnames. The functionality is provided by a simple Raccoon application (called ^^root-config^^) that dispatches incoming HTTP requests to the appropriate application. To use Raccoon is launched with the ^^root-config^^ application which in turn can take a XML configuration file as an command line argument. A typical invocation would look like: ^^run-raccoon -a root-config.py -s server.xml^^ Below is an example of a ^^server.xml^^ configuration file: p""" foo.com foo.org foo.bar.org """ For more information, see the comments in ^^root-config.py^^ found in the rhizome directory. !!Running with Apache Although Raccoon comes with a built-in web server, it can be configured to run behind an Apache web server. There are several different ways to do this. The Raccoon HTTP server is derived from the CherryPy server and a how-to on running with Apache can be found on its website [here|http://cp1.cherrypy.org/static/html/howto/node3.html]. For example, this script kicks off Rhizome when being invoked from a Apache fastcgi script: p'''#!/bin/tcsh python2.2 /lib/python2.2/site-packages/rx/raccoon.py -s /home/rhizome/site/RootServer.cfg -l -a /home/rhizome/site/site-config.py >& /dev/null &''' !Configuration There are two config files Raccoon depends on: the application config file and its built-in http server's config file. In addition, it optionally uses a config file for controlling logging. The application config specified using '-a' command line argument. If that is missing, Raccoon tries to load a file called 'Raccoon-default-config.py' in the current directory. For the application config file settings, see [RaccoonConfig] for complete documentation. The Rhizome config file found at ^^rhizome/rhizome-config.py^^ is also commented. The http server config specified using '-s' command line argument. If that is missing, Raccoon tries to load a file called 'server.cfg' in the current directory. The Raccoon HTTP server is derived from the CherryPy server and documentation on its configuration settings can be found on its website [here|http://cp1.cherrypy.org/static/html/tut/node17.html#SECTION0017200000000000000000]. In addition, Raccoon adds a few settings, they are documented on the sample ^^server.cfg^^ found in the ^^rhizome^^ directory. The logging config file is specified using '-l' command line argument. If '-l' isn't followed by a file path, Raccoon tries to load a file called 'log.config' in the current directory. Raccoon uses the standard log facilities available in Python 2.3 (and uses a back-ported version of the logging library when running on Python 2.2) and the config file format is documented in the Python 2.3 library documentation. The sample log.config file that ships with Raccoon lists the names of the available loggers. !Command Line Arguments p'''raccoon.py [Raccoon options] -a config.py [app. config specific options]''' Raccoon options: p''' -h prints help message -s [server.cfg] specify an alternative server.cfg -l [log.config] specify a config file for logging -r record requests (ctrl-c to stop recording) -d [debug.pkl]: debug mode (replay the requests saved in debug.pkl) -x exit after executing the application specific command-line arguments -p specify the path (overrides RACCOONPATH env. variable) -m [store.nt] load the RDF model (default model supports .rdf, .nt, .mk) -a [config.py] run the application specified ''' !Architecture !!Abstract Processing model Let's start with a 10,000 foot view of how Raccoon works. Fundamentally Raccoon responds to requests. A //Request// consists of a dictionary of arbitrary name-value pairs: the //request metadata//. Raccoon passes each request through a sequence of //Actions//. An //Action// has two parts, one or more //match expressions// and a //action function// that is invoked if the request metadata matches one of the match expressions. The action function returns a value which is passed onto the next Action in the sequence. The final return value is the response to the request. An Action may also modify request metadata. !!A Little Less abstract Once the abstract model is clear, let's descend a few thousand feet. The core Raccoon class that handles requests is the //RequestProcessor//. Each RequestProcessor instance is associated with a RDF or XML data store. This data store that is presented to Raccoon as an [rx4rdf:RxPath] DOM (Document Object Model) for RDF, or a standard DOM for XML (but note that this manual assumes the data store is RDF). Raccoon maps the request metadata to XPath variables and an Action's match expressions are RxPath expressions that are evaluated against the data model. Besides executing a function that returns a value, an Action may instead change the current context node used when evaluating relative RxPath expressions. Each RequestProcessor is associated with an application config file that defines the action sequences for each type of request. Raccoon supports a few built-in type of requests, such as HTTP, XML-RPC, and command-line, each of which defines how that type of request should be mapped to the request metadata. The application also define its own request types. The application config file is simply a Python script that gets executed when the RequestProcessor is created. Raccoon provides some building blocks for defining an application's actions. In particular, it provides a set of //ContentProcessors// which are Action functions for tranforming content, along with a framework for adding additional ContentProcessors. Built-in content processors include XSLT, [rx4rdf:RxSLT], Python, and [rx4rdf:RxUpdate] among others. In addition, Raccoon has a sophisticated framework for caching request processing, including content processor actions. Raccoon also provides various facilities for building applications, including: * transactional updates and merging of the model * an authorization framework for processing content and for accessing the model and request metadata. * XPath extension functions for accessing application facilities from contexts such as XSLT, RxPath, RxUpdate, etc. * request invocation through URL resolution or Python function calls * maintaining session state !A Example Application Here's a simple, crude, but complete application config file: p""" actions = { 'http-request' : [ Action( ["string(/*[rdfs:label=$_name]/rdfs:comment)", "'not found!'"], lambda matchResult, requestMetadata, contextNode, retVal: ''+matchResult+'' ), ] } APPLICATION_MODEL=''' "foo" . "hello!" . ''' """ If you run Raccoon with this config file (e.g. ^^/run-raccoon -a example-config.py^^) and type ^^http://localhost:8000/foo^^ in your browser; you'll see "hello!", any other path will give you a page that says "not found!". This config defines two variables: ^^actions^^ and ^^APPLICATION_MODEL^^. ^^APPLICATION_MODEL^^ is a string of [NTriples|http://www.w3.org/TR/rdf-testcases/#ntriples] containing statements that are added to the application RDF model but are read-only and not saved in the model store. Typically it is used alongside the writable model for structural components such as the schema but in this simple example it will be the entire model. ^^actions^^ is a dictionary that is the heart of an application running on Raccoon. Here we define an action sequence for the 'http-request' trigger that consists of just one Action. The action has two match expressions; the first finds a resource that has an "rdfs:label" property that matches the path of the request URL. If no match is found the second one will be evaluated, which always returns the string "not found!" (actually this isn't really needed since if no action is invoked Raccoon will invoke a default "not found" handler). The Action's function is simple: it just returns the result of the match expression, wrapping it in a HTML template. !Component Reference !!Requests A request consists of a dictionary whose keys are strings that can be used as Python identifiers. Its values can be arbitrary but generally should consist of strings, numbers, and lists of strings or unique DOM nodes (XPath nodesets). When evaluating a //match expression// or a //ContentProcessor// that relies on XPath (such as the XSLT, RxSLT, or RxUpdate content processors), this dictionary is mapped to a set of XPath variables. For simple data types that correspond to XPath data types the mapping is straightforward. Values for other types are mapped as XPath variables in ways specific to the type of request trigger; for example dictionaries may be mapped to variables in a particular namespace and other values excluded from the mapping. When mapping strings the following rules apply: first, lists of strings are converted to a nodeset of Text nodes, and, second, since XPath and the RDF model both treat strings as unicode, non-unicode strings are converted to unicode with the assumption that they are either UTF-8 encoded or 7-bit ASCII. It is the responsibility of the request processor to ensure that strings are encoded properly. However, the http request processor doesn't currently check the character encoding of the request, so application must make sure that request form data is sent as UTF-8. The easiest way to ensure this is to add "accept-charset='UTF-8'" attribute to form elements. For all requests, Raccoon will add these variables to the request metadata: +\__server\__=the ^^RequestProcessor^^ object that is processing this current request. +\__requestor\__=a ^^Requestor^^ object that can invoke requests as function calls. In addition, when a request dictionary is mapped to XPath variables, the following XPath variables are added: +__STOP=When an action's match expression evaluates to this variable, the RequestProcessor will stop evaluating the rest of the action's match expressions. +_APP_BASE=The value of the ^^appBase^^ config setting +BASE_MODEL_URI=The value of the ^^BASE_MODEL_URI^^ config setting +__store=A reference to the application store's root Document node. Using this variable allows RxPath expressions to be executed when the store isn't part of XPath context, for example when executing a regular XSLT stylesheet. __Request Triggers__ Raccoon invokes the following request triggers (or request types). In addition, an application can invoke its own triggers using the ^^runActions^^ method -- the application just needs to add the trigger name as a key in its ^^actions^^ dictionary. Requestor objects and the "site:" URL resolvers default to invoking the 'http-request' trigger but this can be globally changed by setting ^^DEFAULT_TRIGGER^^ (for an application not designed to respond to HTTP requests, for example) or in a specific context by assigning the (thread-local) ^^RequestProcessor.currentRequestTrigger^^ property. +^^load-model^^=The load-model request is invoked when a model is first loaded, usually when the Raccoon application is initialized. No specific request metadata is set. +^^run-cmds^^=This is invoked after ^^load-model^^ using the command line arguments specified after the '-a' argument. Each argument that start with '-' is added to the request metadata (with the leading '-'s stripped). If the next argument doesn't also start with a '-' it will be treated as the value of variable (as a string). If it does, the value of the prior variable is set to the boolean True. In addition, the whole command line is assigned to the variable named '_cmdline'. +^^http-request^^=HTTP requests are mapped to request metadata as follows: * URL query parameters and form variables (with HTTP POST requests) are mapped to request keys of the same name. There is no distinction between form variables and query parameters. If a parameter or form variable appears multiple times in a request, their values are combined into a list. If the body of a HTTP POST request is an XML-RPC payload then each XML-RPC parameter will be mapped to request keys named '0', '1', '2', etc. (starting with '0'). :In addition, these special keys will be added: *^^_name^^ is the name or path of the request; the part of the request path after the base of the URL that the application is running on (as set by the 'appBase' config setting). If the resulting path is empty or "/", _name will be set by the value of the ^^defaultPageName^^ server setting (default 'index'). *The ^^_request^^ object has the following properties: ++browserUrl=The absolute URL of the request. This is mapped to an XPath variable named "_url". ++browserBase=This is the part of the browserUrl which is excluded from the "_name". This is mapped to an XPath variable named "_base-url". ++browserPath=This is the rest of the browserUrl after browserBase up to the query part; usually the same as '_name', except when path has a trailing slash or the URL maps to the default name (eg. 'index') or a subdomain (if "virtual" domains are configured). This is mapped to an XPath variable named "_path". ++browserQuery=This is the browserUrl after the "?". This is mapped to an XPath variable named "_url-query". ++method=The request's HTTP method (eg. "GET", "POST"). This is mapped to an XPath variable named "_method". ++headerMap=This is a dictionary of all the HTTP request headers. Each header is mapped to a XPath variable in the namespace "http://rx4rdf.sf.net/ns/raccoon/http-request-header#" (default prefix: "request-header"). ++simpleCookie=This is a dictionary (actually a Cookie.SimpleCookie object) of all the cookies in the request. Each cookie is mapped to an XPath variable in the namespace "http://rx4rdf.sf.net/ns/raccoon/request-cookie#" (default prefix: "request-cookie"). *The ^^_response^^ object has the following properties: ++headerMap=Is a dictionary of all the HTTP response headers. Each header is mapped to a XPath variable in the namespace. "http://rx4rdf.sf.net/ns/raccoon/http-response-header#" (default prefix: "response-header"). ++simpleCookie=This is a dictionary (actually a Cookie.SimpleCookie object) of all the cookies in the request. Each cookie is mapped to an XPath variable in the namespace "http://rx4rdf.sf.net/ns/raccoon/response-cookie#" (default prefix: "response-cookie"). *^^_session^^ is a dictionary that is associated with a session cookie; when that cookie is part of the request headers the associated sessions values are added to the _session dictionary. The _session dictionary is present with every http request, however a session cookie is not set until the first time a value is assigned to the dictionary. Each item in the session dictionary is mapped to an XPath variable in the namespace "http://rx4rdf.sf.net/ns/raccoon/session#" (default prefix: "session"). :The content-type, content-length, and status response headers will be set automatically if not set by the request handler. Also if the "useEtags" config option is set, an etag based on a MD5 digest of the response will be set for every response and the "If-None-Match" request headers honored. If no Actions are invoked or the result of processing the request is None, the ^^RequestProcessor.default_not_found^^ method will be used to generate the response. +//error triggers//=When an exception is raised while processing a request Raccoon will search for a request trigger named //trigger name//^^-error^^ (e.g. ^^http-request-error^^). If found, the associated request Action sequence is invoked with a dictionary named '_errorInfo' added to the request metadata. Each item in this dictionary is mapped to an XPath variable in the namespace "http://rx4rdf.sf.net/ns/raccoon/error#" (default prefix: "error"). It will contain the following items: ++message=The message associated with the exception. ++name=The name of the exception (class). ++module=The name of the module the exception class was defined in. ++errorCode=The error code associated with the exception or an empty string (checks if the exception has an attribute named "errorCode"). ++fileName=The file path to the source that raised the exception. ++lineNumber=The line number in the source file that raised the exception. ++functionName=The name of the function that raised the exception. ++text=The line in source file that raised the exception. ++details=A multi-line string containing the full stack trace of the raised exception. :Also in the '_errorInfo' dictionary, but not available as XPath variables, is "type", "value", "traceback", which are the objects returned by the standard Python function ^^sys.exc_info()^^. :The error handler will not be invoked if an exception occurs while already in an error request, or if the request metadata contains a variable named '_noErrorHandling' with a non-zero value. The following triggers are invoked when updating the data store and committing a transaction: +^^before-add^^=This trigger is invoked before a statement is to be added to the data store. The following variables will be available: ++_added=A XPath nodeset containing the predicate element node representing statement that is about to be added. ++_isnew=A boolean indicating whether the subject resource of the statement was not present in the data store prior to the current transaction. ++_newResources=A XPath nodeset containing resource element nodes representing all the new resources added during this transaction so far. +^^before-remove^^=This trigger is invoked before a statement is to be removed from the data store. The following variables will be available: ++_removed=A XPath nodeset containing the predicate element node representing statement that is about to be removed. ++_isnew=A boolean indicating whether the subject resource of the statement was not present in the data store prior to the current transaction. ++_newResources=A XPath nodeset containing resource element nodes representing all the new resources added during this transaction so far. +^^before-new^^=This trigger is invoked whenever a statement is added to the data store and its subject is a resource not previously present in the data model. The following variables will be available: ++_newResources=A XPath nodeset containing the resource element node representing the new resource. +^^before-prepare^^=This trigger is invoked immediately prior to attempting to commit a transaction. At this point, the application still can make modification to the data store within this transaction. The following variables will be available: ++_added=A XPath nodeset containing predicate element nodes representing all statements added in this transaction. ++_removed=A XPath nodeset containing predicate element nodes representing all statements removed in this transaction. ++_newResources=A XPath nodeset containing the new resource element nodes added in this transaction. +^^before-commit^^=This trigger is invoked after all the transaction participants have voted to commit the transaction. The request metadata contains the final set of changes made during the transaction, as the transaction can not be further modified. However the transaction can still be aborted at this point, so this is good time execute any data validation routines. The request metadata will contains the same variables as the ^^before-prepare^^ trigger. +^^after-commit^^=This trigger is invoked after a transaction has completed been committed successfully. The request metadata will contains the same variables as the ^^before-prepare^^ trigger. !!XPath extension functions Raccoon makes available a few useful XPath extensions functions that can be used in Action match expressions, XSLT and RxSLT stylesheets, and RxUpdate scripts. They live in the 'http://rx4rdf.sf.net/ns/raccoon/xpath-ext#' namespace (default prefix: 'wf'). In addition your application can add its own extension functions by setting the ^^extFunctions^^ config variable. Most crucial of Raccoon's extension functions are the ones for retrieving and setting request metadata. Although request metadata is mapped to XPath variables, XPath is a side-effect free so its variables can't be modified. Furthermore, it is a static language, so it is an error to reference to variables that don't exist (though with Action match expressions such errors just mean the expression didn't match) and there is no way to delete a variable. Because of these limits, Raccoon provides the following XPath functions: +//boolean// wf:has-metadata(//string//)= This function returns true if the request metadata contains a property with the given QName, otherwise it returns false. +//object// wf:get-metadata(//string//, //object//?)= This function returns the value of the request metadata property that matches the given QName; if there is no match, it returns the value of the second paramater or false if that parameter is omitted. +//object// wf:assign-metadata(//string//, //object//)= This function adds or replaces the given request metadata property's value with the second parameter. It returns the second parameter. Note that this will not modify any XPath variables -- if replacing an existing property the equivalent XPath variable will continue to have the old value and no new XPath variables will be added. However, the context of the next Action invoked will reflect this modification. +//boolean// wf:remove-metadata(//string//)= This function removes the given property from the request metadata. It returns true if the property was present, otherwise it returns false. See raccoon.py for documentation on the rest of the XPath extension functions. !!Content Processors Raccoon provides a set of //ContentProcessors// which are Action functions for tranforming content, along with a framework for adding additional ContentProcessors. Built-in content processors include XSLT, [rx4rdf:RxSLT], Python, and [rx4rdf:RxUpdate], among others. In addition, Raccoon has a sophisticated framework for caching request processing, including content processor actions. Your application uses Content Processors by creating an Action that uses the RequestProcessor.processContents method as its action function. For example: p""" actions = { 'http-request' : [ Action( ["string(/*[rdfs:label=$_name]/rdfs:comment)"], result, kw, contextNode, contents lambda matchResult, requestMetadata, contextNode, retVal: matchResult ), Action(['/*[rdfs:label=$_name]/wiki:format/*'], __server__.processContents) ] } """ ^^\__server\__^^ is a config variable which references the application's RequestProcessor. ^^processContents^^ assumes that the result of the match expression can be converted to a string that names the Content Processor to be invoked. The RequestProcessor has a dictionary of Content Processors and the application can add its own by setting the ^^contentProcessors^^ config variable. The contentProcessors functions have the same signature as Action functions and act on the return value of the previous Action. They return either a string or a file-like object that is the result of processing or a tuple containing the result and the name of the next Content Processor to be invoked. Returing a tuple allows a content processing pipeline to be dynamically constructed. Raccoon supports the following content processors: +Python (http://rx4rdf.sf.net/ns/wiki#item-format-python)=The contents is treated as a [Python|www.python.org] script that is executed when the page is requested. Anything the code writes to ^^stdout^^ is captured and sent as the response. The following are exposed as local variables to the script: ++^^\__server\__^^=(see Requests above) ++^^\__requestor\__^^=(see Requests above) ++^^\__kw\__^^=The request metadata dictionary. The script can modify the request metadata by assigning and deleting items from this dictionary. :The script can set another Content Processor to invoke the result by assigning "_nextFormat" in ^^\__kw\__^^ to the name of that content processor. +Base64 (http://www.w3.org/2000/09/xmldsig#base64)=The content is treated as base64 encoded text and decoded. +RxSLT (http://rx4rdf.sf.net/ns/wiki#item-format-rxslt)=The content is treated as a RxSLT stylesheet and invoked using the application's model as its source DOM. If the stylesheet's output method is HTML or XML, it will use HTML/XML content processor as the next content processor unless the stylesheet overrides this by using ^^wf:assign-metadata^^ to set the '_nextFormat' variable to the name of the desired content processor. +RxUpdate (http://rx4rdf.sf.net/ns/wiki#item-format-rxupdate)=The content is treated as a [rx4rdf:RxUpdate] document that will invoke to update the applications's model. +Text (http://rx4rdf.sf.net/ns/wiki#item-format-text)=Just returns the content unmodified +Binary (http://rx4rdf.sf.net/ns/wiki#item-format-binary)=Just returns the content unmodified +HTML/XML (http://rx4rdf.sf.net/ns/wiki#item-format-xml)=This content processor replaces any attribute that contains an URL that starts with 'site:' with a live URL based on the request context. Using 'site:' links allows links on a page to work even if the page moves to another directory or is transformed in other contexts (such as statically exported). The 'site:' URL scheme is similar to the 'file' scheme: URLs that start with 'site:/\//' will be relative to the root of where the Rhizome site is running from, while URLs that start only with 'site:' will be relative to the current location of the page. :This content processor will return the value of the last encountered ^^raccoon-format^^ XML processing instruction, if any, as the next content processor, unless it encountered the ^^raccoon-ignore^^ XML processing instruction, in which case none will be returned. Raccoon also provides a couple of more ContentProcessors that aren't enabled by default; -- these are not included because they requires some application specific information. For example, there's an XSLT content processor which is nearly the same as the RxSLT content processor except it also needs a source document to transform in addition to the stylesheet contents. !!Authentication and Authorization Raccoon doesn't supply any facilities for authentication. It is expected that the application will provide that, for example, by placing a "user" variable in the session dictionary after authentication has occured or by examining HTTP request headers in the request pipeline. Raccoon provides several hooks to enable an application to authorize various operations before they occur, including when updating the model, processing content, executing an XPath extension function, and modifying request metadata. See the [configuration settings|RaccoonConfig] for more info. !!Caching Raccoon uses several MRU (most recently used) caches: +XPath parser cache=This caches XPath expression strings so they don't need to be repeatedly parsed. +XPath processing cache=This caches the result of evaluating an XPath expression. Certain XPath extension functions have side effects or can not be analyzed for dependencies and so any XPath expressions that references such a function is not cacheable. You can declare addition XPath functions as not cacheable by setting the NOT_CACHEABLE_FUNCTIONS config setting. +stylesheet parser cache=This caches XSLT and RxSLT stylesheets so they don't need to be repeatedly parsed. +Action cache=This caches the result of executing an Action. For an action to be cachable you must assign it a CacheKeyPredicate. Raccoon provides cache predicates for caching RxSLT and XSLT actions. See the documentation on the Action class for more details. +file cache=This caches files retrieved by using ^^file^^ and ^^path^^ URLs. You can config it to skip caching files that exceed a certain size. (Disabled by default.) Both the Action cache and XPath processing cache are invalidated when the underlying model is loaded (or reloaded). The other caches last for the lifetime of the Python process. The Raccoon caching model is a little unusual in that it doesn't rely on explicit or proactive cache invalidation. Instead it works on the principle that each time we do a cache lookup we can generate a key based on the aspects of the current state of the system that uniquely determine the cache value. Thus when the relevant system state changes, the lookup will fail. For example, for the XPath processing cache, the lookup value will be a XPath expression but the key stored in the cache will be a tuple consisting of XPath expression, the values of any variables referenced by the expression, and a revision counter representing the state of the model. When the model is updated the revision counter changes and subsequent cache lookups for that expression will result in a cache miss, as the lookup key will now include the new revision counter. Eventually the old cache entry with the old revision counter as part of its key will be flushed out as the MRU cache fills up. Raccoon's caches have a mechanism for handling side effects that may occur when generating a value that will be cached. For example, the XPath processing cache and the XSLT/RxSLT processing caches keep track of calls to ^^wf:assign-metadata^^ and ^^wf:remove-metadata^^ so that the changes they make to the request metadata can be repeated when subsequent requests result in the value being retrieved from the cache. !!Loading and updating models When Raccoon starts, after it has read the application's config file, it will attempts to load the application. If the ^^-m^^ command line argument is present, Raccoon will attempt to load that file into the RxPath DOM. Otherwise it will attempt to load from the file path specified by the ^^STORAGE_PATH^^ config variable. If neither of the file paths were specified or if the given file path does not exist it will load the model from the config variable ^^STORAGE_TEMPLATE^^ which is a string of [NTriples|http://www.w3.org/TR/rdf-testcases/#ntriples]). In this case, the next time the model is saved, the file will be created at the given path location unless no path was specified, in which case the application will run in read-only mode and the model not be updated. After the model is loaded, any statements in the ^^APPLICATION_MODEL^^ config variable are added to the model. ^^APPLICATION_MODEL^^ is a string of [NTriples|http://www.w3.org/TR/rdf-testcases/#ntriples] containing statements that are read-only and not saved with the model. Typically it is used alongside the writable model for structural components such as the schema. Raccoon loads the model using the class or factory function specified by the ^^domStoreFactory^^ config variable. The function must return an object that conforms to the ^^DomStore^^ interface. Raccoon provides two ^^DomStore^^ implementations: ^^XMLDomStore^^ for using an XML file as the datastore and ^^RxPathDomStore^^ (the default), which exposes an RDF model as a RxPath DOM. Once the DOMStore is created, its ^^loadDom^^ method is called, passing both the value of the ^^-m^^ argument or ^^STORAGE_PATH^^ and (as the default model) a stream containing the NTriples stored in STORAGE_TEMPLATE. ^^RxPathDomStore^^ can rely different RDF libraries for its underlying RDF store, depending on the class or factory function passed to its constructor. The default is ^^RxPath.IncrementalNTriplesFileModel^^, which can load RDF file in any of the supported RDF formats, but always saves the model as a special NTriples file that acts as a transaction log. This allowing the model to be incrementally updated with appends. Also available is ^^RxPath.RedlandHashBdbModel^^, which loads and saves to Berkeley DB files created by Redland (requires [Redland|http://www.redland.opensource.ac.uk/] to be installed); and ^^RxPath.RDFLibFileModel^^, which loads and saves to the specified RDF/XML file (requires [RDFLib|http://rdflib.net]). Below is an example from of an application config file demonstrating how to use one of these alternative RDF libraries: p""" def createRedlandDomStore(): from rx import RxPath,DomStore return DomStore.RxPathDomStore(RxPath.initRedlandHashBdbModel) domStoreFactory = createRedlandDomStore """ 4Suite, Redland, and RDFLib, each, in turn, support several other RDF stores (including various SQL databases) and alternative ^^domStoreFactory^^ functions can be written to use these fairly easily. !!Transactions Each top-level request (that is, a request not invoked within another request, in practice, an external request such as http request or command line invocation) has a separate transaction context associated with it. Any modifications to the data store or other transactional object will be part of that transaction context. This transaction context is managed by a transaction coordinator that implements 2-phase commit protocol, so if any modifications to an object participating in the transactions fails all changes to all transaction participants will be rolled back. Any object that implements the ^^TransactionParticipant^^ interface can participate in a transaction; the ^^DomStore^^ interface is derived from this and Raccoon also provides a implementation for transactionally manipulating the file system (^^TxnFileFactory^^). After the actions associated with a request are run, the transaction is committed if any of the transaction's participants have been modified. Application code can mark a transaction as read-only by adding a variable named ^^__readOnly^^ to the request metadata; if it is present and evaluates to a non-zero value, the transaction will be aborted if any modification occured. During the commit process three trigger will be invoked: one before the commit process begins (^^before-prepare^^), another after all the transaction participants have voted to commit (^^before-commit^^) and a third one after the commit succeeded (^^after-commit^^). See the above section on request triggers for more details. During the first trigger modifications can still be made to the transaction and this trigger could be used to, for example, transactionally update a full text index based on changes to the data store. By the time the second trigger is invoked the changes to the transaction is finalized but the transaction can still be aborted, at this point schema validation could occur. With the third trigger, the transaction has completed successfully; this could be used to implement notifications about changes to the application. Besides allowing an application to modified the data store's DOM directly, Raccoon provides a few methods for updating the data store. At the lowest level is ^^updateDOM^^ which takes a list of statements to add and remove. Also available is ^^updateStoreWithRDF^^ which parses RDF and updates the model either by adding new statements or, if a list of resources to replace is specified, by merging in the difference between the specified RDF and the resource list. Finally, there is the ^^xupdateRDFDom^^ method which processes an [rx4rdf:RxUpdate] document and is used by the RxUpdate content processor. #todo: describe global process lock, update statement #todo!!request invocation: site: urls, requestor, call-actions #todo!!Security !!Request invocation step-by-step To illustrate how Raccoon works let's walk through the steps it takes to serve a page in Rhizome. 1. Run ^^/run-raccoon -a rhizome-config.py 11. Raccoon executes the rhizome-config.py file, setting all the variables declared there, including: 111. ^^PATH^^ which specifies the directory search path for Raccoon 111. ^^STORAGE_TEMPLATE^^ which specifies (in [NTriples|http://www.w3.org/TR/rdf-testcases/#ntriples]) the site structure for a new site. There's a helper function ^^\__addItem\__^^ to update this variable and the line ~~ ^^\__addItem\__('index',loc='path:index.txt', format='zml', disposition='entry', owner=None)^^ ~~ adds a page named 'index' with the assertion that its contents is stored in "path:index.txt" 111. Perhaps most importantly, the config file must set a dictionary called ^^actions^^ that maps trigger names to a pipeline of Actions that are invoked in order (see Request Triggers above). 11. Raccoon next tries to load the model store specified in the ^^STORAGE_PATH^^ or with the -m option. If it is not found, it creates a new store using ^^STORAGE_TEMPLATE^^. 11. If there were any command line arguments listed after the '-a' they are mapped to a dictionary of XPath variables and the 'run-cmd' Action pipeline is invoked with them. 11. Raccoon starts its http server, listening on port 8000 by default 1. retrieve http://localhost:8000/index with your browser 11. Raccoon maps the HTTP request to XPath variables, using different namespaces for the request and response headers, query parameters, cookies, etc. 11. Raccoon then invokes the Action pipeline associated with 'handle-request'. For each Action Raccoon invokes each RxPath query until it finds a match and then passes on the result to the next Action. The result of the final action is returned to browser. 111. The first action (^^findResourceAction^^) will match this expression ^^/a:NamedContent\[wiki:name=$_name]^^, selecting the resource added to the template by the above ^^\__addItem\__^^ call. 111. Skipping ahead to the 5th Action (^^findContentAction^^), our request will match ^^wf:openurl(.\//a:contents/a:ContentLocation)^^. Since ^^.\//a:contents/a:ContentLocation^^ points to the "path:index.txt" a:ContentLocation resource, ^^wf:openurl^^ will retrieve that URL. 111. To resolve the URL Raccoon uses its custom URL resolver which supports the internal 'path:' scheme, which finds the first file on Raccoon's PATH config variable. For example, if you are running site-config.py, Raccoon will find the index.txt in the site directory before it finds the index.txt in the rx/rhizome directory. Using path: URLs provides both security (by restricting access to the file system) and modularity, now you don't need to modify rhizome directly, enabling you to easily upgrade rhizome, etc. 111. The next Action tranforms the content based on its format (ZML, XSLT, etc.) and the final Action looks for any template pages that may applied to it. This Action may invoke another series of Actions and so on and so on. For example, if the page isn't HTML it will try to find an appropriate stylesheet, and those results may be further transformed by the site-template XSL script. 111. Finally, we've processed all the Actions and the results are delivered to the browser. !Debugging Raccoon Raccoon provides a simple debugging facility that is useful for debugging or creating test scripts: launch it with the -r option and it will record all the incoming requests until you hit control-c; that will trigger it to write them to a file named debug.pkl. Once you have one of these files, you can launch Raccoon using the -d option, which causes Raccoon to replay the requests stored in that file instead of launching its http server. !More information For more details, read the class and function level documentation found in ^^raccoon.py^^. For an example of a complex application running on Raccoon, see the Rhizome config file "rhizome-config.py". !Appendix: config settings Complete documentation on all the application config settings is available [here|RaccoonConfig].