Ticket #72 (closed defect: fixed)

Opened 6 years ago

Last modified 5 years ago

Article ingest should insert RDF into kowari/mulgara instead of creating DC/RELS-EXT datastreams

Reported by: ronald Assigned to: ronald
Priority: critical Milestone:
Component: topaz Version:
Keywords: article ingest Cc:
Blocking: Blocked By:

Description

Article ingest currently creates DC and RELS-EXT datastreams with all the RDF. This causes problems because changes to those datastreams don't appear immediately in kowari/mulgara, leading to race-conditions. Instead, ingest should talk to kowar/mulgara directly and insert the RDF that way.

Depends on #4 and #10.

Dependency Graph

Change History

07/31/06 00:04:41 changed by ronald

  • status changed from new to assigned.

With [145], this change is not strictly necessary. The question becomes more of a architectural/philosophical one: should we assume a fedora-kowari exchange of rdf data, or not. If the former, then ingesting everything as fedora objects only is more efficient (since they'll get propagated to kowari anyway) (*); if the latter, then we should make this change, and also change the ingester/stylesheets to take/create a list of fedora objects and a list of rdf-statements.

(*) If we adopt the currently proposed solution in #10, then this isn't true and hence we have to write all RDF to kowari.

08/17/06 02:32:37 changed by ronald

(In [468]) Addresses #72 and #99: support inserting RDF directly into the triplestore. For this the RDF element in the object has renamed to the more accurate RELS-EXT, and a new element rdf:RDF is supported as a child of ObjectList? which takes an RDF/XML fragment.

Conversion from RDF/XML to triples is done via a stylesheet taken from http://www.semanticplanet.com/library/Main/RdfToTriplesStylesheet and slightly modified to generate iTQL triples instead of N3 triples.

08/17/06 02:51:12 changed by ronald

(In [469]) Addresses #10 and #72 (indirectly): disable all triple-manipulation by the resource-indexer. Note that this is a temporary "fix", to be replaced with the full fix described in #72.

Also note that we disable the triple updates this way instead of disabling the whole RI outright via fedora.fcfg because the we still want Fedora to start up and initialize the triple-store for us.

08/17/06 02:59:53 changed by ronald

(In [470]) Addresses #72: insert all meta-data directly into the triple-store.

However, due to the incompleteness of the fix for #10 in rev [469] we also store that data into the DC and RELS-EXT streams of Fedora, i.e. we explicitly store the meta-data in two places. Once #10 is fixed properly we can (and should) stop populating the DC and RELS-EXT streams, at which point #72 can be closed.

09/11/06 19:17:03 changed by ebrown

  • milestone changed from TBD to september24.

10/02/06 21:18:01 changed by ronald

(In [728]) Addresses #4, #10 and #72: multiple fixed and updates to FilterResolver?:

  • Handle multiple models. The datastream a model's statements should be written to is part of the model definition so as avoid having to change the configuration and restart the server for new models.
  • Flush outstanding queued items on server shutdown.
  • Make various hardcoded params configurable.
  • Made fedora-updater transaction aware so only committed stuff triggers an update
  • Delete empty datastreams
  • Avoid creating fedora objects unless really necessary.
  • Updated fedora-updater for Fedora 2.1.1 (error message for no-such-datastream changed).
  • Remove obsolete/dead code
  • Added configurable filters for fedora-update to control which statements are written and whether any URI-rewriting should occur. Provided three different implementations: a very simple one that only handles Fedora URI's; a filter that tries to put things into Fedora in their proper places with the least mods, but falls back shortening and escaping URI's where necessary; and lastly a filter that writes all statements to separate Fedora objects (i.e. not the object indicated by the subject-uri). Neither of these are satisfactory, though, as they either don't fully fix the original problems in #10 or they invalidate the purpose of the updater (#4). They are being checked in here for historical purposes.

10/02/06 22:15:15 changed by ronald

  • status changed from assigned to closed.
  • resolution set to fixed.

Since #4 and #10 have been closed as wontfix, ingest will stay as is: meta-data will be explicitly duplicated into DC/RELS-EXT and the triplestore.

10/29/07 21:13:03 changed by

  • milestone deleted.

Milestone september24 deleted