Ticket #1243 (new defect)

Opened 3 years ago

Last modified 3 years ago

Relative URI's not properly handled

Reported by: russ Assigned to: pag
Priority: low Milestone:
Component: topaz-mulgara Version: 0.9.1_rc1
Keywords: Cc:
Blocking: Blocked By:

Description

in 0.9.1 it was possible for admin users to enter a comma separated list of issue URIs into the "issues" field of a volume object on the manageVolumesIssues.action page.

when a volume is created, the "issues" field is empty, and there is no statement in mulgara with dcterms:issueList for the volumen.

once "issues" is populated, a triple with the volume id, dcterms:issueList, and an empty node is inserted.

if, subsequently, and admin user tries to remove all issues from the volume by deleting the contents of the "issues" field, ambra throws a site error, and topaz inserts garbage into mulgara for the issue list.

specifically, the rmi URL of the mulgara server is inserted, as a literal, into the issue list.

once this happens, the volume is damaged and cannot be updated or deleted.

while this particular failure is no longer reproduceable in 0.9.2, due to changes in the volume management UI, topaz needs to do a better job of escaping user input to avoid this bug recurring in the future.

Dependency Graph

Change History

03/26/09 18:01:58 changed by pradeep

(In [7562]) Additional test to see if setting a collection to null works as expected. Addresses #1243.

(follow-up: ↓ 4 ) 03/27/09 12:07:30 changed by dragisak

This is the sequence of TQL statements that caused it.

  1. Creating a new volume:
    insert 
    <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.plos.org/RDF/Volume> 
    <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Collection> 
    <info:doi/10.1371/volumeX> <http://purl.org/dc/terms/created> '2009-03-27'^^<http://www.w3.org/2001/XMLSchema#date> 
    <info:doi/10.1371/volumeX> <http://rdf.plos.org/RDF/displayName> 'volumeX' 
    <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.plos.org/RDF/Volume> 
    <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Collection> 
    <info:doi/10.1371/aggregation/002a94c6-d458-48f5-871e-1dc487f5e54e> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.plos.org/RDF/Journal> 
    <info:doi/10.1371/aggregation/002a94c6-d458-48f5-871e-1dc487f5e54e> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Collection>
    <info:doi/10.1371/aggregation/002a94c6-d458-48f5-871e-1dc487f5e54e> <http://rdf.plos.org/RDF/Journal/volumes> <info:doi/10.1371/volumeX> 
    into <local:///topazproject#filter:graph=ri>;
    
  2. Adding "asdsadsad" as in the issue list of the new volume:
    insert 
    <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.plos.org/RDF/Volume> 
    <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Collection> 
    <info:doi/10.1371/volumeX> <http://purl.org/dc/terms/issueList> $s261i 
    $s261i <rdf:type> <rdf:Seq> 
    $s261i <rdf:_1> <asdsadsad> 
    into <local:///topazproject#filter:graph=ri>;
    
  3. Updating issue list to blank:
    delete select $s $p $o from <local:///topazproject#filter:graph=ri> 
    where ($s $p $o and $s <mulgara:is> <info:doi/10.1371/volumeX> 
    and ($p <mulgara:is> <http://purl.org/dc/terms/issueList>)) 
    or ($s $p $o and <info:doi/10.1371/volumeX> $x $s  
    and ($s <rdf:type> <rdf:Bag> or $s <rdf:type> <rdf:Seq> or $s <rdf:type> <rdf:Alt>) 
    and ($x <mulgara:is> <http://purl.org/dc/terms/issueList>)) 
    from <local:///topazproject#filter:graph=ri>;
    
  4. Previous delete seems to be executed again:
    delete select $s $p $o from <local:///topazproject#filter:graph=ri> 
    where ($s $p $o and $s <mulgara:is> <info:doi/10.1371/volumeX> 
    and ($p <mulgara:is> <http://purl.org/dc/terms/issueList>)) 
    or ($s $p $o and <info:doi/10.1371/volumeX> $x $s 
    and ($s <rdf:type> <rdf:Bag> or $s <rdf:type> <rdf:Seq> or $s <rdf:type> <rdf:Alt>) 
    and ($x <mulgara:is> <http://purl.org/dc/terms/issueList>)) 
    from <local:///topazproject#filter:graph=ri>;
    

Last delete will cause:

Caused by: java.io.IOException: Error running update 'delete select $s $p $o from <local:///topazproject#filter:graph=ri> where ($s $p $o and $s <mulgara:is> <info:doi/10.1371/volumeX> and ($p <mulgara:is> <http://purl.org/dc/terms/issueList>) ) or ($s $p $o and <info:doi/10.1371/volumeX> $x $s  and ($s <rdf:type> <rdf:Bag> or $s <rdf:type> <rdf:Seq> or $s <rdf:type> <rdf:Alt>) and ($x <mulgara:is> <http://purl.org/dc/terms/issueList>) ) from <local:///topazproject#filter:graph=ri>;'
        at org.topazproject.mulgara.itql.TIClient.doUpdate(TIClient.java:117)
        at org.topazproject.otm.stores.ItqlStore.doDelete(ItqlStore.java:479)
        ... 169 more
Caused by: org.mulgara.query.QueryException: Unable to modify local:///topazproject#filter:graph=ri: Attempt to remove a triple with node number out of range: 3878891 357 -3 55
        at org.mulgara.resolver.DatabaseSession.execute(DatabaseSession.java:755)
        at org.mulgara.resolver.DatabaseSession.modify(DatabaseSession.java:738)
        at org.mulgara.resolver.DatabaseSession.delete(DatabaseSession.java:354)
        at org.mulgara.server.rmi.SessionWrapperRemoteSession.delete(SessionWrapperRemoteSession.java:176)
        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
        at sun.rmi.transport.Transport$1.run(Transport.java:159)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)
        at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:255)
        at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:233)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:142)
        at org.mulgara.server.rmi.RemoteSessionImpl_Stub.delete(Unknown Source)
        at org.mulgara.server.rmi.RemoteSessionWrapperSession.delete(RemoteSessionWrapperSession.java:266)
        at org.mulgara.query.operation.Deletion.execute(Deletion.java:55)
        at org.topazproject.mulgara.itql.TIClient.doUpdate(TIClient.java:111)

After this query:

select $o $p from <local:///topazproject#filter:graph=ri> 
where (($s $p $o and <info:doi/10.1371/volumeX> <http://purl.org/dc/terms/issueList> $s) 
minus ($s <rdf:type> $o 
and <info:doi/10.1371/volumeX> <http://purl.org/dc/terms/issueList> $s));

... will return:

o                                                       p
------------------------------------------------------- --------
<rmi://ip-10-251-81-156.ec2.internal:8111/topazproject> <rdf:_1>

(in reply to: ↑ 3 ) 03/27/09 13:19:07 changed by pradeep

  • owner changed from dragisak to pradeep.

Replying to dragisak:

2. Adding "asdsadsad" as in the issue list of the new volume: {{{ insert <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.plos.org/RDF/Volume> <info:doi/10.1371/volumeX> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Collection> <info:doi/10.1371/volumeX> <http://purl.org/dc/terms/issueList> $s261i $s261i <rdf:type> <rdf:Seq> $s261i <rdf:_1> <asdsadsad> into <local:///topazproject#filter:graph=ri>; }}}

This operation should have failed. <asdsadsad> is a valid URI. But it is not an absolute URI. RDF specs states that URI references must be absolute. Looks like both Topaz and Mulgara was in violation of the RDF spec in the 0.9.1 version. Mulgara had these weired errors on retrieval for these and was the source of the garbage.

Subsequent to 0.9.1 release Mulgara seems to have fixed this and throws an exception during insert. Topaz is yet to fix this. Will add a check to ansure RDF resources are always absolute URIs.

Looks like the older versions of mulgara accepted these - but failed on a retrieval. The newer versions of Mulgara prevents it at insert time by throwing an exception.

(follow-up: ↓ 6 ) 03/27/09 13:24:07 changed by pradeep

  • priority changed from critical to low.
  • version set to 0.9.1_rc1.
  • component changed from ambra to topaz.

Lowered the priority since new versions of Mulgara throws an exception during insert. Also updated the component and the version where this is reported.

(in reply to: ↑ 5 ; follow-up: ↓ 7 ) 03/27/09 14:02:17 changed by dragisak

Replying to pradeep:

Lowered the priority since new versions of Mulgara throws an exception during insert.

Does this mean we need to upgrade to a newer version of Mulgara ? We have ambra-specific hacks in the version of Mulgara that is in Topaz repository.

(in reply to: ↑ 6 ) 03/27/09 17:10:27 changed by pradeep

Replying to dragisak:

Replying to pradeep:

Lowered the priority since new versions of Mulgara throws an exception during insert.

Does this mean we need to upgrade to a newer version of Mulgara ? We have ambra-specific hacks in the version of Mulgara that is in Topaz repository.

No. The current version of mulgara disallows relative URIs at insert time.

Had a brief chat with Paul. According to him, he would want Mulgara to handle relative URIs in URI references - even though it is a violation of the RDF spec. This is mainly because he wants to treat graphs as a fragment. ie. 'select $s $p $o from <#> where $s $p $o;' should select all statements from the system graph - even though <#> is not strict RDF - he wants the convenience there. Now since graphs can appear anywhere in a query, it follows then that Mulgara has to provide proper handling of relative URIs in URI references. So he is aware of the 'delete select' failure issue and also a subsequent query returning garbage. He would have to address and fix these as part of doing a proper relative-URI support.

This also brings up the interesting issue of how Topaz should handle these. If Topaz prevents relative-URIs, the mulgara feature addition will effectively get nullified. So I am tempted to say - even though the spec says absolute URIs, Topaz can leave that enforcement or lack thereof to the triple-store.

But from Ambra perspective, it is always good to validate URIs using RdfUtils.validate() method.

03/31/09 10:42:05 changed by pradeep

  • owner changed from pradeep to pag.
  • component changed from topaz to topaz-mulgara.

Leaving it on Paul's plate since there isn't anything that needs to be done on Topaz regarding this.

03/31/09 15:57:26 changed by ronald

  • summary changed from topaz inserts garbage into mulgara when user inputs unexpected data to Relative URI's not properly handled.

I think there are two issues here that need fixing:

  1. mulgara is not behaving correctly when relative URI's are inserted
  2. ambra's admin page handling should probably be validating the input data and enforcing absolute URI's (I don't think there is any need for a user to be able to create/enter relative URI's)

And I agree that there isn't anything to fix here on Topaz.

04/14/09 10:25:20 changed by pag

I believe that the original error came about because the relative URIs were being translated into absolute URIs in some circumstances, but not in others.

Mulgara has now been updated to handle relative URIs more consistently. They can still be inserted relatively, and they are store that way. However, when retrieved they will always be shown as absolute, according to the current context. Selection (in a query) can refer to the URI absolutely or relatively, though the output will always be absolute.

This will be present in the next release of Mulgara (2.1.0)