Ticket #729 (closed clarification: fixed)

Opened 4 years ago

Last modified 3 years ago

article is missing from feed, but shows up in browse list

Reported by: russ Assigned to: russ
Priority: high Milestone:
Component: ambra Version: 0.8.1
Keywords: feed Cc:
Blocking: Blocked By:

Dependency Graph

Attachments

plosone-browse (6.8 kB) - added by russ on 12/20/07 13:04:55.
plosone.log from clinicaltrials-stage.plos.org for browse.action
plosone-feed (35.3 kB) - added by russ on 12/20/07 13:05:21.
plosone.log from clinicaltrials-stage.plos.org for feed

Change History

12/20/07 13:04:55 changed by russ

  • attachment plosone-browse added.

plosone.log from clinicaltrials-stage.plos.org for browse.action

12/20/07 13:05:21 changed by russ

  • attachment plosone-feed added.

plosone.log from clinicaltrials-stage.plos.org for feed

12/20/07 13:09:18 changed by russ

uploaded some logs at jeff's request.

i used the same URLs as listed in the description, but with the clinicaltrials-stage.plos.org host.

the mulgara logs are too long to attach - it probably makes the most sense for you to login to plosstage01.plos.org as topazdev (IM or call me for the password if you need it) and look directly at the mulgara log, using timestamps from the plosone log for reference.

the 'failed to parse date' error is interesting, since the date range is, in fact, being respected despite the error. i believe that at some point eric wrote code to allow the feed to accept various date formats, perhaps this error is a leftover, or the feed looks for a standard date first, throws an error, and then moves on to whatever other date parsing is available?

2007-12-20 13:03:35,235 DEBUG ArticleOtmService(PLoSClinicalTrials)> failed to parse date '2007-12-12' use Date - trying iso8601 format [TP-Processor5 org.plos.article.service.ArticleOtmService]
java.lang.IllegalArgumentException

12/20/07 14:12:59 changed by jsuttor

  • owner changed from alex to jsuttor.
  • milestone set to 0.8.2.

12/20/07 14:33:51 changed by jsuttor

the XML is the same for all articles:

  <pub-date pub-type="epub">
    <day>12</day>
    <month>12</month>
    <year>2007</year>
  </pub-date>

the date parsing does fall back so the Exception isn't relevant.

Browse & Feeds query the database differently:

  • Browse gets a list of all Article meta data and builds a map
  • Feed does an explict OTM query

looking at the Mulgara logs & indexes is probably fruitful.

12/20/07 16:33:56 changed by rich

  • milestone changed from 0.8.2 to pubApp_0.8.3.

12/28/07 12:35:50 changed by russ

  • priority changed from medium to high.

for ntds 12/26 pub date:

The following articles were included: 32 (frontmatter) and 88, 103, 111 (research)

The following articles were missing: 149, 159, 161, and 168 (all frontmattter)

i'm wondering if there's something specific that the feed is looking for that is missing in the XML (and thus mulgara) for the articles that don't show up in the feed?

01/08/08 15:13:55 changed by rich

  • milestone changed from pubApp_0.8.3 to pubApp_0.8.2.1.

01/29/08 14:03:23 changed by russ

this is not the same as the bug in #762

02/04/08 17:45:55 changed by rich

  • priority changed from high to low.

02/19/08 16:06:48 changed by rich

  • priority changed from low to high.

02/20/08 14:31:55 changed by rich

  • milestone deleted.

Related to #745 ehcache cleanup.

07/31/08 23:42:32 changed by amit

  • type changed from defect to clarification.
  • blocking changed.
  • blockedby changed.
  • milestone set to 0.9.0.

Is this still a problem?

07/31/08 23:43:07 changed by amit

  • owner changed from jsuttor to russ.

08/01/08 11:29:25 changed by russ

  • owner changed from russ to amit.

yes. it's still an issue in 0.9.

example. the following articles were published, appear in browse, but do not appear in the feed:

  • pntd.0000270
  • pntd.0000268
  • pntd.0000267

http://www.plosntds.org/article/browse.action?field=date http://feeds.feedburner.com/plosntds/NewArticles

anecdotally, this almost always, or perhaps always, happens with front matter (non-research) articles.

08/01/08 11:40:36 changed by amit

  • owner changed from amit to russ.

Hang on. These are not research articles, but editorials?

08/01/08 12:09:57 changed by russ

  • owner changed from russ to amit.

yes. the feed should include all articles published. currently it includes *most* articles published.

sometimes research articles are missed as well, but it's more rare.

i expect there's something in xml that almost all research articles have, but many front matter articles don't, which is causing them to elude the feed.

08/01/08 12:36:52 changed by amit

  • owner changed from amit to jkirton.

Jon, please take a look at this too.

08/01/08 12:40:30 changed by amit

I would test the 'internal' URLs before testing it via feed burner.

08/11/08 18:41:53 changed by amit

  • milestone changed from 0.9.0 to 0.9.1.

Moving to next milestone.

09/03/08 12:20:06 changed by rich

  • keywords set to feed.

09/08/08 16:13:33 changed by dragisak

  • owner changed from jkirton to dragisak.

09/10/08 16:58:29 changed by amit

  • type changed from clarification to defect.

09/19/08 17:38:47 changed by dragisak

  • status changed from new to assigned.

09/29/08 10:24:00 changed by dragisak

I tested with these three articles and they are showing in feed now.

  • pntd.0000270
  • pntd.0000268
  • pntd.0000267

Probably fixed with #745

09/29/08 11:45:04 changed by amit

  • owner changed from dragisak to russ.
  • status changed from assigned to new.
  • type changed from defect to clarification.

Please confirm.

(follow-up: ↓ 27 ) 11/05/08 17:13:40 changed by russ

  • owner changed from russ to dragisak.

it's hard to confirm since it's an intermittent problem, and i've never understood why this was happening.

can you explain what you think the bug was and how #745 fixed it?

perhaps that will help me come up with a test case.

if not, we can wait until after deployment to see if this is still a problem.

(in reply to: ↑ 26 ) 11/11/08 11:21:32 changed by dragisak

  • owner changed from dragisak to russ.
  • description changed.

#745 fixed feed cache invalidation. My impression was that this problem might have been caused by feeds not being properly invalidated on new article publish or virtual journal changes.

I agree that we should wait until deployment to see if this problem is still present.

11/18/08 16:34:07 changed by russ

there is no feed invalidation in 0.9 - there's just a longish time out.

however, iirc this problem was reproduceable even after emptying the feed cache, or when creating a feed with parameters that did not match any keys in the cache...

i'll confirm with susanne...

12/01/08 12:31:48 changed by russ

  • status changed from new to closed.
  • resolution set to fixed.

apparently this has been fixed in 0.9 - james and susanne report no recent occurrence of this bug, and the feed for dates which definitely had this issue in the past is now showing all article correctly.

02/25/09 14:46:46 changed by

  • milestone deleted.

Milestone 0.9.1 deleted