fusion's previous behavior was to immediately retry requesting a feed when the request fails. This made more sense before we added failure recovery (df412f17d3). Now, immediately retrying on failure complicates the implementation and risks getting the client banned if the server is responding with HTTP 429 errors and we just keep spamming the same requests.
This changes the polling behavior so that we only request each feed once per polling interval. If the request fails, we'll try again at the next polling interval.
* Recover after feed fetch failure with exponential backoff
The current implementation stops attempting to fetch a feed if fusion encounters any error fetching it. The only way to continue fetching the feed is if the user manually forces a refresh.
This allows fusion to recover from feed fetch errors by tracking the number of consecutive failures and slowing down requests for consistent failure. If a feed always fails, we eventually slow to only checking it once per week.
Fixes#67
* Add comment
I realized it's simpler to just create the actual error type in the testcase than to define the error message and wait until the test body to convert it to an error type.
FusionRequest currently specifies model.FeedRequestOptions as a pointer rather than as a regular parameter. This is unnecessary, as it's easy for us to treat model.FeedRequestOptions{} as the 'default options' value. With it as a pointer, we clutter our code with extra != nil checks.
This change updates FusionRequest to just take a model.FeedRequestOptions rather than a *model.FeedRequestOptions.
Co-authored-by: rook1e <rook1e404@outlook.com>
* Add FeedClient.FetchDeclaredLink method
The sniff package uses redundant RSS parsing logic that duplicates logic that's in the service/pull/client package now.
This change adds a FetchDeclaredLink method that exposes functionality that the sniff package needs and switches the sniff.parseRSSUrl to use the client package instead of duplicating RSS parsing functionality.
* Reorder imports
The feed ID doesn't really belong in ParseGoFeedItems because its job is to convert gofeed objects to fusion objects, but the feed ID is not a concept in gofeed.
The only reason we need to store the feed ID in each feed item is so that each feed item references its parent feed in the database, so we should handle populating feed ID with the database logic, not with the gofeed parsing logic.
Puller.do is pretty complex, so I'm trying to pull responsibilities into helper functions.
One responsibility that lifts out pretty cleanly is parsing the gofeed.Item array into a fusion model.Item array, so I added a function for that with accompanying unit tests.
In Puller.do, we decide isLatestBuild by comparing the feed's stored LastBuild field to the fetched UpdatedParsed value, but then at the end of the function, we store PublishedParsed as the LastBuild time for the feed.
This appears to be an error, so this change makes it so that we consistently use the fetched feed's UpdateParsed field as the value of the LastBuild.
There's some complexity to deciding whether or not to update a feed, and there will be even more if we add support for things like honoring the Retry-After response header. I thought it would be helpful to extract the decision to a dedicated function so that we can unit test it.
Note that this change adjusts the decision logic slightly. Previously, we'd skip a suspended feed even if the force parameter was set to true. Now, we'll check a suspended feed if the force parameter was true. I don't think this actually changes anything because it doesn't seem like the user can force a refresh on a suspended feed, but I just wanted to call attention to the change.
I thought it would be helpful to get unit test coverage for the feed fetch logic.
Currently, our logic is pretty simple, but I'd like fusion to eventually be able to support the If-Modified-Since HTTP header, so we don't force servers to send the same data over and over. At that point, extra testing will be especially helpful, but I thought it would be a good idea to get testing infrastructure in place now.
The verb 'Creates' is a bit inconsistent with the other verbs we're using like List, Get, and Update. Those are all imperative, whereas Creates is simple present tense. I think 'Insert' is the more appropriate name, as it's imperative and describes the action. 'Create' sounds as though it creates a single thing, whereas 'Insert' properly communicates that it can insert one or many.