fusion

Author	SHA1	Message	Date
Yuan	de223e6991	fix: use update time if item does not have a publish time (#159 )	2025-04-28 20:37:30 +08:00
Yuan	bc8109fe39	refactor: replace zap log with slog (#150 ) * refactor: replace zap log with slog * fix	2025-04-25 17:18:25 +08:00
Michael Lynch	e4e08942a9	Only pull feed once per polling interval (#121 ) fusion's previous behavior was to immediately retry requesting a feed when the request fails. This made more sense before we added failure recovery (`df412f17d3`). Now, immediately retrying on failure complicates the implementation and risks getting the client banned if the server is responding with HTTP 429 errors and we just keep spamming the same requests. This changes the polling behavior so that we only request each feed once per polling interval. If the request fails, we'll try again at the next polling interval.	2025-03-28 10:52:39 +08:00
Michael Lynch	f09b990533	Handle feeds that use relative paths for links (#116 ) * Handle feeds that use relative paths for links Fixes #94 * Use url.ResolveReference	2025-03-28 10:51:34 +08:00
Michael Lynch	df412f17d3	Recover after feed fetch failure with exponential backoff (#108 ) * Recover after feed fetch failure with exponential backoff The current implementation stops attempting to fetch a feed if fusion encounters any error fetching it. The only way to continue fetching the feed is if the user manually forces a refresh. This allows fusion to recover from feed fetch errors by tracking the number of consecutive failures and slowing down requests for consistent failure. If a feed always fails, we eventually slow to only checking it once per week. Fixes #67 * Add comment	2025-03-24 11:11:18 +08:00
Michael Lynch	50a6652d6c	Switch to more obviously invalid XML (#104 ) Strangely, my LLM coding assistant gets confused when it encounters the	2025-03-23 12:21:00 +08:00
Michael Lynch	d36ef67037	Convert httpErrMsg to httpErr (#103 ) I realized it's simpler to just create the actual error type in the testcase than to define the error message and wait until the test body to convert it to an error type.	2025-03-23 12:18:53 +08:00
Michael Lynch	c0eaea70de	Refactor Puller to create a separate SingleFeedPuller (#102 )	2025-03-23 12:16:58 +08:00
Michael Lynch	8de93295b6	Change FusionRequest to use a non-pointer type parameter (#106 ) FusionRequest currently specifies model.FeedRequestOptions as a pointer rather than as a regular parameter. This is unnecessary, as it's easy for us to treat model.FeedRequestOptions{} as the 'default options' value. With it as a pointer, we clutter our code with extra != nil checks. This change updates FusionRequest to just take a model.FeedRequestOptions rather than a *model.FeedRequestOptions. Co-authored-by: rook1e <rook1e404@outlook.com>	2025-03-23 11:59:24 +08:00
Michael Lynch	797f270178	Add FeedClient.FetchDeclaredLink method (#98 ) * Add FeedClient.FetchDeclaredLink method The sniff package uses redundant RSS parsing logic that duplicates logic that's in the service/pull/client package now. This change adds a FetchDeclaredLink method that exposes functionality that the sniff package needs and switches the sniff.parseRSSUrl to use the client package instead of duplicating RSS parsing functionality. * Reorder imports	2025-03-22 11:48:46 +08:00
Michael Lynch	68760f2ce6	Move RSS parsing code from pull to a dedicated package (#96 ) * Create a client package * Bring back TestDecideFeedUpdateAction, which was removed accidentally * Fix import order	2025-03-21 21:13:20 +08:00
Michael Lynch	d2cb870574	Remove feed ID from ParseGoFeedItems (#95 ) The feed ID doesn't really belong in ParseGoFeedItems because its job is to convert gofeed objects to fusion objects, but the feed ID is not a concept in gofeed. The only reason we need to store the feed ID in each feed item is so that each feed item references its parent feed in the database, so we should handle populating feed ID with the database logic, not with the gofeed parsing logic.	2025-03-20 10:41:40 +08:00
Michael Lynch	d956996be4	Refactor gofeed.Item parsing (#84 ) Puller.do is pretty complex, so I'm trying to pull responsibilities into helper functions. One responsibility that lifts out pretty cleanly is parsing the gofeed.Item array into a fusion model.Item array, so I added a function for that with accompanying unit tests.	2025-03-16 10:43:30 -04:00
Michael Lynch	a926192777	Treat update time as feed build time consistently (#78 ) In Puller.do, we decide isLatestBuild by comparing the feed's stored LastBuild field to the fetched UpdatedParsed value, but then at the end of the function, we store PublishedParsed as the LastBuild time for the feed. This appears to be an error, so this change makes it so that we consistently use the fetched feed's UpdateParsed field as the value of the LastBuild.	2025-03-15 15:17:54 +08:00
Michael Lynch	cfc5d8dfb1	Get rid of failure variable in Puller.do Eliminating the failure variable simplifies this function a bit and makes the values in the function calls more obvious to the reader.	2025-03-09 21:01:01 -04:00
rook1e	c77ec889ca	fix: correct pointer dereference for suspended feed skip reason	2025-03-08 16:07:03 +08:00
rook1e	45623915eb	fix: force refreshing should skip suspended feeds	2025-03-07 12:53:34 +08:00
rook1e	3ad3c3ab56	refactor: add a pointer helper function	2025-03-02 22:20:44 +08:00
Michael Lynch	68ead651f2	Add tests for deciding whether to update a feed There's some complexity to deciding whether or not to update a feed, and there will be even more if we add support for things like honoring the Retry-After response header. I thought it would be helpful to extract the decision to a dedicated function so that we can unit test it. Note that this change adjusts the decision logic slightly. Previously, we'd skip a suspended feed even if the force parameter was set to true. Now, we'll check a suspended feed if the force parameter was true. I don't think this actually changes anything because it doesn't seem like the user can force a refresh on a suspended feed, but I just wanted to call attention to the change.	2025-02-28 20:28:17 -05:00
Michael Lynch	60e75ef33e	Add unit tests for pull.FetchFeed I thought it would be helpful to get unit test coverage for the feed fetch logic. Currently, our logic is pretty simple, but I'd like fusion to eventually be able to support the If-Modified-Since HTTP header, so we don't force servers to send the same data over and over. At that point, extra testing will be especially helpful, but I thought it would be a good idea to get testing infrastructure in place now.	2025-02-27 16:28:58 -05:00
Yuan	09a409a5de	Merge pull request #62 from mtlynch/creates-insert Rename Item.Creates to Item.Insert	2025-02-25 10:46:27 +08:00
Michael Lynch	235e93b0fd	Rename pull.FetchFeeds to FetchFeed If I understand correctly, the function fetches a single feed, so the name should be FetchFeed to reflect the function's purpose.	2025-02-24 20:57:21 -05:00
Michael Lynch	2a0200d955	Rename Item.Creates to Item.Insert The verb 'Creates' is a bit inconsistent with the other verbs we're using like List, Get, and Update. Those are all imperative, whereas Creates is simple present tense. I think 'Insert' is the more appropriate name, as it's imperative and describes the action. 'Create' sounds as though it creates a single thing, whereas 'Insert' properly communicates that it can insert one or many.	2025-02-24 20:42:14 -05:00
rook1e	8a25ffa154	refactor: list only the feeds that match the condition	2024-08-04 17:35:31 +08:00
rook1e	9133dc65c3	feat(#3 ): support fetching feed though the proxy	2024-04-21 22:24:59 +08:00
rook1e	bedd8ae21f	fix: add retry and try to fix network errors such as EOF	2024-04-03 22:14:38 +08:00
rook1e	4c7889deff	refactor: use uber-go/zap	2024-03-18 22:10:57 +08:00
rook1e	39072600d3	refactor: derive context from user's request	2024-03-18 18:17:43 +08:00
rook1e	8f08cbb22d	fix: item list order	2024-03-16 23:59:16 +08:00
rook1e	d684247b42	refactor: optimize refresh skipping	2024-03-16 23:14:07 +08:00
rook1e	fbb04b3a84	fix: refresh failed feeds when user trigger refreshAll	2024-03-15 12:49:35 +08:00
rook1e	340f3be641	refactor: combine guid, feed_id and deletedAt as unique identity	2024-03-13 00:29:24 +08:00
rook1e	d91a5a5e3c	refactor: error handling	2024-03-12 22:57:53 +08:00
rook1e	5dee1de132	feat: add support for suspending/resuming a feed	2024-03-12 01:00:35 +08:00
rook1e	c217a2adf6	fix: invalid timeout context in puller	2024-03-11 12:51:55 +08:00
rook1e	e9b065e9fb	init	2024-03-06 16:54:13 +08:00

36 commits