[New Feature] Smart duplicates

Discuss newly added features or request new features.
Post Reply
hugbug
Developer & Admin
Posts: 7645
Joined: 09 Sep 2008, 11:58
Location: Germany

[New Feature] Smart duplicates

Post by hugbug » 09 Sep 2013, 20:40

The feature is designed mainly as a companion for RSS but can also be used standalone.

Purposes:
  1. avoid download of duplicates;
  2. if download fails - automatically choose another release (duplicate).
The feature consists of two parts:
  • duplicate detection;
  • duplicate handling.
Duplicate detection:
  • by adding items from newznab RSS feeds fields imdbid or rageid/season/episode are used if available to automatically detect duplicates;
  • by adding items from RSS feeds not having these fields and by adding nzb-files from other sources (upload via web-interface or put files into NzbDir) a simple title match is used;
  • manually: select items in download queue, click edit to open multiple-items-edit-dialog and use UI action "mark as duplicates".

Duplicate handling:
  • before an item marked as duplicate is added to queue a check is performed whether a similar item was already downloaded. If history contains the title with status "success" the duplicate is skipped (not added to queue);
  • if the title was never successfully downloaded it is added to download queue;
  • if download queue contains the same title the existing item and the newly added item are marked as duplicates to each other. The newly added item is paused;
  • if download of an item completes successfully and there are duplicates to this item in the queue, they all are deleted from queue;
  • if download of an item fails and there are (paused) duplicates to this item in the queue the first of them is unpaused.
This feature is available since r811.

prinz2311
Posts: 466
Joined: 08 Dec 2012, 00:03

Re: [New Feature] Smart duplicates

Post by prinz2311 » 10 Sep 2013, 08:58

hugbug wrote:Duplicate detection:
  • by adding items from newznab RSS feeds fields imdbid or rageid/season/episode are used if available to automatically detect duplicates;
So that are the only things checked?

If so it's a BIG problem because the different quality's of releases are not checked. Example: I always download the HDTV 720p version of a Episode first, but I want the Web dl 720p or better web dl 1080p. This is because not always the web dl's are available. If now the duplicate detection only checks rageid/season/episode it wont download the better quality later. With Movies similar things are possible if only the imdbid is checked.

So if this check ignores the quality and the user is also not able to deactivate this check, Download Systems like the example are no longer possible.

If you look at how many basic quality's today exist.

TV Show Quality's:
  • SD TV
  • DVDRip
  • DVD-R
  • HD 720p
  • RawHD TV
  • HD 1080p
  • Webrip SD
  • Webrip 720p
  • Webrip 1080p
  • WebDL 720p
  • WebDL 1080p
  • BluRay 720p
  • BluRay 1080p
Movies have additionally:
  • Cam
  • TeleSync
  • TeleCine
  • R5
  • R6
  • Screener
[/list]

I probably have forgotten quality's. And new Quality's will surely come in the future.

hugbug
Developer & Admin
Posts: 7645
Joined: 09 Sep 2008, 11:58
Location: Germany

Re: [New Feature] Smart duplicates

Post by hugbug » 10 Sep 2013, 10:24

Forget to mention: this can be deactivated via option DupeCheck.

As for different qualities: it's the idea of duplicate check: to avoid download of the same movie or episode multiple times. You use filters to define qualities you accept and only one file will be downloaded. Why would you need another version if you already watched the movie? (It's a rhetorical question).

Still, I have an idea how to extend the system:
  • queue items become new field DupeScore;
  • DupeScore is set to 0 by default but can be changed when the item is being added to queue;
  • when a new item is added the duplicate check will check its DupeScore against other items including items in history;
  • if newly item has higher DupeScore than items in the history, the newly item will be added and downloaded;
  • if newly item has higher DupeScore than items in the download queue, the newly item will be added and unpaused, previously unpaused item with lower DupeScore will be paused (and used as backup if necessary);
How to set DupeScore?
Use command "options" in RSS filter. Something like this:

Code: Select all

Options(dupescore:100): 720p
Options(dupescore:200): 1080p
In this example:
  • if the feed has only 720p item the item will be downloaded. If later a 1080p-item appears in the feed, it will be downloaded too.
  • if 1080p-item appears first in the feed it will be downloaded. If later a 720p-item appears in the feed it will not be downloaded.
In short: dupe scores allow to define quality grades. A higher quality release is downloaded even if lower quality releases were downloaded before.

Dupe scores don't help in another case: if you always want to download multiple qualities. Is this real? If it is, I have another idea which we can discuss later.

prinz2311
Posts: 466
Joined: 08 Dec 2012, 00:03

Re: [New Feature] Smart duplicates

Post by prinz2311 » 10 Sep 2013, 10:49

hugbug wrote:Why would you need another version if you already watched the movie? (It's a rhetorical question).
Simple answer from my example: HDTV 720p always comes out, but web dl's are unreliable... they come out hour or days later or not at all. So I first download the HDTV Version as a minimum and then download the ones i really want if they come out. It's not like that people watch the download straight away after it's downloaded. But when i watch it I want the best quality that is available and not the first that was available.

All Apps like Sickbeard (for TV Shows), Coutchpotato (for Movies) have such a function download until the best quality that the user defined is loaded or he manually stops the download for better quality's.

I think that is the normal case not the exception, that more then one quality is wanted since not all come out at the same time or at all. One simply never knows before the download which is the max quality that will be uploaded. Movies may have that problem rarely, but for Episodes this is common place.

Still, I have an idea how to extend the system:
  • queue items become new field DupeScore;
  • DupeScore is set to 0 by default but can be changed when the item is being added to queue;
  • when a new item is added the duplicate check will check its DupeScore against other items including items in history;
  • if newly item has higher DupeScore than items in the history, the newly item will be added and downloaded;
  • if newly item has higher DupeScore than items in the download queue, the newly item will be added and unpaused, previously unpaused item with lower DupeScore will be paused (and used as backup if necessary);
How to set DupeScore?
Use command "options" in RSS filter. Something like this:

Code: Select all

Options(dupescore:100): 720p
Options(dupescore:200): 1080p
In this example:
  • if the feed has only 720p item the item will be downloaded. If later a 1080p-item appears in the feed, it will be downloaded too.
  • if 1080p-item appears first in the feed it will be downloaded. If later a 720p-item appears in the feed it will not be downloaded.
In short: dupe scores allow to define quality grades. A higher quality release is downloaded even if lower quality releases were downloaded before.
Yes that would be exactly what Sickbeard/Couchpotato does and would solve this problem.

hugbug
Developer & Admin
Posts: 7645
Joined: 09 Sep 2008, 11:58
Location: Germany

Re: [New Feature] Smart duplicates

Post by hugbug » 10 Sep 2013, 11:00

What about this part:
Dupe scores don't help in another case: if you always want to download multiple qualities. Is this real? If it is, I have another idea which we can discuss later.
Is this necessary?

This could be solved via rss filter command Options too. Each nzb-item has field DupeKey identifying titles. For movies it is set to

Code: Select all

imdb=12345667
for series:

Code: Select all

rageid=12345,2,3
A new filter command could allow modification of DupeKey, like this:

Code: Select all

Options(adddupekey:,720p) 720p
Options(adddupekey:,1080p) 1080p
As a result the internal DupeKey would be set to:

Code: Select all

imdb=12345667,720p
imdb=12345667,1080p
rageid=12345,2,3,720p
rageid=12345,2,3,1080p
And the 720p and 1080p items will be considered different.

prinz2311
Posts: 466
Joined: 08 Dec 2012, 00:03

Re: [New Feature] Smart duplicates

Post by prinz2311 » 10 Sep 2013, 11:19

I don't think that someone wants that (multiple quality's). Sickbeard, Couchpotato can't do that (they overwrite lower quality's in post-processing) and never heard of such a Feature Request. I can't even think of a valid reason for someone to want to do that.

prinz2311
Posts: 466
Joined: 08 Dec 2012, 00:03

Re: [New Feature] Smart duplicates

Post by prinz2311 » 10 Sep 2013, 11:26

I just thought of a reason someone may want to do that: :oops:

Download the 3D and the 2D Version of a Movie.

hugbug
Developer & Admin
Posts: 7645
Joined: 09 Sep 2008, 11:58
Location: Germany

Re: [New Feature] Smart duplicates

Post by hugbug » 10 Sep 2013, 11:45

OK, then no multiple qualities in the first version. If someone ask this can be added later.

prinz2311
Posts: 466
Joined: 08 Dec 2012, 00:03

Re: [New Feature] Smart duplicates

Post by prinz2311 » 11 Sep 2013, 10:06

Thanks. Now I'm thinking about if I really need Sickbeard anymore after this is integrated.

This applies only for newznab rss Feeds.

Just a List that maybe helps others what is better with Sickbeard and what with nzbget. This is just for New Shows/Episodes. For old Shows there is a clear advantage for Sickbeard.

Better with nzbget:
- Failure Handling

- Hasn't problems with scene numbering

Better with Sickbeard:
- Long term duplicate check: Sickbeard has a list for all Episodes and there quality. In nzbget the duplicate check only works as long the Episode is still in the History. Only two ways to handle this is:
1. Change the filters all the time after new episode come out and before they are removed from nzbget History (set the new minimum season/episode) This not very practical.
2. Set the History to a very high setting, but that is not a real option on resource limited devices because of the memory needed

- Postprocessing: There is no guessing which Episode is postprocessed in Sickbeard, because it's known from the snatched nzb. In nzbget I could use Videosort but it uses guessing for identifying the episodes, which can result in misidentification. (This could maybe changed if the Script would use the rageid/season/episode info from nzbget)

So in the End I don't know yet what I will do.

hugbug
Developer & Admin
Posts: 7645
Joined: 09 Sep 2008, 11:58
Location: Germany

Re: [New Feature] Smart duplicates

Post by hugbug » 11 Sep 2013, 13:16

prinz2311 wrote:Set the History to a very high setting, but that is not a real option on resource limited devices because of the memory needed
One history item takes about 2 KB memory. Not that much even on small devices.
It still remains to be seen how the program behaves with big history (thousands of items).

Another aspect - if download queue (with history) become damaged this has fatal consequences for duplicate check. A mechanism for queue backup is needed (not necessary as program feature, a start script or a post-processing script could do that).

The purpose of smart duplicates is to make RSS usable. I don't see how one can use RSS (other than newznab bookmarks) without duplicate check.

Post Reply

Who is online

Users browsing this forum: No registered users and 17 guests