Notice: Comments you submit will be routed for moderation. If you have an account, please log in first.

Ticket #1694 (closed Bug: fixed)

Opened 19 months ago

Last modified 18 months ago

Duplicate finder should find similar files

Reported by: silversurffer@… Owned by:
Priority: 7, high Milestone: 1.9 "Killer Queen"
Component: Any (Default Component) Version: 1.9RC5
Keywords: ratings; covers; duplicates Cc:

Description

nice player, but ...

1 did you call it "killer queen" because it resets ratings if you make a deep refresh? You should mention this because its annoying! moreover it cannot be helped if one add extra tags.

2 'albums catalog' don't show covers now - in contrary to RC4.

3 'duplicate finder' don't list very similar files like 'Paffendorf - Self Control.mp3' and 'Paffendorf - Self Control_2.mp3.

hth

Attachments

Change History

Changed 19 months ago by bflorat

  • summary changed from ratings & album covers are gone; duplicate finder should find similar files to Duplicate finder should find similar files
  • milestone changed from To Be Qualified by Jajuk Team to 1.9 "Killer Queen"

1 : hum :-/ , we're sorry about it, this is a bug we are already aware of but we cannot reproduce it so far. See [1] to restore a previous backup if available. Downgrading to 1.8.4 should be ok with your ratings (not 100% sure but it is worth a try). It would probably be fixed in next RC.

2 : this is already reported too. Seems to be a RC5 regression.

3 : Please provide these two files duration / size and tag values to know if we should expect these files to be seen as duplicates.

[1] http://www.jajuk.info/index.php/Using_devices#Collection_backup_and_restore

Changed 19 months ago by silversurffer@…

Well, the problem arises through web-taperecords by 'AudialsOne?'. Thus, tags are identical, there is only the described difference in the file name (additonal remark "_2", or "_3" ...) and - very often - in the file length.

At least 'duplicate finder' should find files with the same tags but different file length.

Rgds, Marc

PS. Besides I think it would be helpful if the 'file date' (in config/advanced/discovery date) would catch the 'creation date' of the file and not the 'mod-date'.

Changed 18 months ago by bflorat

  • On the duplicate finder issue : jajuk sees two tracks as identical if and only if they have the same :
    • genre name (case sensitive)
    • artist name (case sensitive)
    • album name (case sensitive)
    • album CDDB disc id (computed by jajuk by reading files durations of files in the same directory)
    • track name (case sensitive)
    • Year
    • Duration (in secs)
    • Track#
    • Type (mp3, ogg, wma...)
    • Disk number The file name is not token into account.

If you feel that this contract is not endorsed in the case of your files, please send me a sample a two identical files at bertra{x}d@… (x=n)

Note also that even if two files have the same tags and same duration, they are seen as different if located in two directories with different number of files (see the cddb disc id criteria bellow)

I agree the current disambiguation algorithm is probably too restrictive. I created this task for future releases : #1698 ([Duplicate finder] Fuzzy search).

  • On the ratings resets : another user reported it (see #1630) but we cannot reproduce so far. You'll be very kind of providing more details. Did it occurred during a 1.8.4 to 1.9RC5 migration or using 1.9RC5 for a while and performing a deep refresh ? Did all your rates were lost or only some ? Did you added extra tags before this happened ?
  • On your "file date" suggestion : I totally agree but I'm afraid it can't be done because the "creation date" concept is MS Windows specific AFAIK and Java doesn't provide any way to retrieve it.

Changed 18 months ago by bflorat

  • status changed from new to closed
  • resolution set to fixed

About the duplicate finder : after more thoughts, we found the current duplicate criteria so restrictive that it was almost a bug to ignore two exactly identical tracks located in different directories with different number of files.

This is why we checked-in a fix. We now compare album name, not album cddb discid. We now consider a dup as same :

  • genre name (case sensitive), if provided
  • artist name (case sensitive), if provided
  • album name (case sensitive), if provided
  • track name (case sensitive)
  • Year, if provided
  • Duration (in secs)
  • Track#, if provided
  • Type (mp3, ogg, wma...)
  • Album-artist (case sensitive) , if provided

We close this ticket, feel free to reopen if required.

Feedback about the deep refresh issue is still appreciated into the ticket #1630.

Changed 18 months ago by bflorat

I forget : the fix will be available in 10 aug daily build or 1.9 RC6 or 1.9 final.

Add/Change #1694 (Duplicate finder should find similar files)

Author



Action
as closed
Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.