PSA: Reporting Duplicates

Posts: 27 · Views: 1269
  • 8815

    About half of all reports are for duplicates and a lot of these are reported wrongly these days, so to hopefully get us all on the same page and save people some time and effort...

    Wallpapers are merged if the resolutions are the same or almost the same (to a margin of about 5-10%, so a 1920x1280 version would usually be merged with a 1920x1200 one). This is really just to weed out versions like 1920x1081 which are technically not identical to 1080p ones, but are utterly unnecessary.

    When merging, the oldest wall that's the same or almost the same resolution is what other walls get merged with. There are exceptions to this if

    1. the older wall is of visibly lower quality
    2. the older wall has had its original author's watermark removed (wallpaperswide.com and user marks don't count, kids).

    In these cases, better quality or being watermarked takes precedence over age. If an upload of yours was deleted and you don't understand why, it might be because of this.

    Otherwise, walls should be grouped. When this needs to be done, just report under Other and put the URL of the original wall in the info field. Walls that are already grouped shouldn't be reported as dupes without good reason.

    Please refer back to this if you're unsure when reporting and discuss as well - we realise it's not fun taking the time to upload walls and having them taken down for being dupes, it's not an exact science, it's work in progress, etc...

  • 8825

    First, if I see two identical wallpapers and one of them has 1920x1080, and the other let's say 3840x2160, then I should report for 'Other' and provide a link to a different sized wallpaper? Second, if two or more wallpapers are grouped because of different resolutions, where can I see other sizes? What if I see a wallpaper with Full HD size, and I know it's grouped with 2K/4K and I want bigger sized image on my desktop?

  • 8828

    @dwemer, if it's grouped then right under the resolution it has a link for "1 more size"

  • 8834

    sannukas0016 thank you, now it's clear

    Added 2016-06-07 16:04:31

    I have another question related to groups of wallpapers. Let's say I have some wallpaper in many different resolutions - 4K, 2K Full HD and so on. Should I upload all of them or the biggest one is the best?

    Last updated
  • 8847

    The biggest one would be the best option but you could upload all of them and just group them (which I believe is automatics, not entirely sure)

  • 8857

    You should not upload all resolutions but instead only the largest and, if it that is not already a standard resolution, one standard resolution (like 1920x1080). If you upload more than that we will have to delete them manual and will likely become annoyed.

  • 8861

    I found these two Wallpapers:

    loading
    1920 x 108033

    [lq3o52]

    They are minimal in Resolution. So is this a case of Duplicate ? Shall i report ?

  • 8942

    @WallpaperManiac, there's an example of two walls of similar enough resolutions, so that's the kind of thing we'd like folks to report. I've already merged the second with the first.

  • 8958

    What about these?

    loading
    4500 x 316835
    [5d75j5] They are different in sizes, they are almost the same, but different in details, should I report something like these to be grouped, or they should be completely different?

  • 8959

    @dwemer, the differences are big enough that you can't count them as 1 image.

  • 8978

    What about low effort collages like this that are just other users' uploads in non-standard resolutions? [eoxvoo]

    loading
    2960 x 185064PNG
    loading
    1653 x 233855
    loading
    2052 x 350725

  • 8987

    KrimzinZV, these aren't dupes but we usually consider this kind of thing as a low-quality edit, so against the rules.

  • 8989

    Please look in the Best of the Worst Thread there you will find some really good kind-of-a-thing low-quality-edits

    Last updated
  • 9444

    You don't need the public for this you could easily create an algorithm using python to compare pixel to pixel image differences to search for duplicates.

  • 9447

    JCarlin6 said:

    You don't need the public for this you could easily create an algorithm using python to compare pixel to pixel image differences to search for duplicates.

    There's just so much wrong about that…

  • 9459

    Gandalf JCarlin6

    Pixel-by-pixel or byte-by-byte is a really really bad idea.

    Wallhaven currently hosts more than 400K wallpapers. With an average of almost 700 MB for 1K wallpapers. Not to mention you'd have to retrieve each uploaded image ---> read its data--->Compare it with all wallpapers?!!! Also, one could simply convert the image. So, you have to spend more than 6 ~7days to compare a single WP with the other 400K this would result a huge impact on both disk(reading from disk) and memory(running the application)

    The best you can do is to categorize every uploaded image by uploader /size/author(png)/type/tags/date created/dimensions

    Long story short, it is tooooo late!

    Last updated
  • 9470

    Look, there are ways to find duplicates quickly. They are just a little more complicated than "compare all the pixels". We're going to be using IQDB, which should be able to find similar wallpapers quickly enough. It's just a bit annoying to implement because IQDB is a standalone program (not a library) and doesn't come with a handy PHP Interface. But we'll get there. ^^

  • 9488

    Gandalf , It doesn't have to be complicated. Besides, you can implement it after the alpha phase is over.

  • 9490

    Better dupe detection will hit this year for sure. We won't leave alpha without it being implemented.

  • 9492

    Holy, nothing has to be complicated, but it often ends up being that way when you want to achieve several things within a specific system. We never came to a firm conclusion amongst ourselves about the extent to which we should tolerate dupes, and this thread is evidence that we don't all have the same ideas about what constitutes a dupe in the first place. That's a separate issue but informs what's implemented.

    Solid dupe detection is one of the niggles we'd like sorted in plenty time before alpha is over (and has actually been an issue since pretty much the first week). Like you said yourself, it's too late to go about this via certain methods, which is part of the reason we'll be starting fresh again in whatever capacity.

    Anyway, thanks to you guys for chipping in so far...

  • 9499

    AksumkA That's amazing! byebye alpha!

    cfunk

    it often ends up being that way when you want to achieve several things within a specific system.

    I wouldn't know.

    we don't all have the same ideas about what constitutes a dupe in the first place

    1) Not searching if an image already exists. 2) Increasing the uploaded images count.

    we'll be starting fresh again in whatever capacity.

    Mwahah!!! More png-24!

  • 9508

    I think that not all of these should be deleted, or, at least two should be preserved. [4ymyd7][4y5mwl]

    loading
    2500 x 1527224
    [492o3w]

    100216 (last one) is probably the worst copy, doesn't have the watermark and is of the smallest size.

    Maybe mark other three as "other size", that would be my suggestion.

    I even had this as a wallpaper for some time.

    Last updated
  • 9549

    Vozho, this is the kind of situation where we'd group, which is what I've done (but #2 and #3 are just barely different enough).

Message