Ultrawide	16:9	16:10	4:3	5:4
2560 × 1080	1280 × 720	1280 × 800	1280 × 960	1280 × 1024
3440 × 1440	1600 × 900	1600 × 1000	1600 × 1200	1600 × 1280
3840 × 1600	1920 × 1080	1920 × 1200	1920 × 1440	1920 × 1536
	2560 × 1440	2560 × 1600	2560 × 1920	2560 × 2048
	3840 × 2160	3840 × 2400	3840 × 2880	3840 × 3072

Wide	Ultrawide	Portrait	Square
All Wide	All Portrait
16 × 9	21 × 9	9 × 16	1 × 1
16 × 10	32 × 9	10 × 16	3 × 2
	48 × 9	9 × 18	4 × 3
			5 × 4

Wallhaven is easy to use for external programs/bots

Posts: 3 · Views: 139

dosluke – 7 years ago13234
AksumkA Gandalf bfoxwell

What do you guys think of external applications using Wallhaven infrastructure?

using my program, i have already gone through the first 20k wallpapers by hand. glancing at each one and saving it if it catches my eye. (i can go through about 1k in 10 minutes)

an example: https://www.youtube.com/watch?v=JfMAkDZqs9I

in case you guys are against it, the code only exists on my computer, nowhere online, and the video is unlisted. also pardon my bad video skillz. this is litterally the first video i have ever "produced"

off topic, but some interesting things: 1) from a sample size (not random) of about 10k, the average file size seems to be about 500 kb. 2) of all the pictures, about 60% seems to be softcore porn. the website filters it, but my program doesnt :| 3) the total size of all images has to be somewhere around 275 gb, excluding metadata and whatnot.
Gandalf – 7 years ago13237
If I understand your video correctly that's a scraper, though I'm not quite sure if it's interactive or something? If you use it responsibly that's fine but if you run that on a larger scale it may impact our server performance, in which case we'd have to do something about it.

As you have already noticed it's not particularly difficult to write a scraper for wallhaven (I have one myself to synchronize my collections). We may do a bit of work in the future to make this a little more difficult. Not because we want people to "only use wallhaven via browser" which would be silly, but rather to prevent large scale scraping. For example it's considered a matter of politeness not to parallelize a scraper (for any website, not just here).

Meanwhile we're also hoping to one day (no ETA, sorry) offer an API. That should make it easier to create tools like this while allowing us to exact some control over the load individual tools may causing.

As for your off topic mentions: 2) That stat is a lot lower for the overall website. 3) Slightly over 300 atm.
dosluke – 7 years ago13239
you know i never thought to call it a scraper. this is the first one ive written and i know the term, i just didnt think of it.

its slightly interactive. i input the starting parameters (besides the max which is usually determined by finding the highest number that doesnt give a 404. i gave it a constant value for the demonstration). I never heard that it was polite to not parallelize scrapers, but makes sense. It doesnt even improve the timing by that much (ill remove it). as for extensive use, i usually do it in 10k bursts which i then spend the next few days going over. is that alright?

as easy as it is to do this, i suspect that it could become a popular target for scrapers. I would suggest using base 64 hashes like youtube does, instead of sequential numbers. (if you are interested i know i a nice video talking about it, though im sure you know a lot more than i)

an api would be nice.

I have another idea: after the website reset, everything will be removed. i could set up a program that every once and a while would check for new wallpapers and scrape them. since it would be staying up to date, i dont think it would be considered to have an extensive impact. more like a user who downloads every new wallpaper say hourly or daily.

To prevent scraping abuse, i will not be releasing the code anywhere. and if you would like I can edit and remove the link from the post.

on second thought i gave an old copy of the exe to a friend. its not parallelized, and i doubt he will abuse it. I will ask him to delete it.