Add unsplash search terms

Bug #1527998 reported by jpeg729
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Variety
Triaged
Wishlist
Unassigned

Bug Description

For example, it would be nice to be able to download only "landscape" images from unsplash.

Revision history for this message
jpeg729 (jpeg729) wrote :

I just noticed, it currently doesn't work, and they have added an api.

https://source.unsplash.com/

If I'm not too busy, I might try hacking it myself in a day or two.

Revision history for this message
jpeg729 (jpeg729) wrote :

Done.

I copied the Flickr code and hacked it into shape for Unsplash.

I'm more familiar with git than bazaar, so I tried following the directions I found in http://doc.bazaar.canonical.com/latest/en/tutorials/using_bazaar_with_launchpad.html

I pushed it to a personal branch using "bzr push lp:~jpeg729/variety/unsplash". I hope that is alright.

It doesn't queue downloads because the api doesn't return a list of potential downloads, it just does an http redirect to a random image respecting your search terms.

I didn't put anything in the Test file because I wasn't sure what to put. None of the checks done for the Flickr downloader seemed relevant.

I haven't done anything about getting it ready for translation, other than using _("...") where that seemed appropriate. I hope that is enough.

Revision history for this message
jpeg729 (jpeg729) wrote :

I have just pushed a bugfix and a new feature.

I hadn't properly understood the difference between origin_url and image_url, so the wrong urls were getting banned. Now origin_url is the link to the url without the various options present in image_url. Thus the same image is banned whatever the options given.

The new feature is the possibility to choose whether to crop the image to your screensize, or to scale the image down. This is an unsplash feature that I discovered by messing with the image urls it gave. Crop, the default option will keep the most interesting part of the image.

Revision history for this message
jpeg729 (jpeg729) wrote :

I have reverted the crop/scale options because it makes variety download images that don't fit the screen. The default crop based on image entropy works really well.

Revision history for this message
jpeg729 (jpeg729) wrote :

I've had it running for a few days downloading a new wallpaper from unsplash every 10 minutes, no obvious problems have cropped up.

I shall wait for your feedback. No rush though, I am happy with what I have.

Revision history for this message
Peter Levi (peterlevi) wrote :
Download full text (3.6 KiB)

@jpeg729
Hi, sorry for replaying late, not enough time for Variety lately...
First, thanks a lot for the input. I checked it out, and gave it a try, and briefly looked at the code. I have some concerns, mostly are related to the Source API:

1. This one is major. I prefer to credit the image authors (name and URL). It seems the Source API does not provide them, and if this is the case, then it's a no-go and we should use the full version. The old version of the downloader had author info and I definitely prefer to keep it this way for Unsplash. Look for this code in the previous version of the downloader to see how to do it on Variety's side:

                extra_metadata = {
                    'sourceType': 'unsplash',
                    'sfwRating': 100,
                    'author': item.find_all('a')[1].contents[0],
                    'authorURL': self.location + item.find_all('a')[1]['href'],
                }

But if with the Source API we have no way to do it, we'll be better off using the json api. I will need to research the APIs some more, as probably I have to create the API application account. The problem with the json API is that is has rate limiting, 5000 requests per hour. If we limit ourselves to search operations, use a queue, and do not make additional API requests to fetch individual photos, these 5000 requests/hour should be enough.

2. Do not rely on the user screen size in the downloader. It can change at any time, some users have multiple monitors, or switch between laptop / external monitor, etc. Consider it very, very unreliable. The usual best approach is to just always fetch the highest possible resolution and let the corpping/scaling be handled by the wallpaper logic of the OS (user also has control over this via the Appearance settings). But it seems the Source API does not give us good control here - if we don't specify size, we get a very small image, otherwise we get a cropped image.

3. The source type "unsplash" collides with the previous version of the UnsplashDownoader so the existing "unsplash" source misbehaves (the one with location "High-resolution photos from Unsplash.com"). I guess the new UnsplashDownloader tries to parse this location unsuccessfully. Since all users have this source in their current configurations, we need to handle this better. One way is to catch such location parsing errors and default to the most sensible filter "Use all featured content", and to also use the same special location "High-resolution photos from Unsplash.com" when user configures an "All featured content" unsplash source - so that the label is kept if they click Edit and we don't change anything in their existing source when they do such a non-edit.

4. Quality vs. quantity. The vast majority of Variety users don't really care about the customization aspects of the image sources, most of the users prefer high-quality sources configured out-of-the-box and never tinker with them. This is why the old Unsplash source was using the featured images. We need to make sure the default included source is the same high-quality source it used to be. Also the GUI should be tailored so it is very easy way to just add this...

Read more...

Revision history for this message
jpeg729 (jpeg729) wrote :

I can't say I'll have much time soon, and neither will you, I suppose. But yes, I'd be happy to look into it.

1. I see your point, even if the images are provided under a do what you like with them license. I checked a few of the downloaded images to see if their metadata included the author - none do. Even the full images don't. Maybe we could ask them to see about including it automatically. That would make things easier for us - we'd only have to extract it from the jpg.

The api docs warn constantly that the photo info is truncated, but if we do a search, the information returned does include the author, just not the exif info, the location and a ton of other info. So that should be doable.

2. I see your point about screen size. However, the full size images (obtained by removing the crop and size parameters from the image url) can be HUGE, besides the entropy crop seems to work pretty well. I would suggest leaving the crop option available for those who might want it, and maybe adding optional maximum dimensions.

There is also the image shape problem. Some images are very far from being anywhere near square.

Maybe we should simply try a range of standard screen sizes, and as 1920x1080 won't return images that are smaller, we could try 1600x1200 if there were none bigger, and so on until we are reach the current screen size.

3. Good point, I'll see what I can do about backwards compatibility.

4. Maybe we could add a few searches for curated photos to the default list of image sources. For example: "featured;mountains" and "featured;landscapes".

Revision history for this message
jpeg729 (jpeg729) wrote :

I'm going to suggest a hybrid approach - use the api for some stuff, and use html parsing for other stuff.

For example, we can browse the curated_batches at https://unsplash.com/collections/curated, so when we find a good one, we can add it as a default source. Unfortunately, none of them seem to be that large.

I'd be happy to write some html parsing stuff for the featured content. Unfortunately, I still see no way of searching for curated content, other than by the source.unsplash.com api. An alternative would be to accept only images with a minimum number of likes.

I'm still concerned by the image size question. I tried using full size images from unsplash on my not so slow laptop with an ssd, and the image list that shows along the bottom of the screen took ages to load. Some images were over 5mb in size. I guess most users would be happy with a sensible default maximum that they can customize if needed. How about 1920x1920 as a default maximum image size, or if you prefer, as an option that can be enabled by those who choose to.

James Lu (jlu5)
Changed in variety:
importance: Undecided → Wishlist
James Lu (jlu5)
Changed in variety:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.