Post

Making art from art

At the start of the year I scraped data from over 800,000 FOI requests from WhatDoTheyKnow to look for patterns in the data. This sent me down a number of rabbit holes, including experiments with automatic request classification. I have spent a lot of time thinking about the text of the requests and responses, but not about the images.

Alaveteli spits these out and lists them as attachments. They’re mainly logos and promotional stuff that is unrelated to the information being asked for, so I’ve never really looked at them.

I decided I’d take a subset of 45,000ish images, and pass them to Gemma 3 12B to analyse. I started at request_id 1,000,000 and worked forward in time from there. To avoid sending the same image to the model multiple times, I recorded a SHA-256 hash of the image to skip ones that were already in the database. This was not foolproof, but it removed a lot of the noise.

I asked the model to output json with:

  • altText: An alt text description
  • subjectType: The main subject of the image.
  • tags: A list of relevant keywords.
  • visibleText: Any text found within the image.
  • is_photo & is_logo: Booleans to look at the image type.

This was also a change to grab some new desktop wallpaper to replace the one linked too closely to my last job. I decided I’d record the top 5 dominant colours for each image in the dataset. Once I removed the blacks, whites and greys that dominated, I was happy with the result.

A horizontal spectrum wallpaper generated from public data

Other interesting finds included:

Having has a quick scan of the outputs, the model seems to have performed well, and if anything might have been overkill in terms of capabilities for this kind of project. I’d eventually like to do all the images, so I’m hoping to find as lightweight a solution as possible that is still accurate enough to be useful.

This post is licensed under CC BY 4.0 by the author.