Artificial intelligence do artists leave a visual signature?
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
- As a fan of art and machine learning, I trained a set of convolutional neural networks to classify artwork and explore which styles and artists are most similar.
- Identifying the subject of a picture is relatively easy - the model achieves 81% accuracy with <20 epochs. Artistic style can be identified with an accuracy of c.76% - symbolic pieces proved hardest to distinguish. Detecting the artist proved challenging with an accuracy of c.74% on ~100 artists - I expect that this still exceeds non-expert human performance.
- This has been a really fun project to work on, allowing me to discover new styles and artists relating to my own work. I'm enthusiastic to progress this work further.
- Tools used: fast.ai libraries, PyTorch, Google Colab, github here
Preparing the dataset
I obtained a large Kaggle dataset of around 100,000 artworks, sourced chiefly from wikiart and covering a wide variety of artists, styles and genres. I wanted to focus on popular artworks and preserve reasonable training times so applied a few filters:
- Filtered for 19th, 20th and 21st century art and selected the top artists, ranked by number of images. This yields around 100 artists with an average of 210 images per artist.
- Within this group, I filtered the top 15 subjects and top 15 styles. I removed genres which relate to the medium rather than than the subject like "illustration" or "sketch and study". I also collapsed similar categories which are visually very hard to differentiate like “impressionism” and “post-impressionism” to “impressionistic”.
- I used some image processing and limited augmentation – the images are compressed to have a consistent smallest dimension of 256 pixels. I also use small rotations of <5 degrees and magnification factors of 1- 1.2 to add some variety between training epochs.
This creates a dataset of around 20,000 images with a good mix of modern genres, styles and artists. I then split this 80:20 for training and validation - there was no specific design to split labels across train and validation sets but it's likely these will appear in both sets. Note that there are significant class imbalances but this did not cause training issues.
Interestingly, the problem set does not reduce to identifying the artists - many of the artists in the collection produce work spanning across genres and styles, often through different periods of their working lives.
Building the model
I used the fast.ai tools which sit on top of the PyTorch library to apply a transfer learning approach. As a starting point I use a pre-trained ResNet 50 architecture (initially outlined here) which use skip-connections for strong performance on varied image recognition tasks.
I trained each problem (genre, style, artist) independently, although I am experimenting with transferring weights between tasks for efficiency. I employed a few tricks to obtain good results:
- Freezing all except the final layer for the first few epochs.
- Use differential learning rates across layers to focus learning in later layers.
- Using variable learning rate multipliers within epochs.
Given the ResNet model is already trained to recognise objects, detecting the subject of a painting might be expected to be the easiest task. Within 15 epochs I achieved a validation error rate of 19%, meaning around 1 in 5 images are misclassified. The confusion matrix below shows the pattern of errors between predictions and actual labels - cityscapes, landscapes and people and portraits are frequently confused and symbolic paintings are hard to differentiate.
Inspection of the largest losses - that is the cases where the model misclassified images with most certainty - is revealing. In several of these cases I would select the prediction ahead of the actual label; clearly there is judgement in many cases and multiple labels may be applicable to a single image (like a landscape view featuring buildings and people). Other errors are unsurprising like "religious paintings" being misclassified as "people and portraits". So the accuracy is probably underestimated and the model could plausibly be used to help enrich or correct labels.
Artistic style would be expected to be harder to detect - this captures subtle use of colour or brushstrokes and there are no clear boundaries between styles. Nevertheless, after 15 tailored training epochs I achieved an accuracy of 76% with a few interesting observations:
- Realism can be difficult to distinguish from impressionistic art or romanticism but magic realism does appear distinguishable. Inspection of some of the errors suggests that the ground truth labelling may be incorrect in some cases.
- Like the symbolic art genre, the style of symbolism is especially hard to identify correctly - images labelled symbolic were frequently misclassified as impressionistic, art nouveau, realism, romanticism or surrealism. This is unsurprising given the cultural understanding required to identify symbolism.
With around 15 epochs the model yields an error rate of 24% which I consider to be fairly impressive since there are c.100 artists in the dataset. However, there is clearly room for further improvement as the Kaggle competition winner achieves an accuracy of above 90% on a broader dataset.
This suggests that artists do leave a clear signature in their pictures - although I'm investigating whether in some cases the model might be picking up the actual signatures!
Testing on specific paintings
I couldn't resist the temptation to try out the model on my own (mediocre) artwork - the piece below yields similarities to Martiros Saryan, who I had not discovered but does use vivid colours in a similar manner.
This work is ongoing and I'm exploring a few avenues:
- Grouping the label categories to align better with the clusters used in current art marketplaces.
- Testing other architectures and training approaches.
- Looking at activation for different artist and pieces to identify similar work.
- Based on visual inspection of a handful of test cases, it does appear that the style and artist detection networks (over)emphasise the use of colour. So I'm exploring options to use black-and-white images to reduce reliance on colour.