During Apple’s recent announcement of new products and software versions, an apparently tiny detail nonetheless got my attention. There was about the picture processing software: Craig Federighi –the head of software at Apple– presented a seemingly ordinary photo of a kid sitting on a bench on the roof of a building. He commented, “Notice that the boy is not at the center of the picture, but of course, you want to have him at the center!”, adding “Now you can move him to the center of the picture!” while dragging the boy’s image to the intended place.
“Wait a minute, are you altering the picture?” I asked. And things didn’t stop there: with the photo software included in the promised next iteration of iOS, using Generative AI, the photo editing dad extended the picture to present a wider area. The photo looked better and more compelling. It was a better image for the photo album.
Or was it?
The resulting photo, as lovely as it could be, was not the image of what happened on the rooftop the day it was taken. It could be similar, but it’s not what was there in real life.
“In real life” is the keyword here. Is it essential to collect images of what happened IRL?
At least that was what people of the older generation did when they gathered photos in those decorated beautiful photo albums that we don’t use anymore –digital photography is to blame. Photo albums’ “souvenir” quality was intended to relive our memories, in moments of shared joy and longing, while we browsed the old photos and identified people we knew.
But now that photos can be manipulated in ways that those old photos couldn’t, are they still memories or souvenirs?
“Testimonial” and “artistic” photos
I’m a big photography buff, and for decades I’ve taken photos of my family and friends and pictures of people I don’t know just because they looked interesting. By the way, there is a kind of “etiquette” for respectfully taking photographs of strangers.
Somebody asked me if I was interested in “artistic” photography. “Artistic” sounds to me like a big word; I wouldn’t pretend to be an “artist.” But I noticed that I took photos for one of two possible reasons:
Sometimes I wanted to have a picture of people I know (mainly family and friends), an event, a special place, etc. I called them “testimonial pictures” because the identity of the people in them is crucial. If you remove the identity of the people in the picture (for instance, by showing them to a stranger), all value is gone.
Other times I took pictures of a compelling image, such as a sunset, an expressive face, etc. I call them “expressive” or “artistic” pictures. The value is in the beauty of the sunset or the emotion the picture evokes –even to those who don’t know who’s in the photo.
It’s not a dichotomy: many photographs of family or friends could be expressive in some way, not just a testimony that “we were there.”
The “selfies” –the most common type of photo currently– are testimonial pictures: we want to make the point that we were there together at that moment. Without the identity of the people in the selfie, its value is none.
My vacation pictures usually take the middle ground, sometimes going more to the testimonial side and others to the expressive side.
Adjusting an expressive or artistic photo is not a big deal because the picture was not testimonial in the first place. It feels OK, as what we want to maximize is the beauty or the emotional expression.
What about AI images?
In principle, fully AI-created images shouldn’t pose much of a problem because they are invented; they are not “testimonials” in any way. You won’t put synthetic images in a family photo album, even digital ones. At least, this is how we see fake images now. But what if people start wanting fake pictures of their beloved ones?
I don’t know about you, but in my experience as a photography buff, I have the impression that my best photos are those I couldn’t take. The fleeting moment for the perfect image just slipped away while I was busy doing something else: I thought, “Wow, this would have been an amazing photo.” And then it’s gone. But what if I can, in a way, reproduce what I saw but fail to capture as a picture? Soon this could be done with Generative AI. I could be interested.
The fusion between photographic software and Generative AI software is being done in one of the following two ways:
Photo edition software adds Generative AI capabilities like fill-out for extending a photo with invented stuff. Photoshop is doing this; Apple apparently will also do it in the next photo software iteration.
Generative AI software adds capabilities to import a picture and do something with text commands. This is already done in several products like DALL-E 2 and others. The new version of Midjourney adds a “zoom-out” feature to extend the canvas of an image beyond its original boundaries.
Tampering with the evidence
Photo editing has shifted a lot after the arrival of Generative AI, and this trend will continue for the foreseeable future.
A few years ago, we had a very clear idea of “photo editing” to the point that “Photoshop” became a verb. We got used to the fact that many celebrities’ pictures were “adjusted” with Photoshop: bigger eyes and lips, smaller waists, and so on. Yes, that’s “Tampering with the evidence,” but it’s accepted for celebs.
But the rest of us didn’t get access to a Photoshop expert to fix our family pictures –and perhaps we didn’t even want it. We wanted to have Aunt Maggie just how she looks, belly and all.
Even if we have Photoshop, using it effectively is more complex than we’d wish: mastering Photoshop is almost a profession. I bought the “Affinity Photo” editor, which is very feature-rich and much cheaper than Photoshop, but then struggled to learn all the details to use it efficiently.
But now that Generative AI is being incorporated into photo editors, you don’t need to get involved in messy details, such as making a mask with a person in a photo to isolate them from the background. You can make all-encompassing changes such as “I want this picture in Winter!” and voila, the whole scene is changed.
Imagine, for instance, that in a family photo, someone significant to them wasn’t there when the picture was taken, and you want to include that person for a “what if” version. As many know, this could be done with Photoshop, but Generative AI is superior, at least for the non-specialist, because:
Photoshop requires lots of skills to be learned. Even selecting a person in a photo could be tricky.
In Photoshop, adjusting other people’s positions to make space for the added person is almost impossible. In Generative AI, though, even several people’s placement can be adjusted by asking the system to do so.
We have to consider that Generative AI is in its infancy. Later on, we’ll have, for instance, the option to aesthetically “fix” every single photo we take, as has already been suggested.
Lying cameras
But hey, it’s not like photos were truthful before, and only later on, they became a lie. Some picture modifications were even made before we started to edit it.
For instance, some cameras use “bracketing,” which essentially takes several pictures in a burst to combine them somehow. If you want to take a group photo, one common situation is that somebody closes their eyes, which could potentially ruin the image. But with bracketing, it’s not the case anymore (unless somebody deliberately closed the eyes) because the open eyes of one person can be taken from another image in the bracket.
Another instance –which became a bit of a scandal– is that moon photos at night taken with some Samsung smartphones replaced the image of the real moon with a stored image of a nicer fake moon. Samsung did not publicize this feature; somebody accidentally discovered it, and Samsung later confirmed it.
It’s not only photos
Real-life alteration or forgery is not limited to photos: in recorded music, we can make the same distinction between testimonial recordings and artistic ones. As I wrote in my Newsletter:
There was an old cassette demo recording made by John Lennon and given later to Paul McCartney by Yoko Ono as a souvenir because on the tape there was a label “For Paul.” But the tape had only souvenir value because of the awful sound quality of the recording, to the point that conventional “sound cleaning” made by electronic filters was not enough to salvage the material. But apparently, with AI, it will be entirely possible to produce a pristine recording of the songs contained in the demo cassette because, with AI, it is possible to isolate each voice and also each instrument from the surrounding sound.
At first, it seems like a simple musical project allowing us to continue enjoying the Beatles songs. But if we think about it, initially, there was “testimonial value” on the old tape because it was recorded by the one and only John Lennon. The AI-generated version, instead, could sound a lot better but was not recorded by Lennon: it was generated by a plausible-sounding AI algorithm. If we take a “forensic” approach and compare the sound waves at the micro level, we’d see that the original and generated versions are unrelated.
In the end, the question that matters is: even if the AI version was generated and not recorded by Lennon, do we care or not? Once again, the answer to this depends on whether we give the recording a testimonial or an artistic value: in the first case, the AI-generated version would have no value whatsoever.
Final thoughts
I believe taking a radical testimonial position will be harder each day to hold. Take the example of the John Lennon recording, and assume that what matters to us is that it was recorded IRL with the actual voice or the real John Lennon. But don’t forget that the voice is not what is put into a recording, but an analog or digital representation of electronic signals detected by a recording device so that it can be played again on a suitable player.
Old real photos (as the one in the illustration at the top) could be more testimonial than modern images fixed with AI. Still, I discovered that professional photographers in the early XXth century had manual procedures for removing blemishes and more. AI mostly democratizes the capability for “adjusting” our photos to our wishes.