Artificial intelligence. Neural networks. Machine learning. Have you heard terms like these flying around the science fiction sections of the film/TV world? Have you ever wondered just how accurately these films portray real science? Well, my friends, today is your lucky day: this column, Fantasy Science & Coffee, aims to bridge the gap between science and science fiction in films and popular culture. My hope is to explain things in a fun way – like we’re chatting over coffee.
You may be thinking: who is this person, why does she think she can explain science, and why the heck would I want to have coffee with her? Well, I’m Radha, a researcher in India, who recently submitted a PhD thesis in theoretical quantum physics. I quite like hot beverages. I’ll also pay.
In this twenty third part of the series published on the second and fourth Tuesdays of every month, we are going chat about seeing people’s thoughts, Black Mirror style.
Mental Image Reconstruction in Black Mirror
Imagine if you could tap into someone’s memories of an event. If you could tap into someone’s mind and see what they are visualizing. This is exactly the theme explored in Black Mirror‘s third episode of season four, Crocodile. If you are unfamiliar with this show, it’s a thought provoking anthology series based on near-future technology that hits close to home, often in uncomfortable ways.
In Crocodile, an insurance agent investigates an accident for potential claims by looking at the memories of eye witnesses. A device placed on a witness’s temple relays their memories to a monitor for the agent to view.
Note that this device does not tap into any sort of ocular memory (unlike the tech in season 1, episode 3: The Entire History of You), that is, it doesn’t have anything to do with a recording eyepiece. The images seen on the monitor are lifted directly from brain activity, and therefore may be tainted with incorrect or biased recollections — it’s a literal depiction of a person’s memory.
This sounds solely the stuff of good science fiction, right? Turns out, we may be closer to this future than you think.
Mental Image Reconstruction in Real Life
Scientists are recreating images based on synaptic activity in the brain, that is, artificial intelligence is one step closer to showing us what a person imagines. Yes, you read that correctly.
This work by scientists from Kamitami Lab at Kyoto University was first released in 2017 as a pre-print, but published in January 2019 in Computational Biology as a paper: Deep image reconstruction from human brain activity. In the paper, they describe the use of deep neural networks to reproduce images that a subject looks at or recalls from memory.
Here’s a glimpse of one set of their results:
The image to the left is the original image, and the ones to the right are reconstructions made with the help of a deep neural network, by studying the subject’s brain activity. The images may not seem super accurate at first glance, but remember that these images were reconstructed from brain activity — the original image was unknown to the artificial neural network. It’s quite remarkable!
Not too long before, a group at the Gallant Lab at UC Berkeley published their work on something I reckon you would find particularly fascinating: they reconstructed movie clips from brain activity.
From the FAQ about this work:
The goal of movie reconstruction is to use the evoked activity to recreate the movie you observed. To do this, we create encoding models that describe how movies are transformed into brain activity, and then we use those models to decode brain activity and reconstruct the stimulus.
This work was apparently the first demonstration of reconstructing dynamic visual experiences, which is essentially what the technology depicted in Crocodile does.
The researchers at Gallant Lab first created predictive computational models trained on both thousands of hours of video footage and the brain activity the footage evoked in subjects. The brain activity was captured with functional magnetic resonance imaging (fMRI), that is, brain imaging by measuring changes in blood flow. When a subject viewed a new film clip, the predictive model matched the subject’s brain activity to those corresponding to clips in its library, and the video output was an average over the top few matches.
The predictive models used in studies like these are artificial intelligence models, that is, models that mimic human cognition. One such model is the artificial neural network.
A Brief Overview: Artificial Neural Networks
Let’s take a quick moment to chat about artificial neural networks (ANN’s). Without getting into too much detail, an ANN is an artificial intelligence framework that mimics a real biological neural network, and can be trained to perform certain tasks like classification and pattern recognition. Take, for example, this image of three objects:
You likely immediately recognized that this is an image of three chairs. Each of these chairs is structured uniquely, yet you knew without a doubt that these are chairs. This simple classification task is a no-brainer for us, but for machines it is tricky. How can a machine recognize a chair, when chairs come in all shapes and sizes? There’s no one way to define a chair, so simply providing a machine with a definition is no good.
That’s where the elegance of artificial intelligence comes into the picture, particularly ANN’s, which are widely used for machine learning. The structure of an ANN is a simplified representation of the networks of neurons in our brain. Just like our own biological neurons send signals to one another through connections, the artificial neurons pass information to one another through artificial connections. An ANN consists of layers of interconnected nodes, which act as these artificial neurons. Each node performs a certain mathematical function on its input, and passes its output to the next layer of nodes. The final output layer gives us the answer we seek.
Artificial Intelligence, Machine Learning, and Training an ANN
ANN’s don’t magically know the answer to a question; they need to be trained on accurate data. You can’t simply create an ANN for the purpose of chair classification, and say “Here’s an image Ms. ANN, is this a chair?” You need to make it understand what a chair is first.
This is the process of machine learning. The terms “machine learning” and “artificial intelligence” are often used interchangeably, but this isn’t entirely accurate. A quote from Data Science Central sums up the distinction nicely:
Artificial intelligence is a broader concept than machine learning, which addresses the use of computers to mimic the cognitive functions of humans. When machines carry out tasks based on algorithms in an “intelligent” manner, that is AI. Machine learning is a subset of AI and focuses on the ability of machines to receive a set of data and learn for themselves, changing algorithms as they learn more about the information they are processing.
Thus, simply using an algorithm to calculate a solution isn’t artificial intelligence. The algorithm needs to mimic how we, as humans, process data. And machine learning occurs when the algorithm adapts according to new data, which is what happens when we train ANN’s.
Let’s pretend we are creating a simple ANN with a single output that indicates whether or not an object is a chair. The output will be a simple ‘yes’ or ‘no’, perhaps represented by 1 and 0, respectively.
In order to train the network, we take a picture of something we know is a chair, break it up into bits in a manner we won’t get into here, and feed those bits into the first, input layer of the ANN. The nodes perform functions on the data, pass their outputs to the next layer and so on, following the directions of the arrows. The final output layer gives us a 0, say, which means the network didn’t recognize the input as a chair. Since we know the object is a chair, we tell the network that the output layer should have given us a 1, and it adjusts its own parameters with this information. Note that had it given us a 1 at first, it would have been a fluke, because it had not been trained yet.
We do this hundreds of times for different varieties of chairs, so that the network will eventually be able to handle all sorts of different chair images. This is the training process, and each time a new image is introduced, and the correct answer is provided to the network, it adjusts its parameters as it gets trained. Some of these parameters include the number of nodes in the hidden layers, the node function parameters, and the numbers that are associated with each connection.
A deep artificial neural network is one with multiple hidden layers, and corresponds to a concept called “deep learning”. The structure of deep models mimics how we can process information in a hierarchal manner, that is, multiple levels of abstraction. They learn by being fed tons of data that correspond to different levels of abstraction. To put this into perspective, using deep learning, Google Brain was shown ten million cat images that were extremely varied, before it began to recognize cats.
Coming back to our own chair classifying ANN, once the training process is over, an unknown object that is fed into the network would be classified correctly. To be a chair or to not be a chair, that is the question.
Mental Image Reconstruction and ANN’s
In the context of mental image reconstruction, the scientists at Kamitami Lab used deep neural networks because of their hierarchical nature with multiple layers. From an interview with CNBC’s MakeIt:
We have been studying methods to reconstruct or recreate an image a person is seeing just by looking at the person’s brain activity. Our previous method was to assume that an image consists of pixels or simple shapes. But it’s known that our brain processes visual information hierarchically extracting different levels of features or components of different complexities. These neural networks or AI models can be used as a proxy for the hierarchical structure of the human brain.
This research is remarkable. As the technology advances and we get closer to peeking into each other’s heads, I believe it could have Black Mirror potential. Which means, it won’t be long before insurance agents come a-knocking for eye witness memories.
If we had this Black Mirror-esque technology today, what would you use it for?
More to Explore
Articles
CNBC Japanese scientists just used AI to read minds and it’s amazing (2018)
Data Science Central: Artificial Intelligence vs. Machine Learning vs. Deep Learning (2017)
Wired: Google’s Artificial Brain Learns to Find Cat Videos (2012)
Papers
Computational Biology: Deep image reconstruction from human brain activity (2019)
Frontiers in Neuroscience: Decoding the Semantic Content of Natural Movies from Human Brain Activity (2016)
Current Biology: Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies (2011)
Resources
Kamitani Lab’s Deep Image Reconstruction code on GitHub
Py Image Search: A simple neural network with Python and Keras
Books
Neural Networks: A Classroom Approach by Satish Kumar — The best book out there for ANN’s!
Neural Network Textbook by Michael Nielsen
Does content like this matter to you?
Become a Member and support film journalism. Unlock access to all of Film Inquiry`s great articles. Join a community of like-minded readers who are passionate about cinema - get access to our private members Network, give back to independent filmmakers, and more.