What is Deepfake technology? Here is everything you need to know.

Not every video on the internet is real and the fake ones are multiplying. In 2017, researchers at University of Washington released a paper describing how they were able to create a fake video of President Barack Obama, this was the first time Deepfake technology hit the news headlines. There is a fear the industry that false conspiracy theories and propaganda can be promoted using such fake videos. While many are likely intended to be humorous, others could be harmful to individuals and society. We are also starting to see commercialization of this technology in to apps like Face swap, Snapchat, Facetime (animoji), etc. Let’s find out everything you need know about Deepfake technology. 


What is Deepfake technology?

Deepfakes are next generation of video and audio manipulations, they are based on artificial intelligence and they make it much easier to do a range of things. The main thing Deepfakes are known for is face swap. You take a face of one person and transfer it another. There is also other forms of synthetic media manipulations, like ability to manipulate someone’s lips and sync them up with a fake or real audio track. There is also the possibility of making someone’s body move or appear to move in a way that is realistic but it is in fact computer generated. 

Deepfake is basically training a computer, based on patterns and image recognition to understand the structure of a human face. One of the byproducts of this technology is that you can accurately replicate someone else’s face onto another person’s face. 

When in comes to Deepfake, it is still just a mask. You still need the craft, the inflection, and mannerisms, so Deepfake will only work if you can get someone to act like the character you are trying to impersonate. There is no Deepfake for voice, so you need a convincing voice to elevate the visuals of a Deepfake video. 

For examples, in the video below, you can see Deepfake technology at use in the video below with Bill Hader. The video shows Bill Hader’s face being transformed into Tom Cruise’s face. The transition and fade between faces is absolutely unnoticeable and with Bill Hader’s facial movements and resemblance of Tom Cruise, the deepfake reaches a very realistic level in this video. It is so convincing, you will think for a second that Bill Hader does look like Tom Cruise!

All of this is driven by advances in artificial intelligence, with the use of generative adversarial networks, which have the capacity to take two artificial intelligence networks to compete against each other. One producing forgeries and other one detecting the forgery. As the forgeries improve, they do it based on this competition between two networks. Deepfakes are videos that have been altered using machine learning, a form of artificial intelligence, to show someone saying or doing something that they did not do or say. 


How is it being used in the real world? 

For the moment, the Deepfake technology is primarily used for non-consensual sexual images. Probably up to 95% of Deepfakes are images of celebrities or there are non-consensual images of ordinary people being shared on porn sites or being shared on instant messaging. We have started to see other cases of Deepfake technology being used in other contexts, targeting journalists and civic activists, showing them in sexual situations. We have also started to see people using the “it’s a deepfake” excuse to get out of a situation. In the small number of political level cases, where there was potentially a deepfake, you see people using the phrase just like “fake news”. 


Can anyone use Deepfake technology? How can you do it?

Deepfake technology is getting more accessible because it is getting commercialized and commoditised.  The technology is still not at a point where anyone can do a really convincing face swap Deepfake. There is code available online and there are websites you can use that will allow you to create a Deepfake.

For instance, there is a program called DeepFaceLab, which is actually free on Github. If you want to create a Deepfake yourself, you will need a decent graphics card and a lot of video RAM to run the Deepfake videos in high resolution.  The first thing you need a source video of a face you are trying to use. The more structure and variation you have in the source material the more the program can learn, which is why it is crucial to have a very large collection of reference material (different poses, profiles, expressions, perspectives) that program can use when creating a Deepfake video. There are many examples where the Deepfaked person doesn’t blink in the video, which can happen if the program doesn’t have any reference frames where eyes are closed.

Once you have your reference video clips and a video clip you are trying to alter, the program can start the Deepfake process. The program does facial recognition to determine the placement of eyes, nose, mouth to isolate the face and alignment. Once the faces are extracted, you need to train the program on how the faces work. The program creates ground truth versions and synthesized versions of your reference and target faces. With all of these data, the computer program tries to recreate the face in your reference material on the structure of the face you are trying to alter. 

Some of those Deepfake will be imperfect but we know some of those Deepfakes will cause harm. There is also a vulnerabilities for political campaigns as there might be last minute surprises with compromising video. 


Why do people worry about Deepfake technology?

The reason people worry about advances in Deepfake is because we have really seen quite significant progress in the last six to twelve months. There is decline in the amount of training data needed down to few images for some of the face expression modification. People are also trying to combine video manipulations like the lips with simulations of audio.In the last six months, what has become clear is that Deepfakes and synthetic audio generation is getting better and better and requiring less training data (less examples needed to generate the Deepfake). All of this means that we are going to get more and more of Deepfake content and it is going to get better over time. There are detection methods being developed to protect a high profile person from Deepfake technology. 

Actors pictured here (top) with an example deepfake (bottom), which can be a subtle or drastic change, depending on the other actor used to create them.

Earlier this month, Facebook announced it had set up a $10m (£8.1m) fund to find better ways to detect deepfakes. Google has also released a database of 3,000 deepfakes, which was created using publicly available deepfake generation methods and working with paid and consenting actors to record hundreds of videos. There is definitely potential for misuses of synthetic media and tech giants are hoping to mitigate the harm by helping the research community with development of synthetic video detection methods. Only time will tell whether the Deepfake technology will bring more humor or harm to our society.