Inside the World of Deepfake Technology

      Comments Off on Inside the World of Deepfake Technology

‘Deepfakes’ need no introduction in the times we live in and there’s no ‘one definition’ to encapsulate the term. A common understanding can however be drawn by stating that, deepfakes are synthetic media that are computer-generated to create images, videos and audios impersonating a person doing or saying something which didn’t occur in real. The term is a portmanteau of ‘deep learning’ and ‘fake’.

The technology might have started with good intentions like David Bekham’s 2019 Malaria campaign that enabled him to speak multiple languages after the virtual alteration. But given the breakthrough in GenAI regarding content-creation, deepfakes powered by artificial intelligence has seen its production become easier and more inexpensive.

Today, the same deepfake videos are used by bad actors to tarnish reputation, commit frauds, mislead, and manipulate identity for malicious gains. Cloud computing becoming affordable and Large Language Models (LLMs) ability has contributed to the ease of its production. Moreover, content created from GenAI easily blends with the feeds of social media users contributing to the already focused demographic targeting. Thus, gaining torrential traction on social networking platforms, thanks to frivolous use.

How Deepfake Videos are Created

The creation was sparked with Generative Adversarial Networks (GANs), whose neural networks compete against each other. One is the generative component and the other is the adversarial component. That is, one generates while the other one corrects by pointing out the glitches. However, many tools today do not leverage the GANs for deepfakes but use encoder and decoder layers or first-order motion models.

To simplify what they are and how this layered approach works, we need to understand Deep Learning and AI in media manipulation. Deep learning models consist of various layers, where each is a mathematical abstraction developed over the previous one. This is known as latent representation; in deep neural networks this is what comprises the ‘inferred’ reality and not the observable reality based on the mathematical framework of the AI layers preceding it.

A deepfake generator works on an encoder-decoder paradigm.
Encoding: Process where an original image transcends to a latent image. Captures the details of the given image.

Decoding: Re-translates the latent image into the original. ‘Re-sketching’ the given image with the latent image generated during encoding.

To further simplify, encoding is the process of capturing the details of the given image and decoding is ‘re-sketching’ the given image with the latent image generated during encoding.

On the other hand, first-order motion models allow editing the video frame by frame based on the algorithms we are aware of used in the AI filters. After being fed on hours of real-life footage the neural network gains an understanding of how to restructure the actual footage.

But this isn’t all, new and improved processes are around the corner and already tampering with the media fabric.

Ethical Concerns Surrounding Deepfakes

Deepfakes is a community threat if the ease of its penetration keeps intensifying. Sparing no one, be it actors, politicians, or people like Duwaraka Prabhakaran, daughter of the Tamil Tiger militant chief who despite being dead for almost 15 years was digitally altered in a UK speech addressed to Tamilians; it was only Mr. Chinnadurai’s keen fact-checking eyes revealed that it was a deepfake video.

The technology’s malicious attacks become crucial at a time when polls are in full swing in the country. Visual alteration compounded by speech and language are being skillfully revised to spread misinformation.

However, arrests are made but no proper regulation is in place to act against the bad actors. A personal moral compass is what keeps the creators either at bay or in the hot soup. Putting up disclaimers is indeed important but not everyone uses them. People do blatant circulation of AI-generated content, adding fire to the spread.

As quoted by BBC, Mr Chinnadurai, running a media watchdog in Tamil Nadu said, “Information travels at the speed of 100km per hour. The debunked information we disseminate will go at 20km per hour.”

The government has sprung up to action asking the tech organizations to inform the users of the unreliability and fallibility of the generated output before launching any GenAI model that is under-testing phase or not very reliable. In addition, it is suggested to incorporate ‘consent pop-up’ or similar mechanism for user awareness and information on the output unreliability.

Having said that, there is something we can do as individuals to avoid being a part of this misinformation spreading chain. A few checks before it becomes yet another content we forward.

Look for:

  • Inconsistencies in facial features
  • Oddity in the audio
  • Unnatural movements
  • Glitches in the video’s background or lighting
  • Skin discoloration
  • Watermarks if any that indicate the use of AI
  • Report suspicious content on platforms

Here’s how to avoid being misled by malicious deepfakes

At last, pause and reflect before you share, focus well on the context to avoid being taken down in the rut. You can also head to our exploratory note on ‘Deepfakes and Policy Considerations’ to gain insights on how to deal with the subject better.

Sources:
https://www.bbc.com/news/world-asia-india-68918330

https://www.forbes.com/sites/lutzfinger/2022/09/08/overview-of-how-to-create-deepfakesits-scarily-simple/?sh=6619602bf164

https://www.forbes.com/sites/forbestechcouncil/2024/05/02/deepfakes-a-prime-vector-of-civic-abuse/?sh=782582301141

https://www.meity.gov.in/writereaddata/files/Advisory%2015March%202024.pdf