Tuesday, June 18, 2024
HomeNewsGoogle AI Introduced Frame Interpolation for Large Motion (FILM)

Google AI Introduced Frame Interpolation for Large Motion (FILM)

Google AI has been working on frame interpolation and has introduced a new neural network, Frame Interpolation for Large Motion (FILM). Frame interpolation is the process of synthesizing in-between images from pre-existing ones. The technique is frequently used for temporal up-sampling to accelerate video refresh rates or produce slow-motion effects.

Google published “FILM: Frame Interpolation for Large Motion” at the ECCV 2022, presenting a new technique to generate high-grade slow-mo videos from duplicate images. FILM is efficient for both large and small motions with state-of-the-art outcomes.

Google iteratively invoked the model to output in-between images at the inference moment. 

The FILM model generates a middle image from two input images. There are three parts to the FILM model: 

  1. A feature extractor uses deep multi-scale (pyramid) features to summarise each input image.
  2. A bi-directional motion estimator calculates pixel-wise motion (i.e., flows) at each pyramid level.
  3. A fusion module that generates the final interpolated image.

Read More: Google AI digitizes sense of smell by mapping scent of molecules

Typically, multi-resolution feature pyramids and hierarchical motion estimates are used to accommodate significant motion. Small and swiftly moving items challenge this technique as they tend to vanish near the pyramid’s base. 

The above components help solve this problem by using a shared motion estimator and creating a network with fewer weights. Shared weights increase the number of pixels available for large motion supervision by enabling the interpretation of minor motions at deeper levels to be the same as large motions at shallow levels.

Following feature extraction, FILM uses pyramid-based residual flow estimates to determine the flows from the center image—which has not yet been predicted—to the two inputs. The model aligns the two feature pyramids after estimating the bi-directional flows. Stacking the two aligned feature maps, the bi-directional flows, and the input images at each pyramid level create a concatenated feature pyramid.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Disha Chopra
Disha Chopra
Disha Chopra is a content enthusiast! She is an Economics graduate pursuing her PG in the same field along with Data Sciences. Disha enjoys the ever-demanding world of content and the flexibility that comes with it. She can be found listening to music or simply asleep when not working!


Please enter your comment!
Please enter your name here

Most Popular