On Monday, November 14, Intel unveiled new software that is reportedly capable of instantly recognizing deepfake videos. The company claims that their “FakeCatcher” real-time deep fake detection is the first of its kind in the world, with a 96% accuracy rate and a millisecond response time.
Like Demir, an Intel researcher, and Umur Ciftci from the State University of New York at Binghamton created FakeCatcher, which utilizes Intel hardware and software, runs on a server, and communicates via a web-based platform. The software utilizes specialized tools, such as the OpenVINO open-source toolkit for deep learning model optimization and OpenCV for processing real-time photos and videos, to run AI models for face and landmark detection. The developer teams also provided a comprehensive software stack for Intel’s Xeon Scalable CPUs using the Open Visual Cloud platform. The FakeCatcher software can run up to 72 different scanning streams simultaneously on 3rd Gen Xeon Scalable processors.
According to Intel, while the majority of deep learning-based detectors check for signs of inauthenticity in raw data, FakeCatcher adopts a different strategy and searches for genuine biological cues in actual videos, as they are neither spatially nor temporally preserved in fake content. Based on photoplethysmography (PPG), it evaluates what makes us human and examines minute blood flow in video pixels. According to Intel, the color of our veins changes when our hearts pump blood. These blood flow data are gathered from various parts of the face, and algorithms turn them into spatiotemporal maps. Then, using deep learning, FakeCatcher can instantaneously determine if a video is real or fake.
When evaluated against different datasets, FakeCatcher showed 96%, 94.65%, 91.50%, and 91.07% accuracies on Face Forensics, Face Forensics++, CelebDF, and on new Deep Fakes Dataset, respectively.
There are a number of possible applications for FakeCatcher, says Intel, including preventing users from posting malicious deepfake videos to social media and assisting news organizations in avoiding airing misleading content.
As deepfake threats proliferate, deepfake detection has become more crucial. These include compositional deepfakes, where malicious actors produce several deepfakes to assemble a “synthetic history,” and interactive deepfakes, which give the impression that you are speaking to a real person. The FBI reported to its Internet Crime Complaint Center this summer that it has received more complaints about persons using deepfakes to apply for remote employment, with an emphasis on voice spoofing. Some even pretend to be job applicants in order to acquire private corporate data. Additionally, deepfakes have been exploited to make provocative statements by posing as well-known political personalities.
Generative adversarial networks, or GANs, are frameworks for generative modeling. Generative modeling is an unsupervised learning approach that involves uncovering and studying patterns in data, then using it to generate new outputs. GANs are a way to train these generative models by framing the problem as a supervised learning problem, bifurcated into two sub-models, the generator model and the discriminator model. While the former generates new instances, the latter classifies the instances as either fake (generated) or real (from the domain). These models are solved in a zero-sum game, where both get better by competing against each other. Many sources are available to kick-start your knowledge of GANs and generative modeling. While books give you the most profound insight into the subject, finishing it is a big commitment. You can always start with some generative adversarial networks videos to get some idea.
We have enlisted some of the top generative adversarial network videos; have a look.
Top Generative Adversarial Networks Videos
Let us start with some introductory videos that will introduce Generative Adversarial Networks.
Introductory videos about GANs
What are GANs? By IBM Technology
What are GANs is a small introductory video on generative networks. Developed by IBM Tech and presented by Martin Keen, this video briefs you about the bifurcation of GANs into the generator and discriminatory models. Keen begins by explaining the models, their functioning, and their outputs. He explains how the models compete against each other and how this competition benefits you. You will be able to define a GAN and understand how it works after watching the video.
What are Generative Adversarial Networks?
This introductory YouTube video tutorial on GANs is a good place for beginners to learn what is meant by “generative,” “adversarial,” and “network.” it is posted by DigitalSreeni, a YouTube channel explaining several Python and AI-related topics. This video discusses deep learning architectures with two neural networks: a generator and a discriminator. It is a short video wherein Sreeni will only brief you about the concept and give you an overview of how to implement it by providing code snippets. Lastly, he ends the video by mentioning several applications of GANs, specifically SR-GAN, to generate high-resolution images.
The Math Behind Generative Adversarial Networks Clearly Explained! By Normalized Nerd
The above two videos introduce you to the fundamentals of generative adversarial networks, and this video teaches you the core mathematics behind these models. You will need a background in statistics and advanced-level mathematics, as the video begins with defining the generator and discriminator using probability concepts.
The GAN video explains the computation behind a generative model in generating results and a discriminator model in predicting them. It talks less about theoretical concepts and is conversationally practical by showcasing all the formulas used. For more information, you can refer to the original paper from which the content has been taken.
Some more exciting videos about GANs
Now that you know the basics, you can check out some more detailed videos on GANs.
Generative Adversarial Networks and TF-GAN by TensorFlow
This generative adversarial network video is a part of the Machine Learning Tech Talks hosted by TensorFlow. Research engineer Joel Shor talks about GANs as the recent development of machine learning technologies and an open-source library, TF-GAN, for training and evaluating GANs.
Shor begins by describing GANs and their applications, then delves into the metrics. You need to have a statistical and mathematical background to understand the metrics. Lastly, he discusses how to develop a self-attention GAN and get started working with these networks.
Ian Goodfellow: Generative Adversarial Networks, NIPS 2016 Tutorial
This video session, delivered by Dr. Ian Goodfellow, is an insightful discussion for those without experience with generative adversarial networks. Dr. Goodfellow is the man behind this class of machine learning frameworks and aims to promote a greater audience to understand and utilize GANs to improve on other core algorithms. He describes GANs as “universal approximators” of probability distributions and requires a few approximations like Markov’s chain, variational bounds, or Monte Carlo for generating possible learning results.
While watching the video, you will learn about the entire learning process of the adversarial game between the generator and the discriminator. The video also covers the Jensen-Shannon divergence as an extended GAN framework, applications of GANs, research frontiers, and several improved model architectures.
Conditional GANs and their applications by DigitalSreeni
All the generative adversarial networks videos mentioned in this list talk about generative adversarial networks or GANs, except this one focus on conditional GANs or cGANs. Initially, the video talks about standard GANs and their usefulness in generating random images from the domain.
You will learn that the standard GANs can be conditioned using specific image modalities and the methods that generate them. This conditioning is done by feeding the class labels in both adversarial models. The video also discusses applications like image-to-image translation, CycleGAN, super-resolution, and text-to-image synthesis.
Generative Adversarial Networks by Coursera
Coursera offers several courses on generative adversarial networks or GANs. These courses contain video lectures about fundamental concepts, applications, and challenges in learning and deploying GANs. The course “Build Basic Generative Adversarial Networks” is a part of the GAN specialization offered by Coursera that introduces you to the concept’s intuition and helps you build conditional GANs, training models using PyTorch and also covers the social implications of using such networks.
This video specialization offers a straightforward route for learners of all skill levels who want to explore GANs or use GANs in their projects, even if they have no prior knowledge of advanced mathematics or machine learning research.
GANs and Autoencoders by Argonne Leadership Computing Facility
This generative adversarial network video features a session of ALCF AI for Science Training and introduces you to applying GANs and autoencoders in scientific research. Presented by Corey Adams, an assistant computer scientist at the Argonne Leadership Computing Facility, it is a detailed video discussing an ongoing ALCF research project, the study problem, the theoretical solution, and the codes.
Technically, you will learn about how these frameworks work and the institutional differences in their learning process. It is one of the most appropriate generative adversarial networks videos if you are interested in learning about autoencoders and their application in solving a semisupervised learning problem and GANs in solving an unsupervised learning problem.
Improved Consistency Regularization for GANs
This is one of those generative adversarial networks videos that uncovers a new technique to enhance consistency regularization, a model training technique invariant to data augmentations in semi-supervised learning. It discusses using the same regularization methods in unsupervised learning methods like SimCLR and FixMatch in the GAN. You will learn that doing so significantly improves the FID scores (Frechet Inception DIstance, a quality evaluation metric for generated images) for generating images.
Business Applications of GANs and Reinforcement Learning by Dataiku
If you want to learn about real-life applications of GANs, it is a great video posted by Dataiku. In this video, Alex Combessie, a data scientist at Dataiku, talks about the business applications of GAN AI technologies. GANs have succeeded in synthetic image generation, but can they be applied to forecast option prices?
Combessie shares the story of two data scientists who deployed a GAN for option pricing. Specifically, he discusses real-time option pricing by explaining the gaussian assumption in the Black-Scholes formula. Learning about pricing will be wise if you are interested in trading options contracts. Hence, this video is a great place to start if you have a background in AI, adversarial networks, or related technologies and wish to learn about their application in options trading.
Galileo, a machine learning data intelligence platform, has recently announced Galileo Community Edition, allowing data scientists to work on natural language processing to build high-performance ML models with better-quality training data. The free edition was showcased during the Galileo demo hour on November 15.
You can instantly fix, track, curate, and optimize your machine-learning data with Galileo. With Galileo, you can carryout many tasks like text classification, named entity recognition, multi-label text classification, and natural language inference using different platforms such as hugging face, PyTorch, Tensorflow, and Keras.
Vikram Chatterji, the CEO of Galileo, said, “While data powers ML, debugging unstructured data is very manual and time intensive.” The co-founders, Atindrryo Sanyal and Yash Sheth have also noticed the absence of data tools for unstructured data in ML, even in companies like Apple, Google, and UberAI. Therefore, Galileo was developed to help data scientists with handling unstructured data effectively.
With Galileo, data scientists can integrate with various labeling tools such as Labelbox, scale AI, Label Studio, and cloud providers like GCP, AWS, and Azure. Galileo enables users to integrate with different machine learning platforms and sevices like AzureML, VertexAI, SageMaker, and Databricks.
With Galileo, data scientists can reduce the time needed for training large and complex datasets from weeks to minutes by eliminating data mistakes. Recently, Galileo announced that it had raised an $18 million series A funding round by Battery Ventures and others.
Cerebras Systems, an American artificial intelligence company, has recently announced its new AI supercomputer, Andromeda, which is now available for commercial and academic use.
The 13.5 million core AI supercomputer, Andromeda, is made by linking 16 Cerebras CS-2 systems. The organization claims that Andromeda features more cores than 1953 NVIDIA A100 GPUs. Andromeda also features 1.6 times large cores than the largest supercomputer in the world-Frontier, which has 8.7 million cores.
As per Cerebras, multiple users can use Andromeda simultaneously and specify how many of Andromeda’s CS-2s are needed to use within seconds. Andromeda can also be used as the 16 CS-2 supercomputer cluster for a single user who is working on a single job. It can also be used as 16 CS-2 systems for sixteen different users with sixteen different jobs.
The company claims that Andromeda can deliver more than one exaflop of AI computing and 120 petaflops of dense computing at 16-bit half precision. Andromeda is the only supercomputer to demonstrate near-perfect scaling on large language models workload relying on simple data parallelism. Near-perfect scaling means that as more CS-2s are used, the training time is reduced in near-perfect proportion.
According to Cerebara, Andromeda is built at a high-performance data center in Santa Clara, California, called Colovore. Organizations and researchers from US national lab can access Andromeda remotely.
Udacity and Bertelsmann are partnering to award learners with 50000 scholarships through the Next Generation Tech Booster program. It aims to open new career opportunities for students in the lucrative fields of data science programming, cybersecurity, or front-end web development.
The applicants must be 18 years or older and have English comprehension. This program is suitable for developing job-ready skills in data science, cybersecurity, or front-end web development. The last date to apply is November 28.
Learners who apply to the program will have to select 1 of 3 tracks viz – Programming for Data Science, Front End Web Developer, and Introduction to Cybersecurity.
In phase 1, 17,000 accepted applicants will enroll in a challenge course for their selected track, in which they will learn the foundational elements of their chosen topic. In phase 2, the 500 top-performing learners from their respective challenge courses will receive a Nanodegree program scholarship.
This scholarship program aims to allow learners worldwide to develop lucrative digital skills regardless of social status or cultural background.
Many organizations like Meta, OpenAI, and Google are working in the language domain and expanding language models to proximate more complex tasks. Language models generally transform and generate qualitative information, just like humans. Fundamentally, these models interpret data using algorithms that process information in the context of natural language. Once the algorithms work, the models accurately produce new content. Working in the same domain, Google AI has proposed a new artificial intelligence-driven approach, “ReAct,” for large language models. In this research, researchers have combined reasoning and acting advances to enhance the efficiency of language models.
Existing language models usually work using two main techniques: chain-of-thought or pre-trained models. The models that work via chain-of-thought, a standard prompting method that enables a model to decompose the problem into numerous intermediate steps, are very efficient. With this prompting technique, language models with sufficient scale (approximately ~100B parameters) can also effectively solve reasoning problems. However, this technique makes reason-only models unsuitable for external environments and has limited exploring abilities.
On the other hand, others that use pre-trained language models focus on mapping text contexts to actions with the model’s internal knowledge. These models are hence known as act-only models. However, even these models cannot reason or remain consistent in their actions as they learn from what has been fed. If the input is not socially sound and consistent, the model will learn from it and output in the same manner. Consequently, language models are known to exhibit more social bias compared to human-written text.
With ReAct, the researchers show that the Reason+Act (ReAct) paradigm outperforms models with reason-only and act-only paradigms. Especially when it is a large model, optimizing smaller models, and enhancing interpretability, ReAct is very efficient. To set up the ReAct prompting method, the PaLM-540B language model was used to prompt in-context domain-specific examples in the model. While executing reasoning-based tasks like navigating, the reasoning and acting jobs are alternated. For instance, say the prompt is “go to” in a room for navigating, then this command needs a task-solution trajectory that comprises multiple reasoning-action-observation stages.
What sets the ReAct approach apart is that the reasoning traces only need to be sparsely located throughout the trajectory of tasks with a large number of actions. The ReAct model then determines when and how reasoning and action responses will occur asynchronously. The PaLM-540B model was used to generate successful trajectories, which were later used to fine-tune smaller models like PaLM-8/62B.
The researchers evaluated ReAct against four benchmarks to see if it could reduce the extensive need for human annotation. These benchmarks were: HotPotQA (for question answering), Fever (for fact-checking), ALFWorld (text-based gaming), and WebShop (web page navigation. In the context of HotPotQA and Fever, it was observed that the model overcomes standard errors and hallucinations in chain-of-thought reasoning. When it comes to ALFWorld and WebShop, ReAct surpasses reinforcement learning techniques.
ReAct was also investigated using a human inspector control over its reasoning traces so that the researchers could evaluate human interactions with the model. ReAct successfully altered its behavior corresponding to the revisions provided by the human inspector. The model generates a hallucinatory trajectory when any improvised revision is entered, making it highly efficient in human-machine interaction with negligible human involvement. Google has been actively working on language models, and this new model is yet another stride of success in that direction. As seen in the paper, ReAct makes it feasible to describe a behavior or feedback within a model while it flexibly handles the input and calls for action. Be it multiple-choice question-answering, fact-verification, or interactive decision-making, ReAct exhibits a commendable performance.
Meta introduced a new large language model ‘Galactica’ to generate original academic papers with simple prompts. But as quickly as it was introduced, several people criticized it as “dangerous,” after which Meta turned down the demo.
Earlier, on visiting the website, users could see an option to “Generate” content, as seen in the image below.
However, as more and more people reported it to be full of “statistical nonsense” and that it was generating “wrong” content, the website withdrew the option for people to experiment with.
Grady Booch, a software engineer described Galactica as “little more than statistical nonsense at scale.” He said it was amusing but IMHO (International Human Rights & Media Organization) unethical.
Gary Marcus tweeted his concern after reviewing Galactica and said that it has “jumped the AI shark.”
He also added that Galactica “prevaricates” a lot, implying that it is evasive when it comes to mentioning the exact truth. He proceeded to say that students would love to use such a model to intimidate their teachers, while others, aware of the risks, should be terrified.
Michael Black, director of the Max Planck Institute for Intelligent Systems, tweeted:
I asked #Galactica about some things I know about and I'm troubled. In all cases, it was wrong or biased but sounded right and authoritative. I think it's dangerous. Here are a few of my experiments and my analysis of my concerns. (1/9)
He said the work is an interesting advancement, yet it is not useful and safe for doing scientific work. He used the word “dangerous” and explained that Galactica outputs grammatically coherent content, but there is no certainty of it being unbiased and scientifically correct. In such ambiguity, if these results slip into scientific submissions, it would be potentially distorting.
He feared that models like Galactica could usher in an era of scientific deep fakes. He said, “Alldieck and Pumarola will get citations for papers they didn’t write. These papers will then be cited by others in real papers. What a mess this will be.”
Associate Professor Keenan Crane, Computer Science, and Robotics at CMU, also expressed distrust in Galactica. He said that none of the deep language models could be trusted completely as they are intuitive and authoritative and imitate reliability.
So badly—but subtly—wrong that I not only distrust #Galactica: I don’t trust *any* deep language model to provide reliable answers in situations that matter.
Meta announced on Thursday it has appointed Sandhya Devanathan as the new vice president of its India unit. She will be the head of the company’s business in the country.
Meta said Devanathan’s journey as the Vice President of Meta India would begin on January 1, and she will relocate to India to head the organization. She is currently based in Singapore.
Devanathan is succeeding Ajit Mohan, who resigned on November 3. Meta India’s director and head of partnerships, Manish Chopra, has taken over Mohan’s position on an interim basis.
She joined Meta in January 2016 as the Group Director of SEA – Travel, FinServ, and E-commerce, in Singapore. She became the tech giant’s Business Head for Vietnam and Managing Director for Singapore in August. She is also Meta’s Vice President for Gaming in the Asia-Pacific region which she assumed in April 2020.
Devanathan also serves, or has served, as a board member of several organizations. These include the National Library Board (Singapore), Singapore Management University, Pepper Financial Services Group, Women’s Forum for the Economy and Society, and Ministry of Information and Communications (Singapore).
Motional, the joint venture between Hyundai and Aptiv for autonomous vehicles, is bringing its robotaxis to Los Angeles, where riders can request them using the Lyft app. The service is provided by Motional’s fleet of electric Hyundai Ioniq 5 cars, all of which will be operated entirely autonomously upon service launch and without the need for a human safety driver. However, neither party has stated when the service would be available.
After Las Vegas, Los Angeles is the second location in Lyft and Motional’s multi-city collaboration, where the two businesses are offering shared trips using Motional’s IONIQ 5 AVs. The IONIQ 5 cars, like those in Las Vegas, will be incorporated into Lyft’s Los Angeles network. Riders in Los Angeles will have access to a rideshare network with various transportation options, including autonomous cars and conventional ridesharing, to get them where they need to go. People can use the Lyft app to open the doors when the car arrives, and there is a passenger display in every car that can be used to contact a remote agent anytime.
The IONIQ 5 is powered by Motional’s AI-first autonomous vehicle stack, which includes over 30 sensors, including cameras, radar, and LIDAR, to reliably identify objects at extremely long distances. An onboard ‘compute’ system analyzes all the data coming in from the robotaxi cameras and sensors.
Motional claims to have used Lyft to provide over 100,000 trips in Las Vegas, with a five-star rating rate of more than 95%. Additionally, the company signed agreements to a 10-year partnership with Uber, announcing that it would begin providing passenger trips later this year and that its vehicles will be “strategically placed” in US cities.
Since 2016, Motional has established an office in Los Angeles. Last year, it announced plans to expand its foothold in California by establishing a new operations center in Los Angeles to facilitate testing on public roads, recruiting additional engineers, and creating an office in Silicon Valley. However, this will be the first time LA citizens can use the Lyft app to book an autonomous vehicle. This implies that Motional must get all necessary permits before accepting passengers. AV companies must get a series of permits from the Department of Motor Vehicles and Public Utilities Commission to legitimately shuttle passengers and receive ride payments.
Despite the mandatory permissions, California has been one of the most supportive and legally friendly states when it comes to robotaxis and fully autonomous vehicles. A month ago, the Waymo unit of Alphabet revealed intentions to launch a self-driving robotaxi service in key Los Angeles districts. In September, UberEats announced the introduction of Nuro’s driverless delivery robots in California.
Earlier this year, Meta revealed its plans to host a hyperrealistic avatar of The Notorious B.I.G, a late rapper popularly known as “Biggie,” in a VR concert called The Brook. The concert was meant to accompany other upcoming NFTs that were due to launch by the end of 2022.
Now, Meta has updated the story as it announced that its Horizon Worlds will host ‘The Notorious B.I.G. Sky’s The Limit: A VR Concert Experience.’ As per the announcement, there will be a ‘hyperrealistic’ avatar of Biggie, with legs and a human-like face.
Biggie, or Christopher Wallace, was a well-known hip-hop rapper who rose to stardom in the early 90s. He was mysteriously killed in a drive-by shooting in March 1997.
The VR concert will be organized in collaboration with the late icon’s estate managers and will take place on December 16, 2022. The rapper’s avatar will be seen performing his famous raps, just like the ‘Hypnotize.’ He will also take concert attendees through a “narrative journey” of Biggie’s life in Brooklyn.
The iconic rapper has received such homage before. On what would have been his 50th birth anniversary, May 2021, the Empire State Building changed its color to red and white, with a spinning crown. A special memorial, attended by people like Lil Kim, Lil’ Cease, and Voletta Wallace (his mother), was also held in New York.