Baidu unveils plans to work in generative AI, the metaverse, and quantum computing

January 16, 2023

Baidu plans work generative AI metaverse quantum computing

Chinese technology giant Baidu has unveiled plans to expand its field of work to emerging areas such as generative AI, the metaverse, and quantum computing.

Baidu CEO Robin Li announced this at the company’s annual AI developer conference, Baidu Create, which was held virtually on its XiRang metaverse platform. It was co-hosted by people and robots together for the first time.

Among the deep learning products unveiled at Bindu Create were video content generation and editing models, including Big Model ERNIE 3.0 Zeus. Baidu said these new models would boost AI-generated content.

Baidu also debuted its own metaverse solution, XiRang MetaStack. The platform enables brands to build their own metaverse space in only 40 days as compared to the average of 6 to 12 months, according to the tech company.

The Chinese tech company also showcased an AI-powered software tool to cancel echoes during smartphone calls. Baidu said this development would “enable a smoother and more intelligent human-AI interaction for improved navigation.”

Baidu plans to increase research as well as the construction of infrastructure to foray into the fields of metaverse and quantum computing. The Chinese company also expressed plans to integrate its quantum tech into countless industries.

PicsArt announces its new AI tool SketchAI

Sahil Pawar

January 16, 2023

The world’s leading digital creation platform, PicsArt, has announced a new standalone app called SketchAI, which gives its users the ability to turn a sketch or image into a stunning piece of AI art.

With the rapid rise of generative AI in the past few months, millions of people are using this technology to make new breathtaking visuals. SketchAI expands these possibilities even further by offering users a fun and quick way to create art from sketches or photos.

Apart from creating a sketch or uploading an existing image, users can also add text describing the image to enhance results. SketchAI features various artistic styles which users can apply to their creations, including pencil sketching, ink drawing, da Vinci, van Gogh and more.

“The evolution of AI technology from then until now is incredible. The ability to not only draw anything you want on a mobile phone, but now turn that into a completely new work of art is something I never would have thought was possible,” said Picsart’s VP of Product Lusine Harutyunyan.

SketchAI is the latest in Picsart’s rapid innovation in the generative AI space, coming just after the launch of AI Avatar, which provides custom AI-generated avatars from a user’s selfies.

South Korea’s Ambitions to Lead AV Industry: Transport Minister has new Plans

Preetipadma K

January 16, 2023

For the introduction of Level 4 autonomous cars in the nation by the end of the following year, the South Korean Ministry of Transport is planning to restructure the country’s current transportation networks and establish safety regulations and insurance schemes. The government will dramatically relax rules governing autonomous vehicles, according to Land, Infrastructure, and Transport Minister Won Hee-ryong, who made the announcement on Sunday at the ‘CES 2023’ trade show in Las Vegas.

This comes after the Ministry of Land, Infrastructure and Transport unveiled Mobility Innovation Roadmap last year. In September 2022, the Korea Times reported that as a part of its by 2035, the nation plans Level 4 autonomous driving technology to be standard on 50% of all new automobiles. By 2030, the South Korean government hopes to establish real-time communications between vehicles and the road, along with installing a connectivity system in place over a 110,000 km area. It even has plans to bring autonomy to bus services in the future.

One of South Korea’s most well-adapted megatrends over the past ten years has been self-driving technology, particularly in connected and autonomous vehicles (CAVs), which enable V2X connection — communication between cars, infrastructure, and other road users — according to a recent news release from ResearchAndMarkets.com.

In light of this, autonomous cars are viewed as an important development in the automotive industry that has the potential to increase driving comfort and road safety in South Korea. Furthermore, South Korea has made significant advancements in autonomous vehicle technology, which is piquing the attention of automakers, technology companies, decision-makers, and the general public.

It will be interesting to see how the nation fulfills its autonomous vehicle ambitions and delivers on its futuristic vision of leading the industry.

Shutterstock Joins Hands with Meta to Boost Generative AI Plans

Preetipadma K

January 15, 2023

Shutterstock Joins Hands with Meta to Boost Generative AI Plans

Shutterstock recently announced its collaboration with Meta in an attempt to foster innovations in generative AI. This partnership, according to Shutterstock, highlights both companies’ dedication to being at the forefront of AI innovation, as well as the potential of Shutterstock’s Shutterstock’s expansive content library.

Shutterstock claims that its growing collaboration with Meta is part of a larger strategic objective to be at the core of technology, design, content, and innovation. It also claims to be one of the first companies to compensate artists for their work in developing machine learning models, and that by assuring ethical content production and licensing through a transparent IP transfer, it has “proved to be a trusted partner” to those joining the market.

Paul Hennessy, CEO of Shutterstock, thinks AI could encourage more creativity. In addition to its alliance with OpenAI and LG AI Research, which was announced last year, he said the company intends to deepen its long-standing relationship with Meta.

The latest Meta and Shutterstock collab will focus on three core goals:

Introduce innovative products to the market
Expand the Shutterstock ecosystem, which currently pays contributors and links them to artists.
Meta will leverage Shutterstock’s collection of millions of videos, photos, and music to create, train, and analyze its machine-learning skills. In exchange, Meta will allow Shutterstock to use its generative AI models, such as Make A Scene, Make A Video, AudioGen, and others.

Shutterstock has been acquiring many competitors over the past couple of years to expand its content library and have a dominating influnce in the generative AI industry. Over the past two years, it has acquired the leading video-focused stock agency Pond5, the 3D rendering stock agency TurboSquid, the online image editing and design platform PicMonkey, and the celebrity news agency Splash News. In October of last year, the company also added the DALL-E generative AI technology to its platform.

List of Bug Bounty Platforms for Cyber Security

Disha Chopra

January 15, 2023

In the era of digital communication, reducing the consequences of an exploit on software or web services must be a primary objective. Most companies need to pay more attention to the necessity of security experts and professionals who advise on security, making themselves vulnerable to cyber attacks. A starting point of solution to prepare an organization for cyber security is to tap into the experiences of security professionals and understand the benefits of bug bounty programs and platforms.

What is a bug bounty program, and why have one?

Bug bounty programs, or vulnerability reward programs, enable ethical hackers to use their technical know-how to find vulnerabilities in a company’s network and receive compensation based on the severity. Indulging in or creating a bug bounty program allows organizations to have access to security professionals and step beyond their own testing constraints, enabling them to find more vulnerabilities that they might otherwise overlook. Since these programs are often continuous, organizations must keep working on them as long as the services are provided. With this approach, enterprises don’t have to wait for the subsequent testing cycle to find new vulnerabilities.

What are bug bounty platforms?

Etymologically, bug bounties are the rewards a company gives white-hat hackers (ethical hackers who identify software vulnerabilities in networks, hardware, or software. Bug bounty platforms are dedicated to creating and managing bounty programs for bugs while offering discussion communities to facilitate better security practices. These platforms are used by businesses to provide rewards to seasoned users who test and identify product flaws. Most companies supplement their in-house QA and issue-finding efforts with these platforms’ bug bounty services. Businesses that can test vulnerabilities without disclosing sensitive information benefit most from bug bounty programs.

Open Bug Bounty

Open Bug Bounty is one of the independently established bug bounty platforms that surfaced in 2014. It is a non-profit project that security researchers developed to connect website owners and security administrators to make the web safer. Any security researcher can disclose a vulnerability on a website using Open Bug Bounty’s coordinated vulnerability disclosure platform as long as the flaw was discovered without intrusive testing methods and was submitted per responsible disclosure standards. The platform follows ISO standard guidelines to ensure ethical and thoughtful disclosure of any.

Open Bug Bounty is only responsible for independent verifications of detected vulnerabilities and notifying website owners. Upon being reported, it is up to the website owner to decide on a suitable remedy and coordinate its disclosure.

Redstorm

Redstorm is one of the bug bounty platforms that help organizations build a team of ethical hackers and security experts as a part of an organization’s infosec team. Using Redstorm’s bug bounty platform, organizations can conveniently publish websites and applications to independent security researchers/ethical hackers who will try to find vulnerabilities in your products.

Redstorm also offers vulnerability disclosure assistance by helping organizations determine the target scope (of what needs to be tested) and the validity of the spotted vulnerabilities.

YesWeHack

YesWeHack is one of the emerging European bug bounty platforms and vulnerability management companies. The platform offers a big community of security professionals and white hat hackers who optimize vulnerability testing. Clients can choose the relevant experts in their security, describe their requirements and get the ‘hunters’ to find vulnerabilities. Once done, users receive protected vulnerability reports, ensuring data privacy and disclosure compliance.

YesWeHack also hosts several contests and hackathons to attract people (or ‘hunters’ as they call them) to hone their hacking skills with its DOJO platform. The platform also offers introductory ethical hacking courses and training modules for those who want to learn ethical hacking.

BugCrowd

Casey Ellis, a cybersecurity expert, founded BugCrowd, one of the most creative and inventive bug bounty platforms. BugCrowd is known to actively push the standard crowd security testing services and test surface management with a wide range of penetration testing activities for IoT, API, and even networks. The platform also skillfully promotes various software development life cycle (SDLC) integration capabilities to speed up and simplify the DevSecOps workflow.

BugCrowd also has a university where security research, webinars, and ethical hacking training are offered for those who want to learn and participate in bug bounty programs. The renowned (ISC)² cybersecurity education group, and business behemoths like Amazon, VISA, and eBay, have hosted numerous Bug Bounty programs on BugCrowd.

Immunefi

Immunefi is one of the web3 bug bounty platforms that operates the most significant bounties worldwide and is the first operational bug bounty program. It is a unique bug bounty platform with chain-agnostic capabilities, i.e., it hosts bug bounties for blockchain projects. Immunefi has a white hat army of security experts who do continuous code reviews and check for vulnerabilities.

Since its inception in 2020, Immunefi has become an industry leader with a team of over 50 experts. It protects over US$25 billion in user funds spread across multiple projects like chainlink, compound, and cream finance.

Bugv

Bugv is one of the bug bounty platforms that help in vulnerability coordination with robust penetration testers and a team of security researchers. It was founded by Naresh LamGade, an independent security researcher and web enthusiast with a vision to make infrastructures more secure and prepared to tackle exploits. After working as a security analyst in the cybersecurity domain, he wanted to help other organizations keep up to date with vulnerabilities and exploits.

HackerOne

HackerOne is one of the leading bug bounty platforms specializing in attack resistance management (ARM). It was founded in 2012 by ethical hackers and security experts to bridge the gap between organizations’ assets and their protection. HackerOne offers ARM to identify relative weaknesses in the constantly changing digital attack surface and combines the security expertise of ethical hackers with asset discovery, ongoing assessment, and process improvement. With HackerOne’s bug bounty platform, organizations can monitor their bug bounty program in real-time and get access to multiple remediation methods.

Bugbase

Bugbase is one of the first few Indian bug bounty platforms and the largest cybersecurity marketplace in the country. Bugbase harnesses a massive ethical hacking talent to ensure security for businesses with an all-in-one platform. It provides one-click integration using security testing solutions to be used within minutes, and bugs are reported within a few hours. Bugbase offers all compliance certifications like CERT-In, PCI, NIST, GDPR, and a few others to ensure data privacy and security.

Synack

Synack is one of the most valued commercial bug bounty platforms, founded by security visionaries Jay Kaplan and Mark Kuhr. Synack offers a robust testing service with end-to-end vulnerability management with Synack365 and a specialized Synack Red Team team for bug bounty operations. The “Synack Red Team” (SRT) is an exclusive group of cybersecurity experts that comprises security specialists with verified backgrounds and respectable industry expertise. By conducting thorough due diligence on their Red Team and documenting every action for later analysis or review, SynAck successfully established itself as the market leader of trusted crowd security testing services.

Inspectiv

Inspectiv is a well-known vulnerability management and bug bounty platform that crowdsources web applications to test security and scan for vulnerabilities. It was founded in 2018 to provide world-class cybersecurity intelligence and consulting services. Veritone’s FedRAMP certification validates the platform for third-party assessment controls. It effectively reduces cybersecurity threats through easy identification and a hands-on triage team to validate security concerns.

Top Deep Learning Frameworks

Disha Chopra

January 14, 2023

Deep learning, a vital element of data science, is a class of machine learning that works similarly to the human brain. Simply, it can be considered a proxy for predictive modeling, a statistical technique to predict future states. Many of the top deep learning frameworks make this process quicker and more straightforward, making it advantageous to data scientists who gather, analyze, and interpret massive amounts of data.

Deep learning frameworks differ from traditional ML frameworks in linearity. While traditional ones are linear, deep learning frameworks have stacked algorithms in complex hierarchies (or layers). Each algorithm in the hierarchy performs a nonlinear transformation on its input and outputs a statistical model using what it has learned. Iterations keep pushing until the output is accurate enough. Thus, the word “deep” justifies the presence of multiple processing layers that the data must go through.

Convolutional neural networks (CNNs), recurrent neural networks (RNNs), long short-term memory (LSTM), and deep belief networks (DBN) are some popular deep learning frameworks that are commonly used for many commercial applications and use cases like NLP, computer vision, etc. This article enlists some of the top deep learning frameworks.

Top deep learning frameworks

Sonnet

Sonnet is an advanced framework for deep learning developed by Google’s DeepMind for TensorFlow, its a fundamental framework due to its versatility and adaptability for creating even higher-level frameworks. Sonnet does not replace TensorFlow for DL-based tasks but makes it easier for the users to construct neural networks. Before it, developers had to be proficient with all underlying TensorFlow graphs, but now all they have to do is construct Python objects and connect them separately to form a TensorFlow computation graph.

It features multiple pre-defined modules snt.Linear, snt.Conv2D, snt.BatchNorm, etc. Users can also write customized modules and submodules to create models. These models can be readily integrated using raw TF codes or codes written in other advanced languages.

Eclipse Deeplearning4j

Deeplearning4j is a comprehensive suite of tools for deep learning. The framework is highly compatible with JVM and allows you to train models from java while incorporating the Python ecosystem. Its model support, cpython bindings, and interoperability between multiple runtimes like onnxruntime.

Deeplearning4j comes with numerous submodules like samediff (for complex graphs), Nd4j (numpy ++ for java), Libnd4j (C++ library), Python4j (for easy deployment of python scripts), and others. You may either use Deeplearning4j as an addition to your current Python and C++ workflows or as a standalone library to create and deploy models.

Several renowned enterprises, the US Bank, Nuix, and Teladoc Health are among hundreds of others that use Deeplearning4j.

TensorFlow

TensorFlow is an open-source framework developed by the Google Brain team for internal research and production. The E2E platform allows developers to build and deploy machine learning models with built-in neural network executions. The framework offers scalability by training across multiple resources simultaneously and seamless support for transitioning from shared memory to distributed memory.

TensorFlow supports a variety of programming languages. TensorFlow is most frequently in Python to ensure stability. There is also support for other languages, including JavaScipt, C++, and Java. Programming language flexibility enables a broader choice of industrial applications.

Enterprises like Airbnb, eBay, AirBus, DropBox, Snapchat, and others actively utilize TensorFlow for text-based services, image recognition, voice search, etc.

PyTorch

PyTorch is an open-source and one of the top deep learning frameworks based on Torch, a package containing data structures for multi-dimensional tensors. The framework, developed by Facebook’s AI Research Lab, accelerates the process from prototyping to production deployment by focusing on computer vision and NLP tasks.

PyTorch is similar to NumPy in computations, but instead of arrays, it uses n-dimensional arrays or tensors. It provides a massive GPU-powered computation speed, an autograd package for gradient computation, a set of modules for neural network functionalities, and many more features.

AMD, Intel, Pfizer, NVIDIA, Intuitive Surgical, and many other established enterprises use PyTorch for CRM, marketing, campaign management, and other applications.

Caffe2

Caffe, created by Berkeley AI Research (BAIR), is one of the top deep learning frameworks that promote modularity, performance, and expression. Caffe enables automation, image processing, statistical analysis, and other tasks when dealing with massive data sets. Meta research has developed Caffe2 to be an advanced version of the previous framework in terms of flexibility, robustness, and scalability to make deep learning more straightforward and uncomplicated.

Caffe2 is compatible with native Python and C++ APIs that are interchangeable, allowing for rapid prototyping and simple optimization. It can also integrate with Android Studio, XCode, or Microsoft Visual Studio for mobile development.

Many organizations like Snap Inc, Qualcomm, Meridian Health, Thermo Fisher Scientific, and others use Caffe2 for its deep learning capabilities.

Kaldi

Kaldi is an open-source deep learning toolkit explicitly designed for speech recognition based on C++. The framework is highly flexible and efficient in training acoustic models (statistical representations of acoustic information). Kaldi compiles the OpenFst toolkit to modernize coding and provide several “recipes” for model training.

The framework has bindings for Python, MATLAB, Java, and other programming languages. Many companies like Microsoft, Google, IBM, Apple, and others use Kaldi under the Apache 2.0 license.

Theano

Theano is a Python-based library built on NumPy that enables you to define, optimize, and evaluate mathematical expressions in deep learning with multi-dimensional arrays. Named after “Theano,” a greek mathematician, the library was introduced by the Montreal Institute for Learning Algorithms in 2007. It features a transparent use of GPU for data-intensive computations, avoids nasty bugs in complex computations, supports dynamic C code generation, and tools for detecting potential problems.

Theano functions in three stages: first, it defines the objects or variables; next, it progresses through phases to determine the mathematical expressions; and finally, it evaluates expressions by receiving input values.

Companies like Vizual.ai, Vuclip, Zetaops, and others actively use Theano in their technology stacks.

Scikit-learn

Next on our list of top deep learning frameworks is Scikit-learn, a Python-based toolkit, that was unveiled in 2007 during the Google Summer of Code project hosted by David Cournapeau. It was designed to operate and facilitate machine learning and artificial intelligence algorithms and work in conjunction with several frameworks like NumPy, SciPy, Matplotlib, Pandas, etc.

Other than supporting deep learning, it is used for statistical modeling with regression, classification (K-nearest neighbors), clustering (K-means and K-means++), preprocessing, and dimensionality reduction.

The toolkit is being actively used by organizations like Spotify, Inria, and J.P. Morgan to enhance their linear algebra and statistical analysis.

Apache MXNet

Apache MXNet is an open-source, efficient, and versatile library for deep learning. The library features hybrid front-end support and seamless transitions from other frameworks. MXNet can be extended with Apache’s thriving ecosystem of tools and libraries to enable more real-world use cases like NLP, computer vision, etc. It also features scalable and distributed training and performance optimization with its dual Parameter Server support.

MXNet is compatible with 8 integrable language bindings like C++, Java, Julia, Python, Perl, R, Go, etc. You can follow Apache’s guide to build and install MXNet from the official website.

Many companies like Intel, Amazon, Baidu, Carnegie, Wolfram Research, and others use and contribute to the community, yet it remains limited.

Microsoft Cognitive Toolkit

Microsoft’s cognitive toolkit (CNTK) is an open-source toolkit for distributed deep learning that describes neural networks as a sequence of computations based on a directed graph. With CNTK, users can conjugate and work with multiple models like CNNs (convolutional neural networks), DNNs (deep neural networks), LSTM (long short-term memory), and RNNs (recurrent neural networks).

CNTK is accessible as a library in Python, C++, or C# programming languages and can also be used as a standalone tool with its language (BrainScript). The toolkit supports 64-bit Windows or Linux operating systems. You can install the pre-compiled binary package or compile the toolkit by sourcing from GitHub.

Many famous companies like Delta Air, General Electric, Bain & Company, and many others use CNTK for personalized analytics.

Top Machine Learning (ML) Research Papers Released in 2022

Preetipadma K

January 13, 2023

Top Machine Learning (ML) Research Papers Released In 2022

Machine learning (ML) is gaining much traction in recent years owing to the disruption and development it brings in enhancing existing technologies. Every month, hundreds of ML papers from various organizations and universities get uploaded on the internet to share the latest breakthroughs in this domain. As the year ends, we bring you the Top 22 ML research papers of 2022 that created a huge impact in the industry. The following list does not reflect the ranking of the papers, and they have been selected on the basis of the recognitions and awards received at international conferences in machine learning.

Bootstrapped Meta-Learning

Meta-learning is a promising field that investigates ways to enable machine learners or RL agents (which include hyperparameters) to learn how to learn in a quicker and more robust manner, and it is a crucial study area for enhancing the efficiency of AI agents.

This 2022 ML paper presents an algorithm that teaches the meta-learner how to overcome the meta-optimization challenge and myopic meta goals. The algorithm’s primary objective is meta-learning using gradients, which ensures improved performance. The research paper also examines the potential benefits due to bootstrapping. The authors highlight several interesting theoretical aspects of this algorithm, and the empirical results achieve new state-of-the-art (SOTA) on the ATARI ALE benchmark as well as increased efficiency in multitask learning.

Competition-level code generation with AlphaCode

One of the exciting uses for deep learning and large language models is programming. The rising need for coders has sparked the race to build tools that can increase developer productivity and provide non-developers with tools to create software. However, these models still perform badly when put to the test on more challenging, unforeseen issues that need more than just converting instructions into code.

The popular ML paper of 2022 introduces AlphaCode, a code generation system that, in simulated assessments of programming contests on the Codeforces platform, averaged a rating in the top 54.3%. The paper describes the architecture, training, and testing of the deep-learning model.

Restoring and attributing ancient texts using deep neural networks

The epigraphic evidence of the ancient Greek era — inscriptions created on durable materials such as stone and pottery — had already been broken when it was discovered, rendering the inscribed writings incomprehensible. Machine learning can help in restoring, and identifying chronological and geographical origins of damaged inscriptions to help us better understand our past.

This ML paper proposed a machine learning model built by DeepMind, Ithaca, for the textual restoration and geographical and chronological attribution of ancient Greek inscriptions. Ithaca was trained on a database of just under 80,000 inscriptions from the Packard Humanities Institute. It had a 62% accuracy rate compared to historians, who had a 25% accuracy rate on average. But when historians used Ithaca, they quickly achieved a 72% accuracy.

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

Large neural networks use more resources to train hyperparameters since each time, the network must estimate which hyperparameters to utilize. This groundbreaking ML paper of 2022 suggests a novel zero-shot hyperparameter tuning paradigm for more effectively tuning massive neural networks. The research, co-authored by Microsoft Research and OpenAI, describes a novel method called µTransfer that leverages µP to zero-shot transfer hyperparameters from small models and produces nearly perfect HPs on large models without explicitly tuning them.

This method has been found to reduce the amount of trial and error necessary in the costly process of training large neural networks. By drastically lowering the need to predict which training hyperparameters to use, this approach speeds up research on massive neural networks like GPT-3 and perhaps its successors in the future.

PaLM: Scaling Language Modeling with Pathways

Large neural networks trained for language synthesis and recognition have demonstrated outstanding results in various tasks in recent years. This trending 2022 ML paper introduced Pathways Language Model (PaLM), a 780 billion high-quality text token, and 540 billion parameter-dense decoder-only autoregressive transformer.

Although PaLM just uses a decoder and makes changes like SwiGLU Activation, Parallel Layers, Multi-Query Attention, RoPE Embeddings, Shared Input-Output Embeddings, and No Biases and Vocabulary, it is based on a typical transformer model architecture. The paper describes the company’s latest flagship surpassing several human baselines while achieving state-of-the-art in numerous zero, one, and few-shot NLP tasks.

Robust Speech Recognition via Large-Scale Weak Supervision

Machine learning developers have found it challenging to build speech-processing algorithms that are trained to predict a vast volume of audio transcripts on the internet. This year, OpenAI released Whisper, a new state-of-the-art (SotA) model in speech-to-text that can transcribe any audio to text and translate it into several languages. It has received 680,000 hours of training on a vast amount of voice data gathered from the internet. According to OpenAI, this model is robust to accents, background noise, and technical terminology. Additionally, it allows transcription into English from 99 different languages and translation into English from those languages.

The OpenAI ML paper mentions the author ensured that about one-third of the audio data is non-English. This helped the team outperform other supervised state-of-the-art models by maintaining a diversified dataset.

OPT: Open Pre-trained Transformer Language Models

Large language models have demonstrated extraordinary performance f on numerous tasks (e.g., zero and few-shot learning). However, these models are difficult to duplicate without considerable funding due to their high computing costs. Even while the public can occasionally interact with these models through paid APIs, complete research access is still only available from a select group of well-funded labs. This limited access has hindered researchers’ ability to comprehend how and why these language models work, which has stalled progress on initiatives to improve their robustness and reduce ethical drawbacks like bias and toxicity.

The popular 2022 ML paper introduces Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers with 125 million to 175 billion parameters that the authors want to share freely and responsibly with interested academics. The biggest OPT model, OPT-175B (it is not included in the code repository but is accessible upon request), which is impressively proven to perform similarly to GPT-3 (which also has 175 billion parameters) uses just 15% of GPT-3’s carbon footprint during development and training.

A Path Towards Autonomous Machine Intelligence

Yann LeCun is a prominent and respectable researcher in the field of artificial intelligence and machine learning. In June, his much-anticipated paper “A Path Towards Autonomous Machine Intelligence” was published on OpenReview. LeCun offered a number of approaches and architectures in his paper that might be combined and used to create self-supervised autonomous machines.

He presented a modular architecture for autonomous machine intelligence that combines various models to operate as distinct elements of a machine’s brain and mirror the animal brain. Due to the differentiability of all the models, they are all interconnected to power certain brain-like activities, such as identification and environmental response. It incorporates ideas like a configurable predictive world model, behavior-driven through intrinsic motivation, and hierarchical joint embedding architectures trained with self-supervised learning.

LaMDA: Language Models for Dialog Applications

Despite tremendous advances in text generation, many of the chatbots available are still rather irritating and unhelpful. This 2022 ML paper from Google describes the LaMDA — short for “Language Model for Dialogue Applications” — system, which caused the uproar this summer when a former Google engineer, Blake Lemoine, alleged that it is sentient. LaMDA is a family of large language models for dialog applications built on Google’s Transformer architecture, which is known for its efficiency and speed in language tasks such as translation. The model’s ability to be adjusted using data that has been human-annotated and the capability of consulting external sources are its most intriguing features.

The model, which has 137 billion parameters, was pre-trained using 1.56 trillon words from publicly accessible conversation data and online publications. The model is also adjusted based on the three parameters of quality, safety, and groundedness.

Privacy for Free: How does Dataset Condensation Help Privacy?

One of the primary proposals in the award-winning ML paper is to use dataset condensation methods to retain data efficiency during model training while also providing membership privacy. The authors argue that dataset condensation, which was initially created to increase training effectiveness, is a better alternative to data generators for producing private data since it offers privacy for free.

Though existing data generators are used to produce differentially private data for model training to minimize unintended data leakage, they result in high training costs or subpar generalization performance for the sake of data privacy. This study was published by Sony AI and received the Outstanding Paper Award at ICML 2022.

TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data

The use of a model that converts time series into anomaly scores at each time step is essential in any system for detecting time series anomalies. Recognizing and diagnosing anomalies in multivariate time series data is critical for modern industrial applications. Unfortunately, developing a system capable of promptly and reliably identifying abnormal observations is challenging. This is attributed to a shortage of anomaly labels, excessive data volatility, and the expectations of modern applications for ultra-low inference times.

In this study, the authors present TranAD, a deep transformer network-based anomaly detection and diagnosis model that leverages attention-based sequence encoders to quickly execute inference while being aware of the more general temporal patterns in the data. TranAD employs adversarial training to achieve stability and focus score-based self-conditioning to enable robust multi-modal feature extraction. The paper mentions extensive empirical experiments on six publicly accessible datasets show that TranAD can perform better in detection and diagnosis than state-of-the-art baseline methods with data- and time-efficient training.

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

In the last few years, generative models called “diffusion models” have been increasingly popular. This year saw these models capture the excitement of AI enthusiasts around the world.

Going ahead of the current text to speech technology of recent times, this outstanding 2022 ML paper introduced the viral text-to-image diffusion model from Google, Imagen. This diffusion model achieves a new state-of-the-art FID score of 7.27 on the COCO dataset by combining the deep language understanding of transformer-based large language models with the photorealistic image-generating capabilities of diffusion models. A text-only frozen language model provides the text representation, and a diffusion model with two super-resolution upsampling stages, up to 1024×2014, produces the images. It employs several training approaches, including classifier-free guiding, to teach itself conditional and unconditional generation. Another important feature of Imagen is the use of dynamic thresholding, which stops the diffusion process from being saturated in specific areas of the picture, a behavior that reduces image quality, particularly when the weight placed on text conditional creation is large.

No Language Left Behind: Scaling Human-Centered Machine Translation

This ML paper introduced the most popular Meta projects of the year 2022: NLLB-200. This paper talks about how Meta built and open-sourced this state-of-the-art AI model at FAIR, which is capable of translating 200 languages between each other. It covers every aspect of this technology: language analysis, moral issues, effect analysis, and benchmarking.

No matter what language a person speaks, accessibility via language ensures that everyone can benefit from the growth of technology. Meta claims that several languages that NLLB-200 translates, such as Kamba and Lao, are not currently supported by any translation systems in use. The tech behemoth also created a dataset called “FLORES-200” to evaluate the effectiveness of the NLLB-200 and show that accurate translations are offered. According to Meta, NLLB-200 offers an average of 44% higher-quality translations than its prior model.

A Generalist Agent

AI pundits believe that multimodality will play a huge role in the future of Artificial General Intelligence (AGI). One of the most talked ML papers of 2022 by DeepMind introduces Gato – a generalist agent. This AGI agent is a multi-modal, multi-task, multi-embodiment network, which means that the same neural network (i.e. a single architecture with a single set of weights) can do all tasks while integrating inherently diverse types of inputs and outputs.

DeepMind claims that the general agent can be improved with new data to perform even better on a wider range of tasks. They argue that having a general-purpose agent reduces the need for hand-crafting policy models for each region, enhances the volume and diversity of training data, and enables continuous advances in the data, computing, and model scales. A general-purpose agent can also be viewed as the first step toward artificial general intelligence, which is the ultimate goal of AGI.

Gato demonstrates the versatility of transformer-based machine learning architectures by exhibiting their use in a variety of applications. Unlike previous neural network systems tailored for playing games, stack blocks with a real robot arm, read words, and caption images, Gato is versatile enough to perform all of these tasks on its own, using only a single set of weights and a relatively simple architecture.

The Forward-Forward Algorithm: Some Preliminary Investigations

AI pioneer Geoffrey Hinton is known for writing paper on the first deep convolutional neural network and backpropagation. In his latest paper presented at NeurIPS 2022, Hinton proposed the “forward-forward algorithm,” a new learning algorithm for artificial neural networks based on our understanding of neural activations in the brain. This approach draws inspiration from Boltzmann machines (Hinton and Sejnowski, 1986) and noise contrast estimation (Gutmann and Hyvärinen, 2010). According to Hinton, forward-forward, which is still in its experimental stages, can substitute the forward and backward passes of backpropagation with two forward passes, one with positive data and the other with negative data that the network itself could generate. Further, the algorithm could simulate hardware more efficiently and provide a better explanation for the brain’s cortical learning process.

Without employing complicated regularizers, the algorithm obtained a 1.4 percent test error rate on the MNIST dataset in an empirical study, proving that it is just as effective as backpropagation.

The paper also suggests a novel “mortal computing” model that can enable the forward-forward algorithm and understand our brain’s energy-efficient processes.

Focal Modulation Networks

In humans, the ciliary muscles alter the shape of the eye and hence the radius of the curvature lens to focus on near or distant objects. Changing the shape of the eye lens, changes the focal length of the lens. Mimicking this behavior of focal modulation in computer vision systems can be tricky.

This machine learning paper introduces FocalNet, an iterative information extraction technique that employs the premise of foveal attention to post-process Deep Neural Network (DNN) outputs by performing variable input/feature space sampling. Its attention-free design outperforms SoTA self-attention (SA) techniques in a wide range of visual benchmarks. According to the paper, focal modulation consists of three parts: According to the paper, focal modulation consists of three parts:

a. hierarchical contextualization, implemented using a stack of depth-wise convolutional layers, to encode visual contexts from close-up to a great distance;

b. gated aggregation to selectively gather contexts for each query token based on its content; and

c. element-wise modulation or affine modification to inject the gathered context into the query.

Learning inverse folding from millions of predicted structures

The field of structural biology is being fundamentally changed by cutting-edge technologies in machine learning, protein structure prediction, and innovative ultrafast structural aligners. Time and money are no longer obstacles to obtaining precise protein models and extensively annotating their functionalities. However, determining a protein sequence from its backbone atom coordinates remained a challenge for scientists. To date, machine learning methods to this challenge have been constrained by the amount of empirically determined protein structures available.

In this ICML Outstanding Paper (Runner Up), authors explain tackling this problem by increasing training data by almost three orders of magnitude by using AlphaFold2 to predict structures for 12 million protein sequences. With the use of this additional data, a sequence-to-sequence transformer with invariant geometric input processing layers is able to recover native sequence on structurally held-out backbones in 51% of cases while recovering buried residues in 72% of cases. This is an improvement of over 10% over previous techniques. In addition to designing protein complexes, partly masked structures, binding interfaces, and numerous states, the concept generalises to a range of other more difficult tasks.

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

Within the AI research community, using video games as a training medium for AI has gained popularity. These autonomous agents have had great success in Atari games, Starcraft, Dota, and Go. Although these developments have gained popularity in the field of artificial intelligence research, the agents do not generalize beyond a narrow range of activities, in contrast to humans, who continually learn from open-ended tasks.

This thought-provoking 2022 ML paper suggests MineDojo, a unique framework for embodied agent research based on the well-known game Minecraft. In addition to building an internet-scale information base with Minecraft videos, tutorials, wiki pages, and forum discussions, Minecraft provides a simulation suite with tens of thousands of open-ended activities. Using MineDojo data, the author proposes a unique agent learning methodology that employs massive pre-trained video-language models as a learnt reward function. Without requiring a dense shaping reward that has been explicitly created, MinoDojo autonomous agent can perform a wide range of open-ended tasks that are stated in free-form language.

Is Out-of-Distribution Detection Learnable?

Machine learning (supervised ML) models are frequently trained using the closed-world assumption, which assumes that the distribution of the testing data will resemble that of the training data. This assumption doesn’t hold true when used in real-world activities, which causes a considerable decline in their performance. While this performance loss is acceptable for applications like product recommendations, developing an out-of-distribution (OOD) identification algorithm is crucial to preventing ML systems from making inaccurate predictions in situations where data distribution in real-world activities typically drifts over time (self-driving cars).

In this paper, authors explore the probably approximately correct (PAC) learning theory of OOD detection, which is proposed by researchers as an open problem, to study the applicability of OOD detection. They first focus on identifying a prerequisite for OOD detection’s learnability. Following that, they attempt to show a number of impossibility theorems regarding the learnability of OOD detection in a handful yet different scenarios.

Gradient Descent: The Ultimate Optimizer

Gradient descent is a popular optimization approach for training machine learning models and neural networks. The ultimate aim of any machine learning (neural network) method is to optimize parameters, but selecting the ideal step size for an optimizer is difficult since it entails lengthy and error-prone manual work. Many strategies exist for automated hyperparameter optimization; however, they often incorporate additional hyperparameters to govern the hyperparameter optimization process. In this study, MIT CSAIL and Meta researchers offer a unique approach that allows gradient descent optimizers like SGD and Adam to tweak their hyperparameters automatically.

They propose learning the hyperparameters by self-using gradient descent, as well as learning the hyper-hyperparameters via gradient descent, and so on indefinitely. This paper describes an efficient approach for allowing gradient descent optimizers to autonomously adjust their own hyperparameters, which may be layered recursively to many levels. As these gradient-based optimizer towers expand in size, they become substantially less sensitive to the selection of top-level hyperparameters, reducing the load on the user to search for optimal values.

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Embodied AI is a developing study field that has been influenced by recent advancements in artificial intelligence, machine learning, and computer vision. This method of computer learning makes an effort to translate this connection to artificial systems. The paper proposes ProcTHOR, a framework for procedural generation of Embodied AI environments. ProcTHOR allows researchers to sample arbitrarily huge datasets of diverse, interactive, customisable, and performant virtual environments in order to train and assess embodied agents across navigation, interaction, and manipulation tasks.

According to the authors, models trained on ProcTHOR using only RGB images and without any explicit mapping or human task supervision achieve cutting-edge results in 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation, including the ongoing Habitat2022, AI2-THOR Rearrangement2022, and RoboTHOR challenges. The paper received the Outstanding Paper award at NeurIPS 2022.

A Commonsense Knowledge Enhanced Network with Retrospective Loss for Emotion Recognition in Spoken Dialog

Emotion Recognition in Spoken Dialog (ERSD) has recently attracted a lot of attention due to the growth of open conversational data. This is due to the fact that excellent speech recognition algorithms have emerged as a result of the integration of emotional states in intelligent spoken human-computer interactions. Additionally, it has been demonstrated that recognizing emotions makes it possible to track the development of human-computer interactions, allowing for dynamic change of conversational strategies and impacting the result (e.g., customer feedback). But the volume of the current ERSD datasets restricts the model’s development.

This ML paper proposes a Commonsense Knowledge Enhanced Network (CKE-Net) with a retrospective loss to carry out dialog modeling, external knowledge integration, and historical state retrospect hierarchically.

Meta Sues Voyager Labs in a Lawsuit for Allegedly Creating Fake Accounts to Scrape Data

Disha Chopra

January 13, 2023

Meta filed a lawsuit about Voyager Labs, claiming that the company set up bogus Facebook profiles to gather data from actual Facebook users, which it later used for its own business needs. As per the filing at the District Court for the Northern District of California, Meta claimed that there were over 38,000 fake Facebook accounts.

The complaint said, “Meta seeks damages and injunctive relief to stop Defendant’s use of its platforms and services.”

Voyager Labs specializes in investigative tools and services that assist law enforcement, and businesses in learning more about suspects. Meta claimed that Voyager Labs inappropriately collected data from not only Facebook but also Instagram, Twitter, YouTube, Telegram, and other websites to fuel its software.

Over 60,000 Voyager Labs-related Facebook and Instagram identities and pages, including the 38,000 fraudulent accounts, were eventually removed by Meta.
Meta’s complaint is similar to a data-scraping court case between LinkedIn and Microsoft-owned social network hiQ in December 2022, with a settlement worth US$500,000. Moreover, in September 2022, Meta settled another case with BrandTotal and Unimania, which ultimately stopped “using and scraping Instagram and Facebook.”

Neeva AI Restructuring Consumer Search Practices with its AI-Powered Engine like You.com

Disha Chopra

January 13, 2023

Neeva AI, a consumer search engine, has launched an AI-powered engine that uses large language models and an independent search stack to leverage search capabilities. Neeva’s interface works somewhat similarly to Google’s Featured Snippets.

1/ #NeevaAI is here, powered by AI & LLMs and @Neeva's independent search stack to search in an authentic, believable way.

This is unlike anything we, or anyone, have built before.

U.S. users, try it out by logging into your Neeva account & searching ⤵️ pic.twitter.com/vSDPBMBkQW
— sridhar (@RamaswmySridhar) January 6, 2023

Neeva AI has taken inspiration from recently viral You.com’s YouChat, where users simply type their inputs, and the chatbot uses a Chat-GPT-like user interface to generate replies. Neeva AI is different because it credibly replies, “AI can’t answer” when it does not know/understand, instead of generating factually incorrect responses.

To get started, you must create a Neeva account because, without that, it will continue to function as a standard search engine. However, not all queries generate an answer from Neeva’s AI. Generally, you only receive AI search results for questions that could benefit from responses that AI created.

Read More: Uncovering The Twitter Files

Essentially, Neeva AI only appears to respond to inquiries; however, co-founder Ramaswamy Sridhar does state that Neeva AI will respond to additional searches “in the coming months.”

Neeva was founded by Sridhar Ramaswamy and Vivek Raghunathan with a vision to provide a search engine that keeps users’ information safe. Neeva AI search engine looks for information on the web and personal files like emails. The consumer search engine will not show any advertisements or collect any profit from user data.

1/ #NeevaAI is here, powered by AI & LLMs and @Neeva's independent search stack to search in an authentic, believable way.

This is unlike anything we, or anyone, have built before.

U.S. users, try it out by logging into your Neeva account & searching ⤵️ pic.twitter.com/vSDPBMBkQW
— sridhar (@RamaswmySridhar) January 6, 2023

Read More: Uncovering The Twitter Files

Essentially, Neeva AI only appears to respond to inquiries; however, co-founder Ramaswamy Sridhar does state that Neeva AI will respond to additional searches “in the coming months.”

The Labour Commissioner’s office in Pune summons Amazon over layoffs

Manjiri Gaikwad

January 13, 2023

The Labour Commissioner’s office in Pune has sent a summon to e-commerce giant Amazon and Nascent Information Technology Employees Senate (NITES) regarding its massive layoffs and voluntary separation policy.

As per the summon, the assistant Labour commissioner requested Amazon and its union representatives to be present at the commissioner’s office on the 17th of January at 3 PM. The commissioner will be taking necessary actions following an inquiry into the unethical and illegal layoffs of Amazon.

Harpreet Singh Saluja, President of NITES, stated that 1000 employees and their families’ livelihoods have now been made vulnerable. He also mentioned that according to the procedures under the Industrial Dispute Act, an employee could only be laid off only with prior permission from the authorized government.

Amazon issued a voluntary separation policy to its employees in November 2022, enabling them to resign voluntarily. Employees who did not apply for the policy were included in the workforce optimization announced by Amazon.

As per Saluja, Amazon violated the Indian Labour laws as the voluntary separation policy of Amazon was never proposed to the Labour Ministry for review. An employee who has served continuously for a year in a company can only be laid off if served a notice period of three months in advance.

1...99100101...354 Page 100 of 354