Home Blog Page 101

Top Deep Learning Frameworks

January 14, 2023

Deep learning, a vital element of data science, is a class of machine learning that works similarly to the human brain. Simply, it can be considered a proxy for predictive modeling, a statistical technique to predict future states. Many of the top deep learning frameworks make this process quicker and more straightforward, making it advantageous to data scientists who gather, analyze, and interpret massive amounts of data.

Deep learning frameworks differ from traditional ML frameworks in linearity. While traditional ones are linear, deep learning frameworks have stacked algorithms in complex hierarchies (or layers). Each algorithm in the hierarchy performs a nonlinear transformation on its input and outputs a statistical model using what it has learned. Iterations keep pushing until the output is accurate enough. Thus, the word “deep” justifies the presence of multiple processing layers that the data must go through.

Convolutional neural networks (CNNs), recurrent neural networks (RNNs), long short-term memory (LSTM), and deep belief networks (DBN) are some popular deep learning frameworks that are commonly used for many commercial applications and use cases like NLP, computer vision, etc. This article enlists some of the top deep learning frameworks.

Top deep learning frameworks

Sonnet

Sonnet is an advanced framework for deep learning developed by Google’s DeepMind for TensorFlow, its a fundamental framework due to its versatility and adaptability for creating even higher-level frameworks. Sonnet does not replace TensorFlow for DL-based tasks but makes it easier for the users to construct neural networks. Before it, developers had to be proficient with all underlying TensorFlow graphs, but now all they have to do is construct Python objects and connect them separately to form a TensorFlow computation graph.

It features multiple pre-defined modules snt.Linear, snt.Conv2D, snt.BatchNorm, etc. Users can also write customized modules and submodules to create models. These models can be readily integrated using raw TF codes or codes written in other advanced languages.

Eclipse Deeplearning4j

Deeplearning4j is a comprehensive suite of tools for deep learning. The framework is highly compatible with JVM and allows you to train models from java while incorporating the Python ecosystem. Its model support, cpython bindings, and interoperability between multiple runtimes like onnxruntime.

Deeplearning4j comes with numerous submodules like samediff (for complex graphs), Nd4j (numpy ++ for java), Libnd4j (C++ library), Python4j (for easy deployment of python scripts), and others. You may either use Deeplearning4j as an addition to your current Python and C++ workflows or as a standalone library to create and deploy models.

Several renowned enterprises, the US Bank, Nuix, and Teladoc Health are among hundreds of others that use Deeplearning4j.

TensorFlow

TensorFlow is an open-source framework developed by the Google Brain team for internal research and production. The E2E platform allows developers to build and deploy machine learning models with built-in neural network executions. The framework offers scalability by training across multiple resources simultaneously and seamless support for transitioning from shared memory to distributed memory.

TensorFlow supports a variety of programming languages. TensorFlow is most frequently in Python to ensure stability. There is also support for other languages, including JavaScipt, C++, and Java. Programming language flexibility enables a broader choice of industrial applications.

Enterprises like Airbnb, eBay, AirBus, DropBox, Snapchat, and others actively utilize TensorFlow for text-based services, image recognition, voice search, etc.

PyTorch

PyTorch is an open-source and one of the top deep learning frameworks based on Torch, a package containing data structures for multi-dimensional tensors. The framework, developed by Facebook’s AI Research Lab, accelerates the process from prototyping to production deployment by focusing on computer vision and NLP tasks.

PyTorch is similar to NumPy in computations, but instead of arrays, it uses n-dimensional arrays or tensors. It provides a massive GPU-powered computation speed, an autograd package for gradient computation, a set of modules for neural network functionalities, and many more features.

AMD, Intel, Pfizer, NVIDIA, Intuitive Surgical, and many other established enterprises use PyTorch for CRM, marketing, campaign management, and other applications.

Caffe2

Caffe, created by Berkeley AI Research (BAIR), is one of the top deep learning frameworks that promote modularity, performance, and expression. Caffe enables automation, image processing, statistical analysis, and other tasks when dealing with massive data sets. Meta research has developed Caffe2 to be an advanced version of the previous framework in terms of flexibility, robustness, and scalability to make deep learning more straightforward and uncomplicated.

Caffe2 is compatible with native Python and C++ APIs that are interchangeable, allowing for rapid prototyping and simple optimization. It can also integrate with Android Studio, XCode, or Microsoft Visual Studio for mobile development.

Many organizations like Snap Inc, Qualcomm, Meridian Health, Thermo Fisher Scientific, and others use Caffe2 for its deep learning capabilities.

Kaldi

Kaldi is an open-source deep learning toolkit explicitly designed for speech recognition based on C++. The framework is highly flexible and efficient in training acoustic models (statistical representations of acoustic information). Kaldi compiles the OpenFst toolkit to modernize coding and provide several “recipes” for model training.

The framework has bindings for Python, MATLAB, Java, and other programming languages. Many companies like Microsoft, Google, IBM, Apple, and others use Kaldi under the Apache 2.0 license.

Theano

Theano is a Python-based library built on NumPy that enables you to define, optimize, and evaluate mathematical expressions in deep learning with multi-dimensional arrays. Named after “Theano,” a greek mathematician, the library was introduced by the Montreal Institute for Learning Algorithms in 2007. It features a transparent use of GPU for data-intensive computations, avoids nasty bugs in complex computations, supports dynamic C code generation, and tools for detecting potential problems.

Theano functions in three stages: first, it defines the objects or variables; next, it progresses through phases to determine the mathematical expressions; and finally, it evaluates expressions by receiving input values.

Companies like Vizual.ai, Vuclip, Zetaops, and others actively use Theano in their technology stacks.

Scikit-learn

Next on our list of top deep learning frameworks is Scikit-learn, a Python-based toolkit, that was unveiled in 2007 during the Google Summer of Code project hosted by David Cournapeau. It was designed to operate and facilitate machine learning and artificial intelligence algorithms and work in conjunction with several frameworks like NumPy, SciPy, Matplotlib, Pandas, etc.

Other than supporting deep learning, it is used for statistical modeling with regression, classification (K-nearest neighbors), clustering (K-means and K-means++), preprocessing, and dimensionality reduction.

The toolkit is being actively used by organizations like Spotify, Inria, and J.P. Morgan to enhance their linear algebra and statistical analysis.

Apache MXNet

Apache MXNet is an open-source, efficient, and versatile library for deep learning. The library features hybrid front-end support and seamless transitions from other frameworks. MXNet can be extended with Apache’s thriving ecosystem of tools and libraries to enable more real-world use cases like NLP, computer vision, etc. It also features scalable and distributed training and performance optimization with its dual Parameter Server support.

MXNet is compatible with 8 integrable language bindings like C++, Java, Julia, Python, Perl, R, Go, etc. You can follow Apache’s guide to build and install MXNet from the official website.

Many companies like Intel, Amazon, Baidu, Carnegie, Wolfram Research, and others use and contribute to the community, yet it remains limited.

Microsoft Cognitive Toolkit

Microsoft’s cognitive toolkit (CNTK) is an open-source toolkit for distributed deep learning that describes neural networks as a sequence of computations based on a directed graph. With CNTK, users can conjugate and work with multiple models like CNNs (convolutional neural networks), DNNs (deep neural networks), LSTM (long short-term memory), and RNNs (recurrent neural networks).

CNTK is accessible as a library in Python, C++, or C# programming languages and can also be used as a standalone tool with its language (BrainScript). The toolkit supports 64-bit Windows or Linux operating systems. You can install the pre-compiled binary package or compile the toolkit by sourcing from GitHub.

Many famous companies like Delta Air, General Electric, Bain & Company, and many others use CNTK for personalized analytics.

Top Machine Learning (ML) Research Papers Released in 2022

Preetipadma K

January 13, 2023

Top Machine Learning (ML) Research Papers Released In 2022

Machine learning (ML) is gaining much traction in recent years owing to the disruption and development it brings in enhancing existing technologies. Every month, hundreds of ML papers from various organizations and universities get uploaded on the internet to share the latest breakthroughs in this domain. As the year ends, we bring you the Top 22 ML research papers of 2022 that created a huge impact in the industry. The following list does not reflect the ranking of the papers, and they have been selected on the basis of the recognitions and awards received at international conferences in machine learning.

Bootstrapped Meta-Learning

Meta-learning is a promising field that investigates ways to enable machine learners or RL agents (which include hyperparameters) to learn how to learn in a quicker and more robust manner, and it is a crucial study area for enhancing the efficiency of AI agents.

This 2022 ML paper presents an algorithm that teaches the meta-learner how to overcome the meta-optimization challenge and myopic meta goals. The algorithm’s primary objective is meta-learning using gradients, which ensures improved performance. The research paper also examines the potential benefits due to bootstrapping. The authors highlight several interesting theoretical aspects of this algorithm, and the empirical results achieve new state-of-the-art (SOTA) on the ATARI ALE benchmark as well as increased efficiency in multitask learning.

Competition-level code generation with AlphaCode

One of the exciting uses for deep learning and large language models is programming. The rising need for coders has sparked the race to build tools that can increase developer productivity and provide non-developers with tools to create software. However, these models still perform badly when put to the test on more challenging, unforeseen issues that need more than just converting instructions into code.

The popular ML paper of 2022 introduces AlphaCode, a code generation system that, in simulated assessments of programming contests on the Codeforces platform, averaged a rating in the top 54.3%. The paper describes the architecture, training, and testing of the deep-learning model.

Restoring and attributing ancient texts using deep neural networks

The epigraphic evidence of the ancient Greek era — inscriptions created on durable materials such as stone and pottery — had already been broken when it was discovered, rendering the inscribed writings incomprehensible. Machine learning can help in restoring, and identifying chronological and geographical origins of damaged inscriptions to help us better understand our past.

This ML paper proposed a machine learning model built by DeepMind, Ithaca, for the textual restoration and geographical and chronological attribution of ancient Greek inscriptions. Ithaca was trained on a database of just under 80,000 inscriptions from the Packard Humanities Institute. It had a 62% accuracy rate compared to historians, who had a 25% accuracy rate on average. But when historians used Ithaca, they quickly achieved a 72% accuracy.

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

Large neural networks use more resources to train hyperparameters since each time, the network must estimate which hyperparameters to utilize. This groundbreaking ML paper of 2022 suggests a novel zero-shot hyperparameter tuning paradigm for more effectively tuning massive neural networks. The research, co-authored by Microsoft Research and OpenAI, describes a novel method called µTransfer that leverages µP to zero-shot transfer hyperparameters from small models and produces nearly perfect HPs on large models without explicitly tuning them.

This method has been found to reduce the amount of trial and error necessary in the costly process of training large neural networks. By drastically lowering the need to predict which training hyperparameters to use, this approach speeds up research on massive neural networks like GPT-3 and perhaps its successors in the future.

PaLM: Scaling Language Modeling with Pathways

Large neural networks trained for language synthesis and recognition have demonstrated outstanding results in various tasks in recent years. This trending 2022 ML paper introduced Pathways Language Model (PaLM), a 780 billion high-quality text token, and 540 billion parameter-dense decoder-only autoregressive transformer.

Although PaLM just uses a decoder and makes changes like SwiGLU Activation, Parallel Layers, Multi-Query Attention, RoPE Embeddings, Shared Input-Output Embeddings, and No Biases and Vocabulary, it is based on a typical transformer model architecture. The paper describes the company’s latest flagship surpassing several human baselines while achieving state-of-the-art in numerous zero, one, and few-shot NLP tasks.

Robust Speech Recognition via Large-Scale Weak Supervision

Machine learning developers have found it challenging to build speech-processing algorithms that are trained to predict a vast volume of audio transcripts on the internet. This year, OpenAI released Whisper, a new state-of-the-art (SotA) model in speech-to-text that can transcribe any audio to text and translate it into several languages. It has received 680,000 hours of training on a vast amount of voice data gathered from the internet. According to OpenAI, this model is robust to accents, background noise, and technical terminology. Additionally, it allows transcription into English from 99 different languages and translation into English from those languages.

The OpenAI ML paper mentions the author ensured that about one-third of the audio data is non-English. This helped the team outperform other supervised state-of-the-art models by maintaining a diversified dataset.

OPT: Open Pre-trained Transformer Language Models

Large language models have demonstrated extraordinary performance f on numerous tasks (e.g., zero and few-shot learning). However, these models are difficult to duplicate without considerable funding due to their high computing costs. Even while the public can occasionally interact with these models through paid APIs, complete research access is still only available from a select group of well-funded labs. This limited access has hindered researchers’ ability to comprehend how and why these language models work, which has stalled progress on initiatives to improve their robustness and reduce ethical drawbacks like bias and toxicity.

The popular 2022 ML paper introduces Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers with 125 million to 175 billion parameters that the authors want to share freely and responsibly with interested academics. The biggest OPT model, OPT-175B (it is not included in the code repository but is accessible upon request), which is impressively proven to perform similarly to GPT-3 (which also has 175 billion parameters) uses just 15% of GPT-3’s carbon footprint during development and training.

A Path Towards Autonomous Machine Intelligence

Yann LeCun is a prominent and respectable researcher in the field of artificial intelligence and machine learning. In June, his much-anticipated paper “A Path Towards Autonomous Machine Intelligence” was published on OpenReview. LeCun offered a number of approaches and architectures in his paper that might be combined and used to create self-supervised autonomous machines.

He presented a modular architecture for autonomous machine intelligence that combines various models to operate as distinct elements of a machine’s brain and mirror the animal brain. Due to the differentiability of all the models, they are all interconnected to power certain brain-like activities, such as identification and environmental response. It incorporates ideas like a configurable predictive world model, behavior-driven through intrinsic motivation, and hierarchical joint embedding architectures trained with self-supervised learning.

LaMDA: Language Models for Dialog Applications

Despite tremendous advances in text generation, many of the chatbots available are still rather irritating and unhelpful. This 2022 ML paper from Google describes the LaMDA — short for “Language Model for Dialogue Applications” — system, which caused the uproar this summer when a former Google engineer, Blake Lemoine, alleged that it is sentient. LaMDA is a family of large language models for dialog applications built on Google’s Transformer architecture, which is known for its efficiency and speed in language tasks such as translation. The model’s ability to be adjusted using data that has been human-annotated and the capability of consulting external sources are its most intriguing features.

The model, which has 137 billion parameters, was pre-trained using 1.56 trillon words from publicly accessible conversation data and online publications. The model is also adjusted based on the three parameters of quality, safety, and groundedness.

Privacy for Free: How does Dataset Condensation Help Privacy?

One of the primary proposals in the award-winning ML paper is to use dataset condensation methods to retain data efficiency during model training while also providing membership privacy. The authors argue that dataset condensation, which was initially created to increase training effectiveness, is a better alternative to data generators for producing private data since it offers privacy for free.

Though existing data generators are used to produce differentially private data for model training to minimize unintended data leakage, they result in high training costs or subpar generalization performance for the sake of data privacy. This study was published by Sony AI and received the Outstanding Paper Award at ICML 2022.

TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data

The use of a model that converts time series into anomaly scores at each time step is essential in any system for detecting time series anomalies. Recognizing and diagnosing anomalies in multivariate time series data is critical for modern industrial applications. Unfortunately, developing a system capable of promptly and reliably identifying abnormal observations is challenging. This is attributed to a shortage of anomaly labels, excessive data volatility, and the expectations of modern applications for ultra-low inference times.

In this study, the authors present TranAD, a deep transformer network-based anomaly detection and diagnosis model that leverages attention-based sequence encoders to quickly execute inference while being aware of the more general temporal patterns in the data. TranAD employs adversarial training to achieve stability and focus score-based self-conditioning to enable robust multi-modal feature extraction. The paper mentions extensive empirical experiments on six publicly accessible datasets show that TranAD can perform better in detection and diagnosis than state-of-the-art baseline methods with data- and time-efficient training.

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

In the last few years, generative models called “diffusion models” have been increasingly popular. This year saw these models capture the excitement of AI enthusiasts around the world.

Going ahead of the current text to speech technology of recent times, this outstanding 2022 ML paper introduced the viral text-to-image diffusion model from Google, Imagen. This diffusion model achieves a new state-of-the-art FID score of 7.27 on the COCO dataset by combining the deep language understanding of transformer-based large language models with the photorealistic image-generating capabilities of diffusion models. A text-only frozen language model provides the text representation, and a diffusion model with two super-resolution upsampling stages, up to 1024×2014, produces the images. It employs several training approaches, including classifier-free guiding, to teach itself conditional and unconditional generation. Another important feature of Imagen is the use of dynamic thresholding, which stops the diffusion process from being saturated in specific areas of the picture, a behavior that reduces image quality, particularly when the weight placed on text conditional creation is large.

No Language Left Behind: Scaling Human-Centered Machine Translation

This ML paper introduced the most popular Meta projects of the year 2022: NLLB-200. This paper talks about how Meta built and open-sourced this state-of-the-art AI model at FAIR, which is capable of translating 200 languages between each other. It covers every aspect of this technology: language analysis, moral issues, effect analysis, and benchmarking.

No matter what language a person speaks, accessibility via language ensures that everyone can benefit from the growth of technology. Meta claims that several languages that NLLB-200 translates, such as Kamba and Lao, are not currently supported by any translation systems in use. The tech behemoth also created a dataset called “FLORES-200” to evaluate the effectiveness of the NLLB-200 and show that accurate translations are offered. According to Meta, NLLB-200 offers an average of 44% higher-quality translations than its prior model.

A Generalist Agent

AI pundits believe that multimodality will play a huge role in the future of Artificial General Intelligence (AGI). One of the most talked ML papers of 2022 by DeepMind introduces Gato – a generalist agent. This AGI agent is a multi-modal, multi-task, multi-embodiment network, which means that the same neural network (i.e. a single architecture with a single set of weights) can do all tasks while integrating inherently diverse types of inputs and outputs.

DeepMind claims that the general agent can be improved with new data to perform even better on a wider range of tasks. They argue that having a general-purpose agent reduces the need for hand-crafting policy models for each region, enhances the volume and diversity of training data, and enables continuous advances in the data, computing, and model scales. A general-purpose agent can also be viewed as the first step toward artificial general intelligence, which is the ultimate goal of AGI.

Gato demonstrates the versatility of transformer-based machine learning architectures by exhibiting their use in a variety of applications. Unlike previous neural network systems tailored for playing games, stack blocks with a real robot arm, read words, and caption images, Gato is versatile enough to perform all of these tasks on its own, using only a single set of weights and a relatively simple architecture.

The Forward-Forward Algorithm: Some Preliminary Investigations

AI pioneer Geoffrey Hinton is known for writing paper on the first deep convolutional neural network and backpropagation. In his latest paper presented at NeurIPS 2022, Hinton proposed the “forward-forward algorithm,” a new learning algorithm for artificial neural networks based on our understanding of neural activations in the brain. This approach draws inspiration from Boltzmann machines (Hinton and Sejnowski, 1986) and noise contrast estimation (Gutmann and Hyvärinen, 2010). According to Hinton, forward-forward, which is still in its experimental stages, can substitute the forward and backward passes of backpropagation with two forward passes, one with positive data and the other with negative data that the network itself could generate. Further, the algorithm could simulate hardware more efficiently and provide a better explanation for the brain’s cortical learning process.

Without employing complicated regularizers, the algorithm obtained a 1.4 percent test error rate on the MNIST dataset in an empirical study, proving that it is just as effective as backpropagation.

The paper also suggests a novel “mortal computing” model that can enable the forward-forward algorithm and understand our brain’s energy-efficient processes.

Focal Modulation Networks

In humans, the ciliary muscles alter the shape of the eye and hence the radius of the curvature lens to focus on near or distant objects. Changing the shape of the eye lens, changes the focal length of the lens. Mimicking this behavior of focal modulation in computer vision systems can be tricky.

This machine learning paper introduces FocalNet, an iterative information extraction technique that employs the premise of foveal attention to post-process Deep Neural Network (DNN) outputs by performing variable input/feature space sampling. Its attention-free design outperforms SoTA self-attention (SA) techniques in a wide range of visual benchmarks. According to the paper, focal modulation consists of three parts: According to the paper, focal modulation consists of three parts:

a. hierarchical contextualization, implemented using a stack of depth-wise convolutional layers, to encode visual contexts from close-up to a great distance;

b. gated aggregation to selectively gather contexts for each query token based on its content; and

c. element-wise modulation or affine modification to inject the gathered context into the query.

Learning inverse folding from millions of predicted structures

The field of structural biology is being fundamentally changed by cutting-edge technologies in machine learning, protein structure prediction, and innovative ultrafast structural aligners. Time and money are no longer obstacles to obtaining precise protein models and extensively annotating their functionalities. However, determining a protein sequence from its backbone atom coordinates remained a challenge for scientists. To date, machine learning methods to this challenge have been constrained by the amount of empirically determined protein structures available.

In this ICML Outstanding Paper (Runner Up), authors explain tackling this problem by increasing training data by almost three orders of magnitude by using AlphaFold2 to predict structures for 12 million protein sequences. With the use of this additional data, a sequence-to-sequence transformer with invariant geometric input processing layers is able to recover native sequence on structurally held-out backbones in 51% of cases while recovering buried residues in 72% of cases. This is an improvement of over 10% over previous techniques. In addition to designing protein complexes, partly masked structures, binding interfaces, and numerous states, the concept generalises to a range of other more difficult tasks.

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

Within the AI research community, using video games as a training medium for AI has gained popularity. These autonomous agents have had great success in Atari games, Starcraft, Dota, and Go. Although these developments have gained popularity in the field of artificial intelligence research, the agents do not generalize beyond a narrow range of activities, in contrast to humans, who continually learn from open-ended tasks.

This thought-provoking 2022 ML paper suggests MineDojo, a unique framework for embodied agent research based on the well-known game Minecraft. In addition to building an internet-scale information base with Minecraft videos, tutorials, wiki pages, and forum discussions, Minecraft provides a simulation suite with tens of thousands of open-ended activities. Using MineDojo data, the author proposes a unique agent learning methodology that employs massive pre-trained video-language models as a learnt reward function. Without requiring a dense shaping reward that has been explicitly created, MinoDojo autonomous agent can perform a wide range of open-ended tasks that are stated in free-form language.

Is Out-of-Distribution Detection Learnable?

Machine learning (supervised ML) models are frequently trained using the closed-world assumption, which assumes that the distribution of the testing data will resemble that of the training data. This assumption doesn’t hold true when used in real-world activities, which causes a considerable decline in their performance. While this performance loss is acceptable for applications like product recommendations, developing an out-of-distribution (OOD) identification algorithm is crucial to preventing ML systems from making inaccurate predictions in situations where data distribution in real-world activities typically drifts over time (self-driving cars).

In this paper, authors explore the probably approximately correct (PAC) learning theory of OOD detection, which is proposed by researchers as an open problem, to study the applicability of OOD detection. They first focus on identifying a prerequisite for OOD detection’s learnability. Following that, they attempt to show a number of impossibility theorems regarding the learnability of OOD detection in a handful yet different scenarios.

Gradient Descent: The Ultimate Optimizer

Gradient descent is a popular optimization approach for training machine learning models and neural networks. The ultimate aim of any machine learning (neural network) method is to optimize parameters, but selecting the ideal step size for an optimizer is difficult since it entails lengthy and error-prone manual work. Many strategies exist for automated hyperparameter optimization; however, they often incorporate additional hyperparameters to govern the hyperparameter optimization process. In this study, MIT CSAIL and Meta researchers offer a unique approach that allows gradient descent optimizers like SGD and Adam to tweak their hyperparameters automatically.

They propose learning the hyperparameters by self-using gradient descent, as well as learning the hyper-hyperparameters via gradient descent, and so on indefinitely. This paper describes an efficient approach for allowing gradient descent optimizers to autonomously adjust their own hyperparameters, which may be layered recursively to many levels. As these gradient-based optimizer towers expand in size, they become substantially less sensitive to the selection of top-level hyperparameters, reducing the load on the user to search for optimal values.

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Embodied AI is a developing study field that has been influenced by recent advancements in artificial intelligence, machine learning, and computer vision. This method of computer learning makes an effort to translate this connection to artificial systems. The paper proposes ProcTHOR, a framework for procedural generation of Embodied AI environments. ProcTHOR allows researchers to sample arbitrarily huge datasets of diverse, interactive, customisable, and performant virtual environments in order to train and assess embodied agents across navigation, interaction, and manipulation tasks.

According to the authors, models trained on ProcTHOR using only RGB images and without any explicit mapping or human task supervision achieve cutting-edge results in 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation, including the ongoing Habitat2022, AI2-THOR Rearrangement2022, and RoboTHOR challenges. The paper received the Outstanding Paper award at NeurIPS 2022.

A Commonsense Knowledge Enhanced Network with Retrospective Loss for Emotion Recognition in Spoken Dialog

Emotion Recognition in Spoken Dialog (ERSD) has recently attracted a lot of attention due to the growth of open conversational data. This is due to the fact that excellent speech recognition algorithms have emerged as a result of the integration of emotional states in intelligent spoken human-computer interactions. Additionally, it has been demonstrated that recognizing emotions makes it possible to track the development of human-computer interactions, allowing for dynamic change of conversational strategies and impacting the result (e.g., customer feedback). But the volume of the current ERSD datasets restricts the model’s development.

This ML paper proposes a Commonsense Knowledge Enhanced Network (CKE-Net) with a retrospective loss to carry out dialog modeling, external knowledge integration, and historical state retrospect hierarchically.

Meta Sues Voyager Labs in a Lawsuit for Allegedly Creating Fake Accounts to Scrape Data

Disha Chopra

January 13, 2023

Meta filed a lawsuit about Voyager Labs, claiming that the company set up bogus Facebook profiles to gather data from actual Facebook users, which it later used for its own business needs. As per the filing at the District Court for the Northern District of California, Meta claimed that there were over 38,000 fake Facebook accounts.

The complaint said, “Meta seeks damages and injunctive relief to stop Defendant’s use of its platforms and services.”

Voyager Labs specializes in investigative tools and services that assist law enforcement, and businesses in learning more about suspects. Meta claimed that Voyager Labs inappropriately collected data from not only Facebook but also Instagram, Twitter, YouTube, Telegram, and other websites to fuel its software.

Over 60,000 Voyager Labs-related Facebook and Instagram identities and pages, including the 38,000 fraudulent accounts, were eventually removed by Meta.
Meta’s complaint is similar to a data-scraping court case between LinkedIn and Microsoft-owned social network hiQ in December 2022, with a settlement worth US$500,000. Moreover, in September 2022, Meta settled another case with BrandTotal and Unimania, which ultimately stopped “using and scraping Instagram and Facebook.”

Neeva AI Restructuring Consumer Search Practices with its AI-Powered Engine like You.com

Disha Chopra

January 13, 2023

Neeva AI, a consumer search engine, has launched an AI-powered engine that uses large language models and an independent search stack to leverage search capabilities. Neeva’s interface works somewhat similarly to Google’s Featured Snippets.

1/ #NeevaAI is here, powered by AI & LLMs and @Neeva's independent search stack to search in an authentic, believable way.

This is unlike anything we, or anyone, have built before.

U.S. users, try it out by logging into your Neeva account & searching ⤵️ pic.twitter.com/vSDPBMBkQW
— sridhar (@RamaswmySridhar) January 6, 2023

Neeva AI has taken inspiration from recently viral You.com’s YouChat, where users simply type their inputs, and the chatbot uses a Chat-GPT-like user interface to generate replies. Neeva AI is different because it credibly replies, “AI can’t answer” when it does not know/understand, instead of generating factually incorrect responses.

To get started, you must create a Neeva account because, without that, it will continue to function as a standard search engine. However, not all queries generate an answer from Neeva’s AI. Generally, you only receive AI search results for questions that could benefit from responses that AI created.

Read More: Uncovering The Twitter Files

Essentially, Neeva AI only appears to respond to inquiries; however, co-founder Ramaswamy Sridhar does state that Neeva AI will respond to additional searches “in the coming months.”

Neeva was founded by Sridhar Ramaswamy and Vivek Raghunathan with a vision to provide a search engine that keeps users’ information safe. Neeva AI search engine looks for information on the web and personal files like emails. The consumer search engine will not show any advertisements or collect any profit from user data.

1/ #NeevaAI is here, powered by AI & LLMs and @Neeva's independent search stack to search in an authentic, believable way.

This is unlike anything we, or anyone, have built before.

U.S. users, try it out by logging into your Neeva account & searching ⤵️ pic.twitter.com/vSDPBMBkQW
— sridhar (@RamaswmySridhar) January 6, 2023

Read More: Uncovering The Twitter Files

Essentially, Neeva AI only appears to respond to inquiries; however, co-founder Ramaswamy Sridhar does state that Neeva AI will respond to additional searches “in the coming months.”

The Labour Commissioner’s office in Pune summons Amazon over layoffs

Manjiri Gaikwad

January 13, 2023

The Labour Commissioner’s office in Pune has sent a summon to e-commerce giant Amazon and Nascent Information Technology Employees Senate (NITES) regarding its massive layoffs and voluntary separation policy.

As per the summon, the assistant Labour commissioner requested Amazon and its union representatives to be present at the commissioner’s office on the 17th of January at 3 PM. The commissioner will be taking necessary actions following an inquiry into the unethical and illegal layoffs of Amazon.

Harpreet Singh Saluja, President of NITES, stated that 1000 employees and their families’ livelihoods have now been made vulnerable. He also mentioned that according to the procedures under the Industrial Dispute Act, an employee could only be laid off only with prior permission from the authorized government.

Amazon issued a voluntary separation policy to its employees in November 2022, enabling them to resign voluntarily. Employees who did not apply for the policy were included in the workforce optimization announced by Amazon.

As per Saluja, Amazon violated the Indian Labour laws as the voluntary separation policy of Amazon was never proposed to the Labour Ministry for review. An employee who has served continuously for a year in a company can only be laid off if served a notice period of three months in advance.

PhysicsWallah acquires iNeuron Intelligence for ₹250 crore to expand its upskilling offerings

Sahil Pawar

January 13, 2023

PhysicsWallah acquires iNeuron Intelligence

On Thursday, Westbridge Capital-backed edtech unicorn PhysicsWallah (PW) said that it had acquired iNeuron Intelligence to expand its offerings in the upskilling category.

The deal is valued at around ₹250 crores and will provide an exit to investors, including publishing house S Chand & Co. The iNeuron team will drive the company’s tech skilling plan under PW Skills, said Alakh Pandey, co-founder, and chief executive, PhysicsWallah.

PhysicsWallah offers upskilling courses under the PW Skills brand, which includes business analytics programs and computer science languages such as Java and C++.

Launched in 2020, PW helps students prepare for engineering and medical entrance examinations. It turned unicorn earlier this year after it raised $100 million in its Series A funding round from marquee investors Westbridge and GSV Ventures.

According to co-founder Prateek Maheshwari, there is a skills gap between what has been taught at institutes and what the industry demands. “We were looking for skilling startups that had strong fundamentals and have actually helped students bag their dream jobs.”

“We saw such capabilities in iNeuron, which helped a chef to turn into a coder and a UPSC aspirant with gap of five years land a job at Amazon,” Maheshwari said, explaining the thesis for the merger.

Uncovering the Twitter Files

Disha Chopra

January 13, 2023

Elon has co-authored a series of Twitter threads, called Twitter Files, on the platform’s internal documents

Since Elon Musk disclosed his plans to acquire the social media platform Twitter in April, Twitter has hit multiple hurdles along the way. Musk decided to take over Twitter as he believed the workforce to be “lazy and politically biased.” After the acquisition for US$44 billion, the platform has experienced several policy changes and bumps down the road, including millions of followers deactivating their accounts, several prestigious people like CEO Parag Agarwal, Chief Financial Officer Ned Segal, and General Counsel Sean Edgett being fired. Additionally, as a self-described “free speech absolutist,” he adopted a significantly different approach to content management and unbanning well-known users, including Trump. Elon claimed the policy changes to be in favor of the platform. In the most recent development, Elon has co-authored a series of Twitter threads, called Twitter Files, on the platform’s internal documents, with Matt Taibbi and Bari Weiss, freelance journalists.

He wrote, “The Twitter Files on free speech suppression soon to be published on Twitter. The public deserves to know what really happened…” Elon provided many internal documents such as chat logs, executive emails, and screenshots taken in 2020 around the US Presidential elections to Weiss and Taibbi and asked them to build unique stories around the same. It is a ballistic move but is getting much controversial attention worldwide.

Let’s look at why these revelations are being billed as a bombshell.

We have elaborated each file sequentially, have a look.

1. Twitter Files #1: The Hunter Biden Laptop story.

Musk tweeted:

Here we go!! 🍿🍿 https://t.co/eILK9f3bAm
— Elon Musk (@elonmusk) December 2, 2022

Interspersed with his own investigation, Taibbi revealed a Twitter thread on December 2 that included internal Twitter emails. In some of these documents, Twitter’s internal discussions surrounding the decision to conceal the New York Post’s reporting on the findings on Hunter Biden’s laptop were revealed. The concealed tweets supposedly contained nude pictures and videos of Hunter Biden and were termed “revenge porn.”

Moreover, some of the tweets in this thread contained information on the platform-managed tweets flagged for removal by the Trump White House and the Biden campaign team. According to Taibbi, Twitter “received and honored” deletion requests from the Trump White House and the Biden campaign team. However, he provided evidence of the former but not the latter.

When the decision to suppress the content was made, then-CEO Jack Dorsey was not informed, but days later, he changed his mind, calling it a “mistake.” Simultaneously, Twitter’s hacked materials policy was updated to allow publishing such content with a contextual warning.

21. Strom’s note returned the answer that the laptop story had been removed for violation of the company’s “hacked materials” policy: https://t.co/EdTa2xbXn1 pic.twitter.com/KQFRiKYKkb
— Matt Taibbi (@mtaibbi) December 3, 2022

Matt posted a snippet of an exchange between the former legal head of policy, Vijaya Gadde, and Yoel Roth, Trust & Safety Chief. It appears that the decision was made at a high level in the company, without the knowledge of Jack Dorsey but with the involvement of the former legal head of policy, Vijaya Gadde.

https://t.co/j4EeXEAw6F can see the confusion in the following lengthy exchange, which ends up including Gadde and former Trust and safety chief Yoel Roth. Comms official Trenton Kennedy writes, “I'm struggling to understand the policy basis for marking this as unsafe”: pic.twitter.com/w1wBMlG33U
— Matt Taibbi (@mtaibbi) December 3, 2022

2. Twitter Files #2: Visibility filtering

Concerning a practice referred to as “visible filtering” by the previous Twitter management, Bari Weiss published the second installment of the Twitter Files on December 9. Weiss reported that a high-level team called Site Integrity Policy-Policy Escalation Support (SIP-PES) was responsible for making decisions about “politically sensitive” accounts and controlling the extent of their visibility. Allegedly, Twitter utilized VF to prevent searches for specific individuals, restrict the audience for whom a tweet can be found, prevent certain users’ postings from ever making the “trending” tab, and prevent them from being included in hashtag searches.

As evidence, Bari shared screenshots of the internal system with accounts tagged as “Trends Blacklist,” “Search Blacklist,” “Do Not Amplify,” and “Do Not Take Action on User Without Consulting with SIP-PES.” Some accounts whose snippets were shared by Bari:

Stanford’s Dr. Jay Bhattacharya’s account was placed on the “ Trends Blacklist.”

3. Take, for example, Stanford’s Dr. Jay Bhattacharya (@DrJBhattacharya) who argued that Covid lockdowns would harm children. Twitter secretly placed him on a “Trends Blacklist,” which prevented his tweets from trending. pic.twitter.com/qTW22Zh691
— Bari Weiss (@bariweiss) December 9, 2022

Dan Bongino, a reputed right-wing talk show host, was on the “Search Blacklist.”

4. Or consider the popular right-wing talk show host, Dan Bongino (@dbongino), who at one point was slapped with a “Search Blacklist.” pic.twitter.com/AdOK8xLu9v
— Bari Weiss (@bariweiss) December 9, 2022

Charlie Kirk, a conservative activist, was placed on “Do Not Amplify.”

5. Twitter set the account of conservative activist Charlie Kirk (@charliekirk11) to “Do Not Amplify.” pic.twitter.com/dOyQIVdsW2
— Bari Weiss (@bariweiss) December 9, 2022

The controversial part is when Twitter denied the allegations back in 2018. Vijaya Gadde and Kayvon Beykour (Head of Product) had then said, “We do not shadow ban.,” and also added that “we certainly don’t shadow ban based on political viewpoints or ideology.” Evidently, the images show otherwise.

The revelations do not end here. Chaya Raichik’s Twitter account “libsoftiktok” became the talk of the town as it rose to the highest level of scrutiny. It was on the Trend Blacklist and the “Do Not Take Action on User Without Consulting With SIP-PES” list.

16. One of the accounts that rose to this level of scrutiny was @libsoftiktok—an account that was on the “Trends Blacklist” and was designated as “Do Not Take Action on User Without Consulting With SIP-PES.” pic.twitter.com/Vjo6YxYbxT
— Bari Weiss (@bariweiss) December 9, 2022

In 2022 alone, the account was suspended six times, and each time the posts were blocked under violation of Twitter’s “hateful conduct” policy. However, the committee noted in an internal SIP-PES report from October 2022, following her sixth ban, that “LTT has not directly engaged in behavior violative of the Hateful Conduct policy.” Again, this contradicts what was publicly informed.

19. But in an internal SIP-PES memo from October 2022, after her seventh suspension, the committee acknowledged that “LTT has not directly engaged in behavior violative of the Hateful Conduct policy." See here: pic.twitter.com/d9FGhrnQFE
— Bari Weiss (@bariweiss) December 9, 2022

Later, Raichik’s home address and pictures were leaked on Twitter, and ironically, the platform denied support by saying that it did not find the content to violate Twitter rules.

3. Twitter Files #3: Donald Trump’s Suspension

Both Matt and Bari posted the third installment on December 9. Donald Trump’s tweets have been controversial for years, and Twitter has resisted internal and external calls to ban him because he is blocking a world leader from sharing vital information. But during the presidential elections in January 2021, following the riots at the Capitol, Trump’s Twitter account was suspended on 8 January. While it finally happened during the three days of Capitol riots between 6 and 8 January, the theoretical groundwork was established months before.

According to Taibbi, Yoel Roth often met with representatives from the FBI, Department of Homeland Security, and Office of the Director of National Intelligence (DNI). Roth said the meeting’s purpose was to coordinate efforts to thwart foreign interference and domestic disinformation in the elections.

4. The fourth installment of the Twitter Files talks in continuation about the removal of Donald Trump.

Blogger Michael Shellenberg posted internal conversations between Yoel Roth and some of his colleagues. These conversations show that Jack Dorsey was on vacation when Twitter’s senior executives and Roth were progressing overwhelmingly toward banning conservatives. Following an email from Jack about maintaining Twitter’s consistency in policies, including the rights of an account’s return after a temporary suspension, Roth had a conversation with an employee. Soon after, as Michael shares, Roth just shared a text saying that Jack had approved a system policy called “repeat offender for civic integrity,” under which any five violations would result in a permanent suspension.

Around 11:30 am PT, Roth DMs his colleagues with news that he is excited to share.

“GUESS WHAT,” he writes. “Jack just approved repeat offender for civic integrity.”

The new approach would create a system where five violations ("strikes") would result in permanent suspension. pic.twitter.com/F1KYqd1Xea
— Michael Shellenberger (@ShellenbergerMD) December 11, 2022

However, as the subsequent tweets reveal, the policy was for “everything else,” and “Trump continues to just have his one strike.” But, to everyone’s surprise, Twitter suspended Donald Trump’s account due to a risk of further incitement of violence.

Roth's colleague's query about "incitement to violence" heavily foreshadows what will happen the following day.

On January 8, Twitter announces a permanent ban on Trump due to the "risk of further incitement of violence." pic.twitter.com/psLb5HDGQP
— Michael Shellenberger (@ShellenbergerMD) December 11, 2022

The following events are crucial to comprehending why Twitter decided to ban Trump. When a sales executive asked whether the company was dropping the public interest policy, Roth said they were just changing the public interest approach for Trump’s account. This policy allows select officials/election contenders’ content, even if it violates Twitter’s content policy.

What happens next is essential to understanding how Twitter justified banning Trump.

Sales exec: "are we dropping the public interest [policy] now…"

Roth, six hours later: "In this specific case, we're changing our public interest approach for his account…" pic.twitter.com/XRUFil2npI
— Michael Shellenberger (@ShellenbergerMD) December 11, 2022

More Layoffs in 2023, Alphabet Verily Lays Off 15% of its Staff

Disha Chopra

January 13, 2023

Verily Life Sciences, an Alphabet Inc. division specializing in health sciences, stated that it has laid off over 200 workers, or 15% of its workforce. This is the first time in at least six years that Alphabet or one of its affiliates has announced employment cutbacks.

In September of last year, Verily emerged from the Google X research program in 2015 after receiving a $1 billion investment from Alphabet. At the time, Stephen Gillet was announced as the Chief Executive Officer (CEO), and co-founder Andy Conrad became the executive chairman.

In his emails, Stephen wrote to the employees that the company is striving for financial independence from the parent company Alphabet. According to the emails, the cuts were due to “redundancy” on the team and canceled programs.

The layoff comes after comparable layoffs in the corporate sector, primarily in banks and technology industries, as businesses try to cut costs in a challenging economic climate. Many companies like Meta, Salesforce, Tesla, and others have hinted at more layoffs in 2023.

Foreseeing tougher economic situations, Verily said in a blog post, “We will advance fewer initiatives with greater resources.” Verily also stopped developments on its analytics tool Value Suite and some other products, adding that it will also remove employees from those segments.

Brain Chip by Inner Cosmos Enters 2nd Human Trial to Cure Depression, A Competition to Neuralink

Disha Chopra

January 13, 2023

California-based Inner Cosmos has successfully developed and implanted a penny-sized brain chip in a patient to cure depression and will be starting the 2nd human trial in February. As per Inner Cosmos, depression could soon be treated using the “digital pill” as a brain implantation that delivers electrical pulses to the regions of the brain affected by mental illness.

CEO of Inner Cosmos, Meron Gribetz, was diagnosed with ADHD at age 10. The diagnosis was the reason for his interest in neuroscience. He said, “The world is in a state of severe disorder, and the effects are being felt by millions leading to surging levels of depression.”

The Inner Cosmos brain chip has two parts: an electrode and an external prescription pod. The electrode is implanted under the scalp, and a prescription pod is attached to the head to power the device. In March 2022, the company gained permission from the US Food and Drug Administration to carry out patient trials with their technology.

A fairly simple outpatient operation can implant the tiny electrode into the cranium, and the procedure only takes 30 minutes. The implant targets the left dorsolateral prefrontal cortex for 15 minutes each day, the part of the brain responsible for processes like working memory, abstract reasoning, and motor skills.

The second human trial, if successful, can pave the way for more advanced treatments in neurosciences.

SIBM Pune, Swansea University to host an online seminar series on metaverse

Sahil Pawar

January 13, 2023

SIBM Pune online seminar series metaverse

A seminar series called “The Digital Future for Business & Society: Emerging Perspectives on the Metaverse” will be jointly hosted by Professor Yogesh K. Dwivedi, Dr. Laurie Hughes, Swansea University, UK, and Director of Symbiosis Institute of Business Management (SIBM), Pune, Professor Ramakrishnan Raman. The latter announced the same in their LinkedIn post.

This seminar series will run for twelve months, beginning in January 2023 and ending in December 2023. A couple of sessions a month by stalwart professors from universities worldwide have been planned online, and the registration is free for all interested.

The series will be overseen by Dr. Anabel Gutierrez, co-chair of digital marketing and analytics SIG at the Academy of Marketing, and Dr. Vinod Kumar, Associate Professor of Marketing, SIBM, Pune.

The seminar series will offer thought-provoking insights into the metaverse and the impact it has on the future. The seminar series will present various ideas from a number of expert speakers to highlight the opportunities and challenges presented by the rapid emergence of the metaverse.

The seminar topics include Engaging Users in the Metaverse and Surfing the Internet and Diving in the Metaverse: A Status Quo Analysis. It will also discuss Metaverse and Tourism Marketing, the Pollution-reducing and Pollution-generating Effects of the Metaverse, and Unlocking the Metaverse in Operations and Manufacturing Management.

Some interesting discussions on topics such as Advertising and Media Planning on the Metaverse, Individual and its Property in the Virtuality, the Legal Aspects of Virtual and Augmented Reality, Metaverse Retail: Reflections on the Opportunities and Challenges Ahead, Opportunities and Challenges of Metaverse in Marketing will be held.

1...100101102...354 Page 101 of 354