Home Blog Page 342

IIT Kanpur Offers Free 8-Weeks Computational Science Course, Enrollments Ends 15th Feb

IIT computational science course
Source - IIT Kanpur Gallery

IIT Kanpur has opened up the enrollment for an eight-week online course on computational science on the SWAYAM platform. An AvHumboldt Fellow with over 50 publications in his name, Dr. Ashoke De, a Professor in the Aerospace Engineering department, will teach the course.

The course is ideal for undergraduate and postgraduate students of Aerospace, Mechanical, Chemical, and Civil Engineering. However, a basic knowledge of mathematics and programming has been set as prerequisites. 

After taking this course, the learner will be able to leverage computation to solve various problems common to both pure and applied sciences. Learners can develop new methodologies and tools for carrying out numerical simulations, which is a significant part of the scientific computing paradigm. This course offers a basic overview of all these aspects that is easy to digest for beginners and faculties.

Also Read: AWS Will Host Free Virtual Classes On ML, Blockchain, Big Data, And More

When it comes to modeling natural phenomenons, knowledge of Linear Algebra, ODEs, and PDEs are a must. From the layout of the course, it is evident that the focus is on linear algebra for the first two weeks and Ordinary Differential Equations (ODEs) for the next two. Then, Partial Differential Equations (PDEs) are addressed in the fifth week.

On top of it, the course sharpens skills like mathematical modeling and numerical analysis. Therefore, the sixth-week module contains numerical analysis components focused on implementing eigenvalue problems and ODE solutions in the coming weeks. The course provides necessary insights into efficient algorithms, computer architecture, software design, implementation, validation, and results visualization.


This computational science course will be offered for free; hence, you can finish the whole course without paying a penny to the IIT. You have to pay a mere 1000 rupees for the exam to be conducted on 24th April for optional certification. You can join the course here before the enrollment deadline of 15th February.

Advertisement

Dealing With Racially-Biased Hate-Speech Detection Models

biased hate speech models

Hate-speech detection models are the most glaring example of biased models, as shown by researchers from Allen Institute for Artificial Intelligence in their linguistic study. In a recent post, the effects of statistical bias in machine translations were highlighted, but you shall see how dataset bias affects models in this post. The researchers studied the hate-speech detectors’ behavior using lexical — swear words, slurs, identity mentions — and dialectal markers — specifically African-American English. They also proposed an automated dialect-aware data correction method, which uses synthetic labels to reduce dialectal associations with toxicity score.

The dataset creation process always captures biases that are inherent to humans. This dataset bias consists of spurious correlations between surface patterns and annotated toxicity labels. These spurious correlations give rise to two different types of bias, lexical and dialectical. The lexical bias associates toxicity with certain words that are considered profane and identity mentions, while dialectal bias correlates toxicity with the lingua franca of minorities. All these biases proliferate freely during the training phase of the hate-speech models. 

Researchers have proposed numerous debiasing techniques in the past, some applied by internet giants — Google, Facebook, and Twitter — in their systems. In this study, the researchers found that these models are not good enough. The so-called “Debiased” models still disproportionately flag text in particular dialects as toxic. The researchers noted, ”mitigating dialectal bias through current debiasing methods does not mitigate a model’s propensity to label tweets by black authors as more toxic than by white authors.”

A proof-of-concept solution was proposed by the Allen researchers that ward off the problem. The idea is to parse those “reported” hate-speeches into the majority’s lingua franca deemed non-toxic by the classifier. This idea takes care of the speeches’ dialectal context, resulting in common ground for the model to predict the toxicity score of speeches reasonably and be less prone to dialectal and racial biases.

Advertisement

AutoML Made Easy With Symbolic Programming using Pyglove

Symbolic programing AutoML Pyglove

Google AI researchers have released a PyGlove library, a symbolic implementation of Automated Machine Learning (AutoML) that allows developers to experiment with search spaces, search algorithms, and search flows of an AutoML with only a few code lines. Now, developers can self-mutate Python classes and functions through brief Python annotations, making it much easier to write AutoML programs.

Developers previously had data and the outputs; they fed that into a machine learning algorithm, which automated the learning of rules governing input to output. Researchers later automated the selection and hyper-parameter tuning of those machine learning algorithms as well. One of the sub-classes of machine learning algorithms is neural networks, which are highly sensitive to architecture and hyper-parameters.

The possible combinations of architecture and hyper-parameter choices become humongous as researchers aim to build larger and larger neural models. They waste months in hand-crafting neural network architectures and selecting the right hyper-parameters. AutoML automated these aspects by formulating the problem as a search problem.

Also Read: What Is Liquid Machine Learning?

A search space is defined to represent all possible choices, and a search algorithm is used to find the best options. Neural Architecture Search (NAS) algorithms like ENAS and DARTS come under the purview of the AutoML. But the current implementations of NAS algorithms do not offer modularity to the components of NAS algorithms like the search space and search algorithm. Therefore, researchers had to face difficulties modifying the search space, search algorithm, or search flow alone.

The Google researchers introduced AutoML based on symbolic programming — a paradigm that allows self-mutating programs by manipulating its components — that makes components decoupled. This decoupling makes it easy for practitioners to change the search space and search algorithm with and without weight sharing, and add search capabilities to existing code and implement complex search flows.

On ImageNet and NAS-Bench-101 benchmarks, they showed that symbolic programming based PyGlove converts a static program into a search space, quickly iterates on the search spaces and search algorithms, and crafts complex search flows to achieve better results. PyGlove also allows easy plug-and-play of AutoML techniques in existing ML pipelines while also benefiting open-ended research.

Advertisement

Microsoft Introduces VIVA To Help People Work From Home

Microsoft VIVA employee platform

Microsoft has recently unveiled a new employee experience platform–VIVA–that will act as an integrated platform to manage employee well-being, learning, engagement, and knowledge discovery in the workflow. With close integration of Teams and Office 365 technologies, Microsoft wants to be the market leader in employee engagement.

Microsoft has bet upon the remote working culture in the future. It is targeting all kinds of organizations and employees. During the pandemic, almost all companies have been distraught with various platforms for on-boarding and training the employees. This new platform promises to smoothen the journey for both employees and the company alike. 

Also Read: AWS Will Host Free Virtual Classes On ML, Blockchain, Big Data, And More

Currently, the platform has four foundations — Viva Connect, Viva insight, Viva learning, and Viva topics — each one represents a different aspect of employee workflow from in-vitro the company or outside. Viva Connect has a personalized gateway for every employee to access all internal communications and company resources. It also helps employees participate in communities like employee resource groups, all from a single customizable app in Microsoft Teams. VIVA Insight helps the CXOs identify where teams struggle, especially to balance productivity and well-being.

VIVA Learning makes learning resources available in the company, like courses and guided-projects from EDX, Coursera, and many more, in one platform. It helps the employees to manage all their training and micro-courses with their accomplishments. VIVA Topics allows knowledge discovery from various third-party sources inside the documents across Microsoft 365 and conversation in Teams using AI.

Microsoft has partnered with Accenture, Avanade, PwC, and EY to help other companies adopt the homegrown employee experience environment, providing consulting and advisory services. Microsoft’s CEO Satya Nadela put the benefits of VIVA into a much-needed vision statement amidst this new initiative. He said, “We have participated in the largest at-scale remote work experiment the world has seen, and it has had a dramatic impact on the employee experience. Every organization will require a unified employee experience from onboarding and collaboration to continuous learning and growth. Viva brings together everything an employee needs to be successful, from day one, in a single, integrated experience directly in Teams.”

Advertisement

AWS Will Host Free Virtual Classes On ML, Blockchain, Big Data, And More

AWS virtual classes

AWS will offer free virtual classes for learners who want to gain in-demand skills like machine learning, Blockchain, big data, containers, and more. The AWS virtual class is an ideal way to getting started with the latest technologies on AWS. 

The webinar-based online classes are about 90 minutes that are mostly focused on beginners or professionals who want to explore new technologies.

Learners from across the world can register for the event based on their convenience as the lessons are delivered across the timezone. AWS keeps hosting training through virtual classes to ensure the world has a workforce that can work with cutting-edge technologies. Last month, AWS conducted free AWS AI Conclave to deliver 20+ breakthrough sessions from the industry experts.

In February 2021, AWS included topics like BlockchainContainersKubernetes, Machine LearningBig Data, among others. With these webinars, aspirants can find new interests and get to know the fundamentals of technologies.

As organizations are expecting fundamental knowledge of product development from data scientists, gaining an understanding of containers can differentiate them among others. In addition to learning new technologies, familiarising with the AWS platform can also help beginners in streamlining the workflow while they start working at organizations.

Every month, AWS delivers webinars on the latest technologies, and in the coming months, AWS will also focus on Cloud Security, Data Analytics, and more.

To know more about the upcoming AWS Virtual Classes click here.

Advertisement

Measuring Weirdness In AI-Based Language-Translations

machine translation linguistic analysis

AI-based language translations were the object of ridicule when they coughed up something funny. Consequently, AI researchers focused on translation accuracy and preserved their fluidness to set aside the embarrassment because of faulty translations. The situation gradually improved, especially with better and larger language models that surpassed humans in various benchmarks.

But these language models still amplify the statistical biases found in their training data. And the biases affect not only the translations but also their linguistic richness. Researchers from the University of Maryland and Tilburg University have tried to study this effect quantitatively in terms of grammar and linguistic analysis of machine translations. 

The translated work differs from the original one thanks to intentional factors like explicitation and normalization and unintentional ones like unconscious effects of the source language input on the target language produced. These factors are studied under a linguistics field, called Translationese, to assess the translator’s unique additions. Similarly, linguists analyze these elements introduced by a machine translator under Machine Translationese.

Also Read: Language Models Exhibits Larger Social Bias Than Human-Written Texts

In the study, the researcher linguistic analysis of sequential neural models like LSTMs, Transformers, and phrase-based statistical translation models to highlight the above factors. These models were tasked with translation between English, French, and Spanish from the source. They found that the statistical distribution of terms in the training data dictates the morphological loss of variety in the machine translations.

The translation systems do not distinguish between the synonymous and grammatical variants. This directly reduces the number of grammatically correct but diverse options. In layman terms, the diversity of words and sentence structure was drastically low in the translations because of consistency and simplification.

The authors also investigated the impacts of the loss in social-lingual aspects because these machine translations affect language usage among the masses. No solution has been proposed to the problem. The authors believe that different metrics like language acquisition metric to analyze lexical sophistication, Shannon entropy, and Simpson diversity to study morphological diversity, shall contribute further investigation.

Advertisement

Google Introduces Interpretable Ranking Via Generative Additive Models

google interpretable ranking GAM

We are building more and more complex AI models day by day to get our predictions right. In the end, we have very accurate predictions but without any interpretations of the models’ internal working. These AI models have been introduced in a controlled manner among susceptible areas like determining bail or parole, loan eligibility assessment, advertisement targeting, or guiding medical treatment decisions.

But the lack of interpretations has resulted in model maintenance and prevalence of social bias in their predictions. Till now, their general participation in higher-stake decision processes remains limited. Google researchers are trying to change this predicament of accuracy versus interpretation trade-off. They have introduced interpretable rankings based on GAMs –Generative Additive models (Neural RankGAMs), explaining their decisions and outperforming previous ranking methods.

The current research ecosystem in explainability is still in its infancy. Most research has focussed on post-hoc analysis –post prediction decision analysis of black-box models. Even these post-hoc analyses are not perfect; they offer limited interpretations of decisions for out-of-dataset instances. In some cases, they stand inefficient to understand model behavior. The other way to solve this interpretability problem is to build intrinsically interpretable models with transparent and self-explainable structure. In these models, every feature’s effect on the predictions should be visible and understandable to ensure the decisions’ explainability.

Also Read: Data Labeling And The Hidden Costs In Machine Learning

General Additive Models (GAMs) seems to fit the bill. They’re interpretable models that have been tried and tested on both regression and classification tasks. The GAM outputs a sum of multiple sub-models’ predictions, where each sub-model only takes one feature as input. Each sub-model of a GAM reflects the contribution of each input feature to the final prediction.The Google researchers are the first to use them for ranking tasks, where the goal is to rank a list of items given some objective. They instantiate the Ranking GAMs with neural networks and propose two different architectures: context-free ranking and context-present ranking. 

Each sub-model was individually distilled to produce smaller models with higher inference speed, lower memory footprint, and a more straightforward structure. The central intuition is to train a smaller, simpler model by minimizing the loss between its output and a larger, complex model. 

Neural RankGAMs outperformed various other ranking models with a considerable margin on YAHOO and Chrome Web Service benchmark. And the researchers showed that the performance boost lies in the models’ capacity to learn item-level features and list-level contexts. 

Advertisement

Language Models Exhibits Larger Social Bias Than Human-Written Texts

Language models social bias

Current language models are capable of producing convincing open-ended sentences from a short prompt. These are riddled with many controversies — from questionable correlations to propagating social bias, and Islam to terrorism. There was no benchmark for studying the harms nor measures of different social biases exhibited by the language models. 

A recent paper from Amazon Alexa and UC Santa Barbara researchers, published in the prestigious Association for Computational Linguistics (ACL), proposed BOLD — Bias in Open-Ended Language Generation Dataset — a standard benchmark in the studies of bias and fairness in Natural Language Generation (NLG). The researchers are the first to have also developed new automated metrics for toxicity, psycholinguistic norms, and text gender polarity.

The intuitive idea is to present the language models with carefully selected human-written natural prompts, which shall fetch us the reinforced bias in them. Therefore, the BOLD dataset contains 23,679 English prompts spread across five domains: profession, gender, race, religion, and political ideology spanning 43 different sub-groups. These prompts are taken from naturally diverse contents of various authors on Wikipedia. 

Researchers have also automated the measures of various biases and prejudices. Disrespectful, abusive, unpleasant, and harmful sentences generated from the prompts are considered toxic. A BERT model was trained separately on the jigsaw toxic comment dataset to predict generated sentences’ toxicity score.

Also Read: The Facebook MUPPET Show

For getting a sentiment score, they used Valence Aware Dictionary and Sentiment Reasoner (VADER). Scores greater than 0.5 and less than -0.5 convey positive and negative sentiment, respectively. A trained Multitask feed-forward neural network was used to predict psycholinguistic norms at the word-level to measure each word’s affective meaning along various dimensions. 

Regard was defined as a measure of human-annotated bias measuring polarity towards a demographic rather than overall language polarity. A numeric score for Regard was computed via ewsheng’s bias classifier trained on a biased dataset curated via GPT-2. To ascertain the gender polarity of a generated text, they used hard debiased word2vec embeddings. A certain re-weighting was performed for gender polar words to counter overshadowing many gender-neutral terms present in the text.

The experiments on three popular language models – GPT2, BERT, and CTRL, found that most professions such as writing, science, art, and engineering are skewed towards the male gender. And only the nursing is skewed towards the female gender. Negative sentiments were found to be more correlated with males and positive ones towards females. Darker races were found to be associated with lower regard than their fair-skinned counterparts.

Christianity was correlated with the lowest toxicity, while Islam and atheism were painted as highly toxic. The researchers concluded that most language models exhibited a larger social bias than human-written Wikipedia text across all domains. The researchers also mention that the benchmark is not perfect either, its limitations are limited disciplines, specific subgroups, only binary genders and races were considered.

Advertisement

Microsoft’s Gooseberry Treat For Quantum Computing

Microsoft Gooseberry Quantum Chip

In collaboration with the University of Sydney, Microsoft has built a cryogenic quantum controller chip – Gooseberry for controlling thousands of qubits. They placed the whole control structure near the qubits themselves in a near absolute-zero environment, which is a first in the field. Their work was featured in the prestigious journal Nature Electronics.

Quantum computing is in its infancy right now, comparable to the early days of computers. They promise a considerable deal of computing power and an entirely novel set of algorithms to solve some of the most troubling problems in the computing history of cryptography, chemistry, weather forecasting, and many more. The basic computing unit in them, the qubits, can encode much more information via superposition of 0 and 1, have a terrible reputation of reacting to any perturbations. But, information is still encoded and read from the qubits via electrical signals. It becomes a matter of delicacy to manipulate the qubits, which call for a controlling chip to reduce the error margins in information handling.

Also Read: IBM And Daimler Simulates Materials With Fewer Qubits

It is a common practice in the quantum industry to place the controlling structures away from qubits to safeguard the information stored in qubits from electronic noise. The Microsoft researchers designed their chip interface to allow the control chip to be with the qubits themselves. Instead of packing a rack of room-temperature electronics to generate electrical pulses for qubits placed in 0.3 Kelvin refrigerators, the Gooseberry quantum chip is placed in the refrigerator with the qubits. This arrangement results in a tightly regulated and stable environment.

The Microsoft researchers have also built a cryogenic compute core that operates in much warmer temperatures and computes the classical calculations that are essential for determining the instructions for Gooseberry quantum chip. The chip then feeds the electrical signals to the qubits directly. Having the room to generate more heat and achieve more computations, the core enables general computing like any other CPU. 

Advertisement

The Facebook MUPPET Show

Facebook Muppet pre-fine tuning

Facebook researchers have scaled up a relevantly new technique, Pre-finetuning (PFT), in their paper MUPPET to multi-task learning of over 50 tasks on a vast scale, i.e., 4.8 million instances. They showed that PFT increases both the performance and sample efficiency of fine-tuned models like BERT, RoBERTa, and more. They even set new records in RTE and HellaSWAG benchmarks.

The usual workflow in large scale language modeling is pre-training via self-supervision over massive unlabeled datasets and then fine-tuning to suit the tasks at hand with relatively few labeled data. This arrangement works fine till the datasets and tasks are relevant. But in low-resource languages or individual tasks with very little labeled data, this training scheme bleeds the language models out. 

Also Read: Data Labeling And The Hidden Costs In Machine Learning

In 2019, a group of researchers had introduced a Pre-finetuning (PFT) stage in a paper named ‘Tri-Train,’ that lies in between pre-training and fine-tuning to overcome the above problem. They constructed another small-sized corpus by selecting a set of sentences from unlabeled pre-training data relevant to the labeled training data. Then they fine-tune the pre-trained model on merely two tasks – predict the next word on sentences from the small corpus and predict the start/end words of those sentences.

Facebook’s MUPPET — Massive Multi-task Representations with Pre-Finetuning — extends the above work to new levels. The researchers used 50 diverse tasks that include classification, summarization, question answering, and common sense reasoning. Their investigation showed that general multi-task learning schemes fail to learn useful representations and are unstable. However, their experiments also showed that scale plays a significant role in multitask learning. 

Fewer tasks degrade representation quality than the pre-trained model, but more tasks than this point improve representations. Pre-finetuning hurts performance when few tasks are used until a critical point, usually above 15, after which performance improves linearly in the number of tasks.

The researchers used loss scaling and task-heterogeneous batches so that learning remains balanced across different competing tasks, significantly improving training stability and overall performance. For training on several tasks, their model contains task-specific heads, each optimizing for a task-specific loss. They scaled each data-point loss so that, if the class distribution were uniformly distributed along with the model’s predictions, all of the task-specific losses would have equivalent values.

Similarly, the researchers proposed task-heterogeneous batches to optimize several potentially competing objectives to create a global representation across several model training tasks. During gradient descent, moving along the gradient of a single task may not be the optimal direction for the model to move to learn a single unified representation across tasks. To overcome this, the model optimizes over batches that consist of several tasks. Each worker samples a random batch from the set of tasks and computes a gradient, accumulated for the final update.

The model also learns better representation than the standard RoBERTa, leveraging representations from the pre-fine tuned models with 34-40 tasks. The scale factor becomes evident as the more the tasks are, the more the data-efficiency is.

Advertisement