Tuesday, November 11, 2025
ad
Home Blog Page 336

How do I start learning data science?

Learning Data Science

If you go back the memory lane, specifically in the early 2010s, the shiny titles were taken by web designers and programmers. However, in the current situation, data scientists have taken a significant place.

The primary reason behind the demand for data science courses taking over other professionals is the need to make data-driven decisions. After all, to run a successful company in the 21st century, you need to have data that represents your target audience and the target market.

Over the last decade, a majority of companies have implemented data science to receive huge growth in the business and have found it helpful. Sure, this position has been engaging a lot of youngsters and mid-aged professionals.

So, if you wish to be a data scientist and want to learn this concept, this guide is curated just for you. Let’s understand how to start learning data science in this competitive environment.

What is Data Science?

Data science is the study that deals with the world of understanding, analyzing, and using modern tools and techniques to curate meaningful information and create protocols for making big business decisions.

To put it simply, data science is an interdisciplinary domain that uses scientific methods, procedures, algorithms, and systems to derive information and insights from structured and unstructured data and apply that knowledge and actionable data to a wide variety of application domains.

Data processing, deep learning, and big data are also some of the essential aspects of data science. Mainly, data science is used to make decisions and forecasts with the help of predictive causal analytics, prescriptive analytics, and machine learning.

Key Skills to Be a Data Scientist

To become a data scientist, you must master some basic skills in terms of analyzing, understanding, and collecting the data, such as:

  • You must know the tools used to analyze the database, such as Oracle® Database, MySQL®, Microsoft® SQL Server, and Teradata®.
  • You should know basic mathematical analysis, probability, and statistics. While statistics deal with the study and development of the data and calculation of the outcomes, the probability is the calculation of possibilities. Mathematical analysis deals with the basic differentiation, integrations, variables, measure, values, limits, vectors, odds, series, and so on.
  • Expertise in at least one programming language is the most important skill a data scientist must have. Data science relies on programming tools, which are the foundation of the discipline. You can consider learning R, Python, SAS, or any other language that meets the requirement.
  • Data wrangling is another requirement to become a data scientist as it helps in cleaning, manipulating, and organizing the data.
  • Being a data scientist, you must be able to understand the result. Data visualization is the integration of different data sets and generating a visual representation of the data using diagrams, charts, and graphs. You can also take up a data science and business analytics course in India to adept yourself with adequate knowledge. 

How to start learning data science?

Data science is a complicated subject to choose as a career, but simultaneously a creative and flabbergasting field as well. To learn data science, you need to first understand and learn the key skills shared above. Once done, here is how you can begin learning data science.

As mentioned above, basic mathematics is required to enter the field of data science. As a result, learning the required mathematics concepts should be the first step. Complex equations, differentiation, integrations, calculus, programming, and database, are some important concepts for the same.

Above all the programming languages, you would have to go through one and learn to master your knowledge in data science. This field involves a lot of programming features, and that is the reason you must know any of the required programming languages. Python and R could be great choices, to begin with.

Once done with basic mathematics and a programming language, you need to move forward to learn the Pandas library. Understand what it is all about, how it works, its advantages, used resources, and more. Pandas library provides high-performance data frames that enable you to get easy access and understanding of the data. It simplifies the process and presents it in a tabular form. It includes various tools for reading and writing data, handling missing data, cleaning messy data, and many more.

Further, you will need to learn machine learning and practice the same. Machine learning is a complex field. Once you have completed the entire course, you must ensure that the practice is not missed at any cost.

Data Science Certificate

To become a data scientist, you would at least need a bachelor’s degree in data science or a computer-related field. Also, some of the careers require a master’s degree. Thus, you would have to cross-check before starting anything.

Project model certification, internship certification, and qualification certificates are among the additional certifications you will need. Apart from this, if you have graduated in a different field, you can pursue a diploma in any online platform as well. There are many short data science courses available online that you can pursue right away.

Current Scenario in India

The position of a data scientist has risen to become one of the most sought-after professions, ranking second only to that of a machine learning engineer, which is a profession that is closely related to that of a data scientist.

With emerging tech companies and educational institutions in India, data science is already booming. This has resulted in an increase in data science job openings. Such opportunities are not just in the private sector but the government as well, with majorly all of the organizations making digital transitions.

Moreover, in the current pandemic, data scientists emerge as a guide to shift business operations online by using big data, cloud computing, machine learning, and more advanced Artificial Intelligence (AI).

So far, data scientists have been able to make significant strides in client data processing, Robotic Process Automation (RPA), cybersecurity, banking, healthcare, manufacturing, logistical supply chain AI, retail, workplace connectivity, and e-commerce as a result of increased demand.

There are lots of opportunities in the private sector as well, where entry-level data scientists can earn around 5 lakhs rupees per annum and experienced data scientists can get more than 6 lakhs rupees per annum in India. According to a LinkedIn report 2020, data scientist jobs showed a growth rate of over 37% this year.

Current Scenario in Other Countries

There are six countries where data science expertise is in short supply. France is standing tall at the top of the list, with high demand for data science due to new startups that have emerged in recent years.

Germany is at the bottom of the list and can face a shortage of 3 million skilled workers by 2030, regardless of its dependability on technology, even with a small population. Other countries like Sweden, UK, Finland, Ireland are also facing this kind of problem.

The Bottom Line

Today, this field is considered to be a vast and emerging one for a person to grow and make their career in. In summary, data science is the science of analyzing data and solving big equations with modern techniques more efficiently. Apart from this, data science simplifies complex data into user-friendly data.

There are many institutes in India as well as across the world that are providing free beginner data science courses. To become a data scientist, mathematics is a plinth for it.

Moving further, some programming languages are a must to know if someone is willing to choose data science. The path to get your destination in this field is to get at least a degree or a diploma certificate in the field of computers.

Through Artificial Intelligence and automation, data science is poised to transform many industries, including health care, transportation, business, finance, and manufacturing.

Advertisement

Microsoft Releases A Library For PowerBI In Jupyter Notebook

PowerBI in Jupyter Notebook

Microsoft releases a library for PowerBI in Jupyter notebook to allow developers to leverage the power of one of the most widely used business analytics tools’ reports. For several years, Jupyter notebook has become the go-to tool for data scientists but creating reports from Matplotlib and Seaborn and other libraries is not as straightforward as one might think. What beats plotting with just drag and drop with analytics platforms like PowerBI, Qlik, and Tableau?

However, business analytics platforms are not highly popular with data scientists as several preprocessing of data is required to improve the quality and make information ready for visualizations. Although data visualization and report generation require numerous lines of code, they still rely on Jupyter notebooks; switching to a business analytics platform only to visualize is too much for an ask for data scientists.

In addition, machine learning practitioners prefer to showcase everything within Jupyter Notebooks as it is easier to have everything in one place.

During Microsoft Build, the company removed a major pain point of data scientists for visualization in Jupyter notebooks. Now, PowerBI analytics can be accessed with a new powerbiclient Python package. You can also find the Python package and associated TypeScript widget on GitHub.

You can install the package with:

pip install powerbiclient

The ability to bring PoweBI reports makes the workflow earlier for data scientists as they can collaborate with other developers with ease. All they have to do is import the reports developed by other professionals like data analysts or business analytics with PowerBI. Data scientists can focus more on the advanced work while easily importing the reports of others.

For reports in the Jupyter notebook, you have to authenticate PowerBI using Azure AD and then provide details of the workspace ID and report ID to embed. Eventually, they can load the report to the output cell.

from powerbiclient import Report, models
# Import the DeviceCodeLoginAuthentication class to authenticate against Power BI
from powerbiclient.authentication import DeviceCodeLoginAuthentication

# Initiate device authentication
device_auth = DeviceCodeLoginAuthentication()

group_id=""
report_id=""

report = Report(group_id=group_id, report_id=report_id, auth=device_auth)

report

Check out the demo here.

Advertisement

LinkedIn Releases Greykite, A Library For Time Series Forecasting

LinkedIn greykite

LinkedIn releases a time-series forecasting library, Greykite, to simplify prediction for data scientists. The primary forecasting algorithm used in this library is Silverkite, which automates the forecasting. LinkedIn developed GrekKite to support its team make effective decisions based on the time-series forecasting models. As the library also helps interpret outputs, it can become a go-to tool for most time-series forecasting. LinkedIn also had, last year, released a Fairness Toolkit for explainability in machine learning. 

Over the years, LinkedIn has been using the Greykite library to provide sufficient infrastructure to handle peak traffic, set business targets, and optimize budget decisions. 

Figure 1

According to LinkedIn, the Silverkite algorithm architecture is shown in Figure 1. The green parallelograms represent model inputs, and the orange ovals represent model outputs. The user provides the input time series and any known anomalies, events, regressors, or changepoint dates. The model returns forecasts, prediction intervals, and diagnostics.

Often, time-series models fail to consider seasonality and other non-frequent events, making it difficult to predict the outcome precisely. The is where LinkedIn’s Greykite library assists data scientists while working with seasonality and holidays. Users can fit their models based on the requirements and effectively work with changepoints and seasonality. 

Since the models not only forecast but also provide exploratory plots, templates for tuning, and explainable forecasts, the Greykite library can be used for quick prototyping and deployment at scale. 

To benchmark the performance of LinkedIn’s Greykite library, the researchers used several datasets — Peyton-Manning Dataset, Daily Australia Temperature Dataset, and Bejing PM2.5 Dataset. Silverkite outperformed Prophet, Facebook’s open-source algorithm for forecasting, ran four times faster than the latter. 

Currently, the LinkedIn Greykite library also supports Prophet and will add more open-source algorithms to allow data scientists to work on diverse forecasting requirements. 

Checkout the LinkedIn’s Greykite library on GitHub, and its documentation.

Advertisement

Alphabet Rejects DeepMind’s Request For Autonomy

Alphabet rejects DeepMind AI autonomy

Founders of DeepMind have for years shown interest in making DeepMind an independent artificial intelligence research center. The idea behind autonomy for DeepMind was to eliminate the control by a single entity over advanced artificial intelligence research. However, Alphabet has rejected the request for obtaining complete independence, according to sources.

Blue-chip companies have an eye for obtaining innovative artificial intelligence-based solutions to control groundbreaking technologies. A similar instance occurred when Microsoft exclusively licensed GPT-3 of OpenAI. The move was criticized by the likes of Elon Musk, who was one of the promoters of OpenAI. 

But why shouldn’t companies that fund these research centers get control over how technological advancements are handed?

Bought by Google for $500 million in 2014, DeepMind has been doing groundbreaking research to further the development of artificial intelligence technology. More recently, DeepMind solved a 50-year-old grand challenge in biology by predicting protein structures. But every year, DeepMind incurs losses of around $600, and Alphabet in December 2020 waived off the debt of $1.5 billion. Supporting such a research center for nothing would mean a substantial long-term loss for promoters. 

GPT-3, reportedly, cost $12 million for a single training run, making it cost-intensive for research centers. If a single research cost takes a sizable chunk of your funding, only a few big tech companies can invest in such projects. And tech companies can only do that because they are profit-oriented. Although this concentrates huge innovations to only a few companies, there are impactful innovations that further the development of artificial intelligence.

In addition, when big tech companies invest in these artificial intelligence research centers, they do it with specific terms and conditions that give them intellectual property rights; Microsoft has invested in OpenAI that allowed them to license GPT-3 exclusively. Blue-chip companies invest to gain profit. And so does Alphabet with DeepMind.

Advertisement

Top 15 Datasets For Sentiment Analysis With Significant Citations

Datasets for sentiment analysis

Sentiment analysis is one of the most common tasks performed by machine learning enthusiasts to understand the tone, opinions, and other sentiments. Over the years, sentimental analysis datasets were mostly created by extracting information from social media platforms. But due to an increase in unstructured data within organizations, companies have been actively leveraging natural language processing techniques to gain unique insights to make quick decisions. Today, with sentiment analysis, organizations are able to monitor brand and product sentiments among their customers. Consequently, working with datasets for sentiment analysis allows job seekers to gain expertise in handling unstructured data and help companies make effective decisions.

Sentiment analysis datasets are not limited to organizations; researchers have used rule-based models, automated models, and a combination of both to gauge out the sentiments behind texts for advancing the techniques in artificial intelligence. Neural network models are prevalent in the field for their sheer performance. But all these models need data to be trained, especially clean and well-annotated data. This is where benchmarks — sentiment analysis datasets — come in.

Amongst all the available datasets for sentimental analysis, here are some of the highest cited datasets:

1. General Language Understanding Evaluation (GLUE) Benchmark

Based on a paper on Multi-Task benchmarking and analysis for Natural Language Understanding (NLU), the GLUE  sentiment analysis dataset offers a binary classification of sentiments — SST-2 along with eight other tasks for an NLU model. Current state-of-the-art models are trained and tested on it because of the variety of divergent tasks. Besides, a wide range of models can be evaluated for linguistic phenomena found in natural language.

Download: python download_glue_data.py --data_dir glue_data --tasks all

Source: Wang et al. GLUE 

2. IMDb Movie Reviews

Hosted by Stanford, this beginner-friendly dataset is a binary sentiment analysis dataset consisting of 50,000 reviews from the Internet Movie Database (IMDb). A score above seven is labeled as positive, and a score below 4 is negative. The dataset for sentiment analysis contains the same number of positive and negative reviews with only 30 reviews per movie. Only highly polarizing reviews are considered.

Download: Link

Source: Andrew L. Maas et al.

3. DynaSent

DynaSent is an English-language-based positive, negative, and neutral dataset for sentiment analysis. It combines naturally occurring sentences with sentences created using the open-source Dynabench Platform, which facilitates human-and-model-in-the-loop dataset creation. DynaSent has a total of 121,634 sentences, each validated by five crowd workers. The dataset also contains the Stanford Sentiment Treebank dev set with labels.

Download: Link

Source: Potts et al.

Also Read: Microsoft Announces The Support Of Hindi For Sentiment Analysis

4. MPQA Opinion Corpus (Multi-Perspective Question Answering)

The MPQA Opinion Corpus contains 535 news articles from a wide variety of news sources manually annotated for opinions, beliefs, emotions, sentiments, speculations, and more. The data should be strictly used for research and academic purpose only. 

Download: Link

Source: Janyce Wiebe et al. 

5. ReDial

ReDial (Recommendation Dialogues) is an annotated dataset of dialogues for sentiment analysis, where users recommend movies to each other. The dataset consists of over 10,000 conversations centered around the theme of providing movie recommendations. There are several examples from the conversation for validation sets that can be useful before getting started.

Download: Link

Source: Li et al.

6. AG’s Corpus 

Antonio Gulli’s corpus of news articles is a collection of more than 1 million news articles. The articles are curated from more than 2000 news sources by ComeToMyHead in more than one year. This data set can be used for non-commercial activities only. Also, you cannot re-distribute the datasets with a different name.

Download: Link

Source: Gulli in AG’s corpus of news articles

7. Amazon Fine Foods

The paper From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews using Amazon Fine Foods is cited over 400 times. The Amazon Fine Foods dataset consists of ~5000,000 reviews up to October 2021 by 256,059 users. A total of 74,258 products have been reviewed, with a median number of words per review of 56.

Download: Link  

Source: McAuley et al. 

8. SPOT (Sentiment Polarity Annotations Dataset)

Collected from Yelp’13 and IMDB, the SPOT sentiment analysis dataset contains 197 reviews that are annotated with segment-level polarity labels (positive/neutral/negative). Annotations have been gathered on two levels of granularity: Sentences and Elementary Discourse Units (EDUs). The dataset is ideal for evaluating methods that are focused on predicting sentiment on a fine-graned and segment-level basis. 

Download: Link

Source: Angelidis et al.

9. Youtubean

Youtbean is a dataset created from closed captions of YouTube product review videos. It can be used for a wide range of sentimental analysis tasks like aspect extraction and sentiment classification. The data set was used for the paper ‘Mining fine-grained opinions on closed captions of YouTube videos with an attention-RNN.’

Download: GitHub

Source: Marrese-Taylor et al.

10. ReviewQA

ReviewQA is a question-answering dataset proposed for sentiment analysis tasks based on hotel reviews. The dataset consists of questions that are linked to a set of relational understanding competencies of models. Each question comes with an associated type that characterizes the required competency.

Download: GitHub

Source: Grail et al. 

11. iSarcasm

Twitter datasets for sentimental analysis are one of the go-to data for sentiment analysis. iSarcasm is a dataset of tweets that are intended sarcasm for sentimental analysis. The data is labeled as either sarcastic or non_sarcastic. Sarcastic tweets are further labeled with the types of ironic speech — sarcasm, irony, satire, understatement, overstatement, and rhetorical questions.

Download: GitHub

Source: Oprea et al.

12. PHINC

PHINC is a parallel corpus of the 13,738 code-mixed English-Hindi sentences and their translation in English. According to researchers, the translations of sentences are manually annotated. This is one of the best datasets for sentimental analysis with a mixture of languages that is highly common in India.

Download: Link

Source: Srivastava et al.

13. XED

XED is a multilingual fine-grained emotion dataset for sentimental analysis. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 30 additional languages, providing new resources for many low-resource languages.

Download: GitHub

Source: Öhman et al.

14. MultiSenti

MultiSenti offers code-switched informal short text, for which a deep learning-based model can be trained for sentiment classification. The developers have provided a pre-trained model using word2vec, which can be accessed here.

Download: Link

Source: Shakeel et al.

15. PerSenT

PerSenT dataset contains crowd-sourced annotations of the sentiment expressed by the authors towards the main entities in news articles. The dataset also includes paragraph-level sentiment annotations to provide more fine-grained supervision for the task.

Download: Link

Source: Bastan et al.

Advertisement

Google I/O Introduces LaMDA, A Breakthrough Conversational AI Technology

Sundar Pichai, CEO of Alphabet, demonstrates a breakthrough Conversational AI technology LaMDA (Language Model for Dialogue Applications. Like other language models, LaMDA is a Transformer-based neural network architecture model but are trained on dialogue. Since the release of Transformer by Google Research in 2017, several large-scale language models like GPT-3, DeBERT, and ROBERTA were released that have revolutionized the artificial intelligence industry. Today, language models can generate code, summarise articles, and more.

However, Transformer-based models can be heavily limited to specific tasks or require training the pre-build models with new information to effectively on a wide range of functions.

To make models topic/task agnostic, Google blazed a trail and trained LaMDA on dialogue, especially with chatbots. “During its training, it picked up on several of the nuances that distinguish open-ended conversation from other forms of language. One of those nuances is sensibleness. Basically: Does the response to a given conversational context make sense?;” mentions Google.

Google has witnessed a superior performance of LaMDA while asking a few questions. The video below demonstrates LaMDA’s capability with open-ended conversation.

LaMDA resulted from Google’s earlier research published in 2020 that showcased that Transformer-based language models can be trained on dialogue to improve use cases on numerous tasks. Since it is still early in the research, Google is committed to revolutionizing Conversational AI technologies.

For now, Google will be focusing on the sensibleness and satisfyingness of response from LaMDA to further enhance its responses.

Advertisement

Facebook AI Introduces Expire-Span To Make Artificial Intelligence More Human

Facebook AI's Expire-Span

Humans have the ability to forget unnecessary information to make space for new patterns that can be significant while making decisions. Facebook AI is moving the needle with Expire-Span to make this a reality for machine learning models. As machine learning models learn new patterns, it keeps on collecting new information, making them computation intensive. As a workaround, researchers embraced the compression technique, where less relevant data is compressed. But, this resulted in blurry visions of memory for tasks that require models to look a long way back to enhance accuracy.

To eliminate this challenge, Facebook AI introduced a novel approach — Expire-Span — to set the expiration time of data. According to the researchers, Expire-Span is a first-of-its-kind operation that enables neural networks to forget at scale. It allows machine learning-based systems to make space for more information while reducing the computational requirements.

Facebook AI’s Expire-Span

For instance, if a machine learning model is tasked to find a yellow door, it stores all the patterns collected while iterating to find the right path. Even after finding the correct patterns, it remembers other unnecessary details that might not help it achieve its goal in the future. This is where Facebook AI’s Expire-Span approach is making the grounds in achieving human-like abilities by deleting nonessential data.

Also Read: Brain Storage Scheme Can Solve Artificial Networks’ Memory Woes

“Expire-Span calculates the information’s expiration value for each hidden state each time a new piece of information is presented, and determines how long that information is preserved as a memory,” mentions the researchers. Facebook AI’s Expire-Span determines the span based on context learned from data and influenced by its surrounding memories. Besides, the span size can be adjusted when needed at a later stage to retain information for a longer period.

Image Credit: Facebook AI blog

Facebook AI researchers evaluated the performance of models that are equipped with Expire-Span against the state-of-the-art models. The Expire-Span models required less computation while delivering comparable performance. “The impressive scalability and efficiency of Expire-Span has exciting implications for one day achieving a wide-range of difficult, human-like AI capabilities that otherwise would not be possible,” wrote the researchers.

The research is in its early stages, and Facebook AI is committed to further enhancing the capabilities of the approach. Nevertheless, the researchers believe that Expire-Span can go beyond research and help in real-world applications.

Read the complete research paper here.

Advertisement

DeepLearning.AI Launches MLOps Specialization

Photo of Andrew Ng teaching

DeepLearning.AI launches Machine Learning Engineering or Production (MLOps) Specialization to help learners become industry-ready. Today, MLOps has become an essential part of organizations to build robust machine learning-based solutions. However, there is a dearth of courses that can enable learners to build end-to-end AI solutions.

Taught by instructors from Google, Pulsar, and Andrew Ng, there are four courses in the MLOps specialization by DeepLearning.AI — introduction to machine learning in production, machine learning data lifecycle in production, machine learning modeling pipelines in production, and deploying machine learning models in production.

Andrew Ng announced on LinkedIn the release of the MLOps specialization by DeepLearning.AI on Coursera. The specialization covers the designing of an ML production system, modeling strategies, development requirements, establishing a model baseline, building data pipelines, and more.

Also Read: Google Launches Professional ML Engineering Certification

“Being able to train ML models is essential. And, to build an effective AI career, you need production engineering skills as well, to build and deploy ML systems. With this specialization, you can grow your knowledge of ML into production-ready skills.,” wrote Andrew Ng on LinkedIn.

Since the specialization is categorized as advanced level, you would require prerequisites like knowledge of Python and familiarity with deep learning frameworks like PyTorch, Keras, or TensorFlow.

Advertisement

Coursera Is Offering Fee Machine Learning Courses With Certificates In India

Coursera Free Course For India

As India is navigating through an unprecedented second wave of the covid-19 spread, various organizations, including Coursera, have come together to assist the country. Coursera has devised a special collection of courses to offer for free with certifications in India.

The curated course includes not only artificial intelligence, cloud, and application development course but also personal development and public health. The last day to avail of the offer is on June 30.

The discount will be automatically applied during the checkout. According to Coursera, you can only enroll in one course during the offer period. However, we were able to enroll in multiple courses. 

Coursera had earlier come up with a similar offer on its 9th anniversary, where it offered numerous courses for free.

Some of the popular courses offered in this initiative are Getting Started with AWS Machine LearningVersion control with GitIntroduction to programming with MATLABGoogle Cloud Platform fundamentals for AWS professionals, and more.

You can also enroll in other courses like resume building, android application development, among others.

You can check the complete list of free courses from Coursera here.

Advertisement

Microsoft Open-Sources Counterfit, A Tool To Automate Security Testing In Machine Learning Models

Microsoft-Counterfit

Microsoft Counterfit release will allow the artificial intelligence community to quickly determine security flaws in their machine learning-based applications used for businesses. As the adoption of AI applications is proliferating in both business and consumer markets, the need to protect personal information from sneaking out from the ML models.

According to a survey by Microsoft, 25 out of 28 businesses do not have the right tools to secure their AI systems. Unlike other applications, AI-based software are prone to a wide range of security attacks, including adversarial attacks and data leaks. Such attacks not only hamper the brand of organizations but also lead to monetary loss due to stringent data privacy laws in place.

Since machine learning applications vary widely based on the algorithms and architecture used, companies specifically address every application’s shortcoming in security. However, to assist organizations, Microsoft releases Counterfit, which can be leveraged with most machine learning systems.

Counterfit was born out of the internal needs of Microsoft AI systems for pinpointing vulnerabilities. Over the years, the company enhanced Counterfit to make it a generic automation tool that can evaluate multiple AI systems at scale. Today, Counterfit is environment, model, and data agnostic, making it an ideal tool to leverage in numerous use cases.  

“Under the hood, Counterfit is a command-line tool that provides a generic automation layer for adversarial AI frameworks such as Adversarial Robustness Toolbox and TextAttack,” mentions Microsoft.

Users can leverage Counterfit for penetration testing and red teaming AI systems, vulnerability scanning for AI systems, and logging for AI systems.

Microsoft heavily relies on Counterfit to make their artificial intelligence applications robots before shipping them to the market. Currently, it can not be used before the models and applications hit production. But, it is being piloted to find AI-specific vulnerabilities before taking the efforts into production.

Advertisement