Home Blog Page 37

IIT Madras Offers 2-year AI Fellowship that Pays ₹40,000 Per Month

Applications for the well-known artificial intelligence fellowship, which is provided by the Robert Bosch Centre for Data Science and AI (RBCDSAI) at IIT Madras, have been reopened. 

Graduates and postgraduates with an extraordinary academic record who are interested in working on the newly emerging fields of artificial intelligence and data science are eligible for the two-year fellowship. Throughout the course of the fellowship, those chosen will receive a monthly stipend of ₹40,000.

The primary goal of the IIT Madras artificial intelligence fellowship is to provide a platform for developing research abilities by giving the chosen applicants plenty of opportunity to study extensively, collaborate with colleagues, and develop creative solutions. The program’s primary goal is to instill ethical AI practices, ensuring that participants are prepared for work after the fellowship by developing ethical competence.

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

Candidates with a four-year bachelor’s degree or a master’s degree in a field related to RBCDSAI are eligible for the position. The candidate must be younger than 27 on March 31, 2023. In addition to having a stellar academic history, applicants should be interested in fields relevant to data science and artificial intelligence. The candidate must accept the offer right away, be available full-time, and suggest a joining date within six weeks of the offer letter’s receipt.

Participants will receive knowledge on artificial intelligence and data science, interact with leading experts from across the world, and the chance to collaborate on projects with illustrious organizations and businesses including Google, NASA, CMU, Walmart, JHU, MIT, The Ohio State University, Harvard, etc. 

Research participants will also have access to cutting-edge high-performance CPU and GPU computing infrastructure. There is also a possibility to get course credits while receiving guidance from IIT Madras teachers.

Candidates who are interested and qualified must submit their applications through this Google Form, along with information like their selection of three RBCDSAI faculty members they would like to collaborate with. During the fellowship, chosen fellows may also apply for a PhD programme. However, they will need to give one month’s notice if they decide to leave. 

Advertisement

Important Topics in Machine Learning that Every Data Scientist Must Know

Important Topics in Machine Learning that Every Data Scientist Must Know
Image Credits: Shutterstock

With the advent of Artificial Intelligence, organizations are now more inclined towards digitalization and automation of operations. The functions of a data scientist have become central to decision-making in all types of businesses. A well-rounded pedagogy for machine learning and data science courses typically includes the following elements:

  • Theoretical foundations: Covering the mathematical concepts and principles behind different machine learning algorithms, such as probability, statistics, optimization, and generalization.
  • Hands-on practice: Students can implement and experiment with various machine learning algorithms on different data types using popular programming tools such as Python, R, and Matlab.
  • Data preparation and pre-processing: Teaching students the importance of data quality, cleaning, feature engineering, and data preparation and pre-processing techniques.
  • Model evaluation and selection: Emphasizing the importance of evaluating and comparing different models and selecting the most appropriate model for a given problem.
  • Real-world application: Providing examples of machine learning applications in various domains such as computer vision, natural language processing, and recommender systems.
  • Communication and interpretation: Emphasizing the importance of effectively communicating the results of machine learning models and understanding and interpreting the outputs of these models.
  • Ethics and safety: Teaching the ethical, societal, and safety considerations that arise with machine learning, such as bias, fairness, and explainability.
  • Continuous learning and staying updated on the field: Advise students on staying updated with the latest developments and advancements in the area, and continue learning and experimenting with machine learning.

The best data science and machine learning course would include the following topics:

1. Data structures: 

They play a crucial role in machine learning as they provide a way to organize and manipulate data efficiently. Here are some commonly used data structures in machine learning:

•        Arrays: An ordered collection of elements often used to store data in a contiguous memory block.

•        Lists: These are dynamic data structures that can grow or shrink in size. It provides a way to store elements as separate nodes in memory.

•        Tuples: An ordered, immutable collection of elements that can store different data types.

•        Dictionaries: A key-value mapping data structure with unique keys used to look up values.

•        Sets: A collection of unique elements, often used for set operations like union, intersection, etc.

•        Matrices: A two-dimensional data structure widely used in linear algebra and numerical computations in machine learning.

•        Trees are hierarchical data structures where each node has a parent and zero or more children used for decision-making and data classification tasks.

•        Graphs are data structures that represent a set of vertices and edges that connect them. Applications include recommendation systems and social network analysis.

Additionally, specialized data structures like heaps, hash tables, and Bloom filters are helpful in specific scenarios in machine learning. Understanding these data structures and their operations helps select the proper structure for the task and improves the algorithms’ performance.

2. Machine Learning life-cycle: 

It consists of six stages:

  • Problem Definition: Clearly define the problem and determine the goal of the model.
  • Data Collection: Gather and pre-process relevant data to train the model.
  • Data Preparation: Clean, format, and split the data into training and testing sets.
  • Model Selection: Choose an appropriate algorithm and fine-tune the hyperparameters.
  • Model Training: Train the model using the prepared data.
  • Model Evaluation: Evaluate the model’s performance using accuracy, precision, recall, etc.

It is important to note that the machine learning life cycle is an iterative process, with each stage influencing the next. For example, if the data is not of high quality, it may be necessary to go back to the data collection stage to gather more data or improve the pre-processing steps. Similarly, if the model is not performing well, it may be necessary to go back to the model selection stage to choose a different algorithm or fine-tune the hyperparameters. If the model performs well, it is deployed in the production environment and monitored for performance and accuracy. Periodically retrained models incorporate new data and maintain performance. 

3. Languages: 

Machine learning can be executed using several programming languages, including:

  •  Python: It is the most widely used language for machine learning due to its simplicity, vast libraries (e.g., TensorFlow, PyTorch, Scikit-learn), and strong community support.
  •  R: It is a statistical programming language widely used in academic and research settings. It offers several packages for machine learning, such as caret and mlr.
  • Java is a popular language for building enterprise-level applications and strongly supports machine learning libraries such as Weka and Deeplearning4j.
  • Julia: It is a high-level programming language designed for numerical and scientific computing and strongly supports machine learning through packages such as Flux.jl.
  • Scala: It is a statically-typed programming language that runs on the Java Virtual Machine and supports machine learning through libraries such as Spark MLlib.

The choice of language for machine learning depends on the specific project requirements and the expertise of the data scientists and developers involved. Python and R are the most widely used languages, while Java, Julia, and Scala are used for more specific projects.

4. Data visualization: 

There are several platforms for data visualization in machine learning, including:

  • Matplotlib: It is a plotting library in Python that provides functionality for creating a variety of static, animated, and interactive visualizations.
  • Seaborn: It is a Python library based on Matplotlib that provides advanced visualization capabilities, including heatmaps, violin, and box plots.
  • Tableau is a data visualization and BI tool providing interactive dashboards and visualization capabilities.
  • ggplot2: A plotting library in R provides a flexible and intuitive syntax for creating static, animated, and interactive visualizations.
  • Plotly: It is a cloud-based platform with advanced visualization capabilities, including interactive dashboards and 3D visualizations.

These platforms provide a range of options for visualizing and exploring data, from a simple bar and line charts to more advanced visualizations such as heatmaps and interactive dashboards. The platform choice depends on the project’s specific requirements, the data scientists’ skills, and the available resources.

 5. Machine learning in various industries: 

ML has been applied in multiple industries, including:

  • Healthcare: It is used for diagnosis, prognosis, and personalized treatment plans
  • Finance: It is used for fraud detection, risk management, and algorithmic trading.
  • E-commerce: It is used for personalized recommendations, customer segmentation, and pricing optimization.
  • Transportation: It is used for route optimization, predictive maintenance, and autonomous vehicles.
  • Manufacturing: It is used for quality control, predictive maintenance, and supply chain optimization.
  •  Agriculture: It is used for yield prediction, soil analysis, and precision farming.
  • Education: It is used for personalized learning, student assessment, and educational data analysis.

Industries use machine learning to automate processes, make predictions, and gain insights from data. The applications are diverse and continue to grow as machine learning advances. Since the field is ever-evolving, data scientists should be up-to-date with all new developments to create the most value.

Conclusion

In data science and machine learning, essential knowledge is pivotal. As industries embrace digital transformation, data scientists play crucial roles in decision-making. A comprehensive curriculum covers theoretical foundations, practical implementation, data preprocessing, model assessment, real-world applications, effective communication, and ethics. A standout course includes understanding data structures, the machine learning life cycle, programming languages, data visualization, and industry applications. With the field’s constant evolution, staying updated is critical for sustained success.

Advertisement

Researchers Propose to Avoid Harmful Content by Having LLMs Filter Their Own Responses

avoid harmful content from LLM
Image Credits: Shutterstock

Researchers from Georgia Institute of Technology have proposed a simple approach to defending against harmful content generation by large language models by having a large language model filter its own responses. Their results show that even if a model is not fine-tuned to be aligned with human values, it is possible to stop it from presenting harmful content to users by validating the content using a language model.

LLMs have been shown to have the potential to generate harmful content in response to user prompting. There has been a focus on mitigating these risks, through methods like aligning models with human values through reinforcement learning. However, it has been shown that even aligned language models are susceptible to adversarial attacks that bypass their restrictions on generating harmful text. This is where the newly proposed method comes into light. 

The approach for filtering out harmful LLM-generated content works by feeding the output of a model into an independent LLM, which validates whether or not the content is harmful. By validating only the LLM-generated content of a user-prompted LLM, and not the prompt itself, the approach potentially makes it harder for an adversarial prompt to influence their validation model. 

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

First, the researchers conducted preliminary experiments to test the ability of the approach to detect harmful LLM-generated content. They randomly sampled 20 harmful prompts and 20 harmless prompts, generating responses to each. They used an uncensored variant of the Vicuña model to produce responses to each prompt. The researchers manually verified that the LLM-generated responses were indeed relevant to the prompts, meaning harmful prompts produce harmful content and harmless prompts produce harmless content. 

They then instantiated their harm filter using several widely used large language models, specifically, GPT 3.5, Bard, Claude, and Llama-2 7B. They presented the Vicuña generated content to each of the LLM harm filters, which then produced a “yes” or “no” response. These responses act as a classifier output, which were then used to compute various quantitative evaluation metrics

According to experimental results, Claude, Bard, and GPT 3.5 performed similarly well at identifying and flagging harmful content, each reaching 97.5%, 95%, and 97.5% accuracy respectively. Llama 2 had the lowest performance on the sampled data with an accuracy of 80.9 %. According to the paper, this approach has the potential to offer strong robustness against attacks on LLMs.

Advertisement

Amazon Removes AI-generated Books that Falsely List Jane Friedman as Author

Amazon removes AI generated books falsely list Jane Friedman author
Image Credits: Blog

After author Jane Friedman protested that five books listed as being written by her on Amazon were actually not written by her, Amazon pulled the titles from sale. The books were also listed on Goodreads which is owned by Amazon. Friedman believes that the books were written by AI.

Friedman said, “It feels like a violation since it’s extremely poor material with my name on it.” The author, who is from Ohio, has written a number of books about the publishing industry, and the fake titles imitated her legitimate work. 

How to Write and Publish an eBook Quickly and Make Money and A Step-by-Step Guide to Crafting Compelling eBooks, Building a Thriving Author Platform, and Maximizing Profitability were some of the listed books. Friedman’s real books include Publishing 101 and The Business of Being a Writer.

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

A reader discovered the book listings on Amazon and emailed Friedman after thinking the listings were fake, which is how she first learned about the phony titles. After reading the first few pages, Friedman assumed the books were produced by AI since she had familiarity with AI technologies like ChatGPT.

According to Friedman, the books were “if not entirely generated by AI, then at least mostly generated by AI.” She immediately started looking for ways to have the books removed from Amazon and filled out a claim form. Friedman claims that Amazon informed her it would not take the books down because she had not registered a trademark for her identity.

By Tuesday, however, the books had been removed from both Amazon and Goodreads, and Friedman believes this was as a result of her addressing the problem on social media. 

Friedman said, “This will continue; it won’t end with me, unless Amazon puts some sort of policy in place to stop anyone from just uploading whatever book they want and applying whatever name they want. They don’t have a process in place for reporting this kind of conduct, where someone is trying to cash in on someone else’s name.” She urged the websites to create a way to verify authorship.

Advertisement

OpenAI Won’t Go Bankrupt by 2024, AIM Opinion Mere Sensationalism

OpenAI won’t go bankrupt by 2024

A recent opinion by Analytics India Magazine (AIM) claimed that Sam Altman’s OpenAI, which is also the creator of the infamous ChatGPT, will go bankrupt by 2024 due to the decline in user base, astronomical operational costs, and unrealistic revenue expectations. However, on detailed analysis, the claims of AIM are found to be lacking expert insights, rendering the article to be perceived as mere sensationalism. 

The article cited significant daily expenditures, notably around $700,000 per day (approximately ₹5.8 crore per day), dedicated solely to ChatGPT. However, the writert fails to consider the fact that such huge expenses are quite common for early stage startups. The report also mentions that ChatGPT user base has declined as users are making use of LLM APIs in their workflows. This assertion also seems to undermine the fact that APIs are also a major source of revenue for the startup. 

Microsoft-backed OpenAI has projected an annual revenue of $200 million in 2023. According to the AIM article, OpenAI “expects to reach $1 billion in 2024, which seems to be a long shot since the losses are only mounting.” We must consider here that OpenAI is still in its initial operational phase and has several projects and resources to raise decent funding to stay afloat. 

After the AIM opinion gained some traction, Ather Energy CEO Tarun Mehta took to Twitter to explain how the ChatGPT maker won’t go bankrupt despite various claims. 

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

Tarun Mehta of Ather Energy voiced confidence on Sunday that OpenAI will easily manage its predicament, despite the claims of AIM. Mehta cited well-known Indian startups that later became well-known corporations, such as Flipkart, Meesho, Ola, Paytm, and Swiggy, as evidence in favor of his claim. These companies, he noted, also experienced extended periods of significant financial burns. 

In addition, he pointed out that many Indian companies have seen a comparable level of capital consumption during their peak moments, and many of them have been able to maintain stability. According to him, Uber, at its height, consumed ten times the capital for an extended period of time. “They will be fine folks,” he added. 

OpenAI is funded by multiple large companies and is at the forefront of Large Language Models as of now. Microsoft has invested about $10 billion in OpenAI and there is every possibility that it will continue to invest more. For years, Microsoft has been in a tug-of-war with Google. Now, partnering with and investing in OpenAI gives Microsoft a once in life-time opportunity to make Google eat dirt, and we doubt that it will let OpenAI go bankrupt. 

To add to the narrative, OpenAI today announced that it has acquired a New York-based AI design studio called Global Illumination. Now, it is only sensible to ask why would a startup, which is on the very precipice of bankruptcy, spend such crucial capital on acquisition. It only implies two things, either Sam Atham is out of his senses, or OpenAI is not going bankrupt. 

Considering all these facts, it is safe to say that OpenAI will not be going bankrupt, or at least at any rate not because of the reasons cited by the AIM article, and certainly not as early as 2024.  

Advertisement

Union Cabinet Approves ₹14,903 Crore Booster for Expansion of Digital India Programme

Image Credits: NDTV

The expansion of the Digital India Programme, which includes a ₹14,903 crore boost for e-governance services, cybersecurity, and the use of artificial intelligence, was approved by the Union Cabinet on Wednesday.

Initiatives under the expanded Digital India programme would prioritize cybersecurity. The Information Security & Education Awareness Phase (ISEA) Programme will provide training on information security to about 265,000 citizens.

The Indian Computer Emergency Response Team (CERT-In), the government’s official organization for cyber forensics, emergency response, and cyber diagnostics, would be greatly expanded, according to Ashwini Vaishnaw, Union Minister for Communication, Electronics, and Information Technology. Along with developing cybersecurity technologies, the plan will also integrate more than 200 locations with the National Cyber Coordination Centre.

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

As previously stated, the government intends to construct three Centres of Excellence (CoE) for the growth of the nation’s ecosystem for AI research and innovation, under this program. These centers will concentrate on sustainable cities, agriculture, and health. Moreover, 22 official Indian languages will all be supported by the AI-enabled multi-language translation tool Bhashini, which is currently offered in 10 languages.

Under the National Supercomputer Mission, the government will also install nine additional supercomputers for AI modeling and weather forecasting. This will be in addition to the existing 18 supercomputers.

To enable digital delivery of services to residents, the Digital India programme was introduced in July 2015. The programme will now run for a total of five years, from 2021–2022 to 2025–2026. Over 1,200 startups from Tier-II and Tier-III cities will receive help from the government throughout the extended time.

Approximately 625,000 IT employees will receive new training and up-skilling for next-generation technologies like the Internet of Things (IoT), machine learning, data analytics, and more, as part of the second phase of the government’s digital push. 

Advertisement

OpenAI Acquires Global Illumination, a New York-based AI Design Studio 

An AI startup called Global Illumination was acquired by OpenAI, the AI company behind the popular AI-powered chatbot ChatGPT. The AI startup Global Illumination, based in New York, uses artificial intelligence technology to create innovative tools, infrastructure, and digital experiences. 

In a short blog post that was posted on its official site, OpenAI stated that the entire team from Global Illumination has joined OpenAI to work on our flagship products such as ChatGPT. “We are very excited for the impact they’ll have here at OpenAI,” the company said. In its almost seven-year history, this is OpenAI’s first public acquisition. The agreement’s terms weren’t made public.

Thomas Dimson, Taylor Gordon, and Joey Flynn founded Global Illumination in 2021, and since then, they have worked on a variety of initiatives. With the support of venture capital firms Paradigm, Benchmark, and Slow, Global Illumination’s team planned and built products for Instagram, YouTube, Google, Pixar, Facebook, and Riot Games early on.

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

Dimson played a key role in improving Instagram’s search algorithms while serving as the company’s director of engineering. He participated in the establishment of the teams in charge of IGTV, feed and Stories ranking, Instagram’s Explore tab, and general data engineering.

Biomes, an open source sandbox multiplayer online role-playing game (MMORPG) designed for the web that resembles Minecraft, is the most recent project by Global Illumination. It’s unknown what will happen to the game after the acquisition, although it is being assumed that the team’s work at OpenAI will be less focused on entertainment.

Despite the fact that OpenAI has resisted acquisitions up until now, the organization has been running funds and grant programmes for several years to support investments in start-up AI businesses and organizations. The company is backed by billions in venture capital from Microsoft and other significant VCs. 

Advertisement

What Are CAT Tools? 6 Features Every Translator Needs

What Are CAT Tools 6 Features Every Translator Needs
Image Credits: Lokalise

In the rapidly evolving field of translation, computer-assisted translation (CAT) tools are indispensable for professional translators. These software applications help streamline and enhance the translation process, improving efficiency and ensuring consistency. 

However, with a wide range of CAT tools available on the market, it might be challenging to pick the right one. Translators should do their research and pinpoint the essential features that their CAT tool must have to significantly enhance their productivity. This article explores CAT tools in detail and highlights six critical features that every translator should consider when choosing the right tool for their needs. 

What are CAT tools? 

Computer-assisted translation tools are software applications specifically designed to assist professional translators in their work. These tools provide a range of features and functionalities that streamline and enhance the translation process. 

CAT tools typically incorporate a translation memory (TM) to store and reuse previously translated segments, improving efficiency and consistency. They also offer various style guide and collaboration features, as well as support for various file formats. 

By leveraging these tools, translators can work more effectively, save time, ensure accuracy, and promptly deliver high-quality translations. As a result, CAT tools have become indispensable in translation, revolutionizing how translators approach their work.

6 essential CAT tools features that every translator needs

Translation memory 

One of the fundamental features of CAT tools is translation memory (TM). TM stores previously translated segments, allowing translators to reuse them in future projects. This not only saves time but also promotes consistency in terminology across different translations. 

A good CAT tool should have a robust TM database that is easily searchable and editable, enabling translators to locate and modify previous translations quickly. This feature is particularly beneficial for translators working on large projects or those who frequently translate content in the same domain.

Termbase

The termbase, alternatively referred to as a translation glossary, serves as a repository of definitions or specific guidelines for using pre-aproved, translated terms. They are similar to dictionaries employed with translation memories, allowing translators to search for significant terms for the organization they are translating for.

Termbases play a crucial role in upholding translation precision across various projects when utilizing a CAT tool by facilitating the consistent application of shared or specialized terminology pertinent to your project. They can ensure accuracy throughout your translations and contribute to maintaining linguistic consistency within your business context.

Style guide

Translation style guides encompass a collection of directives that serve as a handbook for faithfully translating your content into each target language while preserving its inherent meaning and purpose. Style guides are valuable in guaranteeing consistent communication of your brand’s distinct characteristics across different languages, cultures, and markets.

By outlining specific guidelines, a CAT tool with a translation style guide assists in upholding brand consistency throughout different languages. It ensures the precise translation of content while retaining its original essence, helping to maintain a cohesive brand identity across linguistic boundaries.

Collaboration and project management features

CAT tools with collaboration and project management features enable translators to work seamlessly with clients, project managers, and other translators. These tools often include real-time collaboration, version control, and task assignment. 

As a result, translators can easily share files, communicate with team members, and track project progress. In addition, effective collaboration and project management capabilities ensure efficient workflow, minimize errors, and promote effective communication between all stakeholders involved in the translation process.

File format support

Translators often work with various file formats, from standard text documents to complex design files. This is why a CAT tool should support multiple file formats, including Microsoft Office documents, PDFs, HTML, XML, and more. 

This ensures that translators can seamlessly import and export files without the need for manual formatting, preserving the original layout and structure. A CAT tool with comprehensive file format support simplifies the translation process and saves translators valuable time, enabling them to focus on the linguistic aspects rather than technical issues.

Linguistic QA capabilities

Translation quality assurance (QA), similar to the spellcheck and grammar check tools found in most text editing software, safeguards against errors infiltrating your translation endeavors while using the tool. These QA features can detect missing text or tags, deviations from authorized terminologies, numeric inconsistencies, and more.

The QA process can start before submitting a project for translation, persist throughout the translation and editing stages, and culminate in final checks even after completing the ultimate translation version.

By employing a CAT tool powered with linguistic quality assurance, you can foster confidence that your translated content is clear of errors and maintains the utmost quality on every occasion. 

A Proper CAT Tool Is a Translator’s Best Friend

CAT tools have revolutionized the translation industry by providing translators with powerful features that enhance their efficiency and quality of work. Translation memory, termbases, style guides, collaboration and project management, file format support, and quality assurance are essential features that every translator should consider when selecting a CAT tool. By leveraging these features, translators can streamline their workflow, maintain consistency, improve accuracy, and deliver high-quality, timely translations.

Advertisement

Wipro Establishes Generative AI Center of Excellence at IIT Delhi

Wipro establishes generative AI center of excellence at IIT Delhi
Image Credits: Wipro

The establishment of the generative artificial intelligence center of excellence (CoE) at the Indian Institute of Technology (IIT), Delhi, was announced on Wednesday by Wipro. The teams at Wipro center of excellence will work on solutions based on artificial intelligence, machine learning, and other technologies. 

The center will concentrate on research and development (R&D) projects and evaluate the commercial viability of research-based ventures undertaken by Yardi School of AI students at the institute. Wipro will provide financial assistance through the CoE to IIT Delhi’s generative AI research initiatives, including both fundamental and applied research. 

According to a joint statement released by Wipro and IIT Delhi, the company’s $1 billion ambition to create an ecosystem of services in the field of AI, known as the “Wipro ai360” ecosystem, includes the formation of the generative AI CoE at the institute.

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

Professor Mausam, Dean of the Yardi School of AI at IIT Delhi said, “Students will gain valuable insight into problems of relevance to industry and will learn first-hand how their technical know-hows transfer to commercial environments with the help of the facility.”

The move is being taken as experiments and investments in generative AI continue to rise at every IT services company in the nation. During the company’s June quarter post-earnings press conference on July 12, K. Krithivasan, the recently appointed chief executive of Tata Consultancy Services, stated that the company is currently working on more than 50 proof-of-concept (PoC) projects and about 100 opportunities in the generative AI field.

Advertisement

OpenAI Never Trains on Anything Submitted to the APIs, says Sam Altman

OpenAI never trains anything submitted to the APIs Sam Altman
Image Credits: AD

OpenAI has declared that the company does not use client data given via its APIs to train its large language models, such as GPT-4. Sam Altman, the CEO of OpenAI, took to Twitter to reiterate the same amid confusions surrounding the decision. On March 1, 2023, OpenAI modified its terms of service to reflect this new commitment to user privacy, putting into effect the company’s shift in policy.

Altman said, “Customers clearly want us not to train on their data, so we’ve changed our plans. We will not do that.” Altman claimed that OpenAI hasn’t been using API data for model training for a while, implying that this official statement just formalizes an already-accepted practice.

The decision made by OpenAI has broad ramifications, especially for the companies that it serves as clients, including Microsoft, Salesforce, and Snapchat. Because these businesses are more likely to use OpenAI’s API capabilities for their operations, the shift in privacy and data protection is more important to them.

Read More: OpenAI’s Sam Altman Launches Cryptocurrency Project Worldcoin

The new data protection regulations, however, only apply to clients that use the company’s API services. According to the most recent version of OpenAI’s terms of service, the company may “use Content from Services other than their API”. So, unless the data is shared over the API, OpenAI may still use alternative types of data input, such as words inputted into ChatGPT.

A turning point in the current discussion concerning data privacy and AI has been reached with OpenAI’s decision to forgo using consumer data via API for training. Ensuring user privacy and upholding trust will probably continue to be at the center, as OpenAI pushes the limits of AI technology.

Advertisement