An AI startup called Global Illumination was acquired by OpenAI, the AI company behind the popular AI-powered chatbot ChatGPT. The AI startup Global Illumination, based in New York, uses artificial intelligence technology to create innovative tools, infrastructure, and digital experiences.
In a short blog post that was posted on its official site, OpenAI stated that the entire team from Global Illumination has joined OpenAI to work on our flagship products such as ChatGPT. “We are very excited for the impact they’ll have here at OpenAI,” the company said. In its almost seven-year history, this is OpenAI’s first public acquisition. The agreement’s terms weren’t made public.
Thomas Dimson, Taylor Gordon, and Joey Flynn founded Global Illumination in 2021, and since then, they have worked on a variety of initiatives. With the support of venture capital firms Paradigm, Benchmark, and Slow, Global Illumination’s team planned and built products for Instagram, YouTube, Google, Pixar, Facebook, and Riot Games early on.
Dimson played a key role in improving Instagram’s search algorithms while serving as the company’s director of engineering. He participated in the establishment of the teams in charge of IGTV, feed and Stories ranking, Instagram’s Explore tab, and general data engineering.
Biomes, an open source sandbox multiplayer online role-playing game (MMORPG) designed for the web that resembles Minecraft, is the most recent project by Global Illumination. It’s unknown what will happen to the game after the acquisition, although it is being assumed that the team’s work at OpenAI will be less focused on entertainment.
Despite the fact that OpenAI has resisted acquisitions up until now, the organization has been running funds and grant programmes for several years to support investments in start-up AI businesses and organizations. The company is backed by billions in venture capital from Microsoft and other significant VCs.
In the rapidly evolving field of translation, computer-assisted translation (CAT) tools are indispensable for professional translators. These software applications help streamline and enhance the translation process, improving efficiency and ensuring consistency.
However, with a wide range of CAT tools available on the market, it might be challenging to pick the right one. Translators should do their research and pinpoint the essential features that their CAT tool must have to significantly enhance their productivity. This article explores CAT tools in detail and highlights six critical features that every translator should consider when choosing the right tool for their needs.
What are CAT tools?
Computer-assisted translation tools are software applications specifically designed to assist professional translators in their work. These tools provide a range of features and functionalities that streamline and enhance the translation process.
CAT tools typically incorporate a translation memory (TM) to store and reuse previously translated segments, improving efficiency and consistency. They also offer various style guide and collaboration features, as well as support for various file formats.
By leveraging these tools, translators can work more effectively, save time, ensure accuracy, and promptly deliver high-quality translations. As a result, CAT tools have become indispensable in translation, revolutionizing how translators approach their work.
6 essential CAT tools features that every translator needs
Translation memory
One of the fundamental features of CAT tools is translation memory (TM). TM stores previously translated segments, allowing translators to reuse them in future projects. This not only saves time but also promotes consistency in terminology across different translations.
A good CAT tool should have a robust TM database that is easily searchable and editable, enabling translators to locate and modify previous translations quickly. This feature is particularly beneficial for translators working on large projects or those who frequently translate content in the same domain.
Termbase
The termbase, alternatively referred to as a translation glossary, serves as a repository of definitions or specific guidelines for using pre-aproved, translated terms. They are similar to dictionaries employed with translation memories, allowing translators to search for significant terms for the organization they are translating for.
Termbases play a crucial role in upholding translation precision across various projects when utilizing a CAT tool by facilitating the consistent application of shared or specialized terminology pertinent to your project. They can ensure accuracy throughout your translations and contribute to maintaining linguistic consistency within your business context.
Style guide
Translation style guides encompass a collection of directives that serve as a handbook for faithfully translating your content into each target language while preserving its inherent meaning and purpose. Style guides are valuable in guaranteeing consistent communication of your brand’s distinct characteristics across different languages, cultures, and markets.
By outlining specific guidelines, a CAT tool with a translation style guide assists in upholding brand consistency throughout different languages. It ensures the precise translation of content while retaining its original essence, helping to maintain a cohesive brand identity across linguistic boundaries.
Collaboration and project management features
CAT tools with collaboration and project management features enable translators to work seamlessly with clients, project managers, and other translators. These tools often include real-time collaboration, version control, and task assignment.
As a result, translators can easily share files, communicate with team members, and track project progress. In addition, effective collaboration and project management capabilities ensure efficient workflow, minimize errors, and promote effective communication between all stakeholders involved in the translation process.
File format support
Translators often work with various file formats, from standard text documents to complex design files. This is why a CAT tool should support multiple file formats, including Microsoft Office documents, PDFs, HTML, XML, and more.
This ensures that translators can seamlessly import and export files without the need for manual formatting, preserving the original layout and structure. A CAT tool with comprehensive file format support simplifies the translation process and saves translators valuable time, enabling them to focus on the linguistic aspects rather than technical issues.
Linguistic QA capabilities
Translation quality assurance (QA), similar to the spellcheck and grammar check tools found in most text editing software, safeguards against errors infiltrating your translation endeavors while using the tool. These QA features can detect missing text or tags, deviations from authorized terminologies, numeric inconsistencies, and more.
The QA process can start before submitting a project for translation, persist throughout the translation and editing stages, and culminate in final checks even after completing the ultimate translation version.
By employing a CAT tool powered with linguistic quality assurance, you can foster confidence that your translated content is clear of errors and maintains the utmost quality on every occasion.
A Proper CAT Tool Is a Translator’s Best Friend
CAT tools have revolutionized the translation industry by providing translators with powerful features that enhance their efficiency and quality of work. Translation memory, termbases, style guides, collaboration and project management, file format support, and quality assurance are essential features that every translator should consider when selecting a CAT tool. By leveraging these features, translators can streamline their workflow, maintain consistency, improve accuracy, and deliver high-quality, timely translations.
The establishment of the generative artificial intelligence center of excellence (CoE) at the Indian Institute of Technology (IIT), Delhi, was announced on Wednesday by Wipro. The teams at Wipro center of excellence will work on solutions based on artificial intelligence, machine learning, and other technologies.
The center will concentrate on research and development (R&D) projects and evaluate the commercial viability of research-based ventures undertaken by Yardi School of AI students at the institute. Wipro will provide financial assistance through the CoE to IIT Delhi’s generative AI research initiatives, including both fundamental and applied research.
According to a joint statement released by Wipro and IIT Delhi, the company’s $1 billion ambition to create an ecosystem of services in the field of AI, known as the “Wipro ai360” ecosystem, includes the formation of the generative AI CoE at the institute.
Professor Mausam, Dean of the Yardi School of AI at IIT Delhi said, “Students will gain valuable insight into problems of relevance to industry and will learn first-hand how their technical know-hows transfer to commercial environments with the help of the facility.”
The move is being taken as experiments and investments in generative AI continue to rise at every IT services company in the nation. During the company’s June quarter post-earnings press conference on July 12, K. Krithivasan, the recently appointed chief executive of Tata Consultancy Services, stated that the company is currently working on more than 50 proof-of-concept (PoC) projects and about 100 opportunities in the generative AI field.
OpenAI has declared that the company does not use client data given via its APIs to train its large language models, such as GPT-4. Sam Altman, the CEO of OpenAI, took to Twitter to reiterate the same amid confusions surrounding the decision. On March 1, 2023, OpenAI modified its terms of service to reflect this new commitment to user privacy, putting into effect the company’s shift in policy.
seeing a lot of confusion about this, so for clarity:
openai never trains on anything ever submitted to the api or uses that data to improve our models in any way.
Altman said, “Customers clearly want us not to train on their data, so we’ve changed our plans. We will not do that.” Altman claimed that OpenAI hasn’t been using API data for model training for a while, implying that this official statement just formalizes an already-accepted practice.
The decision made by OpenAI has broad ramifications, especially for the companies that it serves as clients, including Microsoft, Salesforce, and Snapchat. Because these businesses are more likely to use OpenAI’s API capabilities for their operations, the shift in privacy and data protection is more important to them.
The new data protection regulations, however, only apply to clients that use the company’s API services. According to the most recent version of OpenAI’s terms of service, the company may “use Content from Services other than their API”. So, unless the data is shared over the API, OpenAI may still use alternative types of data input, such as words inputted into ChatGPT.
A turning point in the current discussion concerning data privacy and AI has been reached with OpenAI’s decision to forgo using consumer data via API for training. Ensuring user privacy and upholding trust will probably continue to be at the center, as OpenAI pushes the limits of AI technology.
A new prototype of an analogue AI chip that functions like a human brain and executes intricate computations for a variety of deep neural network (DNN) applications has been announced by the tech company IBM. According to IBM, the cutting-edge chip can significantly increase artificial intelligence’s efficiency while reducing battery consumption for computers and cellphones.
The completely integrated circuit has 64 AIMC cores that are coupled via an on-chip communication network, the company stated in a blog introducing the chip. Additionally, it uses extra processing and digital activation functions that are used in each convolutional layer and long short-term memory unit.
The 64 analogue in-memory computation cores in the new AI chip were created at IBM’s Albany NanoTech Complex. In order to bridge the analogue and digital worlds, IBM claims that it has incorporated small, time-based analog-to-digital converters inside each tile or core of the chip. These converters are modeled after the main characteristics of neural networks that operate in biological brains.
According to the blog post from IBM, each tile (or core) also has compact digital processing units that carry out straightforward scaling and nonlinear neuronal activation operations. Future computers and phones could run advanced AI apps on IBM’s prototype chip instead of the ones that are now used.
IBM says that a lot of the chips being created right now separate their memory and processing units, which slows down computing. This indicates that AI models are often kept in a separate location in memory, and computational operations necessitate the frequent rerouting of data between the memory and processing units.
When comparing the human brain to conventional computers, Thanos Vasilopoulos, a scientist at IBM’s research facility in Switzerland, told BBC that the former is able to achieve remarkable performance while consuming little power. He claimed that because of the IBM chip’s improved energy efficiency, large and more complex workloads can be executed in low power or battery-constrained environments.
The New York Times has taken proactive steps to prevent the exploitation of its material for the development and training of artificial intelligence models.
The NYT changed its Terms of Service on August 3rd to forbid the use of its content, including text, pictures, audio and video clips, look and feel, metadata, and compilations, in the creation of any software programme, including, but not limited to, training a machine learning or artificial intelligence system.
The revised terms now add a restriction prohibiting the use of automatic technologies, such as website crawlers, for accessing, using, or gathering such content without express written consent from the publication. According to the NYT, there may be undefined fines or repercussions if people refuse to abide by these new regulations.
Despite adding the new guidelines to its policy, it doesn’t appear that the publication has altered its robots.txt file, which tells search engine crawlers which URLs can be viewed. The action might be in response to Google’s recent privacy policy update, which disclosed that the search engine giant may use open data from the internet to train its numerous AI services, such as Bard or Cloud AI.
However, the New York Times also agreed to a $100 million contract with Google in February, allowing the search engine to use part of the Times’ content on its platforms for the following three years. Given that both businesses would collaborate on technologies for content distribution, subscriptions, marketing, advertising, and “experimentation,” it is probable that the modifications to the NYT terms of service are aimed at rival businesses like OpenAI or Microsoft.
According to a recent announcement. website owners can now prevent OpenAI’s GPTBot web crawler from scraping their sites. Numerous large language models that power well-known AI systems like OpenAI’s ChatGPT are trained on large data sets that may contain content that has been illegally stolen from the internet or is otherwise protected by copyright.
Stability AI, the pioneering generative AI startup behind Stable Diffusion, has unveiled its first Japanese Language Model (LM), known as Japanese StableLM Alpha, in a key step towards improving the Japanese generative AI market.
As the company claims their language model to be the most effective publically available model catering to Japanese speakers, this historic debut has drawn attention. Accordingly to the company, thorough benchmark assessment against four other Japanese LMs supports the assertion. With its design of 7 billion parameters, the recently unveiled Japanese StableLM Alpha is a tribute to Stability AI’s dedication to technological development.
The well-known Apache Licence 2.0 will be used for the commercial distribution of the Japanese StableLM Base Alpha 7B iteration. This specialised model was painstakingly created after prolonged training on a massive dataset that included 750 billion tokens of both Japanese and English text that were carefully collected from web archives.
The Japanese community of Stability AI created datasets by utilising the knowledge of the EleutherAI Polyglot project’s Japanese team. The use of EleutherAI’s GPT-NeoX software, a key component of Stability AI’s development process, in an expanded form, greatly facilitated this group effort.
The Japanese StableLM Instruct Alpha 7B is a similar model that represents yet another outstanding achievement. This model was created primarily for research purposes and is only suitable for research-related applications. Through the use of several available datasets and a advanced approach known as Supervised Fine-tuning (SFT), it demonstrates a unique capacity to follow user instructions.
EleutherAI’s Language Model Evaluation Harness was used to conduct thorough evaluations that served to validate these models. The models underwent scrutiny across various domains, such as question answering, sentence classification, sentence pair classification, and sentence summarization, emerging with an impressive average score of 54.71%.
According to Stability AI, this performance indicator clearly places the Japanese StableLM Instruct Alpha 7B ahead of its rivals, demonstrating its strength and supremacy.
According to a recently released research paper, the University of Washington, Carnegie Mellon University, and Xi’an Jiaotong University have recently found that different AI language models have political biases.
According to the study, which examined 14 large language models, OpenAI’s AI chatbot ChatGPT and the latest LLM version GPT-4 have a propensity to favor left-wing libertarianism whereas Meta’s LLaMA has a tendency to favor right-wing authoritarianism. The researchers asked questions about democracy, feminism, and other themes, and used this information to assess the political slant of these models.
Unexpectedly, the study discovered that training the models on datasets with various political biases changed their behavior and changed their capacity to recognise hate speech and false information.
The study used a three-stage approach to look at the development of AI language models. The models’ initial responses to politically charged words revealed their innate political leanings. For instance, compared to OpenAI’s GPT models, Google’s BERT models showed a sense of social conservatism. The discrepancy may be explained by the fact that more recent GPT models were influenced by liberal online texts and older BERT models were trained on conservative book sources.
In the following step, datasets containing news and social media posts from both left-leaning and right-leaning sources were used to retrain the GPT-2 and Meta’s RoBERTa models. The biases that these models already had were reinforced by this process.
In the study’s last phase, it was shown how political preferences of AI models affected how well they could categorize hate speech and false information. While models trained with right-wing data were more sensitive to hate speech directed at white Christian men, those trained with left-wing data were more tuned in to hate speech that targeted minority groups.
The research team emphasised the need of comprehending the political biases exhibited by AI language models, particularly as these models are increasingly being incorporated into popular goods and services. Right-wing skeptics have criticized OpenAI, the company that created ChatGPT, claiming that the chatbot represents a liberal viewpoint.
The public has been reassured by OpenAI that it is actively addressing these worries and instructing human reviewers to refrain from supporting any one political organization while the AI model is being improved. The scientists are nevertheless dubious, claiming that it is doubtful that any AI language model will be totally free of political prejudices.
Microsoft has introduced ChatGPT on Azure solution accelerator. This solution provides a similar user experience to ChatGPT but acts as your private ChatGPT. The open-source code for the application is available on GitHub.
As we all know now, ChatGPT’s popularity has grown exponentially since its launch. This AI service which is freely available to public is frequently used by business users around the world to increase productivity or serve as a creative assistant.
ChatGPT, however, runs the danger of disclosing confidential data. Blocking corporate access to ChatGPT is one method, but people will always find a way past it. Additionally, this lessens ChatGPT’s potent powers and lowers worker productivity and satisfaction. To address this issue, ChatGPT on Azure solution accelerator was introduced.
Azure ChatGPT provides built-in protections for the privacy of user’s data and complete isolation from OpenAI systems. Other enterprise-grade security controls are built in, and network traffic can be completely isolated to the user’s network.
Users can provide additional business value by integrating plug-ins with their internal services such as ServiceNow, etc., or by using your own internal data sources (plug and play).
The project is open for contributions and suggestions from the public. A Contributor Licence Agreement (CLA), which states that users have the authority to provide the company the rights to use their contribution, is typically required in order for them to make a contribution.
In January, Microsoft CEO Satya Nadella said that the company would soon add OpenAI’s popular AI chatbot ChatGPT to its cloud-based Azure service very soon. In March, Microsoft announced that ChatGPT is available in preview in Azure OpenAI Service.
It appears like TikTok is developing a new method for creators to disclose whether or not their posts contain AI-generated content. A new “AI-generated content” option has surfaced under the “more options” section before sharing a video, according to social media strategist Matt Navarra.
TikTok now let’s you add AI-generated content labels to your videos 🤖
TikTok amended its content restrictions in March to require users to disclose deepfakes and AI-generated content in the video’s title or apply an identifying sticker. In the description for that toggle, TikTok claims the label will assist prevent content removal.
According to a video that Navarra had posted showing the function, when the toggle was flipped in the video, TikTok displayed the brand-new pop-up explaining the feature. The pop-up reminds content producers that they must mark AI-produced material that depicts “realistic scenes” and cautions them once more that improper labeling could result in the removal of their work.
Although users were unable to locate this toggle in the app, it appears that it may just have been rolled out for testing. TikTok has still not commented on the feature yet.
Following last week’s revelation that competitor platform Instagram is developing its own artificial intelligence content disclosure labels, the new AI-generated content feature for TikTok debuted just on cue. Last month, Meta made a similar pledge to internet giants Google, Amazon, Microsoft, and others, saying it will responsibly develop AI and be transparent with users about its use.