Adobe Stock will accept images generated by AI on its service, the company said in a blog post on Monday.
Unlike stock image services like Getty Images, which have prohibited illustrations generated using AI on their platforms, Adobe is embracing content created with generators like DALL-E, which is now open to everyone, and Stable Diffusion. These generators use text-to-image prompts to generate art and other-worldly images.
“Adobe Stock contributors are using AI technologies to increase their earning potential, diversify their portfolios, and expand their creativity,” Sarah Casillas, senior director at Adobe, said in the blog post. Adobe Stock will accept art created with such models under the condition that they are registered as such.
Leading up to Monday’s announcement, Adobe has been quietly testing AI-generated images. Casillas said that the company was pleasantly surprised by the results. “It meets our quality standards and has been performing well,” Casillas said.
Due to possible copyright issues, Getty Images said in September that it would not use images generated by AI on its service. Adobe, however, created terms to avoid any such risks.
Creators must have property rights for their art before they submit it to Adobe Stock, and they must read the terms and conditions for AI tools. They cannot submit photos with logos, notable people, famous characters, or real places. If they stick to the terms and conditions, artists may earn royalties through their AI-generated content.
Recently, OpenAI unveiled a prototype general-purpose chatbot that exhibits a remarkable diversity of new text creation capabilities. The company’s language interface, known as ChatGPT, has become extremely popular online as users speculate about its potential to replace anything from playwrights to Google Search Engine to college essays. Meanwhile, in an unexpected turn of events, Twitter CEO Elon Musk said on Sunday that he ‘paused’ OpenAI from using the microblogging platform’s database for training after finding out about it.
Elon stated in a tweet that he would like to learn more about the governance and future revenue plans of ChatGPT. He also mentioned OpenAI, which he co-founded, started as a non-profit and open-source project. Both are no longer true.
The tweet by Musk sparked discussion among many people. Sam Altman, CEO of OpenAI, was one of them and tweeted: “Interesting to me how many of the ChatGPT takes are either ‘this is AGI’ (obviously not close, lol) or this approach can’t really go that much further.”
Musk responded, admitting ChatGPT is scary good. He also expressed his concerns that “We are not far from dangerously strong AI.”
Altman replied that he agrees that we are getting near to dangerously powerful AI in the sense that an AI offers a significant cybersecurity risk, and he thinks we might get to actual AGI in the next decade, so we have to take the risk of that very seriously too.
Today, Altman tweeted that ChatGPT has crossed 1 million users in less than a week since its inauguration last Wednesday.
Machine learning algorithms are fueled by data. Gathering relevant data is the most crucial and challenging step in creating a robust machine-learning model that can successfully execute tasks like image classification. Unfortunately, just because data is becoming more abundant does not mean everyone can use it. Real-world diverse data collection is complex, error-prone, time-consuming, and can cost millions of dollars to generate. As a result, getting reliable outcomes is generally out of reach since there is a dearth of credible training data that would allow machine learning algorithms to be trained more effectively. This is where synthetic data comes to the rescue!
Synthetic data is created by a computer using 3D models of environments, objects, and humans to swiftly make different clips of certain behaviors. It is becoming increasingly resourceful as synthetic data comes without the inevitable copyright constraints or ethical ambiguity that come with real data. It fills in the gaps when real data is scarce or current image data fails to represent the nuances of the physical world thoroughly.
By bridging the gap between reality and its representation, synthetic data prevents machine learning from committing errors that a person would never make. However, there is a significant bottleneck: the synthesis begins off simple but becomes more difficult as the quantitative and qualitative demand for the image data increases. You need to possess expert domain knowledge to develop an image data generation system that yields useful training data.
To address such issues, MIT researchers from the MIT-IBM Watson AI Lab captured a dataset of 21,000 publicly accessible programs from the internet rather than building unique image-generating algorithms for a specific training purpose. These programs generate a wide range of graphics using simple colors and textures. This includes procedural models, statistical image models, models based on the architecture of GANs, feature visualizations, and dead leaves image models. Then, they trained a computer vision model using this extensive collection of basic image-generating programs. The team explained that such programs generate a variety of graphics with simple color and texture patterns. The programs, each of which had only a few lines of code, were not edited or modified by the researchers.
According to the researchers, excellent synthetic data for training vision systems has two essential characteristics: naturalism and diversity. It’s interesting to note that the most naturalistic data is not necessarily the best because naturalism might compromise diversity. The primary goal must be to obtain naturalistic real data, which captures key structural aspects of real data.
The researchers didn’t feel the necessity to create images in advance to train the model since these simple programs ran so efficiently. In addition, the researchers discovered that they could produce images and train the model at the same time, which sped up the process.
The researchers pre-trained computer vision models for both supervised and unsupervised picture classification tasks using their enormous dataset of image-generating programs. While the image data in supervised learning are labeled, in unsupervised learning, the model learns to classify images without labels.
Compared to previous synthetically trained models, the models they trained with this large dataset of programs classified images more accurately. Besides that, the researchers demonstrated that adding more image programs to the dataset enhanced model performance, all while their models outperformed those trained using actual data, suggesting a new method for increasing accuracy.
The accuracy levels were still inferior to those of models trained on actual data, but their method reduced the performance difference between models trained on real data and those trained on synthetic data by an impressive 38%.
The researchers also employed each image generation software for pretraining in order to identify parameters that influence model accuracy. They discovered that a model performs better when a program generates a more varied set of images. They also discovered that the most effective way to enhance model performance is to use vibrant images with scenes that occupy the full canvas.
Through this research, the team emphasizes that their findings raise questions about the real complexity of the computer vision problem; if very short programs can create and train a high-performing machine learning computer vision system, then creating such a model may be simpler than previously thought, and might not require enormous data-driven systems to achieve adequate performance. Further, their methods enable training computer vision image classification systems that cannot get access image datasets. Thus addressing the expensive, biased, private, or ethical aspects of data collection. The research participants clarified that they are not advocating for completely eliminating datasets from computer vision (since real data may be needed for evaluation), but rather evaluating what can be done in the absence of data.
Flipkart is investigating how Web3 can reshape the future of commerce, consumption, and value creation and transform the shopping experiences for millions of people through its relationship with Polygon and the new Blockchain-Commerce Centre.
Over the past year, the leading e-commerce company has been testing Web3 projects through Flipkart Labs. This collaboration comes after a number of recent forays into Web3 by Flipkart. Flipkart Labs, its innovation arm, was introduced earlier this year to incubate various concepts to bring innovation to the Indian e-commerce sector. With Labs, Flipkart explored NFTs, Virtual Immersive stores, and other Blockchain-related use cases as they related to Web3 and Metaverse commerce.
Before its recent festival season sale, Flipkart collaborated with the Ethereum scaling protocol for Flipverse, its interactive virtual shopping platform on the metaverse. The Flipverse is created by eDAO, a company established by Polygon, in partnership with 23 teams from the tech, design, Web3, and brand industries. By encouraging new forms of engagement and use cases through NFTs that saw community exploration, the Flipverse reoriented the top-down relationship between customers and brands.
Jeyandran Venugopal, Chief Product and Technology Officer, Flipkart, said, “With the COE, we look forward to working with Polygon and leveraging their expertise and technical know-how to successfully onboard users not just to the value proposition of Web3 or Metaverse commerce but also Web3 in general.”
This partnership aims to enhance research and development at the nexus of Web3 and experiential retail, which will increase acceptance and impact in India and throughout the world, according to Sandeep Nailwal, cofounder of Polygon. He continues that the Blockchain-eCommerce Center of Excellence “will be a driving force in the future development of e-commerce.”
The State Bank of India, ICICI Bank, Yes Bank, and IDFC First Bank are the first four banks to participate in the Reserve Bank of India’s testing of its retail central bank digital currency (CBDC), the digital rupee (e₹-R), in Mumbai, New Delhi, Bengaluru, and Bhubaneswar.
This pilot program will ultimately include participation from four other banks: Bank of Baroda, Union Bank of India, HDFC Bank, and Kotak Mahindra Bank. It will also be introduced in more cities, including Ahmedabad, Gangtok, Guwahati, Hyderabad, Indore, Kochi, Lucknow, Patna, and Shimla.
The introduction of a digital currency by the RBI this fiscal year is intended to advance the digital economy and facilitate effective currency management, as per Finance Minister Nirmala Sitharaman’s Budget 2022–23 Speech from earlier this year.
The RBI states that CBDC is the legal tender issued in digital form by a central bank. It has the exact same value as fiat money and may be exchanged for it in exact amounts. CBDC can be traded via Blockchain-backed wallets, which make payments final and reduce settlement risk.
Digital currency can be exchanged for money equivalent to paper notes since the CBDC is freely convertible against real money. To use e-rupees, unlike UPI, a consumer does not require a bank account.
In contrast to cryptocurrencies, the Digital Rupee will also have another key benefit of being centralized i.e., it will be administered by a single entity, hence lowering the risk of volatility that is associated with the likes of Bitcoin, Ethereum, etc. It can also aid in preventing fraud. With inherent programmability and controlled traceability, CBDC could proactively combat fraud, whereas the existing system relies on post-facto inspections to do so.
A report describing the Reserve Bank of India’s ambitions for the digital rupee, or “e-rupee,” was published earlier in October. It also outlined the reasons for the implementation of a CBDC and how it would be tested in distinct phases.
As per the official announcement by the central bank on Tuesday, the digital rupee will be distributed through intermediaries like banks and will be produced in the same denominations as present paper money and coins.
Users will be able to process transactions with e₹-R using a digital wallet provided by the collaborating banks and stored on mobile devices, the central bank explained, adding that both person-to-person (P2P) or person-to-merchant (P2M) will be possible. By scanning the QR code placed on the spot, a customer can make a purchase from the vendor. The digital currency can be changed into other kinds of payment, such as bank deposits, as needed but will not accrue any interest.
The Reserve Bank of India stated that the pilot would evaluate the stability of the complete creation, distribution, and retail use of digital rupees in real-time. Based on the insights learned from this pilot, it would eventually test more e-rupee features and applications.
To get started, download the CBDC app and provide a phone number associated with a bank account. A digital wallet with a specific ID will be provided to you after you successfully register on the app. After that, you may add money to the wallet by transferring funds from your bank account. Next, the app allows you to select currencies in whatever denomination you like. In order to load 20,000, you may, for example, ask for 500×20 units, 100×50 units, and 50×100 units. And after you confirm, you’ll find digital cash in these denominations in your wallet.
Several nations are exploring centralized digital currencies. While some are carrying out research, others have launched trial programs or formally implemented digital money.
In 2020, the Bahamas introduced the Sand Dollar, one of the first digital currencies issued by a central bank. In order to test integrating their domestic CBDCs, the central banks of Sweden, Norway, and Israel have started a project with the Bank for International Settlements. In October this year, the Central Bank of Nigeria celebrated the first anniversary of the launch of Africa’s first digital currency, the e-Naira.
This month in the United States, a coalition of banking institutions led by the Federal Reserve Bank of New York, HBSC, Mastercard, and Wells Fargo announced the launch of the Regulated Liability Network, a proof-of-concept digital money network. Through the Venus Initiative, France and Luxembourg settled a bond for 100 million euros (US$104 million) using an experimental CBDC. The National Bank of Ukraine unveiled plans on Monday on the possibility of creating an electronic hryvnia that could be used for a variety of purposes, including the issue and exchange of virtual assets.
A team of researchers at DeepMind develops an AI agent called DeepNash that can play the Stratego game at an expert level.
The Stratego game is a two-player board game and is difficult to master. The goal for each player in Stratego is to capture their opponent’s flag hidden among their initial 40 game pieces. Every game piece is marked with a power ranking. Higher-rank players defeat the lower-ranked players during the face-offs in Stratego. The players in Stratego cannot see the markings of the opponent’s game pieces until they are in face-offs.
DeepNash first learned to play the Stratego game against itself many times. Researchers at DeepMind came up with an algorithm based on game theory, which uses an optimal strategy for every move in the game. They have published and explained the entire work of DeepNash in the paper, ‘Mastering the game of Stratego with model-free multiagent reinforcement learning.’
Testing revealed that DeepNash achieved an 84% winning rate against the top expert human players on the Gravon games platform and became one of the top three players. Gravon is a virtual world that allows users to play board and card games together. Researchers did not inform the players on Gravon that they were playing against a computer.
Google has announced that it has shut down Duplex on the Web, a service that enables Google Assistant to automate specific user tasks for site visitors.
According to the Google support page, Duplex on the Web is “deprecated” and, as of this month, will no longer be in use. Any automation features facilitated by Duplex on the Web will not be available any longer, says the page.
“As we continue to enhance the Duplex experience, we are responding to feedback from developers and users about how to make it even better,” a Google spokesperson said.
“By the end of this year, we will turn down Duplex on the Web and focus fully on making AI advancements to the Duplex voice technology that assists people most every day,” they added.
The company launched Duplex on the Web at its 2018 Google I/O developer conference. It enables Google Assistant to perform different site actions. These actions are performed under the full supervision of the user, who can terminate the process and take back control at any time.
There has been a lot of hype around generative AI since the beginning of 2022. Social media platforms such as Reddit and Twitter are full of images created through the generative machine learning models such as Stable Diffusion and DALL-E. Startups building products through generative models are attracting massive funding despite the market downturn. And large tech companies have started to integrate generative models into their mainstream products.
The concept of generative AI is not new. With a few exceptions, most of the advancements we are witnessing today have existed for several years. However, the emergence of several trends has made it possible to make the most out of the generative models and bring them to everyday applications. The field still has several challenges to overcome, but there is no doubt that the generative AI market is bound to grow in 2023.
Advancements in Generative AI
Generative AI became famous in 2014 with the rise of generative adversarial networks (GANs), which is a type of deep learning architecture that can create realistic images, for example, of faces from noise maps. Scientists later created other versions of GANs to perform different tasks, like converting the style of one image to another. GANs and the variational autoencoders (VAE), another deep learning architecture, later welcomed the era of deepfakes, which is an AI technique that modifies videos and images to swap one person’s face for another.
The year 2017 ushered in the transformer, a deep learning architecture that underlies large language models (LLMs) such as GPT-3, LaMDA, and Gopher. The transformer generates text, software code, and even protein structures. A variation of the transformer, called the “vision transformer,” is also utilized for visual tasks such as image classification. A previous version of OpenAI’s DALL-E used the transformer to create images from text.
A technique introduced by OpenAI in 2021, called Contrastive Language-Image Pre-training (CLIP), became crucial in text-to-image generators. CLIP is effective at learning shared embeddings between text and images by learning from image-caption pairs collected from the internet. CLIP and diffusion (another deep learning technique used for generating images from noise) were utilized in DALLE-2 to create high-resolution images with stunning quality and detail.
As we moved toward 2022, larger models, better algorithms, and more extensive datasets helped improve the output of generative models, creating superior images, generating long stretches of (mostly) coherent text, and writing high-quality software code. Besides, several models became available for the general public to experiment with, making them popular among the masses. In September, OpenAI’s DALL-E became available to everyone. The company removed the waitlist to allow open access to its text-to-image generator DALL-E 2.
“More than 1.5 million users are now actively creating over 2 million images a day with DALL-E, from artists and creative directors to authors and architects, with about 100,000 users sharing their creations and feedback in our Discord community,” said an OpenAI spokesperson, elaborating on the popularity of their generative AI tool.
Newer Applications
Generative models were first released as systems that could work with big chunks of creative work. GANs became popular for generating complete images with significantly less input. LLMs like GPT-3 were in the spotlight for writing full articles.
But as the field evolved, it has become evident that generative AI models are pretty unreliable when left to their own whim. Many scientists believe that current deep learning models lack some of the essential components of intelligence, no matter how large they are, which makes them prone to committing unpredictable mistakes. Recently, Meta introduced a new large language model ‘Galactica’ to generate original academic papers with simple prompts. But as more and more people reported it to be full of “statistical nonsense” and that it was developing “wrong” content, the website withdrew the option for people to experiment with.
Product teams are finding that generative models perform best when implemented in ways that facilitate greater user control. The past year witnessed several products that use generative models in clever, human-centric ways. For instance, Copy AI, a tool that uses GPT-3 to create blog posts, has an interactive interface where the writer and the LLM create the outline of the article and build it up together. Applications developed with DALL-E 2 and Stable Diffusion also facilitate user control with features that allow for regenerating, configuring, or editing the output of the generative AI model.
As the principal scientist at Google Research, Douglas Eck, said at a recent AI conference, “It is no longer about a generative AI model that creates a realistic picture. It is about making something that you created yourself. Technology should serve our need for agency and creative control over our actions.”
Conclusion
The generative AI industry still has many challenges to overcome, including copyright and ethical complications. Nevertheless, it is interesting to see the generative AI field thrive. As major generative AI models become accessible to the general public, it is obvious that everyone is benefiting from these powerful tools. Moreover, big companies like Microsoft are making the most out of their exclusive access to OpenAI’s technology, cloud infrastructure, and the huge market for creativity tools to bring generative models to its users.
However, down the road, the real potential of generative AI might manifest itself in unexpected markets. Who knows, perhaps generative AI will give birth to a new era of applications that we have never thought of before.
Reddit, the American social news and content aggregator, hits an all-time high in minting NFT avatars by having over 2,55,000 minted in a day, approximately 55,000 more than the previous record set on August 30-31.
In July, Reddit launched its limited-edition NFT avatars, created by independent artists. Initially, Reddit evaded using cryptocurrency to pay for avatar purchases and referred to them as digital “collectibles” instead of NFTs. Thus, the collection was generally viewed as a strategy to encourage the widespread adoption of blockchain technology.
Within the next few months, Reddit’s NFT avatar trading volume touched US$1.5 million, as per a Dune Analytics and Polygon report. The surge in avatar trading accounted for more than a third of the total volume (US$4.1 million), while the daily sales of digital collectibles also skyrocketed to 3,780.
On secondary NFT marketplaces like OpenSea, some of the costly Reddit NFTs have sold for over $300, while the platform’s own marketplace only reports values of approximately $50.
Researchers from the Kyoto University Institute for the Future of Human and Society have shown AI’s capability to develop literary art like Haiku, a Japanese poetic form.
A study led by Yoshiyuki Udea, one of the researchers at Kyoto University, compared AI-generated Haiku without human intervention, also known as the ‘human out of the loop’ or HOTL, with an opposing method known as ‘human in the loop’ or HITL.
The research involved 385 participants who evaluated 40 Haiku poems comprising 20 each of HITL and HOTL and 40 other poems composed by professional Haiku writers. Ueda said, “it was interesting that the evaluators found it difficult to differentiate between human-generated Haiku and AI-generated Haiku.”
From the result, HITL Haiku received more praise for their poetic capabilities, whereas HOTL and human-generated Haiku had similar scores. However, researchers witnessed algorithm aversion among the evaluators. They were not supposed to be biased but became influenced by reverse psychology. In other words, evaluators tended to give lower scores to those they felt were AI-generated Haiku.
According to the researchers, the capability of AI in the field of Haiku creation is an essential and initial step to collaborating with humans to produce more creative work.