Thursday, November 20, 2025
ad
Home Blog Page 149

Stability AI, creator of Stable Diffusion, gets a US$101 million funding

Stability AI, creator of Stable Diffusion and DreamStudio, recently announced that it had secured US$101 million in a fundraising round to support the creation of open-source systems. Leading the investment round were O’Shaughnessy Ventures LLC, Coatue, and Lightspeed Venture Partners. The London-based company will use these funds to accelerate the development of open AI models for language, image, audio, 3D, video, and for consumer and global enterprise use cases.

Unveiled in August, Stable Diffusion is an open-source text-to-image generator similar to OpenAI’s DALL-E. Like most of its contemporaries, it promises to make it possible for billions of people to produce beautiful art instantly. The model itself draws inspiration from the work of CompVis and RunwayML, a video editing business well-known for its widely used latent diffusion model, as well as ideas from Katherine Crowson, lead generative AI developer at Stability AI, who developed conditional diffusion models, Dall-E 2 by OpenAI, Imagen by Google, and academics at Ludwig Maximilian University of Munich.

With the debut of independent research lab Midjourney’s self-titled product in July and OpenAI’s DALL-E 2 in April, AI image generators have become increasingly popular this year. In May, Google also unveiled Imagen, a text-to-image technology that is not yet accessible to the general public.

Despite being an image generator, Stable Diffusion does not leverage the auto-regressive method utilized by systems like DALL-E 2. To generate visual output, auto-regressive algorithms employ probability distributions. Stable Diffusion creates visuals using latent diffusion models (LDMs). The latent diffusion model employs diffusion algorithms but reconstructs the image rather than just compressing it. Images are created in this scenario by denoising data from neural networks known as autoencoders in a latent representation space, which is the information required to represent particular data that is embedded closely together. The whole image is then created by decoding the representation.

Because Stable Diffusion is open source, users can get over any prohibitions that are in place, unlike DALL-E and Midjourney, which have measures in place to prohibit the creation of graphic or pornographic images. The open source designation distinguishes it from its competitors because Stability AI has made all the information about its AI model, including the model’s weights, available for anybody to read and use.

Stability AI’s 4,000 A100 Ezra-1 AI ultracluster was used to train the model. With more than 10,000 beta testers producing 1.7 million photographs every day, the company has been pushing the model through extensive testing.

The main dataset was trained using LAION(Large-scale Artificial Intelligence Open Network)-Aesthetics, a subset of LAION-5B that was constructed using a new CLIP-based model that filtered LAION-5B based on the scores of Stable Diffusion’s alpha testers for how “beautiful” a picture was. Stable Diffusion can quickly produce 512 x 512-pixel images on consumer GPUs with less than 10 GB of VRAM. This revolutionizes image production by allowing researchers and, eventually, the general public to use the tool in a number of settings.

Read More: Microsoft introduces DALL-E 2 with Designer and Image Creator

On the grim side, private medical information and copyrighted works were both included in the dataset used to train Stable Diffusion. Fearing copyright infringement lawsuits, Getty Images prohibited the submission of content created by systems like Stable Diffusion. Even U.S. House Representative Anna G. Eshoo recently criticized stability AI in a letter to the National Security Advisor (NSA) and the Office of Science and Technology Policy (OSTP), urging them to address the release of “unsafe AI models” that do not filter the content posted on their platforms.

DreamStudio, a new suite of generative media tools built to allow everyone the power of infinite imagination and the seamless simplicity of visual expression through a mix of natural language processing and novel input controls for rapid creation, is another Stability AI’s consumer-facing product. Stability AI also offers financial support to an organization called Harmonai. Late in September, Harmonai unveiled Dance Diffusion, an algorithm and collection of tools that can create musical clips by learning from hundreds of hours of pre-existing music.

Advertisement

Meta builds first AI-powered translation system for Hokkien language

Meta builds AI-powered translation system for Hokkien language

AI translation primarily focuses on written languages. However, around half of the world’s 7,000+ living languages are mainly oral i.e. without a standard or widely used writing system. As a result, it is impossible to build machine translation tools using standard techniques as they require large amounts of written text to train the AI models. 

To address this challenge, Meta has built the first-ever AI-powered translation system for a primarily oral language, Hokkien, which is widely spoken within the Chinese diaspora. Meta’s technology allows Hokkien speakers to converse with English speakers.

The open-sourced AI translation system is part of Meta’s Universal Speech Translator (UST) project. The project is developing new AI methods that will eventually allow real-time speech-to-speech translation for all extant languages. Meta believes that spoken communication can help break down barriers and bring people closer wherever they are. Recently, Zuckerberg announced that the company plans to build a universal language translator for the metaverse. 

Read More: Meta AI’s New AI Model Can Translates 200 Languages With Enhanced Quality

Meta’s AI researchers overcame many complex challenges from traditional machine translation systems to develop the new system, including data gathering, evaluation, and model design. Meta is open-sourcing not just their Hokkien translation models but also the evaluation datasets so that others can reproduce and build on their work.

Moreover, the techniques can be extended further to other written and unwritten languages. Meta is also releasing SpeechMatrix, a large corpus of speech-to-speech translations mined with the data mining technique called LASER. Researchers will be able to create their own speech-to-speech translation (S2ST) systems and build on Meta’s work.

Advertisement

Interpol launches the first ever metaverse for global law enforcement agencies

Interpol launches metaverse for global law enforcement agencies

The Interpol on Thursday launched the first-ever metaverse especially designed for global law enforcement agencies during its 90th General Assembly in Delhi.

This Interpol Metaverse enables registered users to tour virtually the Interpol General Secretariat headquarters in Lyon without any physical boundaries. It also allows one to interact with other officers through their avatars and take immersive training courses in forensic investigation and other policing skills.

According to Interpol, as the figure of metaverse users grows and the technology develops further, the list of possible crimes will expand to potentially include crimes against children, counterfeiting, ransomware, phishing, data theft, money laundering, financial fraud, and sexual harassment. In May this year, UAE’s AI Minister demanded laws and actions against crimes in the metaverse.

Read More:  Clearview AI Hit With Fine In France For GDPR Breaches

For law enforcement, some of these threats may present significant challenges as not all acts criminalized in the physical world are considered as crimes when committed in the virtual world.

According to Madan Oberoi, Interpol’s Executive Director of Technology and Innovation, by identifying these risks from the outset, one can work with stakeholders to build the necessary governance frameworks and cut off future crimes before they are fully formed.

During an interactive session and a follow-up panel discussion on Thursday in Delhi, Interpol also announced the creation of an expert group on metaverse to represent the reservations of law enforcement on the global stage, thus ensuring the new virtual world is secure by design.

According to the Global Crime Trend report of Interpol, crimes have increasingly moved online as digitalization has increased.

Advertisement

Totality Corp to take Diwali celebrations to the metaverse

Totality Corp to take Diwali celebrations to the metaverse

Totality Corp, the NFT Gaming company, announced that it would take Diwali celebrations to the metaverse for its users and community members this year. 

The platform Totality Corp creates NFTs and tokens based on Indian culture and mythology. It will organize a Lakshmi Puja in the metaverse through its Zionverse app. Thus, users would be able to celebrate Diwali and experience the Laxmi puja in the metaverse.

The puja has been scheduled for five days i.e. from 21st October to 26th October. It will take place 24×7 in the metaverse. The total duration of each puja will be about 5-7 mins, which users can attend with their friends and family. 

Read More: Meta’s Horizon Worlds Struggling To Gain New Users

Zionverse would offer two types of rooms for celebrations, public and private. Users can explore based on their preferences. They can invite their family and friends through a QR code in the private room. However, in the public room, users can celebrate with the entire community.

Lucky users will also have the chance to enter the raffle game. Fifty lucky winners will get a chance to win digital gold worth Rs 2,00,000.

Zionverse is a digital ecosystem with many opportunities for web3 enthusiasts, game developers, gamers, and artists. The company has been undertaking such initiatives in the metaverse to transcend its community into the world of futuristic possibilities. 

Advertisement

Meta Releases the SpeechMatrix Dataset for Speech-to-Speech Translation

meta is releasing speechmatrix for translation

Meta releases the SpeechMatrix Dataset, which provides a vast collection of parallel (multilingual speech-to-speech) speech elements mined from VoxPopuli in seventeen languages while enabling researchers to generate individual speech-to-speech (S2S) systems.

Using Hokkien, the S2S system was developed under Meta’s Universal Speech Translator (UST) project. Hokkien, one of Taiwan’s official languages, is extensively spoken in the Chinese diaspora but does not have a standard written form. The company stated that Meta’s AI researchers developed translation tools for this language.

Meta said AI translation had been around for the past few years, mainly for written languages. However, more than 40% of 7,000+ languages exist orally and do not have a written standard. 

Read More: Meta AI’s New AI Model can Translate 200 Languages with Enhanced Quality.

The company wrote, “We plan to use our Hokkien translation system as part of a universal speech translator and will open source our model, code, and training data for the AI community to enable other researchers to build on this work.”

Hokkien speakers can now communicate with English speakers using Meta’s latest S2S translation technology. More than 8,000 hours of Hokkien speech have been mined, along with the appropriate English translations, Meta claimed, adding that the technology may be applied to other unwritten languages and eventually would function in real-time.

Even though the model is currently under development and can only translate one entire sentence simultaneously, Meta said, “It is a step towards a future where simultaneous translation between languages is achievable.”

Advertisement

Synchrotron X-ray Microdiffraction Image Screening enabled by Federated Learning

Synchrotron X-ray microdiffraction image screening method using federated learning
Image Source: Chemistry World

A new Synchrotron X-ray microdiffraction (μXRD) image screening method based on federated learning (FL) has recently been proposed by a research team led by Prof. Zhu Yongxin from the Shanghai Advanced Research Institute (SARI) of the Chinese Academy of Sciences to enhance the screening while safeguarding data privacy.

Synchrotron μXRD harnesses the dual particle nature of X-rays, akin to traditional XRD technology, to understand more about the structure of crystalline materials. In traditional XRD analysis, interferences happen when scattered waves are in phase or out of phase resulting in bright (high-intensity) spots or peaks in a scanned diffraction pattern on an aerial detector. Contrary to traditional XRD, which typically has a spatial resolution of several hundred micrometers to several millimeters, Synchrotron μXRD employs X-ray optics to concentrate the excitation beam to a small point on the sample surface, allowing for the analysis of minute features on the sample. Consequently, high flux, adjustable well-defined wavelength, and superior collimation of Synchrotron radiation enable Synchrotron μXRD to give enhanced sensitivity and resolution of diffraction peaks than traditional laboratory XRD. 

The micro-diffraction technique is often applied to smaller or non-homogeneous samples with different compositions, lattice strains, or crystallite orientations.

Industrial minerals are subjected to synchrotron X-ray microdiffraction technologies to determine their crystal impurities in terms of crystallinity and potential impurities. Before being processed and stored, the enormous amounts of photos that μXRD services produce must be filtered. However, Synchrotron μXRD cannot work with massive image inflow in a short period of time. It will also be a challenging and expensive affair for humans to annotate every image.

At the same time, service users are reluctant to provide their original experimental images, there aren’t enough efficient labeled examples to train a screening model. Even industrial users’ privacy concerns about using μXRD services are a barrier to the development of precise μXRD image screening.

There are several organizations, and each one provides data that could be compiled into a coherent and large database. This database can be used to train a big data model. But industrial imagery could include sensitive and private information about users that is generally not authorized to be released outside of the establishments where they were created, particularly when ‘effective de-identification’ is not assured. Due to competing interests, each institution may also be regrettably unwilling or unable to share its own data with others. It may be challenging to construct reliable Synchrotron μXRD image screening without enough and a variety of datasets. Isolated or scant resources can cause misclassified results. For conducting industrial material testing using commercial data, bias or a lack of variety in images creates the need for a shared technology that does not need data centralization. This can further prevent the parameters gained by each institution from being used dishonestly to encrypt the data and models of another institution by forming an alliance in compliance with an all-side protocol. The use of federated learning among industrial users is one way to address this problem.

Federated learning takes machine learning models to the data source as opposed to the data coming to the model. This method, often referred to as collaborative learning, enables large-scale model training on data that is still scattered throughout the devices where it was originally collected. Federated learning unites multiple computing devices into a decentralized system that enables the various data collection devices to help train the model. This is advantageous because federated learning is able to mitigate such privacy issues to some extent by keeping device data locally to train the local model, whereas conventional machine learning methods for image classification at device interfaces tend to offer a risk of a privacy breach.

Read More: FedLTN: A Novel Federated learning-based system by MIT Researchers

Using the local data from the client, each device trains its own copy of the model, and then sends the parameters/weights from each model to a master device, or server, which aggregates the parameters and updates the global model. Then, until the required degree of accuracy is obtained, this training procedure is repeated. In a nutshell, the concept underlying federated learning is that only model-related updates are ever transferred between devices or parties, never any training data.

To increase the accuracy of federated learning, the researchers used domain-specific physical information. They then implemented a sampling method with new client sampling algorithms after taking into account the uneven data distributions in the actual world. In order to address the erratic communication environment between federated learning clients and servers, a hybrid training architecture was eventually developed.

Extensive research revealed that machine learning models’ accuracy increased from 14% to 25% and that sharing data characteristics across users or apps without compromising commercially sensitive information is possible.

This Synchrotron X-ray microdiffraction image screening technology powered by federated learning capabilities will aid in the removal of non-technical barriers to data sharing. This includes saving expenses for training specialists with domain knowledge, saving the work time of experts without compromising efficiency on intelligent classification, preserving the privacy of local clients, and utilizing sample information from different clients and organizations. Apart from that, it also encourages the use of unsupervised machine learning that doesn’t require vast troves of annotated image data, unlike supervised machine learning. Researchers say by employing their methodology, edge devices on the client side can be equipped with federated learning software packages or even deployed with customized hardware. Once the software (or hardware) is ready, instead of depending on talent for annotations, the users can have their images of industrial samples labeled intelligently and automatically by the federated learning paradigm when data flows into the pipeline without human intervention.

The researchers published their findings on Synchrotron X-ray microdiffraction using federated learning, and inference in IEEE Transactions on Industrial Informatics.

Advertisement

Clearview AI hit with fine in France for GDPR breaches

Clearview AI hit with fine in France for GDPR breaches

Clearview AI, the controversial facial recognition firm, has been hit with another fine in Europe. Clearview AI scrapes selfies and other personal information off the Internet without consent to feed it into an AI-powered identity-matching service that the comany sells to law enforcement and other organizations. 

This fine comes after Clearview AI failed to respond to an order from the CNIL last year, France’s privacy watchdog, to stop its unlawful processing of citizens’ information and delete their data.

Clearview AI responded to that order by ghosting the regulator, thereby adding a third GDPR breach for its non-cooperation with the regulator to its earlier tally. Italy’s privacy watchdog fined Clearview AI €20 million in March for breaches.

Read More: Clearview AI Fined In the UK For Illegally Storing Images

Here’s the CNIL’s summary of Clearview’s breaches:

  • Articles 17, 15, and 12 of the GDPR: Individuals’ rights not respected.
  • Breach of Article 6 of the GDPR: Unlawful processing of the personal data.
  • Article 31 of the GDPR: Lack of cooperation with the CNIL.

According to CNIL, Clearview AI had been given two months to comply with the formal notice’s injunctions and justify them to CNIL. However, it did not render any response to this formal notice.

As a result, the chair of the CNIL made the decscision to refer the matter to the restricted committee that is in-charge of issuing sanctions. Based on the information brought to its attention, the restricted committee imposed a maximum 20 million euros financial penalty, according to the article 83 of the General Data Protection Regulation (GDPR).

Advertisement

Microsoft fires 1000 employees across divisions to contain slowdown

Microsoft fires 1000 employees across divisions

Technology giant Microsoft has reportedly fired almost 1,000 employees across multiple divisions. It has now joined the likes of Flipboard and Snap, who also have resorted to job cuts to contain the slowdown. 

In a statement to a US-based website, Axios, Microsoft said that it evaluates business priorities and makes structural adjustments accordingly, like all companies. Microsoft added that it would continue to invest in its business and hire for the key growth areas. 

In July, the tech giant said that a small number of roles had been eliminated and would increase its headcount down the line. Several big tech companies have opted for hiring freeze or job cuts to check the slowdown. 

Read More: Meta Launches New Ad Campaign To Target Apple’s IMessage Platform

Last month, Meta’s chief executive officer Mark Zuckerberg during a weekly Q&A session with employees, announced the company would cut budgets across most of the teams. He said the company would freeze hiring and restructure some teams to trim expenses, Bloomberg reported.

In August this year, iPhone maker Apple sacked about 100 contract-based recruiters due to its push to rein in its hiring and spending, Bloomberg reported. In the same month, multimedia platform Snapchat decided to cut jobs to refocus the business on growing ad revenue.

In a letter to the employees, Snap CEO Evan Spiegel said it had become clear that the company must reduce the cost structure to avoid incurring significant ongoing losses.

“As a result, we have made the hard decision to reduce the size of our team by about 20%. These changes differ from team to team, depending upon investment needed and the level of prioritization to execute against our strategic priorities”, Spiegel’s letter read.

Advertisement

Tesla tops registrations of battery-electric vehicles in Germany, beats Volkswagen

Tesla tops registrations of battery-electric vehicles in Germany

According to federal data, Tesla has topped registrations of battery-electric vehicles in the first nine months of this year in Germany at nearly 38,500, beating Volkswagen by around 6,000 registrations.

Tesla’s battery-electric registrations jumped nearly 50% from last year’s January-September. In comparison, Volkswagen’s dropped 40% to almost 32,300, in line with a broader drop for most of the Volkswagen Group brands.

According to the data from the federal motor transport authority, only Audi and Seat saw a rise in the number of battery-electric cars registered in Germany for Volkswagen Group brands. Globally, Volkswagen Group witnessed total deliveries of its battery-electric vehicles increase by 25% in January-September from a year earlier.

Read More: Meta India Reports Gross Advertising Revenue Of $2 Bn For FY22

But supply chain bottlenecks have hit the carmaker especially hard in Europe, where inflation and rising energy costs also weigh on demand. Across all vehicle types, including combustion engine, hybrid, and battery-electric, deliveries of Volkswagen Group vehicles fell 12.9% globally this year, the carmaker reported last week, with Europe as the hardest-hit region.

A Volkswagen spokesperson said, “The tense situation of component supply has continuously led to adjustments in production. We expect a stabilization of supply over the coming year.” Tesla has seen record deliveries worldwide but also faced logistical challenges and delivered less in the third quarter than analysts had expected.

Advertisement

Tesla launches home charging station that works for other EV brands too.

Tesla launches home charging station

Electric vehicle manufacturer Tesla has launched a home charging station, or Wall Connector as it calls it, that works with other electric car brands, too, not just Tesla vehicles.

According to the auto-tech website Electrek, Tesla has launched a brand new version of its J1772 Wall Connector. It is priced as $550 on its official website.

The comany said that the J1772 Wall Connector is an convenient and easy charging solution for Tesla and non-Tesla electric vehicles alike. It is ideal for houses, apartments, hospitality properties, and workplaces.

Read More: Meta Launches New Ad Campaign To Target Apple’s IMessage Platform

The report said that Tesla’s description looks like the automaker might be seeking after the commercial charging market. With multiple power settings, range of up to 44 miles added per hour, a versatile indoor/outdoor design, and a 24-foot cable, the J1772 Wall Connector provides unparalleled convenience.

It can also power-share to maximize the existing electrical capacity, automatically distributing power and enabling the charging of multiple vehicles simultaneously. Tesla’s own elcetric vehicles can also use the station. However, they will need to use an adapter provided with the vehicle.

Till now, Tesla has installed almost 4,000 supercharger stations globally, growing 34% year-on-year. According to data compiled by Finbold, Tesla has 3,971 supercharger stations globally, recording a growth of 33.88% from the 2,966 recorded during the same period in 2021.

Advertisement