Tesla recalls nearly 1.1 million cars in the US

By

-

September 23, 2022

Tesla recalls nearly 1.1 million cars in the US as the windows might close too fast and hurt people’s fingers. Documents produced by American regulators explain how the windows may not react correctly after detection of an obstruction.

The National Highway Traffic Safety Administration said that it is a safety-standards violation. Tesla says a software update will fix the problem. The world’s largest electric-vehicle manufacturer has had repeated run-ins with federal safety regulators on sevral occasions.

Previous recalls have been due to rear-view cameras, bonnet latches, seat-belt reminders, and sound-system software. Tesla chief executive Elon Musk criticized the term “recall,” tweeting: “The terminology is outdated & inaccurate. This is a small over-the-air software update. To the best of our knowledge, there have been no injuries.”

The latest recall includes all four Tesla models, specifically 2017-22 Model 3 sedans and some 2020-21 Model Y SUVs, Model X SUVs, and Model S sedans. Tesla detected the problem with the automatic windows during production testing in August.

Owners will be notified by letter from 15th November. Company documents indicate vehicles made after 13th September already have the updated software to remedy the issue. Tesla said it was unaware of any warranty claims, deaths, crashes, or injuries related to the recall.

Whisper: OpenAI’s Latest Bet On Multilingual Automatic Speech Recognition

By

Preetipadma K

-

September 23, 2022

Whisper: Openai's Latest Bet On Multilingual Automatic Speech Recognition — Image Credit: Analytics Drift

OpenAI has released Whisper, an open-source automatic speech recognition system that the company says allows for “robust” transcription in various languages as well as translation from those languages into English.

Due to an increasing use of smartphones and voice assistant devices, multilingual speech recognition is the need of the hour. The demand for multilingual automatic speech recognition that can handle linguistic and dialectal differences in languages is growing as globalization progresses. While most speech recognition tools cater to English-speaking users, English is not the most spoken language in the world. This implies that the lack of language coverage can create a barrier to adoption.

In addition, the use of more than one language in conversation is a typical occurrence in a culture where individuals are bilingual or trilingual, which makes the development of multilingual models a reasonable case. It is also quite possible that most of the languages in a multilingual setting can have the same cultural heritage resulting in identical phonetic and semantic characteristics. Moreover, the absence of a well-known multilingual voice recognition system draws attention to an exceedingly fascinating field of speech recognition research that monolingual systems have long dominated.

The researchers trained Whisper using 680,000 hours of multilingual and multitask supervised data acquired from the web. According to OpenAI’s blog post, using such a large and diverse dataset improves the system’s ability to adapt to accents, background noise, and technical language.

While the variance in audio quality can aid in the robust training of a model, variability in transcript quality is not as advantageous. Initial examination of the raw information revealed a large number of substandard transcripts. This is why OpenAI created several automatic filtering techniques to enhance the quality of transcripts. The company also noted that many online transcripts were produced by other automatic speech recognition systems rather than by actual humans. A recent study has demonstrated that training on datasets containing both human- and machine-generated data can considerably harm translation system performance. Therefore, OpenAI developed numerous heuristics to find and exclude machine-generated transcripts from the training dataset to prevent the system from picking up “transcript-ese.”

In order to validate that the spoken language matches the language of the text according to CLD2, OpenAI additionally employed an audio language detector that was developed by refining a prototype model trained on a prototype version of the dataset on VoxLingua107. The research team excluded the (audio, transcript) combination as a speech recognition training example from the dataset if the two do not match.

OpenAI selected an encoder-decoder Transformer for the Whisper model’s architecture because it has been proven to scale efficiently. An 80-channel log-magnitude Mel spectrogram representation is generated on 25-millisecond windows with a stride of 10 milliseconds after all input audio was divided into 30-second chunks and re-sampled to 16,000 Hz. This input representation is processed by the encoder using a ‘small stem’ composed of two convolution layers with a filter width of 3 and the GELU activation function. The encoder Transformer blocks are then applied to the output of the stem after adding sinusoidal position embeddings.

The encoder output is subjected to a final layer normalization just after the transformer employs pre-activation residual blocks. The decoder predicts the corresponding text caption using learned position embeddings, tied input-output token representations, and unique tokens that instruct the single model to carry out various tasks. These tasks include language identification, phrase-level timestamping, multilingual speech transcription, and to-English speech translation. The width and quantity of transformer blocks are the same for the encoder and decoder.

Whisper aims to provide an integrated, resilient speech processing system that operates consistently without requiring dataset-specific fine-tuning to get high-quality results on particular distributions. Whisper’s capacity to generalize effectively across domains, tasks, and languages was examined by OpenAI using an extensive collection of existing speech processing datasets. Rather than using the conventional evaluation protocol for these datasets, which includes both a train and a test split, the researchers tested Whisper’s zero-shot performance and discovered that it is far more resilient and has an average relative error reduction of 55% fewer errors when evaluated on other speech recognition datasets. It outperforms the supervised SOTA on CoVoST2 to English translation zero-shot. Whisper can transcribe speech with 50% fewer mistakes than prior models. However, it does not outperform models specializing in LibriSpeech performance, a competitive benchmark in speech recognition, because it was trained on a broad and diverse dataset rather than being tailored to any particular one.

OpenAI claims that about a third of Whisper’s dataset is non-English. Even though this is an impressive feat, there is a strong possibility of a data imbalance if the training data have different quantities of transcribed data available for each language because of the disproportion in the distribution of speakers in different languages. As a result, languages over-represented in the training dataset could have a greater effect on a multilingual automated speech recognition system. Given that the majority of languages have less than 1000 hours of training data, OpenAI hopes to make an intensive effort to increase the amount of data for these more uncommon languages to bring significant improvement in the average speech recognition performance with only a modest increase in the size of the training dataset.

The OpenAI team speculates that optimizing Whisper models for decoding performance more directly via reinforcement learning and fine-tuning them on a high-quality supervised dataset could also minimize long-form transcribing errors. The researchers only examined Whisper’s zero-shot transfer performance in this research since they were more interested in the resilience characteristics of speech processing systems. Although this environment is essential for research since it reflects overall dependability, OpenAI believes it is possible that findings can be tweaked further for many areas where high-quality supervised speech data are provided.

For now, Whisper’s multilingual capabilities would be a huge asset in the fields of international trade, healthcare, education, and diplomacy. The company decided to release Whisper and the inference code as open-source software so that they could be used to develop practical applications and to do more research on effective speech processing.

NVIDIA announces next-generation automotive-grade chip Drive Thor

By

Sahil Pawar

-

September 22, 2022

NVIDIA announces next-generation automotive-grade chip Drive Thor

Nvidia is gearing up to launch Drive Thor, its next-generation automotive-grade chip that the company says will be able to bring together a wide range of in-car technologies, from driver monitoring systems and automated driving features to streaming Netflix in the back for kids.

Thor, which goes into production in the year 2025, is notable not just because it is a step up from the Drive Orin chip. It is also taking Drive Atlan’s spot in the lineup.

Founder and CEO Jensen Huang said on Tuesday at the company’s GTC event that Nvidia is scrapping the Drive Atlan system on chip ahead of schedule for Thor. In a race to develop bigger and better chips, Nvidia is going with Thor, which, according to the company, will deliver twice the compute and throughput at 2,000 teraflops of performance.

Nvidia’s vice president of automotive, Danny Shapiro said that if we look at a car today, advanced driver assistance systems, driver monitoring, camera mirrors, digital instrument cluster, parking, and infotainment are all different computers distributed throughout the vehicle. he added that in 2025, these functions will no longer be separate computers. Rather, Drive Thor will enable manufacturers to efficiently consolidate these functions into a single system, reducing overall system cost.”

Nvidia already has several automotive customers building software-defined fleets using Drive chips. For example, Volvo announced in January at the annual CES tech conference that Drive Orin would power its new automated driving features.

The automaker said it would power its infotainment system with Qualcomm’s Snapdragon chip. It’s precisely this space-sharing with competitors that likely drove Nvidia to create a more powerful chip.

NVIDIA launches GeForce RTX 4080 and 4090 desktop GPUs

By

Sahil Pawar

-

September 22, 2022

NVIDIA launches GeForce RTX 4080 and 4090 desktop GPUs

Nvidia has revealed its brand-new RTX 40 series GPUs at GTC 2022, two years after the RTX 30 series. The Nvidia GeForce RTX 40 series features RTX 4080 and RTX 4090 desktop GPUs based on Nvidia’s new Ada Lovelace architecture.

The new graphics cards will deliver a massive leap in performance, a more immersive gaming experience, AI-powered graphics, and fast content creation workflow. The Nvidia GeForce RTX 4080 and 4090 GPUs will significantly improve the gameplay experience, thanks to improved DLSS support and powerful hardware.

The Nvidia GeForce RTX 4090 GPU comes with 24GB of GDDR6X memory, and the company claims it is two to four times faster than its previous-gen flagship GPU, RTX 3090 Ti. Nvidia also says that RTX 4090 can deliver up to 100 FPS in 4K games while consuming 450W power, just like the RTX 3090 Ti. The GPU will support Nvidia’s new deep-learning super-sampling technique DLSS 3, which will improve performance even more.

The Nvidia GeForce RTX 4080 will be launched in two configurations. The first one features 16GB GDDR6X memory, 9,728 CUDA cores, and 76 RT cores. The second comes with 12GB GDDR6X memory, 7,680 CUDA cores, and 60 RT cores.

The 12GB variant also has slower memory with 21Gbps throughput over a 192-bit bus, compared to 22 Gbps throughput over a 256-bit bus on the 16GB variant. Nvidia says the 16GB RTX 4080 variant is up to three times faster than RTX 3080 Ti while using DLSS 3. Meanwhile, the 12GB RTX 4080 variant is faster than RTX 3090 Ti and consumes less power when using DLSS 3.

Nvidia GeForce RTX 4090 is set to launch on October 12th, and RTX 4080 will be launching in November. After some confusion over Indian pricing, Nvidia has finally shared the official price of these GPUs for the country. GeForce RTX 4090 – Rs 1,55,000, GeForce RTX 4080 (16GB) – Rs 1,16,000, and GeForce RTX 4080 (12GB) – Rs 87,000.

NVIDIA unveils new GeForce series of graphics chip that uses enhanced AI

By

Sahil Pawar

-

September 22, 2022

NVIDIA unveils new GeForce series of graphics chip

Nvidia, the leading semiconductor maker in the US, unveiled a new type of graphics chip at the GPU Technology Conference (GTC) 2022 that uses enhanced artificial intelligence to create more realistic images in games. Earlier this year, NVIDIA announced the launch of its new range of RTX professional GPUs during its Spring 2022 GTC.

Codenamed Ada Lovelace, the new architecture underpins Nvidia’s GeForce RTX 40 series of graphics cards which was unveiled by Chief Executive Officer Jensen Huang at an event on Tuesday. The top-of-the-line RTX 4090 will cost $1,599 and go on sale aroung October 12. Other versions that will be launched in November will retail for US $899 and $1,199.

The high-end version of the chip will consist 76 billion transistors. It will be accompanied by 24GB of onboard memory on the RTX 4090, making it one of the most advanced in the industry. Nvidia relies on Taiwan Semiconductor Manufacturing Co. to produce the processor with its 4N technology, whereas Micron Technology is the memory provider. Nvidia has been using Samsung Electronics to make Ada’s predecessor.

The new technology assures to speed up the rate at which cards produce images using the traditional procedure of calculating where pixels are situated on the screen while using AI to simulate others simultaneously. It is continuing a shift that Nvidia is pioneering which allows computers to make images appear more natural by generating them using calculations of the path of individual light rays.

The approach could give customers a new reason to upgrade their technology — which Nvidia could use now. The chipmaker is facing a steep slowdown in demand for PC components. Last month, Nvidia reported much lower quarterly sales than it originally predicted and gave a disappointing forecast.

Nvidia has been forced to deliberately slow down shipments to make sure its customers — primarily makers of graphics cards sold as add-ins for high-end computers — work through their stockpiles of unused inventory. Huang has said that process should be completed by the end of the year.

Visual Search Is One of the Biggest Trends – Know its Importance

By

Ratan Kumar

-

September 21, 2022

importance of visual search — Image Credit: Canva

Everything is temporary except evolution. However, in the last few decades, the world has evolved at an unprecedented pace. There was a time when only written text was used for communication and information. However, pictures are now replacing words as people use them to communicate. It forced data scientists and IT experts to modify technologies according to the images’ popularity and users’ needs.

One of the most valuable techniques developed was the visual search method. Although this method was introduced a few years ago, it gained popularity quite late. However, it has become a trend these days to search for information.

Most people rely on the picture search method to find images on the web as, in most cases, it is more accurate and efficient than word-based searches. The primary reason behind that is that it doesn’t show the results based on the SEO algorithms but the relevancy.

This article is written to help those still using old methods of digging for information. It can help internet users know the importance of this extraordinary technology and why they should prefer it over text inputs.

Importance of Visual Search

The image search method gained massive popularity because of the incredible benefits that it provides to people. As a result, reverse picture search has become an essential tool to use in different circumstances. The factors mentioned below can help you know why it is vital these days.

Provide Better Results

As mentioned above, text-based queries sometimes don’t help people find out what they are looking for. So, people must try multiple questions to find the most accurate and relevant data. On the contrary, the reverse photo lookup technique is comparatively much better than text inputs.

Most image search engines use machine learning and artificial intelligence to analyze the visual inputs, providing the exact results people need. These technologies make the search results far more relevant than other inputs. Due to its precision, many people don’t prefer text and audio queries until they are bound to do it.

Find out Who Are Using Your Pictures

Besides providing many other benefits, the reverse image lookup can also help people know who is illegally or unethically using their images. Once you upload a picture on a social media platform or web, you don’t have any control over it. Anyone can save it on their devices and use it for their benefit.

Keeping a check on your pictures is essential for you. There are some chances that scammers can use them to harm your reputation or for some propaganda purposes. Visual search methods can help you find images on the web that are uploaded anywhere else. Besides, you can also learn for what purpose these pictures are used. Once you know about them, you can secure your reputation before damaging it or take legal action against those using your pictorial data without your consent.

Save People from Scams

Scammers have found many ways to loot or trap people in this modern age. Catfishing is the most common among them. In this scam, fraudsters use other people’s pictorial data to create social media profiles, contact their friends and family members and ask for money. Every year, tons of people lose their money due to this new type of fraud. Unfortunately, it is hard to determine whether a catfisher or the real person is talking to you from the other end.

Similarly, some eCommerce stores also use images of other companies’ products and sell low-quality and fake products. People should get assistance from photo search engines to avoid this issue. These engines will show the web pages and social media profiles where the same images have been uploaded. By analyzing those images, and their upload date, you can quickly know whether a fraudster is targeting you.

Acquire Information about Different Objects

Sometimes we receive different images on WhatsApp or other platforms with unknown objects. Those objects could be anything like places, people, animals, birds, herbs, etc. Getting information about those images could be tricky using the text as an input query. That’s where image search comes in handy.

With the help of a visual search, you can quickly find out every piece of information regarding those unknown objects. This method allows people to discover other similar objects as well.

Detect Fake News

Social media and other sources of information are full of fake news. The propagandists frequently use fake news to spread misinformation for multiple purposes. For example, they can use it to malign their competitors, gain sympathies, or lead people to make unlawful decisions. Avoiding fake news is essential for people, but many don’t know how to differentiate between fake and real news. However, reverse image search engines and techniques can assist people in this case.

To know the authenticity of the news, people should save the supported pictures and search for them on the photo search facility. It will show the web pages having the same picture. By checking those sources, you can identify whether the real source of news is credible or not.

Locate Royalty-Free Images

Images are sometimes essential for social media posts, blogs, and eCommerce stores. However, finding or creating relevant pictures is difficult for many people as copyright laws protect some pictures. Therefore, if you use them, you may have to face severe fines. That’s why avoiding them and finding royalty-free images is the only option you may have.

The reverse image search can help people find royalty-free images they can use on social media and blogging websites to support their arguments or any other purpose. In that case, a single image input is better than multiple text queries.

To Conclude

Due to a massive increase in the use of images over the web, the need for visual search methods has increased. Analyzing the needs, many reverse image search engines are developed that now provide users with a range of benefits.

Salesforce Announces Genie, a Real-time Data Integration Platform

By

Disha Chopra

-

September 21, 2022

Salesforce, a global cloud solutions provider, has announced Genie at the Dreamforce customer conference, a real-time data integration platform using which enterprises can deliver seamless services across sales, marketing, and commerce.

David Schmaier, Chief Product Officer and President at Salesforce, said that the company had built Genie to automate every service provision by Customer 360, Salesforce’s customer relationship management (CRM) platform. Salesforce Genie forms the core of real-time Customer 360, collects, stores, and integrates real-time data streams with Salesforce transactional data.

Genie underlines the Salesforce platform by smoothing data movement whenever required. Patrick Stokes, GM and EVP of Salesforce, said, “So we’re announcing that our Customer 360 applications now have access to an entirely new way of bringing data into Salesforce in real time at scale that we’ve never been able to achieve before.”

Stokes highlighted that Genie is a lakehouse architecture and a modern equivalent of the company’s previous attempts to integrate transactional data in the CRM database. However, Genie is more than just an integration layer added to the platform.

Genie offers Sales Cloud, Service Cloud, Marketing Cloud, and Commerce Cloud services separately. It also features services for Tableau, MuleSoft, and Slack. A part of its ability to offer such capabilities is developed on Salesforce’s cloud infrastructure, Hyperforce, which offers data security, privacy, and regulatory compliance controls. This ensures customers’ trust and reliance on the platform.
You can check the entire list of services here.

NVIDIA Releases Maxine to Deliver Breakthrough Audio and Video Quality at Scale

By

Disha Chopra

-

September 21, 2022

nvidia releases maxine to deliver audio video quality

NVIDIA releases Maxine, a suite of GPU-driven software development kits (SDKs) to deliver breakthrough audio and video quality. Maxine enables clear communications via its cloud-native microservices for augmented-reality effects and audio-video enhancement.

With the early-access release of Maxine’s audio effects, the company said that Maxine would be re-architected for cloud-native microservices. Additionally, new SDK capabilities, including Speaker Focus and Face Expression Estimation, were announced, along with the availability of Eye Contact to all users. Updated versions of existing SDK functionalities are also included in NVIDIA Maxine.

Maxine provides three updated GPU-accelerated SDKs for audio, video, and AR effects that revolutionize real-time communications with AI. A new feature called Speaker Focus isolates the audio tracks of foreground and background speakers to make each voice more audible. Lastly, the Audio Super Resolution SDK function has also received an upgrade with better quality.

The video effects SDK uses a regular webcam to produce AI-based video effects. Enhancements to temporal stability have been made to the Virtual Background function, which divides a person’s profile into sections and uses AI-powered background removal, replacement, or blur.

Additionally, the AR SDK offers typical web camera feed-based, real-time 3D face tracking and body pose estimation driven by AI.

Other cloud-native microservices offered by Maxine will enable developers to create real-time AI applications. These services may be autonomously managed and deployed on the cloud, speeding up implementation time. Some of these microservices are:

Background Noise Removal
Room Echo Removal
Audio Super Resolution
Acoustic Echo Cancellation

Maxine is a part of the NVIDIA Omniverse Avatar Cloud Engine, a set of cloud-based AI models and services that developers may use to create, personalize, and use interactive avatars. You can refer to the GTC keynote for more information.

New NVIDIA DGX System Software and Infrastructure Solutions Supercharge Enterprise AI

By

Disha Chopra

-

September 21, 2022

During the GTC event, NVIDIA announced its new DGX system software and infrastructure to power innovation in enterprise AI development. The company announced that NVIDIA DGX H100 systems are now available for order. Based on the latest GPU chips, these systems will form the building blocks for NVIDIA’s full-stack AI solutions.

The company launched the new NVIDIA Base Command software to simplify and accelerate AI developments by powering the DGX systems. The software will enable enterprises to tap the potential of their investment in NVIDIA’s DGX systems for orchestration and network infrastructure.

NVIDIA unveiled the DGX BasePOD to make AI deployments simpler and faster. The BasePOD provides an architectural framework for all DGX computing, storage, network, and software systems.

The company has also created an advanced version of the BasePOD, the NVIDIA DGX SuperPOD. The DGX SuperPOD is a comprehensive hardware, software, and services package that removes the guesswork from developing and deploying AI infrastructure in any enterprise, making it the fastest route to AI innovation.

The GTC event also unveiled the NVIDIA Partner Network, a network of fully integrated and readily deployable offerings provided to valued partners. The program is intended for business models, including value-added reselling, solutions integration, system design or manufacture, hosting services, consultancy, or NVIDIA products and solutions.

NVIDIA Ramps up the Hopper Architecture and Pushes H100 Chips to Production

By

Disha Chopra

-

September 21, 2022

nvidia ramps up hopper and h100 chips production

NVIDIA is driving more and more architectural decisions and modifications in its CPU and GPU accelerator engines with each new generation. Jensen Huang, CEO of NVIDIA, announced that the company would ramp up Hopper, an architecture supporting AI workloads. The Hopper architecture is intended to scale diverse workloads for data centers.

NVIDIA unveiled Hopper in March, along with other advancements like the NVIDIA Grace CPU. This month, the company released benchmark results for the chip in the MLPerf suite of machine learning tasks.

Hopper is built with approximately 80 billion transistors with NVIDIA’s cutting-edge TSMC 4N technology and features multiple innovations to enhance the performance of NVIDIA H100 Tensor Core GPUs.

The company has pushed the H100 Tensor Core GPUs to enter the production zone in total volume. The GPU chips will be shipped to companies including Hewlett Packard, Dell, Cisco Systems, etc. NVIDIA systems with the H100 GPU will enter the market in the first quarter of next year.

When the company launched the first H100 GPU chip, Huang said, the chips would be “the next generation of accelerated computing.” The H100 chip is designed to accomplish artificial intelligence tasks for data centers. The company claims that H100 chips “dramatically” reduce deployment costs for AI-based programs. For instance, the performance of 320 top-of-the-line A100 GPUs is equivalent to only 64 H100s.