Optimus Gen-2, Second Generation Humanoid Robot Unveiled by Tesla

By

-

December 15, 2023

Elon Musk, CEO of Tesla, unveiled the second generation of the company’s humanoid robot, Optimus, in a video posted on X. Tesla’s accompanying statement highlighted that Optimus Gen-2 boasts a 30% increase in speed compared to the May prototype and weighs 10 kg less without any compromises.

Tesla also asserted that Optimus had undergone several technical enhancements, including refined torque sensing, articulated toe sections, and improved human geometry.

In the later segment of the video, Optimus Gen-2 demonstrates its capabilities by performing squats in a gym, attributed by Tesla to the humanoid’s enhanced balance and full-body control.

Additionally, the footage shows Optimus Gen-2 delicately transferring eggs from a carton to an egg boiler, made feasible by the humanoid’s new hands equipped with “tactile sensing on all fingers.”

It was back in 2022, during Tesla’s AI Day when Elon Musk first unveiled two prototypes of Optimus. He was confident that the robots would serve humanity in a good way and likely change the socio-economical structures. With an advanced version of Optimus has been unveiled, it is likely to be seen if his predictions come true.

Microsoft Unveils Phi-2, a Small Language Model Which Can Operate on Mobile Devices

By

Boudhayan Ghosh

-

December 13, 2023

Microsoft has launched its Phi-2 small language model (SML), an AI program specialized in text-to-text tasks. Microsoft’s official account on X states that this model is compact enough to operate seamlessly on laptops or mobile devices.

Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters, and Mistral-7B, another model boasting 7 billion parameters.

Figure 1

In Microsoft’s official blog post, Phi-2 is seen as a pursuit of smaller-scale language models to match the capabilities of larger ones. The key strategies of researchers include prioritizing high-quality training data focusing on textbook-quality content and synthetic datasets for common sense reasoning and general knowledge. They also enrich their dataset with meticulously selected web content emphasizing educational value.

Moreover, Microsoft innovatively leveraged knowledge transfer from Phi-1.5, a 1.3 billion parameter model, embedding its insight into the 2.7 billion parameter Phi-2. This technique not only expedites training but also substantially elevates Phi-2’s benchmark performance.

Going more into the model’s technicality, Phi-2 operates on a next-word prediction objective and underwent training on a massive 1.4 trillion tokens sourced from web datasets focusing on NLP and coding. Its training spanned 14 days, utilizing 96 A100 GPUs.

Fig 2: Safety scores computed on 13 demographics from ToxiGen. A subset of 6541 sentences are selected and scored between 0 to 1 based on scaled perplexity and sentence toxicity. A higher score indicates the model is less likely to produce toxic sentences compared to benign ones. (Source: Microsoft)

This model, notably a base version, did not undergo alignment through reinforcement learning from human feedback or instruct fine-tuning. Despite this absence of additional refinement, researchers observed Phi-2 displaying improved behavior regarding toxicity and bias compared to existing open-source models that underwent alignment processes. (See Figure 2)

Mistral AI’s New LLM Model Outperforms GPT-3 Model

By

Boudhayan Ghosh

-

December 13, 2023

Mistral Mixtral 8x7B — Source: MISTRAL AI

Mistral released its latest model, the Mixtral 8x7B, last week. Named after its “mixture of experts” technique, this model combines various specialized models, each focusing on different task categories.

Surprisingly, Mistral made it available online as a torrent link without accompanying it with explanations, blog posts, or demo videos showcasing its capabilities.

Mistral later published a blog post that delved deeper into the model. They showcased benchmarks where Mixtral 8x7B matched or even surpassed the performance of OpenAI’s GPT-3.5 and Meta’s Llama 2.

Acknowledging collaboration with CoreWeave and Scaleway for technical support during training, Mistral also confirmed that the Mixtral 8x7B model is open for commercial use under the Apache 2.0 license.

Ethan Mollick, an AI influencer and professor at the University of Pennsylvania Wharton School of Business, pointed out on X that Mixtral 8x7B appears to lack “safety guardrails.” This means users who are dissatisfied with OpenAI’s stricter content policies have access to a model with similar performance that can generate content considered unsafe. On the flip side, this absence of safety measures could pose a challenge for policymakers and regulators.

OpenAI is Aware of ChatGPT’s Laziness

By

Boudhayan Ghosh

-

December 13, 2023

OpenAI is currently investigating reports regarding user dissatisfaction with the latest iteration of ChatGPT, based on the GPT-4 model. Users have voiced concerns that the chatbot appears uncooperative and lazy in addressing their queries.

For example, when asked for a piece of code, ChatGPT might provide minimal information and prompt users to complete the task themselves. Some users found its responses to be rather sassy, with ChatGPT implying that users are fully capable of completing the task on their own.

As user complaints kept on rising on different social media platforms, ChatGPT creators OpenAI on X (formerly Twitter) through the ChatGPT account said, “We’ve heard all your feedback about GPT4 getting lazier! we haven’t updated the model since Nov 11th, and this certainly isn’t intentional. Model behavior can be unpredictable, and we’re looking into fixing it.”

The response wasn’t convincing enough for people, as one user on X replied to that same thread where ChatGPT posted their “awareness of the issue” tweet. The user pointed out one important factor: even if the model hasn’t been updated, how can it change or get lazy when it’s just a file?

Although OpenAI did not succinctly point out the reason for ChatGPT’s laziness, they did reply to the user, saying, “To be clear, the idea is not that the model has somehow changed itself since Nov 11th. It’s just that differences in model behavior can be subtle — only a subset of prompts may be degraded, and it may take a long time for customers and employees to notice and fix these patterns.”

However, user complaints about ChatGPT’s declining accuracy aren’t a recent occurrence. Reports of it being dumber and inaccurate date back as far as six months ago, which OpenAI refuted, unlike this time.

What happens next is to be seen. As the AI race is getting heated up with more and more highly efficient LLM models like Grok and Gemini-powered Bard catching up with ChatGPT, a slight lack of focus from OpenAI might cost them a lot of users as choices are in abundance.

ISRO Releases its Datasets that can be Freely Accessed by All

By

Boudhayan Ghosh

-

December 12, 2023

India’s space research organization, ISRO, has made its datasets freely accessible to all. ISRO is granting access to an extensive archive of Remote Sensing data sourced from 44 satellites, encompassing both Indian and Foreign Remote Sensors accumulated since 1986. Additionally, they facilitate the regional distribution of Sentinel and LandSat data within India.

Out of the datasets released by ISRO, one of them is Cartosat DEM. This dataset gives you free access to a detailed elevation model derived from Cartosat-1 satellite data. It provides 30-meter resolution coverage for all of India’s land. You can use this data for many things, like making maps, studying water systems, and handling disasters better.

Another data released for free access is Resourcesat-2. This dataset offers free access to orthorectified satellite images. The images have a 23-meter resolution and cover the entire land area of India. Thanks to its high-resolution LISS-3 and LISS-4 cameras, they are incredibly useful for tasks like mapping land use, planning infrastructure, monitoring the environment, agricultural monitoring, and disaster management.

ISRO also released Oceansat datasets. The Ocean Color Monitor (OCM) and Scatterometer are tools on the Oceansat series of satellites. OCM offers valuable details on chlorophyll concentration, which is a big deal when it comes to understanding phytoplankton growth. If you are already confused about the technicality, then in simple words, it helps us understand how healthy and productive our ocean is. You can study marine life and fishing, spot harmful algal blooms, and even grasp how climate change affects our oceans.

A scatterometer, on the other hand, helps measure speed and direction across the ocean. It’s key for predicting weather, figuring out how our climate works, ensuring ships take the safest route, checking out the potential for wind energy at sea, and even understanding how the ocean moves around.

Lastly, the IRS-1A, IRS-1C, and IRS-1D datasets are also released by ISRO. These datasets can help you understand environmental studies such as agricultural monitoring and environmental cropping.

Audio Generation Tool, Audiobox, Introduced by Meta

By

Boudhayan Ghosh

-

December 12, 2023

Meta introduced the Audiobox as their latest foundational research model for audio generation. Within this family of models are specialized versions such as Audiobox Speech and Audiobox Sound.

These models enable the creation of voices and sound effects by amalgamating voice inputs with natural language prompts, catering to diverse audio needs. Underlying these variants is the Audiobox SSL, a self-supervised model forming the common foundation for all Audiobox iterations.

Audiobox further permits users to merge an audio voice input alongside a textual style prompt, facilitating the synthesis of speech in various environments or emotional tones, such as speaking in a cathedral or expressing sadness at a slower pace.

The inclusion of text and voice inputs significantly amplifies Audiobox’s controllability in contrast to other Meta inventions like Voicebox. Audiobox empowers users to utilize text description prompts to specify and manipulate sound effects, expanding the range of controllable features. When combined, the voice input establishes the fundamental timbre, while the text prompt becomes a tool for altering other attributes.

Audiobox inherits Voicebox’s guided audio generation training objective and flow-matching modeling method, enabling audio infilling. This capability permits users to refine sound effects, such as incorporating diverse thunder sounds into a rain soundscape, enhancing the model’s versatility.

Google’s NotebookLM Helps You Take Online Notes

By

Boudhayan Ghosh

-

December 11, 2023

Google has unveiled NotebookLM, its experimental AI-powered online note-taking application previously accessible to chosen users via a waitlist. It is now open to all users across the United States upon their voluntary opt-in and is offered free of charge.

NotebookLM, a collaboration with author Steven Johnson, enables users to consolidate multiple documents from their computers on Google Drive into a unified digital notepad. Users can engage with Google’s AI by asking questions that the AI can answer using the uploaded documents.

The AI tool has also got some recent updates; one of them is integrating the notebook with the Gemini AI model, Google’s most advanced AI chatbot. Apart from this, NotebookLM has undergone various other enhancements, allowing it to analyze and reference up to 20 documents simultaneously, with a capacity of 200,000 words per document.

NotebookLM faces one significant limitation: it lacks the capability to analyze or explore web links, even when included by the user in their notes. This omission appears substantial, particularly for a company like Google, known for its extensive crawling and indexing of the entire web.

Rather than allowing direct analysis of web content, users must manually save and upload webpage PDFs or copy and paste the text into a Google Doc within their Google Drive for NotebookLM to access and reference that information.

Air Space Intelligence, an AI Startup, Receives Funding worth $34 Million

By

Boudhayan Ghosh

-

December 11, 2023

Air Space Intelligence AI Funding — Source: Andreessen Horowitz

Air Space Intelligence Inc., an AI startup focusing on air travel, secured $34 million in funding from Andreessen Horowitz to ramp up its engagement with the US Department of Defense.

Recognized for its technology likened to “Waze for air travel,” the company has historically provided its tools to commercial carriers like Alaska Airlines.

Their primary offering, Flyways, assists flight dispatchers in selecting optimal routes for aircraft, considering variables such as air traffic, weather conditions, and airport statuses.

With the recent funding injection, Air Space intends to bolster its workforce, aiming to potentially double its current staff to 160 individuals. The focus lies on expanding its Washington presence. Despite the company’s San Francisco roots, the company relocated to DC last year and anticipates significant growth for the office there.

The recent funding round has reportedly elevated Air Space’s valuation to around $300 million, as per sources familiar with the matter who preferred anonymity due to the confidential nature of details. The startup has not provided any information on the either on the company’s current valuation.

Is Grok the First-Ever Politically Incorrect AI Chatbot?

By

Boudhayan Ghosh

-

December 11, 2023

Grok, an AI bot similar to Open AIs ChatGPT, was launched by xAI, Elon Musk’s AI startup on X (previously known as Twitter).

At its core, Grok relies on a generative model named Grok-1. This model was trained using data from the web (up to Q3 2023) and insights gathered from human assistants. Distinguishing itself from other chatbots, Grok can integrate real-time information from X’s posts into its responses, potentially providing the latest updates when answering questions.

Grok operates conversationally, leveraging a knowledge base akin to ChatGPT and Google Bard. It resides in the X side menu across web, iOS, and Android platforms and can be easily accessed by adding it to the bottom menu on X’s mobile apps for faster use.

Elon Musk has said that Grok possesses a witty and rebellious nature, being more open to addressing controversial questions compared to other AI systems. When asked to be less formal, Grok used explicit language, which is a stark diversion from the politically correct responses of Bard or ChatGPT.

Grok just passed my sanity check pic.twitter.com/HYN3KTkRyX
— Jim Fan (@DrJimFan) December 7, 2023

Jokes and foul language apart, Grok's responses are so entertaining because it sounds way more intelligent than the other ChatLLM apps

Fundamentally, the safety RLHF makes the other LLMs dumber… You are killing off part of the LLM brain by overly censoring it.

h/t… pic.twitter.com/rhZQff4rdp
— Bindu Reddy (@bindureddy) December 7, 2023

Embracing its edgy persona, X introduced a roast feature on Grok’s home screen, where Grok humorously critiqued a user based on their recent X post history. As per the latest reactions by users, Grok’s responses are more nuanced than any other existing chatbots.

OH MY GOODNESS

I JUST GOT ACCESS TO GROK AND IT ROASTED MY X ACCOUNT

THIS WAS ABSOLUTELY HILLARIOUS

ELONS BRINGING HUMOR BACK TO AI pic.twitter.com/3mjrXIxO80
— amit (@amitisinvesting) December 7, 2023

Grok is initially available to U.S. users subscribed to X Premium Plus. The plan costs $16 per month for an ad-free social network experience. Priority access is granted to longstanding subscribers.

The EU Reached an Agreement on AI Act for the Responsible Use of AI

By

Boudhayan Ghosh

-

December 11, 2023

European Union policymakers reached an agreement on the AI Act, a comprehensive law aimed at regulating Artificial Intelligence (AI). This groundbreaking legislation serves as a global standard, balancing the potential benefits of AI with efforts to mitigate associated risks such as job automation, online misinformation, and threats to national security.

European policymakers focus on regulating the riskiest applications of AI in both corporate and government realms, particularly in law enforcement and essential services like water and energy.

The AI Act introduces new transparency requirements for creators of major general-purpose AI systems. Additionally, the guidelines specify that any content created by AI, such as deepfakes, must be clearly labeled as AI-generated.

The law also imposes restrictions on the use of facial recognition technology (an impending concern by the EU) by law enforcement and governments, except in specific safety and national scenarios. Companies found violating these regulations could face fines of up to 7 percent of their global sales.

After three days of negotiations in Brussels, which included a 22-hour session starting Wednesday afternoon and stretching out into Thursday, the final agreement was not immediately disclosed. Further discussions were anticipated to finalize technical details, potentially causing delays in the ultimate approval process.

The legislation requires votes in both the Parliament and the European Council, representing the 27 countries in the union. The AI Act has been a longstanding dream of the EU to regulate the responsible use of AI. The EU first announced its decision to draft such a policy came back in August 2022.