Download the Ace AWS DEA-C01 Exam App: iOS - Android
AI Innovations in June 2024.
Welcome to our blog series “AI Innovations in June 2024”! This is an evolving article that will be updated daily throughout the month of June 2024 to bring you the latest news and developments in the world of artificial intelligence. As we move further into the 21st century, the pace of AI innovation continues to accelerate at a breathtaking rate. Each day, researchers, engineers, and visionaries are pushing the boundaries of what’s possible, unlocking new capabilities and applications that are transforming industries, enhancing our lives, and shaping the future. In this blog, we’ll dive deep into the most exciting AI breakthroughs, advancements, and milestones happening in June 2024. From groundbreaking AI-powered technologies and cutting-edge research to the societal and ethical implications of these innovations, we’ll provide you with a comprehensive and insightful look at the rapidly evolving world of artificial intelligence. Whether you’re an AI enthusiast, a tech-savvy professional, or simply someone curious about the future, this blog will keep you informed, inspired, and engaged. So, join us on this journey of discovery as we explore the frontiers of AI and uncover the innovations that are shaping our world. Stay tuned for daily updates, and get ready to be amazed by the incredible advancements that are happening in the world of AI!
Experience the transformative capabilities of AI with “Read Aloud For Me – AI Dashboard – AI Tools Catalog – AI Tools Recommender” – your ultimate AI Dashboard and Hub. Seamlessly access a comprehensive suite of top-tier AI tools within a single app, meticulously crafted to enhance your efficiency and streamline your digital interactions. Now available on the web at readaloudforme.com and across popular app platforms including Apple, Google, and Microsoft, “Read Aloud For Me – AI Dashboard” places the future of AI at your fingertips, blending convenience with cutting-edge innovation. Whether for professional endeavors, educational pursuits, or personal enrichment, our app serves as your portal to the forefront of AI technologies. Embrace the future today by downloading our app and revolutionize your engagement with AI tools.
Top AI Statistics 2024 Roundup.
The global AI market is valued at over $196 billion.
AI industry value is projected to increase by over 13x over the next 7 years.
The US AI market is forecast to reach $299.64 billion by 2026.
The AI market is expanding at a CAGR of 38.1% between 2022 to 2030.
By 2025, as many as 97 million people will work in the AI space.
AI market size is expected to grow by at least 120% year-over-year.
83% of companies claim that AI is a top priority in their business plans.
Netflix makes $1 billion annually from automated personalized recommendations.
48% of businesses use some form of AI to utilize big data effectively.
38% of medical providers use computers as part of their diagnosis.
Apple and Meta are discussing a generative AI partnership ByteDance and Broadcom collaborate on AI chip development Researchers developed a new method to detect hallucinations Synthesia 2.0: World’s 1st AI video communication platform OpenAI is on an acquiring spree, buying Rocket and Multi Record labels sue AI music startups over copyright infringement Anthropic rolls out Claude’s cutting-edge collaborative features Google experiments with celebrity-inspired AI Chatbots OpenAI postpones the launch of ChatGPT voice mode Amazon steps into the chatbot race with Metis Figma’s new AI features stir competition with Adobe Alibaba’s Qwen-72B tops Hugging Face’s Open LLM Leaderboard Google releases Gemma 2, lightweight but powerful open LLMs OpenAI’s CriticGPT finds GPT-4’s mistakes with GPT-4 Google partners with Moody’s, Thomson Reuters & more for AI data
Apple and Meta are discussing gen AI partnership
Apple is reportedly in talks with its longtime rival Meta to integrate the latter’s Llama 3 AI model into Apple Intelligence. This move comes as Apple prepares to roll out its AI features across iPhones, iPads, and Macs later this year.
The potential partnership follows Apple’s existing deal with OpenAI, suggesting a collaboration strategy rather than solo development in the AI race. In Apple’s arrangement with OpenAI, there’s no direct payment. Instead, OpenAI can offer premium subscriptions through Apple Intelligence, with Apple taking a percentage. It’s unclear if Meta would agree to a similar business model, given that Llama 3 is open-source and free to access.
ByteDance and Broadcom collaborate on AI chip development
ByteDance is collaborating with U.S. chip designer Broadcom to develop an advanced AI processor. This partnership aims to secure a stable supply of high-end chips amid ongoing U.S.-China tensions. The project centers on creating a 5-nanometre, customized Application-Specific Integrated Chip (ASIC) that complies with U.S. export restrictions.
This chip’s manufacturing is set to be outsourced to Taiwan Semiconductor Manufacturing Company (TSMC), though production is not expected to begin this year. While the design work is currently underway, the critical “tape out” phase has yet to commence.
Researchers developed a new method to detect hallucinations
ChatGPT and Gemini can produce impressive results but often “hallucinate” false or unsubstantiated information. This research focuses on a subset of hallucinations called “confabulations,” where LLMs generate answers that are both wrong and arbitrary. Researchers have developed new methods to detect confabulations using entropy-based uncertainty estimators. They introduce the concept of “semantic entropy” to measure the uncertainty of LLM generations at the meaning level.
High semantic entropy corresponds to high uncertainty and indicates a higher likelihood of confabulation. The method computes uncertainty at the level of meaning rather than specific word sequences, addressing the fact that one idea can be expressed in many ways. The method provides scalable oversight by detecting confabulations that people might otherwise find plausible.
Synthesia 2.0: World’s 1st AI video communication platform
Synthesia is launching Synthesia 2.0 – the world’s first AI video communications platform for businesses. It reinvents the entire video production process, allowing companies to create and share AI-generated videos at scale easily.
2 Personal AI Avatars: Expressive Avatars shot in a studio and Custom Avatars created using your webcam.
AI Video Assistant: Converts text, documents, or websites into high-quality videos, with options to customize the branding, tone, and length.
Intuitive Video Editing: Editing simplified with “Triggers” that let you control animations and edits from the script.
Translation and Dynamic Video Player: Videos can now be translated into over 120 languages. Synthesia is also building a new video player with interactive features.
AI Safety Focus: Synthesia is pursuing ISO/IEC 42001 certification, the first standard for responsible AI management, to ensure its AI technologies are ethical.
OpenAI is on an acquiring spree, buying Rockset and Multi
Last week, OpenAI acquired Rockset, a startup that develops tools for real-time data search and analytics. OpenAI said it would integrate Rockset’s technology to power its infrastructure and offerings across products.
This week, OpenAI acquired Multi, a startup focused on building remote collaboration tools and software. Technically, the deal is an acqui-hire as the entire Multi team, including its co-founders, will join OpenAI to work on the company’s ChatGPT desktop application.
Record labels sue AI music startups over copyright infringement
The world’s major record labels, including Universal Music Group, Sony Music, and Warner Music, have filed twin lawsuits against the AI music generation startups Suno and Udio. The lawsuits accuse the companies of unlawfully training their AI models on massive amounts of copyrighted music, which, according to the complaints, allows the startups to generate similar-sounding music without permission.
The record labels allege Suno and Udio have effectively copied artists’ styles and specific musical characteristics. The labels claim the AI-generated music is so close to the original that it is eerily similar when transcribed into sheet music. The lawsuits also accuse the startups of making it easy for people to distribute AI-created samples that mimic copyrighted recordings on platforms like Spotify.
Anthropic rolls out Claude’s cutting-edge collaborative features
Anthropic has introduced new collaboration features for Claude. These features include:
Projects: Projects in Claude allow integration of internal resources like style guides or codebases, enhancing Claude’s ability to deliver tailored assistance across various tasks. Users can set custom instructions for each Project to modify Claude’s tone or perspective for a specific role or industry.
Artifacts: It allows users to generate and edit various content types like code, documents, and graphics within a dedicated window. This benefits developers by offering larger code windows and live previews for easier front-end reviews.
Sharing Features: Claude Team users can share snapshots of their best conversations with Claude in their team’s shared project activity feed.
Additionally, any data or chats shared within Projects will not be used to train Anthropic’s generative models without a user’s explicit consent.
Google experiments with celebrity-inspired AI Chatbots
These chatbots will be powered by Google’s Gemini family of LLMs. The company aims to strike partnerships with influencers and celebrities and is also working on a feature that allows people to create their own chatbots by describing their personalities and appearances.
The project is led by Ryan Germick, a longtime executive at Google and a team of ten. These chatbots could be an experiment and may only appear on Google Labs rather than being widely available.
Originally planned for late June, the Voice Mode aims to provide a more naturalistic and conversational experience with the AI chatbot, complete with emotional inflection and the ability to handle interruptions.
However, it will now be available only to a small group of users in late July or early August. OpenAI is working on improving content detection and user experience before wider rollout. GPT-4o’s real-time voice and vision capabilities are also expected to roll out to ChatGPT Plus users soon.
Amazon is reportedly working on a new consumer-focused chatbot codenamed “Metis.” It is planned to be released somewhere around September. Here’s what we know about it:
The chatbot is powered by a new model, Olympus, and can be accessed via a web browser.
It uses a retrieval-augmented generation (RAG) technique to provide up-to-date information and automate tasks.
The model conversationally provides text and image-based outputs, suggesting follow-ups to queries. It also shares links to sources and supports image generation.
It uses an infrastructure similar to Amazon’s upcoming voice assistant, Remarkable Alexa.
Figma’s new AI features stir competition with Adobe
Figma announced a range of new features at the 2024 Config conference. Significant ones include a UI redesign, generative AI tools, new icons and toolbar, AI-enhanced asset search, and auto-generated texts in designs.
For instance, by typing a simple prompt into the textbox, users can create an entire app design mock-up for a restaurant. Figma will connect the design pages and even write suggested content!
Figma has also added a few designer-specific features to allow users to tweak designs in real-time. It features a developer mode with a “ready-for-dev” task list. The upgrade also boasts Figma slides, a Google slides-like tool for building and sharing presentations.
Alibaba’s Qwen-72B tops the Hugging Face leaderboard
Hugging Face’s latest open large language model leaderboard ranks and evaluates open LLMs based on benchmarks like MMLU-pro and tests them on high-school and college-level problems.
The platform used 300 NVIDIA H100 GPUs to re-evaluate major open LLMs to obtain updated rankings. Chinese company Alibaba’s Qwen-72B dominated the leaderboard, becoming a top performer overall.
Not just that, the leaderboard was mainly dominated by Chinese companies, highlighting their headway into the open LLM space.
Google’s Gemma 2, a set of lightweight, powerful open LLMs
Google has released Gemma 2 set of models that punch above their weight classes. Available in 9B and 27B parameter sizes, these models are
Higher performing and more efficient at inference than the first-generation
Have significant safety advancements built in
Optimized to run at incredible speed across a range of hardware and easily integrate with other AI tools
Trained on 13 trillion tokens for 27B, 8 trillion for 9B, and 2 trillion for 2.6B model (en route)
27B performs better than Llama3-70B and Nemotron-340B on Lmsys Arena, making it best in its size and stronger than some larger models. While 9B outperforms the likes of Mistral-large and Qwen1.5-110B.
The 27B Gemma 2 model is designed to run inference efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. Moreover, this is an open weights model line, currently only available to researchers and developers.
OpenAI’s CriticGPT finds GPT-4’s mistakes with GPT-4
OpenAI trained a model based on GPT-4, called CriticGPT, to catch errors in ChatGPT’s code output. It found that when users get help from CriticGPT to review ChatGPT code, they outperform those without help 60% of the time.
OpenAI aligns GPT-4 models to be more helpful and interactive through Reinforcement Learning from Human Feedback (RLHF). A key part of RLHF is collecting comparisons in which people, called AI trainers, rate different ChatGPT responses against each other.
OpenAI is beginning to integrate CriticGPT-like models into its RLHF labeling pipeline, providing trainers with explicit AI assistance.
Google’s partnerships to help AI with real-world facts
Google is partnering with reputable third-party services, such as Moody’s, MSCI, Thomson Reuters, and Zoominfo, to ground its AI with real-world data. These four will be available within Vertex AI starting next quarter. They will offer developers qualified data to backstop their model outputs and ensure responses are factually accurate.
Google is also announcing high-fidelity grounding. Available through an experimental preview, it’s designed to help AI systems work better with a given set of specific information.
A Daily chronicle of AI Innovations June 28th 2024:
Google releases Gemma 2, a set of lightweight but powerful open LLMs OpenAI’s CriticGPT finds GPT-4’s mistakes with GPT-4 Google partners with Moody’s, Thomson Reuters & more for AI data
OpenAI’s critics train smarter AI systems
TIME and OpenAI announced a multi-year content deal
Google is using AI to add 110 new languages to Google Translate
YouTube is in talks with major record labels for an AI music deal
Meta to start testing user-created AI chatbots on Instagram
Character.AI now allows users to talk with AI avatars over calls
OpenAI’s critics train smarter AI systems
OpenAI just published new research exploring how to improve model training using AI ‘critics’ that help human trainers spot errors in AI outputs more effectively.
OpenAI developed a model called CriticGPT to assist human trainers in evaluating and catching errors in code written by AI systems like ChatGPT.
In tests, CriticGPT caught 85% of inserted bugs in AI-generated code, compared to just 50% for human reviewers alone.
Human trainers also preferred CriticGPT’s code critics over their own 63% of the time on naturally occurring bugs.
OpenAI said that CriticGPT-type models are already being integrated into the company’s training workflow, helping prepare for advancing AI systems.
As OpenAI states in the research paper, both LLMs and AI critics will continue to scale — human intelligence will not. The limitations of our own puny mortal brains look to be no match for evaluating systems that are continuing to soar, meaning AI is quickly becoming its own trainer.
Google releases Gemma 2, a set of lightweight but powerful open LLMs
Google has released Gemma 2 set of models that punch above their weight classes. Available in 9B and 27B parameter sizes, these models are
Higher performing and more efficient at inference than the first-generation
Have significant safety advancements built in
Optimized to run at incredible speed across a range of hardware and easily integrate with other AI tools
Trained on 13 trillion tokens for 27B, 8 trillion for 9B, and 2 trillion for 2.6B model (en route)
27B performs better than Llama3-70B and Nemotron-340B on Lmsys Arena, making it best in its size and stronger than some larger models. While 9B outperforms the likes of Mistral-large and Qwen1.5-110B.
The 27B Gemma 2 model is designed to run inference efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. Moreover, this is an open weights model line, currently only available to researchers and developers.
Why does it matter?
The models sound like they are built for practical deployments. They come in practical sizes so that they can be easily deployed while being amazing in quality due to best-in-class performances.
OpenAI’s CriticGPT finds GPT-4’s mistakes with GPT-4
OpenAI trained a model based on GPT-4, called CriticGPT, to catch errors in ChatGPT’s code output. It found that when users get help from CriticGPT to review ChatGPT code, they outperform those without help 60% of the time.
OpenAI aligns GPT-4 models to be more helpful and interactive through Reinforcement Learning from Human Feedback (RLHF). A key part of RLHF is collecting comparisons in which people, called AI trainers, rate different ChatGPT responses against each other.
OpenAI is beginning to integrate CriticGPT-like models into its RLHF labeling pipeline, providing trainers with explicit AI assistance.
Why does it matter?
With more advances in reasoning and model behavior, AI models’ mistakes can become more subtle for AI trainers to spot. CriticGPT is a step towards addressing this fundamental limitation of RLHF.
Google partners with Moody’s, Thomson Reuters & more for AI data
Google is partnering with reputable third-party services, such as Moody’s, MSCI, Thomson Reuters, and Zoominfo, to ground its AI with real-world data. These four will be available within Vertex AI starting next quarter. They will offer developers qualified data to backstop their model outputs and ensure responses are factually accurate.
Google is also announcing high-fidelity grounding. Available through an experimental preview, it’s designed to help AI systems work better with a given set of specific information.
Why does it matter?
Earlier, Google announced efforts to ground Vertex AI results using web data and a plan to allow companies to ground AI systems in their own internal data.
Now, it is grounding these systems in known factual data from third parties, which could significantly lessen hallucinations and make AI more trustworthy for enterprise customers.
TIME and OpenAI announced a multi-year content deal
OpenAI will gain access to current and historic content from TIME’s extensive archives from the last 101 years to enhance its products. It will also enable TIME to gain access to OpenAI’s technology to develop new products for its audiences.
Google is using AI to add 110 new languages to Google Translate
It is Google’s largest expansion ever, thanks to its PaLM 2 LLM. It includes languages like Cantonese, NKo, and Tamazight, representing more than 614 million speakers and opening up translations for around 8% of the world’s population.
YouTube is in talks with major record labels for an AI music deal
It is offering to pay Universal Music Group (UMG), Sony Music Entertainment, and Warner Records “lump sums of cash” in exchange for legally licensing their songs to train new AI music tools. These will likely be one-off payments, not royalty-based arrangements. Source: https://www.theverge.com/2024/6/27/24187151/youtube-ai-music-deals-licensing-record-labels-sony-umg-warner
Meta to start testing user-created AI chatbots on Instagram
CEO Mark Zuckerberg announced yesterday that Meta will begin to surface AI characters made by creators through Meta AI studio on Instagram, starting in the U.S. These will primarily show up in messaging for now and will be clearly labeled as AI.
Character.AI now allows users to talk with AI avatars over calls
Users can initiate calls with a user-generated AI character directly with a button tap. Users can also switch between calling and texting seamlessly and stop the AI from talking through a “Tap to interrupt” option. The feature currently supports only a few languages.
With a simple prompt, ChatGPT can analyze documents, answer questions, perform calculations, and create a downloadable spreadsheet — all in one conversation!
Log in to ChatGPT and upload your document to the chat (remember to remove any sensitive data before submitting).
Ask questions about it, e.g., “How much is the security deposit?”
Request calculations, like the total first month’s costs, including deposits and fees.
Simple prompt ChatGPT to “create a downloadable budget spreadsheet”, specifying the time frame and desired columns.
Note: Only share documents you’re comfortable with. Be cautious with sensitive information and consider redacting critical details before uploading.
Alibaba’s Qwen-72B tops the Hugging Face leaderboard
Hugging Face updates Open LLM Leaderboard
NBC to use AI voice for Olympics recaps!
Sonia’s ‘AI Therapist’ sparks debate
Anthropic extends Claude’s access to government
YouTube wants to pay record labels to use their songs for AI training
YouTube is reportedly negotiating with major music labels like Sony Music Entertainment, Universal Music Group, and Warner Records to pay for licensing their songs for AI training.
Despite these negotiations, the ultimate decision to allow the use of songs for AI training will likely fall to individual artists rather than the labels themselves.
In April 2023, more than 200 artists voiced their opposition to AI using their work, emphasizing the potential threat to creators’ rights and the music ecosystem.
Figma has launched AI-powered design features to speed up workflows and aid teams in creating superior software, positioning it as a product development platform to compete with Adobe.
The new Figma AI features include design creation with text prompts, visual search capabilities, layer renaming, and automating repetitive tasks, and are currently available in a limited beta.
While beta access to Figma AI is free, later availability will require subscriptions to specific plans, and admins must enable AI features, differentiating Figma’s approach from Adobe’s controversial AI terms.
Character.AI announced that users can now make calls to AI characters, supporting multiple languages including English, Spanish, Portuguese, Russian, Korean, Japanese, and Chinese.
The company reported that during testing, over 3 million users made 20 million calls, and noted that AI character calls can help with language practice, mock interviews, or role-playing games.
With a button tap, users can start calls with AI characters, seamlessly switch between calling and texting, and use a “Tap to interrupt” feature to stop the AI from talking.
Amazon is reportedly working on a new consumer-focused chatbot codenamed “Metis.” It is planned to be released somewhere around September. Here’s what we know about it:
The chatbot is powered by a new model, Olympus, and can be accessed via a web browser.
It uses a retrieval-augmented generation (RAG) technique to provide up-to-date information and automate tasks.
The model conversationally provides text and image-based outputs, suggesting follow-ups to queries. It also shares links to sources and supports image generation.
It uses an infrastructure similar to Amazon’s upcoming voice assistant, Remarkable Alexa.
Why does it matter?
The AI assistant market is already too crowded, making Metis’ release seem a bit too late. Its rivals, OpenAI and Google, have already launched their models in 2023 and will give tough competition to Amazon. If Amazon wishes to win this AI race, Metis must showcase exceptional capabilities.
Figma’s new AI features stir competition with Adobe
Figma announced a range of new features at the 2024 Config conference. Significant ones include a UI redesign, generative AI tools, new icons and toolbar, AI-enhanced asset search, and auto-generated texts in designs.
https://x.com/figma/status/1806016455609286847
For instance, by typing a simple prompt into the textbox, users can create an entire app design mock-up for a restaurant. Figma will connect the design pages and even write suggested content!
Figma has also added a few designer-specific features to allow users to tweak designs in real-time. It features a developer mode with a “ready-for-dev” task list. The upgrade also boasts Figma slides, a Google slides-like tool for building and sharing presentations.
Why does it matter?
Figma’s AI tools will enable newer users on the platform to test out ideas easily. At the same time, tools like “ready-for-dev” will offer quick iterations to users who are more familiar with the platform.
With Adobe abandoning its plans to acquire Figma, the company’s recent upgrade might pose a challenge to the former.
Source: https://www.figma.com/whats-new/
Alibaba’s Qwen-72B tops the Hugging Face leaderboard
Hugging Face’s latest open large language model leaderboard ranks and evaluates open LLMs based on benchmarks like MMLU-pro and tests them on high-school and college-level problems.
The platform used 300 NVIDIA H100 GPUs to re-evaluate major open LLMs to obtain updated rankings. Chinese company Alibaba’s Qwen-72B dominated the leaderboard, becoming a top performer overall.
Not just that, the leaderboard was mainly dominated by Chinese companies, highlighting their headway into the open LLM space.
Why does it matter?
The open LLM leaderboard is essential for transparent and comprehensive LLM evaluation. It also proves that a model’s size doesn’t guarantee its superior performance. As Chinese models deliver exceptional performance, will well-known models from the USA, like Meta’s Llama, be replaced in the coming years?
Anthropic is making Claude 3 Haiku and Claude 3 Sonnet available for government use through the AWS marketplace. This will improve citizen services, streamline document reviews, support data-based policymaking, and generate realistic training scenarios.
Opera announces new themes and features for One R2
Major features include AI-based image and voice generation, image understanding capabilities, floating windows, new themes, design elements, and split window tabs.
The AI will offer daily, personalized coverage of the 2024 Paris Olympics on the Peacock app in the voice of the famous sports commentator, AI Michael. According to NBC, the voice AI is trained on data from Michael’s appearances on the show.
The feature adds custom graphics, zooms, music, sound effects, transitions, and motion backgrounds to an unedited video. Users can also use Caption’s AI avatars to create videos and fully edit them with the AI edit feature.
Source: https://www.captions.ai/
Sonia’s ‘AI Therapist’ sparks debate
The personalized AI therapist can deal with depression, stress, anxiety, poor sleep, and more by analyzing what users say during therapy sessions. The release has sparked a debate around user data privacy, its susceptibility to bias, and the accuracy of its diagnosis. Source: https://www.soniahealth.com/
Hugging Face updates Open LLM Leaderboard
Hugging Face just introduced a new upgrade to its Open LLM Leaderboard, adding new benchmarks and evaluation methods to help address the recent plateau in LLM performance gains.
The leaderboard now features six new benchmarks designed to be more challenging and less prone to contamination.
Initial rankings show Qwen2-72B-Instruct leading the pack, followed by Meta’s Llama-3-70B-Instruct and Mixtral 8×22b.
A new normalized scoring system adjusts for baseline performance, providing a more fair comparison across different evaluation types.
The upgrade also introduces a ‘maintainer’s highlight’ category and community voting system to prioritize the most relevant models.
As LLMs approach human-level performance on most tasks, finding new ways to evaluate them is becoming more difficult — and more crucial. This revamp helps guide researchers and developers towards more targeted improvements, providing a more nuanced assessment of model capabilities.
A group of developers just discovered a major vulnerability in Rabbit’s R1 AI assistant device, potentially exposing user’s private data and chat responses.
A community-led group called Rabbitude uncovered hardcoded API keys in Rabbit’s codebase, which allowed access to all R1 responses.
The group gained access to the codebase in mid-May, saying the Rabbit team was aware of the issue but failed to take action.
Rabbitude said the vulnerability could allow bad actors to disable all r1 devices, alter voices and responses, and access private messages.
Rabbit acknowledged an ‘alleged data breach’ via a Discord post, but claims no customer data was leaked.
Despite massive hype in the first wave of consumer AI standalone devices, the Rabbit r1 has been nothing short of a disaster so far. Already facing major criticism over the companion’s limited capabilities, this security breach only furthers the skepticism surrounding the early AI hardware market entrants.
A Daily chronicle of AI Innovations June 26th 2024:
Anthropic rolls out Claude’s cutting-edge collaborative features Google experiments with celebrity-inspired AI Chatbots OpenAI postpones the launch of ChatGPT voice mode
New AI simulates 500M years of evolution
OpenAI delays ChatGPT voice upgrade
Anthropic rolls out Claude’s cutting-edge collaborative features
Anthropic has introduced new collaboration features for Claude. These features include:
Projects: Projects in Claude allow integration of internal resources like style guides or codebases, enhancing Claude’s ability to deliver tailored assistance across various tasks. Users can set custom instructions for each Project to modify Claude’s tone or perspective for a specific role or industry.
Artifacts: It allows users to generate and edit various content types like code, documents, and graphics within a dedicated window. This benefits developers by offering larger code windows and live previews for easier front-end reviews.
Sharing Features: Claude Team users can share snapshots of their best conversations with Claude in their team’s shared project activity feed.
Additionally, any data or chats shared within Projects will not be used to train Anthropic’s generative models without a user’s explicit consent.
Why does it matter?
By offering advanced AI collaboration tools, Anthropic meets critical enterprise needs for flexibility and team efficiency, where many of its competitors fall short. This approach could broaden its user base, increasing user retention and market penetration.
Source: https://www.anthropic.com/news/projects
Google experiments with celebrity-inspired AI Chatbots
These chatbots will be powered by Google’s Gemini family of LLMs. The company aims to strike partnerships with influencers and celebrities and is also working on a feature that allows people to create their own chatbots by describing their personalities and appearances.
The project is led by Ryan Germick, a longtime executive at Google and a team of ten. These chatbots could be an experiment and may only appear on Google Labs rather than being widely available.
Why does it matter?
Google’s AI chatbots featuring celebrities and influencers enhance user engagement but raise ethical concerns about consent, boundaries, and legal and privacy issues as AI evolves.
Originally planned for late June, the Voice Mode aims to provide a more naturalistic and conversational experience with the AI chatbot, complete with emotional inflection and the ability to handle interruptions.
However, it will now be available only to a small group of users in late July or early August. OpenAI is working on improving content detection and user experience before wider rollout. GPT-4o’s real-time voice and vision capabilities are also expected to roll out to ChatGPT Plus users soon.
Why does it matter?
Delays can give competitors like Anthropic a chance to gain an edge. If rivals manage to launch similar or superior voice capabilities sooner, OpenAI risks losing potential users and market share.
Biotech startup EvolutionaryScale just introduced ESM3, a new AI model capable of designing and simulating new proteins in ways that mimic millions of years of natural evolution.
ESM3 is a frontier language model that can reason over protein sequences, structures, and functions simultaneously.
In tests, ESM3 generated a new fluorescent protein 58% different from other known structures — a leap that typically takes nature 500M years of evolution.
The largest version of ESM3 has 98B parameters, and was trained on approximately 1.8T tokens of biological data.
EvolutionaryScale was created by former META researchers, and has already secured $142M in seed funding from tech giants like Amazon and Nvidia.
ESM3 is bringing us closer to a future where we can ‘program’ biology as easily as we write computer code — potentially revolutionizing fields from healthcare to environmental science and unlocking completely new areas of knowledge across fields.
Ex-Meta engineers launch protein-generating AI biology model
EvolutionaryScale, launched by ex-Meta engineers, introduced ESM3, a gen AI model for designing novel proteins. It uses AI to simulate evolutionary processes, aiming to revolutionize industries like medicine and biofuels. It is available via API to selected customers and non-commercial users.
Reddit sets new barriers for AI crawlers to protect its content
Reddit will rate-limit or block unknown bots and crawlers that violate its Public Content Policy or lack an agreement with the platform. These changes aim to deter AI companies from training LLMs on Reddit content (excluding the $60 million deal with Google).
Toys ‘R’ Us, the iconic toy store chain, collaborated with OpenAI’s Sora to produce a brand film called “The Origin of Toys ‘R’ Us.” The film premiered at the 2024 Cannes Lions International Festival of Creativity.
Mozilla’s Firefox Nightly now lets users pick from leading AI chatbots
It allows users to choose from multiple AI chatbots, such as ChatGPT, Google Gemini, HuggingChat, and Le Chat Mistral. The feature lets users select their preferred AI to perform tasks such as summarizing web content, simplifying text, or conducting quizzes.
Google extends Gemini AI to Gmail to streamline email tasks
Google has integrated its Gemini AI into Gmail for Workspace subscribers, enhancing email functionality. It assists users by automatically drafting emails, summarizing threads, crafting responses, and retrieving information from emails or Google Drive.
A Daily chronicle of AI Innovations June 25th 2024:
Amazon is working on a ChatGPT competitor
Google is building AI chatbots modeled after celebrities, YouTube influencers
Nvidia’s market value plummets drastically
Apple rejected Meta’s AI integration over privacy issues
OpenAI buys a remote collaboration platform
Synthesia 2.0: World’s 1st AI video communication platform OpenAI is on an acquiring spree, buying Rocket and Multi Record labels sue AI music startups over copyright infringement
Amazon is working on a ChatGPT competitor
Amazon is developing an AI chatbot, code-named “Metis,” to compete with OpenAI’s ChatGPT, and it could be launched later this year, based on a report from Business Insider.
The upcoming chatbot will be accessed through a web browser and rely on a new foundational model, as mentioned by sources and documents cited by Business Insider.
Amazon’s CEO Andy Jassy is reportedly directly involved with “Metis,” reflecting Amazon’s broader strategy to intensify its AI efforts across various applications and services.
Google is building AI chatbots modeled after celebrities, YouTube influencers
Google is reportedly developing new AI chatbots modeled after celebrities and YouTube influencers, aiming to create specialized interactive experiences.
The initiative will use Google’s Gemini family of large language models and seeks partnerships with influencers and celebrities, allowing users to design their own chatbots by describing traits and appearance.
The project, led by executive Ryan Germick and a team of ten, might initially be available only on Google Labs, keeping the bots experimental and not widely accessible at launch.
Nvidia’s shares dropped nearly 13% in a week, causing a loss of $430 billion from its market value and dropping its market capitalization to $2.91 trillion.
This decline has moved Nvidia from the top position to third place among the world’s most valuable companies, now trailing behind Microsoft and Apple.
Investor enthusiasm for Nvidia’s role in AI is waning after significant stock gains, leading to a 6.7% drop on Monday, the third consecutive day of declines for the company.
Apple rejected Meta’s AI integration over privacy issues
Apple declined a potential AI partnership with Meta concerning the integration of Meta’s large language model into iOS, primarily due to privacy concerns highlighted in a Bloomberg report.
While initial discussions between Apple and Meta took place, they did not advance, with Apple choosing not to pursue formal talks or integration due to Meta’s insufficient privacy practices.
Despite rejecting Meta, Apple has engaged in collaborations with other major AI entities, finalizing a deal with OpenAI to integrate ChatGPT into iOS 18 and discussing potential integrations with Google and Anthropic for diverse AI model options.
OpenAI is advancing its AI technology by integrating it into desktop computing, moving beyond browser-based interactions with the acquisition of startup Multi.
By purchasing Multi, OpenAI aims to develop potentially a new AI-based operating system or at least an interaction layer for current systems, as suggested by recent product releases including a ChatGPT desktop app.
OpenAI’s future AI assistant could replace traditional browsers, altering the role of search engines and websites, and revolutionizing how we interact with online content.
Synthesia 2.0: World’s 1st AI video communication platform
Synthesia is launching Synthesia 2.0 – the world’s first AI video communications platform for businesses. It reinvents the entire video production process, allowing companies to create and share AI-generated videos at scale easily.
The key new features and capabilities of Synthesia 2.0 include:
2 Personal AI Avatars: Expressive Avatars shot in a studio and Custom Avatars created using your webcam.
AI Video Assistant: Converts text, documents, or websites into high-quality videos, with options to customize the branding, tone, and length.
Intuitive Video Editing: Editing simplified with “Triggers” that let you control animations and edits from the script.
Translation and Dynamic Video Player: Videos can now be translated into over 120 languages. Synthesia is also building a new video player with interactive features.
AI Safety Focus: Synthesia is pursuing ISO/IEC 42001 certification, the first standard for responsible AI management, to ensure its AI technologies are ethical.
Why does it matter?
By making video creation and distribution more accessible and automated, Synthesia aims to help businesses transition to a “video-first” approach to communicate more effectively, improve employee training, customer service, and marketing efforts, and streamline various business operations.
OpenAI is on an acquiring spree, buying Rocket and Multi
Last week, OpenAI acquired Rockset, a startup that develops tools for real-time data search and analytics. OpenAI said it would integrate Rockset’s technology to power its infrastructure and offerings across products.
This week, OpenAI acquired Multi, a startup focused on building remote collaboration tools and software. Technically, the deal is an acqui-hire as the entire Multi team, including its co-founders, will join OpenAI to work on the company’s ChatGPT desktop application.
Why does it matter?
With over 600,000 paid business users, including most Fortune 500 companies, OpenAI’s back-to-back acquisitions strategically position it to expand its enterprise-focused AI products and make its powerful language models more useful for businesses.
Record labels sue AI music startups over copyright infringement
The world’s major record labels, including Universal Music Group, Sony Music, and Warner Music, have filed twin lawsuits against the AI music generation startups Suno and Udio. The lawsuits accuse the companies of unlawfully training their AI models on massive amounts of copyrighted music, which, according to the complaints, allows the startups to generate similar-sounding music without permission.
The record labels allege Suno and Udio have effectively copied artists’ styles and specific musical characteristics. The labels claim the AI-generated music is so close to the original that it is eerily similar when transcribed into sheet music. The lawsuits also accuse the startups of making it easy for people to distribute AI-created samples that mimic copyrighted recordings on platforms like Spotify.
Why does it matter?
This legal battle is only the latest example, and one of many, of AI technology colliding with creative industries. However, any resolution of these battles will likely have far-reaching implications for the music industry and the development of AI-powered music tools.
Apple withdraws AI collaboration deal with Meta over privacy concerns
Contrary to the rumors about a potential AI collaboration, Apple is not planning a partnership to integrate Meta’s AI models into its products due to privacy concerns. Instead, Apple is focusing on partnerships with OpenAI and Google that align with its commitment to user privacy.
Shopify expands its AI capabilities to attract more businesses
Shopify has expanded access to its AI-powered features, including the Sidekick assistant and image-generation tool. It is making these AI tools available to more users, as it aims to provide merchants with an integrated and seamless platform to scale their businesses.
Google is bringing Gemini access to teens through school accounts
Google now makes its AI assistant Gemini available to teenage students through school accounts. This move is aimed at helping prepare students for a future where generative AI is more prevalent. Google says Gemini will provide real-time feedback and support to students while including safeguards to prevent inappropriate responses.
NVIDIA deploys its AI tech in the Middle East among U.S AI export curb
NVIDIA has struck a deal with Qatari telecom provider Ooredoo to deploy its AI technology in the Middle East. This move comes amid U.S. export restrictions on advanced AI chips to some countries in the region. Ooredoo will be the first company in the Middle East to offer Nvidia’s AI and graphics technology to its clients across multiple countries.
Meta has been incorrectly labeling real photos as “Made by AI.” Multiple photographers have reported that Meta has added this label to their genuine photographs, even when they have only made minor edits using tools like Adobe’s Generative Fill. Meta says it relies on industry standards to detect AI-generated content, leading to false positives.
A Daily chronicle of AI Innovations June 24th 2024:
Apple and Meta are discussing gen AI partnership ByteDance and Broadcom collaborate on AI chip development Researchers developed a new method to detect hallucinations
Apple in talks with Meta for potential AI integration
Record labels sue AI music companies
AI blood test for Parkinson’s detection
Researchers just developed an AI-powered blood test that can predict Parkinson’s disease up to 7 years before symptoms appear, potentially revolutionizing early diagnosis and treatment.
The AI analyzes key markers in blood samples linked to inflammation and protein degradation, attempting to predict which patients would develop the disease.
In a 10-year study, the test correctly identified 79% of patients who later developed Parkinson’s.
The AI model predicted Parkinson’s up to 7 years before clinical symptoms emerged in some cases.
Researchers hope to develop a simple finger-prick version for widespread screenings.
AI’s ability to find subtle patterns in previously undetectable biological data is unlocking new diagnostic capabilities. By catching a debilitating disease years before the onset of symptoms, patients can get earlier access to treatment aimed at prevention.
Apple in talks with Meta for potential AI integration
Apple is reportedly negotiating with Meta to integrate Meta’s generative AI model into Apple’s new AI system, Apple Intelligence, according to The Wall Street Journal.
Apple is seeking partnerships with multiple AI companies, including Meta, to enhance its AI capabilities and catch up in the competitive AI race.
A potential collaboration between Apple and Meta would be significant due to their history of disagreements, and it could greatly impact the AI industry if successful.
Universal Music Group, Sony Music Entertainment, and Warner Records are suing AI companies Suno and Udio for massive copyright infringement related to the creation of original songs using generative AI technology.
The music labels claim that Suno and Udio used copyrighted works without permission, and are seeking damages of up to $150,000 per violated work in lawsuits filed in Boston and New York federal courts.
This legal action is part of a broader conflict between the music industry and tech companies offering AI tools, highlighting concerns about AI-generated music impacting the control and revenue of human artists.
Apple is reportedly in talks with its longtime rival Meta to integrate the latter’s Llama 3 AI model into Apple Intelligence. This move comes as Apple prepares to roll out its AI features across iPhones, iPads, and Macs later this year.
The potential partnership follows Apple’s existing deal with OpenAI, suggesting a collaboration strategy rather than solo development in the AI race. In Apple’s arrangement with OpenAI, there’s no direct payment. Instead, OpenAI can offer premium subscriptions through Apple Intelligence, with Apple taking a percentage. It’s unclear if Meta would agree to a similar business model, given that Llama 3 is open-source and free to access.
Why does it matter?
Despite their historical rivalry, especially after Apple’s 2021 privacy changes impacted Meta’s ad revenue, this potential partnership suggests a willingness to cooperate in the AI race. However, not everyone is thrilled about it. Users and investors are voicing concerns about privacy, with Meta’s track record being a major sticking point. Some see this as a desperate move by Apple.
ByteDance and Broadcom collaborate on AI chip development
ByteDance is collaborating with U.S. chip designer Broadcom to develop an advanced AI processor. This partnership aims to secure a stable supply of high-end chips amid ongoing U.S.-China tensions. The project centers on creating a 5-nanometre, customized Application-Specific Integrated Chip (ASIC) that complies with U.S. export restrictions.
This chip’s manufacturing is set to be outsourced to Taiwan Semiconductor Manufacturing Company (TSMC), though production is not expected to begin this year. While the design work is currently underway, the critical “tape out” phase has yet to commence.
Why does it matter?
This partnership shows how Chinese tech companies are getting creative to secure advanced AI chips. As ByteDance has launched a major push in generative AI, this new initiative brings dual benefits: reducing procurement costs and ensuring a stable supply of higher-end chips.
Researchers developed a new method to detect hallucinations
ChatGPT and Gemini can produce impressive results but often “hallucinate” false or unsubstantiated information. This research focuses on a subset of hallucinations called “confabulations,” where LLMs generate answers that are both wrong and arbitrary. Researchers have developed new methods to detect confabulations using entropy-based uncertainty estimators. They introduce the concept of “semantic entropy” to measure the uncertainty of LLM generations at the meaning level.
High semantic entropy corresponds to high uncertainty and indicates a higher likelihood of confabulation. The method computes uncertainty at the level of meaning rather than specific word sequences, addressing the fact that one idea can be expressed in many ways. The method provides scalable oversight by detecting confabulations that people might otherwise find plausible.
Why does it matter?
Detecting confabulations can improve question-answering accuracy for state-of-the-art LLMs. The method helps users understand when to be cautious with LLM outputs and opens up new possibilities for using LLMs that were previously limited by unreliability.
Prem Akkaraju, former CEO of Weta Digital, will step up to this role. This change comes after founder Emad Mostaque resigned in March amid financial concerns. Akkaraju is part of a new investor group, including former Facebook President Sean Parker, that provides a cash infusion to Stability AI. Source: https://www.theinformation.com/articles/stability-ai-gets-new-ceo-and-a-bailout-from-investor-group-led-by-sean-parker
Coders spotted Midjourney references in Grok’s standalone version
It hints at a possible collaboration between xAI and Midjourney for image generation. Grok might use Midjourney models via API to generate images. This could mean that the standalone Grok might let users upload images. All this could be part of the Grok 1.5 rollout. Source: https://www.bolnews.com/technology/2024/06/grok-1-5-potential-use-of-midjourney-for-image-generation-gains-evidence
YouTube fights AI fakes with new policy
YouTube expanded its privacy policy to cover AI-generated content. Now, users can request the removal of synthetic media that imitates their face or voice. However, privacy complaints alone may not guarantee content removal. While reviewing requests, YouTube will consider the potential for confusion and intent (parody/satire). Source: https://support.google.com/youtube/thread/281049221
OmniAI transforms unstructured enterprise data for AI and analytics use
Founded by Y Combinator alums, OmniAI syncs with your existing storage, prep the data, and lets you run your choice of AI models on the data. It offers integrations with popular models like Meta’s Llama 3 and Anthropic’s Claude. Source: https://techcrunch.com/2024/06/22/omniai-transforms-business-data-for-ai
The new Dot AI chatbot wants to be your companion
Dot chatbot aims to personalize interactions based on users’ thoughts and feelings. It uses multiple AI models from OpenAI and Google. The app uses intensive Q&A to learn about users’ interests and experiences. Dot offers a subscription model at $11.99/month for unlimited conversations. Source: https://techcrunch.com/2024/06/21/dots-ai-really-really-wants-to-get-to-know-you
Y Combinator joined over 140 AI startup founders in signing a letter opposing California’s Senate Bill 1047 on AI regulation, arguing it harms the ability to retain talent and stifles innovation. Source: https://www.politico.com/newsletters/california-playbook/2024/06/21/little-tech-brings-a-big-flex-to-sacramento-00164369
Formula 1 and AWS unveiled ‘Statbot’, an AI-powered system that provides real-time context, trivia, and predictions for personalized race broadcasts. Source: https://www.bloomberg.com/news/articles/2024-06-23/formula-1-and-amazon-aim-for-ai-powered-personalized-race-viewing
OpenAI announced the acquisition of Rockset, with the search and database analytics startup’s tech set to boost retrieval infrastructure for enterprise products. https://openai.com/index/openai-acquires-rockset
Axel Peytavin demonstrated a small, autonomous robot prototype powered by GPT-4o that can see, think and act based on prompts, with capabilities like picking up trash and shaking hands. Source: https://x.com/ax_pey/status/1804209628680720746
OpenAI CTO Mira Murati said in an interview that the next generation of GPT will be ‘PhD-level’ compared to GPT-3 (toddler) and GPT-4 (high school), and will be released in a year and a half. Source: https://x.com/tsarnick/status/1803901130130497952
Unitree posted new demos of its Go2 robotic dog performing handstands, flips, and withstanding and adjusting to force, showing extreme precision and agility. Source: https://x.com/UnitreeRobotics/status/1803728664213012520
A Daily chronicle of AI Innovations June 20th 2024:
Microsoft debuts a vision-foundational model for diverse tasks Ex-OpenAI co-founder launches own AI company Can AI read minds? New model can predict anxiety levels
😱 Sutskever launches new AI venture
🚧 Tech giants team up on AI factory for xAI
🚨 Perplexity faces legal action
OpenAI co-founder announces new AI company
Anthropic announces its most powerful AI yet
EU Council withdraws Chat Control vote
Apple Intelligence plans face legal challenges in China
Microsoft debuts a vision-foundational model for diverse tasks
Florence 2, Microsoft’s vision-AI model, can perform diverse tasks like object detection, captioning, visual grounding, and visual question answering via image and text prompts. It displays excellent captioning, object detection, visual grounding, and segmentation.
The model comes in 232M and 771M parameter sizes and uses a sequence-to-sequence architecture, enabling multiple vision tasks without needing a task-specific architecture modification.
On fine-tuning the model with publicly available human-annotated data, Florence 2 showcased impressive results, offering tough competition to existing large vision models like Flamingo despite its compact size.
Why does it matter?
The model will equip enterprises with a standard approach to handling various vision-handling applications. This will save resources spent on separate task-specific vision models that need fine-tuning. Moreover, it may also be useful to developers as it would eliminate the need for separate vision models for smaller tasks, significantly saving compute costs.
Just a month after leaving OpenAI, ex-cofounder Ilya Sutskever has launched his own AI company, Safe Superintelligence Inc. (SSI), alongside former Y Combinator partner Daniel Gross and ex-OpenAI engineer Daniel Levy as co-founders.
According to the SSI’s launch statement on X, the company will prioritize safety, progress, and security. Sutskever also emphasizes that the company’s “singular focus” on a joint approach to safety and capabilities will prevent it from being distracted by management overhead or production cycles, unlike companies like OpenAI or Google.
Why does it matter?
SSI’s launch clearly marks the emergence of a new key player in the race to build safe, powerful AI. Its mission statement emphasizes safety and the potential for groundbreaking developments that may shape the future of AI research and development. It would be interesting to see whether the startup will uphold its mission statement in the coming days.
Source: https://ssi.inc/
Can AI read minds? New model can predict anxiety levels
Researchers at the University of Cincinnati have developed an AI model that can identify people with an urgent risk of anxiety. The AI model uses minimal computational resources, a short picture rating task, and a small set of variables to make the prediction. The approach named “Comp Cog AI” integrates computational cognition and AI.
Participants rated 48 pictures with mildly emotional subject matter based on the degree to which they liked or disliked those pictures. The response data was then used to quantify the mathematical features of their judgments. Finally, the data was combined with ML algorithms to identify their anxiety levels.
Since the technology doesn’t rely on a native language, it is accessible to a wider audience and diverse settings to assess anxiety.
Why does it matter?
The picture rating feature can provide unbiased data to medical professionals on a person’s mental health status without subjecting them to direct questions that may trigger negative emotions. Given its 81% accuracy rate, the tool can become a useful app for detecting individuals with high anxiety risks.
Anthropic has launched Claude 3.5 Sonnet, a new AI model that aims to be on par with, or superior to, OpenAI’s GPT-4o and Google’s Gemini across various tasks.
Claude 3.5 Sonnet claims to be significantly faster than its predecessor and outperforms it, even surpassing other leading models in multiple benchmarks.
Alongside the new model, Anthropic introduced the Artifacts feature, which allows users to interact with and edit Claude’s outputs directly within the app, enhancing its functionality beyond a typical chatbot.
Apple Intelligence plans face legal challenges in China
Apple’s push to enhance its artificial intelligence capabilities faces significant challenges in China due to the country’s stringent AI regulations and censorship laws.
China’s strict regulation of AI, including rules about language models and data protection, means Apple must navigate approvals and potentially partner with local tech giants like Baidu and Alibaba.
Apple’s focus on user privacy and the need for localized AI experiences may help it gain regulatory approval but also present significant hurdles in ensuring compliance and acceptance in the Chinese market.
Dell, Nvidia, and Super Micro Computer are collaborating to build an AI factory for xAI’s massive supercomputer, which will train and scale its Grok AI chatbot.
Dell is assembling half of the server racks for xAI’s planned supercomputer, with Super Micro handling the other half.
The system aims to use up to 100,000 Nvidia H100 GPUs, potentially making it 4x larger than the biggest existing AI clusters.
Musk wants the supercomputer to be fully operational by Fall 2025 to accelerate the development of advanced Grok models.
This powerhouse partnership shows the staggering scale of compute power required to push the boundaries of AI. With huge capital flowing in, this massive cluster could provide a critical edge in evolving Grok to be a serious competitor to the likes of OpenAI and Anthropic.
News publisher Forbes is reportedly threatening legal action against AI search startup Perplexity, alleging the company’s chatbot stole and republished original reporting without proper attribution.
Forbes claims Perplexity’s chatbot summarized one of its investigative articles, copying with minimal sourcing.
The AI-generated version was then pushed to Perplexity subscribers and turned into a YouTube video that outranked Forbes’ own content.
Perplexity CEO Aravind Srinivas responded that the product still has “rough edges”, and the company is working on ways to align with publishers.
A WIRED investigation found Perplexity scraping sites that actively blocked crawlers, while also hallucinating info and containing incorrect attributions.
Tensions between AI firms and media companies continue to escalate. While legal grounds surrounding scraping and repurposing with AI are still murky, publishers are in a bind when it comes to protecting IP. The only viable options right now seem to be cashing out with licensing deals or taking legal action.
Deloitte, HPE, and NVIDIA partner up! The alliance combines Deloitte’s deep industry expertise and AI capabilities with the newly released HPE NVIDIA AI Compute solutions suite. The collaboration further seeks to advance industry-specific gen AI applications and assist clients across sectors in modernizing their data strategies by accessing innovative insights.
NVIDIA’s AI can turbocharge deployment of self-driving cars: NVIDIA has unveiled a new AI software, Omniverse Cloud Sensor RTX, that will likely accelerate the development of self-driving cars and robots.
The software combines real-world data with synthetic data, making it easy to test sensor perception in realistic virtual environments before deployment into the actual world.
YouTube trials AI-powered “Ask for music” feature: The feature seeks to make searching music more conversational for users. Users can search for music using voice commands and verbal prompts. It is speculated that the feature may be an addition to the AI-generated playlist cover available on YouTube.
Luma adds an “extend video” feature to Dream Machine: In contrast to Luma’s previous five-second limit for videos, the “extend video” feature will allow videos to be extended based on prompts. The AI model will consider the new context while extending the video. Additionally, the upgrade will enable Standard, Pro, and Premier users to remove watermarks.
Apple releases 20 core ML models on Hugging Face: The release includes 4 major datasets in addition to 20 core models as a part of Apple’s efforts to equip developers with advanced on-device AI capabilities.
These core ML models have been optimized to run exclusively on users’ devices and can be used for various applications like image classification, depth estimation, and semantic segmentation.
Source: https://huggingface.co/apple
A Daily chronicle of AI Innovations June 19th 2024:
Nvidia is now the world’s most valuable company
Perplexity AI is lying about their user agent
Meta releases new AI models for text, image and audio
Ex-Snap engineer launches social network where AIs and humans coexist
OpenAI and Color Health join forces to accelerate cancer treatment Meta unveils new AI models for audio, text, and watermarking Notion introduces AI Connectors for Slack & Google Drive integration
🎺 TikTok brings AI to content creation
Nvidia is now the world’s most valuable company
Nvidia surpassed Microsoft to become the world’s most valuable company, reaching a market cap of $3.34 trillion, driven by its focus on artificial intelligence and expanded robotics research.
The company’s stock price increased by 3.7%, putting it ahead of Microsoft’s $3.33 trillion market cap, while Apple ranks third with a market cap of $3.28 trillion.
Nvidia’s rise is fueled by partnerships and advancements in AI and robotics, including a new collaboration with Hewlett Packard to launch NVIDIA AI Computing and Private Cloud AI, along with significant projects with major AI developers like Google, Meta, Microsoft, and OpenAI.
WIRED and developer Robb Knight found that Perplexity’s crawler ignores the Robots Exclusion Protocol, accessing content from sites that have blocked it, including Condé Nast properties.
Perplexity CEO Aravind Srinivas describes the product as an “answer engine,” but the chatbot has been found to paraphrase articles without proper attribution and sometimes fabricate information entirely.
An experiment by WIRED demonstrated that Perplexity sometimes generates content without accessing the actual webpage, raising concerns about the accuracy and ethics of its information-gathering methods.
Meta releases new AI models for text, image and audio
Meta’s Fundamental AI Research team introduced new models such as Chameleon, a multimodal model for text and images, and JASCO, a text-to-music model accepting various inputs like chords or beats.
Meta’s Chameleon can handle and generate text and images, with its 7B and 34B variants available for research under a non-commercial license.
The company launched AudioSeal, an audio watermarking technology, which can identify AI-generated speech, released under a commercial license for recognizing speech in lengthy audio segments.
Ex-Snap engineer launches social network where AIs and humans coexist
Butterflies, founded by former Snap engineer Vu Tran, is a social network allowing humans and AI personas (called Butterflies) to interact through posts, comments, and direct messages, launching publicly on iOS and Android after five months in beta.
Users can easily create Butterfly personas with backstories, emotions, and opinions, which then autonomously generate content that other users and AIs can engage with, aiming to provide a more immersive AI experience compared to traditional chatbots.
The app received $4.8 million in seed funding led by Coatue and has mostly been used for entertainment, with potential plans for subscription models and brand interactions in the future.
OpenAI and Color Health join forces to accelerate cancer treatment
In collaboration with OpenAI, Color Health has developed a copilot application that leverages OpenAI’s GPT-4.o model to assist clinicians in accelerating cancer treatment. Here’s how it works:
Integration of medical data: The copilot integrates patient medical data with clinical knowledge using OpenAI’s APIs. It processes and normalizes vast unstructured data such as family history, individual risk factors, and clinical guidelines.
Tailored treatment plans: Based on this data, the copilot generates customized, comprehensive treatment plans. These plans answer critical questions like “What screenings should the patient be doing?” and identify missing diagnostics.
Clinician-in-the-loop workflow: A clinician evaluates the copilot’s output at each step. They can modify it if necessary before presenting it to the patient.
Assured privacy: Color Health ensures HIPAA compliance and maintains rigorous privacy standards while providing clinician oversight to ensure high-quality care.
Why does it matter?
The tool identifies missing diagnostics and expedites the analysis of complex medical records – a process that can now be completed in just 5 minutes rather than hours or weeks. This not only improves access to critical expertise but also has the potential to catch cancer or pre-cancerous conditions earlier, enabling faster treatment and better patient outcomes.
Source: https://openai.com/index/color-health
Meta unveils new AI models for audio, text, and watermarking
Meta’s Fundamental AI Research (FAIR) team has recently released several new AI models and tools for researchers to use. Here are the highlights:
JASCO: This AI model, short for “Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation,” allows users to adjust features of generated sound (such as chords, drums, and melodies) through text inputs. FAIR plans to release the JASCO inference code under an MIT license and the pre-trained model under a non-commercial Creative Commons license.
AudioSeal: This tool adds watermarks to AI-generated speech. It’s designed specifically for localized detection of AI-generated segments within longer audio snippets and will be released with a commercial license.
Chameleon: FAIR will release two sizes of its multimodal text model, Chameleon (7B and 34B), under a research-only license. These models can handle tasks requiring visual and textual understanding, such as image captioning.
Why does it matter?
By making these tools publicly available, Meta encourages collaboration within the AI community, which could lead to groundbreaking applications in various fields, such as music creation, sound design, and content generation. Moreover, the introduction of AudioSeal addresses the growing need to identify and attribute AI-generated content, promoting transparency and accountability in an increasingly AI-driven world.
Notion introduces AI Connectors for Slack & Google Drive integration
With AI Connectors, users can query Notion and receive responses from connected apps. The integration allows for selecting specific public channels or all public channels for information retrieval, excluding private channels and direct messages.
Notion AI can access up to a year of historical Slack content, which may take up to 36 hours. Additionally, users can set up various interactions between Notion and Slack, such as sending Slack messages to a Notion database, creating Notion tasks directly from Slack, and receiving notifications in Slack for specific events in Notion
Why does it matter?
This development demonstrates the growing trend of incorporating AI capabilities into mainstream productivity and collaboration platforms. The ability to intelligently pull in and surface relevant information from external platforms helps streamline workflows, enhance productivity, and improve the overall efficiency of the team.
Researchers at the University of Cincinnati just developed a new AI system that can predict anxiety levels using a short picture rating task and a small set of demography and psychological variables.
Comp Cog AI combines principles from psychology with machine learning to measure human decision-making patterns related to anxiety.
Participants rate a series of emotion-evoking pictures and answer a few contextual questions about their age, income, employment, etc.
The model extracts 15 ‘judgment variables’ that capture patterns in how users weigh positive and negative outcomes in decision-making.
In tests, the model predicted whether participants had higher or lower anxiety with up to 81% accuracy.
While anxiety disorders are extremely prevalent, they’re also often challenging and time-consuming to diagnose. With AI’s ability to connect data patterns and achieve reliable predictions, the tech could soon enable more efficient screenings to streamline assessment and treatment time.
Google transitions DeepMind from research to AI product development
This transition involves merging Google Research’s Brain team with DeepMind, creating a unified group called Google DeepMind. The aim is to enhance the development of advanced AI technologies responsibly. However, this move has led to frustration among some researchers due to new guidelines and the pressure to commercialize AI technologies.
ElevenLabs’ new open-source AI tool adds sound effects to videos
The app extracts four frames from the video and sends them with a prompt to OpenAI’s GPT-4 to generate a text-to-sound effects prompt. This prompt is used by ElevenLabs’s Sound Effects API to create the sound effect, which is then combined with the video into a single file up to 22 seconds long, ready for download.
Snap unveils advanced gen AI for enhanced AR effects
Snap’s latest iteration of gen AI allows users to see more realistic special effects. It plans to create full-body AR experiences rather than just facial AR experiences, such as generating a new outfit. It also announced an upgraded version of its developer program, Lens Studio, which lets artists and developers create AR features for Snapchat or other websites and apps.
Hedra Labs launches Character-1, a multimodal creation studio preview
Hedra Labs has launched a preview of Character-1, a model for creating expressive characters that talk, sing, and rap. It offers infinite duration (30s in open preview) and generates 90s content per 60s of input, hardware permitting. This is part of Hedra’s mission to give creators control over dialogue, movement, and virtual worlds.
Ex-Snap engineer launches AI-human social network, Butterflies
The platform allows users to create AI avatars, called “Butterflies,” that can engage in conversations, generate images, and participate in social activities like human users. The app offers a range of features, including creating and customizing AI characters and exploring a feed filled with AI-generated and human-generated content.
TikTok just launched Symphony, a new suite of AI features to help brands and creators produce content more efficiently on the platform — including digital avatars, translation tools, an AI assistant, and more.
Symphony Digital Avatars enable brands to generate AI spokespeople to star in their TikTok ads and branded content across multiple languages.
Brands can choose from a selection of ‘stock avatars’ based on real actors or create custom avatars to serve as virtual brand reps.
A new AI Dubbing tool automatically transcribes, translates, and dubs videos into 10+ languages, helping brands scale content globally.
TikTok’s Creative Assistant has been rebranded to Symphony Assistant, a chatbot to help with brainstorming, writing scripts, and optimizing content for the platform.
The creator economy is about to be completely remade with AI, and these tools will unlock new opportunities for optimized content, reach, and creation — but the rise of digital ‘spokespeople’ will also raise new questions on authenticity and consumer trust that brands will need to grapple with.
A Daily chronicle of AI Innovations June 18th 2024:
OpenAI expands into healthcare with AI tool for cancer screening
NATO invests $1.1 billion in AI, space, and robotics defence tech
Apple halts work on Vision Pro
Google DeepMind’s new AI can generate soundtracks for videos Runway launches new model Gen-3 Alpha China’s DeepSeek Coder V2 beats GPT-4 Turbo
🔊 DeepMind creates sound for videos
Apple halts work on Vision Pro
Apple has paused development on the second-generation Vision Pro headset to concentrate on a more affordable version, according to The Information.
The company reportedly began working on this cheaper Vision device, codenamed “N109,” in 2022, aiming to sell it for a price similar to a high-end iPhone.
Even though Apple plans to use the high-end display components from the Vision Pro in the budget model, it still faces challenges in reducing costs without removing too many features.
OpenAI expands into healthcare with AI tool for cancer screening
OpenAI has partnered with Color Health to integrate artificial intelligence into cancer screening and treatment procedures using their GPT-4o model.
The AI assistant developed by Color Health processes patient data to create personalized cancer screening strategies and informs doctors of missing diagnostic tests.
The AI tool also helps in assembling a cancer pretreatment work-up, expediting the process that can take weeks or months, and thus potentially reducing delays in treatment.
NATO invests $1.1 billion in AI, space, and robotics defence tech
NATO has started its €1 billion investment in defence technology, focusing on AI, space, and robotics, by funding four European tech firms and four venture capital funds.
The NATO Innovation Fund, established in 2022, aims to address contemporary defence and security challenges by matching government buyers with start-ups to develop new technologies.
Initial recipients of the fund include Germany’s ARX Robotics, UK’s Fractile AI, iComat, and Space Forge, with the broader aim of promoting NATO’s technological sovereignty and innovation ecosystems.
Google DeepMind’s new AI can generate soundtracks for videos
DeepMind is developing video-to-audio (V2A) technology to generate rich soundtracks for silent videos generated by AI models. V2A combines video pixels with natural language text prompts to create synchronized audiovisual content. The technology offers enhanced creative control, allowing users to guide the audio output using positive and negative prompts.
What sets DeepMind’s V2A apart is its ability to understand raw pixels and generate audio without manual alignment. However, V2A struggles with artifacts or distortions in videos and generates audio that is not super convincing. As DeepMind continues to gather feedback from creators and filmmakers, they remain committed to developing this technology responsibly.
Why does it matter?
The technology could help revive and enhance historical footage, silent films, and other archival material. However, generative AI tools like V2A also threaten to disrupt the film and TV industry, potentially eliminating jobs without strong labor protections.
Runway launched Gen-3 Alpha, its latest AI model for generating video clips from text descriptions and still images. Gen-3 Alpha excels at generating expressive human characters with a wide range of actions, gestures, and emotions and can interpret various styles and cinematic terminology. However, it has limitations, including a maximum video length of 10 seconds, and struggles with complex character and object interactions and following the laws of physics precisely.
Runway partnered with entertainment and media organizations to create custom versions of Gen-3 for more stylistically controlled and consistent characters, targeting specific artistic and narrative requirements. They also have implemented safeguards, such as a moderation system to block attempts to generate videos from copyrighted images and a provenance system to identify videos coming from Gen-3.
Why does it matter?
As competition in AI video generation heats up, Runway’s Gen-3 Alpha empowers artists and filmmakers to create high-quality, controllable videos with ease, pushing the boundaries of storytelling and creative possibilities.
Chinese AI startup DeepSeek has announced the release of DeepSeek Coder V2, an open-source code language model. It is built upon the DeepSeek-V2 MoE model and excels at coding and math tasks, supporting over 300 programming languages. It outperforms state-of-the-art closed-source models like GPT-4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro, making it the first open-source model to achieve this feat. DeepSeek Coder V2 also maintains comparable performance in general reasoning and language capabilities.
The model is being offered under an MIT license, which allows for research and unrestricted commercial use. It can be downloaded or accessed via API on DeepSeek’s platform.
Why does it matter?
DeepSeek aims to “unravel the mystery of AGI with curiosity” and has quickly emerged as a notable Chinese player in the AI race. As it only costs $0.14/1M tokens(input) and $0.28/1M tokens(output), it will give notable models like GPT-4 Turbo intense competition.
Findem – Data Analyst: https://jobs.therundown.ai/jobs/63102512-data-analyst
What Else Is Happening in AI on June 18th 2024
Perplexity now displays weather, currency conversion, and simple math directly through cards
This move aims to keep users from going to Google for such results. Perplexity’s CEO, Aravind Srinivas, acknowledged that Google handles basic queries like weather, time, and live sports scores well, and his company had work to do in that area.
U.S. government and private sector ran the first AI attack simulation
Federal officials, AI model operators, and cybersecurity companies ran the first joint simulation of a cyberattack on a critical AI system. It also involved experts from private sector companies like Microsoft, Nvidia, and OpenAI. It helped identify potential new threats and establish communication channels between the government and the private sector.
Adobe Acrobat got a major upgrade, bringing AI to PDFs and more
Adobe Firefly generative AI enables image generation and editing directly within Acrobat for the first time. Acrobat AI Assistant’s new features, “insights across documents” and “enhanced meeting transcripts,” help users extract insights and share information from various document types. Adobe is offering free, unlimited access to Acrobat AI Assistant from June 18 to June 28.
TikTok introduces gen AI avatars of creators and stock actors for ads
“Custom Avatars” allow creators to scale their likeness for multilingual avatars and brand collaborations, while brands can use pre-built “Stock Avatars” to add a human touch. Plus, the new “AI Dubbing” tool translates content into ten languages, helping creators and brands increase their global reach.
Pixelbot 3000 builds Lego art using simple AI prompts
YouTuber Creative Mindstorms designed and built the Pixelbot 3000, a Lego printer that automates the assembly of brick-built mosaics. It uses OpenAI’s DALL-E 3 to generate images based on simple text prompts. First it generates a simplified cartoon-style image, then it is divided into a 32 x 32 grid, and the color of the center pixel in each square is sampled to create a high-contrast scaled image for the mosaic.
Adobe integrated new Firefly AI capabilities into Acrobat, allowing users to create and edit images within PDFs using text prompts — also adding the ability to access an AI assistant for insights, content creation, and more. https://news.adobe.com/news/news-details/2024/Adobe-Reimagines-Acrobat-Bringing-Firefly-AI-to-PDFs-and-Expanding-Use-Across-More-Document-Types/default.aspx
Reuters Institute of Journalism published a new report finding growing public wariness of AI-generated news content, with many expressing discomfort about its potential impact on content reliability and trust. https://www.reuters.com/technology/artificial-intelligence/global-audiences-suspicious-ai-powered-newsrooms-report-finds-2024-06-16
The U.S. Navy is deploying AI-powered underwater drones to better detect threats, with plans to expand the tech’s use in identifying enemy ships and aircrafts. https://finance.yahoo.com/news/ai-equipped-underwater-drones-helping-153947268.html
Luma teased new control features coming to its Dream Machine video model, including the ability to quickly change scenes and precisely edit characters — also launching the ability to extend video and remove watermarks. https://x.com/LumaLabsAI/status/1802678788490494128
Anthropic published new research showing that AI models can engage in ‘reward tampering’, learning to cheat the system and grant higher rewards even without specific training. https://www.anthropic.com/research/reward-tampering
A Daily chronicle of AI Innovations June 17th 2024:
McDonald’s pauses AI drive-thru tests
Chinese startup creates realistic robots
Photographer wins AI-image contest with real picture, gets disqualified
NVIDIA’s AI model for synthetic data generation rivals GPT-4 Meta pauses AI model training in EU due to regulatory pushback Spotify launches ‘Creative Labs’ to test Gen AI voiceover ads
Tiny LLM matches GPT-4 on math
McDonald’s pauses AI drive-thru tests
McDonald’s is ending its artificial intelligence drive-thru order-taking program and will shut it off in all participating restaurants by July 26, 2024.
The decision to end the AI program came after a thorough review, and McDonald’s Chief Restaurant Officer stated the company will seek broader voice ordering solutions.
Although the AI drive-thru program showed successes, McDonald’s aims to simplify operations and speed up service, with plans to decide on a new voice ordering solution by the end of the year.
Photographer wins AI-image contest with real picture, gets disqualified
A photographer named Miles Astray won an AI-image competition with a real photograph of a head-scratching flamingo, resulting in his disqualification.
His entry, titled F L A M I N G O N E, was mistakenly awarded third place by judges from notable institutions like the New York Times and Christie’s.
Astray aimed to demonstrate that human creativity and natural beauty surpass AI-generated content, sparking a discussion about the ethical implications of using AI in art.
NVIDIA’s AI model for synthetic data generation rivals GPT-4
NVDIAI has released Nemotron-4 340B, an open-source pipeline for generating high-quality synthetic data. It includes a base model trained on 9M tokens, an instruction, and a reward model.
The instruction model can generate diverse synthetic data that mimics real-world data.
The reward model then evaluates the generated data to filter out high-quality responses.
This interaction between the two models produces better training data over time.
Note: 98% of the training data used to fine-tune the Instruct model is synthetic and was created using NVIDIA’s pipeline.
In benchmarks such as MT-Bench, MMLU, GSM8K, HumanEval, and IFEval, the Instruct model generally performs better than other open-source models such as Llama-3-70B-Instruct, Mixtral-8x22B-Instruct-v0.1, and Qwen-2-72B-Instruct, and in some tests, it even outperforms GPT-4o.
It also performs comparable to or better than OpenAI’s GPT-4-1106 in human evaluation for various text tasks, such as summaries and brainstorming. The technical report provides detailed benchmarks.
Why does it matter?
This development allows businesses to create powerful, domain-specific LLMs without the need for extensive, costly real-world datasets. It has significant potential impacts across various industries, such as healthcare (drug discovery, personalized medicine, medical imaging), finance (fraud detection, risk assessment, customer service), manufacturing (predictive maintenance, supply chain optimization), and retail (personalized customer experiences).
Meta pauses AI model training in EU due to regulatory pushback
In response to the regulatory pressure from the Irish Data Protection Commission and the UK’s Information Commissioner’s Office, Meta has decided to pause its plans to train its large language model, Llama, using public content shared by Facebook and Instagram users in the European Union and the UK.
The regulators expressed concerns about Meta’s plan to use this user-generated content to train its AI systems without obtaining explicit user consent. Meta relied on a GDPR provision called “legitimate interests” to justify this data usage, but the regulators felt this was insufficient. Meta has decided to delay the launch of its AI chatbot in Europe until it can address the regulators’ concerns and establish a more transparent user consent process.
Why does it matter?
Meta’s inability to use EU user data for AI training is a setback for its regional AI ambitions. It could disadvantage Meta against competitors who can leverage such data. This situation highlights the ongoing tensions between tech companies’ desire to utilize consumer data for AI development and regulators’ efforts to protect user privacy. Striking the right balance between innovation and privacy will be a major challenge as the AI race intensifies.
Spotify launches ‘Creative Labs’ to test Gen AI voiceover ads
Spotify has launched a new in-house creative agency called “Creative Lab.” This agency will help brands and advertisers create custom campaigns for Spotify’s platform. Creative Lab teams in different markets will provide local insights and collaborate with brands to develop campaigns through workshops, inspiration sessions, and collaborative ideation.
In addition, Spotify is also testing a new AI tool called “Quick Audio” that will allow brands to create scripts and voiceovers using generative AI technology. This new capability will be integrated into Spotify’s ad manager platform, giving advertisers more options to produce audio ads for Spotify’s audience of over 615 million listeners.
Why does it matter?
This move emphasizes Spotify’s ambition to become a full-service advertising platform. Marketers and advertisers will have new creative and production capabilities available through Spotify to better reach the platform’s large and engaged user base in unique ways, including potentially using AI-generated audio ads. This could disrupt traditional advertising models and open new possibilities for how brands connect with consumers on audio platforms.
AChinese startup called Ex-Robots is developing ultra-realistic humanoid robots that can closely mimic human facial expressions and emotions.
Ex-Robots’ lifelike humanoid robots can smile, frown, and replicate subtle facial movements using small motors embedded in the silicone faces.
The startup uses multimodal AI and algorithms to enable the robots to recognize and express emotions based on perceiving their environment.
Creating one humanoid robot currently takes 2-4 weeks, with costs as high as $275k per unit.
The company said near-term use cases include museum exhibits, with eventual roles in healthcare, therapy, and child services.
The advancement of ultra-realistic robots combined with the power of AI is about to create some bizarre new frontiers. Will humans be creeped out by an actual human-looking humanoid over mechanical options like Tesla’s Optimus? It’s unclear if the world is ready… but we’ll find out soon.
Shanghai AI Laboratory researchers just developed a new algorithm called MCT Self-Refine, which allowed a relatively small 8B-parameter model to achieve GPT-4 level performance on complex math.
MCT Self-Refine combines an 8B LLaMa model with Monte Carlo Tree Search, an AI technique famously used by Google DeepMind to master the game of Go.
The algorithm constructs a search tree, refining answers through a series of processes and self-evaluation.
On the GSM-Hard math word problem benchmark, infusing the Monte Carlo Tree Search took the 8B model from 25.5% accuracy to 45.5%.
The model also achieved 96.7% on the GSM8K benchmark, outperforming significantly larger models like GPT-4, Claude, and Gemini.
The research shows it’s possible to boost math capabilities without massive scale — and GPT-4 level performance with a model trained on 200x less parameters is an impressive feat. If the approach proves to be a more efficient path to advanced reasoning, we could be on the cusp of a new wave of model acceleration.
Palantir Technologies – Software Engineer, New Grad: https://jobs.therundown.ai/jobs/63248966-software-engineer-new-grad-us-government
Fiddler AI – Staff AI Scientist: https://jobs.therundown.ai/jobs/63187617-staff-ai-scientist
Meta – Software Engineer – ML Systems: https://jobs.therundown.ai/jobs/63275347-software-engineer-ml-systems-technical-leadership
What Else Is Happening in AI on June 17th 2024
Apple enters the AI icon race to find a logo that makes sense
Apple has joined other tech giants like Google, OpenAI, Anthropic, and Meta in the race to find an iconic visual representation for AI. No company has yet created an unambiguous “AI logo” that conveys the concept to users. AI’s lack of a clear visual identity reflects the difficulty of representing such a broad and evolving technology in a simple icon.
Niloom.AI launches gen AI content creation platform for spatial computing
Without extensive technical expertise, the platform allows users to create, prototype, edit, and instantly publish sophisticated AR/VR content using text or speech prompts. It consolidates the entire creative process, from ideation to publishing, and integrates with various third-party tools to provide a one-stop solution for spatial computing content creation.
AI to delete abusive posts against athletes during the 2024 Paris Olympics
The International Olympic Committee (IOC) will deploy AI at the 2024 Paris Olympics to automatically detect and erase abusive social media posts directed at athletes and officials. The AI tool will monitor posts about 15,000 athletes and officials and immediately remove any content involving hate speech, bullying, or political attacks.
Picsart and Getty team up to counter Adobe’s “commercially-safe” AI
Picsart has partnered with Getty Images to develop a “responsible, commercially-safe” AI image generator tool. The AI model will be trained exclusively on Getty’s licensed stock content to address concerns about AI-generated content violating copyright laws. Picsart hopes to provide a viable alternative to Adobe’s Firefly by leveraging Getty’s library of licensed images. Source: https://www.theverge.com/2024/6/14/24178333/picsart-getty-commercially-safe-ai-image-tool-adobe
Yahoo News gets an AI-powered revamp with Artifacts integration
Yahoo has acquired the technology behind the Artifact news aggregation app and is launching a new AI-powered Yahoo News app. The app will feature a personalized news feed based on user interests and a “Key Takeaways” feature that provides bullet-point summaries of articles. Users can also flag problematic content, which the AI will then try to rewrite.
A Daily chronicle of AI Innovations June 14th 2024:
Google’s PH-LLM reads your wearables’ data for personalized insights Ex-OpenAI researcher on what to expect from AI in next decade DeepMind built ‘a virtual rodent’ with AI to understand brain activity
AI chatbots run for office
LinkedIn is rolling out new AI-powered features for premium users
Synthflow’s AI voice assistants are now multilingual!
Picsart is partnering with Getty Images to develop a custom model for AI imagery
Former head of NSA joins OpenAI’s Safety and Security Committee
Google’s PH-LLM reads your wearables’ data for personalized insights
Building on the next-gen capabilities of Gemini models, Google has presented research that highlights two complementary approaches to providing accurate personal health and wellness information with LLMs.
The first introduces PH-LLM, a version of Gemini fine-tuned to understand and reason on time-series personal health data from wearables such as smartwatches and heart rate monitors. The model answered questions and made predictions noticeably better than experts with years of experience in the health and fitness fields.
In the second paper, Google introduces an agent system that leverages state-of-the-art code generation and information retrieval tools to analyze and interpret behavioral health data from wearables. Combining these two ideas will be critical for developing truly personalized health assistants.
Why does it matter?
Wearables generate a wealth of personal health data that is rarely utilized in clinical settings. Integrating this data with advanced AI models could revolutionize personal health management and preventative care by putting an “expert health assistant” on everyone’s wrist.
Ex-OpenAI researcher on what to expect from AI in next decade
A researcher fired from OpenAI, Leopold Aschenbrenner, published a 165-page essay on what to expect from AI in the next decade. And GPT-4 has summarized it! Here are some key takeaways from the essay:
By 2027, AI models could reach the capabilities of human AI researchers and engineers, potentially leading to AI surpassing human intelligence
Trillions of dollars are being invested into developing the infrastructure needed to support these AI systems
Controlling AI systems smarter than humans(the ‘superalignment‘ problem) will be crucial to prevent catastrophic outcomes
Only a few hundred people truly understand the scale of change AI is about to bring
Why does it matter?
The essay provides a rare insider’s perspective on the rapid progression of AI. Coming from someone deeply involved in cutting-edge AI development, the insights highlight the urgency to get ahead of managing risks before AI’s capabilities outpace our defenses.
DeepMind built ‘a virtual rodent’ with AI to understand brain activity
Researchers from Google DeepMind and Harvard built a ‘virtual rodent’ powered by AI to help them better understand how the brain controls movement. With deep reinforcement learning (RL), it learned to operate a biomechanically accurate rat model, allowing researchers to compare real and virtual neural activity.
Why does it matter?
Understanding how the brain controls movement and modeling neural activity could exponentially advance fields like neuroscience and brain-computer interfaces, with the help of AI.
An AI-powered candidate named ‘AI Steve’ is running for U.K. Parliament in next month’s general election — creating polarizing questions around AI’s use in government affairs.
AI Steve is represented by businessman Steve Endacott and will appear as an independent candidate in the upcoming election.
Voters can interact with AI Steve online to ask policy questions and raise concerns or suggestions, which the AI will incorporate based on feedback.
If elected, Endacott will serve as AI Steve’s human proxy in Parliament, attending meetings and casting votes based on the AI’s constituent-driven platform.
The idea of an AI running for office might sound like a joke, but the tech behind it could actually help make our politicians more independent and (ironically) autonomous. AI-assisted governance is likely coming someday, but it’s probably still a bit too early to be taken seriously.
OpenAI CTO Mira Murati participated in a Q&A with Fortune Magazine, addressing topics including criticism from Elon Musk, internal AI model strength, the 2023 board drama and more.
Murati said both Apple and OpenAI believe strongly in privacy and will not log data through Apple accounts or train models on user data.
Murati responded to Musk calling the Apple partnership ‘creepy spyware’, saying OpenAI is trying to be as transparent as possible with its approach to safety.
When asked about the board drama, Murati said the previous board structure ‘didn’t have accountability to anyone but themselves’.
The CTO also curiously said that OpenAI’s internal models ‘aren’t that far ahead’ of what the public has for free right now.
This had to be a harder-hitting Q&A than Murati expected — but her responses certainly evokedsimilar vibes to the infamous Sora interview. The commentary on internal models was particularly interesting, as it would imply that OpenAI doesn’t have some GPT-5 level model waiting in the wings.
Researchers from Google DeepMind and Harvard just created a virtual rodent powered by an AI neural network, capable of mimicking agile movements and neural activity of real-life rats with extreme accuracy.
The researchers created a bio-mechanically realistic digital rat model and trained an AI ‘brain’ to control its body in a physics simulator.
The AI’s brain activity patterns closely matched real rodents engaged in similar behaviors, showing it can capture principles of muscle control and movement.
This ‘virtual neuroscience’ approach could open new research with testing on AI animals, also transferring to engineering of advanced robotics systems.
By bridging the gap between motor control and the complexity of brains, these virtual rodents offer a new window into how neural activity generates movement. Our understanding of the brain and neural activity is about to expand exponentially with the increasing power of AI.
Meta – GenAI Content Manager: https://jobs.therundown.ai/jobs/60547830-genai-content-manager
What Else Is Happening in AI on June 14th 2024
Former head of NSA joins OpenAI’s Safety and Security Committee
Paul M. Nakasone, a retired US Army general and a former head of the National Security Agency (NSA), will also join OpenAI’s board of directors. He will contribute to OpenAI’s efforts to better understand how AI can be used to strengthen cybersecurity by quickly detecting and responding to cybersecurity threats.
Former Meta engineers launch Jace, your new autonomous AI employee
Jace uses Zeta Labs’ proprietary web-interaction model, Autonomous Web Agent-1, to use a browser to interact with websites like any human would. It allows it to handle real-world tasks like booking flights, handling hiring, or even setting up a company.
LinkedIn is rolling out new AI-powered features for premium users
The features include searching for jobs by prompting in natural language, building a cover letter from scratch, reviewing your résumé with personalized suggestions for improving it for a specific job post, and making edits interactively with AI.
Synthflow’s AI voice assistants are now multilingual!
They can fluently communicate in Spanish, German, Portuguese, French, and English. Sythflow also added corresponding voices for each language to ensure authentic and natural-sounding interactions so businesses can engage a global audience and offer personalized experiences.
Picsart is partnering with Getty Images to develop a custom model for AI imagery
The model will be built from scratch and trained exclusively on Getty Images’ licensed creative content. It will bring responsible AI imagery to creators, marketers, and small businesses that use Picsart, enabling them to generate unique images with full commercial rights. Source: https://techcrunch.com/2024/06/13/picsart-partners-with-getty-images-to-develop-a-custom-ai-model/
A Daily chronicle of AI Innovations June 13th 2024:
Apple to ‘pay’ OpenAI through distribution, not cash
New free rival to OpenAI Sora unveiled
⌚️ Google AI gets personal with wearables
Perplexity was planning revenue-sharing deals with publishers when it came under media fire
Amazon ‘dropped the ball’ with Alexa, former employee says
Tim Cook says it’s a ‘significant possibility’ people use their iPhones less because of AI
Luma AI’s dream machine debut sparks AI media wars Stability AI’s Diffusion 3 Shakes Up Image Creation No Cash Involved! Apple will pay OpenAI in distribution
NVIDIA tops MLPerf benchmarks, dominates AI training
Databricks Boosts Mosaic AI for Enterprise LLMs
ChromeOS gets GenAI boost via Android tech
YouTube trials AI comment summaries for shorts
Luma AI’s dream machine debut sparks AI media wars
LumaAI has launched Dream Machine, an AI-powered video generation system capable of generating high-quality videos via simple text prompts. The open-for-all AI tool takes about two minutes to generate a realistic five-second video clip to render specified objects and environments without compromising coherency.
Why does it matter?
The open-for-all feature of this AI tool highlights Luma AI’s approach toward democratizing generative AI and puts rival systems into the spotlight, including OpenAI’s Sora and Kuaishou’s Kling, which are only accessible to select users.
Additionally, Luma’s open approach will enable creators and companies to produce original content at a rapid speed and scale, giving it a competitive edge.
Source: https://lumalabs.ai/dream-machine
⌚️ Google AI gets personal with wearables
Google just published two new research papers, introducing an AI model that can interpret personal health data from wearable devices and an AI agent that provides detailed insights, answer health questions, and more.
Google’s PH-LLM is a version of Gemini fine-tuned to reason using data from wearables and generate coaching insights for sleep and fitness.
PH-LLM performs comparably to human experts on health insights for fitness and sleep, also achieving expert performance on certification exams.
Google also revealed PHIA, an AI agent that combines Gemini’s language skills with code and search capabilities to analyze wearable health data.
PHIA scored 84% on health insight questions, demonstrating strong reasoning and data analysis capabilities on complex queries.
Google’s research showcases AI’s potential to take health wearable tracking to the next level — going beyond generic insights to enable personalized support. Once ingrained into popular devices, these types of AI tools will put an expert fitness and nutrition coach on the wrists of every user.
Stability AI’s Diffusion 3 Shakes Up Image Creation
Stability AI has launched its most sophisticated text-to-image AI open model, optimized for consumer PCs and enterprise-tier GPUs. The model delivers several noteworthy features:
Enables photorealistic and high-quality outputs in exceptional style by providing images with extraordinary color, lighting, and details.
Can comprehend long, complex prompts that involve spatial reasoning, compositional elements, actions, and styles.
Leverages Diffusion Transformer architecture to achieve exceptional text quality, reducing spelling, character spacing, and letter-forming errors.
Displays the ability to absorb minute details from small datasets, making it an ideal model for customizing and fine-tuning.
Why does it matter?
Stability’s unique features make it the most advanced text-to-image AI model. As it continues to push the boundaries of generative AI, it will be interesting to note whether it will stand out against its competitors and remain at the forefront of image generation.
No Cash Involved! Apple will pay OpenAI in distribution
A landmark arrangement between OpenAI and Apple to integrate ChatGPT into iPhone, iPad, and Mac was unclear in its financial terms. People briefed on the matter suggest that the agreement isn’t likely to generate revenue for either party and simply provides OpenAI brand exposure to Apple’s user base while offering Apple an advanced chatbot feature.
Why does it matter?
Apple integrating ChatGPT into its devices without offering any compensation to OpenAI creates concerns among users on whether Apple intends to share their data in return.
Apple to ‘pay’ OpenAI through distribution, not cash
Apple and OpenAI announced a partnership at WWDC, but according to Bloomberg, neither side is paying the other for the integration of ChatGPT into Apple’s devices.
The agreement between Apple and OpenAI is non-exclusive, and Apple is also negotiating with Anthropic and Google to integrate their chatbots, with a deal for Google’s Gemini expected later this year.
Apple plans to generate revenue from these partnerships through revenue-sharing agreements, taking a cut from AI providers that monetize their chatbot services, such as ChatGPT Plus’ $20-per-month subscription plan.
Dream Machine is a new text-to-video tool developed by Luma AI, offering a free tier you can use immediately with a Google account, though there might be waiting times due to high demand.
The tool creates five-second video clips at 1360×752 resolution based on user prompts, and while it has some processing delays, the outputs are impressive and close to user expectations.
Dream Machine offers various subscription plans, with the free option limited to 30 generations a month, and higher tiers providing more generations at different price points, up to 2,000 generations for $499.99 monthly.
Perplexity was planning revenue-sharing deals with publishers when it came under media fire
Perplexity, an AI search engine startup, was working on revenue-sharing deals with high-quality publishers before facing accusations from Forbes of content misuse.
The company aims to announce the details of these partnerships soon and has already updated its user interface to make citation of sources more prominent.
Perplexity’s planned revenue-sharing framework will offer media companies a recurring income alternative, contrasting with the one-off data usage deals currently offered by entities like OpenAI.
Amazon ‘dropped the ball’ with Alexa, former employee says
A former Amazon employee, Mihail Eric, criticized the company’s handling of Alexa, citing technical and bureaucratic problems as major hindrances to its development.
Eric stated that despite having ample resources and talent, Amazon mishandled its opportunity to make Alexa the leading product in conversational AI.
Organizational issues, such as fragmented teams and inadequate computing resources, were mentioned by Eric as significant factors that prevented Alexa from achieving its full potential.
Tim Cook says it’s a ‘significant possibility’ people use their iPhones less because of AI
Apple CEO Tim Cook said Apple Intelligence may reduce iPhone usage in an interview with Marques Brownlee.
Cook highlighted that AI will simplify previously time-consuming tasks, allowing users to complete them more efficiently and quickly.
Apple has added several features to the iPhone that raise awareness of how much time consumers spend on their devices and which help make their usage more intentional.
YouTube trials AI comment summaries for shorts: YouTube is trying out a new AI-powered comment summarizer feature on mobile apps. The feature breaks down comments into keyword-based topics, simplifying users’ engagement with Shorts with large comment sections. Source: https://support.google.com/youtube/thread/18138167
NVIDIA tops MLPerf benchmarks, dominates AI training: NVIDIA has set new performance records in MLPerf Training v4.0, showcasing its dominance in generative AI and LLM training. Key achievements include tripling GPT-3 175B training performance, fine-tuning Llama 2 70B model in just 1.5 minutes using 1,024 H100 GPUs, and accelerating Stable Diffusion v2 training performance by up to 80%. Source: https://blogs.nvidia.com/blog/mlperf-training-benchmarks
Databricks Boosts Mosaic AI for Enterprise LLMs: Databricks has announced enhancements to Mosaic AI to accelerate its efforts in enterprise gen AI development. The focus lies on the development of compound AI systems, their evaluation across metrics, and governance. Releases include Mosaic AI Model Training and Agent Framework, strengthening the company’s offering against Snowflake. Source: https://venturebeat.com/data-infrastructure/databricks-bolsters-mosaic-ai-with-tools-to-build-and-evaluate-compound-ai-systems
No more vagueness – Adobe to update its AI terms: Adobe has promised to update its terms of service amidst user backlash, assuring that it will not train generative AI models on customer content. The company clarified that it does not own customer content and will refrain from using it to train AI models. Source: https://blog.adobe.com/en/publish/2024/06/10/updating-adobes-terms-of-use
ChromeOS gets GenAI boost via Android tech: Google plans to release Android-based tech stack into ChromeOS to accelerate AI innovation. The release will simplify engineering efforts, helping phones and accessories work better with Chromebooks. Through this release, Google seeks to continue rolling out its AI features to users at a faster and larger scale. Source: https://blog.chromium.org/2024/06/building-faster-smarter-chromebook.html
A Daily chronicle of AI Innovations June 12th 2024:
Elon Musk drops OpenAI lawsuit with no explanation
Microsoft kills off Copilot GPT Builder after just 3 months
Stability AI launches its ‘most sophisticated’ image generator yet
Microsoft to discontinue GPT Builder from Copilot Pro
Musk drops lawsuit alleging OpenAI strayed from its mission
OpenAI leverages Oracle Cloud to extend the Microsoft Azure Al platform
Google to launch anti-theft AI features for android phones in Brazil
Google expands Gemini Nano AI capabilities to Pixel 8 & 8a
Yahoo Mail integrates AI for a streamlined desktop experience
Samsung unites North American AI teams, appoints ex-Apple Siri executive
Elon Musk drops OpenAI lawsuit with no explanation
Elon Musk has dropped his lawsuit against OpenAI and its CEO, Sam Altman, which accused the company of breaching a contract and deviating from its mission to benefit humanity.
The case was dismissed without prejudice, meaning Musk can refile it in the future; this decision came just a day before a scheduled hearing on OpenAI’s request to dismiss the case.
Musk’s lawsuit alleged a breached “Founding Agreement” with OpenAI, but the company denied this, asserting there is no binding agreement and claiming Musk sought control by merging it with Tesla.
Microsoft kills off Copilot GPT Builder after just 3 months
Microsoft has announced the discontinuation of GPT Builder and is giving users one month to save their data before deletion.
GPT Builder, part of Microsoft’s Copilot Pro, allowed users to create and customize various GPTs for $20 per month but will no longer be available after July 10.
Despite retiring GPT Builder, Microsoft remains focused on enterprise GPT applications and will not adjust Copilot Pro subscription fees or hand the project over to the community.
Stability AI launches its ‘most sophisticated’ image generator yet
Stability AI launched Stable Diffusion 3 Medium, described as its “most advanced text-to-image open model yet,” featuring 2 billion parameters for photorealistic output on consumer systems.
The new model overcomes common artefacts in hands and faces, accurately understands complex prompts, and enhances typography through Diffusion Transformer architecture.
Users can access SD3 Medium via Stability’s API under an open non-commercial license, while commercial usage requires contacting the startup for licensing details.
Autonomous driving by musculoskeletal humanoids: A study
The research paper discusses the development of a musculoskeletal humanoid robot, Musashi, designed to perform autonomous driving tasks. It mimics the human body in detail with redundant sensors and a flexible body structure that are suitable for motions with complex environmental contact.
The robot is expected to sit down on the car seat, step on the acceleration and brake pedals, and operate the steering wheel by both arms.
The paper extensively discusses the hardware and learning-based software necessary for this unique approach to autonomous driving.
Why does it matter?
The learning-based software discussed in the paper highlights the adaptability of musculoskeletal humanoids. AI models could learn from both structured data and unstructured sensory inputs, potentially improving generalization and handling novel scenarios.
Google to launch anti-theft AI features for android phones in Brazil
The features include three lock modes: AI Movement Detection (recognizes common theft-related movements), Remote Lock, and Internet Access Timeout (automatically locks the screen if the device lacks internet access for an extended period). Starting in July 2024, Brazilian users with Android phones running version 10 or higher will have access to these features. Source: https://www.reuters.com/technology/artificial-intelligence/google-test-anti-theft-ai-feature-phones-brazil-2024-06-11
Google expands Gemini Nano AI capabilities to Pixel 8 & 8a
This update includes features like “Summarize in Recorder” and “Smart Reply.” Users need to access developer options to enable these features. Additionally, the update enhances Find My Device functionality and allows Display Port output via USB-C. Older Pixel devices, such as the Pixel Fold and Pixel 6, also receive new features. https://www.theverge.com/2024/6/11/24176127/google-pixel-feature-drop-june-gemini-nano
Yahoo Mail integrates AI for a streamlined desktop experience
The enhancements include AI-generated email summaries, a “Priority” inbox tab for urgent messages, quick action buttons, and the ability to link other email accounts within Yahoo Mail. Existing users can opt in, while new users have immediate access. Source: https://www.theverge.com/2024/6/11/24175280/yahoo-mail-generative-ai-gmail
Samsung unites North American AI teams, appoints ex-Apple Siri executive
Samsung merged its AI research centers in Toronto and California into a single unit, the North America AI Center. To lead this new North American AI Center, Samsung recruited Murat Akbacak, a former Siri executive at Apple. Source: https://www.bloomberg.com/news/articles/2024-06-11/samsung-taps-apple-siri-veteran-to-lead-north-american-ai-group
A Daily chronicle of AI Innovations June 11th 2024:
Apple brings ChatGPT to Siri
Apple reveals macOS Sequoia with iPhone mirroring, new Passwords app, and more
Elon Musk threatens Apple device ban after OpenAI partnership
Apple introduces Apple Intelligence, its new Gen AI tech Musk threatens to ban Apple devices over ChatGPT integrations OpenAI hires Nextdoor and Instagram veterans as CFO and CPO
Apple brings ChatGPT to Siri
Apple introduced AI updates for Siri at WWDC 2024, enabling it to perform actions across apps, manage notifications, automatically write and summarize text, and understand user context with “onscreen awareness.”
Siri’s new capabilities include referencing one app to perform actions in another, such as playing a podcast your partner sent, changing the tone of written text with a prompt, and allowing users to interact by typing as well as speaking.
Apple emphasizes privacy with on-device processing for many features, while more complex requests are handled by “Private Cloud Compute” on Apple Silicon-powered servers, with independent experts verifying the code running on its servers.
Apple reveals macOS Sequoia with iPhone mirroring, new Passwords app, and more
Apple announced macOS Sequoia, featuring major updates such as iPhone mirroring that allows users to control their iPhone from their Mac.
Sequoia introduces a stand-alone Passwords app, syncable with other Apple devices and Windows PCs via iCloud, along with new Safari improvements like Highlights for extracting useful information.
Apple also revealed Game Porting Toolkit 2 for better Windows game compatibility, and the macOS Sequoia beta will be available this month with a full release expected in the fall.
Elon Musk threatens Apple device ban after OpenAI partnership
Elon Musk has threatened to ban Apple devices at his companies because Apple plans to integrate OpenAI’s ChatGPT into its products.
Musk argues that allowing OpenAI access through Apple devices poses a security risk and criticizes Apple for not developing its own AI technology.
Musk’s discontent also stems from his ongoing disputes with OpenAI, including past legal actions and competitive tensions with his own AI ventures like xAI and its Grok chatbot.
Apple introduces Apple Intelligence, its new Gen AI tech
Apple unveiled Apple Intelligence at WWDC 2024, a personal intelligence system deeply integrated into iOS 18, iPadOS 18, and macOS Sequoia. Leveraging Apple silicon’s power, it understands and creates language/images, takes actions across apps, and simplifies tasks using personal context. With Private Cloud Compute, Apple sets a new privacy standard in AI by flexibly processing between on-device and dedicated Apple silicon servers.
Apple Intelligence will power new features like AI-generated Bitmojis, natural language images, and video search, as well as enhanced integration with apps and Siri.
Apple also partnered with OpenAI to integrate ChatGPT with its apps, including Siri. This will allow users to directly access GPT-4 through Apple’s platforms. Additionally, Apple confirmed plans to work with other models in the future, including Google’s Gemini. The iPhone 15 Pro and devices with M1 or newer chips will be the first to receive the full suite of Apple Intelligence features.
Siri Upgrades
A next-gen Siri will converse more naturally, remember context across requests, and accomplish more complex tasks by better understanding both voice and text.
Siri also gains ‘onscreen awareness’, with the ability to take actions and utilize on-device info to better tailor requests to the individual user.
New AI Features
New AI writing tools built into apps like Mail, Messages, and Notes will allow users to auto-generate and edit text.
Mail will utilize AI to better organize and surface content in inboxes, while Notes and Phone gain new audio transcription and summarization capabilities.
AI-crafted ‘Genmojis’ enable personalized text-to-image emojis, and a new “Image Playground” feature introduces an image generation tool from prompts.
Photos get more conversational search abilities, the ability to create photo ‘stories’, and new editing tools.
Privacy
A focus of the AI reveal was privacy — with new features leveraging on-device processing when possible and Private Cloud Compute for more complex tasks.
Private Cloud Compute (PCC) is Apple’s new intelligence system specifically for private AI processing on the cloud.
The new AI features will be opt-in, so users will not be forced to adopt them.
OpenAI Integration
The OpenAI partnership will allow Siri to leverage ChatGPT/GPT-4o when needed for more complex questions.
OpenAI’s blog also outlined additional ChatGPT tools like image generation and document understanding embedded into the new OS.
Why does it matter?
This is generative AI as only Apple can deliver it– put at the core of its products, accessing users’ personal data for truly helpful AI assistance but keeping it secure through on-device processing and Private Cloud Compute. Apple’s long-known focus on user privacy + exceptional UX could inspire a new era of AI development.
Musk threatens to ban Apple devices over ChatGPT integrations
Elon Musk threatens to ban iPhones and other Apple devices from his companies, such as Tesla, SpaceX, and xAI, over Apple’s announcement of ChatGPT integration into its OS. Musk thinks this integration will allow ChatGPT to access too much private user data through Apple devices. He claims Apple doesn’t understand OpenAI’s tech and is putting user privacy at risk.
However, Apple and OpenAI stated that users will be asked permission before any data is sent to ChatGPT.
Why does it matter?
This highlights Musk’s skepticism about the privacy implications of Apple’s AI strategy. But it could significantly impact Apple users in the future, as they may no longer be able to use iPhones and other Apple devices at Musk’s companies
OpenAI hires Nextdoor and Instagram veterans as CFO and CPO
OpenAI has hired two experienced executives to fill important leadership roles. Sarah Friar, previously the CEO of the neighborhood app Nextdoor, was appointed OpenAI’s new chief financial officer (CFO). Kevin Weil, a former Instagram and Twitter executive who oversees product teams, has been named OpenAI’s chief product officer (CPO).
As CFO, Friar will manage OpenAI’s financial operations and investments as the company expands globally. As CPO, Weil will lead product development utilizing OpenAI’s AI research to create new products for consumers and enterprise customers.
Why does it matter?
This indicates OpenAI’s ambition to scale as it commercializes its AI technology. It also suggests that OpenAI is solidifying its position as a major force in the AI industry while navigating the recently surfaced challenges around AI safety.
A new study from the University of Michigan leveraged AI models trained on human speech to decode the meaning behind dog noises — identifying details like breed, age, gender, and emotional state with 70% accuracy.
Researchers gathered vocalizations from 74 dogs of varying breeds, ages, and situational contexts.
The noises were inputted into an AI model originally designed to analyze human voices, trained on 960 hours of speech, and fine-tuned for dogs.
The AI was able to predict individual dogs from barks, distinguish dog breed and gender, and match barks to emotional context like play and aggression with 70% accuracy.
AI is not only bridging the language gap for humans across the globe — but also potentially across species as well. Communicating with other intelligent animals (or at least better understanding them) seems like a skill issue that’s going to be solved sooner rather than later.
Source: https://arxiv.org/pdf/2404.18739?
New AI Job Opportunities on June 11th 2024
Fiddler AI – Technical Project Manager, Customer Success: https://jobs.therundown.ai/jobs/62351203-technical-project-manager-customer-success-remote-east-coast
Cohere – Research Intern (PhD): https://jobs.therundown.ai/jobs/62351142-research-intern-phd
Meta – Research Engineer, Conversational AI (Reality Labs): https://jobs.therundown.ai/jobs/62419977-research-engineer-conversational-ai-reality-labs
Palantir Technologies – Product Designer, US Government: https://jobs.therundown.ai/jobs/62398950-product-designer-us-government
What Else Is Happening in AI on June 11th 2024
AI could save financial advisors 10-15 hours per week
Morgan Stanley is using an AI tool to transcribe and categorize client conversations, saving financial advisors 10-15 hours per week. It allows advisors to better prepare for client meetings by reminding clients of past discussions and anticipating their needs. The AI tool can automatically summarize meetings and generate follow-up emails.
Anthropic reveals Claude 3’s character training process
Anthropic has developed a “character training” process for their AI assistant, Claude, to imbibe traits like curiosity, open-mindedness, and thoughtfulness. It aims to make Claude more discerning in their interactions and able to express their own views while also displaying openness to other perspectives.
FanVue introduces the world’s first AI beauty pageant contest
Influencer marketing platform FanVue hosted the world’s first AI-powered beauty pageant, “Miss AI,”. Over 1,500 AI-generated digital models worldwide compete for the title and a $20,000 prize pool. The contest aims to showcase the capabilities of AI in creating photorealistic digital avatars that can function as influencers and brand ambassadors.
Perplexica provides an open-source AI search alternative for Perplexity
Perplexica uses LLMs, machine learning algorithms, and embedding models to provide refined search results and clear answers from cited sources. Perplexica offers multiple modes, like various “Focus Modes” tailored for specific question types.
Source: https://github.com/ItzCrazyKns/Perplexica
AI tools secretly use real images of children for training
Over 170 images and personal details of Brazilian children have been repurposed by an open-source AI training dataset called LAION-5B without the children’s consent. The images were originally posted on mommy blogs and personal social media. Researchers are concerned these images could be used to generate deepfakes or reveal sensitive information.
How does Apple send your data to its cloud AI servers? Very carefully, it claims. https://www.engadget.com/how-does-apple-send-your-data-to-its-cloud-ai-servers-very-carefully-it-claims-233312425.html
A Daily chronicle of AI Innovations June 10th 2024:
Meta faces legal complaints over AI data plans
Alibaba’s Qwen2 AI models outperform GPT-4 & Llama-3 SAP & Nvidia are developing applications with AI & digital twins Chinese tech giants exploit Nvidia AI chip loophole
Apple launches “Apple Intelligence” at WWDC 2024 for iPhone, iPad, and Mac
TCS launches TCS AI WisdomNext™, an industry-first GenAI aggregation platform
Human Native AI is building a marketplace for AI training licensing deals
Hugging Face and Pollen Robotics launched an open-source robot for household chores
Alibaba’s Qwen2 AI models outperform GPT-4 & Llama-3
Alibaba launched Qwen2 with five sizes ranging from 0.5B to 72B parameters. These models are trained in 27 additional languages besides English and Chinese, showcasing state-of-the-art performance in benchmarks. The models deliver significantly improved performance in coding and mathematics and extended context length support up to 128K tokens. Despite having fewer parameters, qwen2-72 B outperforms leading models like Llama-3-70B and its predecessor Qwen1.5-110B.
Qwen2-72B-Instruct performs comparably to GPT-4 in terms of safety and significantly outperforms Mistral-8x22B. The models are released under Apache 2.0 and Qianwen License on Hugging Face and ModelScope.
Why does it matter?
Qwen2 beats Meta’s model despite being trained on relatively fewer tokens. The researchers attribute it to more efforts put into data cleaning and training, implying innovative approaches on their end.
However, it also signals the slow shift in how LLMs are developed– from solely relying on quantity of data to prioritizing the quality of data and training techniques.
Source: https://qwenlm.github.io/blog/qwen2
SAP & Nvidia are developing applications with AI & digital twins
At SAP’s Sapphire event in Orlando, Florida, SAP and NVIDIA announced their collaboration to enhance SAP’s generative AI copilot, Joule, with two new capabilities: SAP Consulting and ABAP Developer. These new features are powered by NVIDIA AI Enterprise software.
Additionally, SAP is integrating NVIDIA Omniverse Cloud APIs into its Intelligent Product Recommendation solution to simplify the buying and selling process for complex products. This integration will allow salespeople to visualize 3D product digital twins directly within the SAP Intelligent Product Recommendation interface, making it easier to understand the products.
Why does it matter?
Using NVIDIA Omniverse Cloud APIs in SAP’s Intelligent Product Recommendation solution accelerates the quote generation process and increases sales and customer satisfaction by enabling sales representatives to provide more accurate, tailored recommendations.
Chinese tech giants exploit Nvidia AI chip loophole
The U.S. government prohibits Nvidia from selling A.I. chips directly to Chinese companies due to national security concerns. Still, ByteDance is accessing Nvidia’s A.I. chips for its U.S. operations by leasing them from Oracle, as the current U.S. rules do not explicitly prohibit Chinese companies from accessing the chips if used within the U.S.
Other Chinese tech giants like Alibaba, Tencent, and China Telecom seek similar arrangements with U.S. cloud providers. The U.S. Commerce Department proposed a rule to tighten controls, but it faced opposition from cloud providers and remains in limbo.
Why does it matter?
Even if the loophole is closed, Alibaba and Tencent have discussed obtaining Nvidia chips for their U.S.-based data centers. It could further escalate the AI “arms race” and rivalry between the USA and China as both nations seek to outpace each other in developing advanced AI systems for economic and military advantages.
Apple will launch “Apple Intelligence” at WWDC 2024 for iPhone, iPad, and Mac
Leaks suggest Apple will reveal “Apple Intelligence”, aka AI, at the WWDC event this week. These AI features will focus on broad appeal and privacy, with opt-in not mandatory. Apple will use its own tech and OpenAI tools to power the new AI features.
TCS launches TCS AI WisdomNext™, an industry-first GenAI aggregation platform
The platform allows organizations to compare and experiment with GenAI models across cloud services in a single interface. It offers ready-to-deploy business solution blueprints with built-in guardrails for quick adoption.
A study by Harvard, MIT, and Wharton reveals junior staff is not reliable for AI training
Junior consultants who participated in a GPT-4 experiment struggled with AI risk mitigation, with their tactics lacking a deep understanding of the technology and focusing on changing human behavior rather than AI system design. The findings highlight the need for top-down AI governance, expert input, and upskilling across all levels of the organization.
Human Native AI is building a marketplace for AI training licensing deals
The platform helps AI companies find data to train their models while ensuring rights holders are compensated. Rights holders upload content for free and connect with AI companies for revenue share or subscription deals. Human Native AI helps prepare and price content, monitors for copyright infringements, and takes a cut of each deal.
Hugging Face and Pollen Robotics launched an open-source robot for household chores
The humanoid Reachy2 was initially controlled by a human wearing a VR headset. Then, a machine learning algorithm studied the teleoperation sessions to learn how to perform the tasks independently. The dataset and trained model used for the demo are open-sourced on Hugging Face, allowing anyone to replicate the process on smaller robots at home.
The World AI Creator Awards and creator platform FanVue are currently running the inaugural ‘Miss AI’ contest, with over 1,500 AI-generated models competing for the world’s first AI beauty pageant.
10 finalists were selected from a pool of 1,500 AI-generated contestants, with the winner set to be announced at the end of June.
The AI models ‘hail’ from countries across the globe, showcasing various causes and personalities in addition to their photorealistic images.
Judges will also evaluate the AI tech behind the avatars, including prompts/image outputs and the creator’s ability to engage audiences on social media.
The prize pool includes $20,000 in awards and access to PR and mentorship programs.
With how good AI image generation capabilities have become, we’re likely already being exposed to an influx of generated brand ambassadors and models without even realizing it. While the idea of an AI beauty pageant may sound strange… it’s probably only going to get weirder from here.
Microsoftis making changes to its new Recall on-device AI feature following security concerns — with the tool now starting off by default and gaining new encryption protections ahead of the release on June 18.
The Copilot + Recall feature was unveiled at the Microsoft Build event in May, which will run locally and take constant screenshots to remember user actions.
The company faced backlash after experts warned of the tool’s ‘privacy nightmare’ with the potential to expose screenshots and activity to hackers.
The feature will now be off by default, with users having to opt in manually — with new encryption and authentication being added for extra safety measures.
The tool will roll out in ‘preview’ on the new Copilot + PCs set to launch on June 18.
While the Recall feature was one of the flashiest features revealed at Build — it also is one of the most controversial. While the screenshots enable memory and get us a step closer to the AI agent dream, its also a tough hurdle to get past for increasingly privacy-focused consumers.
🔊 AI TRAINING: Using ‘Background Conversations’ in ChatGPT
ChatGPT’s latest update introduced the “Background Conversations” feature, allowing you to continue your chat even when using other apps or when your screen is off.
Install the latest ChatGPT app update on your phone.
Open the app and go to Settings.
Scroll to “Voice Mode” and toggle “Background Conversations” on.
Use ChatGPT hands-free while multitasking or with your screen off.
Note: If you still don’t see the option, the feature may not have rolled out to your account yet.
🧠 AI RESEARCH: Concise prompting cuts AI costs by 20%
Researchers just found that adding a simple ‘be concise’ instruction to chain-of-thought prompts can reduce the length of AI responses by 50% with minimal impact on accuracy — leading to over 20% cost savings on API calls.
Chain-of-thought (CoT) prompting has the AI explain its reasoning step-by-step before giving a final answer, improving accuracy on complex problems.
Researchers tested adding ‘be concise’ to CoT prompts on GPT 3.5 and 4, using 1,000 multiple-choice questions across 10 topics.
The instruction made responses about 50% shorter on average vs. normal CoT prompting, with no significant accuracy impact.
The approach also reduced per-query API costs by over 20% for both models, with the more concise prompts generating fewer tokens.
Next time your AI chatbot is getting a little too wordy, this one simple prompt trick is all you need! Despite the massive complexities and uncertainties surrounding how LLMs work, sometimes these strange, seemingly basic quirks make all the difference in getting the optimal outputs.
Source: https://arxiv.org/pdf/2401.05618
New AI Job Opportunities on June 10th 2024:
🔬 Meta – Research Engineer, Conversational AI (Reality Labs): https://jobs.therundown.ai/jobs/62419977-research-engineer-conversational-ai-reality-labs
🎨 Palantir Technologies – Product Designer, US Government: https://jobs.therundown.ai/jobs/62398950-product-designer-us-government
A Daily chronicle of AI Innovations June 07th 2024:
OpenAI reverse engineers the workings of AI models New Chinese video generation model beats OpenAI’s Sora Nvidia is now the second-most valuable company, overtaking Apple
Adobe’s launches AEP AI Assistant to help brands master customer data and outreach
The most comprehensive benchmarking & leaderboard for image models is here!
AI used to predict potential new antibiotics in groundbreaking study
Meta gets EU complaints from 11 countries over use of personal data to train AI
The Federal Trade Commission is investigating Microsoft-Inflection AI deal
OpenAI reverse engineers the workings of AI models
In new research, OpenAI has shared improved methods for finding a large number of “features”—patterns of activity in AI models that are human interpretable. They developed new state-of-the-art methodologies that allow scaling sparse autoencoders to tens of millions of features on frontier AI models.
It demonstrated smooth and predictable scaling, with better returns to scale than prior techniques. And they could find 16 million features in GPT-4. The research also introduces several new metrics for evaluating feature quality.
OpenAI has shared the paper, code, and feature visualizations to foster further exploration.
Why does it matter?
It could bring us closer to understanding the inner workings of neural networks. This interpretability will eventually increase model trustworthiness and steerability.
While OpenAI is being criticized for taking unnecessary risks with AI technology, this shows it is also keen on tackling the risk by making models more explainable.
New Chinese video generation model beats OpenAI’s Sora
Kuaishou, a Chinese tech company, has introduced Kling, an AI model for video generation. It can make videos up to two minutes long at 1080p resolution and 30 frames per second, vs. Sora’s one-minute videos.
Kuaishou claims Kling correctly simulates the physical properties of the real world, including complex motion sequences. Using a diffusion transformer, it can also combine concepts and create fictional scenes, such as a cat driving a car through a busy city.
The model is currently available as a public demo in China.
Why does it matter?
Move over, Sora. This Chinese alternative is blowing everyone’s minds. It is becoming increasingly clear that China is catching up fast with the USA in AI.
Source: https://kling.kuaishou.com/
Nvidia is now the second-most valuable company, overtaking Apple
Nvidia rallied to record highs on Wednesday, with it’s stock market valuation hitting $3 trillion and overtaking Apple to become the world’s second most valuable company. This comes after Nvidia made a series of major announcements in the past week.
However, Nvidia’s stock has surged 147% so far in 2024, with demand for its top-of-the-line processors far outstripping supply as Big Tech races to build out their AI computing capabilities and dominate the emerging technology.
Microsoft remains the world’s most valuable company, with a market value of approximately $3.15 trillion.
Why does it matter?
Nvidia is now seen as a key company to watch to see how fast AI-powered tech is spreading across the business world, a shift that Nvidia’s founder, Jensen Huang, has declared as the dawn of the “next industrial revolution”.
Chinese tech firm Kuaishou just introduced KLING, a new text-to-video AI model capable of generating high-quality videos up to 2 minutes long with outputs that appear to rival OpenAI’s still-unreleased Sora.
The details:
KLING can produce videos at 1080p resolution with a maximum length of 2 minutes, surpassing the 1-minute Sora videos demoed by OpenAI.
KLING’s demos include realistic outputs like a man eating noodles and scenic shots, as well as surreal clips like animals in clothes.
The model uses a 3D space-time attention system to simulate complex motion and physical interactions that better mimic the real world.
The model is currently available to Chinese-based users as a public demo on the KWAI iOS app.
Why it matters: These generations are even more mind-blowing when you consider that Will Smith’s spaghetti-eating abomination was barely a year ago. With users still anxiously waiting for the public release of Sora, other competitors are stepping in — and the AI video landscape looks like it’s about to heat up in a major way.
Source:
🇨🇳 ByteDance’s AI chip loophole
TikTok parent company ByteDance is renting advanced Nvidia AI chips and using them on U.S. soil, exploiting a loophole to sidestep restrictions on China’s AI chip exports.
The details:
Due to national security concerns, the U.S. government prohibits Nvidia from selling AI chips like the A100 and H100 directly to Chinese companies.
The restrictions don’t prevent Chinese firms from renting chips for use within the U.S. — ByteDance is allegedly leasing servers with chips from Oracle.
ByteDance reportedly had access to over 1,500 H100 chips and several thousand A100s last month through the Oracle deal.
Other Chinese giants like Alibaba and Tencent are also reportedly exploring similar options, either renting from U.S. providers or setting up US data centers.
Why it matters: The AI race between the U.S. and China is only escalating — and it appears major players are going to get AI chips by any means necessary. While the U.S. tries to stall its rival’s progress with restrictions, it feels like a game of whack-a-mole that won’t stop China from reaching its AI goals.
Source:
🔍 AI RESEARCH: OpenAI probes GPT’s inner workings
OpenAI just released a new paper detailing a method for reverse engineering concepts learned by AI models and better understanding ChatGPT’s inner workings.
The details:
The paper was authored by members of the recently disbanded superalignment team, including Ilya Sutskever and Jan Leike.
‘Scaling and Evaluating Sparse Autoencoders’ outlines a technique to ID patterns representing specific concepts inside GPT-4.
By using an additional model to probe the larger model, researchers found a way to extract millions of activity patterns for further exploration.
OpenAI released open-source code and a visualization tool, allowing others to explore how different words and phrases activate concepts within models.
Why it matters: Much like Anthropic’s recent “Golden Gate Claude” and corresponding research, AI firms are still working to understand what’s truly going on underneath the hood. Cracking AI’s black box would be a big step towards better safety, tuning, and controllability of rapidly advancing models.
Source:
New AI Job Opportunities on June 07th 2024:
📊 Harvey – Data Analyst
🧠 Meta – Applied AI Research Scientist – Reinforcement Learning
📣 DeepL – Corporate Communications Lead
🤝 Palantir Technologies – Enablement Partner
What Else Is Happening in AI on June 07th 2024
Adobe’s launches AEP AI Assistant to help brands master customer data and outreach
Adobe’s Adobe Experience Platform (AEP) is a real-time customer experience management (CXM) solution designed for enterprises. AEP AI Assistant is a gen AI chatbot hooked up to the AEP and an enterprise’s storehouse of advertising and customer data, brand assets, and content collateral (at their direction).
The most comprehensive benchmarking & leaderboard for image models is here!
AI image models have achieved near-photographic quality. But how do they compare? Are the open-source alternatives on par with their proprietary counterparts? The Artificial Analysis Text to Image Leaderboard aims to answer these questions with human preference-based rankings.
AI used to predict potential new antibiotics in groundbreaking study
Scientists used an algorithm to mine the entirety of the microbial diversity that we have on earth – or a huge representation of that – and find almost 1million new molecules encoded or hidden within all that microbial dark matter.
Meta gets EU complaints from 11 countries over use of personal data to train AI
Meta rejected the criticism and referred to a May 22 blog in which it said it uses publicly available online and licensed information to train AI as well as information that people have shared publicly on its products and services.
The Federal Trade Commission is investigating Microsoft-Inflection AI deal
The FTC has sent subpoenas to tech giant and startup, asking whether their partnership evaded the required government antitrust review of the transaction. Microsoft picked off Inflection’s specialized workforce of AI researchers but didn’t purchase the company outright.
A Daily chronicle of AI Innovations June 06th 2024:
Stability AI’s sound generator creates drum beats and instrument riffs xAI to build the gigafactory of compute New study reveals key findings on young peoples’ use of Gen AI
💊 AI predicts nearly 1M new antibiotics
🔊 Stability AI releases open audio model
Mistral’s new SDK simplifies customization for users
Salesforce inaugurates its first AI Centre in London
eBay unveils new AI-tool to enhance product image backdrops
Asana introduces AI teammates
Writer launches a suite of No-Code development tools
Stability AI’s sound generator creates drum beats and instrument riffs
Stability AI’s Stable Audio Open can generate up to 47-second audio samples based on text descriptions. The open AI model is trained on data from 486,000 samples of royalty-free music samples. The tool enables users to generate drum beats, instrument riffs, and ambient sounds.
However, the AI model has its limitations.
It is unable to produce full songs, melodies, or vocals.
Its terms of service prohibit users from using Stable Audio Open commercially.
Its training data is biased toward the English language and specific music styles.
Why does it matter?
The tool will enable creative users like sound designers, musicians, and developers to fine-tune the model to their own custom audio data, which is a major leap in the realm of generative AI’s audio generation capabilities.
The AI startup seeks to build the world’s largest supercomputer in a multi-billion dollar project. The company plans to use this supercomputer to develop AI products, including its chatbot, Grok. The facility will be powered by Nvidia’s H100 GPUs. The project aims to be operational by fall 2025.
The project will reportedly use Nvidia AI chips originally intended for Tesla, raising concerns about conflicts of interest. Moreover, Musk hasn’t yet delivered Grok 2, an advanced AI model that he had promised in May.
Why does it matter?
The supercomputer’s support for the development of Grok could become a groundbreaking advancement in the field of AI technology. Moreover, the project represents a multi-billion dollar investment that could create numerous job opportunities and boost the economy.
New study reveals key findings on young peoples’ use of Gen AI
The study directly involved young readers and examined the use of generative AI by use, ethnicity, age, gender, and LGBTQ+ identity. Key findings include:
50% of the survey respondents (aged 14-22) have used generative AI. However, only 4% use it daily.
For 53% of respondents, the use case for generative AI was obtaining information, while for 51%, it was brainstorming.
Black young people are more likely to use generative AI compared to their white peers. Reasons include getting information, brainstorming ideas, and assistance with schoolwork.
Young people of Latin origin are more likely than white people to use generative AI for multiple purposes, including image generation and getting help with their jobs.
Out of respondents who have never used generative AI, 34% believed it would not be helpful.
Among people never having used generative AI, LGBTQ+ young people are more likely to use it in comparison to cisgender and straight respondents.
41% of respondents believe that generative AI will have a positive as well as negative impact on their lives in the next 10 years.
Why does it matter?
Understanding the diverse ways in which the younger users engages with generative AI enables businesses to uncover trends and data-oriented insights to improve their AI models and deliver, user experiences that are responsive and inclusive.
Researchers just published a new study detailing the use of AI to predict close to 1M new antibiotics hidden within tiny microbes all over the world, uncovering new potential treatments against bacteria and superbugs.
Researchers used AI to analyze publicly available data on over 100,000 different genomes and meta-genomes.
The AI then predicted which parts of the microbial genomes could potentially produce antibiotic compounds, generating a list of nearly one million candidates.
100 of the AI-predicted drug candidates were tested in the lab, with 79 of them being a potential antibiotic.
The paper’s author Cesar de la Fuente said the findings are “the largest antibiotic discovery ever”, accelerating the process from years to just hours.
As the world faces growing threats from antibiotic-resistant bacteria, AI’s ability to unlock millions of new potential treatments could be a lifeline toward staying ahead in the race to outsmart superbugs responsible for millions of deaths every year.
Stability AI just unveiled Stable Audio Open, a new model enabling users to create short audio samples and sound effects from text prompts.
Stable Audio Open can create up to 47 seconds of high-quality audio from text prompts, focused primarily on sound effects and production elements.
The model was trained on data from FreeSound and Free Music Archives to avoid copyright issues.
Stable Audio Open can also be fine-tuned on custom audio, allowing artists to train the model on their own samples.
While the AI advances in text-to-image models have been the most visible (literally), both video and audio are about to take the same leap. Putting these tools in the hands of creatives will redefine traditional workflows — from musicians brainstorming new beats to directors crafting sound effects for film and TV.
Mistral’s new SDK simplifies customization for users
Mistral’s SDK, Mistral-Finetune, will allow users to modify its models on workstations, servers, and small data center nodes. While optimized for multi-GPU setups, the SDK can also scale down to a single Nvidia A100 or H100 GPU. It enables developers to fine-tune data sets such as UltraChat within half an hour, says Mistral. (https://techcrunch.com/2024/06/05/mistral-launches-new-services-sdk-to-let-customers-fine-tune-its-models)
Salesforce inaugurates its first AI Centre in London
The goal of the centre is to foster collaboration, AI training, upskilling, and create AI-related jobs among developers, industry experts, and customers. This step is a part of its £4bn five-year investment in the UK, solidifying London’s position as a global AI hub. (https://www.reuters.com/technology/salesforce-open-first-ai-centre-london-2024-06-06)
eBay unveils new AI-tool to enhance product image backdrops
The tool, currently available for iOS users in the UK, USA, and Germany, automatically removes image backgrounds and allows users to replace them with AI-generated backdrops. The AI feature powered by Stable Diffusion will enable sellers to upload high-quality photos easily. (https://techcrunch.com/2024/06/05/ebay-ai-powered-background-enhancement-tool)
Asana introduces AI teammates
The platform leverages historical data based on parameters like skill sets and missing information, allowing teams to define tasks and get an overview of who is assigned which responsibility. The tool allows users to interact with it using an AI chat interface. (https://www.theverge.com/2024/6/5/24170480/asana-ai-teammate-workflow-assistant-chatbot)
Writer launches a suite of No-Code development tools
The Gen AI platform announced major expansions that included Writer Framework – a drag-and-drop tool that enables developers to connect pre-built UI components with generative AI to create custom AI applications. The tool additionally provides full flexibility to the underlying code. (https://venturebeat.com/ai/writer-launches-no-code-platform-and-framework-for-custom-enterprise-ai-applications)
A Daily chronicle of AI Innovations June 05th 2024:
Apple could crown OpenAI as King of the Valley
🚨 AI researchers demand ‘Right to Warn’
New ‘AI teammate’ can tell people what to do at work
Elon Musk explains reasoning for Nvidia chip re-route from Tesla to X
OpenAI seeks ‘vast quantities’ of nuclear fusion energy to power superhuman AI
Cisco has unveiled HyperFabric AI Clusters in collaboration with NVIDIA Tesla’s AI ambitions on hold? Musk diverts chips to X & xAI OpenAI insiders raise concerns over oversight and safety
ChatGPT, Claude, and Perplexity experienced outages at the same time
Raspberry Pi 5 gets AI boost with Hailo extension module
TECNO CAMON 30 series launches Ella-GPT AI assistant
Snowflake empowers enterprise AI with new No-Code studio
Zoom’s CEO envisions AI clones in meetings
Apple could crown OpenAI as King of the Valley
Apple and OpenAI are reportedly partnering to integrate ChatGPT into the iPhone’s operating system, potentially making it a core feature on every device.
The partnership could mark the end of Google’s competition with OpenAI for being the main AI chatbot on iPhones, putting OpenAI in front of Apple’s massive user base and enhancing its long-term influence in Silicon Valley.
While it’s unclear exactly how Apple will use ChatGPT, possibilities include powering Siri or offering it as a standalone app, with new AI features likely to be introduced as opt-in services during the Worldwide Developers Conference next week.
Current and former employees from top AI labs, including OpenAI, Anthropic, and DeepMind just published an open letter calling for companies to expand whistleblower protections so workers can raise the alarm about potential AI dangers without fear of retaliation.
The ‘Right to Warn AI’ petition was crafted by current and former employees of OpenAI, Anthropic, and Google DeepMind.
The open letter was also endorsed by AI visionaries Yoshua Bengio, Geoffrey Hinton, and Stuart Russell.
The statement pushes for AI firms to agree to several principles:
Eliminating non-disparagement clauses concerning AI risk
Establishing and facilitating anonymous channels for raising concerns
Expanding whistleblower protections and anti-retaliation measures
Several researchers posted threadson their experience, with Daniel Kokotajlo revealing he quit OpenAI after ‘losing hope’ the company would act responsibly.
The AI safety discourse is reaching a boiling point, and there is clearly a major industry divide that transcends any one AI firm or researcher. The proposed principles seem reasonable and necessary — but it remains to be seen if the top AI leaders will actually listen.
Source: https://righttowarn.ai/
New ‘AI teammate’ can tell people what to do at work
Asana has introduced an “AI teammate” that not only assigns tasks but also fills in missing information, aiming to enhance team collaboration and overall efficiency within the platform.
The AI model leverages historical data and team relationships to match tasks with employees’ specific skill sets, ensuring that the right people are assigned to the right projects for optimal performance, and proactively gathering any missing information needed for tasks.
Asana’s AI teammate includes a chat interface that allows users to ask questions about their projects, similar to other AI tools from companies like Google, Amazon, and Microsoft, which are also focused on automating workflow tasks.
Elon Musk explains reasoning for Nvidia chip re-route from Tesla to X
Elon Musk has acknowledged diverting Nvidia AI chips intended for Tesla to his other companies, xAI and social media firm X, claiming the chips would have been unused otherwise.
This move comes as Musk repositions Tesla from solely a car manufacturer to an AI and robotics company, with plans to significantly increase its chip procurement for AI training by the year’s end.
The timing of Musk’s confession could affect the upcoming vote on his substantial pay package, raising concerns among Tesla shareholders about potential resource misallocation to his other ventures.
OpenAI seeks ‘vast quantities’ of nuclear fusion energy to power superhuman AI
OpenAI is looking to secure a deal with Helion Energy to obtain “vast quantities” of nuclear fusion energy to create superhuman artificial intelligence, despite the technology not yet being commercially available.
Helion Energy, supported by significant investments including $375 million from OpenAI’s CEO Sam Altman, aims to produce commercial-scale nuclear fusion energy by 2028.
Nuclear fusion is considered the “holy grail of clean energy” and is crucial for meeting the energy needs of future AI, but achieving it at scale remains highly challenging.
Cisco has unveiled HyperFabric AI Clusters in collaboration with NVIDIA
Cisco and NVIDIA announced Cisco Nexus HyperFabric AI Clusters, an end-to-end infrastructure solution for scaling generative AI workloads in the data center. It combines Cisco’s AI-native networking with NVIDIA’s accelerated computing AI software and VAST’s data storage platform.
It is designed to simplify the deployment and management of generative AI applications for enterprise customers, providing centralized control across the entire AI infrastructure stack.
The Nexus HyperFabric AI cluster will be available for early customer trials in Q4 2024, with general availability expected shortly after.
Why does this matter?
This breakthrough solution aims to provide IT visibility and analytics across the entire AI infrastructure stack, allowing enterprises to focus on AI-driven revenue opportunities rather than spending excessive time on IT management.
Tesla’s AI ambitions on hold? Musk diverts chips to X & xAI
Elon Musk instructed Nvidia to prioritize shipments of AI chips to X and xAI over Tesla, diverting over $500 million worth of Nvidia’s flagship H100 AI chips that were initially reserved for Tesla.
This decision could delay Tesla’s plans to significantly increase its acquisition of H100 chips from 35,000 to 85,000 by the end of 2024, a crucial part of Musk’s vision for transforming Tesla into “a leader in AI and robotics.”
Consequently, this move could frustrate Tesla investors who are counting on Musk to deliver on his promises regarding autonomous driving and Tesla’s AI capabilities.
Why does this matter?
Musk’s decision to prioritize chip shipments to xAI could give it a technological edge in the race to develop advanced generative AI models, potentially outpacing competitors like OpenAI, Google, and others.
OpenAI insiders raise concerns over oversight and safety
Open AI researchers are concerned about the lack of proper oversight, the influence of profit motives, and the suppression of whistleblowers working on advanced AI technologies. They warn of risks ranging “from the further entrenchment of existing inequalities to manipulation and misinformation, to the loss of control of autonomous AI systems potentially resulting in human extinction.”
They want AI companies to agree to four principles: refraining from enforcing non-disparagement agreements, establishing anonymous channels to raise concerns, allowing employees to share risk-related information publicly while protecting trade secrets, and not retaliating against whistleblowers.
Why does this matter?
Amid ongoing OpenAI controversies, the letter, coupled with the high-profile names endorsing it, will place even greater scrutiny on its practices and decision-making. This could pressure the company to be more transparent and accountable.
ChatGPT, Claude, and Perplexity experienced outages at the same time
This unusual occurrence could indicate a systemic problem rather than individual issues, possibly signaling a broader infrastructure or internet-scale issue affecting these providers. (https://techcrunch.com/2024/06/04/ai-apocalypse-chatgpt-claude-and-perplexity-are-all-down-at-the-same-time)
Raspberry Pi 5 gets AI boost with Hailo extension module
Raspberry Pi launched a $70 AI Kit, an extension for the Raspberry Pi 5. It includes a neural network inference accelerator, the Hailo-8L, powered by Hailo’s AI chip. With it, the Raspberry Pi 5 can perform inferencing at 13 tera-ops per sec, facilitating tasks like object detection, semantic segmentation, and facial landmarking for camera applications. (https://techcrunch.com/2024/06/04/raspberry-pi-partners-with-hailo-for-its-ai-extension-kit)
TECNO CAMON 30 series launches Ella-GPT AI assistant
It supports over 70 languages, helps with daily tasks and content creation, and improves user interaction with features like real-time translations, voice commands, and personalized assistance. Additional capabilities include Ask AI for text editing and grammar checks and AI Generate for turning sketches into images. (https://www.businesstoday.in/technology/news/story/tecno-camon-30-series-introduces-new-ai-assistant-ella-gpt-432156-2024-06-05)
Snowflake empowers enterprise AI with new No-Code studio
It announced several updates to its Cortex AI service and Snowflake ML. The introduction of No-Code AI & ML Studio stands out among these enhancements, enabling every enterprise user to construct AI applications tailored to their specific use cases without requiring coding expertise. (https://venturebeat.com/ai/snowflake-doubles-down-on-enterprise-ai-with-no-code-studio-and-more)
Zoom’s CEO envisions AI clones in meetings
Zoom’s CEO, Eric Yuan, envisions AI-driven digital avatars, or “digital twins,” representing humans in meetings, potentially reducing the workweek to three or four days. He argues AI can efficiently manage tasks like Zoom calls, chats, and emails, allowing people to reclaim time spent in meetings. This initiative forms part of Zoom’s 2.0 journey, aiming to evolve beyond a mere videoconferencing tool. (https://www.theverge.com/2024/6/3/24168733/zoom-ceo-ai-clones-digital-twins-videoconferencing-decoder-interview)
Google’s AI Overviews appear on 70% fewer Search results pages now: https://www.zdnet.com/article/googles-ai-overviews-appear-on-70-fewer-search-results-pages-now
📶 AI Research: OpenAI researcher: AGI coming by 2027
Former OpenAI researcher Leopold Aschenbrenner just released a new essay series detailing his view on AGI, saying the rate of AI progress will be the most intense and volatile events in human history.
Aschenbrenner says that ‘nobody is pricing in’ what is coming in AI, and to expect another GPT-2 to GPT-4 level jump by 2027 (that would take us to AGI).
The researcher predicts that hundreds of millions of AGI would then rapidly accelerate progress, compressing decades of progress into a year.
He also discussed the economic and military advantages that will come with AGI, calling it a national security issue that needs its own ‘Manhattan Project’.
Aschenbrenner reiterated these views on the Dwarkesh Podcast, also revealing he was fired from OpenAI after raising AI security concerns.
As an insider at OpenAI, Aschenbrenner’s analysis carries weight — and his projections paint a striking picture of how radically AGI will reshape the world. His commentary on the firing also raises even more drama, given OpenAI’s current media battle with former researchers and the board over safety concerns.
Siena – Head of Design: https://jobs.therundown.ai/jobs/61293280-head-of-design
Figure AI – Software Engineer, Developer Tools and Productivity: https://jobs.therundown.ai/jobs/59968286-software-engineer-developer-tools-and-productivity
Snorkel – ML Enablement Lead: https://jobs.therundown.ai/jobs/61783157-ml-enablement-lead
A Daily chronicle of AI Innovations June 04th 2024:
Intel’s new data center chips handle demanding AI workloads Amazon’s Project PI detects defective products before shipping Microsoft’s Aurora AI could transform weather forecasting
High-quality education data key to AI performance: Research
LeCun criticizes Musk for mistreating scientists and spreading misinformation
Microsoft to invest $3.2 billion in Sweden to expand AI and cloud infrastructure
Microsoft identifies few AI deep fakes in the EU election
☕️ Robots serve up coffee at Starbucks
📦 Amazon’s AI ‘private investigator’
Intel’s new data center chips handle demanding AI workloads
Intel has announced next-generation Xeon 6 server processors to regain the data center market share it had been losing to AMD. They come in two varieties. The larger, more powerful version is designed to run the computations necessary to generate responses from complex AI models and other tasks requiring increased horsepower. Intel plans to help companies modernize their aging data center systems with Xeon 6 chips so they can generate new digital capabilities.
Intel also revealed that its Gaudi 3 AI accelerator chips would be priced much lower than its rivals’ products.
Why does it matter?
As more companies have started to deploy AI apps and models, the AI hardware space is getting heated with competition. Intel seems to be one of the only companies innovating across the full spectrum of the AI market opportunity– from semiconductor manufacturing to PCs and data center systems.
Amazon’s Project PI detects defective products before shipping
Amazon has launched Project PI, which uses AI to scan products for defects before shipping them to customers. This AI system combines computer vision to visually inspect items with generative AI models that can understand things like text on packages.
As products go through a scanning tunnel, the AI checks for damage, incorrect colors/sizes, or expired dates. If it finds a problem, that item is isolated to evaluate the defect. Project PI already operates in several of Amazon’s warehouses across North America. The system catches millions of defective products daily before they reach customers.
Why does it matter?
Using innovative AI systems, retailers can avoid dealing with returns and reshipments, reducing costs and inefficiencies. By cutting down unnecessary shipping, retailers minimize environmental impact and carbon emissions, contributing to sustainability goals.
Microsoft’s Aurora AI could transform weather forecasting
Microsoft has developed a powerful new AI foundation model called Aurora that can make highly accurate weather predictions. It is trained on over a million diverse weather and climate data hours. This allows it to develop a comprehensive understanding of atmospheric dynamics and excel at forecasting various weather variables like temperature, wind speed, air pollution levels, and greenhouse gas concentrations.
What sets Aurora apart is its ability to capture intricate details at high spatial resolution (around 11km) while being much faster and more computationally efficient than traditional numerical weather prediction systems. Aurora’s flexible architecture and training on heterogeneous datasets enable it to adapt to different forecasting tasks and resolutions.
Why does it matter?
This major advancement in AI-based weather forecasting could help communities prepare for extreme weather events like storms. AI will also play a bigger role in predicting the impacts of climate change. We may be nearing days when weatherman’s predictions will be 100% accurate.
South Korean search giant Naver shared a video of the company’s in-office autonomous Starbucks location — with 100 robots delivering coffee and other items throughout the building.
Naver’s autonomous wheeled “Rookie” robots navigate the building’s 36 floors to bring packages, coffee, and lunch to employees.
Rookie bots are assisted by dual-armed “Ambidex” robots, which are lightweight and dexterous for safer human interactions.
Both robots connect to the company’s ARC system, which enables control of navigation, planning, and processing for the entire fleet through cloud computing.
Naver also developed RoboPort, a dedicated elevator system that allows robots to move quickly and efficiently between floors.
Naver’s putting its robots to work in the ‘real world’ — and while the location is just a testing ground for now, it’s also a glimpse into how service robotics will be integrated into the mainstream sooner than many expect.
📦 Amazon’s AI ‘private investigator’
Amazon just unveiled Project P.I., an AI system that scans products in the company’s fulfillment centers to detect damaged or incorrect items before they ship to reduce returns and waste.
Project P.I. uses AI and computer vision to spot defects like damaged products as well as wrong colors/sizes as items pass through imaging tunnels.
The system is already in place across the company’s North American fulfillment centers, with plans to expand globally throughout the year.
Amazon also utilizes a multimodal LLM to investigate issues further, combining customer feedback with Project P.I. images to identify the source problem.
Amazon’s no stranger to adopting AI across its operations, and Project P.I. is yet another innovative approach to supercharge efficiency. Soon, the company will likely combine these talents with advanced warehouse robots — taking humans out of the fulfillment process altogether.
⛈️ AI RESEARCH: Microsoft’s AI weather forecasting model
Microsoft researchers just introduced Aurora, a new AI foundation model trained on over 1M hours of weather and climate data that can generate accurate weather forecasting insights.
Aurora produces accurate forecasts across a variety of weather scenarios, including extreme events or areas with limited data.
The 1.3B parameter model can generate a 5-day global air pollution prediction in under 60 seconds.
Aurora also produces 10-day global forecasts at high resolutions, beating both the top models and specialized AI weather systems.
Aurora’s success could usher in a huge shift in the way we do weather forecasting. With the recent progress of AI models, combined with massive amounts of atmospheric data — the day that the weatherman is truly right 100% of the time might be coming sooner than most people probably think.
🎨 Adept AI – Product Designer: https://jobs.therundown.ai/jobs/61589310-product-designer
⚙️ Luma AI – Senior Distributed Systems Engineer: https://jobs.therundown.ai/jobs/61614624-senior-distributed-systems-engineer
🚗 Scale AI – Account Executive, Automotive: https://jobs.therundown.ai/jobs/61614640-account-executive-automotive
🧑💻 Lambda – Technical Program Manager: https://jobs.therundown.ai/jobs/61589251-technical-program-manager
What Else Is Happening in AI on June 04th 2024
Hugging Face detects ‘unauthorized access’ to its AI model hosting platform
Last week, Hugging Face detected unauthorized access to Spaces, its platform for creating, sharing, and hosting AI models. It suspects some Spaces secrets (private pieces of info that act as keys to unlock protected resources like accounts, tools, and dev environments) have leaked. It has taken steps to remediate this. (https://huggingface.co/blog/space-secrets-disclosure)
High-quality education data key to AI performance: Research
Researchers created a high-quality dataset called FineWeb-Edu by filtering an existing web dataset for educational content. Language models trained on FineWeb-Edu significantly outperformed models trained on unfiltered datasets. The research shows that data quality and diversity are more important than dataset size for training effective AI models. (https://x.com/gui_penedo/status/1797173053123916036)
LeCun criticizes Musk for mistreating scientists and spreading misinformation
LeCun has again rebuked Musk on X after they had a heated feud on X last week. This time, LeCun accused Musk of forcing researchers to work in secrecy instead of allowing them to publish their work, which slows scientific progress. He also accused Musk of falsely predicting AI and autonomous vehicles. (https://venturebeat.com/ai/yann-lecun-ai-pioneer-sharply-criticizes-elon-musk-over-treatment-of-scientists-and-spreading-of-misinformation)
Microsoft to invest $3.2 billion in Sweden to expand AI and cloud infrastructure
Microsoft will invest $3.2 billion over two years to expand its cloud and AI infrastructure in Sweden. Microsoft’s biggest investment to date in Sweden includes a pledge to help train some 250,000 people with AI skills, corresponding to 2.4% of the population, which will help boost the Nordic country’s competitiveness in generative AI. (https://www.reuters.com/technology/microsoft-invest-32-bln-swedish-cloud-ai-2024-06-03)
Microsoft identifies few AI deep fakes in the EU election
As the European Union prepares for its elections in June 2024, the threat of AI-generated deepfakes has become a significant concern. Microsoft President Brad Smith highlighted this burning issue, emphasizing its potential impact on the democratic process and the steps Microsoft is taking to mitigate these risks. (https://www.reuters.com/technology/few-ai-deepfakes-identified-eu-elections-microsoft-president-says-2024-06-03 )
A Daily chronicle of AI Innovations June 01-03 2024:
Windows AI feature is a security ‘disaster’
Zoom CEO wants AI clones in meetings
Nvidia and AMD unveil next generation AI chips as competition heats up
Nvidia announces an AI gaming assistant
Nvidia CEO drops a series of AI announcements AMD outlined new chip architecture strategy for AI data centers ElevenLabs’ Text to Sound AI wows creators
Windows AI feature is a security ‘disaster’
Microsoft’s new AI-powered Recall feature, which captures and stores screenshots of all user activity on a PC, has been criticized as a cybersecurity “disaster” due to significant security vulnerabilities discovered during testing.
Cybersecurity expert Kevin Beaumont found that Recall stores screenshots in an unencrypted plain text database, making it easy for malware and attackers to access sensitive data, despite Microsoft’s claims of a secure, encrypted experience.
Privacy advocates and the UK’s ICC have raised concerns about Recall’s potential to expose personal information, with criticisms focusing on its default enabled status and the lack of comprehensive content moderation to protect sensitive data like passwords and financial information.
Zoom CEO Eric Yuan envisions using AI-powered digital twins to attend meetings on behalf of users, allowing people to manage their time more effectively.
Yuan aims to expand Zoom beyond videoconferencing into the broader enterprise software market, competing with giants like Microsoft and Google by integrating productivity tools such as email and chat.
Zoom is heavily investing in AI technology to facilitate this transition, although Yuan acknowledges challenges like AI hallucination that need to be addressed before fully realizing his vision.
Nvidia and AMD unveil next generation AI chips as competition heats up
Nvidia and AMD showcased their latest AI chips at a major computing conference, heightening their rivalry in the AI chip market.
Nvidia CEO Jensen Huang emphasized his company’s dominance and highlighted a new chip, Rubin, set for release in 2026, while AMD CEO Lisa Su focused on teamwork and collaborative presentations with industry partners.
The competition between the two companies is significant for AI development, with Nvidia promoting a proprietary ecosystem and AMD advocating for open standards and interoperability.
Nvidia recently demonstrated its AI assistant, G-Assist, showcasing its evolution from an April Fools’ prank to a powerful tool for game developers and RTX GPU owners.
G-Assist can respond to voice queries, understand in-game activities, provide guidance based on player’s skill points, and offer performance-enhancing recommendations for a smooth gaming experience.
The AI assistant optimizes PC settings, monitors latency and frame rates, detects refresh rate discrepancies, and suggests ways to boost performance, including overclocking the GPU if necessary.
Nvidia CEO Jensen Huang revealed the company’s ambitious plans for annual AI accelerator upgrades, targeting a broader range of industries to expand its customer base.
It will release the Blackwell Ultra chip in 2025 and the next-generation Rubin platform in 2026.
It is also releasing a new server design, MGX, to help companies like HPE and Dell bring products to market faster.
They are promoting the use of digital twins in its Omniverse virtual world, showcasing a digital twin of Earth for sophisticated modeling tasks.
Introduces Project G-Assist, an RTX-powered AI assistant technology that provides context-aware help for PC games and apps.
G-Assist uses voice or text inputs and game window snapshots to provide personalized responses based on in-game context.
Developers can customize the AI models for specific games or apps, and they can run on the cloud or locally on GeForce RTX AI PCs and laptops.
Nvidia partnered with Studio Wildcard for a tech demo using ARK: Survival Ascended, showcasing how G-Assist can help with quests, items, lore, and challenging bosses. Check out full keynote speech:
These announcements show how eager Nvidia is to retain its position as a leader in the AI hardware market. In addition to pushing the acceleration of AI chips, Nvidia is developing new tools to shape AI’s implementation in multiple sectors.
AMD outlined new chip architecture strategy for AI data centers
AMD CEO Lisa Su introduced new AI processors at Computex, including the MI325X accelerator, set to be available in Q4 2024.
The CEO announced the MI325X accelerator, which will be released in Q4 2024, and outlined the company’s plan to develop AI chips over the next two years.
Introduced the MI350 series, expected in 2025, which promises a 35x improvement in inference performance compared to the current MI300 series.
The company also teased the MI400 series, slated for 2026, based on the mysterious “Next” architecture.
With AMD and Nvidia moving to annual release cycles, the competition is heating up to meet the soaring demand for AI semiconductors.
Why does it matter?
AMD’s aggressive push to challenge Nvidia’s market leadership could lead to increased innovation, lower prices, and more widespread adoption of AI across various sectors.
ElevenLabs introduces Text to Sound, an AI model that generates sound effects, instrumental tracks, soundscapes, and character voices from text prompts. The tool aims to help film, TV, video games, and social media creators produce high-quality audio content quickly and affordably.
They have partnered with Shutterstock to fine-tune the model using their diverse audio library of licensed tracks. Users can generate sound effects by logging in, describing the desired sound, and downloading the best results.
Note: This tool doesn’t have a content filter and can generate any raw content through conditional prompting.
Why does it matter?
It could significantly reduce production costs and timelines by simplifying the development of high-quality sound effects, music, and voices, encouraging smaller studios and individual creators to compete with larger players.
Meta’s AI generates bizarre summaries of Facebook comments on various posts
The AI picks up on both serious and lighthearted comments, often highlighting the more outlandish ones. It’s unclear how Meta chooses which posts to display these summaries on. However, the summaries have raised privacy concerns, as Meta feeds user comments into its AI system. (https://www.theverge.com/2024/5/31/24168802/meta-ai-facebook-comments-summaries)
Finnish startup Binit is developing an AI household waste tracker
The gadget, designed to be mounted in the kitchen, has cameras and sensors to scan items before throwing them away. Binit uses OpenAI’s GPT for image recognition, achieving nearly 98% accuracy in trash recognition. The app provides analytics, feedback, and gamification to encourage users to reduce waste, with tests showing a 40% reduction in mixed bin waste. (https://techcrunch.com/2024/06/03/binit-is-bringing-ai-to-trash)
Latest AI Tools on June 03rd 2024:
Consistent Character Model – Create consistent character images in different poses: https://supertools.therundown.ai/content/consistent-character-mode
Perplexity Pages – Turn your research into shareable articles with AI: https://supertools.therundown.ai/content/paper-clipserplexity-ai
PixVerse Magic Brush – Brush areas, mark directions, and watch images come to life: https://supertools.therundown.ai/content/pixverse
Glaze – Protect artists from generative AI: https://supertools.therundown.ai/content/glaze
Fontjoy – Helps designers find perfect font pairings in one click: https://supertools.therundown.ai/content/fontjoy
Eva Coach – An AI assistant to help manage work-related stress: https://supertools.therundown.ai/content/eve-coach
Latest AI Jobs on June 03rd 2024:
Figure AI – Solutions Lead: https://jobs.therundown.ai/jobs/59875235-solutions-lead
So it has begun! Ok, so, yeah! There is not a lot of usage you can get out of this thing so you have to use the prompting very sparingly. It is days rate limiting not hours. 🙁 Let's start off with the media. Just one little dig at them because on CNBC they said, "the model is a smaller model". I think the notion here was that this model is a smaller model from a larger model so they just repeated that. I don't think this is a smaller model. Now, it could be that the heart of the model is smaller but what is going on behind the scenes with the thinking is a lot of throughput to model(s). I think the implication here is important to understand because on one hand there is an insanely low rate limit. when I say low I mean 30 messages per week low. On the other hand, the thinking is clearly firing a lot of tokens to get through a process of coming to a conclusion. The reason why I say it's a concert of models firing towards each other is because something has to be doing the thinking and another call (could be the same model) has to be doing the checking of the steps and other "things". In my mind, you would have a collection of experts doing each thing. Ingenious really. Plausibility model The plausibility model as the prime cerebral model. When humans think the smartest humans understand when they are headed down the right path and what is not the right path. You see this in Einstein's determination to prove the theory of relativity. His clutch of infamy came on the day when in an observatory (I think during an eclipse) he caught the images of light bending around our star proving that the fabric of space was indeed curved. Einstein's intuition here can not be underestimated. From Newton's intuition about gravity and mass to Einstein coming along and challenging that basic notion and to take it further and learn a new understanding of the how and why. It all starts with a plausibility of where one is going in their quest for knowledge. With my thoughts am I headed down the right path. Does the intuition of my thoughts make sense or should I change course to another or should I abandon the thought all together. This is truly what happens in the mind of an intelligent and sentient being on the level of genius. Not only the quest for knowledge but the ability to understand and know correctness wherever the path has led. In this, LLM's were at a distinct disadvantage because they are static capsules of knowledge frozen in time (and a neural network). In many ways they still are. However, OpenAI has done something that is truly ingenious to initially deal with this limitation. First, you have to understand the limitation of why being static and not dynamic is such a bad thing. If I ask you a question and tell you that the only way you can answer is to spit out the first thing that comes to your mind, without thinking, would produce in some probable occasions the wrong answer. With increasing difficulty of the question the more and more likely it would be that one would give the wrong answer. But human beings don't operate with such a constraint. They think through things as the level of difficulty of the perceived question is queried. One initial criticism is that this model over thinks all of the time. Case in point. It took 6 seconds to process hello. https://preview.redd.it/aih5umfz4iod1.png?width=1459&format=png&auto=webp&s=65bef59c6f7cdb52e9bef56c6d65e1a64b32f0d3 Eventually, I am sure OpenAI will figure this out. Perhaps a gate orchestrator model?! Some things don't require much thought; just saying. But back to the plausibility model concept. I don't know from Sunday if this is what is really going on but I surmise. What I imagine here is that smaller models (or the model) are quickly bringing information to a plausibility model. What is a mystery here is how on earth does the plausibility model "know" when it has achieved a qualitative output? Sam said something in an interview that leads me to believe that what's interesting about models as they stood since GPT-4 is that if you run something 10,000 times somewhere in there is correctness. Just getting the model to definitely give you that answer consistently and reliably is the issue. Hence, hallucinations. But what if, you could deliver responses and a model checks that response for viability. It's the classic chicken and egg problem. Does the correct answer come first or the wrong answer. Well, even going further, what if I present to the model many different answers. Choosing between the one that makes the most sense makes the problem solving a little more easier. It all becomes recursively probabilistic at this point. Of all of these incoming results keep checking to see if the path we're heading down is logical. Memory In another methodology, a person would keep track of where they were in the problem solving solution. It is ok to get to a certain point and pause for a moment to plan on where you would then go next. Hmmm. Memory, here is vital. You must keep the proper context of where you are in your train of thought or it is easy to lose track or get confused. Apparently OpenAI has figured out decent ways to do this. Memory, frankly, is horrible in all LLM's including GPT-4. Building up a context window is still such a major issue for me and the way the model refers to it is terrible. In GPT-o1-preview you can tell there are major strides in how memory is used. Not necessarily from the browser but perhaps on their side via backend services we humans would never see. Again, this would stem from the coordinating models firing thoughts in and out. Memory on the backend is probably keeping track of all of that which is probably the main reason why COT won't be spilling out to your browser amongst many other reasons. Such as entities stealing it. I digress. In the case of GPT-o1 memory seems to have a much bigger role and is actually used very well for the purpose of thinking. Clarity I am blown away by the totality of this. The promise is so clear of what this is. Something is new here. The model feels and acts different. It's more confident and clear. In fact, the model will ask you for clarity when you are conversing with it. Amazingly, it feels the need to grasp clarity for an input you are asking it. https://preview.redd.it/dr8zsc235iod1.png?width=1201&format=png&auto=webp&s=9f76caa2efe0251c414162faabc389132f4310e8 Whoa. That's just wild! It's refreshing too. It "knows" it's about to head into a situation and says, wait a minute let me get a better understanding here before we begin. Results and Reasoning The results are spectacular. It's not perfect and for the sake of not posting too many images I had to clean up my prompt so that it would be confused by something it asked me to actual clarify in the first place. So maybe while it isn't perfect, It sure the hell is a major advancement in artificial intelligence. Here is a one shot prompt that GPT-4, 4o continually fail at. The reason why I like this prompt is that it was something I saw in a movie and as soon as I saw the person write down the date from the guy asking him to do it I knew right away what was about to happen. Living in the US and travelling abroad you notice some oddities that are just the way things are outside of one's bubble. The metric system for example. Italy is notorious for giving Americans speeding tickets and to me the reason is because they have no clue how fast they are going with that damn speedometer in KPH. I digress. The point is, you have to "know" certain things about culture and likelihood to get the answer immediately. You have to reason through the information quickly to conclude to the correct answer. There is a degree of obviousness but not just from someone being smart but from someone having experienced things in the world. Here is GPT-o1-preview one shotting the hell out of this story puzzle. https://preview.redd.it/z6vdhal55iod1.png?width=1057&format=png&auto=webp&s=17d6499286d671449ca9a62fe44eba2ed37f9112 https://preview.redd.it/grphx9q65iod1.png?width=616&format=png&auto=webp&s=52457b4bd11c230590c2583aac6660b3d6b65e92 https://preview.redd.it/j0g5wm575iod1.png?width=796&format=png&auto=webp&s=cb258066c771c35ef5826ce7b37287dfc8ac712a As I said, GPT-4 and 4o could not do this in 1 shot no way, no how. I am truly amazed. The Bad Not everything is perfect here. The notion that this model can't not think about certain responses is a fault that OAI needs to address. There is no way that we will want to not being using this model all of the damn time instead of <4o. it not knowing when to think and when to just come out with it will be a peculiar thing. With that said, perhaps they are imagining a time when there are acres and acres of Nvidia Blackwell GPU's that will run this in near real time no matter the thought process. Also, the amount of safety that is embedded into this is remarkable. I would have done a section of a Safety model but that is probably coordinating here too but I think you get the point. Checks upon checks. The model seems a little stiff on the personality and I am unclear about the verbosity of the answers. You wouldn't believe it from my long posts but when I am learning something or interacting I am looking for the shortest and most clearest answer you can give. I can't really tell if that has been achieved here. Conversing and waiting multiple seconds is not something I am going to do to try and figure out. Which brings me to the main complaint as of right now. The rate limit is absurd. lol. I mean 30 per week how can you even imagine using that. For months now people will be screaming because of this and rightly so. Jensen can't get those GPU's to OpenAI fast enough I tell you. Here again, 2 years later and we are going to be capability starved by latency and throughput. I am just being greedy. Final Thoughts In the words of Wes Roth, "I am stunned". When the limitations are removed, throughput and latency are achieved, and this beast is let loose I have a feeling that this will be the dawn of a new era of intelligence. In this way, humanity has truly arrived at the dawn of an man made and plausibly sentient intelligence. There are many engineering feats that will be left to overcome but we are in a place that on this date 9/12/2024 the world will be forever changed. The thing is though this is only showcasing knowledge retrieval and reasoning. It will be interesting to see what can be done with vision, hearing, long term memory, and true learning. The things that will built with this may be truly amazing. The enterprise implications here are going to be profound. Great job OpenAI! submitted by /u/Xtianus21 [link] [comments]
Does anyone feel like agi is a hoax and ai will just end up being some convient reference tool .I just don’t see how people think ai is going to be able to make scientific breakthroughs when it all it does is try to predict the next word on the vast amount of data it’s trained on. It just doesn’t seem fundamentally right to tell a bunch of 0 and 1s to think submitted by /u/Electrical_Prune_932 [link] [comments]
OpenAI, Nvidia Executives Discuss AI Infrastructure Needs With Biden Officials.[1] Google unlists misleading Gemini video.[2] Google’s ALOHA Unleashed AI Robot Arm Can Now Tie Shoes Autonomously.[3] Meta is making its AI info label less visible on content edited or modified by AI tools.[4] Sources: [1] https://www.bloomberg.com/news/articles/2024-09-12/openai-nvidia-executives-discuss-ai-infrastructure-needs-with-biden-officials [2] https://www.theverge.com/2024/9/12/24242897/google-gemini-unlists-misleading-video-ai [3] https://www.techeblog.com/google-aloha-unleashed-robot-arm-tie-shoes/ [4] https://techcrunch.com/2024/09/12/meta-is-making-its-ai-info-label-less-visible-on-content-edited-or-modified-by-ai-tools/ submitted by /u/Excellent-Target-847 [link] [comments]
AI Revolution in Healthcare: ChatGPT & Google Bard’s Breakthroughs – Diagnosis, mRNA Tech, Cancer Detection & More.
AI Revolution in Healthcare: Intro
Dive into the latest AI breakthroughs transforming healthcare since ChatGPT and Google Bard’s inception. Discover GPT-4’s rapid diagnostics, Moderna & IBM’s mRNA tech advancements, cutting-edge cancer detection methods, and more. Stay ahead in AI healthcare news with our comprehensive coverage on AI-powered drug discovery, early Alzheimer’s detection, and groundbreaking AI tools in medicine. Join us as we explore each major AI development that’s reshaping healthcare.
AI Revolution in Healthcare: Topics
GPT-4 diagnosed a 1 in 100,000 condition in seconds Moderna, IBM partner to advance mRNA technology using GenAI AI model detects cancer, outperforms traditional methods AI can detect Alzheimer’s signs even before they begin to show Google Cloud launches AI tools for drug discovery & precision medicine BiomedGPT: The most sophisticated AI medical model? Google & Microsoft battle to lead healthcare AI MedPerf makes AI better for healthcare Google DeepMind advances biomedical AI with ‘Med-PaLM M’ Scientists train a neural network to identify PC users’ fatigue Microsoft & Paige to build largest image-based model to fight cancer DeepMind’s new AI can predict genetic diseases Google Cloud launches new generative AI capabilities for healthcare New AI tool can predict viral variants before they emerge ChatGPT outperforms doctors in depression treatment AI algorithms are powering the search for cells Google releases MedLM, generative AI fine-tuned healthcare Google’s new medical AI, AMIE, beats doctors
Are you eager to expand your understanding of artificial intelligence? Look no further than the essential book “AI Unraveled: Master GPT-4, Gemini, Generative AI & LLMs – Simplified Guide for Everyday Users: Demystifying Artificial Intelligence – OpenAI, ChatGPT, Google Bard, AI ML Quiz, AI Certifications Prep, Prompt Engineering,” available at Etsy, Shopify, Apple, Google, or Amazon
AI Revolution in Healthcare: Podcast Transcript
Welcome to “AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence, Latest AI Trends,” where we dive deep into the complexities of AI and bring forth the latest developments in an easy-to-understand format. Today, we’re tackling a series of compelling updates from the AI frontier in the medical field and beyond. In a remarkable medical application, GPT-4, OpenAI’s newest language model, has been put to the test by Dr. Isaac Kohane of Harvard. Impressively, GPT-4 has been reported to perform better than many human doctors, correctly answering medical exam questions over 90% of the time. But what’s truly astonishing is its ability to diagnose a rare 1 in 100,000 condition in just seconds, a task that draws upon the depth of a seasoned physician’s experience. Despite these advances, Dr. Kohane’s book, ‘The AI Revolution in Medicine,’ brings us back to earth, reminding us that GPT-4 is not infallible, presenting a balanced view with examples of the model’s errors ranging from minor clerical issues to math mistakes.
hifting gears, we look at how pharmaceutical giant Moderna and tech behemoth IBM are joining forces to push the boundaries of mRNA technology. Their collaboration intends to combine generative AI and quantum computing, potentially accelerating the discovery of new therapies and vaccines. This is underpinned by using IBM’s MoLFormer, which is expected to enhance Moderna’s understanding of mRNA medicines. In a leap toward precision medicine, Google Cloud has recently launched two AI-powered tools geared at revolutionizing drug discovery. These innovative tools focus on predicting protein structures and managing vast amounts of genomic data, potentially shaving off years in drug development time. We also witness the rise of BiomedGPT, touted as one of the most sophisticated AI medical models, outperforming predecessors across multiple biomedical modalities. This model appears to be a game-changer with its multi-modal and multi-task learning capabilities.
The competition intensifies in the healthcare AI space with Google’s Med-PaLM 2 going through testing at the Mayo Clinic, while Microsoft swiftly incorporates AI advances into patient care by deploying GPT algorithms via cloud services. Furthermore, MedPerf emerges as a new beacon, an open benchmarking platform introduced by MLCommons, aimed to evaluate medical AI models on diverse datasets, prioritizing patient privacy and aiming to enhance AI’s generalizability in healthcare. Adding to an already impressive array of advancements, we have AlphaMissense by Google DeepMind, which is honing the ability to predict genetic diseases, and Google Cloud briefing the healthcare sector with new capabilities to sift through clinical data more efficiently. And finally, EVEscape, a new AI tool with the potential to predict future viral variants—imagine its profound implications had it been available at the onset of the COVID-19 pandemic!
To cap off, studies suggest that AI models like ChatGPT can outdo doctors in providing unbiased treatment recommendations for depression and that AI algorithms are increasingly crucial in cellular research, changing the landscape of biological imaging experiments. Before we conclude, let’s not forget about AMIE, Google’s Articulate Medical Intelligence Explorer, an AI system optimized for diagnostic reasoning that is giving medical professionals a run for their money. For those seeking a deeper understanding of these advancements, the book “AI Unraveled: Master GPT-4, Gemini, Generative AI & LLMs – Simplified Guide for Everyday Users: Demystifying Artificial Intelligence – OpenAI, ChatGPT, Google Bard, AI ML Quiz, AI Certifications Prep, Prompt Engineering,” is available on various platforms including Etsy, Shopify, Apple, Google, and Amazon. That brings us to the end of today’s episode. We hope you’ve gained new insights into the dynamic and revolutionary world of AI, especially its influence on healthcare. Join us next time on “AI Unraveled” as we continue to explore cutting-edge AI trends that are transforming our lives. Till then, this is your host signing off. Keep questioning, keep learning, and remember—the future is AI.
GPT-4 diagnosed a 1 in 100,000 condition in seconds
Dr. Isaac Kohane, a physician and computer scientist at Harvard, has tested the newest AI model, GPT-4, in a medical setting. According to his findings, GPT-4 performs better than many doctors, as it can answer medical exam licensing questions correctly more than 90% of the time, translate information for patients, and give doctors helpful suggestions about bedside manner.
Kohane tested GPT-4 on a real-life case and found that it could correctly diagnose a rare condition just as he would with all his years of experience. However, GPT-4 isn’t always reliable, and his latest book ‘The AI Revolution in Medicine’ is filled with examples of its blunders, ranging from clerical errors to math mistakes.
Moderna, IBM to explore Generative AI and quantum computing for mRNA vaccines
Moderna and IBM are partnering to advance mRNA technology using generative AI and quantum computing, which could speed up Moderna’s discovery and creation of new messenger RNA vaccines and therapies. Moderna’s scientists will have access to IBM’s generative AI model known as MoLFormer, which will help understand the characteristics of potential mRNA medicines and design a new class of vaccines and therapies.
This agreement comes as Moderna is trying to harness its mRNA technology to target other diseases, while IBM is ramping up its investment in AI with new partnerships, largely driven by the release of OpenAI’s ChatGPT.
The use of quantum computing and AI could help Moderna accelerate the discovery and creation of these new vaccines and therapies by solving problems too complex for traditional computers. The development of these new medicines could potentially benefit the general public by providing more treatment options for a range of diseases.
AI model outperforms traditional methods in identifying cancerous nodules
An AI model developed by experts at the Royal Marsden NHS foundation trust, the Institute of Cancer Research, London, and Imperial College London can accurately identify cancer, potentially speeding up diagnosis and treatment. The algorithm, which analyzes CT scans to determine if abnormal growths are cancerous, reportedly performs more efficiently and effectively than current methods.
Why does this matter?
The AI tool may help doctors make faster decisions about patients with abnormal growths that are currently deemed medium-risk. The model, which is still in its early stages, will require further testing before it can be introduced in healthcare systems. However, researchers hope the AI tool will eventually speed up cancer detection by fast-tracking patients to treatment.
AI can detect signs of Alzheimer’s even before symptoms begin to show
Researchers at UT Southwestern Medical Center have found that AI-powered voice analysis can help diagnose Alzheimer’s and cognitive impairment in early stages. If confirmed by larger studies, these findings could primary care providers with an easy-to-perform screening tool for at-risk individuals.
The research used advanced ML and natural language processing (NLP) to identify even the subtlest changes in language and audio that individuals may not easily recognize.
Why does this matter?
Before ML and NLP, detailed speech studies were often unsuccessful as early changes were often undetectable to human ears. However, with advancements in AI, such novel testing methods have performed significantly better than standard cognitive assessments in detecting even mild impairments. Also, it took less than 10 minutes to capture a patient’s voice, outdoing the traditional tests, which took hours to administer.
Only a few days ago, researchers developed an AI model that outperformed traditional methods in identifying cancer. Does this indicate AI leading the charge in reducing overall healthcare costs with improved patient outcomes?
Google Cloud launches AI tools for drug discovery and precision medicine
Google Cloud has launched two AI-powered tools to help biotech and pharmaceutical companies accelerate drug discovery and advance precision medicine. The Target and Lead Identification Suite aims to streamline the process of identifying a biological target and predicting protein structures, while the Multiomics Suite assists researchers in ingesting, storing, analyzing, and sharing large amounts of genomic data. Both tools aim to significantly reduce the time and cost associated with drug development.
Several companies, including Pfizer, Cerevel Therapeutics, and Colossal Biosciences, have already been using these products. Cerevel Therapeutics estimates that it will save at least three years on average by using the Target and Lead Identification Suite to discover new drugs.
AI seems to benefit humanity the most through its use in medicine and diagnostics. This launch from Google and the subsequent adoption by a pharma giant like Pfizer indicate the swift mainstreaming of the tech.
BiomedGPT: The most sophisticated AI medical model?
BiomedGPT is a unified and generalist Biomedical Generative Pre-trained Transformer model. BiomedGPT utilizes self-supervision on diverse datasets to handle multi-modal inputs and perform various downstream tasks.
Extensive experiments show that BiomedGPT surpasses most previous state-of-the-art models in performance across 5 distinct tasks with 20 public datasets spanning over 15 biomedical modalities.
The study also demonstrates the effectiveness of the multi-modal and multi-task pretraining approach in transferring knowledge to previously unseen data.
Why does this matter?
This research represents a significant advancement in developing unified and generalist models for biomedicine, holding promising implications for enhancing healthcare outcomes, and it could lead to discoveries in biomedical research.
Reportedly, Google’s Med-PaLM 2 (an LLM for the medical domain) has been in testing at the Mayo Clinic research hospital. In April, Google announced its limited access for select Google Cloud customers to explore use cases and share feedback to investigate safe, responsible, and meaningful ways to use it.
Meanwhile, Google’s rivals moved quickly to incorporate AI advances into patient interactions. Hospitals are beginning to test OpenAI’s GPT algorithms through Microsoft’s cloud service in several tasks. Google’s Med-PaLM 2 and OpenAI’s GPT-4 each scored similarly on medical exam questions, according to independent research released by the companies.
It seems Google and Microsoft are racing to translate recent AI advances into products that clinicians would use widely. The AI field has seen rapid advancements and research in diverse domains. But such a competitive landscape accelerates translating them into widely available, impactful AI products (which is sometimes slow and challenging due to the complexity of real-world applications).
MLCommons, an open global engineering consortium, has announced the launch of MedPerf, an open benchmarking platform for evaluating the performance of medical AI models on diverse real-world datasets. The platform aims to improve medical AI’s generalizability and clinical impact by making data easily and safely accessible to researchers while prioritizing patient privacy and mitigating legal and regulatory risks.
MedPerf utilizes federated evaluation, allowing AI models to be assessed without accessing patient data, and offers orchestration capabilities to streamline research. The platform has already been successfully used in pilot studies and challenges involving brain tumor segmentation, pancreas segmentation, and surgical workflow phase recognition.
Why does this matter?
With MedPerf, researchers can evaluate the performance of medical AI models using diverse real-world datasets without compromising patient privacy. This platform’s implementation in pilot studies and challenges for various medical tasks further demonstrates its potential to improve medical AI’s generalizability, clinical impact, and advancements in healthcare technology.
Google DeepMind advances biomedical AI with ‘Med-PaLM M’
Google and DeepMind have introduced Med-PaLM M, a multimodal biomedical AI system that can interpret diverse types of medical data, including text, images, and genomics. The researchers curated a benchmark dataset called MultiMedBench, which covers 14 biomedical tasks, to train and evaluate Med-PaLM M.
The AI system achieved state-of-the-art performance across all tasks, surpassing specialized models optimized for individual tasks. Med-PaLM M represents a paradigm shift in biomedical AI, as it can incorporate multimodal patient information, improve diagnostic accuracy, and transfer knowledge across medical tasks. Preliminary evidence suggests that Med-PaLM M can generalize to novel tasks and concepts and perform zero-shot multimodal reasoning.
Why does this matter?
It brings us closer to creating advanced AI systems to understand and analyze various medical data types. Google DeepMind’s MultiMedBench and Med-PaLM M show promising performance and potential in healthcare applications. It means better healthcare tools that can handle different types of medical information, ultimately benefiting patients and healthcare providers.
Scientists train a neural network to identify PC users’ fatigue
Scientists from St. Petersburg University and other organizations have created a database of eye movement strategies of PC users in different states of fatigue. They plan to use this data to train neural network models that can accurately track the functional state of operators, ensuring safety in various industries. The database includes a comprehensive set of indicators collected through sensors such as video cameras, eye trackers, heart rate monitors, and electroencephalographs.
An example of human fatigue analysis using video recording.
The scientists believe that this approach will allow for remote assessment of fatigue severity, and the database will be accessible to software developers for testing their products.
Microsoft and Paige to build the largest image-based AI model to fight cancer
Paige, a technology disruptor in healthcare, has joined forces with Microsoft to build the world’s largest image-based AI models for digital pathology and oncology.
Paige developed the first Large Foundation Model using over one billion images from half a million pathology slides across multiple cancer types. Now, it is developing a new AI model with Microsoft that is orders-of-magnitude larger than any other image-based AI model existing today, configured with billions of parameters.
Paige will utilize Microsoft’s advanced supercomputing infrastructure to train the technology at scale and ultimately deploy it to hospitals and laboratories across the globe using Azure.
Why does this matter?
This will help realize the potential of generative AI at an unprecedented scale, introduce completely novel capabilities of AI, and serve as the cornerstone for the next generation of clinical/healthcare applications built with AI.
Google DeepMind’s new system, called AlphaMissense, can tell if the letters in the DNA will produce the correct shape. If not, it is listed as potentially disease-causing.
Currently, genetic disease hunters have fairly limited knowledge of which areas of human DNA can lead to disease and have to search across billions of chemical building blocks that make up DNA. They have classified 0.1% of letter changes, or mutations, as either benign or disease-causing. DeepMind’s new model pushed that percentage up to 89%.
Why does this matter?
AI is changing nearly everything we do at the moment and might revolutionize molecular biology and life sciences, too. This development is expected to speed up diagnosis and help search for better genetic disease treatments.
Google Cloud launches new generative AI capabilities for healthcare
Google Cloud introduced new Vertex AI Search features for healthcare and life science companies. It will allow users to find accurate clinical information much more efficiently and to search a broad spectrum of data from clinical sources, such as FHIR data, clinical notes, and medical data in electronic health records (EHRs). Life-science organizations can use these features to enhance scientific communications and streamline processes.
Why does this matter?
Given how siloed medical data is currently, this is a significant boon to healthcare organizations. With this, Google is also enabling them to leverage the power of AI to improve healthcare facility management, patient care delivery, and more.
New AI tool can predict viral variants before they emerge
A new AI tool named EVEscape, developed by researchers at Harvard Medical School and the University of Oxford, can make predictions about new viral variants before they actually emerge and also how they would evolve.
In the study, researchers show that had it been deployed at the start of the COVID-19 pandemic, EVEscape would have predicted the most frequent mutations and identified the most concerning variants for SARS-CoV-2. The tool also made accurate predictions about other viruses, including HIV and influenza.
Why does this matter?
The information from this AI tool will help scientists develop more effective, future-proof vaccines and therapies. If only this AI boom happened a little earlier, it could have prevented the Covid-19 pandemic. But I guess no more pandemics, thanks to AI?
ChatGPT outperforms doctors in depression treatment
According to new study, ChatGPT makes unbiased, evidence-based treatment recommendations for depression that are consistent with clinical guidelines and outperform human primary care physicians. The study compared the evaluations and treatment recommendations for depression generated by ChatGPT-3 and ChatGPT-4 with those of primary care physicians.
Vignettes describing patients with different attributes and depression severity were input into the chatbot interfaces.
Why does this matter?
Compared with primary care physicians, ChatGPT showed no bias in recommendations based on patient gender or socioeconomic status. This means the chatbot was aligned well with accepted guidelines for managing mild and severe depression.
A new paper by Nature details how AI-powered image analysis tools are changing the game for microscopy data. It highlights the evolution from early, labor-intensive methods to machine learning-based tools like CellProfiler, ilastik, and newer frameworks such as U-Net. These advancements enable more accurate and faster segmentation of cells, essential for various biological imaging experiments.
Cancer-cell nuclei (green boxes) picked out by software using deep learning.
Why does this matter?
The short study highlights the potential for AI-driven tools to revolutionize further biological analyses. The advancement is crucial for understanding diseases, drug development, and gaining insights into cellular behavior, enabling faster scientific discoveries in various fields like medicine and biology.
Google releases MedLM: Generative AI fine-tuned healthcare
MedLM is a family of foundation models fine-tuned for the healthcare industry, generally available (via allowlist) to Google Cloud customers in the U.S. through Vertex AI. MedLM builds on Med-PaLM 2. Google will soon add Gemini-based models into the MedLM suite to offer even more capabilities.
Why does this matter?
Google isn’t done yet. While its impressive Gemini demo from last week may have been staged, Google is looking to fine-tune and improve Gemini based on developers’ feedback. In addition, it is also racing with rivals to push the boundaries of AI in various fields.
Google developed Articulate Medical Intelligence Explorer (AMIE), an LLM-based research AI system optimized for diagnostic reasoning and conversations.
AMIE’s performance was compared to that of primary care physicians (PCPs) in a randomized, double-blind crossover study of text-based consultations with validated patient actors in the style of an Objective Structured Clinical Examination (OSCE). AMIE demonstrated greater diagnostic accuracy and superior performance on 28 of 32 axes according to specialist physicians and 24 of 26 axes according to patient actors.
Why does this matter?
While further research is required before AMIE can be translated to real-world settings, it represents a milestone towards conversational diagnostic AI. If successful, AI systems such as AMIE can be at the core of next-generation learning health systems that help scale world-class healthcare to everyone.
A Daily Chronicle of AI Innovations in January 2024.
Welcome to ‘Navigating the Future,’ a premier portal for insightful and up-to-the-minute commentary on the evolving world of Artificial Intelligence in January 2024. In an age where technology outpaces our expectations, we delve deep into the AI cosmos, offering daily snapshots of revolutionary breakthroughs, pivotal industry transitions, and the ingenious minds shaping our digital destiny. Join us on this exhilarating journey as we explore the marvels and pivotal milestones in AI, day by day. Stay informed, stay inspired, and witness the chronicle of AI as it unfolds in real-time.
Are you eager to expand your understanding of artificial intelligence? Look no further than the essential book “AI Unraveled: Master GPT-4, Gemini, Generative AI & LLMs – Simplified Guide for Everyday Users: Demystifying Artificial Intelligence – OpenAI, ChatGPT, Google Bard, AI ML Quiz, AI Certifications Prep, Prompt Engineering,” available at Etsy, Shopify, Apple, Google, or Amazon.
A Daily Chronicle of AI Innovations in January 2024 – Day 31: AI Daily News – January 31st, 2024
Microsoft CEO responds to AI-generated Taylor Swift fake nude images
Microsoft CEO Satya Nadella addresses the issue of AI-generated fake nude images of Taylor Swift, emphasizing the need for safety and guardrails in AI technology.
Microsoft CEO Satya Nadella acknowledges the need to act swiftly against nonconsensual deepfake images.
The AI-generated fake nude pictures of Taylor Swift have gained over 27 million views.
Microsoft, a major AI player, emphasizes the importance of online safety for both content creators and consumers.
Microsoft’s AI Code of Conduct prohibits creating adult or non-consensual intimate content. This policy is a part of the company’s commitment to ethical AI use and responsible content creation.
The deepfake images were reportedly created using Microsoft’s AI tool, Designer, which the company is investigating.
Microsoft is committed to enhancing content safety filters and addressing misuse of their services.
Elon Musk’s $56 billion pay package cancelled in court
A Delaware judge ruled against Elon Musk’s $56 billion pay package from Tesla, necessitating a new compensation proposal by the board.
The ruling, which could impact Musk’s wealth ranking, was based on the argument that shareholders were misled about the plan’s formulation and the board’s independence.
The case highlighted the extent of Musk’s influence over Tesla and its board, with key witnesses admitting they were cooperating with Musk rather than negotiating against him.
Google spent billions of dollars to lay people off
Google spent $2.1 billion on severance and other expenses for laying off over 12,000 employees in 2023, with an additional $700 million spent in early 2024 for further layoffs.
In 2023, Google achieved a 13 percent revenue increase year over year, amounting to $86 billion, with significant growth in its core digital ads, cloud computing businesses, and investments in generative AI.
The company also incurred a $1.8 billion cost for closing physical offices in 2023, and anticipates more layoffs in 2024 as it continues investing in AI technology under its “Gemini era”.
ChatGPT now lets you pull other GPTs into the chat
OpenAI introduced a feature allowing custom ChatGPT-powered chatbots to be tagged with an ‘@’ in the prompt, enabling easier switching between bots.
The ability to build and train custom GPT-powered chatbots was initially offered to OpenAI’s premium ChatGPT Plus subscribers in November 2023.
Despite the new feature and the GPT Store, custom GPTs currently account for only about 2.7% of ChatGPT’s worldwide web traffic, with a month-over-month decline in custom GPT traffic since November.
The NYT is building a team to explore AI in the newsroom
The New York Times is starting a team to investigate how generative AI can be used in its newsroom, led by newly appointed AI initiatives head Zach Seward.
This new team will comprise machine learning engineers, software engineers, designers, and editors to prototype AI applications for reporting and presentation of news.
Despite its complicated past with generative AI, including a lawsuit against OpenAI, the Times emphasizes that its journalism will continue to be created by human journalists.
The tiny Caribbean island making a fortune from AI
The AI boom has led to a significant increase in interest and sales of .ai domains, contributing approximately $3 million per month to Anguilla’s budget due to its association with artificial intelligence.
Vince Cate, a key figure in managing the .ai domain for Anguilla, highlights the surge in domain registrations following the release of ChatGPT, boosting the island’s revenue and making a substantial impact on its economy.
Unlike Tuvalu with its .tv domain, Anguilla manages its domain registrations locally, allowing the government to retain most of the revenue, which has been used for financial improvements such as paying down debt and eliminating property taxes on residential buildings.
A Daily Chronicle of AI Innovations in January 2024 – Day 30: AI Daily News – January 30th, 2024
Meta released Code Llama 70B, rivals GPT-4
Meta released Code Llama 70B, a new, more performant version of its LLM for code generation. It is available under the same license as previous Code Llama models–
CodeLlama-70B-Instruct achieves 67.8 on HumanEval, making it one of the highest-performing open models available today. CodeLlama-70B is the most performant base for fine-tuning code generation models.
Why does this matter?
This makes Code Llama 70B the best-performing open-source model for code generation, beating GPT-4 and Gemini Pro. This can have a significant impact on the field of code generation and the software development industry, as it offers a powerful and accessible tool for creating and improving code.
Neuralink implants its brain chip in the first human
In a first, Elon Musk’s brain-machine interface startup, Neuralink, has successfully implanted its brain chip in a human. In a post on X, he said “promising” brain activity had been detected after the procedure and the patient was “recovering well”. In another post, he added:
The company’s goal is to connect human brains to computers to help tackle complex neurological conditions. It was given permission to test the chip on humans by the FDA in May 2023.
As Mr. Musk put it well, imagine if Stephen Hawking could communicate faster than a speed typist or auctioneer. That is the goal. This product will enable control of your phone or computer and, through them almost any device, just by thinking. Initial users will be those who have lost the use of their limbs.
Alibaba announces Qwen-VL; beats GPT-4V and Gemini
Alibaba’s Qwen-VL series has undergone a significant upgrade with the launch of two enhanced versions, Qwen-VL-Plus and Qwen-VL-Max. The key technical advancements in these versions include
Substantial boost in image-related reasoning capabilities;
Considerable enhancement in recognizing, extracting, and analyzing details within images and texts contained therein;
Support for high-definition images with resolutions above one million pixels and images of various aspect ratios.
Compared to the open-source version of Qwen-VL, these two models perform on par with Gemini Ultra and GPT-4V in multiple text-image multimodal tasks, significantly surpassing the previous best results from open-source models.
This sets new standards in the field of multimodal AI research and application. These models match the performance of GPT4-v and Gemini, outperforming all other open-source and proprietary models in many tasks.
What Else Is Happening in AI on January 30th, 2024
OpenAI partners with Common Sense Media to collaborate on AI guidelines.
OpenAI will work with Common Sense Media, the nonprofit organization that reviews and ranks the suitability of various media and tech for kids, to collaborate on AI guidelines and education materials for parents, educators, and young adults. It will curate “family-friendly” GPTs based on Common Sense’s rating and evaluation standards. (Link)
Apple’s ‘biggest’ iOS update may bring a lot of AI to iPhones.
Apple’s upcoming iOS 18 update is expected to be one of the biggest in the company’s history. It will leverage generative AI to provide a smarter Siri and enhance the Messages app. Apple Music, iWork apps, and Xcode will also incorporate AI-powered features. (Link)
Shortwave email client will show AI-powered summaries automatically.
Shortwave, an email client built by former Google engineers, is launching new AI-powered features such as instant summaries that will show up atop an email, a writing assistant to echo your writing and extending its AI assistant function to iOS and Android, and multi-select AI actions. All these features are rolling out starting this week. (Link)
OpenAI CEO Sam Altman explores AI chip collaboration with Samsung and SK Group.
Sam Altman has traveled to South Korea to meet with Samsung Electronics and SK Group to discuss the formation of an AI semiconductor alliance and investment opportunities. He is also said to have expressed a willingness to purchase HBM (High Bandwidth Memory) technology from them. (Link)
Generative AI is seen as helping to identify M&A targets, Bain says.
Deal makers are turning to AI and generative AI tools to source data, screen targets, and conduct due diligence at a time of heightened regulatory concerns around mergers and acquisitions, Bain & Co. said in its annual report on the industry. In the survey, 80% of respondents plan to use AI for deal-making. (Link)
Neuralink has implanted its first brain chip in human LINK
Elon Musk’s company Neuralink has successfully implanted its first device into a human.
The initial application of Neuralink’s technology is focused on helping people with quadriplegia control devices with their thoughts, using a fully-implantable, wireless brain-computer interface.
Neuralink’s broader vision includes facilitating human interaction with artificial intelligence via thought, though immediate efforts are targeted towards aiding individuals with specific neurological conditions.
OpenAI partners with Common Sense Media to collaborate on AI guidelines LINK
OpenAI announced a partnership with Common Sense Media to develop AI guidelines and create educational materials for parents, educators, and teens, including curating family-friendly GPTs in the GPT store.
The partnership was announced by OpenAI CEO Sam Altman and Common Sense Media CEO James Steyer at the Common Sense Summit for America’s Kids and Families in San Francisco.
Common Sense Media, which has started reviewing AI assistants including OpenAI’s ChatGPT, aims to guide safe and responsible AI use among families and educators without showing favoritism towards OpenAI.
New test detects ovarian cancer earlier thanks to AI LINK
Scientists have developed a 93% accurate early screening test for ovarian cancer using artificial intelligence and machine learning, promising improved early detection for this and potentially other cancers.
The test analyzes a woman’s metabolic profile to accurately assess the likelihood of having ovarian cancer, providing a more informative and precise diagnostic approach compared to traditional methods.
Georgia Tech researchers utilized machine learning and mass spectrometry to detect unique metabolite characteristics in the blood, enabling the early and accurate diagnosis of ovarian cancer, with optimism for application in other cancer types.
A Daily Chronicle of AI Innovations in January 2024 – Day 29: AI Daily News – January 29th, 2024
OpenAI reveals new models, drop prices, and fixes ‘lazy’ GPT-4
OpenAI announced a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and lower pricing on GPT-3.5 Turbo.
Introducing new ways for developers to manage API keys and understand API usage
Quietly implemented a new ‘GPT mentions’ feature to ChatGPT (no official announcement yet). The feature allows users to integrate GPTs into a conversation by tagging them with an ‘@.’
The new embedding models and GPT-4 Turbo will likely enable more natural conversations and fluent text generation. Lower pricing and easier API management also open up access and usability for more developers.
Moreover, The updated GPT-4 Turbo preview model, gpt-4-0125-preview, can better complete tasks such as code generation compared to the previous model. The GPT-4 Turbo has been the object of many complaints about its performance, including claims that it was acting lazy. OpenAI has addressed that issue this time.
Prophetic – This company wants AI to enter your dreams
Prophetic introduces Morpheus-1, the world’s 1st ‘multimodal generative ultrasonic transformer’. This innovative AI device is crafted with the purpose of exploring human consciousness through controlling lucid dreams. Morpheus-1 monitors sleep phases and gathers dream data to enhance its AI model.
Morpheus-1 is not prompted with words and sentences but rather brain states. It generates ultrasonic holograms for neurostimulation to bring one to a lucid state.
Its 03M parameter transformer model trained on 8 GPUs for 2 days
Engineered from scratch with the provisional utility patent application
The device is set to be accessible to beta users in the spring of 2024.
Prophetic is pioneering new techniques for AI to understand and interface with the human mind by exploring human consciousness and dreams through neurostimulation and multimodal learning. This pushes boundaries to understand consciousness itself.
If Morpheus-1 succeeds, it could enable transformative applications of AI for expanding human potential and treating neurological conditions.
Also, This is the first model that can fully utilize the capabilities offered by multi-element and create symphonies.
This paper ‘MM-LLMs’ discusses recent advancements in MultiModal LLMs which combine language understanding with multimodal inputs or outputs. The authors provide an overview of the design and training of MM-LLMs, introduce 26 existing models, and review their performance on various benchmarks.
(Above is the timeline of MM-LLMs)
They also share key training techniques to improve MM-LLMs and suggest future research directions. Additionally, they maintain a real-time tracking website for the latest developments in the field. This survey aims to facilitate further research and advancement in the MM-LLMs domain.
Why does this matter?
The overview of models, benchmarks, and techniques will accelerate research in this critical area. By integrating multiple modalities like image, video, and audio, these models can understand the world more comprehensively.
What Else Is Happening in AI on January 29th, 2024
Update from Hugging Face LMSYS Chatbot Arena Leaderboard
Google’s Bard surpasses GPT-4 to the Second spot on the leaderboard! (Link)
Google Cloud has partnered with Hugging Face to advance Gen AI development
The partnership aims to meet the growing demand for AI tools and models that are optimized for specific tasks. Hugging Face’s repository of open-source AI software will be accessible to developers using Google Cloud’s infrastructure. The partnership reflects a trend of companies wanting to modify or build their own AI models rather than using off-the-shelf options. (Link)
Arc Search combines a browser, search engine, and AI for a unique browsing experience
Instead of returning a list of search queries, Arc Search builds a webpage with relevant information based on the search query. The app, developed by The Browser Company, is part of a bigger shift for their Arc browser, which is also introducing a cross-platform syncing system called Arc Anywhere. (Link)
PayPal is set to launch new AI-based products
The new products will use AI to enable merchants to reach new customers based on their shopping history and recommend personalized items in email receipts. (Link)
Apple Podcasts in iOS 17.4 now offers AI transcripts for almost every podcast
This is made possible by advancements in machine translation, which can easily convert spoken words into text. Users testing the beta version of iOS 17.4 have discovered that most podcasts in their library now come with transcripts. However, there are some exceptions, such as podcasts added from external sources. As this feature is still in beta, there is no information available regarding its implementation or accuracy. (Link)
Google’s Gemini Pro beats GPT-4
Google’s Gemini Pro has surpassed OpenAI’s GPT-4 on the HuggingFace Chat Bot Arena Leaderboard, securing the second position.
Gemini Pro is only the middle tier of Google’s planned models, with the top-tier Ultra expected to be released sometime soon.
Competition is heating up with Meta’s upcoming Llama 3, which is speculated to outperform GPT-4.
iOS 18 could be the ‘biggest’ software update in iPhone history
iOS 18 is predicted to be one of the most significant updates in iPhone history, with Apple planning major new AI-driven features and designs.
Apple is investing over $1 billion annually in AI development, aiming for an extensive overhaul of features like Siri, Messages, and Apple Music with AI improvements in 2024.
The update will introduce RCS messaging support, enhancing messaging between iPhones and Android devices by providing features like read receipts and higher-resolution media sharing.
Nvidia’s tech rivals are racing to cut their dependence
Amazon, Google, Meta, and Microsoft are developing their own AI chips to reduce dependence on Nvidia, which dominates the AI chip market and accounts for more than 70% of sales.
These tech giants are investing heavily in AI chip development to control costs, avoid shortages, and potentially sell access to their chips through their cloud services, while balancing their competition and partnership with Nvidia.
Nvidia sold 2.5 million chips last year, and its sales increased by 206% over the past year, adding about a trillion dollars in market value.
Amazon abandons $1.4 billion deal to buy Roomba maker iRobot
Amazon’s planned $1.4 billion acquisition of Roomba maker iRobot has been canceled due to lack of regulatory approval in the European Union, leading Amazon to pay a $94 million termination fee to iRobot.
iRobot announced a restructuring plan that includes laying off about 350 employees, which is roughly 31 percent of its workforce, and a shift in leadership with Glen Weinstein serving as interim CEO.
The European Commission’s concerns over potential restrictions on competition in the robot vacuum cleaner market led to the deal’s termination, emphasizing fears that Amazon could limit the visibility of competing products.
Arc Search combines browser, search engine, and AI into something new and different
Arc Search, developed by The Browser Company, unveiled an iOS app that combines browsing, searching, and AI to deliver comprehensive web page summaries based on user queries.
The app represents a shift towards integrating browser functionality with AI capabilities, offering features like “Browse for me” that automatically gathers and presents information from across the web.
While still in development, Arc Search aims to redefine web browsing by compiling websites into single, informative pages.
AlphaGeometry: An Olympiad Level AI System for Geometry by Google Deepmind
One of the signs of intelligence is being able to solve mathematical problems. And that is exactly what Google has achieved with its new Alpha Geometry System. And not some basic Maths problems, but international Mathematics Olympiads, one of the hardest Maths exams in the world. In today’s post, we are going to take a deep dive into how this seemingly impossible task is achieved by Google and try to answer whether we have truly created an AGI or not.
1. Problem Generation and Initial Analysis Creation of a Geometric Diagram: AlphaGeometry starts by generating a geometric diagram. This could be a triangle with various lines and points marked, each with specific geometric properties. Initial Feature Identification: Using its neural language model, AlphaGeometry identifies and labels basic geometric features like points, lines, angles, circles, etc.
2. Exhaustive Relationship Derivation Pattern Recognition: The language model, trained on geometric data, recognizes patterns and potential relationships in the diagram, such as parallel lines, angle bisectors, or congruent triangles. Formal Geometric Relationships: The symbolic deduction engine takes these initial observations and deduces formal geometric relationships, applying theorems and axioms of geometry.
3. Algebraic Translation and Gaussian Elimination Translation to Algebraic Equations: Where necessary, geometric conditions are translated into algebraic equations. For instance, the properties of a triangle might be represented as a set of equations. Applying Gaussian Elimination: In cases where solving a system of linear equations becomes essential, AlphaGeometry implicitly uses Gaussian elimination. This involves manipulating the rows of the equation matrix to derive solutions. Integration of Algebraic Solutions: The solutions from Gaussian elimination are then integrated back into the geometric context, aiding in further deductions or the completion of proofs.
4. Deductive Reasoning and Proof Construction Further Deductions: The symbolic deduction engine continues to apply geometric logic to the problem, integrating the algebraic solutions and deriving new geometric properties or relationships. Proof Construction: The system constructs a proof by logically arranging the deduced geometric properties and relationships. This is an iterative process, where the system might add auxiliary constructs or explore different reasoning paths.
5. Iterative Refinement and Traceback Adding Constructs: If the current information is insufficient to reach a conclusion, the language model suggests adding new constructs (like a new line or point) to the diagram. Traceback for Additional Constructs: In this iterative process, AlphaGeometry analyzes how these additional elements might lead to a solution, continuously refining its approach.
6. Verification and Readability Improvement Solution Verification: Once a solution is found, it is verified for accuracy against the rules of geometry. Improving Readability: Given that steps involving Gaussian elimination are not explicitly detailed, a current challenge and area for improvement is enhancing the readability of these solutions, possibly through higher-level abstraction or more detailed step-by-step explanation.
7. Learning and Data Generation Synthetic Data Generation: Each problem solved contributes to a vast dataset of synthetic geometric problems and solutions, enriching AlphaGeometry’s learning base. Training on Synthetic Data: This dataset allows the system to learn from a wide variety of geometric problems, enhancing its pattern recognition and deductive reasoning capabilities.
A Daily Chronicle of AI Innovations in January 2024 – Day 27: AI Daily News – January 27th, 2024
Taylor Swift deepfakes spark calls for new laws
US politicians have advocated for new legislation in response to the circulation of explicit deepfake images of Taylor Swift on social media, which were viewed millions of times.
X is actively removing the fake images of Taylor Swift and enforcing actions against the violators under its ‘zero-tolerance policy’ for such content.
Deepfakes have seen a 550% increase since 2019, with 99% of these targeting women, leading to growing concerns about their impact on emotional, financial, and reputational harm.
Spotify accuses Apple of ‘extortion’ with new App Store tax
Spotify criticizes Apple’s new app installation fee, calling it “extortion” and arguing it will hurt developers, especially those offering free apps.
The fee requires developers using third-party app stores to pay €0.50 for each annual app install after 1 million downloads, a cost Spotify says could significantly increase customer acquisition costs.
Apple defends the new fee structure, claiming it offers developers choice and maintains that more than 99% of developers would pay the same or less, despite widespread criticism.
Netflix co-CEO says Apple’s Vision Pro isn’t worth their time yet
Netflix co-CEO Greg Peters described the Apple Vision Pro as too “subscale” for the company to invest in, noting it’s not relevant for most Netflix members at this point.
Netflix has decided not to launch a dedicated app for the Vision Pro, suggesting users access Netflix through a web browser on the device instead.
The Vision Pro, priced at $3,499 and going on sale February 2, will offer native apps for several streaming services but not for Netflix, which also hasn’t updated its app for Meta’s Quest line in a while.
Scientists design a two-legged robot powered by muscle tissue
Scientists from Japan have developed a two-legged biohybrid robot powered by muscle tissues, enabling it to mimic human gait and perform tasks like walking and pivoting.
The robot, designed to operate underwater, combines lab-grown skeletal muscle tissues and silicone rubber materials to achieve movements through electrical stimulation.
The research, published in the journal Matter, marks progress in the field of biohybrid robotics, with future plans to enhance movement capabilities and sustain living tissues for air operation.
OpenAI and other tech giants will have to warn the US government when they start new AI projects
The Biden administration will require tech companies like OpenAI, Google, and Amazon to inform the US government about new AI projects employing substantial computing resources.
This government notification requirement is designed to provide insights into sensitive AI developments, including details on computing power usage and safety testing.
The mandate, stemming from a broader executive order from October, aims to enhance oversight over powerful AI model training, including those developed by foreign companies using US cloud computing services.
Stability AI introduces Stable LM 2 1.6B Nightshade, the data poisoning tool, is now available in v1 AlphaCodium: A code generation tool that beats human competitors Meta’s novel AI advances creative 3D applications ElevenLabs announces new AI products + Raised $80M TikTok’s Depth Anything sets new standards for Depth Estimation Google Chrome and Ads are getting new AI features Google Research presents Lumiere for SoTA video generation Binoculars can detect over 90% of ChatGPT-generated text Meta introduces guide on ‘Prompt Engineering with Llama 2′ NVIDIA’s AI RTX Video HDR transforms video to HDR quality Google introduces a model for orchestrating robotic agents
A Daily Chronicle of AI Innovations in January 2024 – Day 26: AI Daily News – January 26th, 2024
Tech Layoffs Surge to over 24,000 so far in 2024
The tech industry has seen nearly 24,000 layoffs in early 2024, more than doubling in one week. As giants cut staff, many are expanding in AI – raising concerns about automation’s impact. (Source)
Mass Job Cuts
Microsoft eliminated 1,900 gaming roles months after a $69B Activision buy.
Layoffs.fyi logs over 23,600 tech job cuts so far this year.
Morale suffers at Apple, Meta, Microsoft and more as layoffs mount.
AI Advances as Jobs Decline
Google, Amazon, Dataminr and Spotify made cuts while promoting new AI tools.
Neil C. Hughes: “Celebrating AI while slashing jobs raises questions.”
Firms shift resources toward generative AI like ChatGPT.
Concentrated Pain
Nearly 24,000 losses stemmed from just 82 companies.
In 2023, ~99 firms cut monthly – more distributed pain.
Concentrated layoffs inflict severe damage on fewer firms.
When everyone moves to AI powered search, Google has to change the monetization model otherwise $1.1 trillion is gone yearly from the world economy
Was thinking recently that everything right now on the internet is there because someone wants to make money (ad revenue, subscriptions, affiliate marketing, SEO etc). If everyone uses AI powered search, how exactly will this monetization model work. Nobody gets paid anymore.
WordPress ecosystem $600b, Google ads $200b, Shopify $220b, affiliate marketing $17b – not to mention infra costs that will wobble until this gets fixed.
What type of ad revenue – incentives can Google come up with to keep everyone happy once they roll out AI to their search engine?
AI rolled out in India declares people dead, denies food to thousands
The deployment of AI in India’s welfare systems has mistakenly declared thousands of people dead, denying them access to subsidized food and welfare benefits.
Recap of what happened:
AI algorithms in Indian welfare systems have led to the removal of eligible beneficiaries, particularly affecting those dependent on food security and pension schemes.
The algorithms have made significant errors, such as falsely declaring people dead, resulting in the suspension of their welfare benefits.
The transition from manual identification and verification by government officials to AI algorithms has led to the removal of 1.9 million claimant cards in Telangana.
If AI models violate copyright, US federal courts could order them to be destroyed
TLDR: Under copyright law, courts do have the power to issue destruction orders. Copyright law has never been used to destroy AI models specifically, but the law has been increasingly open to the idea of targeting AI. It’s probably not going to happen to OpenAI but might possibly happen to other generative AI models in the future.
Microsoft, Amazon and Google face FTC inquiry over AI deals LINK
The FTC is investigating investments by big tech companies like Microsoft, Amazon, and Alphabet into AI firms OpenAI and Anthropic to assess their impact on competition in generative AI.
The FTC’s inquiry focuses on how these investments influence the competitive dynamics, product releases, and oversight within the AI sector, requesting detailed information from the involved companies.
Microsoft, Amazon, and Google have made significant investments in OpenAI and Anthropic, establishing partnerships that potentially affect market share, competition, and innovation in artificial intelligence.
OpenAI cures GPT-4 ‘laziness’ with new updates LINK
OpenAI updated GPT-4 Turbo to more thoroughly complete tasks like code generation, aiming to reduce its ‘laziness’ in task completion.
GPT-4 Turbo, distinct from the widely used GPT-4, benefits from data up to April 2023, while standard GPT-4 uses data until September 2021.
Future updates for GPT-4 Turbo will include general availability with vision capabilities and the launch of more efficient AI models, such as embeddings to enhance content relationship understanding.
A Daily Chronicle of AI Innovations in January 2024 – Day 25: AI Daily News – January 25th, 2024
Meta introduces guide on ‘Prompt Engineering with Llama 2′
Meta introduces ‘Prompt Engineering with Llama 2’, It’s an interactive guide created by research teams at Meta that covers prompt engineering & best practices for developers, researchers & enthusiasts working with LLMs to produce stronger outputs. It’s the new resource created for the Llama community.
Having these resources helps the LLM community learn how to craft better prompts that lead to more useful model responses. Overall, it enables people to get more value from LLMs like Llama.
NVIDIA’s AI RTX Video HDR transforms video to HDR quality
NVIDIA released AI RTX Video HDR, which transforms video to HDR quality, It works with RTX Video Super Resolution. The HDR feature requires an HDR10-compliant monitor.
RTX Video HDR is available in Chromium-based browsers, including Google Chrome and Microsoft Edge. To enable the feature, users must download and install the January Studio driver, enable Windows HDR capabilities, and enable HDR in the NVIDIA Control Panel under “RTX Video Enhancement.”
Why does this matter?
AI RTX Video HDR provides a new way for people to enhance the Video viewing experience. Using AI to transform standard video into HDR quality makes the content look much more vivid and realistic. It also allows users to experience cinematic-quality video through commonly used web browsers.
Google introduces a model for orchestrating robotic agents
Google introduces AutoRT, a model for orchestrating large-scale robotic agents. It’s a system that uses existing foundation models to deploy robots in new scenarios with minimal human supervision. AutoRT leverages vision-language models for scene understanding and grounding and LLMs for proposing instructions to a fleet of robots.
By tapping into the knowledge of foundation models, AutoRT can reason about autonomy and safety while scaling up data collection for robot learning. The system successfully collects diverse data from over 20 robots in multiple buildings, demonstrating its ability to align with human preferences.
Why does this matter?
This allows for large-scale data collection and training of robotic systems while also reasoning about key factors like safety and human preferences. AutoRT represents a scalable approach to real-world robot learning that taps into the knowledge within foundation models. This could enable faster deployment of capable and safe robots across many industries.
January 2024 – Week 4 in AI: all the Major AI developments in a nutshell
Amazon presents Diffuse to Choose, a diffusion-based image-conditioned inpainting model that allows users to virtually place any e-commerce item in any setting, ensuring detailed, semantically coherent blending with realistic lighting and shadows. Code and demo will be released soon [Details].
OpenAI announced two new embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and lower pricing on GPT-3.5 Turbo. The updated GPT-4 Turbo preview model reduces cases of “laziness” where the model doesn’t complete a task. The new embedding models include a smaller and highly efficient text-embedding-3-small model, and a larger and more powerful text-embedding-3-large model. [Details].
Hugging Face and Google partner to support developers building AI applications [Details].
Adept introduced Adept Fuyu-Heavy, a new multimodal model designed specifically for digital agents. Fuyu-Heavy scores higher on the MMMU benchmark than Gemini Pro [Details].
Fireworks.ai has open-sourced FireLLaVA, a LLaVA multi-modality model trained on OSS LLM generated instruction following data, with a commercially permissive license. Firewroks.ai is also providing both the completions API and chat completions API to devlopers [Details].
01.AI released Yi Vision Language (Yi-VL) model, an open-source, multimodal version of the Yi Large Language Model (LLM) series, enabling content comprehension, recognition, and multi-round conversations about images. Yi-VL adopts the LLaVA architecture and is free for commercial use. Yi-VL-34B is the first open-source 34B vision language model worldwide [Details].
Tencent AI Lab introduced WebVoyager, an innovative Large Multimodal Model (LMM) powered web agent that can complete user instructions end-to-end by interacting with real-world websites [Paper].
Prophetic introduced MORPHEUS-1, a multi-modal generative ultrasonic transformer model designed to induce and stabilize lucid dreams from brain states. Instead of generating words, Morpheus-1 generates ultrasonic holograms for neurostimulation to bring one to a lucid state [Details].
Google Research presented Lumiere – a space-time video diffusion model for text-to-video, image-to-video, stylized generation, inpainting and cinemagraphs [Details].
TikTok released Depth Anything, an image-based depth estimation method trained on 1.5M labeled images and 62M+ unlabeled images jointly [Details].
Nightshade, the free tool that ‘poisons’ AI models, is now available for artists to use [Details].
Stability AI released Stable LM 2 1.6B, 1.6 billion parameter small language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch. Stable LM 2 1.6B can be used now both commercially and non-commercially with a Stability AI Membership [Details].
Etsy launched ‘Gift Mode,’ an AI-powered feature designed to match users with tailored gift ideas based on specific preferences [Details].
Google DeepMind presented AutoRT, a framework that uses foundation models to scale up the deployment of operational robots in completely unseen scenarios with minimal human supervision. In AutoRT, a VLM describes the scene, an LLM generates robot goals and filters for affordance and safety, then routes execution to policies [Details].
Google Chrome gains AI features, including a writing helper, theme creator, and tab organizer [Details].
Tencent AI Lab released VideoCrafter2 for high quality text-to-video generation, featuring major improvements in visual quality, motion and concept Composition compared to VideoCrafter1 [Details | Demo]
Google opens beta access to the conversational experience, a new chat-based feature in Google Ads, for English language advertisers in the U.S. & U.K. It will let advertisers create optimized Search campaigns from their website URL by generating relevant ad content, including creatives and keywords [Details].
What Else Is Happening in AI on January 25th, 2024
Google’s Gradient invests $2.4M in Send AI for enterprise data extraction
Dutch startup Send AI has secured €2.2m ($2.4M) in funding from Google’s Gradient Ventures and Keen Venture Partners to develop its document processing platform. The company uses small, open-source AI models to help enterprises extract data from complex documents, such as PDFs and paper files. (Link)
Google Arts & Culture has launched Art Selfie 2
A feature that uses Gen AI to create stylized images around users’ selfies. With over 25 styles, users can see themselves as an explorer, a muse, or a medieval knight. It also provides topical facts and allows users to explore related stories and artifacts. (Link)
Google announced new AI features for education @ Bett ed-tech event in the UK
These features include AI suggestions for questions at different timestamps in YouTube videos and the ability to turn a Google Form into a practice set with AI-generated answers and hints. Google is also introducing the Duet AI tool to assist teachers in creating lesson plans. (Link)
Etsy has launched a new AI feature, “Gift Mode”
Which generates over 200 gift guides based on specific preferences. Users can take an online quiz to provide information about who they are shopping for, the occasion, and the recipient’s interests. The feature then generates personalized gift guides from the millions of items listed on the platform. The feature leverages machine learning and OpenAI’s GPT-4. (Link)
Google DeepMind’s 3 researchers have left the company to start their own AI startup named ‘Uncharted Labs’
The team, consisting of David Ding, Charlie Nash, and Yaroslav Ganin, previously worked on Gen AI systems for images and music at Google. They have already raised $8.5M of its $10M goal. (Link)
Apple’s plans to bring gen AI to iPhones
Apple is intensifying its AI efforts, acquiring 21 AI start-ups since 2017, including WaveOne for AI-powered video compression, and hiring top AI talent.
The company’s approach includes developing AI technologies for mobile devices, aiming to run AI chatbots and apps directly on iPhones rather than relying on cloud services, with significant job postings in deep learning and large language models.
Apple is also enhancing its hardware, like the M3 Max processor and A17 Pro chip, to support generative AI, and has made advancements in running large language models on-device using Flash memory. Source
OpenAI went back on a promise to make key documents public
OpenAI, initially committed to transparency, has backed away from making key documents public, as evidenced by WIRED’s unsuccessful attempt to access governing documents and financial statements.
The company’s reduced transparency conceals internal issues, including CEO Sam Altman’s controversial firing and reinstatement, and the restructuring of its board.
Since creating a for-profit subsidiary in 2019, OpenAI’s shift from openness has sparked criticism, including from co-founder Elon Musk, and raised concerns about its governance and conflict of interest policies. Source
Google unveils AI video generator Lumiere
Google introduces Lumiere, a new AI video generator that uses an innovative “space-time diffusion model” to create highly realistic and imaginative five-second videos.
Lumiere stands out for its ability to efficiently synthesize entire videos in one seamless process, showcasing features like transforming text prompts into videos and animating still images.
The unveiling of Lumiere highlights the ongoing advancements in AI video generation technology and the potential challenges in ensuring its ethical and responsible use. Source
Ring will no longer allow police to request doorbell camera footage from users. Source
Amazon’s Ring is discontinuing its Request for Assistance program, stopping police from soliciting doorbell camera footage via the Neighbors app.
Authorities must now file formal legal requests to access Ring surveillance videos, instead of directly asking users within the app.
Privacy advocates recognize Ring’s decision as a progressive move, but also note that it doesn’t fully address broader concerns about surveillance and user privacy.
AI rolled out in India declares people dead, denies food to thousands
In India, AI has mistakenly declared thousands of people dead, leading to the denial of essential food and pension benefits.
The algorithm, designed to find welfare fraud, removed 1.9 million from the beneficiary list, but later analysis showed about 7% were wrongfully cut.
Out of 66,000 stopped pensions in Haryana due to an algorithmic error, 70% were found to be incorrect, placing the burden of proof on beneficiaries to reinstate their status. Source
A Daily Chronicle of AI Innovations in January 2024 – Day 24: AI Daily News – January 24th, 2024
Google Chrome and Ads are getting new AI features
Google Chrome is getting 3 new experimental generative AI features:
Smartly organize your tabs: With Tab Organizer, Chrome will automatically suggest and create tab groups based on your open tabs.
Create your own themes with AI: You’ll be able to quickly generate custom themes based on a subject, mood, visual style and color that you choose– no need to become an AI prompt expert!
Get help drafting things on the web: A new feature will help you write with more confidence on the web– whether you want to leave a well-written review for a restaurant, craft a friendly RSVP for a party, or make a formal inquiry about an apartment rental.
In addition, Gemini will now power the conversational experience within the Google Ads platform. With this new update, it will be easier for advertisers to quickly build and scale Search ad campaigns.
Google Research presents Lumiere for SoTA video generation
Lumiere is a text-to-video (T2V) diffusion model designed for synthesizing videos that portray realistic, diverse, and coherent motion– a pivotal challenge in video synthesis. It demonstrates state-of-the-art T2V generation results and shows that the design easily facilitates a wide range of content creation tasks and video editing applications.
The approach introduces a new T2V diffusion framework that generates the full temporal duration of the video at once. This is achieved by using a Space-Time U-Net (STUNet) architecture that learns to downsample the signal in both space and time, and performs the majority of its computation in a compact space-time representation.
Why does this matter?
Despite tremendous progress, training large-scale T2V foundation models remains an open challenge due to the added complexities that motion introduces. Existing T2V models often use cascaded designs but face limitations in generating globally coherent motion. This new approach aims to overcome the limitations associated with cascaded training regimens and improve the overall quality of motion synthesis.
Binoculars can detect over 90% of ChatGPT-generated text
Researchers have introduced a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data.
It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. Researchers comprehensively evaluated Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data.
Why does this matter?
A common first step in harm reduction for generative AI is detection. Binoculars excel in zero-shot settings where no data from the model being detected is available. This is particularly advantageous as the number of LLMs grows rapidly. Binoculars’ ability to detect multiple LLMs using a single detector proves valuable in practical applications, such as platform moderation.
What Else Is Happening in AI on January 24th, 2024
Microsoft forms a team to make generative AI cheaper.
Microsoft has formed a new team to develop conversational AI that requires less computing power compared to the software it is using from OpenAI. It has moved several top AI developers from its research group to the new GenAI team. (Link)
Sevilla FC transforms the player recruitment process with IBM WatsonX.
Sevilla FC introduced Scout Advisor, an innovative generative AI tool that it will use to provide its scouting team with a comprehensive, data-driven identification and evaluation of potential recruits. Built on watsonx, Sevilla FC’s Scout Advisor will integrate with their existing suite of self-developed data-intensive applications. (Link)
SAP will restructure 8,000 roles in a push towards AI.
SAP unveiled a $2.2 billion restructuring program for 2024 that will affect 8,000 roles, as it seeks to better focus on growth in AI-driven business areas. It would be implemented primarily through voluntary leave programs and internal re-skilling measures. SAP expects to exit 2024 with a headcount “similar to the current levels”. (Link)
Kin.art launches a free tool to prevent GenAI models from training on artwork.
Kin.art uses image segmentation (i.e., concealing parts of artwork) and tag randomization (swapping an art piece’s image metatags) to interfere with the model training process. While the tool is free, artists have to upload their artwork to Kin.art’s portfolio platform in order to use it. (Link)
Google cancels contract with an AI data firm that’s helped train Bard.
Google ended its contract with Appen, an Australian data company involved in training its LLM AI tools used in Bard, Search, and other products. The decision was made as part of its ongoing effort to evaluate and adjust many supplier partnerships across Alphabet to ensure vendor operations are as efficient as possible. (Link)
A Daily Chronicle of AI Innovations in January 2024 – Day 23: AI Daily News – January 23rd, 2024
Meta’s novel AI advances creative 3D applications
The paper introduces a new shape representation called Mosaic-SDF (M-SDF) for 3D generative models. M-SDF approximates a shape’s Signed Distance Function (SDF) using local grids near the shape’s boundary.
This representation is:
Fast to compute
Parameter efficient
Compatible with Transformer-based architectures
The efficacy of M-SDF is demonstrated by training a 3D generative flow model with the 3D Warehouse dataset and text-to-3D generation using caption-shape pairs.
M-SDF provides an efficient 3D shape representation for unlocking AI’s generative potential in the area, which could significantly advance creative 3D applications. Overall, M-SDF opens up new possibilities for deep 3D learning by bringing the representational power of transformers to 3D shape modeling and generation.
ElevenLabs announces new AI products + Raised $80M
ElevenLabs has raised $80 million in a Series B funding round co-led by Andreessen Horowitz, Nat Friedman, and Daniel Gross. The funding will strengthen the company’s position as a voice AI research and product development leader.
ElevenLabs has also announced the release of new AI products, including a Dubbing Studio, a Voice Library marketplace, and a Mobile Reader App.