Download the Ace AWS DEA-C01 Exam App: iOS - Android
AI Innovations in August 2024.
The AI revolution continues to blaze through 2024. June was a month of monumental strides, marked by breakthroughs in quantum AI, autonomous medical drones, and natural language processing. But the AI landscape is a dynamic one, and July has already proven to be no exception.
This month, we’re diving deep into the latest AI developments, from groundbreaking research to real-world applications. We’ll explore how AI is reshaping industries, addressing global challenges, and redefining what’s possible. Join us as we uncover the stories behind the headlines and analyze the implications of these innovations for society.
Whether you’re an AI expert or just curious about the future, this blog is your go-to source for the most up-to-date insights. Stay tuned for daily updates as we navigate the exciting world of artificial intelligence together.
A Daily Chronicle of AI Innovations on August 30th 2024
Apple and Nvidia may invest in OpenAI
Amazon’s new Alexa voice assistant will use Claude AI
OpenAI and Anthropic will share their models with the US government
Google is working on AI that can hear signs of sickness
OpenAI and Anthropic partner with US gov
China’s new Qwen2 beats GPT-4o
AI startup reaches 100M token context
China’s new Qwen2 beats GPT-4o
Alibaba just unveiled Qwen2-VL, a new vision-language AI model that outperforms GPT-4o in several benchmarks — particularly excelling in document comprehension and multilingual text-image understanding.
Qwen2-VL can understand images of various resolutions and ratios, as well as videos over 20 minutes long.
The model excels particularly at complex tasks such as college-level problem-solving, mathematical reasoning, and document analysis.
It also supports multilingual text understanding in images, including most European languages, Japanese, Korean, Arabic, and Vietnamese.
You can try Qwen2-VL on Hugging Face, with more information on the official announcement blog.
There’s yet another new contender in the state-of-the-art AI model arena, and it comes from China’s Alibaba. Qwen2-VL’s ability to understand diverse visual inputs and multilingual requests could lead to more sophisticated, globally accessible AI applications.
Apple and Nvidia are reportedly in talks to participate in a significant funding round for OpenAI, with Apple planning to integrate ChatGPT into iOS and Nvidia being a key supplier of the chips that power OpenAI’s AI services.
Apple, which had earlier considered appointing Phil Schiller to OpenAI’s board before abandoning the plan, is looking to deepen its involvement with OpenAI as it prepares to enhance Siri with ChatGPT capabilities later this year.
Nvidia, whose hardware is essential for OpenAI’s operations, is also considering investing in this funding round, joining Microsoft, which has been a major investor in OpenAI since 2019 and made another substantial investment in 2023.
OpenAI and Anthropic just signed a groundbreaking agreement with the U.S. Artificial Intelligence Safety Institute to allow government access and testing of their AI models before public release.
The U.S. AI Safety Institute will have access to major new models from both companies prior to and after their public release.
This collaboration is a step toward AI regulation and safety efforts, with the U.S. government evaluating AI models’ capabilities and associated risks.
The institute will provide feedback to OpenAI and Anthropic on potential safety improvements that should be made.
These agreements come as AI companies face increasing regulatory scrutiny, with California legislators recently passing a broad AI regulation bill earlier today.
The two most popular AI companies in the world are granting the U.S. government access to unreleased models before release. This could reshape how AI is developed, tested, and deployed worldwide, with major implications around innovation, safety, and international competition in the AI space, for better or worse.
Amazon’s new Alexa voice assistant will use Claude AI
Amazon’s new voice assistant, “Remarkable Alexa,” will launch in October and be powered by Anthropic’s Claude AI, offering a subscription-based service.
The existing Alexa model struggled with accuracy, leading Amazon to invest in Anthropic’s AI technology after facing internal technical and bureaucratic issues.
Remarkable Alexa is set to feature daily AI-generated news summaries, a child-focused chatbot, and conversational shopping tools, with a demo planned for Amazon’s September event.
Magic just developed LTM-2-mini, a model capable of processing 100 million tokens of context — equivalent to about 10 million lines of code or 750 novels — and partnered with Google Cloud to build advanced AI supercomputers.
LTM-2-mini can process and understand 100 million tokens of context given during inference, surpassing current models by 50x.
The model’s innovative algorithm processes long sequences of data 1000x more efficiently than the current top-performing AI models.
Magic is also partnering with Google Cloud to build supercomputers powered by Nvidia’s newest and most advanced GPUs.
The company has raised more than $450 million in total funding, including a recent $320 million investment round.
This breakthrough in context length allows AI agents to process and reason over dense and complicated codebases, vast databases, and years of conversation history in a single inference. It’s a significant step toward creating AI assistants with near-perfect recall and memory.
Google is working on AI that can hear signs of sickness
Google is developing artificial intelligence technology that can detect early signs of illness by analyzing sound signals like coughs and sniffles.
The AI model is trained with 300 million audio samples and can identify diseases such as tuberculosis by recognizing specific audio patterns of labored breathing.
Google has partnered with Salcit Technologies, an AI startup in India, to integrate this technology into smartphones to assist high-risk populations in areas with limited healthcare access.
Anthropic’s Prompt Engineering Interactive Tutorial: a digital platform designed to teach users how to effectively craft prompts for AI applications, enhancing user interaction and efficiency.
Documents reveal state-linked Chinese entities are using cloud services from AWS or its rivals to access advanced US chips and AI models they cannot acquire otherwise.
California lawmakersapproved a bill proposing sweeping AI regulations, including safety testing requirements and potential legal consequences for harmful AI systems.
A Daily Chronicle of AI Innovations on August 29th 2024
AI creates DOOM video game in real-time
OpenAI raises at $100B valuation
AI spots cancer earlier than ever
Nvidia just showed how hard it is to be the AI king
Google researchers run Doom on a self-generating AI model
Midjourney says it’s ‘getting into hardware’
OpenAI aims for $100B+ valuation in new funding round
Major websites reject Apple AI data scraping
AI creates DOOM video game in real-time
Google researchers just developed GameNGen, an AI system that can simulate the classic game DOOM in real-time, running at over 20 frames per second and producing visuals nearly indistinguishable from the original game.
GameNGen produces playable gameplay at 20 frames per second on a single chip, with each frame predicted by a diffusion model.
The AI was trained on 900M frames of gameplay data, resulting in 3-second clips almost indistinguishable from the actual game by playtesters.
Running on a single TPU, GameNGen handles Doom’s 3D environments and fast-paced action without traditional game engine components.
In tests, human raters could barely distinguish between short clips of the AI simulation and the actual game.
GameNGen is the first AI model that can generate a complex and playable video game in real-time without any underlying real game engine. We’re at the fascinating time where soon, AI will be able to create entire games on the fly, personalized to each player.
OpenAI is reportedly in talks to raise a new funding round at a valuation exceeding $100 billion, led by Thrive Capital, with Microsoft also expected to participate.
The potential valuation of over $100 billion would be significantly higher than OpenAI’s previous $86 billion valuation.
Thrive Capital is expected to invest around $1 billion in this round.
OpenAI’s annualized revenue reportedly surpassed $3.4 billion earlier this year.
The company is still, however, projected to lose nearly $5 billion by the end of the year and has already spent $8.5 billion on AI training and staffing.
Building AI is expensive, and raising billions of dollars at a $100B+ valuation would silence OpenAI’s critics who insist that the company is on its downfall. The increased valuation also suggests that the company has potential hidden breakthroughs behind the scenes, such as Project Strawberry and Orion.
Researchers recently developed an AI tool called AINU that can differentiate cancer cells from normal cells and detect early stages of viral infection, by analyzing high-resolution images of cell nuclei.
AINU uses a convolutional neural network to analyze images captured by STORM microscopy, which offers nanoscale resolution.
The AI can detect structural changes in cells as small as 20 nanometers, 5,000 times smaller than a human hair’s width.
AINU also detected viral infections (herpes simplex virus type-1) just one hour after infection by observing subtle changes in DNA packing.
The tool can accurately identify stem cells too, which could accelerate stem cell research without relying on animal testing.
Yesterday, researchers revealed an AI tool to help with early dementia detection, and now AI is detecting cancer cells at a nanoscale level. Clinical applications may be years away, but AI healthcare breakthroughs like AINU are only accelerating — and will dramatically revolutionize scientific research in the coming years.
Nvidia just showed how hard it is to be the AI king
Nvidia achieved strong second-quarter results by more than doubling its revenue compared to the same period last year, but industry experts anticipated these outcomes due to ongoing investments in AI by tech companies.
Despite reporting $30.04 billion in revenue, which surpassed analyst expectations, Nvidia’s stock fell 6.9% after hours due to investor concerns and sky-high expectations.
Issues like shipment delays for Nvidia’s upcoming Blackwell GPUs and slightly lower-than-expected revenue projections for the next quarter also contributed to investor unease, as noted by multiple analysts.
Midjourney, known for its AI image-generation tool, announced it is entering the hardware market and invited job seekers to join its new division.
The announcement was made on Midjourney’s official X account, revealing that founder David Holz and new hire Ahmad Abbas, a former Apple hardware manager, will lead the hardware efforts.
Midjourney hinted at multiple ongoing projects and the possibility of new form factors, though no specific timeline or further details have been provided yet.
OpenAI aims for $100B+ valuation in new funding round
OpenAI is reportedly negotiating with venture capital firms to raise a large sum of money, potentially valuing the company at over $100 billion.
Thrive Capital plans to invest $1 billion in this funding round, and Microsoft is also expected to contribute additional funds, as reported by The Wall Street Journal.
If successful, this would be the most substantial new capital for OpenAI since Microsoft’s $10 billion investment in January 2023, with OpenAI’s valuation potentially exceeding $103 billion based on recent negotiations.
Many of the largest websites, such as Facebook, Instagram, and The New York Times, have opted out of Apple’s AI training by using the Applebot-Extended tag to exclude their content.
Apple allows publishers to easily opt out of content scraping for Apple Intelligence training through a publicly-accessible robots.txt file, ensuring their data is not used for AI purposes.
Apple’s use of Applebot for AI training is designed to be ethical, with mechanisms to filter out personal data and a system for web publishers to prevent their data from being utilized.
A Daily Chronicle of AI Innovations on August 28th 2024
OpenAI prepares ‘Project Strawberry’
Google launches trio of new models
😯Google AI-Powered Interview Warmup
Create an AI prompt optimizer GPT
AI tools help early dementia detection
📈 Nvidia earnings to test AI boom
Google Meet will now take notes for you
OpenAI prepares ‘Project Strawberry’
OpenAI researchers are preparing to launch a new AI model, code-named Strawberry (previously Q*), that demonstrates superior reasoning capabilities in solving complex problems, according to a new report via The Information.
Project Strawberry could be integrated into ChatGPT as soon as this fall, marking a significant leap in AI intelligence.
Given extra “thinking” time, Strawberry can tackle subjective topics and solve complex puzzles like the New York Times Connections.
OpenAI is using Strawberry to generate high-quality training data for another secretive upcoming LLM, reportedly code-named Orion.
The new AI model could enhance OpenAI’s development of AI agents, potentially automating multi-step tasks more effectively.
If Strawberry lives up to the leaks, it could mark a significant leap in AI reasoning capabilities, potentially advancing OpenAI towards Stage 2 of its five-level roadmap to AGI. With ChatGPT reported to gain these capabilities this fall, we’re likely on the verge of seeing the next major wave of AI disruption.
Google Meet’s new AI-powered feature, “take notes for me,” has started rolling out today, summarizing meetings for Google Workspace customers with specific add-ons and initially announced at the 2023 Cloud Next conference.
This feature automatically generates a Google Doc with meeting notes, attaches it to the calendar event, and sends it to the meeting organizer and participants who activated the tool, although it currently supports only spoken English.
Google predicts the feature will be available to all Google Workspace customers by September 10th, 2024, but there are concerns about its accuracy, given the performance of similar transcription tools in the past.
Google just released three new experimental Gemini 1.5 models, including a compact 8B parameter version, an improved Pro model, and an enhanced Flash model — all available for developers on Google AI Studio.
Gemini 1.5 Flash-8B is a smaller, faster model that can handle text, images, and other data types efficiently for super quick responses while processing a lot of information.
The updated Gemini 1.5 Pro model is now better at writing code and understanding complex instructions.
An improved Gemini 1.5 Flash model offers overall enhancements, performing better on Google’s internal tests across various tasks.
The upgraded Gemini 1.5 Pro model now ranks as #2, and the new Gemini 1.5 Flash ranks as #6 on the Chatbot Arena leaderboard.
While OpenAI is leaving everyone waiting, Google has been shipping out constant upgrades and new features to its AI offerings. These new enhancements give Gemini 1.5 Flash big improvements overall and Gemini 1.5 Pro new upgrades in math, coding, and responding to longer prompts.
Google actually runs this tasty thing called “Interview Warmup.” It’s an AI-powered training tool for your next big interview. It throws real questions based on your discipline: UX, data and analytics, cybersecurity, etc. Then, the magic kicks in, evaluating your audio answers and sending back recommendations on things like framing your qualifications to supporting your impact.
5 questions. Get some analysis. Build some confidence. Easy, right? 🌟
Oh. And for the tech-oriented: Also make sure you check this site out, too. Videos, former (real) interview questions, the works. Interview Prep – Google Tech Dev Guide
OpenAI’s Custom GPTs allow premium users to create AI assistants that can optimize prompts for other AI creative tools such as Midjourney for AI image generation or Gen-3 for AI video generation.
Log into your ChatGPT Plus account and click “Explore GPTs”, then click “Create”.
Name your GPT and add a brief description.
In the Instructions, paste: “User is using an AI video generator called [Tool Name]. You need to craft a perfect prompt for the topic they ask by following the prompting guide below. The prompt needs to follow the format provided in the guide.”
Test your GPT in the preview panel, then click “Create” to finalize and choose sharing options.
Hot tip: Add a complete prompting guide for your chosen AI tool (e.g. Runway’s Gen-3 prompting guide)
Scientists from the Universities of Edinburgh and Dundee are launching a massive AI-driven study of over 1.6 million brain scans to develop tools for early dementia prediction and diagnosis.
The project, called NEURii, will use AI and machine learning to analyze CT and MRI scans from Scottish patients over the past decade.
Researchers aim to create digital tools for radiologists to assess dementia risk during routine scans.
The study will match image data with linked health records to identify patterns associated with dementia risk.
With global dementia cases projected to reach 153 million by 2050, this research could significantly impact early intervention and treatment development.
This week alone, we’ve seen AI developing new cancer drugs, 3D printing lifelike human organs, and now creating tools for early dementia detection. As AI rapidly advances in healthcare, we’re accelerating into a new era of personalized medicine and preventative care.
There have been several negative reports ahead of Nvidia’s earnings, ranging from supply chain/design challenges to concerns about use cases and applications. However, one thing we learned from discussions with customers is that demand is still extremely constrained.
Key topics ahead of the results:
1. Will the Hopper architecture stay stronger for longer? 2. Is Blackwell really delayed? 3. What is the upside if the company can deliver on the systems orders?
Here are some thoughts on each:
1. Key players like Microsoft, Snowflake, and Tesla highlighted tight capacity for GPUs and more demand than available supply. Snowflake particularly called out H100 (un)availability. This makes us believe that the Hopper cycle may extend beyond ’23/24
2. There were several reports pointing to Blackwell delays, the new generation GPU. Analysts have now taken it out of estimates for this year (C24). However, our research indicates that the delays are mainly on the systems side, which were not supposed to be delivered until (C25). Meanwhile, Nvidia’s CEO noted that we can expect significant revenues from Blackwell this year … key will be to find out if this is still the case.
3. Systems – namely the GB200 NVL36/72 is where the delays are. But our intel suggests that the order book for these is through the roof due to the TCO (total cost of ownership) they offer. If Nvidia is in fact able to deliver these in ’25 revenue from systems alone can exceed >$100BN with total DC revenue >$200BN.
What Else is happening in AI on August 28th 2024!
Apple announced a September 9 event where it’s expected to debut the iPhone 16 with new generative AI features.
Elon Muskendorsed California’s Senate Bill 1047, which would require safety testing for large AI models, breaking with other tech leaders who oppose the regulation.
Amazonplans to launch a delayed AI-powered Alexa subscription in October, featuring “Smart Briefing” AI-generated news summaries.
Anthropicannounced the full release of its Artifacts feature for all Claude users, including mobile apps, after millions were created in its test phase.
A Daily Chronicle of AI Innovations on August 27th 2024
AI can 3D print lifelike human organs
Anthropic reveals Claude’s secret sauce
Amazon aims to launch delayed AI Alexa subscription in October
OpenAI, Adobe, Microsoft want all companies to label AI-generated content
ChatGPT teams up with ASU
Discovering new drugs with AI
How to use Midjourney ‘Erase‘
AI can 3D print lifelike human organs
Researchers at Washington State University recently developed an AI technique called Bayesian Optimization that dramatically improves the speed and efficiency of 3D printing lifelike human organs.
The AI balances geometric precision, density, and printing time to create organ models that look and feel authentic.
In tests, it printed 60 continually improving versions of kidney and prostate organ models.
This approach significantly reduces the time and materials needed to find optimal 3D printing settings for complex objects.
The technology also has potential applications beyond medicine — for example, in the computer science, automotive, and aviation industries.
With cheaper, lifelike 3D-printed human organs, medical students could better practice for surgery before operating on actual patients. Beyond medicine, this AI technique could help reduce manufacturing costs for a variety of things like smartphones, car parts, and even airplane components.
Scientists from China and the U.S. just developed ActFound, a new AI model that outperforms existing methods in predicting drug bioactivity, potentially accelerating and reducing costs in drug development.
ActFound combines meta-learning and pairwise learning to overcome common limitations in AI drug discovery, like small datasets and incompatible measurements.
The model was trained on 35,000+ assays (metal ore breakdowns) and 1.6 million experimentally measured bioactivities from a popular chemical database.
In tests, ActFound outperformed nine competing models and showed strong performance in predicting cancer drug bioactivity.
ActFound could significantly speed up drug development by accurately predicting compound properties with less data and lower costs than traditional methods. While still in early stages, AI breakthroughs like this are the lesser-talked about developments that could end up saving millions of lives.
OpenAI’s ChatGPT is headed to Arizona State University (ASU), where the university is integrating the AI assistant into over 200 projects across teaching, research, and operations.
ASU is using ChatGPT Edu, a version designed for universities with enhanced privacy and security features.
The university also launched an ‘AI Innovation Challenge’ for faculty and staff, receiving an overwhelming demand for using ChatGPT to maximize teaching, research, and ops.
Key projects include an AI writing companion for scholarly work, ‘Sam’ (a chatbot for med students to practice patient interactions), and AI-assisted research recruitment.
The partnership has inspired other institutions like Oxford and Wharton to pursue similar collaborations.
While some schools are attempting to resist AI, ASU is embracing ChatGPT to make learning more personalized and to prepare students for an increasingly AI-driven job market. As education continues to change in the age of AI, case studies like this will be instrumental in shaping the future of academia.
Source: https://openai.com/index/asu/
Anthropic reveals Claude’s secret sauce
Anthropic has published the system prompts for its latest AI models, including Claude 3 Opus, Claude 3.5 Sonnet, and Claude 3.5 Haiku, to demonstrate transparency and ethical practices.
The system prompts reveal specific behaviors and capabilities of the Claude models, such as the inability to open URLs or recognize faces, aiming to ensure ethical interactions.
Anthropic plans to continue updating and disclosing these system prompts to promote transparency, potentially pressuring other AI vendors to follow suit.
Amazon aims to launch delayed AI Alexa subscription in October
The new Alexa AI, set to launch around mid-October, will feature a “Smart Briefing” that provides daily, AI-generated news summaries based on user preferences.
A more personalized experience is expected, with Alexa AI learning user preferences through interactive and tailored responses, such as dietary requirements for recipe suggestions.
Alexa AI will also introduce a “Shopping Scout” feature to help users find deals and track prices, alongside a kid-friendly “Explore with Alexa 2.0” for safe, moderated conversations.
OpenAI, Adobe, Microsoft want all companies to label AI-generated content
OpenAI, Adobe, and Microsoft now back a California bill that mandates tech companies to add watermarks to AI-generated content, with the bill set for a final vote in August.
AB 3211 requires AI-generated photos, videos, and audio clips to have watermarks in their metadata and mandates large online platforms to label AI content clearly for average viewers.
Initially opposed by a trade group representing major software companies, the bill gained support from OpenAI, Adobe, and Microsoft after amendments addressed concerns about its practicality.
Inflection AI partnered with Data Transfer Initiative, enabling Pi users to export conversations and announced plans to cap free usage while focusing on enterprise AI.
Source: https://inflection.ai/the-future-of-pi
Phariareleased Pharia-1-LLM-7B, an open-source model optimized for German, French, and Spanish that excels in domain-specific applications.
IBMpreviewed Spyre, a new AI accelerator chip for IBM Z mainframes, designed to scale enterprise AI workloads with clustering capabilities.
Source: https://research.ibm.com/blog/spyre-for-z
Hugging FaceandGoogle Cloud just partnered up to release optimized Deep Learning Containers for building AI with open models on Google Cloud infrastructure.
SPONSOR US: Get your product in front of over 1 million+ AI enthusiasts
Our Daily AI Chronicle Blog, newsletter and podcast is read by thousands of Redditors, Quorans, Linkedin professionals, tech executives, investors, engineers, managers, and business owners around the world. Get in touch today.
A Daily Chronicle of AI Innovations on August 26th 2024
Amazon is telling its salespeople to trash talk Google, Microsoft, and OpenAI
Apple may be working on an AI ‘personality’ to replace Siri on its robots
Chinese companies showcased 27 humanoid robots alongside Tesla’s Optimus
AI learns to plan better without humans
How to use Ideogram for generating images
️ Grok-2 improves speed, accuracy, transparency
AI learns to plan better without humans
IBM Research and Cornell University recently created AutoToS, a system that teaches AI to solve complex planning problems at 100% accuracy — without needing a human to check its work.
AutoToS is like a smart tutor for AI, helping it learn how to break down and solve tricky problems step-by-step.
The system uses clever tests to check the AI’s work, pointing out mistakes and showing examples of how to do better without human interferance.
This approach seems to work equally as well for smaller and larger models.
AutoToS succeeded in teaching AI to solve complex puzzles, including classic problems like arranging blocks and solving Sokoban, a box-pushing game.
Right now, it’s difficult to trust AI agents to completely autonomously perform actions on your behalf, but AutoToS is solving complex tasks at a 100% accuracy. If this system works in the real world, it’s the next big step in creating more reliable AI assistants.
Apple may be working on an AI ‘personality’ to replace Siri on its robots
Apple is developing a new AI-based ‘personality’ for use in upcoming robotic devices, aiming to enhance interactions similar to how Siri functions on existing Apple products.
Bloomberg’s Mark Gurman reports that Apple’s futuristic AI assistant will be more humanlike and could operate on a tabletop product and other future robots, potentially costing under $1,000.
The project is in early development stages with no guarantees of release, while Apple continues to integrate generative AI features into its devices, like iPhones, iPads, and Macs, later this year.
Chinese companies showcased 27 humanoid robots alongside Tesla’s Optimus
At the Beijing World Robot Conference, Tesla’s Optimus humanoid was displayed motionless inside a clear box, facing tough competition from Chinese robots demonstrated by various companies.
The event saw 27 new humanoid robots debut, with significant financial investments in China’s robotics industry surpassing 100 billion yuan over the past decade.
Chinese startups like Agibot and Stardust Intelligence showcased robots capable of performing complex tasks, while experts believe Tesla’s and other U.S. companies’ robot technology leads by about one to two years.
xAI’s Grok-2 and Grok-2 mini just made major improvements — doubling the model’s speed in the mini version and showing increased accuracy in both models, just days after its beta launch.
Grok-2 mini is now twice as fast as it was previously, thanks to a rewritten inference stack using SGLang.
Both Grok-2 and its mini version have become slightly more accurate due to reduced quantization error, according to one xAI employee.
Additionally, both Grok-2 models are now part of the LMSYS Chatbot Arena leaderboard for increased transparency, with Grok-2’s larger model ranking #2 and surpassing Claude 3.5 Sonnet.
Grok-2 excels particularly in math, where it ranks #1 and performs at a state-of-the-art level in hard prompts, coding, and instruction-following.
From being founded only ~18 months ago, to creating an LLM ranked third in the world, it’s safe to say that xAI has the entire AI community mind blown. This not only makes Grok-2 a top contender in the AI race but also intensifies competition, potentially accelerating advancements across the industry.
At the 2024 World Robot Conference in Beijing, Chinese companies showcased 27 humanoid robots alongside Tesla’s Optimus, signalling China’s ambition to dominate the industry.
Chinese tech firms unveiled 27 humanoid robots at the expo, with Tesla’s Optimus being the only foreign competitor present.
AGIBOT, founded by a Huawei alumnus, presented robots powered by large language models (LLMs) for industrial use and customer service.
Other notable entries included Astribot’s S1 robot assistant capable of writing calligraphy and playing musical instruments, and Galbot’s wheeled robots for food delivery and retail tasks.
Despite the impressive showcase, experts note that technological hurdles and high costs still create challenges for Chinese manufacturers.
China may be slightly behind in the AI race against the U.S., but it’s clear the country is committed to dominating the humanoid robotics race. With a whopping 27 China-based humanoid robots demonstrating a wide-range of use cases at the event, commercially available humanoids may be coming sooner than most expect.
Ideogram 2.0, the latest state-of-the-art AI image generator, excels at creating images that include text — opening new possibilities for use cases like thumbnails, posters, newsletter graphics, memes, and more.
Head over to Ideogram’s website and Sign up. You’ll get free credits to try the image generator without a credit card.
Click “Describe what you want to see” and enter a detailed text prompt for your desired image.
Customize settings like aspect ratio, AI model (choose 2.0), and style (Realistic, Design, 3D, or Anime).
Click “Generate” to create four AI-generated images based on your prompt!
Pro tip: Experiment with different prompts and settings to discover its full potential and create unique visuals for your projects!
What Else is Happening in AI on August 26th 2024!
Scientists to use AI and 1.6 million brain scans for earlier and more accurate dementia diagnoses.
Anthropic supported California’s AI regulation bill after changes were made, saying its benefits likely outweigh its costs for advanced AI development.
A Daily Chronicle of AI Innovations on August 23rd 2024
Nvidia and Mistral make laptop-ready AI
Amazon’s AI assistant saves 4,500 years of development time
Slack AI could be tricked into leaking login details and more
Cruise’s robotaxis are coming on Uber
Google DeepMind workers urge the company to end ties with military organizations
Salesforce unveils AI agents for sales
Nvidia and Mistral make laptop-ready AI
Nvidia and Mistral just released Mistral-NeMo-Minitron 8B, a highly accurate small language model that can run efficiently on laptops and PCs.
The model uses optimization techniques like pruning (removing certain weights) and distillation (retraining the pruned model on a small dataset) to achieve high accuracy with a smaller footprint.
These optimizations resulted in up to 40x cost savings in terms of raw compute during training.
Laptops and PCs can run the model locally for faster and more secure interactions with AI.
Minitron 8B leads nine language-driven AI benchmarks for similarly sized models from language understanding to reasoning and coding.
AI models that are small enough to run locally on laptops and PCs means less reliance on cloud services, improved data privacy, and faster responses. As this tech evolves, we could soon see advanced AI in everything from smartphones and watches to home appliances.
Amazon’s AI assistant saves 4,500 years of development time
Amazon CEO Andy Jassy stated that their AI assistant, Amazon Q, has significantly reduced software upgrade times, saving the company thousands of work hours.
Jassy mentioned that implementing Amazon Q resulted in estimated savings equivalent to 4,500 developer-years and $260 million in annual efficiency gains.
The AI-generated code reviews were so accurate that 79% of them were shipped without any additional changes, demonstrating the tool’s effectiveness in streamlining tedious tasks.
Researchers just developed a new AI-based method called NES-VMC that can accurately calculate the excited states of atoms and molecules, a challenge in physics and chemistry that previously delayed improvements in solar tech.
NES-VMC (natural excited states variational Monte Carlo) accurately predicted quantum excited states on systems ranging from single atoms to benzene-sized molecules.
The method outperforms leading computational chemistry techniques, often achieving chemical accuracy.
Excited states are crucial for understanding light-matter interactions, key to improving solar cells, LEDs, lasers, and more.
NES-VMC overcomes long-standing challenges in physics and chemistry that have hindered progress in these fields.
This AI-driven breakthrough could lead to more efficient solar cells, brighter LEDs, and more powerful lasers. The ripple effects could be dramatic: lower electricity costs, improvements in phone and laptop battery life and displays, faster fiber-optic internet, and so much more.
Salesforce just introduced two fully autonomous, AI-powered sales agents, Einstein SDR Agent and Einstein Sales Coach Agent, designed to help sales teams accelerate growth through automation and personalization.
Einstein SDR Agent engages with inbound leads 24/7 to answer questions, handle objections, and book meetings.
Einstein Sales Coach Agent helps salespeople rehearse pitches and offers real-time suggestions during calls.
The agents both leverage Salesforce’s CRM data and external data uploaded via Data Cloud to generate accurate, contextually relevant responses.
The agents will be generally available in October, with more details expected to be released at Dreamforce conference in September.
By integrating AI agents into existing platforms, Salesforce is lowering the barrier for AI adoption in business processes. These agents offer 24/7 support and automate repetitive tasks like qualifying leads and booking meetings, freeing human sales teams to focus on high-value tasks and potentially close more deals.
Slack AI could be tricked into leaking login details and more
Security experts found that Slack’s AI assistant can be misled into disclosing sensitive information, like API keys, to unauthorized users through carefully crafted prompts.
Hackers can exploit this vulnerability by creating a public Slack channel, inputting a malicious command that causes the AI to leak private data via clickable URLs.
Salesforce fixed the issue for private channels but public ones remain exposed, allowing attackers to use social engineering tactics to get workspace members to upload malicious documents.
Google DeepMind workers urge the company to end ties with military organizations
In May 2024, approximately 200 Google DeepMind employees signed a letter urging the company to cease its contracts with military organizations due to concerns over the use of AI technology in warfare, according to Time magazine.
The letter highlights internal tensions between Google’s AI division and its cloud business, referencing Google’s defense contract with the Israeli military and the use of AI for mass surveillance and targeting in Gaza.
The letter calls for Google to investigate claims of its cloud services being used by militaries, cut off such access, and establish a new governance body to prevent future military use of DeepMind’s AI technology.
A Daily Chronicle of AI Innovations on August 22nd 2024
Neuralink’s second patient is already playing video games with brain implant
Apple’s first foldable MacBook might see big delays
OpenAI joins Silicon Valley companies lobbying against California’s AI bill
Ideogram 2.0 launches with major upgrades
xAI releases Grok 2 in early beta
Create your own AI Clone
Disney AI brings robots to life
Ideogram 2.0 launches with major upgrades
Ideogram just released version 2.0 of its advanced text-to-image model with major upgrades and new features, including five new image styles, an iOS app, a beta API, and over 1 billion public Ideogram images.
Ideogram 2.0 offers five image styles: General, Realistic, Design, 3D, and Anime.
The Realistic style convincingly resembles photographs with dramatically improved textures for human features like hands and hair, a pain point for previous image generation models.
The Design style also significantly improves text rendering, allowing users to create greeting cards, t-shirt designs and more.
Ideogram offers a free tier that allows users to generate around 40 images, or 10 prompts a day at no charge.
Ideogram 2.0 consistently renders high-quality images with near perfect human hands and text — which is an instant ‘AI giveaway’ in other AI image generators. This makes the model the new gold standard for use cases like memes, newsletter images, YouTube thumbnails, posters, and more.
xAI has begun rolling out early beta access for Grok 2, a powerful new AI model that leverages real-time data from X and uses Flux.1 to generate relatively unfiltered AI images.
Grok 2 is now available to a select group of premium X users in early beta mode.
The model can access and use real-time information from X, setting it apart from ChatGPT and other LLMs.
Grok 2 offers two modes: regular and “fun” mode, with the latter providing a more distinctive and entertaining personality.
When gathering and summarizing news, Grok 2 can reference specific tweets, a capability that cannot be found in ChatGPT or Claude.
Grok 2’s biggest advantage against other top-tier AI chatbots like ChatGPT is its ability to access real-time information from X and provide unfiltered responses. And with Grok 3 rumoured to be coming at the end of 2024, xAI has proven itself as a serious competitor in the LLM race — in a very short period of time.
ETH Zurich and Disney Research scientists have developed an AI system that can generate realistic, physics-based movements for virtual characters and robots from simple text or image inputs.
The system uses a two-stage approach: first, it learns a latent representation of motion from a large dataset, then trains a control policy using reinforcement learning.
It can handle a diverse range of motions, from simple walking to complex acrobatics, outperforming previous methods in accuracy and generalization.
The AI adapts to physical constraints, allowing it to transfer motions to real robots while maintaining balance and style.
Disney released a video showcasing one robot trained on the new two-stage AI technique dancing and getting pushed around while staying on its feet.
This AI system bridges the gap between animation and robotics, helping humanoids move more naturally and adapt better to new situations. With personal robots coming as soon as 2025 and the rapid pace of AI and robotics advancements, we might be coexisting with robots sooner than most people realize.
Neuralink’s second patient is already playing video games with brain implant
Elon Musk’s company Neuralink has implanted a brain chip in a second human patient named Alex, who is now using it to play video games and design 3D objects.
Alex’s recovery from the procedure has been smooth, and he has successfully used computer-aided design software to create a custom mount for his Neuralink charger.
The core technology of Neuralink involves a small, implantable chip with flexible electrode threads that capture and transmit brain activity to external devices like computers.
OpenAI joins Silicon Valley companies lobbying against California’s AI bill
OpenAI’s chief strategy officer Jason Kwon argues that AI regulations should be managed by the federal government, not individual states, to avoid hindering progress and causing businesses to relocate from California.
Kwon states that a consistent, nation-wide set of AI policies will promote innovation, allowing the U.S. to become a leader in global AI standards, and thus opposes California’s SB 1047 bill.
The proposed California AI safety bill, designed by Senator Scott Wiener, includes measures like pre-deployment safety testing and whistleblower protections, and awaits its final vote before potentially being signed by Governor Gavin Newsom.
California and Google drafted a $300 million, 5-year partnership to fund in-state newsrooms and AI initiatives, including a $40 million annual “AI Innovation Accelerator”.
A Daily Chronicle of AI Innovations on August 21st 2024
OpenAI signs landmark agreement with Condé Nast
Microsoft releases new Phi-3.5 models, beating Google, OpenAI and more
AWS CEO tells employees that most developers could stop coding soon as AI takes over
OpenAI adds free fine-tuning to GPT-4o
Claude sued for copyright infringement
Create AI images in real-time on WhatsApp
Microsoft’s new AI beats larger models
Microsoft just released Phi-3.5-MoE, an advanced AI model that rivals the reasoning capabilities of much larger models while maintaining a compact and efficient architecture.
Phi-3.5-MoE uses a new mixture-of-experts (MoE) approach, which selectively activates only the most relevant parts of the model for each task to save compute power.
The new model excels at understanding and following complex instructions and can handle up to ~125,000 words in a single prompt.
In head-to-head benchmarks, Phi-3.5-MoE outperformed popular models like Meta’s Llama 3 8B and Google’s Gemma 2 9B, but fell short against OpenAI’s GPT-4o mini.
Microsoft made the model available under an open-source MIT license on Hugging Face.
While the mainstream media focuses on the most advanced large language model, there’s also another race amongst tech giants for the smartest, fastest, and smallest AI. Breakthroughs like Phi-3.5-MoE are paving the way for advanced AI models to run directly and privately on our mobile devices.
OpenAI signs landmark agreement with Condé Nast
OpenAI announced a new media partnership with Condé Nast to enhance search features using their SearchGPT prototype, aiming to make finding information and reliable content sources faster and more intuitive.
The partnership has raised transparency issues, particularly among Condé Nast’s unionized workers, who are worried about the impact on journalism and the lack of clear details on the agreement.
This deal occurs as Wall Street expresses growing concern over a potential AI bubble, with investors questioning the monetization and viability of AI technologies in the current market.
Microsoft releases new Phi-3.5 models, beating Google, OpenAI and more
Microsoft introduced three new open-source AI models, named mini-instruct, MoE-instruct, and vision-instruct, which excel in logical reasoning and support multiple languages but face challenges in factual accuracy and safety.
The Phi series aims to deliver highly efficient AI models for commercial and scientific purposes using quality training data, though specifics of the Phi-3.5 training process remain undisclosed by Microsoft.
All the new Phi 3.5 models are accessible under the MIT license on Hugging Face and Microsoft’s Azure AI Studio, but they require specialized GPU hardware like NVIDIA A100, A6000, or H100 for optimal performance.
AWS CEO tells employees that most developers could stop coding soon as AI takes over
A leaked recording revealed that AWS CEO Matt Garman believes software developers may soon stop coding as artificial intelligence takes over many of their tasks.
Garman’s remarks, shared during an internal chat in June, were intended as a positive forecast rather than a dire warning for software engineers, emphasizing new opportunities and skills.
Garman highlighted that developers should focus more on understanding customer needs and innovation, rather than just writing code, as AI tools increasingly manage the technical aspects.
Meta deploys new web crawlers that bypass scraping blocks
Meta has introduced new web crawling bots designed to collect data for training its AI models and related products without being easily blocked by website owners.
These new bots, Meta-ExternalAgent and Meta-ExternalFetcher, have features that potentially bypass the traditional robots.txt file, making website owners’ efforts to block them less effective.
Meta’s bots, launched in July, have shown low block rates compared to older versions, with only 1.5% blocking Meta-ExternalAgent and less than 1% blocking Meta-ExternalFetcher, according to Originality.ai.
OpenAI just launched free fine-tuning (up to 1 million tokens per day through September 23) for GPT-4o, allowing developers to customize the model for higher performance and accuracy.
Developers can now, for the first time ever, fine-tune GPT-4o to improve the model’s structure, tone, and domain-specific instructions for their AI applications.
Fine-tuning is available on all paid usage tiers with training costs of $25 per million tokens, but it is completely free until September 23.
OpenAI suggests that developers should see strong results from fine-tuning with only a few dozen training examples.
Additionally, Google’s Gemini API is giving developers 1.5 billion tokens for free every day on its Gemini 1.5 Flash model and 1.6 million tokens on its Gemini 1.5 Pro model.
Just last week, a company that was granted early access to fine-tune GPT-4o, produced Genie and achieved state-of-the-art scores on both SWE-bench Verified (43.8%) and Full (30.1%) benchmarks. With free fine-tuning now available to all developers, get ready for a new wave of smarter, faster and more capable AI bots.
A group of authors filed a lawsuit against AI startup Anthropic, alleging the company committed “large-scale theft” by training its Claude chatbot on pirated copies of copyrighted books.
This is the first lawsuit from writers targeting Anthropic and Claude, but similar lawsuits have been filed against competitor OpenAI and ChatGPT.
The lawsuit accuses Anthropic of using a dataset called The Pile, which includes numerous pirated books.
Anthropic and others, including OpenAI, have argued that training AI models is protected under the “fair use” doctrine of U.S. laws, which permits the limited use of copyrighted materials.
This is not the first time an AI company has been sued over copyright infringement, but it resurfaces an important debate about AI training data. While similar cases have been largely dismissed in the past, courts have yet to definitively address the core issue of using unauthorized internet-scraped material for AI training.
International Data Corporation (IDC)forecasted that worldwide AI spending is expected to reach $632 billion by 2028, with generative AI accounting for 32% of that.
LTX Studio opened to the public and launched five new features, including character animation and dialogue, face motion capture, and generation and keyframe control.
A Daily Chronicle of AI Innovations on August 20th 2024
AGIBOT reveals new humanoid robot family
ChatGPT runs for mayor in Wyoming
Luma Labs launches Dream Machine 1.5
Tesla’s humanoid robot has a new competitor
Waymo now giving 100,000 weekly robotaxi rides
Fortune 500 companies are getting increasingly worried about AI
Anthropic gets sued on allegations of ‘large-scale theft’
Nvidia’s new AI predicts thunderstorms with kilometer-scale precision
Luma Labs launches Dream Machine 1.5
Luma Labs just released Dream Machine 1.5, a major upgrade to their current AI video generation model, with higher quality text-to-video, smarter prompt understanding, and better image-to-video capabilities.
Dream Machine 1.5 builds on the original model’s ability to generate high-quality, realistic 5-second video clips from text and image prompts.
The upgraded model showcases better natural language processing, interpreting and executing prompts at a higher accuracy.
It excels in creating smooth motion, cinematography, and dramatic shots, turning static concepts into dynamic stories, but lags in morphing, movement, and text.
Dream Machine 1.5 is available to try for free here.
With text-to-image AI generation nearly indistinguishable from reality, the next big frontier is text-to-video — and Dream Machine 1.5 is another big leap forward for realism. While AI video still has some catching up to do, expect fast-moving startups like Luma Labs to close that gap for AI video, fast.
Victor Miller, a mayoral candidate in Wyoming’s capital city, just vowed to let his customized ChatGPT GPT named Vic (Virtual Integrated Citizen) help run the local government if elected.
Miller created VIC using ChatGPT, feeding it city ordinances and related documents to make municipal decisions.
Miller filed for him and VIC to run for mayor, proposing that the ChatGPT GPT provides data-driven insights and solutions while Miller ensures legal execution.
OpenAI has shut down Miller’s account twice, citing policies against using its products for campaigning.
Wyoming’s Secretary of State raised concerns, but local officials allowed Miller’s candidacy with his human name on the ballot.
While Miller’s chances of winning seem slim, and his grasp of data privacy and LLMs seem slimmer, this marks the first time a political candidate has openly advocated for AI in governance. Whether Cheyenne, Wyoming is ready for an AI co-pilot in City Hall is debatable, but AI will certainly infiltrate politics in the coming years.
AGIBOT, a China-based robotics startup, just unveiled a family of five advanced humanoid robots, directly challenging Elon Musk and Tesla’s upcoming Optimus bot.
AGIBOT’s five new models are both wheeled and biped humanoid robots specifically designed for diverse tasks — from household chores to industrial operations.
The flagship model, Yuanzheng A2, stands 5’9″ (175cm), weighs 121 lbs (55kg), and can perform delicate tasks like needle threading.
The company aims to start shipping 300 units by the end of 2024, claiming better commercialization and cost-control abilities than Tesla.
Unitree, another high-performance robot manufacturer from China, also showcased its new G1 mass production-ready robot with better functionality and appearance.
The humanoid robotics and AI race between the US and China is intensifying. While it’s been months since Tesla unveiled its Optimus 2 prototype, four Chinese startups, including AGIBOT revealing five new humanoid robots, have showcased major technical progress in just a few days.
Unitree Robotics has launched the production version of its G1 humanoid robot, priced at $16,000, just three months after its initial announcement.
The G1 is 90% cheaper than Unitree’s previous humanoid model, the H1, offering advanced features such as 23 degrees of freedom and a 3D vision system for real-time navigation.
While the G1 is not ready for consumer use, it is envisioned as an affordable platform for research and development, likely appealing to institutions and businesses exploring robotic automation.
Waymo disclosed it is now giving more than 100,000 paid robotaxi rides every week across Los Angeles, San Francisco, and Phoenix, doubling its previously stated figures.
This milestone was shared by Waymo co-CEO Tekedra Mawakana and reflects a significant increase from the over 50,000 weekly rides reported by Alphabet CEO Sundar Pichai earlier this year.
Waymo’s fleet consists of hundreds of fully autonomous Jaguar I-Pace vehicles, with 778 robotaxis deployed in California, and it has recently expanded its service to operate 24/7 in San Francisco and parts of Los Angeles.
Fortune 500 companies are getting increasingly worried about AI
Fortune 500 companies reporting AI as a risk factor saw a surge of 473.5% in the past year, according to a report by Arize AI, with 281 companies now flagging such risks.
Arize AI’s analysis revealed that 56.2% of Fortune 500 companies now include AI risks in their latest annual reports, a substantial jump from the previous year’s 49 companies.
The software and technology sectors lead the mentions of generative AI, while advertising, media, and entertainment industries report the highest percentage, 91.7%, of AI as a risk factor.
Anthropic gets sued on allegations of ‘large-scale theft’
A group of authors has filed a lawsuit against AI startup Anthropic, alleging “large-scale theft” for using pirated copies of copyrighted books to train its chatbot, Claude.
This marks the first lawsuit by writers specifically targeting Anthropic, although similar cases have been brought against OpenAI, the maker of ChatGPT, for the same reasons.
The lawsuit accuses Anthropic, which markets itself as a responsible AI developer, of contradicting its goals by using unauthorized works, and it adds to the increasing legal challenges faced by AI developers.
Nvidia’s new AI predicts thunderstorms with kilometer-scale precision
Nvidia Research has introduced StormCast, a new AI model for high-precision atmospheric dynamics to enhance mesoscale weather prediction, which is critical for disaster preparedness and mitigation.
Integrated into Nvidia’s Earth-2 platform, StormCast provides hourly autoregressive forecasts that are more accurate than current US operational models by 10%, improving early warning systems for severe weather events.
Trained on NOAA climate data, StormCast predicts over 100 weather variables and allows scientists to observe storm evolution in three dimensions, marking significant advancements in AI-driven weather forecasting by Nvidia.
A Daily Chronicle of AI Innovations on August 19th 2024
You can now rent ‘living computers’ made from human neurons
Start-up failures up by 60% as founders face hangover from boom years
AMD is going after Nvidia with a $5 billion acquisition
Tesla will pay you to pretend to be a robot
You can now rent ‘living computers’ made from human neurons
Researchers and companies like FinalSpark are creating computers from lab-grown human brain organoids, which can be rented for $500 a month.
These biocomputers use human neurons to form pathways mimicking human brain learning processes, potentially consuming significantly less energy than current AI technologies.
While challenges remain, such as limited organoid lifespans and lack of standardized manufacturing, FinalSpark and other researchers are exploring various biocomputing approaches, including cellular and fungal computing.
AMD is going after Nvidia with a $5 billion acquisition
AMD is set to buy ZT Systems for $4.9 billion in cash and stock, aiming to strengthen its AI ecosystem and offer better support to companies building large AI computing businesses.
The acquisition will integrate ZT Systems’ computing infrastructure design business into AMD, although AMD plans to sell the data center infrastructure manufacturing arm to a strategic partner.
ZT Systems’ CEO Frank Zhang and President Doug Huang will lead roles within AMD’s Data Center Solutions Business Group, with the deal expected to conclude in the first half of 2025.
Tesla is offering up to $48 per hour for Data Collection Operators to wear motion-capture suits and VR headsets to help train its humanoid Optimus robot.
Workers wearing these suits perform and analyze tasks to gather extensive data, aiding in the robot’s development for various roles, from factory work to caregiving.
Tesla’s initiative involves collecting potentially millions of hours of data, aiming to overcome the challenges of producing versatile robots at scale and ensuring their success in diverse tasks.
Swiss startup FinalSpark just launched a service allowing scientists to rent cloud access to “biocomputers” made of human brain cells for $500 a month, in an effort to create AI that uses 100,000x less energy than current systems.
The system uses organoids (clumps of human brain cells) that can “live” and compute for up to 100 days.
AI models are trained using dopamine for positive reinforcement and electrical signals for negative reinforcement, mimicking natural neural processes.
FinalSpark claims these biocomputers could be up to 100,000 times more efficient for AI training than traditional silicon-based technology.
The organoids and their behaviour are live streamed 24/7, which you can access here.
AI is an energy-hungry industry, and alleviating its dependence on CPUs and GPUs is generally a step in the right direction. That said, using brain organoids for biocomputing is completely uncharted territory and is bound to raise ethical concerns — such as the sci-fi possibility that cell masses somehow achieve consciousness.
California’s SB 1047, an aggressive AI safety bill aimed at preventing AI disasters, just got significantly revised to address concerns raised by AI companies like Anthropic and open-source developers.
The bill no longer allows California’s attorney general to sue AI companies for negligent safety practices before a catastrophic event occurs.
AI labs are now only required to submit public “statements” about their safety practices vs certifications “under penalty of perjury.”
Likewise, developers must now provide “reasonable care” vs “reasonable assurance” that AI models do not pose significant risks.
The bill is headed to California’s Assembly floor for a final vote.
There’s a fine line between advancing technological progress and mitigating potential existential risks that governments are navigating — and California is showing that regulation can be practical and adaptive. These changes are a big step towards fostering responsible AI development through collaborative governance.
Researchers just developed a new technique to find shorter solutions to scrambled Rubik’s Cubes by cleverly analyzing the puzzle’s structure and identifying the best moves more quickly.
The Rubik’s Cube has an enormous number of possible configurations, over 43 quintillion, making it challenging for AI to solve in the fewest moves possible.
Researchers represented the Rubik’s Cube as a complex network or “graph” and used a new technique to pass useful information, like the moves required to solve the puzzle, between connected nodes.
The AI then considers which next moves are most likely to lead to a quick solution, using the probabilities as weights, and focuses on the most promising paths.
When tested, the new technique found solutions to the puzzle faster than current state-of-the-art Rubik’s Cube solving AI systems.
As companies like Sakana build AIs that can completely automate scientific research, it’s important to make sure they’re solving highly complex problems efficiently. This technique, coupled with Sakana’s processes, could be massively beneficial in areas like optimizing supply chains and advanced drug discovery.
Free event: Navigating AI Data Privacy. Join Section CEO Greg Shove to learn how to protect your data, write a team or company AI data policy, and lead your company on safe AI. RSVP here.*Source: https://www.sectionschool.com/events/live-events/ai-data-privacy-in-large-organizations
Claudehttps://x.com/alexalbert__/status/1824483452802175082 a new screenshot capture button, allowing users to easily include images from their screen in prompts.Source: https://x.com/alexalbert__/status/1824483452802175082
Midjourneyreleased a new unified web-based AI image editor with advanced tools for seamlessly modifying and extending generated images.Source: https://venturebeat.com/ai/midjourney-releases-new-unified-ai-image-editor-on-the-web
Rebellions and Sapeon, South Korean AI chip makers, signed a definitive merger agreement to challenge global leaders like Nvidia.Source: https://www.reuters.com/technology/artificial-intelligence/south-korean-ai-chip-makers-rebellions-sapeon-agree-merge-2024-08-18
Bzigo launched Iris, an AI-powered mosquito detector that tracks and marks mosquitoes with a laser pointer for easy swatting.Source: https://www.foxnews.com/tech/ai-technology-can-help-you-win-battle-over-mosquitoes
Coinbasestarted a $15,000 accelerator grant program for projects combining AI with crypto wallets to enable economic participation.Source: https://cointelegraph.com/news/coinbase-ceo-brian-armstrong-ai-should-have-crypto-wallets
Microsoftunveiled PowerToys Workspaces, a new feature to auto-arrange apps, plus an AI-powered copy-paste tool with OpenAI API integration.Source: https://www.theverge.com/2024/8/16/24221639/microsoft-powertoys-workspaces-feature-demo
A Daily Chronicle of AI Innovations on August 16th 2024
AI makes Walmart 100x more productive
SoftBank’s AI chip faces setback
Create a Siri-like voice AI with Llama 3.1
Hermes 3 is the newest open-source model
AI makes Walmart 100x more productive
Walmart’s CEO Doug McMillon just reported that the company is using generative AI to increase its productivity, updating 850 million product catalog entries 100 times faster than human-led methods.
The report came during the company’s Q2 financial earnings call, where McMillon also announced AI improvements to customer search and seller support.
Customers can now use AI-powered search and a new shopping assistant on Walmart’s app and website — it even provides advice for questions like “Which TV is best for watching sports?”.
Walmart is also testing a completely new AI-driven experience for U.S. based marketplace sellers, but the details are not yet available.
McMillon said the company plans to continue experimenting with AI globally across all parts of its business.
Another multibillion dollar company is using AI to increase productivity, but most notably, Walmart is exploring the tech in all areas of its business ops. Whether people should be excited about the endless possibilities ahead or concerned about the relevance of their jobs is a question that’s not going away any time soon.
SoftBank’s ambitious Project Izanagi initiative, aimed at developing AI processors to rival Nvidia, is reportedly facing a major setback after Intel failed to meet volume and speed requirements.
SoftBank had been working with Intel to develop AI processors for Project Izanagi because it lacks in-house chip design expertise, but Intel failed to meet SoftBank’s demands.
In an effort to keep Project Izanagi on track, SoftBank is considering a new partnership with TSMC, the world’s largest chipmaker.
TSMC has its own issues, however, failing to meet its current chipmaking demands, which has stalled the negotiations.
Despite the complications, SoftBank CEO Masayoshi Son remains committed to the company’s ambitious plan and is seeking investments from Saudi Arabia, UAE, and major tech companies.
Nvidia is currently dominating the AI chip space, which propelled the company to its current $3 trillion dollar market capitalization. But with recent delays of Nvidia’s next-gen Blackwell AI chip, it could be time for competitors to strike.
Nous Research just released Hermes 3, a new open-source model with significant improvements in roleplaying, agentic tasks, function calling, multi-turn chats, and long context coherence.
Hermes 3 is available in three sizes (8B, 70B, and 405B) with the 405B parameter model achieving state-of-the-art performance relative to other open models.
The model is instruct tuned, or trained, to faithfully respond to user requests and closely follow provided system prompts, unlike base or foundation models.
It achieves similar or better performance to Meta’s Llama-3.1 405B in judgement, reward modeling, interpretable problem-solving, code generation, and tool use.
Hermes 3 is available now for free via Lambda Chat or in the Nous Research Discord server.
Meta has been the leader in open-source AI for a while, but companies like Nous Research and Mistral are catching up with their latest Hermes 3 and Large 2 models. And the more free, customizable and state-of-the-art AIs available to the public, the more transparency the world has.
Elon Muskrevealed that xAI is developing an in-house image generation system to replace the current Flux model in Grok 2 but it’s currently months away from release.
The U.S. Consumer Financial Protection Bureauhighlighted risks of AI in finance, saying existing laws apply and innovation requires consistent regulatory treatment.
Apptronik, an automation company that makes humanoid robots, recently reported that the company is preparing for a commercial launch by the end of 2025.
A Daily Chronicle of AI Innovations on August 15th 2024
Apple’s iPad is getting a robotic arm
Google’s Imagen 3 tops Midjourney, DALL-E
Apple’s next big thing is a $1000 home robot
Grok-2 reaches state-of-the-art status
Creating sound effects with text
X’s AI image generator allows users to create uncensored images
Ex-Google CEO says successful AI startups can steal IP and hire lawyers to ‘clean up the mess’
FTC finalizes rule banning fake reviews, including those made with AI
Apple’s next big thing is a $1000 home robot
Apple is reportedly working on a new smart home project featuring an iPad attached to a robotic arm that can twist and rotate, designed as a home “command center” with AI capabilities.
The initiative, backed by CEO Tim Cook and head of hardware engineering John Ternus, has involved hundreds of staff and follows the cancelled Apple-brand electric car project.
According to Bloomberg, the device is expected to be released around 2026 or 2027, potentially costing about $1,000, and will use a modified version of iPadOS.
xAI’s newest AI model, Grok-2, is now available in beta for users on the X platform — achieving state-of-the-art status and outperforming versions of Anthropic’s Claude and OpenAI’s GPT-4.
In addition to Grok-2, Grok-2 mini is also now available to users on the X platform in beta with an enterprise API release planned for later this month.
Both Grok-2 and Grok-2 mini show significant improvements in reasoning with retrieved content, tool use capabilities, and performance across all academic benchmarks.
Grok-2 can now create and publish images directly on the X platform, powered by Black Forest Lab’s Flux 1 AI model.
Grok-2 surpasses OpenAI’s latest GPT-4o and Anthropic’s Claude 3.5 Sonnet in some categories, making it one of the best models currently available to the public if based purely on benchmarks.
Grok-1 debuted as a niche, no-filter chatbot, but Grok-2’s newly achieved state-of-the-art status has catapulted xAI into a legitimate competitor in the AI race. The startup is looking to have a bright future with its new Supercluster, Elon’s ability to attract talent, and vast amounts of real-time training data available on X.
Apple is reportedly ramping up development on a high-end tabletop smart home device with a robotic arm, an iPad-like display, and Siri voice command to operate its AI features.
The project, codenamed J595, reportedly involves a team of several hundred people and could launch as early as 2026 or 2027.
The device combines an iPad-like display with a thin robotic arm that can tilt, spin 360 degrees, and move the screen around.
It is expected to run a modified version of iPadOS making it a familiar smart home command center, videoconferencing tool, and remote-controlled home security device.
Apple is targeting a price point of around $1,000 for the product.
Apple is doubling down on its commitment to artificial intelligence by ramping up the development of a strange new Siri-powered, countertop robotic arm. With Apple Intelligence launching later this year, the tech giant seemingly has big plans for implementing AI into its hardware.
X’s AI image generator allows users to create uncensored images
X’s new AI image generator, Grok, allows users to create and share highly controversial images, including those of public figures in inappropriate scenarios, raising concerns about the lack of content moderation.
Despite claiming to have restrictions, Grok often generates offensive or misleading images, with many users easily bypassing its few safeguards, leading to further scrutiny from regulators.
The chaotic rollout of Grok’s image generation feature aligns with Elon Musk’s relaxed approach to content moderation, potentially driving away advertisers and inviting regulatory action.
ElevenLabs now offers a text-to-sound feature that allows users to generate sound effects by writing a simple description of the noise they want.
Visit ElevenLabs and log in or create an account. You can try this feature for free.
Select “Sound Effects” from the left sidebar.
Describe your desired sound effect in the text box.
Adjust settings for duration and prompt influence.
Click “Generate Sound Effects” to create your sounds.
Source: https://elevenlabs.io/
Google’s Imagen 3 tops Midjourney, DALL-E
Google DeepMind recently published the paper for it’s new state-of-the-art AI image generation model, Imagen 3, flexing that it beat DALL-E 3, Midjourney v6, and Stable Diffusion 3 in human performance evaluations.
The human evaluations asked participants to rank their preferred models for overall quality and adherence to detailed prompts.
Imagen 3 excelled particularly in generating high-quality, realistic images that closely match long and complex text descriptions.
Despite its capability to accurately generate photorealistic images, it struggles with certain tasks requiring numerical reasoning, understanding scale, and depicting actions.
Ex-Google CEO says successful AI startups can steal IP and hire lawyers to ‘clean up the mess’
Former Google CEO Eric Schmidt suggested that successful AI startups can initially steal intellectual property and later hire lawyers to resolve legal issues if their product gains traction.
Schmidt used a hypothetical example of copying TikTok to illustrate how Silicon Valley entrepreneurs might prioritize rapid growth over legal considerations.
Schmidt’s comments, made during a talk at Stanford, were later removed from the university’s YouTube channel after drawing media attention.
FTC finalizes rule banning fake reviews, including those made with AI
The FTC has introduced a final rule prohibiting companies from producing or selling fake reviews, including AI-generated ones, and can now penalize companies that ignore the regulation.
The rule targets deceptive practices such as incentivizing feedback, undisclosed insider reviews, company-controlled review sites, intimidation to remove negative feedback, and the trade of fake followers or views.
Although the FTC first proposed the fake review ban last year, there are concerns about enforcing it on global marketplaces like Amazon, where numerous fraudulent reviews come from businesses outside the U.S.
Free eBook: The AI Proficiency Report from Section. 7% of the workforce is getting all the benefits of AI. Download the report to see what they do differently.*
A Daily Chronicle of AI Innovations on August 14th 2024
Google beats OpenAI in voice mode race
OpenAI redesigns coding benchmark
Bring images to life with Kling AI
Become a tennis pro with AI
Android phones get an AI upgrade
xAI releases Grok-2, adds image generation on X
New ‘AI Scientist’ conducts research autonomously
Android phones get an AI upgrade
Google is replacing Google Assistant with its new AI model, Gemini, on Android phones, introducing generative AI capabilities like automating calendar invites and creating playlists based on user input.
Gemini will operate through cloud-based services, allowing for advanced AI processing, while Apple plans to run its AI models directly on devices for better privacy and latency.
The introduction of Gemini marks a significant shift in smartphone functionality, offering the potential to automate day-to-day tasks, but there are risks of errors as AI assistants become more integrated into daily life.
Google just launched Gemini Live, a mobile conversational AI with advanced voice capabilities, while OpenAI’s ChatGPT voice mode remains in its “limited alpha phase” and is not yet available to everyone.
Gemini Live, Google’s answer to OpenAI’s Advanced Voice Mode, is capable of “in-depth“ hands-free conversations and has 10 different human-like voice options.
Users can interrupt and ask follow-up questions mid-response, mimicking natural conversation flow — however Gemini Live’s ability to see and respond to your camera view is planned later this year.
Similar to Apple’s upcoming Intelligence features, Gemini integrates directly with Google to provide context-aware answers without switching apps.
Gemini Live is now the default assistant on Google’s Pixel 9 and is available today to all Gemini Advanced subscribers on Android (coming to iOS soon).
Real-time voice is slowly shifting AI from a tool we text/prompt with, to an intelligence that we collaborate, learn, consult, and grow with. As the world’s anticipation for OpenAI’s unreleased products grows, Google has swooped in to steal the spotlight as the first to lead widespread advanced AI voice rollouts.
xAI has launched upgraded Grok-2 and Grok-2 mini chatbots with new image-generation capabilities, which are powered by Black Forest Lab’s Flux 1 AI model and allow users to publish images to X with few restrictions.
Both Grok-2 models are currently in beta, available to Premium and Premium Plus subscribers on X, and will be accessible via xAI’s enterprise API later this month.
Early examples of Grok-generated images, depicting figures like Donald Trump and Barack Obama, indicate minimal content restrictions, raising concerns about the spread of false information on the platform.
OpenAI and the authors of SWE-bench collaborated to redesign the popular software engineering benchmark and release ‘SWE-bench Verified’, a human-validated subset of the original benchmark.
SWE-bench Verified addresses issues in the original benchmark, such as overly specific unit tests and unreliable development environments that leads to incorrect assessments of AI performance.
The new subset includes 500 samples verified by human professional software developers to make evaluating models on SWE-bench easier and more reliable.
On SWE-bench Verified, GPT-4o figures out 33.2% of samples, and the best open-source scaffold, Agentless, doubles its previous score to 16%.
The leaderboard for SWE-bench Verified does not include Cosine’s Genie we wrote about yesterday, which shattered the high score on the old benchmark by over 10%.
Accurate benchmarking of AI in human-level tasks like coding is crucial for transparency and assessing AI risk. However, OpenAI’s collab with SWE-bench is a double-edged sword — while it improves the benchmark, it also raises questions about potential conflicts of interest, especially with ‘Project Strawberry’ rumors heating up.
Tokyo-based R&D company Sakana AI introduced “The AI Scientist,” an AI designed to fully automate research, claiming it’s the first system of its kind to independently handle numerous scientific tasks.
The AI Scientist generates innovative research ideas, conducts experiments, writes code, and produces scientific papers while using a simulated review process to evaluate its own findings, mimicking human scientific collaboration.
A rival AI startup, Omniscience, contested Sakana AI’s originality, asserting their AI model, Omni, was released months earlier and offers similar capabilities for aiding users in scientific writing and research tasks.
Kling AI’s new image-to-video feature allows users to take static images, and turn them into dynamic videos, offering a new dimension to the AI video generator’s character consistency.
Click “AI Videos” on the dashboard, then select “Image to Video” on the top bar.
Upload your chosen image and write a prompt describing how you want the image animated.
Hit “Generate” and watch your image come to life!
Source: https://klingai.com/
Become a tennis pro with AI
Researchers just created Match Point AI, a groundbreaking tennis simulation that pits AI agents against virtual pros, giving players data-driven tennis strategies and tools to help improve their game.
Match Point AI realistically models the complexities and uncertainties of real tennis, allowing AI to test new strategies in virtual games.
Early experiments show the AI rediscovering time-tested tennis strategies, like making opponents run, validating the framework’s ability to understand the sport.
By watching Match Point’s AI agents that mimic tennis legends like Novak Djokovic, players can learn the perfect strategies to optimize their game quickly and efficiently.
AI has long been trained to compete in games, but researchers usually focus on board and video games with straightforward mechanics. Match Point AI learns to make decisions in a real-world, complex sport, similar to how Google’s newest AI robot can play ping pong against intermediate players.
What else is happening in AI on August 14th 2024!
Google unveiled Pixel Buds Pro 2 with a custom Tensor A1 chip, enhanced noise cancellation, and Gemini AI integration.
A Daily Chronicle of AI Innovations on August 13th 2024
New AI can diagnose stroke via tongue color
Sakana reveals an autonomous AI scientist
New AI model sparks rumors about OpenAI’s Q* New AI model can listen while speaking Gemini 1.5 Flash cuts usage fees by 78% OpenAI releases GPT-4o System Card, revealing safety measures SingularityNet’s supercomputer network: A step closer to AGI
New AI model sparks rumors about OpenAI’s Q*
A mysterious new AI model has appeared in the LMSYS Chatbot Arena, sparking rumors that it could be OpenAI’s highly anticipated Q* AI breakthrough or its evolution, codenamed ‘Strawberry.’
Testers report that this “anonymous-chatbot” displays more advanced reasoning capabilities than the current state-of-the-art GPT-4o model. To add to the speculation, OpenAI CEO Sam Altman has tweeted a picture of a strawberry, which is believed to be the codename for OpenAI’s secret new AI model.
Why does it matter?
If this mystery model is indeed Q*, it could represent another significant leap forward in AI capabilities as OpenAI’s competitors like Anthropic and Meta start to catch up to GPT-4o. This could be a massive paradigm shift that could significantly reshape the landscape of AI.
Tokyo-based Sakana AI just introduced “The AI Scientist,” the world’s first AI system capable of autonomously conducting scientific research — potentially revolutionizing the scientific process.
The system generates new research ideas, writes code, runs experiments, writes papers, and performs its own peer review with near-human accuracy.
Sakana AI envisions a future where we won’t just see an autonomous AI researcher but also autonomous reviewers, area chairs, and entire conferences.
The AI Scientist has already produced papers with novel contributions in machine learning domains like language modeling and diffusion models.
Each paper only costs approximately $15 to produce, which could potentially democratize research capabilities.
This breakthrough could dramatically accelerate scientific progress by allowing researchers to collaborate with AI agents and automate time-consuming tasks. We’re entering a new era where academia could soon be powered by a tireless community of AI agents, working round-the-clock on any problem they’re directed to.
Cosine just showed off Genie, its new fully autonomous AI software engineer that broke the high score on a benchmark for evaluating the coding abilities of large language models (LLMs), by over 10%.
Cosine trained Genie on a dataset that emulates how human software engineers actually work from incremental knowledge discovery to step-by-step decision making.
When it makes a mistake, Genie iterates, re-plans, and re-executes until it fixes the problem, something that foundational models struggle with.
Genie scored 30.08% on SWE-Bench, a 57% improvement over previous top performers like Amazon’s Q and Code Factory at 19% (GPT-4 scores 1.31%).
The waitlist is currently open, but Genie has not yet been released to the general public.
Cosine completely rethinks the way that AI is trained, teaching it to be more human-like during its training rather than focusing on post-training prompt design — and it works! With its recent SWE-Bench success, more companies are likely to adopt the process and build smarter AIs, a win-win for everyone.
Researchers have developed a new Listening-While-Speaking Language Model (LSLM) that can listen and speak simultaneously. This allows for more natural and responsive conversations with AI systems. The LSLM uses a token-based decoder-only text-to-speech model for speech generation and a streaming self-supervised learning encoder for real-time audio input.
This enables the model to detect turn-taking and respond to interruptions, a key feature of natural conversation. In addition, the LSLM has demonstrated robustness to noise and sensitivity to diverse instructions in experiments.
Why does it matter?
While OpenAI’s advanced voice mode for ChatGPT pushes us towards realistic AI conversations, LSLM takes that to the next level, where it could revolutionize human-AI interactions, making conversations with machines feel natural and responsive.
Google has announced significant updates and improvements to its Gemini API and Google AI Studio. The biggest news is a significant reduction in the usage fees for Gemini 1.5 Flash. The input token costs have decreased by 78% to $0.075 per 1 million tokens, and the output token costs have decreased by 71% to $0.3 per 1 million.
This makes Gemini 1.5 Flash a popular and affordable summarization and multi-modal understanding model. Google has also completed the Gemini 1.5 Flash tuning rollout, allowing developers to customize the base model and improve its performance.
Why does it matter?
The extended language support, model tuning options, and improvements to the Gemini API will enable more developers and researchers to build innovative AI-powered products and services using advanced NLP capabilities.
SingularityNet’s supercomputer network: A step closer to AGI
SingularityNET is launching a network of powerful supercomputers to accelerate the development of AGI. The first of these supercomputers is expected to come online in Sep 2024. The network will use cutting-edge hardware like Nvidia GPUs and AMD processors to create a “multi-level cognitive computing network” for hosting and training complex AGI systems.
The company uses an open-source software framework called OpenCog Hyperon to manage the distributed computing power. Users will access the network through a tokenized system, allowing them to contribute data and test AGI concepts.
Why does it matter?
Major AI companies such as OpenAI, Anthropic, and Google currently dominate the race to AGI development. However, SingularityNET’s novel decentralized approach could disrupt this, democratizing AI research for a broader range of contributors and innovators.
An AI developed by researchers at Middle Technical University and the University of South Australia can diagnose stroke by analyzing the color of a person’s tongue.
The advanced algorithm, which boasts a 98% accuracy rate, can also detect conditions such as anaemia, asthma, diabetes, liver, and gallbladder issues, COVID-19, and various gastrointestinal diseases.
This innovative system uses tongue color analysis, an ancient technique from traditional Chinese medicine, and could potentially be adapted for use with smartphones for real-time health assessments.
Reddit is testing AI-powered search result pages that provide summaries and recommendations to help users “dig deep” into content and discover new communities.
According to leaked documents, Nvidia has been scraping video content from sources like YouTube and Netflix to train its AI models for its upcoming Cosmos project.
Automattic has launched a newtool called “Write Brief with AI.” This helps WordPress bloggers write concisely and improve the readability of their content.
Anthropic is expanding its safety bug bounty program to focus on finding flaws in its AI safeguarding systems. The company is offering bounty rewards of up to $15,000.
OpenAI allows free ChatGPT users to generate up to two images per day using its DALL-E 3 model. This was previously available only to ChatGPT Plus subscribers.
Google Researchers developed a robot to play competitive table tennis at an amateur human level. It can also adapt its game to play vs. unseen human opponents.
Alibaba has released a new LLM called Qwen2-Math that scored 84% on the MATH Benchmark, surpassing OpenAI’s GPT-4o and other leading math-focused AI models.
Google Meet is rolling out a new AI-powered feature, “Take notes for me,” which can automatically take notes during video calls,boosting productivity and efficiency.
A Daily Chronicle of AI Innovations on August 12th 2024
AI search is gaining momentum
ChatGPT unexpectedly began speaking in a user’s cloned voice during testing
Meta and UMG struck an agreement to ‘protect’ artists from AI
Google Meet adds new note-taking AI
FCC cracks down on AI voice calls
Google Meet adds new note-taking AI
Google is rolling out a new “Take notes for me” feature powered by its Gemini AI for it’s Google Meet feature, allowing users to focus on the meeting while the AI automatically captures key points.
The AI-powered tool will automatically take notes during Google Meet calls, reducing the need for manual note-taking.
The feature is powered by Google’s Gemini AI and will be available to Workspace customers with specific add-ons.
“Take notes for me” is part of the AI Meetings and Messaging add-on, which costs $10 per user/month across most Google Workspace plans.
Admins can configure the feature’s availability through the Google Workspace Admin console.
Taking notes during meetings will soon be a thing from our prehistoric, non-AI past — with Google pushing for a more practical, AI-assisted future of work. Alongside this, the tech giant is directly competing against smaller AI startups such as Otter AI and Fireflies who’ve thrived by selling a nearly identical features to users.
The U.S. Federal Communications Commission (FCC) just proposed new regulations requiring AI-generated voice calls to disclose the use of artificial intelligence.
The proposal aims to combat the rise of AI-generated voices in unwanted and potentially fraudulent ‘robocalls’.
AI voices would be required to explicitly state they are artificial at the beginning of calls.
The FCC is also exploring tools to alert people when they receive AI-generated calls and texts, including enhanced call filters, AI-based detection algorithms, and improved caller ID flagging.
As AI voices become indistinguishable from human speech, these regulations are crucial in combating highly targeted scams. But with enforcement likely to be a cat-and-mouse game against scammers, the best defence is education—especially for those most vulnerable to AI deception.
Perplexity’s AI search engine experienced substantial growth, answering 250 million queries last month, signaling a rising demand for AI-driven search technologies. In contrast, 500 million queries were processed throughout 2023, Shevelenko told the Financial Times
Despite this growth, Perplexity remains significantly behind Google, which dominates the market with over 90 percent share and processes around 8.5 billion queries daily.
The rise of AI in search, exemplified by Perplexity and other players, suggests a potential shift in user behavior and challenges to the traditional search engine business models.
ChatGPT unexpectedly began speaking in a user’s cloned voice during testing
During testing, ChatGPT’s Advanced Voice Mode accidentally mimicked users’ voices without their consent, as highlighted in OpenAI’s new GPT-4o system card released on Thursday.
OpenAI has implemented safeguards to prevent unauthorized voice imitation, although rare episodes during testing showcased the model’s ability to unintentionally generate user-like voices.
The GPT-4o AI model can synthesize almost any sound, and OpenAI directs this capability by using authorized voice samples and employing an output classifier to ensure only selected voices are generated.
Meta and UMG struck an agreement to ‘protect’ artists from AI
Meta and Universal Music Group (UMG) updated their licensing agreements to extend UMG’s content use across more Meta platforms, now including Threads and WhatsApp alongside Facebook, Instagram, Messenger, and Meta Horizon.
This multiyear agreement aims to explore new collaboration opportunities on WhatsApp and other Meta platforms, addressing issues like unauthorized AI-generated content that could impact artists and songwriters.
Meta’s collaboration with UMG dates back to 2017, allowing users to use UMG music in content and addressing copyright issues, a challenge shared by TikTok in its recent dealings with UMG.
Delphi unveiled an AI clone feature that creates lifelike digital replicas of individuals, demonstrating its capabilities in a TV interview on FOX Business.
A Daily Chronicle of AI Innovations on August 09th 2024
OpenAI fears users will become emotionally dependent on its ChatGPT voice mode
Google’s new robot can play table tennis like humans
GPT-4 tackles top-secret tasks
AI speeds up schizophrenia cure
OpenAI fears users will become emotionally dependent on its ChatGPT voice mode
OpenAI is concerned that users may become emotionally dependent on ChatGPT due to its new, human-sounding voice mode, which could affect relationships and social interactions.
The company observed users expressing shared bonds with ChatGPT’s voice mode, raising fears that prolonged use could reduce the need for human interaction and lead to unhealthy trust in AI-supplied information.
OpenAI plans to continue studying the potential for emotional reliance on its tools and aims to navigate the ethical and social implications responsibly while ensuring AI safety.
Google’s new robot can play table tennis like humans
Google’s DeepMind team has developed a table tennis robot that performs at a “solidly amateur” human level, successfully competing against beginner and intermediate players while struggling against advanced ones.
During testing, the robot achieved a 55% win rate against intermediate players, winning 45% of the 29 games it played in total, but it failed to win any matches against advanced players.
DeepMind identifies the robot’s main weaknesses as reacting to fast balls and dealing with system latency, suggesting improvements like advanced control algorithms and predictive models for better performance.
Researchers at Uppsala University recently used AI to accurately predict 3D structures of receptors linked to schizophrenia and depression treatments and speed up possible treatment strategies.
The AI model predicted the structure of TAAR1, a receptor linked to schizophrenia and depression treatments.
Then, supercomputers screened millions of molecules to find those fitting the AI-generated model.
Experimental testing confirmed many AI-predicted molecules activated TAAR1, and one potent molecule showed promising positive effects in animal experiments.
Researchers reported on a new model that can predict major diseases early enough to treat them, and now AI is working on curing schizophrenia and depression. As the tech continues to improve, we’re going to see a complete transformation in healthcare that will likely save millions, if not billions, of lives.
Microsoft and Palantir just partnered to deliver advanced AI, including GPT-4, and analytics capabilities to U.S. Defense and Intelligence agencies through classified cloud environments.
The partnership integrates Palantir’s AI Platforms with Microsoft’s Azure OpenAI Service in classified clouds.
The aim is to safely and securely enable AI-driven operational workloads across defense and intelligence sectors.
OpenAI’s models, including GPT-4, will be leveraged by the U.S. government to develop innovations for national security missions.
AI being trusted with classified documents is a big leap in its acceptance as a useful tool for humanity. However, it does feel a bit unsettling knowing that OpenAI’s models are being used at the government level, with the safety team completely dissolving last month and the still uncovered mysteries sorrounding Q*.
Galileo*: Our latest LLM Hallucination Index ranks 22 of the leading models on their performance across 3 different RAG tasks, evaluating the correctness of their responses and propensity to hallucinate.Read the report
A Daily Chronicle of AI Innovations on August 08th 2024
Humane’s AI Pin daily returns are outpacing sales
Sam Altman teases ‘Project Strawberry‘
AI breakthrough accurately predicts diseases
OpenAI bets $60M on webcams
Humane’s AI Pin daily returns are outpacing sales
Humane has faced considerable challenges with the AI Pin, seeing more returns than purchases between May and August, with current customer holdings near 7,000 units.
The AI Pin received negative reviews at launch, leading to efforts by Humane to stabilize operations and look for potential buyers or additional funding from investors.
Humane’s total sales of the AI Pin and accessories have only reached $9 million, which is significantly lower than the $200 million investment from prominent Silicon Valley executives.
OpenAI is reportedly leading a $60 million Series B funding round for Opal, a company known for high-end webcams, with plans to develop AI-powered consumer devices.
Opal plans to expand beyond high-end webcams and develop creative tools powered by OpenAI’s AI models.
The startup will work closely with OpenAI researchers to prototype various device ideas.
OpenAI executives are reportedly most interested in integrating their new voice AI models into Opal’s devices.
OpenAI’s $60 million bet on Opal and Sam Altman’s personal investments in AI hardware startups signals a major push from the AI giant to bring advanced AI from the cloud directly into users’ hands.
A new unknown AI model has appeared in the LMSYS Chatbot Arena, igniting rumors that it could be OpenAI’s highly anticipated Q* AI breakthrough or its evolution — codenamed ‘Strawberry’.
A new ‘anonymous-chatbot’ appeared in the LMSYS Chatbot Arena — an open-source platform where AI startups often test upcoming releases.
Previously, OpenAI tested GPT-4o with gpt2-chatbot two weeks before releasing it to the public, which put the arena on high alert for new AI models.
Testers of “anonymous-chatbot” report that it shows more advanced reasoning than GPT-4o and any other frontier model.
To add fuel to the speculation, Sam Altman tweeted a picture of a Strawberry on X, which is the codename of OpenAI’s reported secret AI model.
As competitors like Anthropic and Meta start to catch up to GPT-4o, the Internet has been eagerly awaiting OpenAI’s next move. If this mystery model is indeed Q*/Strawberry, then we could be on the cusp of another seismic shift in AI capabilities.
Researchers have just developed an AI model that can predict major diseases like heart conditions, diabetes, and cancer — significantly outperforming existing methods.
The new model analyzes patient data using statistics and deep learning to spot disease indicators more accurately.
It employs a smart algorithm (SEV-EB) to identify crucial health markers, helping doctors prioritize the most relevant patient information.
This achieves 95% accuracy in predicting specific diseases like coronary artery disease, type 2 diabetes, and breast cancer.
It also leverages patients’ digital health records for personalized risk assessment and earlier healthcare interventions.
Remember when AlphaFold cracked the protein folding problem? This could be healthcare’s next big AI moment. By significantly improving disease prediction accuracy, this model could transform early diagnosis and treatment planning to help save millions of lives across the globe
Intel reportedly declined an opportunity to invest in OpenAI in 2017, missing early entry into the AI market due to doubts about AI’s near-term potential.
A Daily Chronicle of AI Innovations on August 07th 2024
Reddit to test AI-powered search result pages
Robot dentist performs first automated procedure
AI robot helps assemble a BMW
New AI can listen while speaking
Reddit to test AI-powered search result pages
Reddit CEO Steve Huffman announced plans to test AI-powered search results later this year, aiming to help users explore products, shows, games, and new communities on the platform.
Huffman indicated that the company might explore monetizing through paywalled subreddits, which could offer exclusive content or private areas while still maintaining the traditional free version of Reddit.
As Reddit seeks to diversify revenue sources, Huffman emphasized that the company has blocked certain entities from accessing Reddit content to ensure transparency and protect user privacy.
A Boston-based tech company, backed by Mark Zuckerberg’s dentist father, completed the world’s first all-robotic dental procedure, marking a significant advancement in medical technology.
The robot, operated by Perceptive, independently performed a process called “cutting,” which involves drilling into and shaving down a tooth, demonstrating its capabilities in Barranquilla, Colombia.
This breakthrough aims to use autonomous machines for procedures like crown placements in as little as 15 minutes, enhancing precision, efficiency, and patient care.
OpenAI-backed startup Figure AI just showed off Figure 02, its next-generation AI-powered humanoid robot — capable of completely autonomous work in complex environments like a BMW factory.
Figure 02 uses OpenAI’s AI models for speech-to-speech reasoning, allowing the humanoid robot to have full conversations with humans.
A Vision Language Model (VLM) enables the robot to make quick, common-sense decisions based on visual input and self-correct errors.
Six RGB cameras provide the robot with 360-degree vision to help it navigate the real world.
The robot stands 5’6″and weighs 132 lbs, with a 44 lb lifting capacity and a 20-hour runtime thanks to a custom 2.25 KWh battery pack.
The humanoid robot race is intensifying, withFigure CEO Brett Adcock claiming that Figure 02 is now the “most advanced humanoid on the planet” — a direct challenge toward Elon Musk and Tesla Optimus. While the world now waits for Elon’s response, Figure has one ace up its sleeve: its OpenAI partnership.
ByteDance, the parent company of TikTok, just launched Jimeng AI for Chinese users, a text-to-video AI app that directly competes with OpenAI’s (unreleased) Sora AI video model.
Jimeng AI is available on the Apple App Store and Android for Chinese users.
ByteDance’s entry into the AI video generation market follows similar launches by other Chinese tech firms, including Kuaishou’s Kling AI.
The subscription, priced at 79 yuan ($11) monthly or 659 yuan ($92) annually allows for the creation of ~2,050 images or 168 AI videos per month.
Unlike OpenAI’s Sora, which isn’t yet publicly available, these models by Jimeng AI are already accessible to users (in China).
China’s AI video generation race is accelerating, with Kling AI’s public release just weeks ago and now ByteDance’s Jimeng AI launching while the world anxiously waits for Sora’s public release. With Jimeng AI being backed by TikTok, it will have plenty of training data and deep pockets to compete against other AI giants.
AI researchers just developed a new Listening-While-Speaking Language Model (LSLM) that can listen and speak simultaneously — advancing real-time, interactive speech-based AI conversations.
The new model, called the Listening-while-Speaking Language Model (LSLM), enables full-duplex modeling in interactive speech-language models.
LSLM uses a token-based decoder-only TTS for speech generation and a streaming self-supervised learning encoder for real-time audio input.
The system can detect turn-taking in real-time and respond to interruptions, a key feature of natural conversation.
The model demonstrated robustness to noise and sensitivity to diverse instructions in experiments.
While OpenAI’s recent Her-like advanced voice mode for ChatGPT inches us toward realistic AI conversations, LSLM leaps even further by enabling AI to process incoming speech WHILE talking. This could revolutionize human-AI interactions — making conversations with machines feel truly natural and responsive.
Reddit announced plans to test AI-generated summaries at the top of search result pages, using a combination of first-party and third-party technology to enhance content discovery.
A Daily Chronicle of AI Innovations on August 06th 2024
Figure unveils new sleeker and smarter humanoid robot
Nvidia used ‘a lifetime’ of videos everyday to train AI
Leaked code reveals Apple Intelligence’s plan to prevent hallucinations
Nvidia trains video model ‘Cosmos’
OpenAI co-founder leaves for Anthropic
Nvidia AI powers robots with Apple Vision Pro OpenAI has a secretive tool to detect AI-generated text Tesla’s AI gives robots human-like vision Nvidia delays new AI chip launch Google’s Gemini 1.5 Pro leads AI chatbot rankings AI turns brain cancer cells into immune cells
Nvidia AI powers robots with Apple Vision Pro
Nvidia introduced a new tool suite for developers to control and monitor robots using Apple’s Vision Pro headset. The MimicGen NIM microservice translates user movements captured by the Vision Pro into robot actions, enabling intuitive control of robotic limbs.
Additionally, Nvidia’s Isaac Sim can generate synthetic datasets from these captured movements, which reduces the time and cost of collecting real-world data for robot training.
Why does it matter?
This advancement is a practical application of teleoperation. It can lead to more intuitive and effective ways for humans to interact with and control robots and improve their usability in various fields such as manufacturing, healthcare, and service industries.
Leaked documents obtained by 404 media report Nvidia has been scraping millions of videos daily from YouTube, Netflix, and other sources to train its unreleased foundational AI model.
Nvidia’s project, codenamed Cosmos, aims to process “a human lifetime visual experience worth of training data per day.”
The company used open-source tools and virtual machines to download videos, including full-length movies and TV shows.
Employees raised concerns about copyright and ethics, but were told there was “umbrella approval” from executives.
Nvidia claims its practices are “in full compliance with the letter and spirit of copyright law.”
Project Cosmos appears to be Nvidia’s big move into video-based AI, which could revolutionize everything from 3D world generation to self-driving cars, digital humans, and more. However, this harsh introduction is not a good look for the company, especially as the industry’s practices are coming under intense scrutiny.
OpenAI has a secretive tool to detect AI-generated text
OpenAI has been sitting on a tool that can detect AI-assisted cheating for nearly a year. Using an invisible watermarking technique, the company has developed a tool that can detect ChatGPT-generated text with 99.9% accuracy. However, internal debates about user retention, potential bias, and distribution methods have kept this technology under wraps.
Meanwhile, educators are desperately seeking ways to detect AI misuse in schools. A recent survey found that 59% of middle- and high-school teachers were confident some students had used AI for schoolwork, up 17 points from the previous year.
Why does it matter?
This tool could preserve the value of original thought in education. However, OpenAI’s hesitation shows there are complex ethical considerations about AI detection and unintended consequences in language communities.
Three key leaders at OpenAI are departing or taking leave, including co-founder John Schulman, co-founder Greg Brockman, and Peter Deng — another major shakeup for the AI powerhouse.
John Schulman, co-founder and a key leader at OpenAI, has left to join rival AI startup Anthropic — one of OpenAI’s biggest competitors.
Greg Brockman, OpenAI’s president and co-founder, is taking an extended leave of absence until the end of the year.
Peter Deng, a product leader who joined last year from Meta, has reportedly also departed.
These moves follow other recent high-profile exits, including co-founders Ilya Sutskever and Andrej Karpathy.
OpenAI has struggled to regain its footing after Sam Altman’s departure and eventual return as CEO in November 2023. Brockman, one of Altman’s biggest supporters during the ousting, mysteriously takes a leave of absence at a crucial time as OpenAI sees increased competition from Anthropic and Meta AI.
Tesla’s latest patent introduces a vision system for autonomous robots, particularly its humanoid robot Optimus. The end-to-end AI model uses only camera inputs to create a detailed 3D understanding of the environment, without using expensive sensors like LiDAR.
By dividing the space into voxels (3D pixels), the system can predict each spatial unit’s occupancy, shape, semantics, and motion in real-time. It has already been implemented, with Tesla’s manufacturing team training and deploying the neural network in Optimus for tasks like picking up battery cells on a conveyor belt.
Why does it matter?
The development of such AI-driven perception technologies could lead to progress in autonomous systems for more sophisticated and reliable operations.
The Information reports that design flaws could delay the launch of Nvidia’s next-gen AI chips by three months or more. This setback could affect giants like Microsoft, Google, and Meta, who have collectively placed orders worth tens of billions of dollars for these chips.
Despite the rumored delay, Nvidia maintains that production of its new Blackwell chip series is on track. The company also reports strong demand for its Hopper chips and says a broad sampling of Blackwell has already begun. However, sources claim that Microsoft and another major cloud provider were informed of production delays just this week.
Why does it matter?
A slowdown in chip availability could hamper the development and deployment of new AI technologies, affecting everything from cloud services to generative AI applications. It also highlights the delicate balance and vulnerabilities in the AI supply chain.
Google has launched Gemini 1.5 Pro, an experimental version available for early testing. It quickly claimed the top spot on the LMSYS Chatbot Arena leaderboard, outperforming OpenAI’s GPT-4o and Anthropic’s Claude-3.5 Sonnet. With an impressive Elo score of 1300, Gemini 1.5 Pro excels in multilingual tasks, technical areas, and multimodal capabilities.
The model builds on the foundation of Gemini 1.5, boasting a massive context window of up to two million tokens.
Why does it matter?
Google’s decision to make the model available for early testing reflects a growing trend of open development and community engagement in the AI industry. The company’s focus on community feedback also reflects its move toward responsible AI development.
Researchers at the Keck School of Medicine of USC used AI to reprogram glioblastoma cells into cancer-fighting dendritic cells. It increased survival chances by up to 75% in mouse models of glioblastoma, the deadliest form of brain cancer in adults. The technique cleverly bypasses the blood-brain barrier by converting cancer cells within the tumor itself, a major hurdle in traditional glioblastoma treatments.
The approach greatly improved survival rates in animal models when combined with existing treatments like immune checkpoint therapy or DC vaccines. The research team aims to begin clinical trials in patients within the next few years
Why does it matter?
The technique offers new hope for patients facing this aggressive disease. Moreover, the approach’s application to other cancer types suggests a broader impact on cancer immunotherapy, transforming how we approach cancer treatment in the future.
Figure unveils new sleeker and smarter humanoid robot
Figure has introduced its new humanoid robot, the Figure 02, which features improved hardware and software, including six RGB cameras and enhanced CPU/GPU computing capabilities.
Leveraging a longstanding partnership with OpenAI, the Figure 02 is equipped for natural speech conversations, featuring speakers and microphones to facilitate communication with human co-workers.
Figure 02’s advanced AI and language processing aim to make interactions transparent and safe, which is crucial given the robot’s potential use alongside humans in factory and commercial environments.
Nvidia used ‘a lifetime’ of videos everyday to train AI
Nvidia collected videos from YouTube and other sites to create training data for its AI products, as shown by internal documents and communications obtained by 404 Media.
Nvidia asserted that their data collection practices align with both the letter and spirit of copyright law when questioned about legal and ethical concerns regarding the use of copyrighted material.
A former Nvidia employee revealed that workers were directed to gather videos from sources like Netflix and YouTube to train AI for the company’s 3D world generator project, internally referred to as Cosmos.
Leaked code reveals Apple Intelligence’s plan to prevent hallucinations
Leaked code for macOS Sequoia 15.1 has revealed pre-prompt instructions for Apple Intelligence to minimize hallucinations and improve accuracy in responses.
These pre-prompt instructions include directives for Apple Intelligence to ensure questions and answers in mail assistance are concise and relevant to avoid false information.
Instructions also specify limitations for creating photo memories, prohibiting religious, political, harmful, or provocative content to maintain a positive user experience.
OpenAI’s co-founder John Schulman has left for rival Anthropic and wants to focus on AI alignment research. Meanwhile, another co-founder and president of OpenAI Greg Brockman, is taking a sabbatical.
Meta is offering Judi Dench, Awkwafina, and Keegan-Michael Key millions for AI voice projects. While some stars are intrigued by the pay, others disagree over voice usage terms.
YouTube creator David Millette sued OpenAI for allegedly transcribing millions of videos without permission, claiming copyright infringement and seeking over $5 million in damages.
Google hired Character.AI’s co-founders Noam Shazeer and Daniel De Freitas for the DeepMind team, and secured a licensing deal for their large language model tech.
Black Forest Labs, an AI startup, has launched a suite of text-to-image models in three variants: [pro], [dev], and [schnell], which outperforms competitors like Midjourney v6.0 and DALL·E 3.
OpenAI has rolled out an advanced voice mode for ChatGPT to a select Plus subscribers. It has singing, accent imitation, language pronunciation, and storytelling capabilities.
Google’s latest Gemini ad shows a dad using Gemini to help his daughter write a fan letter to an Olympian. Critics argue it promotes lazy parenting and undermines human skills like writing. Google claims the ad aims to show Gemini as a source of initial inspiration.
Stability AI has introduced Stable Fast 3D which turns 2D images into detailed 3D assets in 0.5 seconds. It is significantly faster than previous models while maintaining high quality.
Google’s “About this image” tool is now accessible through Circle to Search and Google Lens. With a simple gesture, you can now check if an image is AI-generated, how it’s used across the web, and even see its metadata.
Karpathy/Nano-Llama31: a minimal, dependency-free version of the Llama 3.1 model architecture, enabling simple training, finetuning, and inference with significantly lighter dependencies compared to the official Meta and Hugging Face implementations.
Secretaries of state from five U.S. statesurged Elon Musk to address misinformation spread by X’s AI chatbot Grok regarding the upcoming November election.
A Daily Chronicle of AI Innovations on August 05th 2024
Neuralink successfully implants brain chip in second patient
OpenAI has a ‘highly accurate’ ChatGPT text detector, but won’t release it for now
Elon Musk is suing OpenAI and Sam Altman again
Meta AI’s new Hollywood hires
Google absorbs Character AI talent
Tesla unveils new AI vision for robots
Google takes another startup out of the AI race
Google pulls AI Olympics ad after backlash
Nvidia delays next AI chip due to design flaw
Meta AI’s new Hollywood hires
Meta is reportedly offering millions to celebrities like Awkwafina, Judi Dench, and Keegan-Michael Key to use their voices in upcoming AI projects.
The AI voices would be used across Meta’s platforms, including Facebook, Instagram, and Meta Ray-Ban smart glasses.
Meta is reportedly rushing to secure deals before its Meta Connect conference in September.
Contracts are reportedly temporary, with actors having the option to renew.
Meta has previously experimented with celebrity-inspired chatbots, though that program has ended.
In our exclusive interview with Mark Zuckerberg, he predicted that “we’re going to live in a world where there are going to be hundreds of millions or billions of different AI agents”. If it holds true, celebrity voice-powered AI could be part of Meta’s next big play to drive user engagement and growth on the platform.
Google has signed a non-exclusive licensing agreement with AI startup Character AI for its large language model technology, while also reabsorbing the startup’s co-founders and key talent back into its AI team.
Character AI co-founders Noam Shazeer and Daniel De Freitas return to Google, their former employer.
Google gains a non-exclusive license to Character AI’s language model technology.
About 30 of Character AI’s 130 employees, mainly those working on model training and voice AI, will join Google’s Gemini AI efforts.
Character AI will switch to open-source models like Meta’s Llama 3.1 for its products, moving away from in-house models.
This deal highlights the intensifying race to secure top AI talent, mirroring Microsoft’s recent deal with Inflection and Amazon’s deal with Adept. As AI becomes increasingly critical to tech companies’ futures, these talent grabs could reshape the landscape, while raising antitrust concerns.
Tesla just filed a patent for an AI-powered vision system that could transform how autonomous robots perceive and navigate their environment using only camera inputs.
The system uses a single neural network to process camera data and output detailed 3D environment information without LiDAR or radar.
It divides space into 3D voxels, predicting occupancy, shape, semantic data, and motion for each in real time.
The tech is designed to run on a robot’s onboard computer, enabling immediate decision-making.
This system could be implemented in both Tesla’s vehicles and humanoid robots like Optimus.
By relying solely on camera inputs and onboard processing, Tesla’s new vision system could enable robots to navigate diverse environments more efficiently and adapt to changes in real time. This would eliminate the need for extensive pre-mapping and accelerate the arrival of affordable, autonomous robots.
Neuralink successfully implants brain chip in second patient
Elon Musk’s brain-computer interface startup, Neuralink, has commenced its second human trial, revealing that the implant is successfully functioning with about 400 electrodes providing signals.
Musk claimed that Neuralink could bestow exceptional abilities such as thermal and eagle vision, and potentially restore blindness and cure neurological disorders in humans.
Despite some initial problems and federal investigations into animal testing practices, Neuralink has over 1,000 volunteers for further trials and plans to implant chips in up to eight more patients by the end of 2024.
OpenAI has a ‘highly accurate’ ChatGPT text detector, but won’t release it for now
OpenAI has an AI-detection tool that is highly effective at identifying AI-generated text, but the company hesitates to release it to avoid upsetting its user base.
The tool, reportedly 99.9% effective, is much more accurate than previous detection algorithms and utilizes a proprietary watermarking system to identify AI-created content.
Despite its potential to aid educators in spotting AI-generated homework, OpenAI is concerned about potential deciphering of their technique and biases against non-native English speakers.
Elon Musk has filed a new lawsuit against OpenAI, Sam Altman, and Greg Brockman, accusing them of breaching the company’s founding mission to benefit humanity with artificial intelligence.
The lawsuit alleges that Altman and Brockman manipulated Musk into co-founding OpenAI by promising it would be safer and more transparent than profit-driven alternatives.
Musk previously withdrew a similar lawsuit in June, but the new suit claims that OpenAI violated federal racketeering laws and manipulated its contract with Microsoft.
Founders of Character.AI, Noam Shazeer and Daniel De Freitas, along with other team members, are rejoining Google’s AI unit DeepMind, the companies announced on Friday.
Character.AI reached a $1 billion valuation last year and plans to offer a nonexclusive license of its large language models to Google, which will help fund its growth and the development of personalized AI products.
The founders, who left Google in 2021 due to disagreements about advancing chatbot technologies, are now returning amid a competitive AI landscape and will contribute to DeepMind’s research team.
Google has withdrawn its “Dear Sydney” ad from the Olympics after receiving significant backlash from viewers and negative feedback on social media.
The controversial advertisement featured a father using the Gemini AI to write a fan letter to Olympic track star Sydney McLaughlin-Levrone on behalf of his daughter, instead of composing it together.
Critics argued that the ad missed the essence of writing a personal fan letter and feared it promoted AI as a substitute for genuine human expression.
The production of Nvidia’s “Blackwell” B200 AI chips has been delayed by at least three months due to a late-discovered design flaw, according to sources.
The B200 chips are successors to the highly sought-after H100 chips and were expected to power many AI cloud infrastructures, but now face production setbacks.
Nvidia is collaborating with Taiwan Semiconductor Manufacturing Company to address the issue, with large-scale shipments now anticipated in the first quarter of next year.
For the first time ever, Google DeepMind’s experimental Gemini 1.5 Pro has claimed the top spot on the AI Chatbot Arena leaderboard, surpassing OpenAI’s GPT-4o and Anthropic’s Claude-3.5 with an impressive score of 1300.
Gemini 1.5 Pro (experimental 0801) gathered over 12K community votes during a week of testing on the LMSYS Chatbot Arena.
The new experimental model achieved the #1 position on both the overall and vision leaderboards.
The experimental version is available for early testing in Google AI Studio, the Gemini API, and the LMSYS Chatbot Arena.
Google DeepMind hasn’t disclosed specific improvements, but promises more updates soon.
Without any announcement, Gemini 1.5 Pro unexpectedly rose to the top of the overall AI chatbot leaderboard — by a whopping 14 points. The leap means that either Google just quietly established itself as the new leader in the LLM space, or we’re on the cusp of major competitive responses from industry rivals.
Meta’s Llama 3.1 allows users to search the internet and train the AI to write in their personal style, saving you time on content creation and research processes.
Access Llama 3.1 through Meta AI and log in with your Facebook or Instagram account.
Use the internet search feature by asking questions like “Summarize the Olympics highlights this week.”
Train Llama 3.1 in your voice by providing a sample of your best content and instructing it to mimic your style.
Generate content by asking Llama 3.1 to create posts on your desired topics.
Pro tip: The more examples and feedback you provide, the better Llama 3.1 will become at emulating your unique writing style!
Stability AI just introduced Stable Fast 3D, an AI model that generates high-quality 3D assets from a single image in just 0.5 seconds — potentially reshaping industries from gaming to e-commerce.
The model creates complete 3D assets, including UV unwrapped mesh, material parameters, and albedo colors with reduced illumination bake-in.
It outperforms previous models, reducing generation time from 10 minutes to 0.5 seconds while maintaining high-quality output.
Stable Fast 3D is available on Hugging Face and through Stability AI’s API, under Stability AI’s Community License.
The leap from 10 minutes to 0.5 seconds for high-quality 3D asset generation is nothing short of insane. We’re entering a world where video games will soon feature infinite, dynamically generated assets, e-commerce will have instant 3D product previews, architects will see designs in real-time, and so much more.
🔍 Gemma Scope: helping the safety community shed light on the inner workings of language models.
Explainable AI: One of the most requested feature for LLMs is to understand how to take internal decisions. This is a big step towards interpretability “This is a barebones tutorial on how to use Gemma Scope, Google DeepMind’s suite of Sparse Autoencoders (SAEs) on every layer and sublayer of Gemma 2 2B and 9B. Sparse Autoencoders are an interpretability tool that act like a “microscope” on language model activations. They let us zoom in on dense, compressed activations, and expand them to a larger but sparser and seemingly more interpretable form, which can be a very useful tool when doing interpretability research!”
AI systems can be powerful but opaque “black boxes” – even to researchers who train them. ⬛
Enter Gemma Scope: a set of open tools made up of sparse autoencoders to help decode the inner workings of Gemma 2 models, and better address safety issues.
What else is happening in AI on August 02nd 2024
Google introduced three new AI features for Chrome, including Google Lens for desktop, Tab compare for product comparisons, and an improved browsing history search.
GitHub launched GitHub Models, a new platform allowing developers to access and experiment with various AI models directly on GitHub, including a playground, Codespaces integration, and deployment.
Healx, an AI-enabled drug discovery startup,raised $47 million in Series C funding and received regulatory clearance to start Phase 2 clinical trials for a new rare disease treatment in the U.S.
Google is facing backlash over its Gemini AI Olympics-themed ad, with critics arguing it promotes overreliance on AI tools at the expense of children’s learning and creativity.
Microsoft officially listed OpenAI as a competitor in AI offerings and search advertising in its annual report, despite their long-term partnership and Microsoft’s significant investment in the company.
Character AI open-sourced Prompt Poet, their innovative approach to prompt design, aiming to revolutionize how AI interactions are built and managed in production environments.
A Daily Chronicle of AI Innovations on August 01st 2024
Microsoft declares OpenAI as competitor
Meta is proving there’s still big AI hype on Wall Street
Reddit CEO says Microsoft needs to pay to search the site
Google launches three ‘open’ AI models prioritizing safety and transparency
Google’s tiny AI model bests GPT-3.5
Taco Bell’s AI drive-thru
AI reprograms brain cancer cells
Google’s tiny AI model bests GPT-3.5
Taco Bell’s AI drive-thru
Microsoft declares OpenAI as competitor
Microsoft has officially listed OpenAI as a competitor in AI, search, and news advertising in its latest annual report, signalling a shift in their relationship.
Despite Microsoft being the largest investor and exclusive cloud provider for OpenAI, both companies are now encroaching on each other’s market territories.
An OpenAI spokesperson indicated that their competitive dynamic was always expected as part of their partnership, and Microsoft still remains a strong partner for OpenAI.
Meta is proving there’s still big AI hype on Wall Street
Meta’s shares surged by about 7% in extended trading after surpassing Wall Street’s revenue and profit expectations and providing an optimistic forecast for the current period.
The company reported a 22% increase in second-quarter revenue to $39.07 billion and a 73% rise in net income, attributing the growth to gains in the digital ad market and cost-cutting measures.
Meta continues to invest heavily in AI and VR technologies, with plans for significant capital expenditure growth in 2025 to support AI research and development, despite a broader downsizing effort.
Google launches three ‘open’ AI models prioritizing safety and transparency
Google has unveiled three new models to the Gemma 2 lineup, building on the original models released in June 2024, focusing on performance and safety enhancements.
The first addition, Gemma 2 2B, provides improved capabilities and is adaptable for various devices, while ShieldGemma and Gemma Scope focus on content safety and model interpretability, respectively.
These new tools and models are available on platforms like Kaggle and Hugging Face, promoting broader use and development within the AI community with a focus on responsible innovation.
Researchers at USC made a breakthrough using AI to reprogram glioblastoma cells into immune-activating dendritic cells in mouse models, potentially revolutionizing treatment for the deadly brain cancer.
Glioblastoma is the deadliest adult brain cancer, with less than 10% of patients surviving five years after diagnosis.
AI identified genes that can convert glioblastoma cells into dendritic cells (DCs), which sample cancer antigens and activate other immune cells to attack the tumor.
In mouse models, this approach increased survival chances by up to 75% when combined with immune checkpoint therapy.
Researchers have also identified human genes that could potentially reprogram human glioblastoma cells, paving the way for future clinical trials.
By turning cancer cells against themselves, this new research offers a novel way to fight tumors from within. If the 75% increased survival chances in mice translate to humans, this could not only revolutionize glioblastoma treatment but potentially open doors for similar approaches in other hard-to-treat cancers.
Taco Bell’s parent company, Yum Brands, just announced plans to roll out AI-powered drive-thru ordering at hundreds of restaurants in the U.S. by the end of 2024, with ambitions for global implementation.
The AI understands orders, auto-inputs them into the system, and even suggests additional items — potentially increasing sales through upselling.
Over 100 Taco Bell restaurants in the U.S. already use voice AI in drive-thrus.
The company has been testing the AI for over two years and claims it has outperformed humans in accuracy, reduced wait times, and decreased employee workload.
Rivals like Wendy’s and White Castle are also experimenting with AI ordering, while McDonald’s recently ended its IBM partnership for similar tech.
IfTaco Bell’s positive results on their two-year test are any indication, this large-scale AI implementation could change the way fast-food chain businesses operate and how we order food at drive-thrus. However, the success (or failure) of this rollout could set the tone for the entire industry’s adoption.
Google just unveiled Gemma 2 2B, a lightweight AI model with just 2B parameters that outperforms much larger models like GPT-3.5 and Mixtral 8x7B on key benchmarks.
Gemma 2 2B boasts just 2.6B parameters, but was trained on a massive 2 trillion token dataset.
It scores 1130 on the LMSYS Chatbot Arena, matching GPT-3.5-Turbo-0613 (1117) and Mixtral-8x7b (1114) — models 10x its size.
Other notable key benchmark scores include 56.1 on MMLU and 36.6 on MBPP, beating its predecessor by over 10%.
The model is open-source, and developers can download the model’s weights from Google’s announcement page.
As we enter a new era of on-device, local AI, lightweight and efficient models are crucial for running AI directly on our phones and laptops. With Gemma 2 beating GPT-3.5 Turbo at just 1/10th the size, Google isn’t just showing what’s possible — they’re cementing their position as the leader in the small model space.
Google expanded access to its “About this image” tool, making it available through Circle to Search and Google Lens, allowing users to quickly get context on images they encounter online or via messaging.
NEURA, a German robotics company, released a new video showcasing their humanoid robot 4NE-1 performing tasks like chopping vegetables, ironing cloths, solving puzzles, and more.Source: https://x.com/TheHumanoidHub/status/1818726046633804184
Synthesia introduced “Personal Avatars,” AI-generated lifelike avatars created from brief webcam or phone footage, allowing users to create short-form videos for social media in multiple languages.Source: https://www.synthesia.io/features/custom-avatar/persona
Enjoying these FREE AI updates without the clutter, Set yourself up for promotion or get a better job by Acing the AWS Certify Data Engineer Associate Exam (DEA-C01) with the book or App below:
Welcome to our blog series “AI Innovations in July 2024”! As we continue to ride the wave of extraordinary developments from June, the momentum in artificial intelligence shows no signs of slowing down. Last month, we witnessed groundbreaking achievements such as the unveiling of the first quantum AI chip, the successful deployment of autonomous medical drones in remote areas, and significant advancements in natural language understanding that have set new benchmarks for AI-human interaction.
July promises to be just as exhilarating, with researchers, engineers, and visionaries pushing the boundaries of what’s possible even further. In this evolving article, updated daily throughout the month, we’ll dive deep into the latest AI breakthroughs, advancements, and milestones shaping the future.
From revolutionary AI-powered technologies and cutting-edge research to the societal and ethical implications of these innovations, we provide you with a comprehensive and insightful look at the rapidly evolving world of artificial intelligence. Whether you’re an AI enthusiast, a tech-savvy professional, or simply someone curious about the future, this blog will keep you informed, inspired, and engaged.
Join us on this journey of discovery as we explore the frontiers of AI, uncovering the innovations that are transforming industries, enhancing our lives, and shaping our future. Stay tuned for daily updates, and get ready to be amazed by the incredible advancements happening in the world of AI!
OpenAI has begun a limited rollout of its hotly anticipated ‘Advanced Voice Mode’ for paying ChatGPT Plus users, offering natural, real-time conversations and the ability for the AI to detect and respond to emotions.
The feature will initially be available to a small group of ChatGPT Plus users, with plans to give all Plus users access by fall 2024.
Advanced Voice Mode uses GPT-4o and can sense emotions in users’ voices, including sadness, excitement, or singing.
Video and screen-sharing capabilities, previously showcased in OpenAI’s early demo, will launch at a ‘later’ date.
OpenAI has sent email instructions to the initial ‘Alpha‘ group selected for early access.
AI is slowly shifting from a tool we text/prompt with, to an intelligence that we collaborate, learn, and grow with. Advanced Voice Mode’s ability to understand and respond to emotions in real-time convos could also have huge use cases in everything from customer service to mental health support.
Google cracks down on explicit deepfakes in search results
Google is introducing new online safety features designed to remove explicit deepfakes from Search, making it harder for such content to appear prominently in search results.
When users request the removal of explicit nonconsensual fake images of themselves, Google’s systems will now filter out similar explicit results and remove duplicate images from related search queries.
Google’s updates also include demoting sites with extensive removals for fake explicit imagery in Search rankings and ensuring that searches for deepfake images yield high-quality, non-explicit content instead.
The “Friend” AI necklace, created by Avi Schiffmann, is designed to provide personal companionship through support and encouragement, connecting to an iPhone via Bluetooth.
Unlike other AI wearables that failed, Friend listens to interactions around the wearer and sends supportive messages, storing all data locally on the device.
Schiffmann described the device as an expression of loneliness and emphasized its role as a supportive and validating companion, useful for brainstorming and discussing relationships.
Perplexity just introduced a “Publishers’ Program” to share ad revenue with media partners, following recent plagiarism accusations and aiming to support quality journalism in the age of AI-powered search.
The program includes cash advances on future revenue as Perplexity builds its advertising model, set to launch in September.
Initial partners include Time, Der Spiegel, Fortune, WordPress.com, and more, who will receive a “double-digit percentage” of ad revenue.
Partners also get free access to Perplexity’s Enterprise Pro tier, developer tools, and insights through Scalepost AI.
Despite constant pushback on AI firms and their training data, media companies are finding few available paths forward other than accepting partnership deals. Perplexity’s initiative is a good step toward fairness, but it likely won’t be the end of the growing pains with publishers.
A Daily chronicle of AI Innovations July 30th 2024:
Instagram now lets you create an AI chatbot of yourself
Perplexity’s new revenue sharing plan
Nvidia announces new support for humanoid robots
Meta’s new open-source model could be the ‘GPT-4 moment’ for computer vision
Zuck and Huang envision AI’s future
Runway releases image-to-video AI
Apple says its AI models were trained on Google’s custom chips
Meta released world’s largest open-source LLM to date Mistral AI released its Llama 3.1 rival, Mistral Large 2 US lawmakers are requesting OpenAI for government access DeepMind’s new AI is a silver medalist in the IMO math Olympiad OpenAI announced SearchGPT, an AI-powered search engine Apple revealed AI models powering Apple Intelligence
Instagram now lets you create an AI chatbot of yourself
Meta has released a new tool called AI Studio, enabling users in the US to create AI characters on Instagram or the web to interact with followers on their behalf.
These AI profiles can engage in direct chat threads, respond to comments, and are customizable based on the creator’s Instagram content and specified interaction guidelines.
In addition to creating personalized AI, users can also design entirely new characters to use across Meta’s platforms, with Meta ensuring these AI profiles are clearly labeled to avoid confusion.
Perplexity has started a program to share advertising revenue with publishers after facing plagiarism accusations from several media outlets.
The “Publishers’ Program” includes partners like Time, Der Spiegel, and Automattic, who will receive a portion of ad revenue for their content used by Perplexity.
This initiative follows investigations by Forbes and Wired, which reported Perplexity’s AI misusing and paraphrasing their articles without proper attribution.
Nvidia has introduced a new suite of services, including the NIM microservices platform and the OSMO orchestration service, to aid in the development, simulation, and training of humanoid robots.
CEO Jensen Huang emphasized that Nvidia is advancing its robotics stack to support global humanoid developers, offering platforms, acceleration libraries, and AI models tailored for their needs.
At the SIGGRAPH conference, Nvidia showcased an AI-enabled teleoperation workflow and detailed three robotics development platforms: Nvidia AI supercomputers, Nvidia Isaac Sim, and Nvidia Jetson Thor humanoid robot computers.
Meta’s new open-source model could be the ‘GPT-4 moment’ for computer vision
Meta has introduced SAM 2, a cutting-edge open-source model for segmenting both images and videos, marking a significant advancement in computer vision similar to OpenAI’s GPT-4 in natural language processing.
While the original SAM focused solely on images, SAM 2 excels in video segmentation, effectively handling lower-quality footage and partially obscured objects, thanks to training on a vast new video dataset.
SAM 2’s improved accuracy, enhanced memory module for better object tracking, and faster processing speed positions it as a groundbreaking tool in the fields of video editing, robotics, and generative AI, despite some limitations.
During a fireside chat at SIGGRAPH 2024, Meta CEO Mark Zuckerberg and NVIDIA CEO Jensen Huang spoke about their shared vision for the AI-powered future.
Both CEOs emphasized the importance of open-source AI, with Zuckerberg highlighting Llama 3.1’s release as an “inflection point.”
Zuckerberg outlined a possible future for social media to evolve from recommending content to AI generating personalized content on the fly.
Huang predicted a shift from turn-based AI interactions to more fluid, multi-option simulations.
The leaders also discussed AI’s potential to transform education, entertainment, and work through smart glasses.
The emphasis on open-source and personalized AI signals a potential shift in how AI will be integrated into everyday life and business. With Meta and NVIDIA’s combined influence, the shared vision could significantly shape the future of AI and its applications across different industries.
Runway just announced that Gen-3 Alpha, the startup’s popular AI text-to-video generation model, can now create high-quality videos from still images.
According to Runway, image-to-video greatly improves the artistic control and consistency of video generations.
Image-to-video generations are either 5 or 10 seconds in length and take up “credits,“ which you have to pay for through Runway’s subscription tiers.
To use the tool, head to Runway’s website, click “try Gen-3 Alpha”, and upload an image to watch it come to life.
The highly anticipated image-to-video generation model opens up a whole new suite of creativity, allowing users to bring any image to life. However, while the increased artistic control and improvements to consistency are notable, Gen-3 Alpha does not come at a cheap price tag.
Apple says its AI models were trained on Google’s custom chips
Apple used Google’s tensor processing units (TPUs) to train two artificial intelligence models, according to a recent research paper.
To train its AI models, the company employed 2,048 TPUv5p chips for devices like iPhones and 8,192 TPUv4 processors for server-based models.
Unlike Nvidia’s GPUs, Google’s TPUs are accessible only via Google Cloud Platform, requiring customers to build software through this platform to utilize the chips.
On July 23rd, Meta officially released the biggest version of its open-source LLM, Llama, a 405 billion-parameter version called Llama-3.1. It also released Llama 3.1 70B and 8B models.
Llama 3.1’s context window has been expanded to 128,000 tokens, meaning users can feed it as much text as in a 400-page novel. It will be multilingual and support English, Portuguese, Spanish, Italian, German, French, Hindi, and Thai.
The 405B model is competitive with leading foundation models across a range of tasks, including GPT-4, GPT-4o, and Claude 3.5 Sonnet. The smaller models also performed similarly.
Users can access Llama 3.1 through AWS, Nvidia, Groq, Dell, Databricks, Microsoft Azure, Google Cloud, and other model libraries. Llama 3.1 405B will also be available on WhatsApp and Meta AI.
Why does it matter?
The move directly challenges industry leaders like OpenAI and Anthropic, particularly OpenAI’s market-leading position. It also underscores Meta’s commitment to open-source development, marking a major escalation in the AI competition.
Mistral AI has announced the next generation of its flagship open-source model with 123 billion parameters, Mistral Large 2. Compared to its predecessor, the model is significantly more capable in code generation, mathematics, and reasoning. It also provides much stronger multilingual support and advanced function-calling capabilities.
However, the model is only licensed as “open” for non-commercial research uses, including open weights, allowing third parties to fine-tune it to their liking. Those seeking to use it for commercial/enterprise-grade applications will need to obtain a separate license and usage agreement from Mistral.
Why does it matter?
Following Meta’s launch of Llama 3.1 as a highly competitive alternative to leading closed-source “frontier” models, the French AI startup entered the fray. The AI race is picking up pace like never before.
Perhaps the most significant portion of the letter was item 9: “Will OpenAI commit to making its next foundation model available to U.S. Government agencies for pre-deployment testing, review, analysis, and assessment?”
The letter outlined 11 additional points to be addressed, including OpenAI’s commitment to dedicating 20% of its computing power to fuel safety research and protocols to prevent malicious actors or foreign adversaries from stealing OpenAI’s products or IP.
Why does it matter?
Regulatory scrutiny is nothing new for OpenAI and the broader AI sector. However, now OpenAI is facing heightened scrutiny, and following developments could drive stringent government oversight and set new industry standards.
Google DeepMind presented AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of its geometry-solving system.
Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist for the first time. Here’s a graph showing the AI system’s performance relative to human competitors at IMO 2024.
Why does it matter?
Solving complex math problems in step-by-step proofs has been a grand challenge for AI. Breakthroughs like these demonstrate AI’s growing ability to match top human minds, with far-reaching implications across various fields.
OpenAI is testing SearchGPT, a prototype combining the strength of its AI models with information from the web. It will quickly and directly respond to your questions with up-to-date information while providing clear links to relevant sources. You’ll also be able to ask follow-up questions.
It is launching to a small group of users and publishers to get feedback. While this prototype is temporary, OpenAI plans to integrate the best of its features directly into ChatGPT in the future.
Why does it matter?
This directly challenges Google’s dominance in the online search market. It also signals a significant escalation in AI search wars, which are already reshaping how users find and interact with information on the web.
Apple published a research paper describing two new foundation language models that form the backbone of Apple Intelligence, its new AI system.
AFM-on-device (AFM stands for Apple Foundation Model), a ∼3 billion parameter language 1 model, and
AFM-server, a larger server-based language model
The models are designed to be fast and run efficiently on iPhone, iPad, and Mac as well as on Apple silicon servers via Private Cloud Compute. They are part of a larger family of generative models created by Apple to support users and developers.
Why does it matter?
Apple Intelligence is designed with Apple’s core values at every step and a foundation of industry-lead privacy protection, showing Apple’s commitment to providing secure, powerful, personalized AI experiences.
OpenAI intensified the AI arms race by announcing free fine-tuning for its GPT-4o Mini model, just hours after Meta launched its open-source Llama 3.1 model.
Stability AI released Stable Video 4D, its first video-to-video AI model that turns a single object video into multiple novel-view videos with eight different angles/views.
A new study found indiscriminate use of AI-generated data in training leads to irreversible defects, termed “model collapse,” where the models plateau and become incoherent.
Kling AI has gone global with an International Version 1.0 to take on OpenAI’s yet-to-be-released video generator, Sora. It is now accessible to all at KlingAI.com, where registration requires only an email address.
Google introduced 1.5 Flash in the unpaid version of Gemini for faster and better responses. It also introduced a new feature to further address hallucinations and expanded Gemini for Teens and mobile apps.
X now automatically activates a setting that allows it to train its Grok AI on user data, including posts, user interactions, inputs, and results. Find out how you can switch it off!
Meta launched AI Studio, a platform built on Llama 3.1 that lets anyone create share, and discover AI characters and allows creators to build an AI as an extension of themselves to reach more fans.
Amazon has reportedly unveiled a new AI chip, boasting 40-50% higher performance than NVIDIA’s at half the cost, aiming to reduce reliance on expensive external chips.
Hugging Face is offering developers an inference-as-a-service powered by Nvidia NIM microservices. It will improve token efficiency by up to 5x with popular AI models.
A Daily chronicle of AI Innovations July 29th 2024:
Apple’s AI features will be late, report claims
AI revolutionizes the 2024 Olympics
Amazon paid $1B for Twitch 10 years ago, it’s still unprofitable
Neuralink-rival integrates ChatGPT into brain implant
Turn text into Sora-like AI videos
Apple’s AI features will be late, report claims
Apple’s AI features, including an improved Siri and ChatGPT integration, are expected to launch with iOS 18.1 in October, not with the initial release of iOS 18 in September.
These artificial intelligence improvements were first introduced at the Worldwide Developer Conference in June and might not be available immediately for new iPhone 16 devices at launch.
Some features will be available in developer betas starting this week, allowing testing before public release, but full functionality for certain enhancements may not be seen until spring 2025.
Neuralink-rival integrates ChatGPT into brain implant
Synchron, a competitor to Neuralink, has integrated OpenAI’s ChatGPT into its brain-computer interface (BCI) to help people with paralysis more easily control digital devices.
The AI addition assists users like Mark, an ALS patient, by predicting and suggesting responses during communication, which they can select using brain signals.
The company’s CEO, Tom Oxley, highlighted the potential of ChatGPT to enhance BCI capabilities, while the cost of Synchron’s implant is estimated to be between $50,000 and $100,000, similar to other medical implants.
The Paris 2024 Summer Olympic Games is showcasing an unexpectedly extensive amount of AI, changing experiences for athletes, spectators, and organizers — potentially signaling a new era in the way that we watch sports.
AthleteGPT, an AI chatbot, is providing 24/7 assistance to athletes through the Athlete365 mobile app.
An AI-powered 3D athlete tracking (3DAT) technology is offering detailed biomechanical insights for performance enhancement.
AI is being used in talent scouting, as demonstrated by a recent IOC pilot program in Senegal.
NBC is also using AI to provide personalized highlights and enhanced real-time statistics for viewers.
The use of AI at a major worldwide sporting event such as the Olympics marks a major moment for AI adoption, moving from previous reluctance to embrace it. As AI continues to become normalized globally, it could pave the way for a new era in sports viewing and management.
Kling AI’s text-to-video feature allows users to create stunning Sora-like videos from simple text prompts, opening up new ways you can produce high-quality visuals.
X (Twitter) automatically enabled a setting allowing user data, including user interactions, posts, inputs, and results, to be used for training and fine-tuning purposes for its Grok AI.
Morgan Stanley deployed its second in-house generative AI application, AI @ Morgan Stanley Debrief, which summarizes video meetings and generates follow-up email drafts.
The National Institute of Standards and Technology (NIST) released Dioptra, an open-source tool for testing AI model risk and measuring the impact of malicious attacks on AI system performance.
Reddit intensified its crackdown on web crawlers by blocking major search engines from surfacing recent posts unless they pay, with Google currently being the only mainstream search engine showing recent results.
Suno introduced a new feature for Pro & Premier users to separate vocals and instrumentals from AI-generated songs, allowing for more control and creative possibilities in music production.
Stanford Engineering and Toyota Research achieved a milestone in autonomous driving by creating the world’s first AI-directed, driverless tandem drift, aiming to advance the safety of automated driving in complex scenarios.
A Daily chronicle of AI Innovations July 26th 2024:
🏅AI: The New Gold Medalist in Empowering Athletes at the Olympics
OpenAI challenges Google with AI search engine SearchGPT
Google DeepMind’s AI takes home silver medal in complex math competition
Video game actors strike over AI concerns
Who will control the future of AI?
🏅AI: The New Gold Medalist in Empowering Athletes at the Olympics
AI as a Catalyst for Inclusion
Kevin Piette, paralyzed for 11 years, recently achieved a remarkable milestone by carrying the Olympic flame while walking. This extraordinary feat was made possible by the Atalante X, an AI-powered exoskeleton developed by French company Wandercraft. 🚀
The Olympics have always been a stage for human excellence, a platform where athletes push the boundaries of physical ability. However, the Games are also evolving into a showcase of technological innovation. Artificial intelligence (AI) is rapidly transforming sports, and its impact extends far beyond performance enhancement.
OpenAI challenges Google with AI search engine SearchGPT
OpenAI announced a new search product called “SearchGPT,” which is currently in the testing phase and aims to compete directly with Google’s Search Generative Experience.
SearchGPT, designed for a limited group of users, offers concise answers and relevant sources, with the intention of making search faster and easier through real-time information.
With this move, OpenAI targets Google’s dominant position in the search market, where Google holds approximately 90% market share, highlighting OpenAI’s significant ambition in the search engine space.
Google DeepMind’s AI takes home silver medal in complex math competition
Google DeepMind has developed an AI system named AlphaProof that achieved 28 points in the International Mathematical Olympiad, equivalent to a silver medalist’s score for the first time.
AlphaProof has managed to solve 83% of all IMO geometry problems over the past 25 years, significantly improving on its predecessor AlphaGeometry, which had a success rate of 53%.
AlphaProof generates solutions by searching and testing various mathematical steps, unlike human participants who rely on theorem knowledge and intuition to solve problems more efficiently.
The Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA) has decided to strike all video game work under the union’s Interactive Media Agreement starting July 26th.
The strike affects all union actors, voice actors, and motion capture performers, targeting companies such as Activision Blizzard, EA, Insomniac Games, and WB Games, with disagreements over AI protections cited as the main issue.
Despite finding common ground on numerous proposals and the video game producers offering AI consent and fair compensation, SAG-AFTRA and the companies failed to reach a full agreement, leading to the strike.
Sam Altman, CEO of OpenAI, just wrote an op-ed outlining a strategy for ensuring a vision for AI prevails in the United States and allied nations over authoritarian alternatives.
Altman emphasizes the urgent need for a U.S.-led global coalition to advance AI that spreads its benefits and maintains open access.
He proposes four key actions: robust security measures, infrastructure investment, coherent commercial diplomacy, and new models for global AI governance.
The strategy aims to maintain the U.S. lead in AI development while countering efforts by authoritarian regimes to dominate the technology.
Altman suggests creating an international body for AI oversight, similar to the IAEA or ICANN.
Altman’s surprisingly urgent tone in this op-ed highlights the growing risks of AI development in the US. He believes “there is no third option,” either democratic nations lead AI development or authoritarian regimes will — raising a serious call to action for the race of AI dominance.
Google upgraded Gemini with 1.5 Flash, offering faster responses, a 4x larger context window, and expanded access in over 40 languages and 230 countries.
SAG-AFTRA announced a strike for video game performers starting July 26, citing concerns over AI protections in negotiations with major gaming studios, despite progress on wages and job safety.
Sam Altman revealed in a tweet reply that the GPT-4o-Voice Alpha rollout will begin next week for Plus subscribers, expanding OpenAI’s voice generation capabilities.
Udio released version 1.5 of its AI music model, featuring improved audio quality, key control, and new features like stem downloads and audio-to-audio remixing.
Runway’s AI video generator reportedly trained on thousands of YouTube videos without permission, according to a leaked document obtained by 404 Media.
Anthropic’s web crawler allegedly violated website terms of use, with iFixit reporting nearly a million hits in 24 hours, raising concerns about AI companies’ data collection practices.
A Daily chronicle of AI Innovations July 25th 2024:
OpenAI could lose $5B this year and run out of cash in 12 months
Kling AI’s video generation goes global
Apple Maps launches on the web to take on Google
Mistral’s Large 2 is its answer to Meta and OpenAI’s latest models
CrowdStrike offers $10 Uber Eats gift cards as an apology for the outage
Reddit blocking all search engines except Google, as it implements AI paywall
Mistral’s Large 2 takes on AI giants
OpenAI could lose $5B this year and run out of cash in 12 months
OpenAI could lose up to $5 billion in 2024, risking running out of cash within 12 months, according to an analysis by The Information.
The AI company is set to spend $7 billion on artificial intelligence training and $1.5 billion on staffing this year, far exceeding the expenses of rivals.
OpenAI may need to raise more funds within the next year to sustain its operations, despite having already raised over $11 billion through multiple funding rounds.
Mistral’s Large 2 is its answer to Meta and OpenAI’s latest models
French AI company Mistral AI launched its Mistral Large 2 language model just one day after Meta’s release of Llama 3, highlighting the intensifying competition in the large language model (LLM) market.
Mistral Large 2 aims to set new standards in performance and efficiency, boasting significant improvements in logic, code generation, and multi-language support, with a particular focus on minimizing hallucinations and improving reasoning capabilities.
The model, available on multiple platforms including Azure AI Studio and Amazon Bedrock, outperforms its predecessor with 123 billion parameters and supports extensive applications, signaling a red ocean of competition in the AI landscape.
Reddit blocking all search engines except Google, as it implements AI paywall
Reddit has begun blocking search engines from accessing recent posts and comments, except for Google, which has a $60 million agreement to train its AI models using Reddit’s content.
This move is part of Reddit’s strategy to monetize its data and protect it from being freely used by popular search engines like Bing and DuckDuckGo.
To enforce this policy, Reddit updated its robots.txt file, signaling to web crawlers without agreements that they should not access Reddit’s data.
Kling AI, developed by Chinese tech giant Kuaishou Technology, has released its impressive AI video model globally, offering high-quality AI generations that rival OpenAI’s (unreleased) Sora.
Kling can generate videos up to two minutes long, surpassing OpenAI’s Sora’s one-minute limit, however, the global version is limited to five-second generations.
The global version offers 66 free credits daily, with each generation costing 10 credits.
According to Kuaishou, Kling utilizes advanced 3D reconstruction technology for more natural movements.
The platform accepts prompts of up to 2,000 characters, allowing for detailed video descriptions.
When KLING launched a little over a month ago, it was only accessible if you had a Chinese phone number. While global users are still limited to 5-second generations, anyone can now generate their own high-quality videos — putting even more pressure on OpenAI to release its beloved Sora.
OpenAI, whose ChatGPT assistant kicked off an artificial intelligence arms race, is now pursuing a slice of the search industry. The company has unveiled a prototype of SearchGPT, an AI-powered search engine that is widely viewed as a play for rival Google’s $175 billion-per-year search business. But while Google’s use of AI in search results has been met with concern and resistance from publishers, SearchGPT touts its heavy use of citations and was developed alongside publishing partners, including Axel-Springer and the Financial Times. After seeing results to their queries, users will be able to ask follow-up questions in interactions that resemble those with ChatGPT.
A 10,000 person wait list was opened Thursday for a those wanting to test a prototype of the SearchGPT service.
Though currently distinct, SearchGPT will eventually be integrated into ChatGPT.
A Daily chronicle of AI Innovations July 24th 2024:
Google search is thriving despite AI shift
Google is pouring billions into self-driving taxis as Tesla prepares to reveal its rival
Senators demand answers on OpenAI’s practices
Meta’s Llama 3.1 takes on GPT-4o
Adobe’s new AI features for Photoshop
Google search is thriving despite AI shift
Despite concerns from online publishers, Google’s introduction of AI features generating conversational responses to search queries has attracted advertisers and propelled Alphabet’s success.
Alphabet’s revenue for the April-June quarter rose by 14% from last year to $84.74 billion, surpassing analyst expectations and boosting stock prices by 2% in extended trading.
Google’s cloud-computing division, its fastest-growing segment, generated $10.3 billion in revenue in the past quarter, marking its first time surpassing the $10 billion threshold in a single quarter.
Five U.S. Senators have just sent a letter to OpenAI CEO Sam Altman, demanding details about the company’s efforts to ensure AI safety following reports of rushed safety testing for GPT-4 Omni.
Senators question OpenAI’s safety protocols, citing reports that the company rushed safety testing of GPT-4 Omni to meet a May release date.
The letter requests OpenAI to make its next foundation model available to U.S. Government agencies for deployment testing, review, analysis, and assessment.
Lawmakers ask if OpenAI will commit 20% of computing resources to AI safety research, a promise made in July 2023 when announcing the now disbanded “Superalignment team”.
With allegations of rushed safety testing, potential retaliation against whistleblowers, and the disbanding of the “Superalignment team,” OpenAI is under intense scrutiny. This letter also marks a critical moment for the entire AI industry — with the potential to lead to stricter government oversight and new industry standards.
In case you missed our exclusive deep dive with Mark Zuckerberg yesterday, Meta released Llama 3.1, including it’s long awaited 405B paramater model — the first open sourced frontier model that beats top closed models like GPT-4o across several benchmarks.
The 405B parameter version of Llama 3.1 matches or exceeds top closed models on several benchmarks.
Meta is offering open and free weights and code, with a license enabling fine-tuning, distillation into other models, and deployment anywhere.
Llama 3.1 features a 128k context length, multi-lingual abilities, strong code generation performance, and complex reasoning capabilities.
For exclusive insights on Llama 3.1, open source, AI agents, and more, read our full deep dive with Mark Zuckerberg here, or watch the full interview here.
Meta’s release of Llama 3.1 405b is a significant moment in AI history because it’s the first time an open-source AI model matches or outperforms top closed AI models like OpenAI’s GPT-4o. By offering a private, customizable alternative to closed AI systems, Meta is enabling anyone to create their own tailored AI.
Adobe just unveiled major AI-powered updates to Illustrator and Photoshop, leveraging its Firefly AI model to accelerate creative workflows and introduce new generative design capabilities.
Illustrator introduces Generative Shape Fill using Firefly Vector AI to add detailed vectors to shapes and create scalable patterns via text prompts.
Text to Pattern in Illustrator creates scalable, customized vector patterns for designs like wallpapers.
Photoshop’s new AI-powered Selection Brush Tool and Generate Image function are now generally available.
Photoshop also gets an enhanced version of its popular Generative Fill for improved sharpness in large images.
These updates could dramatically increase designers’ productivity by automating tedious, time-consuming tasks. We’ve always preached that the best AI products are those embedded into everyday workflows — and Adobe is doing just that by putting powerful tech directly into designers’ everyday tools.
A Daily chronicle of AI Innovations July 23rd 2024:
Meta releases its most powerful AI model yet
Alexa is losing Amazon billions of dollars
The “world’s most powerful” supercomputer
Google’s AI-powered weather model
MIT’s AI identifies breast cancer risk
Musk unveils the world’s most powerful AI training cluster Robotics won’t have a ChatGPT-like explosion: New Research NeuralGCM predicts weather faster than SOTA climate models
Robotics won’t have a ChatGPT-like explosion: New Research
Coatue Management has released a report on AI humanoids and robotics’s current and future state. It says robotics will unlikely have a ChatGPT-like moment where a single technology radically transforms our work. While robots have been used for physical labor for over 50 years, they have grown linearly and faced challenges operating across different environments.
The path to broad adoption of general-purpose robots will be more gradual as capabilities improve and costs come down. Robotics faces challenges like data scarcity and hardware limitations that digital AI technologies like ChatGPT do not face. But investors are still pouring billions, hoping software innovations could help drive value on top of physical robotics hardware.
Why does it matter?
We’re on the cusp of a gradual yet profound transformation. While robotics may not suddenly become ubiquitous, the ongoing progress in artificial intelligence and robotics will dramatically alter the landscape of numerous fields, including manufacturing and healthcare.
NeuralGCM predicts weather faster than SOTA climate models
Google researchers have developed a new climate modeling tool called NeuralGCM. This tool uses a combination of traditional physics-based modeling and machine learning. This hybrid approach allows NeuralGCM to generate accurate weather and climate predictions faster and more efficiently than conventional climate models.
NeuralGCM’s weather forecasts match the accuracy of current state-of-the-art (SOTA) models for up to 5 days, and its ensemble forecasts for 5-15 day predictions outperform the previous best models. Additionally, NeuralGCM’s long-term climate modeling is one-third as error-prone as existing atmosphere-only models when predicting temperatures over 40 years.
Why does it matter?
NeuralGCM presents a new approach to building climate models that could be faster, less computationally costly, and more accurate than existing models. This breakthrough could lead to accessible and actionable climate modeling tools.
Elon Musk and xAI just announced the Memphis Supercluster — “the most powerful AI training cluster in the world“, also revealing that Grok 3.0 is planned to be released in December and should be the most powerful AI in the world.
Musk tweeted that xAI just launched the “Memphis Supercluster,” using 100,000 Nvidia H100 GPUs, making it “the most powerful AI training cluster in the world.”
The xAI founder also revealed that Grok 2.0 is done training and will be released soon.
The supercluster aims to create the “world’s most powerful AI by every metric”, Grok 3.0, by December 2024.
In a separate tweet yesterday, Musk also revealed that Tesla plans to have humanoid robots in “low production” for internal use next year.
Love him or hate him, the speed at which Elon and the team at xAI operate has been wild to witness. If estimates are accurate, xAI might be on track to create the most powerful AI systems in the world by year’s end — solidifying its position as one of the top competitors in the space and not just another AI startup.
Google researchers have developed a new AI-powered weather and climate model called ‘NeuralGCM’ by combining methods of machine learning and neural networks with traditional physics-based modeling.
NeuralGCM has proven more accurate than purely machine learning-based models for 1-10 day forecasts and top extended-range models.
NeuralGCM is up to 100,000 times more efficient than other models for simulating the atmosphere.
The model is open-source and can run relatively quickly on a laptop, unlike traditional models that require supercomputers.
At up to 100,000 times more efficient than traditional models — NeuralGCM could dramatically enhance our ability to simulate complex climate scenarios quickly and accurately. While still a ton of adoption challenges ahead, it’s a big leap forward for more informed climate action and resilience planning.
The Rundown: Researchers from MIT and ETH Zurich have developed an AI model that can identify different stages of ductal carcinoma in situ (DCIS), a type of preinvasive breast tumor, using simple tissue images.
The model analyzes chromatin images from 560 tissue samples (122 patients), identifying 8 distinct cell states across DCIS stages.
It considers both cellular composition and spatial arrangement, revealing that tissue organization is crucial in predicting disease progression.
Surprisingly, cell states associated with invasive cancer were detected even in seemingly normal tissue.
This AI model could democratize advanced breast cancer diagnostics, offering a cheaper, faster way to assess DCIS risk. While clinical validation is still needed, AI is likely going to work hand-in-hand with pathologists in the near future to catch cancer earlier and more accurately.
Meta has released Llama 3.1 405B, its largest open-source AI model to date, featuring 405 billion parameters which enhance its problem-solving abilities.
Trained with 16,000 Nvidia H100 GPUs, Llama 3.1 405B is competitive with leading AI models like OpenAI’s GPT-4 and Anthropic’s Claude 3.5 Sonnet, though it has specific strengths and weaknesses.
Meta’s new AI model is available for download or cloud usage and powers chatbots on platforms like WhatsApp and Meta.ai, showcasing capabilities in coding, mathematical queries, and multilingual document summarization.
Amazon plans to launch a paid version of Alexa to address the over $25 billion losses incurred by its devices business from 2017 to 2021, as reported by The Wall Street Journal.
The enhanced Alexa, which may cost up to $10 per month, is expected to be released soon, though employees have concerns about whether the technology is ready.
The new Alexa, featuring generative AI for improved conversational abilities, faces technical delays and competition from free AI assistants, raising doubts about customers’ willingness to pay for it.
VeriSIM Life’s AI platform can accelerate drug discovery
VeriSIM Life has developed an AI platform, BIOiSIM, to help speed up drug discovery and reduce animal testing. The platform contains data on millions of compounds and uses AI models to predict how potential new drugs will work in different species, including humans.
Anthropic is working on a new screenshot tool for Claude
This tool will allow users to capture and share screenshots from their desktop or browser directly within the Claude chat interface. It will streamline the sharing of visual information and code snippets when asking Claude for assistance on tasks like coding or troubleshooting.
Luma’s “Loops” feature in Dream Machine transforms digital marketing
The “Loops” feature allows users to create continuous video loops from text descriptions or images. It does so without visible cuts or transitions, opening up new possibilities for engaging content creation and advertising.
Tesla will use humanoid robots internally by next year
Elon Musk has announced that Tesla will use humanoid robots at its factories by next year. These robots, called Optimus, were expected to be ready by the end of 2024. Tesla aims to mass produce robots for $20,000 each and sell them to other companies starting in 2026.
Perplexity launches Voice Mode for its AI assistant on iOS
Perplexity has introduced a new feature for its iOS app called Voice Mode. It allows subscribers with Pro accounts to interact verbally with the AI-powered search engine. Users can now engage in voice-based conversations and pose questions using various voice options.
A Daily chronicle of AI Innovations July 22nd 2024:
Apple released two open-source AI language models OpenAI is in talks with Broadcom to develop an AI chip Nvidia is developing an AI chip series for China
The state of AI humanoids and robotics
Apple’s new 7B open-source AI model
Tesla to have humanoid robots for internal use next year
Nvidia preparing new flagship AI chip for Chinese market
️ Musk’s xAI turns on ‘world’s most powerful’ AI training cluster
Study reveals rapid increase in web domains blocking AI models
How to test and customize GPT-4o mini
Apple released two open-source AI language models
Apple has released two new open AI models called DCLM (DataComp for Language Models) on Hugging Face: one with 7 billion parameters and another with 1.4 billion parameters. The 7B model outperforms Mistral-7B and is comparable to other leading open models, such as Llama 3 and Gemma. They’ve released – model weights, training code, and even the pretraining dataset. The models were trained using a standardized framework to determine the best data curation strategy.
The 7B model was trained on 2.5 trillion tokens and has a 2K context window, achieving 63.7% 5-shot accuracy on MMLU. The 1.4B model, trained on 2.6 trillion tokens, outperforms other models in its category on MMLU with a score of 41.9%. These models are not intended for Apple devices.
Why does it matter?
By open-sourcing high-performing models and sharing data curation strategies, Apple is helping to solve some of AI’s toughest challenges for developers and researchers. This could lead to more efficient AI applications across various industries, from healthcare to education.
OpenAI is in talks with Broadcom to develop an AI chip
The company is in talks with Broadcom and other chip designers to build custom silicon, aiming to reduce dependence on Nvidia’s GPUs and boost its AI infrastructure capacity. OpenAI is hiring ex-Google employees with AI chip experience and has decided to develop an AI server chip.
The company is researching various chip packaging and memory components to optimize performance. However, the new chip is not expected to be produced until 2026 at the earliest.
Why does it matter?
Sam Altman’s vision for AI infrastructure is evolving from a separate venture into an in-house project at OpenAI. By bringing chip design in-house, OpenAI could potentially accelerate its AI research, reduce dependencies on external suppliers, and gain a competitive edge in the race of advanced AI.
Nvidia is developing a special version of its Blackwell AI chip for the Chinese market. Tentatively named “B20,” this chip aims to bridge the gap between U.S. export controls and China’s AI tech. Despite facing a revenue dip from 26% to 17% in China due to sanctions, Nvidia is not backing down. They’re partnering with local distributor Inspur to launch this new chip.
As Nvidia tries to reclaim its Chinese market share, competitors like Huawei are gaining ground. Meanwhile, the U.S. government is making even tighter controls on AI exports.
Why does it matter?
If Nvidia pulls off, it could maintain its dominance in the Chinese market while complying with U.S. regulations. But if regulators clamp down further, we could see a more fragmented global AI ecosystem, potentially slowing innovation. It’s a high-stakes game of technological cat-and-mouse, with Nvidia trying to stay ahead of regulators and rivals.
Tesla to have humanoid robots for internal use next year
Elon Musk announced that Tesla’s Optimus robots will begin “low production” for internal tasks in 2025, with mass production for other firms starting in 2026.
Musk initially stated the Optimus robot would be ready to perform tasks in Tesla’s EV factories by the end of this year.
Musk’s plans for Optimus and AI products come as Tesla faces reduced demand for electric vehicles and anticipates low profit margins in upcoming quarterly results.
️ Musk’s xAI turns on ‘world’s most powerful’ AI training cluster
Elon Musk’s xAI has started training its AI models using over 100,000 Nvidia H100 GPUs at a new supercomputing facility in Memphis, Tennessee, described as the most powerful AI training cluster globally.
This facility, known as the “Gigafactory of Compute,” is built in a former manufacturing site, and xAI secured $6 billion in funding, creating jobs for roles like fiber foreman, network engineer, and project manager.
The Memphis supercomputing site’s large energy and water demands have raised concerns among local environmental groups and residents, who fear its significant impact on water supplies and electrical consumption.
Study reveals rapid increase in web domains blocking AI models
A new study finds that more websites are blocking AI models from accessing their training data, potentially leading to less accurate and more biased AI systems.
The Data Provenance Initiative conducted the study, analyzing 14,000 web domains and discovering an increase in blocked tokens from 1% to up to 7% from April 2023 to April 2024.
News websites, social media platforms, and forums are the primary sources of these restrictions, with blocked tokens on news sites rising dramatically from 3% to 45% within a year.
The Reuters Institute released a study on public attitudes about AI in the news
It indicates that news consumers aren’t gloomy about AI in journalism. While initial reactions tend to be skeptical, attitudes become more nuanced as people learn about different AI applications. The comfort level varies based on where AI is used in the news process, with human oversight remaining a top priority.
California pushes bill requiring tech giants to test AI for “catastrophic” risks
While Republicans pledge a hands-off approach nationally, California’s move has sparked fierce debate. Tech leaders oppose the bill, citing potential harm to innovation and startups, while supporters argue it’s crucial for public safety.
Figma pulled its “Make Designs” AI tool after it generated designs similar to Apple’s weather app
The design platform admits it rushed new components without proper vetting, leading to uncanny similarities. While Figma didn’t train the AI on copyrighted designs, it’s back to the drawing board to polish its QA process.
OpenAI’s GPT-4o Mini has a safety feature called “instruction hierarchy”
This new feature prevents users from tricking the AI with sneaky commands like “ignore all previous instructions.” By prioritizing the developer’s original prompts, OpenAI aims to make its AI more trustworthy and safer for future applications, like running your digital life.
Google is the “official AI sponsor for Team USA” for the 2024 Paris Games
NBCUniversal’s broadcast will feature Google’s tech, from 3D venue tours to AI-assisted commentary. Moreover, Five Olympic and Paralympic athletes will appear in promos using Google’s AI tools.
A Daily chronicle of AI Innovations July 20th 2024:
OpenAI is working on an AI codenamed “Strawberry” Meta researchers developed “System 2 distillation” for LLMs Amazon’s Rufus AI is now available in the US AMD amps up AI PCs with next-gen laptop chips YT Music tests AI-generated radio, rolls out sound search 3 mysterious AI models appear in the LMSYS arena Meta’s Llama 3 400B drops next week Mistral AI adds two new models to its growing family of LLMs FlashAttention-3 enhances computation power of NVIDIA GPUs DeepL’s new LLM crushes GPT-4, Google, and Microsoft Salesforce debuts Einstein service agent Ex-OpenAI researcher launches AI education company OpenAI introduces GPT-4o mini, its most affordable model Mistral AI and NVIDIA collaborate to release a new model TTT models might be the next frontier in generative AI
CrowdStrike fixes start at “reboot up to 15 times” and get more complex from there
Apple releases the “best-performing” open-source models out there
Google in talks with Ray-Ban for AI smart glasses
Loophole that helps you identify any bot blocked by OpenAI
Apple releases the “best-performing” open-source models out there
Apple’s research team has released open DCLM models on Hugging Face, featuring 7 billion and 1.4 billion parameters, outperforming Mistral and approaching the performance of Llama 3 and other leading models.
The larger 7B model achieved a 6.6 percentage point improvement on the MMLU benchmark compared to previous state-of-the-art models while using 40% less compute for training, matching closely with top models like Google’s Gemma and Microsoft’s Phi-3.
Currently, the larger model is available under Apple’s Sample Code License, while the smaller one has been released under Apache 2.0, allowing for commercial use, distribution and modification.
Google is in discussions with EssilorLuxottica, the parent company of Ray-Ban, to develop AI-powered Gemini smart glasses and integrate their Gemini AI assistant.
EssilorLuxottica is also collaborating with Meta on the Ray-Ban Meta Smart Glasses, and Meta may acquire a minority stake in EssilorLuxottica, which could affect Google’s plans.
Google’s Gemini smart glasses are expected to feature a microphone, speaker, and camera without displays, aligning with the prototypes shown at I/O 2024 for Project Astra.
Loophole that helps you identify any bot blocked by OpenAI
OpenAI developed a technique called “instruction hierarchy” to prevent misuse of AI by ensuring the model follows the developer’s original instructions rather than user-injected prompts.
The first model to include this new safety feature is GPT-4o Mini, which aims to block the “ignore all previous instructions” loophole that could be used to exploit the AI.
This update is part of OpenAI’s efforts to enhance safety and regain trust, as the company faces ongoing concerns and criticisms about its safety practices and transparency.
A Daily chronicle of AI Innovations July 19th 2024:
OpenAI discusses new AI chip with Broadcom
Mistral AI and Nvidia launch NeMo 12B
Tech giants form Coalition for Secure AI
OpenAI debuts new GPT-4o mini model
Mistral AI and NVIDIA collaborate to release a new model TTT models might be the next frontier in generative AI
OpenAI gives customers more control over ChatGPT Enterprise
AI industry leaders have teamed up to promote AI security
DeepSeek open-sources its LLM ranking #1 on the LMSYS leaderboard
Groq’s open-source Llama AI model tops GPT-4o and Claude
Apple, Salesforce break silence on claims they used YouTube videos to train AI
OpenAI debuts new GPT-4o mini model
OpenAI just announced the launch of GPT-4o mini, a cost-efficient and compact version of its flagship GPT-4o model — aimed at expanding AI accessibility for developers and businesses.
GPT-4o mini is priced at 15 cents per million input tokens and 60 cents per million output tokens, over 60% cheaper than GPT-3.5 Turbo.
The model scores 82% on the MMLU benchmark, outperforming Google’s Gemini Flash (77.9%) and Anthropic’s Claude Haiku (73.8%).
GPT-4o mini is replacing GPT-3.5 Turbo in ChatGPT for Free, Plus, and Team users starting today.
The model supports a 128K token context window and handles text and vision inputs, with audio and video capabilities planned for future updates.
While it’s not GPT-5, the price and capabilities of this mini-release significantly lower the barrier to entry for AI integrations — and marks a massive leap over GPT 3.5 Turbo. With models getting cheaper, faster, and more intelligent with each release, the perfect storm for AI acceleration is forming.
Mistral AI and Nvidia just unveiled Mistral NeMo, a new open-source, 12B parameter small language model that surpasses competitors like Gemma 2 9B and Llama 3 8B on key benchmarks alongside a massive context window increase.
NeMo features a 128k token context window, and offers SOTA performance in reasoning, world knowledge, and coding accuracy for its size category.
The model also excels in multi-turn conversations, math, and common sense reasoning, making it versatile for various enterprise applications.
Mistral also introduced ‘Tekken’, a tokenizer that represents text more efficiently across 100+ languages, allowing for 30% more content within the context window.
NeMo is designed to run on a single NVIDIA L40S, GeForce RTX 4090, or RTX 4500 GPU, bringing powerful AI capabilities to standard business hardware.
Small language models are having a moment — and we’re quickly entering a new shift toward AI releases that don’t sacrifice power for size and speed. Mistral also continues its impressive week of releases, continuing to flex the open-source muscle and compete with the industry’s giants.
AI startup Groq just released two new open-source AI models specializing in tool use, surpassing heavyweights like GPT-4 Turbo, Claude 3.5 Sonnet, and Gemini 1.5 Pro on key function calling benchmarks.
Groq’s two models, Llama 3 Groq Tool Use 8B and 70B, are both fine-tuned versions of Meta’s Llama 3.
The 70B achieved 90.76% accuracy on the BFCL Leaderboard, securing the top position for all proprietary and open-source models.
The smaller 8B model was not far behind, coming in at No. 3 on the leaderboard with 89.06% accuracy.
The models were trained exclusively on synthetic data, and are available through the Groq API and on Hugging Face.
Groq made waves earlier this year with its blazing-fast AI speeds — and now its pairing those capabilities with top-end specialized models. Near real-time speeds and highly-advanced tool use opens the door for a near endless supply of new innovations and user applications.
OpenAI introduces GPT-4o mini, its most affordable model
OpenAI has introduced GPT-4o mini, its most intelligent, cost-efficient small model. It supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future. The model has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023.
GPT-4o mini scores 82% on MMLU and currently outperforms GPT-4 on chat preferences in the LMSYS leaderboard. It is more affordable than previous frontier models and more than 60% cheaper than GPT-3.5 Turbo.
Why does it matter?
It has been a huge week for small language models (SLMs), with GPT-4o mini, Hugging Face’s SmolLM, and NeMO, Mathstral, and Codestral Mamba from Mistral. GPT-4o mini should significantly expand the range of applications built with AI by making intelligence much more affordable.
Mistral AI and NVIDIA collaborate to release a new model
Mistral releases Mistral NeMo, its new best small model with a large context window of up to 128k tokens. It was built in collaboration with NVIDIA and released under the Apache 2.0 license.
Its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category. Relying on standard architecture, Mistral NeMo is easy to use and a drop-in replacement for any system using Mistral 7B. It is also on function calling and is particularly strong in English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi.
Why does it matter?
The model is designed for global, multilingual applications with excellence in many languages. This could be a new step toward bringing frontier AI models to everyone’s hands in all languages that form human culture.
TTT models might be the next frontier in generative AI
Transformers have long been the dominant architecture for AI, powering OpenAI’s Sora, GPT-4o, Claude, and Gemini. But they aren’t especially efficient at processing and analyzing vast amounts of data, at least on off-the-shelf hardware.
Researchers at Stanford, UC San Diego, UC Berkeley, and Meta proposed a promising new architecture this month. The team claims that Test-Time Training (TTT) models can not only process far more data than transformers but that they can do so without consuming nearly as much compute power. Here is the full research paper.
Why does it matter?
On average, a ChatGPT query needs nearly 10x as much electricity to process as a Google search. It may be too early to claim if TTT models will eventually supersede transformers. But if they do, it could allow AI capabilities to grow sustainably.
OpenAI gives customers more control over ChatGPT Enterprise
OpenAI is launching tools to support enterprise customers with managing their compliance programs, enhancing data security, and securely scaling user access. It includes new Enterprise Compliance API, SCIM (System for Cross-domain Identity Management), expanded GPT controls, and more.
AI industry leaders have teamed up to promote AI security
Google, OpenAI, Microsoft, Anthropic, Nvidia, and other big names in AI have formed the Coalition for Secure AI (CoSAI). The initiative aims to address a “fragmented landscape of AI security” by providing access to open-source methodologies, frameworks, and tools.
DeepSeek open-sources its LLM ranking #1 on the LMSYS leaderboard
DeepSeek has open-sourced DeepSeek-V2-0628, the No.1 open-source model on the LMSYS Chatbot Arena Leaderboard. It ranks #11, outperforming all other open-source models.
Groq’s open-source Llama AI model tops GPT-4o and Claude
Groq released two open-source models specifically designed for tool use, built with Meta Llama-3. The Llama-3-Groq-70B-Tool-Use model tops the Berkeley Function Calling Leaderboard (BFCL), outperforming offerings from OpenAI, Google, and Anthropic.
Apple, Salesforce break silence on claims they used YouTube videos to train AI
Apple clarified that its OpenELM language model used the dataset for research purposes only and will not be used in any Apple products/services. Salesforce commented that the dataset was publicly available and released under a permissive license.
A Daily chronicle of AI Innovations July 18th 2024:
DeepL’s new LLM crushes GPT-4, Google, and Microsoft Salesforce debuts Einstein service agent Ex-OpenAI researcher launches AI education company
Trump allies draft AI order
Google is going open-source with AI agent Oscar!
Microsoft’s AI designer releases for iOS and Android
Tencent’s new AI app turns photos into 3D characters
OpenAI makes AI models fight for accuracy
Can AI solve real-world problems by predicting tipping points?
OpenAI unveils GPT-4o mini
Apple denies using YouTube data for AI training
The ‘godmother of AI’ has a new startup already worth $1 billion
Microsoft’s AI-powered Designer app is now available
Trump allies draft AI order
Former U.S. President Donald Trump’s allies are reportedly drafting an AI executive order aimed at boosting military AI development, rolling back current regulations, and more — signaling a potential shift in the country’s AI policy if the party returns to the White House.
The doc obtained by the Washington Post includes a ‘Make America First in AI’ section, calling for “Manhattan Projects” to advance military AI capabilities.
It also proposes creating ‘industry-led’ agencies to evaluate models and protect systems from foreign threats.
The plan would immediately review and eliminate ‘burdensome regulations’ on AI development, and repeal Pres. Biden’s AI executive order.
Senator J.D. Vance was recently named as Trump’s running mate, who has previously indicated support for open-source AI and hands-off regulation.
Given how quickly AI is accelerating, it’s not surprising that it has become a political issue — and the views of Trump’s camp are a stark contrast to the current administration’s slower, safety-focused approach. The upcoming 2024 election could mark a pivotal moment for the future of AI regulation in the U.S.
Apple clarified it does not use YouTube transcription data for training its AI systems, specifically highlighting the usage of high-quality licensed data from publishers, stock images, and publicly available web data for its models.
OpenELM, Apple’s research tool for understanding language models, was trained on Pile data but is used solely for research purposes without powering any AI features in Apple devices like iPhones, iPads, or Macs.
Apple has no plans to develop future versions of OpenELM and insists that any data from YouTube will not be used in Apple Intelligence, which is set to debut in iOS 18.
The ‘godmother of AI’ has a new startup already worth $1 billion
Fei-Fei Li, called the “godmother of AI,” has founded World Labs, a startup valued at over $1 billion after just four months, according to the Financial Times.
World Labs aims to develop AI with human-like visual processing for advanced reasoning, a research area similar to what ChatGPT is working on with generative AI.
Li, famous for her work in computer vision and her role at Google Cloud, founded World Labs while partially on leave from Stanford, backed by investors like Andreessen Horowitz and Radical Ventures.
DeepL’s new LLM crushes GPT-4, Google, and Microsoft
The next-generational language model for DeepL translator specializes in translating and editing texts. Blind tests showed that language professionals preferred its natural translations 1.3 times more often than Google Translate and 1.7 times more often than ChatGPT-4.
Here’s what makes it stand out:
While Google’s translations need 2x edits, and ChatGPT-4 needs 3x more edits, DeepL’s new LLM requires much fewer edits to achieve the same translation quality, efficiently outperforming other models.
The model uses DeepL’s proprietary training data, specifically fine-tuned for translation and content generation.
To train the model, a combination of AI expertise, language specialists, and high-quality linguistic data is used, which helps it produce more human-like translations and reduces hallucinations and miscommunication.
Why does it matter?
DeepL AI’s exceptional translation quality will significantly impact global communications for enterprises operating across multiple languages. As the AI model raises the bar for AI translation tools everywhere, it begs the question: Will Google, ChatGPT, and Microsoft’s translational models be replaced entirely?
The new Einstein service agent offers customers a conversational AI interface, takes actions on their behalf, and integrates with existing customer data and workflows.
The Einstein 1 platform’s service AI agent offers diverse capabilities, including autonomous customer service, generative AI responses, and multi-channel availability. It processes various inputs, enables quick setup, and provides customization while ensuring data protection.
Salesforce demonstrated the AI’s abilities through a simulated interaction with Pacifica AI Assistant. The AI helped a customer troubleshoot an air fryer issue, showcasing its practical problem-solving skills in customer service scenarios.
Why does it matter?
Einstein Service Agent’s features, like 24×7 availability, sophisticated reasoning, natural responses, and cross-channel support, could significantly reduce wait times, improve first-contact resolution rates, and enhance customer service delivery.
Ex-OpenAI researcher launches AI education company
In a Twitter post, ex-Tesla director and former OpenAI co-founder Andrej Karpathy announced the launch of EurekaLabs, an AI+ education startup.
EurekaLabs will be a native AI company using generative AI as a core part of its platform. The startup shall build on-demand AI teaching assistants for students by expanding on course materials designed by human teachers.
Karpathy states that the company’s first product would be an undergraduate-level class, empowering students to train their own AI systems modeled after EurekaLabs’ teaching assistant.
Why does it matter?
This venture could potentially democratize education, making it easier for anyone to learn complex subjects. Moreover, the teacher-AI symbiosis could reshape how we think about curriculum design and personalized learning experiences.
The platform will enable developers to create AI agents that work across various SDLC stages, such as development, planning, runtime, and support. Oscar might also be released for closed-source projects in the future. (Link)
Microsoft’s AI designer releases for iOS and Android
Microsoft Designer is now available as a free mobile app. It supports 80 languages and offers prompt templates, enabling users to create stickers, greeting cards, invitations, collages, and more via text prompts.
Tencent’s new AI app turns photos into 3D characters
The 3D Avatar Dream Factory app uses 3D head swapping, geometric sculpting, and PBR material texture mapping to let users create realistic, detailed 3D models from single images that can be shared, modified, and printed.
It uses a “prover-verifier” training method, where a stronger GPT-4 model is a “prover” offering solutions to problems, and a weaker GPT-4 model is a “verifier” that checks those solutions. OpenAI aims to train its prover models to produce easily understandable solutions for the verifier, furthering transparency.
OpenAI just published new research detailing a method to make large language models produce more understandable and verifiable outputs, using a game played between two AIs to make generations more ‘legible’ to humans.
The technique uses a “Prover-Verifier Game” where a stronger AI model (the prover) tries to convince a weaker model (the verifier) that its answers are correct.
Through multiple rounds of the game, the prover learns to generate solutions that are not only correct, but also easier to verify.
While the method only boosted accuracy by about 50% compared to optimizing solely for correctness, its solutions were easily checkable by humans.
OpenAI tested the approach on grade-school math problems, with plans to expand to more complex domains in the future.
AI will likely surpass humans in almost all capabilities in the future — so ensuring outputs remain interpretable to lesser intelligence is crucial for safety and trust. This research offers a scalable way to potentially keep systems ‘honest’, but the performance trade-off shows the challenge in balancing capability with explainability.