Download the AI & Machine Learning For Dummies PRO App: iOS - Android Our AI and Machine Learning For Dummies PRO App can help you Ace the following AI and Machine Learning certifications:
The AI revolution continues to blaze through 2024. June was a month of monumental strides, marked by breakthroughs in quantum AI, autonomous medical drones, and natural language processing. But the AI landscape is a dynamic one, and July has already proven to be no exception.
This month, we’re diving deep into the latest AI developments, from groundbreaking research to real-world applications. We’ll explore how AI is reshaping industries, addressing global challenges, and redefining what’s possible. Join us as we uncover the stories behind the headlines and analyze the implications of these innovations for society.
Whether you’re an AI expert or just curious about the future, this blog is your go-to source for the most up-to-date insights. Stay tuned for daily updates as we navigate the exciting world of artificial intelligence together.
A Daily Chronicle of AI Innovations on August 30th 2024
Apple and Nvidia may invest in OpenAI
Amazon’s new Alexa voice assistant will use Claude AI
OpenAI and Anthropic will share their models with the US government
Google is working on AI that can hear signs of sickness
OpenAI and Anthropic partner with US gov
China’s new Qwen2 beats GPT-4o
AI startup reaches 100M token context
China’s new Qwen2 beats GPT-4o
Alibaba just unveiled Qwen2-VL, a new vision-language AI model that outperforms GPT-4o in several benchmarks — particularly excelling in document comprehension and multilingual text-image understanding.
Qwen2-VL can understand images of various resolutions and ratios, as well as videos over 20 minutes long.
The model excels particularly at complex tasks such as college-level problem-solving, mathematical reasoning, and document analysis.
It also supports multilingual text understanding in images, including most European languages, Japanese, Korean, Arabic, and Vietnamese.
You can try Qwen2-VL on Hugging Face, with more information on the official announcement blog.
There’s yet another new contender in the state-of-the-art AI model arena, and it comes from China’s Alibaba. Qwen2-VL’s ability to understand diverse visual inputs and multilingual requests could lead to more sophisticated, globally accessible AI applications.
Apple and Nvidia are reportedly in talks to participate in a significant funding round for OpenAI, with Apple planning to integrate ChatGPT into iOS and Nvidia being a key supplier of the chips that power OpenAI’s AI services.
Apple, which had earlier considered appointing Phil Schiller to OpenAI’s board before abandoning the plan, is looking to deepen its involvement with OpenAI as it prepares to enhance Siri with ChatGPT capabilities later this year.
Nvidia, whose hardware is essential for OpenAI’s operations, is also considering investing in this funding round, joining Microsoft, which has been a major investor in OpenAI since 2019 and made another substantial investment in 2023.
OpenAI and Anthropic just signed a groundbreaking agreement with the U.S. Artificial Intelligence Safety Institute to allow government access and testing of their AI models before public release.
The U.S. AI Safety Institute will have access to major new models from both companies prior to and after their public release.
This collaboration is a step toward AI regulation and safety efforts, with the U.S. government evaluating AI models’ capabilities and associated risks.
The institute will provide feedback to OpenAI and Anthropic on potential safety improvements that should be made.
These agreements come as AI companies face increasing regulatory scrutiny, with California legislators recently passing a broad AI regulation bill earlier today.
The two most popular AI companies in the world are granting the U.S. government access to unreleased models before release. This could reshape how AI is developed, tested, and deployed worldwide, with major implications around innovation, safety, and international competition in the AI space, for better or worse.
Amazon’s new Alexa voice assistant will use Claude AI
Amazon’s new voice assistant, “Remarkable Alexa,” will launch in October and be powered by Anthropic’s Claude AI, offering a subscription-based service.
The existing Alexa model struggled with accuracy, leading Amazon to invest in Anthropic’s AI technology after facing internal technical and bureaucratic issues.
Remarkable Alexa is set to feature daily AI-generated news summaries, a child-focused chatbot, and conversational shopping tools, with a demo planned for Amazon’s September event.
Magic just developed LTM-2-mini, a model capable of processing 100 million tokens of context — equivalent to about 10 million lines of code or 750 novels — and partnered with Google Cloud to build advanced AI supercomputers.
LTM-2-mini can process and understand 100 million tokens of context given during inference, surpassing current models by 50x.
The model’s innovative algorithm processes long sequences of data 1000x more efficiently than the current top-performing AI models.
Magic is also partnering with Google Cloud to build supercomputers powered by Nvidia’s newest and most advanced GPUs.
The company has raised more than $450 million in total funding, including a recent $320 million investment round.
This breakthrough in context length allows AI agents to process and reason over dense and complicated codebases, vast databases, and years of conversation history in a single inference. It’s a significant step toward creating AI assistants with near-perfect recall and memory.
Google is working on AI that can hear signs of sickness
Google is developing artificial intelligence technology that can detect early signs of illness by analyzing sound signals like coughs and sniffles.
The AI model is trained with 300 million audio samples and can identify diseases such as tuberculosis by recognizing specific audio patterns of labored breathing.
Google has partnered with Salcit Technologies, an AI startup in India, to integrate this technology into smartphones to assist high-risk populations in areas with limited healthcare access.
Anthropic’s Prompt Engineering Interactive Tutorial: a digital platform designed to teach users how to effectively craft prompts for AI applications, enhancing user interaction and efficiency.
Documents reveal state-linked Chinese entities are using cloud services from AWS or its rivals to access advanced US chips and AI models they cannot acquire otherwise.
California lawmakersapproved a bill proposing sweeping AI regulations, including safety testing requirements and potential legal consequences for harmful AI systems.
A Daily Chronicle of AI Innovations on August 29th 2024
AI creates DOOM video game in real-time
OpenAI raises at $100B valuation
AI spots cancer earlier than ever
Nvidia just showed how hard it is to be the AI king
Google researchers run Doom on a self-generating AI model
Midjourney says it’s ‘getting into hardware’
OpenAI aims for $100B+ valuation in new funding round
Major websites reject Apple AI data scraping
AI creates DOOM video game in real-time
Google researchers just developed GameNGen, an AI system that can simulate the classic game DOOM in real-time, running at over 20 frames per second and producing visuals nearly indistinguishable from the original game.
GameNGen produces playable gameplay at 20 frames per second on a single chip, with each frame predicted by a diffusion model.
The AI was trained on 900M frames of gameplay data, resulting in 3-second clips almost indistinguishable from the actual game by playtesters.
Running on a single TPU, GameNGen handles Doom’s 3D environments and fast-paced action without traditional game engine components.
In tests, human raters could barely distinguish between short clips of the AI simulation and the actual game.
GameNGen is the first AI model that can generate a complex and playable video game in real-time without any underlying real game engine. We’re at the fascinating time where soon, AI will be able to create entire games on the fly, personalized to each player.
OpenAI is reportedly in talks to raise a new funding round at a valuation exceeding $100 billion, led by Thrive Capital, with Microsoft also expected to participate.
The potential valuation of over $100 billion would be significantly higher than OpenAI’s previous $86 billion valuation.
Thrive Capital is expected to invest around $1 billion in this round.
OpenAI’s annualized revenue reportedly surpassed $3.4 billion earlier this year.
The company is still, however, projected to lose nearly $5 billion by the end of the year and has already spent $8.5 billion on AI training and staffing.
Building AI is expensive, and raising billions of dollars at a $100B+ valuation would silence OpenAI’s critics who insist that the company is on its downfall. The increased valuation also suggests that the company has potential hidden breakthroughs behind the scenes, such as Project Strawberry and Orion.
Researchers recently developed an AI tool called AINU that can differentiate cancer cells from normal cells and detect early stages of viral infection, by analyzing high-resolution images of cell nuclei.
AINU uses a convolutional neural network to analyze images captured by STORM microscopy, which offers nanoscale resolution.
The AI can detect structural changes in cells as small as 20 nanometers, 5,000 times smaller than a human hair’s width.
AINU also detected viral infections (herpes simplex virus type-1) just one hour after infection by observing subtle changes in DNA packing.
The tool can accurately identify stem cells too, which could accelerate stem cell research without relying on animal testing.
Yesterday, researchers revealed an AI tool to help with early dementia detection, and now AI is detecting cancer cells at a nanoscale level. Clinical applications may be years away, but AI healthcare breakthroughs like AINU are only accelerating — and will dramatically revolutionize scientific research in the coming years.
Nvidia just showed how hard it is to be the AI king
Nvidia achieved strong second-quarter results by more than doubling its revenue compared to the same period last year, but industry experts anticipated these outcomes due to ongoing investments in AI by tech companies.
Despite reporting $30.04 billion in revenue, which surpassed analyst expectations, Nvidia’s stock fell 6.9% after hours due to investor concerns and sky-high expectations.
Issues like shipment delays for Nvidia’s upcoming Blackwell GPUs and slightly lower-than-expected revenue projections for the next quarter also contributed to investor unease, as noted by multiple analysts.
Midjourney, known for its AI image-generation tool, announced it is entering the hardware market and invited job seekers to join its new division.
The announcement was made on Midjourney’s official X account, revealing that founder David Holz and new hire Ahmad Abbas, a former Apple hardware manager, will lead the hardware efforts.
Midjourney hinted at multiple ongoing projects and the possibility of new form factors, though no specific timeline or further details have been provided yet.
OpenAI aims for $100B+ valuation in new funding round
OpenAI is reportedly negotiating with venture capital firms to raise a large sum of money, potentially valuing the company at over $100 billion.
Thrive Capital plans to invest $1 billion in this funding round, and Microsoft is also expected to contribute additional funds, as reported by The Wall Street Journal.
If successful, this would be the most substantial new capital for OpenAI since Microsoft’s $10 billion investment in January 2023, with OpenAI’s valuation potentially exceeding $103 billion based on recent negotiations.
Many of the largest websites, such as Facebook, Instagram, and The New York Times, have opted out of Apple’s AI training by using the Applebot-Extended tag to exclude their content.
Apple allows publishers to easily opt out of content scraping for Apple Intelligence training through a publicly-accessible robots.txt file, ensuring their data is not used for AI purposes.
Apple’s use of Applebot for AI training is designed to be ethical, with mechanisms to filter out personal data and a system for web publishers to prevent their data from being utilized.
A Daily Chronicle of AI Innovations on August 28th 2024
OpenAI prepares ‘Project Strawberry’
Google launches trio of new models
😯Google AI-Powered Interview Warmup
Create an AI prompt optimizer GPT
AI tools help early dementia detection
📈 Nvidia earnings to test AI boom
Google Meet will now take notes for you
OpenAI prepares ‘Project Strawberry’
OpenAI researchers are preparing to launch a new AI model, code-named Strawberry (previously Q*), that demonstrates superior reasoning capabilities in solving complex problems, according to a new report via The Information.
Project Strawberry could be integrated into ChatGPT as soon as this fall, marking a significant leap in AI intelligence.
Given extra “thinking” time, Strawberry can tackle subjective topics and solve complex puzzles like the New York Times Connections.
OpenAI is using Strawberry to generate high-quality training data for another secretive upcoming LLM, reportedly code-named Orion.
The new AI model could enhance OpenAI’s development of AI agents, potentially automating multi-step tasks more effectively.
If Strawberry lives up to the leaks, it could mark a significant leap in AI reasoning capabilities, potentially advancing OpenAI towards Stage 2 of its five-level roadmap to AGI. With ChatGPT reported to gain these capabilities this fall, we’re likely on the verge of seeing the next major wave of AI disruption.
Google Meet’s new AI-powered feature, “take notes for me,” has started rolling out today, summarizing meetings for Google Workspace customers with specific add-ons and initially announced at the 2023 Cloud Next conference.
This feature automatically generates a Google Doc with meeting notes, attaches it to the calendar event, and sends it to the meeting organizer and participants who activated the tool, although it currently supports only spoken English.
Google predicts the feature will be available to all Google Workspace customers by September 10th, 2024, but there are concerns about its accuracy, given the performance of similar transcription tools in the past.
Google just released three new experimental Gemini 1.5 models, including a compact 8B parameter version, an improved Pro model, and an enhanced Flash model — all available for developers on Google AI Studio.
Gemini 1.5 Flash-8B is a smaller, faster model that can handle text, images, and other data types efficiently for super quick responses while processing a lot of information.
The updated Gemini 1.5 Pro model is now better at writing code and understanding complex instructions.
An improved Gemini 1.5 Flash model offers overall enhancements, performing better on Google’s internal tests across various tasks.
The upgraded Gemini 1.5 Pro model now ranks as #2, and the new Gemini 1.5 Flash ranks as #6 on the Chatbot Arena leaderboard.
While OpenAI is leaving everyone waiting, Google has been shipping out constant upgrades and new features to its AI offerings. These new enhancements give Gemini 1.5 Flash big improvements overall and Gemini 1.5 Pro new upgrades in math, coding, and responding to longer prompts.
Google actually runs this tasty thing called “Interview Warmup.” It’s an AI-powered training tool for your next big interview. It throws real questions based on your discipline: UX, data and analytics, cybersecurity, etc. Then, the magic kicks in, evaluating your audio answers and sending back recommendations on things like framing your qualifications to supporting your impact.
5 questions. Get some analysis. Build some confidence. Easy, right? 🌟
Oh. And for the tech-oriented: Also make sure you check this site out, too. Videos, former (real) interview questions, the works. Interview Prep – Google Tech Dev Guide
OpenAI’s Custom GPTs allow premium users to create AI assistants that can optimize prompts for other AI creative tools such as Midjourney for AI image generation or Gen-3 for AI video generation.
Log into your ChatGPT Plus account and click “Explore GPTs”, then click “Create”.
Name your GPT and add a brief description.
In the Instructions, paste: “User is using an AI video generator called [Tool Name]. You need to craft a perfect prompt for the topic they ask by following the prompting guide below. The prompt needs to follow the format provided in the guide.”
Test your GPT in the preview panel, then click “Create” to finalize and choose sharing options.
Hot tip: Add a complete prompting guide for your chosen AI tool (e.g. Runway’s Gen-3 prompting guide)
Scientists from the Universities of Edinburgh and Dundee are launching a massive AI-driven study of over 1.6 million brain scans to develop tools for early dementia prediction and diagnosis.
The project, called NEURii, will use AI and machine learning to analyze CT and MRI scans from Scottish patients over the past decade.
Researchers aim to create digital tools for radiologists to assess dementia risk during routine scans.
The study will match image data with linked health records to identify patterns associated with dementia risk.
With global dementia cases projected to reach 153 million by 2050, this research could significantly impact early intervention and treatment development.
This week alone, we’ve seen AI developing new cancer drugs, 3D printing lifelike human organs, and now creating tools for early dementia detection. As AI rapidly advances in healthcare, we’re accelerating into a new era of personalized medicine and preventative care.
There have been several negative reports ahead of Nvidia’s earnings, ranging from supply chain/design challenges to concerns about use cases and applications. However, one thing we learned from discussions with customers is that demand is still extremely constrained.
Key topics ahead of the results:
1. Will the Hopper architecture stay stronger for longer? 2. Is Blackwell really delayed? 3. What is the upside if the company can deliver on the systems orders?
Here are some thoughts on each:
1. Key players like Microsoft, Snowflake, and Tesla highlighted tight capacity for GPUs and more demand than available supply. Snowflake particularly called out H100 (un)availability. This makes us believe that the Hopper cycle may extend beyond ’23/24
2. There were several reports pointing to Blackwell delays, the new generation GPU. Analysts have now taken it out of estimates for this year (C24). However, our research indicates that the delays are mainly on the systems side, which were not supposed to be delivered until (C25). Meanwhile, Nvidia’s CEO noted that we can expect significant revenues from Blackwell this year … key will be to find out if this is still the case.
3. Systems – namely the GB200 NVL36/72 is where the delays are. But our intel suggests that the order book for these is through the roof due to the TCO (total cost of ownership) they offer. If Nvidia is in fact able to deliver these in ’25 revenue from systems alone can exceed >$100BN with total DC revenue >$200BN.
What Else is happening in AI on August 28th 2024!
Apple announced a September 9 event where it’s expected to debut the iPhone 16 with new generative AI features.
Elon Muskendorsed California’s Senate Bill 1047, which would require safety testing for large AI models, breaking with other tech leaders who oppose the regulation.
Amazonplans to launch a delayed AI-powered Alexa subscription in October, featuring “Smart Briefing” AI-generated news summaries.
Anthropicannounced the full release of its Artifacts feature for all Claude users, including mobile apps, after millions were created in its test phase.
A Daily Chronicle of AI Innovations on August 27th 2024
AI can 3D print lifelike human organs
Anthropic reveals Claude’s secret sauce
Amazon aims to launch delayed AI Alexa subscription in October
OpenAI, Adobe, Microsoft want all companies to label AI-generated content
ChatGPT teams up with ASU
Discovering new drugs with AI
How to use Midjourney ‘Erase‘
AI can 3D print lifelike human organs
Researchers at Washington State University recently developed an AI technique called Bayesian Optimization that dramatically improves the speed and efficiency of 3D printing lifelike human organs.
The AI balances geometric precision, density, and printing time to create organ models that look and feel authentic.
In tests, it printed 60 continually improving versions of kidney and prostate organ models.
This approach significantly reduces the time and materials needed to find optimal 3D printing settings for complex objects.
The technology also has potential applications beyond medicine — for example, in the computer science, automotive, and aviation industries.
With cheaper, lifelike 3D-printed human organs, medical students could better practice for surgery before operating on actual patients. Beyond medicine, this AI technique could help reduce manufacturing costs for a variety of things like smartphones, car parts, and even airplane components.
Scientists from China and the U.S. just developed ActFound, a new AI model that outperforms existing methods in predicting drug bioactivity, potentially accelerating and reducing costs in drug development.
ActFound combines meta-learning and pairwise learning to overcome common limitations in AI drug discovery, like small datasets and incompatible measurements.
The model was trained on 35,000+ assays (metal ore breakdowns) and 1.6 million experimentally measured bioactivities from a popular chemical database.
In tests, ActFound outperformed nine competing models and showed strong performance in predicting cancer drug bioactivity.
ActFound could significantly speed up drug development by accurately predicting compound properties with less data and lower costs than traditional methods. While still in early stages, AI breakthroughs like this are the lesser-talked about developments that could end up saving millions of lives.
OpenAI’s ChatGPT is headed to Arizona State University (ASU), where the university is integrating the AI assistant into over 200 projects across teaching, research, and operations.
ASU is using ChatGPT Edu, a version designed for universities with enhanced privacy and security features.
The university also launched an ‘AI Innovation Challenge’ for faculty and staff, receiving an overwhelming demand for using ChatGPT to maximize teaching, research, and ops.
Key projects include an AI writing companion for scholarly work, ‘Sam’ (a chatbot for med students to practice patient interactions), and AI-assisted research recruitment.
The partnership has inspired other institutions like Oxford and Wharton to pursue similar collaborations.
While some schools are attempting to resist AI, ASU is embracing ChatGPT to make learning more personalized and to prepare students for an increasingly AI-driven job market. As education continues to change in the age of AI, case studies like this will be instrumental in shaping the future of academia.
Source: https://openai.com/index/asu/
Anthropic reveals Claude’s secret sauce
Anthropic has published the system prompts for its latest AI models, including Claude 3 Opus, Claude 3.5 Sonnet, and Claude 3.5 Haiku, to demonstrate transparency and ethical practices.
The system prompts reveal specific behaviors and capabilities of the Claude models, such as the inability to open URLs or recognize faces, aiming to ensure ethical interactions.
Anthropic plans to continue updating and disclosing these system prompts to promote transparency, potentially pressuring other AI vendors to follow suit.
Amazon aims to launch delayed AI Alexa subscription in October
The new Alexa AI, set to launch around mid-October, will feature a “Smart Briefing” that provides daily, AI-generated news summaries based on user preferences.
A more personalized experience is expected, with Alexa AI learning user preferences through interactive and tailored responses, such as dietary requirements for recipe suggestions.
Alexa AI will also introduce a “Shopping Scout” feature to help users find deals and track prices, alongside a kid-friendly “Explore with Alexa 2.0” for safe, moderated conversations.
OpenAI, Adobe, Microsoft want all companies to label AI-generated content
OpenAI, Adobe, and Microsoft now back a California bill that mandates tech companies to add watermarks to AI-generated content, with the bill set for a final vote in August.
AB 3211 requires AI-generated photos, videos, and audio clips to have watermarks in their metadata and mandates large online platforms to label AI content clearly for average viewers.
Initially opposed by a trade group representing major software companies, the bill gained support from OpenAI, Adobe, and Microsoft after amendments addressed concerns about its practicality.
Inflection AI partnered with Data Transfer Initiative, enabling Pi users to export conversations and announced plans to cap free usage while focusing on enterprise AI.
Source: https://inflection.ai/the-future-of-pi
Phariareleased Pharia-1-LLM-7B, an open-source model optimized for German, French, and Spanish that excels in domain-specific applications.
IBMpreviewed Spyre, a new AI accelerator chip for IBM Z mainframes, designed to scale enterprise AI workloads with clustering capabilities.
Source: https://research.ibm.com/blog/spyre-for-z
Hugging FaceandGoogle Cloud just partnered up to release optimized Deep Learning Containers for building AI with open models on Google Cloud infrastructure.
SPONSOR US: Get your product in front of over 1 million+ AI enthusiasts
Our Daily AI Chronicle Blog, newsletter and podcast is read by thousands of Redditors, Quorans, Linkedin professionals, tech executives, investors, engineers, managers, and business owners around the world. Get in touch today.
A Daily Chronicle of AI Innovations on August 26th 2024
Amazon is telling its salespeople to trash talk Google, Microsoft, and OpenAI
Apple may be working on an AI ‘personality’ to replace Siri on its robots
Chinese companies showcased 27 humanoid robots alongside Tesla’s Optimus
AI learns to plan better without humans
How to use Ideogram for generating images
️ Grok-2 improves speed, accuracy, transparency
AI learns to plan better without humans
IBM Research and Cornell University recently created AutoToS, a system that teaches AI to solve complex planning problems at 100% accuracy — without needing a human to check its work.
AutoToS is like a smart tutor for AI, helping it learn how to break down and solve tricky problems step-by-step.
The system uses clever tests to check the AI’s work, pointing out mistakes and showing examples of how to do better without human interferance.
This approach seems to work equally as well for smaller and larger models.
AutoToS succeeded in teaching AI to solve complex puzzles, including classic problems like arranging blocks and solving Sokoban, a box-pushing game.
Right now, it’s difficult to trust AI agents to completely autonomously perform actions on your behalf, but AutoToS is solving complex tasks at a 100% accuracy. If this system works in the real world, it’s the next big step in creating more reliable AI assistants.
Apple may be working on an AI ‘personality’ to replace Siri on its robots
Apple is developing a new AI-based ‘personality’ for use in upcoming robotic devices, aiming to enhance interactions similar to how Siri functions on existing Apple products.
Bloomberg’s Mark Gurman reports that Apple’s futuristic AI assistant will be more humanlike and could operate on a tabletop product and other future robots, potentially costing under $1,000.
The project is in early development stages with no guarantees of release, while Apple continues to integrate generative AI features into its devices, like iPhones, iPads, and Macs, later this year.
Chinese companies showcased 27 humanoid robots alongside Tesla’s Optimus
At the Beijing World Robot Conference, Tesla’s Optimus humanoid was displayed motionless inside a clear box, facing tough competition from Chinese robots demonstrated by various companies.
The event saw 27 new humanoid robots debut, with significant financial investments in China’s robotics industry surpassing 100 billion yuan over the past decade.
Chinese startups like Agibot and Stardust Intelligence showcased robots capable of performing complex tasks, while experts believe Tesla’s and other U.S. companies’ robot technology leads by about one to two years.
xAI’s Grok-2 and Grok-2 mini just made major improvements — doubling the model’s speed in the mini version and showing increased accuracy in both models, just days after its beta launch.
Grok-2 mini is now twice as fast as it was previously, thanks to a rewritten inference stack using SGLang.
Both Grok-2 and its mini version have become slightly more accurate due to reduced quantization error, according to one xAI employee.
Additionally, both Grok-2 models are now part of the LMSYS Chatbot Arena leaderboard for increased transparency, with Grok-2’s larger model ranking #2 and surpassing Claude 3.5 Sonnet.
Grok-2 excels particularly in math, where it ranks #1 and performs at a state-of-the-art level in hard prompts, coding, and instruction-following.
From being founded only ~18 months ago, to creating an LLM ranked third in the world, it’s safe to say that xAI has the entire AI community mind blown. This not only makes Grok-2 a top contender in the AI race but also intensifies competition, potentially accelerating advancements across the industry.
At the 2024 World Robot Conference in Beijing, Chinese companies showcased 27 humanoid robots alongside Tesla’s Optimus, signalling China’s ambition to dominate the industry.
Chinese tech firms unveiled 27 humanoid robots at the expo, with Tesla’s Optimus being the only foreign competitor present.
AGIBOT, founded by a Huawei alumnus, presented robots powered by large language models (LLMs) for industrial use and customer service.
Other notable entries included Astribot’s S1 robot assistant capable of writing calligraphy and playing musical instruments, and Galbot’s wheeled robots for food delivery and retail tasks.
Despite the impressive showcase, experts note that technological hurdles and high costs still create challenges for Chinese manufacturers.
China may be slightly behind in the AI race against the U.S., but it’s clear the country is committed to dominating the humanoid robotics race. With a whopping 27 China-based humanoid robots demonstrating a wide-range of use cases at the event, commercially available humanoids may be coming sooner than most expect.
Ideogram 2.0, the latest state-of-the-art AI image generator, excels at creating images that include text — opening new possibilities for use cases like thumbnails, posters, newsletter graphics, memes, and more.
Head over to Ideogram’s website and Sign up. You’ll get free credits to try the image generator without a credit card.
Click “Describe what you want to see” and enter a detailed text prompt for your desired image.
Customize settings like aspect ratio, AI model (choose 2.0), and style (Realistic, Design, 3D, or Anime).
Click “Generate” to create four AI-generated images based on your prompt!
Pro tip: Experiment with different prompts and settings to discover its full potential and create unique visuals for your projects!
What Else is Happening in AI on August 26th 2024!
Scientists to use AI and 1.6 million brain scans for earlier and more accurate dementia diagnoses.
Anthropic supported California’s AI regulation bill after changes were made, saying its benefits likely outweigh its costs for advanced AI development.
A Daily Chronicle of AI Innovations on August 23rd 2024
Nvidia and Mistral make laptop-ready AI
Amazon’s AI assistant saves 4,500 years of development time
Slack AI could be tricked into leaking login details and more
Cruise’s robotaxis are coming on Uber
Google DeepMind workers urge the company to end ties with military organizations
Salesforce unveils AI agents for sales
Nvidia and Mistral make laptop-ready AI
Nvidia and Mistral just released Mistral-NeMo-Minitron 8B, a highly accurate small language model that can run efficiently on laptops and PCs.
The model uses optimization techniques like pruning (removing certain weights) and distillation (retraining the pruned model on a small dataset) to achieve high accuracy with a smaller footprint.
These optimizations resulted in up to 40x cost savings in terms of raw compute during training.
Laptops and PCs can run the model locally for faster and more secure interactions with AI.
Minitron 8B leads nine language-driven AI benchmarks for similarly sized models from language understanding to reasoning and coding.
AI models that are small enough to run locally on laptops and PCs means less reliance on cloud services, improved data privacy, and faster responses. As this tech evolves, we could soon see advanced AI in everything from smartphones and watches to home appliances.
Amazon’s AI assistant saves 4,500 years of development time
Amazon CEO Andy Jassy stated that their AI assistant, Amazon Q, has significantly reduced software upgrade times, saving the company thousands of work hours.
Jassy mentioned that implementing Amazon Q resulted in estimated savings equivalent to 4,500 developer-years and $260 million in annual efficiency gains.
The AI-generated code reviews were so accurate that 79% of them were shipped without any additional changes, demonstrating the tool’s effectiveness in streamlining tedious tasks.
Researchers just developed a new AI-based method called NES-VMC that can accurately calculate the excited states of atoms and molecules, a challenge in physics and chemistry that previously delayed improvements in solar tech.
NES-VMC (natural excited states variational Monte Carlo) accurately predicted quantum excited states on systems ranging from single atoms to benzene-sized molecules.
The method outperforms leading computational chemistry techniques, often achieving chemical accuracy.
Excited states are crucial for understanding light-matter interactions, key to improving solar cells, LEDs, lasers, and more.
NES-VMC overcomes long-standing challenges in physics and chemistry that have hindered progress in these fields.
This AI-driven breakthrough could lead to more efficient solar cells, brighter LEDs, and more powerful lasers. The ripple effects could be dramatic: lower electricity costs, improvements in phone and laptop battery life and displays, faster fiber-optic internet, and so much more.
Salesforce just introduced two fully autonomous, AI-powered sales agents, Einstein SDR Agent and Einstein Sales Coach Agent, designed to help sales teams accelerate growth through automation and personalization.
Einstein SDR Agent engages with inbound leads 24/7 to answer questions, handle objections, and book meetings.
Einstein Sales Coach Agent helps salespeople rehearse pitches and offers real-time suggestions during calls.
The agents both leverage Salesforce’s CRM data and external data uploaded via Data Cloud to generate accurate, contextually relevant responses.
The agents will be generally available in October, with more details expected to be released at Dreamforce conference in September.
By integrating AI agents into existing platforms, Salesforce is lowering the barrier for AI adoption in business processes. These agents offer 24/7 support and automate repetitive tasks like qualifying leads and booking meetings, freeing human sales teams to focus on high-value tasks and potentially close more deals.
Slack AI could be tricked into leaking login details and more
Security experts found that Slack’s AI assistant can be misled into disclosing sensitive information, like API keys, to unauthorized users through carefully crafted prompts.
Hackers can exploit this vulnerability by creating a public Slack channel, inputting a malicious command that causes the AI to leak private data via clickable URLs.
Salesforce fixed the issue for private channels but public ones remain exposed, allowing attackers to use social engineering tactics to get workspace members to upload malicious documents.
Google DeepMind workers urge the company to end ties with military organizations
In May 2024, approximately 200 Google DeepMind employees signed a letter urging the company to cease its contracts with military organizations due to concerns over the use of AI technology in warfare, according to Time magazine.
The letter highlights internal tensions between Google’s AI division and its cloud business, referencing Google’s defense contract with the Israeli military and the use of AI for mass surveillance and targeting in Gaza.
The letter calls for Google to investigate claims of its cloud services being used by militaries, cut off such access, and establish a new governance body to prevent future military use of DeepMind’s AI technology.
A Daily Chronicle of AI Innovations on August 22nd 2024
Neuralink’s second patient is already playing video games with brain implant
Apple’s first foldable MacBook might see big delays
OpenAI joins Silicon Valley companies lobbying against California’s AI bill
Ideogram 2.0 launches with major upgrades
xAI releases Grok 2 in early beta
Create your own AI Clone
Disney AI brings robots to life
Ideogram 2.0 launches with major upgrades
Ideogram just released version 2.0 of its advanced text-to-image model with major upgrades and new features, including five new image styles, an iOS app, a beta API, and over 1 billion public Ideogram images.
Ideogram 2.0 offers five image styles: General, Realistic, Design, 3D, and Anime.
The Realistic style convincingly resembles photographs with dramatically improved textures for human features like hands and hair, a pain point for previous image generation models.
The Design style also significantly improves text rendering, allowing users to create greeting cards, t-shirt designs and more.
Ideogram offers a free tier that allows users to generate around 40 images, or 10 prompts a day at no charge.
Ideogram 2.0 consistently renders high-quality images with near perfect human hands and text — which is an instant ‘AI giveaway’ in other AI image generators. This makes the model the new gold standard for use cases like memes, newsletter images, YouTube thumbnails, posters, and more.
xAI has begun rolling out early beta access for Grok 2, a powerful new AI model that leverages real-time data from X and uses Flux.1 to generate relatively unfiltered AI images.
Grok 2 is now available to a select group of premium X users in early beta mode.
The model can access and use real-time information from X, setting it apart from ChatGPT and other LLMs.
Grok 2 offers two modes: regular and “fun” mode, with the latter providing a more distinctive and entertaining personality.
When gathering and summarizing news, Grok 2 can reference specific tweets, a capability that cannot be found in ChatGPT or Claude.
Grok 2’s biggest advantage against other top-tier AI chatbots like ChatGPT is its ability to access real-time information from X and provide unfiltered responses. And with Grok 3 rumoured to be coming at the end of 2024, xAI has proven itself as a serious competitor in the LLM race — in a very short period of time.
ETH Zurich and Disney Research scientists have developed an AI system that can generate realistic, physics-based movements for virtual characters and robots from simple text or image inputs.
The system uses a two-stage approach: first, it learns a latent representation of motion from a large dataset, then trains a control policy using reinforcement learning.
It can handle a diverse range of motions, from simple walking to complex acrobatics, outperforming previous methods in accuracy and generalization.
The AI adapts to physical constraints, allowing it to transfer motions to real robots while maintaining balance and style.
Disney released a video showcasing one robot trained on the new two-stage AI technique dancing and getting pushed around while staying on its feet.
This AI system bridges the gap between animation and robotics, helping humanoids move more naturally and adapt better to new situations. With personal robots coming as soon as 2025 and the rapid pace of AI and robotics advancements, we might be coexisting with robots sooner than most people realize.
Neuralink’s second patient is already playing video games with brain implant
Elon Musk’s company Neuralink has implanted a brain chip in a second human patient named Alex, who is now using it to play video games and design 3D objects.
Alex’s recovery from the procedure has been smooth, and he has successfully used computer-aided design software to create a custom mount for his Neuralink charger.
The core technology of Neuralink involves a small, implantable chip with flexible electrode threads that capture and transmit brain activity to external devices like computers.
OpenAI joins Silicon Valley companies lobbying against California’s AI bill
OpenAI’s chief strategy officer Jason Kwon argues that AI regulations should be managed by the federal government, not individual states, to avoid hindering progress and causing businesses to relocate from California.
Kwon states that a consistent, nation-wide set of AI policies will promote innovation, allowing the U.S. to become a leader in global AI standards, and thus opposes California’s SB 1047 bill.
The proposed California AI safety bill, designed by Senator Scott Wiener, includes measures like pre-deployment safety testing and whistleblower protections, and awaits its final vote before potentially being signed by Governor Gavin Newsom.
California and Google drafted a $300 million, 5-year partnership to fund in-state newsrooms and AI initiatives, including a $40 million annual “AI Innovation Accelerator”.
A Daily Chronicle of AI Innovations on August 21st 2024
OpenAI signs landmark agreement with Condé Nast
Microsoft releases new Phi-3.5 models, beating Google, OpenAI and more
AWS CEO tells employees that most developers could stop coding soon as AI takes over
OpenAI adds free fine-tuning to GPT-4o
Claude sued for copyright infringement
Create AI images in real-time on WhatsApp
Microsoft’s new AI beats larger models
Microsoft just released Phi-3.5-MoE, an advanced AI model that rivals the reasoning capabilities of much larger models while maintaining a compact and efficient architecture.
Phi-3.5-MoE uses a new mixture-of-experts (MoE) approach, which selectively activates only the most relevant parts of the model for each task to save compute power.
The new model excels at understanding and following complex instructions and can handle up to ~125,000 words in a single prompt.
In head-to-head benchmarks, Phi-3.5-MoE outperformed popular models like Meta’s Llama 3 8B and Google’s Gemma 2 9B, but fell short against OpenAI’s GPT-4o mini.
Microsoft made the model available under an open-source MIT license on Hugging Face.
While the mainstream media focuses on the most advanced large language model, there’s also another race amongst tech giants for the smartest, fastest, and smallest AI. Breakthroughs like Phi-3.5-MoE are paving the way for advanced AI models to run directly and privately on our mobile devices.
OpenAI signs landmark agreement with Condé Nast
OpenAI announced a new media partnership with Condé Nast to enhance search features using their SearchGPT prototype, aiming to make finding information and reliable content sources faster and more intuitive.
The partnership has raised transparency issues, particularly among Condé Nast’s unionized workers, who are worried about the impact on journalism and the lack of clear details on the agreement.
This deal occurs as Wall Street expresses growing concern over a potential AI bubble, with investors questioning the monetization and viability of AI technologies in the current market.
Microsoft releases new Phi-3.5 models, beating Google, OpenAI and more
Microsoft introduced three new open-source AI models, named mini-instruct, MoE-instruct, and vision-instruct, which excel in logical reasoning and support multiple languages but face challenges in factual accuracy and safety.
The Phi series aims to deliver highly efficient AI models for commercial and scientific purposes using quality training data, though specifics of the Phi-3.5 training process remain undisclosed by Microsoft.
All the new Phi 3.5 models are accessible under the MIT license on Hugging Face and Microsoft’s Azure AI Studio, but they require specialized GPU hardware like NVIDIA A100, A6000, or H100 for optimal performance.
AWS CEO tells employees that most developers could stop coding soon as AI takes over
A leaked recording revealed that AWS CEO Matt Garman believes software developers may soon stop coding as artificial intelligence takes over many of their tasks.
Garman’s remarks, shared during an internal chat in June, were intended as a positive forecast rather than a dire warning for software engineers, emphasizing new opportunities and skills.
Garman highlighted that developers should focus more on understanding customer needs and innovation, rather than just writing code, as AI tools increasingly manage the technical aspects.
Meta deploys new web crawlers that bypass scraping blocks
Meta has introduced new web crawling bots designed to collect data for training its AI models and related products without being easily blocked by website owners.
These new bots, Meta-ExternalAgent and Meta-ExternalFetcher, have features that potentially bypass the traditional robots.txt file, making website owners’ efforts to block them less effective.
Meta’s bots, launched in July, have shown low block rates compared to older versions, with only 1.5% blocking Meta-ExternalAgent and less than 1% blocking Meta-ExternalFetcher, according to Originality.ai.
OpenAI just launched free fine-tuning (up to 1 million tokens per day through September 23) for GPT-4o, allowing developers to customize the model for higher performance and accuracy.
Developers can now, for the first time ever, fine-tune GPT-4o to improve the model’s structure, tone, and domain-specific instructions for their AI applications.
Fine-tuning is available on all paid usage tiers with training costs of $25 per million tokens, but it is completely free until September 23.
OpenAI suggests that developers should see strong results from fine-tuning with only a few dozen training examples.
Additionally, Google’s Gemini API is giving developers 1.5 billion tokens for free every day on its Gemini 1.5 Flash model and 1.6 million tokens on its Gemini 1.5 Pro model.
Just last week, a company that was granted early access to fine-tune GPT-4o, produced Genie and achieved state-of-the-art scores on both SWE-bench Verified (43.8%) and Full (30.1%) benchmarks. With free fine-tuning now available to all developers, get ready for a new wave of smarter, faster and more capable AI bots.
A group of authors filed a lawsuit against AI startup Anthropic, alleging the company committed “large-scale theft” by training its Claude chatbot on pirated copies of copyrighted books.
This is the first lawsuit from writers targeting Anthropic and Claude, but similar lawsuits have been filed against competitor OpenAI and ChatGPT.
The lawsuit accuses Anthropic of using a dataset called The Pile, which includes numerous pirated books.
Anthropic and others, including OpenAI, have argued that training AI models is protected under the “fair use” doctrine of U.S. laws, which permits the limited use of copyrighted materials.
This is not the first time an AI company has been sued over copyright infringement, but it resurfaces an important debate about AI training data. While similar cases have been largely dismissed in the past, courts have yet to definitively address the core issue of using unauthorized internet-scraped material for AI training.
International Data Corporation (IDC)forecasted that worldwide AI spending is expected to reach $632 billion by 2028, with generative AI accounting for 32% of that.
LTX Studio opened to the public and launched five new features, including character animation and dialogue, face motion capture, and generation and keyframe control.
A Daily Chronicle of AI Innovations on August 20th 2024
AGIBOT reveals new humanoid robot family
ChatGPT runs for mayor in Wyoming
Luma Labs launches Dream Machine 1.5
Tesla’s humanoid robot has a new competitor
Waymo now giving 100,000 weekly robotaxi rides
Fortune 500 companies are getting increasingly worried about AI
Anthropic gets sued on allegations of ‘large-scale theft’
Nvidia’s new AI predicts thunderstorms with kilometer-scale precision
Luma Labs launches Dream Machine 1.5
Luma Labs just released Dream Machine 1.5, a major upgrade to their current AI video generation model, with higher quality text-to-video, smarter prompt understanding, and better image-to-video capabilities.
Dream Machine 1.5 builds on the original model’s ability to generate high-quality, realistic 5-second video clips from text and image prompts.
The upgraded model showcases better natural language processing, interpreting and executing prompts at a higher accuracy.
It excels in creating smooth motion, cinematography, and dramatic shots, turning static concepts into dynamic stories, but lags in morphing, movement, and text.
Dream Machine 1.5 is available to try for free here.
With text-to-image AI generation nearly indistinguishable from reality, the next big frontier is text-to-video — and Dream Machine 1.5 is another big leap forward for realism. While AI video still has some catching up to do, expect fast-moving startups like Luma Labs to close that gap for AI video, fast.
Victor Miller, a mayoral candidate in Wyoming’s capital city, just vowed to let his customized ChatGPT GPT named Vic (Virtual Integrated Citizen) help run the local government if elected.
Miller created VIC using ChatGPT, feeding it city ordinances and related documents to make municipal decisions.
Miller filed for him and VIC to run for mayor, proposing that the ChatGPT GPT provides data-driven insights and solutions while Miller ensures legal execution.
OpenAI has shut down Miller’s account twice, citing policies against using its products for campaigning.
Wyoming’s Secretary of State raised concerns, but local officials allowed Miller’s candidacy with his human name on the ballot.
While Miller’s chances of winning seem slim, and his grasp of data privacy and LLMs seem slimmer, this marks the first time a political candidate has openly advocated for AI in governance. Whether Cheyenne, Wyoming is ready for an AI co-pilot in City Hall is debatable, but AI will certainly infiltrate politics in the coming years.
AGIBOT, a China-based robotics startup, just unveiled a family of five advanced humanoid robots, directly challenging Elon Musk and Tesla’s upcoming Optimus bot.
AGIBOT’s five new models are both wheeled and biped humanoid robots specifically designed for diverse tasks — from household chores to industrial operations.
The flagship model, Yuanzheng A2, stands 5’9″ (175cm), weighs 121 lbs (55kg), and can perform delicate tasks like needle threading.
The company aims to start shipping 300 units by the end of 2024, claiming better commercialization and cost-control abilities than Tesla.
Unitree, another high-performance robot manufacturer from China, also showcased its new G1 mass production-ready robot with better functionality and appearance.
The humanoid robotics and AI race between the US and China is intensifying. While it’s been months since Tesla unveiled its Optimus 2 prototype, four Chinese startups, including AGIBOT revealing five new humanoid robots, have showcased major technical progress in just a few days.
Unitree Robotics has launched the production version of its G1 humanoid robot, priced at $16,000, just three months after its initial announcement.
The G1 is 90% cheaper than Unitree’s previous humanoid model, the H1, offering advanced features such as 23 degrees of freedom and a 3D vision system for real-time navigation.
While the G1 is not ready for consumer use, it is envisioned as an affordable platform for research and development, likely appealing to institutions and businesses exploring robotic automation.
Waymo disclosed it is now giving more than 100,000 paid robotaxi rides every week across Los Angeles, San Francisco, and Phoenix, doubling its previously stated figures.
This milestone was shared by Waymo co-CEO Tekedra Mawakana and reflects a significant increase from the over 50,000 weekly rides reported by Alphabet CEO Sundar Pichai earlier this year.
Waymo’s fleet consists of hundreds of fully autonomous Jaguar I-Pace vehicles, with 778 robotaxis deployed in California, and it has recently expanded its service to operate 24/7 in San Francisco and parts of Los Angeles.
Fortune 500 companies are getting increasingly worried about AI
Fortune 500 companies reporting AI as a risk factor saw a surge of 473.5% in the past year, according to a report by Arize AI, with 281 companies now flagging such risks.
Arize AI’s analysis revealed that 56.2% of Fortune 500 companies now include AI risks in their latest annual reports, a substantial jump from the previous year’s 49 companies.
The software and technology sectors lead the mentions of generative AI, while advertising, media, and entertainment industries report the highest percentage, 91.7%, of AI as a risk factor.
Anthropic gets sued on allegations of ‘large-scale theft’
A group of authors has filed a lawsuit against AI startup Anthropic, alleging “large-scale theft” for using pirated copies of copyrighted books to train its chatbot, Claude.
This marks the first lawsuit by writers specifically targeting Anthropic, although similar cases have been brought against OpenAI, the maker of ChatGPT, for the same reasons.
The lawsuit accuses Anthropic, which markets itself as a responsible AI developer, of contradicting its goals by using unauthorized works, and it adds to the increasing legal challenges faced by AI developers.
Nvidia’s new AI predicts thunderstorms with kilometer-scale precision
Nvidia Research has introduced StormCast, a new AI model for high-precision atmospheric dynamics to enhance mesoscale weather prediction, which is critical for disaster preparedness and mitigation.
Integrated into Nvidia’s Earth-2 platform, StormCast provides hourly autoregressive forecasts that are more accurate than current US operational models by 10%, improving early warning systems for severe weather events.
Trained on NOAA climate data, StormCast predicts over 100 weather variables and allows scientists to observe storm evolution in three dimensions, marking significant advancements in AI-driven weather forecasting by Nvidia.
A Daily Chronicle of AI Innovations on August 19th 2024
You can now rent ‘living computers’ made from human neurons
Start-up failures up by 60% as founders face hangover from boom years
AMD is going after Nvidia with a $5 billion acquisition
Tesla will pay you to pretend to be a robot
You can now rent ‘living computers’ made from human neurons
Researchers and companies like FinalSpark are creating computers from lab-grown human brain organoids, which can be rented for $500 a month.
These biocomputers use human neurons to form pathways mimicking human brain learning processes, potentially consuming significantly less energy than current AI technologies.
While challenges remain, such as limited organoid lifespans and lack of standardized manufacturing, FinalSpark and other researchers are exploring various biocomputing approaches, including cellular and fungal computing.
AMD is going after Nvidia with a $5 billion acquisition
AMD is set to buy ZT Systems for $4.9 billion in cash and stock, aiming to strengthen its AI ecosystem and offer better support to companies building large AI computing businesses.
The acquisition will integrate ZT Systems’ computing infrastructure design business into AMD, although AMD plans to sell the data center infrastructure manufacturing arm to a strategic partner.
ZT Systems’ CEO Frank Zhang and President Doug Huang will lead roles within AMD’s Data Center Solutions Business Group, with the deal expected to conclude in the first half of 2025.
Tesla is offering up to $48 per hour for Data Collection Operators to wear motion-capture suits and VR headsets to help train its humanoid Optimus robot.
Workers wearing these suits perform and analyze tasks to gather extensive data, aiding in the robot’s development for various roles, from factory work to caregiving.
Tesla’s initiative involves collecting potentially millions of hours of data, aiming to overcome the challenges of producing versatile robots at scale and ensuring their success in diverse tasks.
Swiss startup FinalSpark just launched a service allowing scientists to rent cloud access to “biocomputers” made of human brain cells for $500 a month, in an effort to create AI that uses 100,000x less energy than current systems.
The system uses organoids (clumps of human brain cells) that can “live” and compute for up to 100 days.
AI models are trained using dopamine for positive reinforcement and electrical signals for negative reinforcement, mimicking natural neural processes.
FinalSpark claims these biocomputers could be up to 100,000 times more efficient for AI training than traditional silicon-based technology.
The organoids and their behaviour are live streamed 24/7, which you can access here.
AI is an energy-hungry industry, and alleviating its dependence on CPUs and GPUs is generally a step in the right direction. That said, using brain organoids for biocomputing is completely uncharted territory and is bound to raise ethical concerns — such as the sci-fi possibility that cell masses somehow achieve consciousness.
California’s SB 1047, an aggressive AI safety bill aimed at preventing AI disasters, just got significantly revised to address concerns raised by AI companies like Anthropic and open-source developers.
The bill no longer allows California’s attorney general to sue AI companies for negligent safety practices before a catastrophic event occurs.
AI labs are now only required to submit public “statements” about their safety practices vs certifications “under penalty of perjury.”
Likewise, developers must now provide “reasonable care” vs “reasonable assurance” that AI models do not pose significant risks.
The bill is headed to California’s Assembly floor for a final vote.
There’s a fine line between advancing technological progress and mitigating potential existential risks that governments are navigating — and California is showing that regulation can be practical and adaptive. These changes are a big step towards fostering responsible AI development through collaborative governance.
Researchers just developed a new technique to find shorter solutions to scrambled Rubik’s Cubes by cleverly analyzing the puzzle’s structure and identifying the best moves more quickly.
The Rubik’s Cube has an enormous number of possible configurations, over 43 quintillion, making it challenging for AI to solve in the fewest moves possible.
Researchers represented the Rubik’s Cube as a complex network or “graph” and used a new technique to pass useful information, like the moves required to solve the puzzle, between connected nodes.
The AI then considers which next moves are most likely to lead to a quick solution, using the probabilities as weights, and focuses on the most promising paths.
When tested, the new technique found solutions to the puzzle faster than current state-of-the-art Rubik’s Cube solving AI systems.
As companies like Sakana build AIs that can completely automate scientific research, it’s important to make sure they’re solving highly complex problems efficiently. This technique, coupled with Sakana’s processes, could be massively beneficial in areas like optimizing supply chains and advanced drug discovery.
Free event: Navigating AI Data Privacy. Join Section CEO Greg Shove to learn how to protect your data, write a team or company AI data policy, and lead your company on safe AI. RSVP here.*Source: https://www.sectionschool.com/events/live-events/ai-data-privacy-in-large-organizations
Claudehttps://x.com/alexalbert__/status/1824483452802175082 a new screenshot capture button, allowing users to easily include images from their screen in prompts.Source: https://x.com/alexalbert__/status/1824483452802175082
Midjourneyreleased a new unified web-based AI image editor with advanced tools for seamlessly modifying and extending generated images.Source: https://venturebeat.com/ai/midjourney-releases-new-unified-ai-image-editor-on-the-web
Rebellions and Sapeon, South Korean AI chip makers, signed a definitive merger agreement to challenge global leaders like Nvidia.Source: https://www.reuters.com/technology/artificial-intelligence/south-korean-ai-chip-makers-rebellions-sapeon-agree-merge-2024-08-18
Bzigo launched Iris, an AI-powered mosquito detector that tracks and marks mosquitoes with a laser pointer for easy swatting.Source: https://www.foxnews.com/tech/ai-technology-can-help-you-win-battle-over-mosquitoes
Coinbasestarted a $15,000 accelerator grant program for projects combining AI with crypto wallets to enable economic participation.Source: https://cointelegraph.com/news/coinbase-ceo-brian-armstrong-ai-should-have-crypto-wallets
Microsoftunveiled PowerToys Workspaces, a new feature to auto-arrange apps, plus an AI-powered copy-paste tool with OpenAI API integration.Source: https://www.theverge.com/2024/8/16/24221639/microsoft-powertoys-workspaces-feature-demo
A Daily Chronicle of AI Innovations on August 16th 2024
AI makes Walmart 100x more productive
SoftBank’s AI chip faces setback
Create a Siri-like voice AI with Llama 3.1
Hermes 3 is the newest open-source model
AI makes Walmart 100x more productive
Walmart’s CEO Doug McMillon just reported that the company is using generative AI to increase its productivity, updating 850 million product catalog entries 100 times faster than human-led methods.
The report came during the company’s Q2 financial earnings call, where McMillon also announced AI improvements to customer search and seller support.
Customers can now use AI-powered search and a new shopping assistant on Walmart’s app and website — it even provides advice for questions like “Which TV is best for watching sports?”.
Walmart is also testing a completely new AI-driven experience for U.S. based marketplace sellers, but the details are not yet available.
McMillon said the company plans to continue experimenting with AI globally across all parts of its business.
Another multibillion dollar company is using AI to increase productivity, but most notably, Walmart is exploring the tech in all areas of its business ops. Whether people should be excited about the endless possibilities ahead or concerned about the relevance of their jobs is a question that’s not going away any time soon.
SoftBank’s ambitious Project Izanagi initiative, aimed at developing AI processors to rival Nvidia, is reportedly facing a major setback after Intel failed to meet volume and speed requirements.
SoftBank had been working with Intel to develop AI processors for Project Izanagi because it lacks in-house chip design expertise, but Intel failed to meet SoftBank’s demands.
In an effort to keep Project Izanagi on track, SoftBank is considering a new partnership with TSMC, the world’s largest chipmaker.
TSMC has its own issues, however, failing to meet its current chipmaking demands, which has stalled the negotiations.
Despite the complications, SoftBank CEO Masayoshi Son remains committed to the company’s ambitious plan and is seeking investments from Saudi Arabia, UAE, and major tech companies.
Nvidia is currently dominating the AI chip space, which propelled the company to its current $3 trillion dollar market capitalization. But with recent delays of Nvidia’s next-gen Blackwell AI chip, it could be time for competitors to strike.
Nous Research just released Hermes 3, a new open-source model with significant improvements in roleplaying, agentic tasks, function calling, multi-turn chats, and long context coherence.
Hermes 3 is available in three sizes (8B, 70B, and 405B) with the 405B parameter model achieving state-of-the-art performance relative to other open models.
The model is instruct tuned, or trained, to faithfully respond to user requests and closely follow provided system prompts, unlike base or foundation models.
It achieves similar or better performance to Meta’s Llama-3.1 405B in judgement, reward modeling, interpretable problem-solving, code generation, and tool use.
Hermes 3 is available now for free via Lambda Chat or in the Nous Research Discord server.
Meta has been the leader in open-source AI for a while, but companies like Nous Research and Mistral are catching up with their latest Hermes 3 and Large 2 models. And the more free, customizable and state-of-the-art AIs available to the public, the more transparency the world has.
Elon Muskrevealed that xAI is developing an in-house image generation system to replace the current Flux model in Grok 2 but it’s currently months away from release.
The U.S. Consumer Financial Protection Bureauhighlighted risks of AI in finance, saying existing laws apply and innovation requires consistent regulatory treatment.
Apptronik, an automation company that makes humanoid robots, recently reported that the company is preparing for a commercial launch by the end of 2025.
A Daily Chronicle of AI Innovations on August 15th 2024
Apple’s iPad is getting a robotic arm
Google’s Imagen 3 tops Midjourney, DALL-E
Apple’s next big thing is a $1000 home robot
Grok-2 reaches state-of-the-art status
Creating sound effects with text
X’s AI image generator allows users to create uncensored images
Ex-Google CEO says successful AI startups can steal IP and hire lawyers to ‘clean up the mess’
FTC finalizes rule banning fake reviews, including those made with AI
Apple’s next big thing is a $1000 home robot
Apple is reportedly working on a new smart home project featuring an iPad attached to a robotic arm that can twist and rotate, designed as a home “command center” with AI capabilities.
The initiative, backed by CEO Tim Cook and head of hardware engineering John Ternus, has involved hundreds of staff and follows the cancelled Apple-brand electric car project.
According to Bloomberg, the device is expected to be released around 2026 or 2027, potentially costing about $1,000, and will use a modified version of iPadOS.
xAI’s newest AI model, Grok-2, is now available in beta for users on the X platform — achieving state-of-the-art status and outperforming versions of Anthropic’s Claude and OpenAI’s GPT-4.
In addition to Grok-2, Grok-2 mini is also now available to users on the X platform in beta with an enterprise API release planned for later this month.
Both Grok-2 and Grok-2 mini show significant improvements in reasoning with retrieved content, tool use capabilities, and performance across all academic benchmarks.
Grok-2 can now create and publish images directly on the X platform, powered by Black Forest Lab’s Flux 1 AI model.
Grok-2 surpasses OpenAI’s latest GPT-4o and Anthropic’s Claude 3.5 Sonnet in some categories, making it one of the best models currently available to the public if based purely on benchmarks.
Grok-1 debuted as a niche, no-filter chatbot, but Grok-2’s newly achieved state-of-the-art status has catapulted xAI into a legitimate competitor in the AI race. The startup is looking to have a bright future with its new Supercluster, Elon’s ability to attract talent, and vast amounts of real-time training data available on X.
Apple is reportedly ramping up development on a high-end tabletop smart home device with a robotic arm, an iPad-like display, and Siri voice command to operate its AI features.
The project, codenamed J595, reportedly involves a team of several hundred people and could launch as early as 2026 or 2027.
The device combines an iPad-like display with a thin robotic arm that can tilt, spin 360 degrees, and move the screen around.
It is expected to run a modified version of iPadOS making it a familiar smart home command center, videoconferencing tool, and remote-controlled home security device.
Apple is targeting a price point of around $1,000 for the product.
Apple is doubling down on its commitment to artificial intelligence by ramping up the development of a strange new Siri-powered, countertop robotic arm. With Apple Intelligence launching later this year, the tech giant seemingly has big plans for implementing AI into its hardware.
X’s AI image generator allows users to create uncensored images
X’s new AI image generator, Grok, allows users to create and share highly controversial images, including those of public figures in inappropriate scenarios, raising concerns about the lack of content moderation.
Despite claiming to have restrictions, Grok often generates offensive or misleading images, with many users easily bypassing its few safeguards, leading to further scrutiny from regulators.
The chaotic rollout of Grok’s image generation feature aligns with Elon Musk’s relaxed approach to content moderation, potentially driving away advertisers and inviting regulatory action.
ElevenLabs now offers a text-to-sound feature that allows users to generate sound effects by writing a simple description of the noise they want.
Visit ElevenLabs and log in or create an account. You can try this feature for free.
Select “Sound Effects” from the left sidebar.
Describe your desired sound effect in the text box.
Adjust settings for duration and prompt influence.
Click “Generate Sound Effects” to create your sounds.
Source: https://elevenlabs.io/
Google’s Imagen 3 tops Midjourney, DALL-E
Google DeepMind recently published the paper for it’s new state-of-the-art AI image generation model, Imagen 3, flexing that it beat DALL-E 3, Midjourney v6, and Stable Diffusion 3 in human performance evaluations.
The human evaluations asked participants to rank their preferred models for overall quality and adherence to detailed prompts.
Imagen 3 excelled particularly in generating high-quality, realistic images that closely match long and complex text descriptions.
Despite its capability to accurately generate photorealistic images, it struggles with certain tasks requiring numerical reasoning, understanding scale, and depicting actions.
Ex-Google CEO says successful AI startups can steal IP and hire lawyers to ‘clean up the mess’
Former Google CEO Eric Schmidt suggested that successful AI startups can initially steal intellectual property and later hire lawyers to resolve legal issues if their product gains traction.
Schmidt used a hypothetical example of copying TikTok to illustrate how Silicon Valley entrepreneurs might prioritize rapid growth over legal considerations.
Schmidt’s comments, made during a talk at Stanford, were later removed from the university’s YouTube channel after drawing media attention.
FTC finalizes rule banning fake reviews, including those made with AI
The FTC has introduced a final rule prohibiting companies from producing or selling fake reviews, including AI-generated ones, and can now penalize companies that ignore the regulation.
The rule targets deceptive practices such as incentivizing feedback, undisclosed insider reviews, company-controlled review sites, intimidation to remove negative feedback, and the trade of fake followers or views.
Although the FTC first proposed the fake review ban last year, there are concerns about enforcing it on global marketplaces like Amazon, where numerous fraudulent reviews come from businesses outside the U.S.
Free eBook: The AI Proficiency Report from Section. 7% of the workforce is getting all the benefits of AI. Download the report to see what they do differently.*
A Daily Chronicle of AI Innovations on August 14th 2024
Google beats OpenAI in voice mode race
OpenAI redesigns coding benchmark
Bring images to life with Kling AI
Become a tennis pro with AI
Android phones get an AI upgrade
xAI releases Grok-2, adds image generation on X
New ‘AI Scientist’ conducts research autonomously
Android phones get an AI upgrade
Google is replacing Google Assistant with its new AI model, Gemini, on Android phones, introducing generative AI capabilities like automating calendar invites and creating playlists based on user input.
Gemini will operate through cloud-based services, allowing for advanced AI processing, while Apple plans to run its AI models directly on devices for better privacy and latency.
The introduction of Gemini marks a significant shift in smartphone functionality, offering the potential to automate day-to-day tasks, but there are risks of errors as AI assistants become more integrated into daily life.
Google just launched Gemini Live, a mobile conversational AI with advanced voice capabilities, while OpenAI’s ChatGPT voice mode remains in its “limited alpha phase” and is not yet available to everyone.
Gemini Live, Google’s answer to OpenAI’s Advanced Voice Mode, is capable of “in-depth“ hands-free conversations and has 10 different human-like voice options.
Users can interrupt and ask follow-up questions mid-response, mimicking natural conversation flow — however Gemini Live’s ability to see and respond to your camera view is planned later this year.
Similar to Apple’s upcoming Intelligence features, Gemini integrates directly with Google to provide context-aware answers without switching apps.
Gemini Live is now the default assistant on Google’s Pixel 9 and is available today to all Gemini Advanced subscribers on Android (coming to iOS soon).
Real-time voice is slowly shifting AI from a tool we text/prompt with, to an intelligence that we collaborate, learn, consult, and grow with. As the world’s anticipation for OpenAI’s unreleased products grows, Google has swooped in to steal the spotlight as the first to lead widespread advanced AI voice rollouts.
xAI has launched upgraded Grok-2 and Grok-2 mini chatbots with new image-generation capabilities, which are powered by Black Forest Lab’s Flux 1 AI model and allow users to publish images to X with few restrictions.
Both Grok-2 models are currently in beta, available to Premium and Premium Plus subscribers on X, and will be accessible via xAI’s enterprise API later this month.
Early examples of Grok-generated images, depicting figures like Donald Trump and Barack Obama, indicate minimal content restrictions, raising concerns about the spread of false information on the platform.
OpenAI and the authors of SWE-bench collaborated to redesign the popular software engineering benchmark and release ‘SWE-bench Verified’, a human-validated subset of the original benchmark.
SWE-bench Verified addresses issues in the original benchmark, such as overly specific unit tests and unreliable development environments that leads to incorrect assessments of AI performance.
The new subset includes 500 samples verified by human professional software developers to make evaluating models on SWE-bench easier and more reliable.
On SWE-bench Verified, GPT-4o figures out 33.2% of samples, and the best open-source scaffold, Agentless, doubles its previous score to 16%.
The leaderboard for SWE-bench Verified does not include Cosine’s Genie we wrote about yesterday, which shattered the high score on the old benchmark by over 10%.
Accurate benchmarking of AI in human-level tasks like coding is crucial for transparency and assessing AI risk. However, OpenAI’s collab with SWE-bench is a double-edged sword — while it improves the benchmark, it also raises questions about potential conflicts of interest, especially with ‘Project Strawberry’ rumors heating up.
Tokyo-based R&D company Sakana AI introduced “The AI Scientist,” an AI designed to fully automate research, claiming it’s the first system of its kind to independently handle numerous scientific tasks.
The AI Scientist generates innovative research ideas, conducts experiments, writes code, and produces scientific papers while using a simulated review process to evaluate its own findings, mimicking human scientific collaboration.
A rival AI startup, Omniscience, contested Sakana AI’s originality, asserting their AI model, Omni, was released months earlier and offers similar capabilities for aiding users in scientific writing and research tasks.
Kling AI’s new image-to-video feature allows users to take static images, and turn them into dynamic videos, offering a new dimension to the AI video generator’s character consistency.
Click “AI Videos” on the dashboard, then select “Image to Video” on the top bar.
Upload your chosen image and write a prompt describing how you want the image animated.
Hit “Generate” and watch your image come to life!
Source: https://klingai.com/
Become a tennis pro with AI
Researchers just created Match Point AI, a groundbreaking tennis simulation that pits AI agents against virtual pros, giving players data-driven tennis strategies and tools to help improve their game.
Match Point AI realistically models the complexities and uncertainties of real tennis, allowing AI to test new strategies in virtual games.
Early experiments show the AI rediscovering time-tested tennis strategies, like making opponents run, validating the framework’s ability to understand the sport.
By watching Match Point’s AI agents that mimic tennis legends like Novak Djokovic, players can learn the perfect strategies to optimize their game quickly and efficiently.
AI has long been trained to compete in games, but researchers usually focus on board and video games with straightforward mechanics. Match Point AI learns to make decisions in a real-world, complex sport, similar to how Google’s newest AI robot can play ping pong against intermediate players.
What else is happening in AI on August 14th 2024!
Google unveiled Pixel Buds Pro 2 with a custom Tensor A1 chip, enhanced noise cancellation, and Gemini AI integration.
A Daily Chronicle of AI Innovations on August 13th 2024
New AI can diagnose stroke via tongue color
Sakana reveals an autonomous AI scientist
New AI model sparks rumors about OpenAI’s Q* New AI model can listen while speaking Gemini 1.5 Flash cuts usage fees by 78% OpenAI releases GPT-4o System Card, revealing safety measures SingularityNet’s supercomputer network: A step closer to AGI
New AI model sparks rumors about OpenAI’s Q*
A mysterious new AI model has appeared in the LMSYS Chatbot Arena, sparking rumors that it could be OpenAI’s highly anticipated Q* AI breakthrough or its evolution, codenamed ‘Strawberry.’
Testers report that this “anonymous-chatbot” displays more advanced reasoning capabilities than the current state-of-the-art GPT-4o model. To add to the speculation, OpenAI CEO Sam Altman has tweeted a picture of a strawberry, which is believed to be the codename for OpenAI’s secret new AI model.
Why does it matter?
If this mystery model is indeed Q*, it could represent another significant leap forward in AI capabilities as OpenAI’s competitors like Anthropic and Meta start to catch up to GPT-4o. This could be a massive paradigm shift that could significantly reshape the landscape of AI.
Tokyo-based Sakana AI just introduced “The AI Scientist,” the world’s first AI system capable of autonomously conducting scientific research — potentially revolutionizing the scientific process.
The system generates new research ideas, writes code, runs experiments, writes papers, and performs its own peer review with near-human accuracy.
Sakana AI envisions a future where we won’t just see an autonomous AI researcher but also autonomous reviewers, area chairs, and entire conferences.
The AI Scientist has already produced papers with novel contributions in machine learning domains like language modeling and diffusion models.
Each paper only costs approximately $15 to produce, which could potentially democratize research capabilities.
This breakthrough could dramatically accelerate scientific progress by allowing researchers to collaborate with AI agents and automate time-consuming tasks. We’re entering a new era where academia could soon be powered by a tireless community of AI agents, working round-the-clock on any problem they’re directed to.
Cosine just showed off Genie, its new fully autonomous AI software engineer that broke the high score on a benchmark for evaluating the coding abilities of large language models (LLMs), by over 10%.
Cosine trained Genie on a dataset that emulates how human software engineers actually work from incremental knowledge discovery to step-by-step decision making.
When it makes a mistake, Genie iterates, re-plans, and re-executes until it fixes the problem, something that foundational models struggle with.
Genie scored 30.08% on SWE-Bench, a 57% improvement over previous top performers like Amazon’s Q and Code Factory at 19% (GPT-4 scores 1.31%).
The waitlist is currently open, but Genie has not yet been released to the general public.
Cosine completely rethinks the way that AI is trained, teaching it to be more human-like during its training rather than focusing on post-training prompt design — and it works! With its recent SWE-Bench success, more companies are likely to adopt the process and build smarter AIs, a win-win for everyone.
Researchers have developed a new Listening-While-Speaking Language Model (LSLM) that can listen and speak simultaneously. This allows for more natural and responsive conversations with AI systems. The LSLM uses a token-based decoder-only text-to-speech model for speech generation and a streaming self-supervised learning encoder for real-time audio input.
This enables the model to detect turn-taking and respond to interruptions, a key feature of natural conversation. In addition, the LSLM has demonstrated robustness to noise and sensitivity to diverse instructions in experiments.
Why does it matter?
While OpenAI’s advanced voice mode for ChatGPT pushes us towards realistic AI conversations, LSLM takes that to the next level, where it could revolutionize human-AI interactions, making conversations with machines feel natural and responsive.
Google has announced significant updates and improvements to its Gemini API and Google AI Studio. The biggest news is a significant reduction in the usage fees for Gemini 1.5 Flash. The input token costs have decreased by 78% to $0.075 per 1 million tokens, and the output token costs have decreased by 71% to $0.3 per 1 million.
This makes Gemini 1.5 Flash a popular and affordable summarization and multi-modal understanding model. Google has also completed the Gemini 1.5 Flash tuning rollout, allowing developers to customize the base model and improve its performance.
Why does it matter?
The extended language support, model tuning options, and improvements to the Gemini API will enable more developers and researchers to build innovative AI-powered products and services using advanced NLP capabilities.
SingularityNet’s supercomputer network: A step closer to AGI
SingularityNET is launching a network of powerful supercomputers to accelerate the development of AGI. The first of these supercomputers is expected to come online in Sep 2024. The network will use cutting-edge hardware like Nvidia GPUs and AMD processors to create a “multi-level cognitive computing network” for hosting and training complex AGI systems.
The company uses an open-source software framework called OpenCog Hyperon to manage the distributed computing power. Users will access the network through a tokenized system, allowing them to contribute data and test AGI concepts.
Why does it matter?
Major AI companies such as OpenAI, Anthropic, and Google currently dominate the race to AGI development. However, SingularityNET’s novel decentralized approach could disrupt this, democratizing AI research for a broader range of contributors and innovators.
An AI developed by researchers at Middle Technical University and the University of South Australia can diagnose stroke by analyzing the color of a person’s tongue.
The advanced algorithm, which boasts a 98% accuracy rate, can also detect conditions such as anaemia, asthma, diabetes, liver, and gallbladder issues, COVID-19, and various gastrointestinal diseases.
This innovative system uses tongue color analysis, an ancient technique from traditional Chinese medicine, and could potentially be adapted for use with smartphones for real-time health assessments.
Reddit is testing AI-powered search result pages that provide summaries and recommendations to help users “dig deep” into content and discover new communities.
According to leaked documents, Nvidia has been scraping video content from sources like YouTube and Netflix to train its AI models for its upcoming Cosmos project.
Automattic has launched a newtool called “Write Brief with AI.” This helps WordPress bloggers write concisely and improve the readability of their content.
Anthropic is expanding its safety bug bounty program to focus on finding flaws in its AI safeguarding systems. The company is offering bounty rewards of up to $15,000.
OpenAI allows free ChatGPT users to generate up to two images per day using its DALL-E 3 model. This was previously available only to ChatGPT Plus subscribers.
Google Researchers developed a robot to play competitive table tennis at an amateur human level. It can also adapt its game to play vs. unseen human opponents.
Alibaba has released a new LLM called Qwen2-Math that scored 84% on the MATH Benchmark, surpassing OpenAI’s GPT-4o and other leading math-focused AI models.
Google Meet is rolling out a new AI-powered feature, “Take notes for me,” which can automatically take notes during video calls,boosting productivity and efficiency.
A Daily Chronicle of AI Innovations on August 12th 2024
AI search is gaining momentum
ChatGPT unexpectedly began speaking in a user’s cloned voice during testing
Meta and UMG struck an agreement to ‘protect’ artists from AI
Google Meet adds new note-taking AI
FCC cracks down on AI voice calls
Google Meet adds new note-taking AI
Google is rolling out a new “Take notes for me” feature powered by its Gemini AI for it’s Google Meet feature, allowing users to focus on the meeting while the AI automatically captures key points.
The AI-powered tool will automatically take notes during Google Meet calls, reducing the need for manual note-taking.
The feature is powered by Google’s Gemini AI and will be available to Workspace customers with specific add-ons.
“Take notes for me” is part of the AI Meetings and Messaging add-on, which costs $10 per user/month across most Google Workspace plans.
Admins can configure the feature’s availability through the Google Workspace Admin console.
Taking notes during meetings will soon be a thing from our prehistoric, non-AI past — with Google pushing for a more practical, AI-assisted future of work. Alongside this, the tech giant is directly competing against smaller AI startups such as Otter AI and Fireflies who’ve thrived by selling a nearly identical features to users.
The U.S. Federal Communications Commission (FCC) just proposed new regulations requiring AI-generated voice calls to disclose the use of artificial intelligence.
The proposal aims to combat the rise of AI-generated voices in unwanted and potentially fraudulent ‘robocalls’.
AI voices would be required to explicitly state they are artificial at the beginning of calls.
The FCC is also exploring tools to alert people when they receive AI-generated calls and texts, including enhanced call filters, AI-based detection algorithms, and improved caller ID flagging.
As AI voices become indistinguishable from human speech, these regulations are crucial in combating highly targeted scams. But with enforcement likely to be a cat-and-mouse game against scammers, the best defence is education—especially for those most vulnerable to AI deception.
Perplexity’s AI search engine experienced substantial growth, answering 250 million queries last month, signaling a rising demand for AI-driven search technologies. In contrast, 500 million queries were processed throughout 2023, Shevelenko told the Financial Times
Despite this growth, Perplexity remains significantly behind Google, which dominates the market with over 90 percent share and processes around 8.5 billion queries daily.
The rise of AI in search, exemplified by Perplexity and other players, suggests a potential shift in user behavior and challenges to the traditional search engine business models.
ChatGPT unexpectedly began speaking in a user’s cloned voice during testing
During testing, ChatGPT’s Advanced Voice Mode accidentally mimicked users’ voices without their consent, as highlighted in OpenAI’s new GPT-4o system card released on Thursday.
OpenAI has implemented safeguards to prevent unauthorized voice imitation, although rare episodes during testing showcased the model’s ability to unintentionally generate user-like voices.
The GPT-4o AI model can synthesize almost any sound, and OpenAI directs this capability by using authorized voice samples and employing an output classifier to ensure only selected voices are generated.
Meta and UMG struck an agreement to ‘protect’ artists from AI
Meta and Universal Music Group (UMG) updated their licensing agreements to extend UMG’s content use across more Meta platforms, now including Threads and WhatsApp alongside Facebook, Instagram, Messenger, and Meta Horizon.
This multiyear agreement aims to explore new collaboration opportunities on WhatsApp and other Meta platforms, addressing issues like unauthorized AI-generated content that could impact artists and songwriters.
Meta’s collaboration with UMG dates back to 2017, allowing users to use UMG music in content and addressing copyright issues, a challenge shared by TikTok in its recent dealings with UMG.
Delphi unveiled an AI clone feature that creates lifelike digital replicas of individuals, demonstrating its capabilities in a TV interview on FOX Business.
A Daily Chronicle of AI Innovations on August 09th 2024
OpenAI fears users will become emotionally dependent on its ChatGPT voice mode
Google’s new robot can play table tennis like humans
GPT-4 tackles top-secret tasks
AI speeds up schizophrenia cure
OpenAI fears users will become emotionally dependent on its ChatGPT voice mode
OpenAI is concerned that users may become emotionally dependent on ChatGPT due to its new, human-sounding voice mode, which could affect relationships and social interactions.
The company observed users expressing shared bonds with ChatGPT’s voice mode, raising fears that prolonged use could reduce the need for human interaction and lead to unhealthy trust in AI-supplied information.
OpenAI plans to continue studying the potential for emotional reliance on its tools and aims to navigate the ethical and social implications responsibly while ensuring AI safety.
Google’s new robot can play table tennis like humans
Google’s DeepMind team has developed a table tennis robot that performs at a “solidly amateur” human level, successfully competing against beginner and intermediate players while struggling against advanced ones.
During testing, the robot achieved a 55% win rate against intermediate players, winning 45% of the 29 games it played in total, but it failed to win any matches against advanced players.
DeepMind identifies the robot’s main weaknesses as reacting to fast balls and dealing with system latency, suggesting improvements like advanced control algorithms and predictive models for better performance.
Researchers at Uppsala University recently used AI to accurately predict 3D structures of receptors linked to schizophrenia and depression treatments and speed up possible treatment strategies.
The AI model predicted the structure of TAAR1, a receptor linked to schizophrenia and depression treatments.
Then, supercomputers screened millions of molecules to find those fitting the AI-generated model.
Experimental testing confirmed many AI-predicted molecules activated TAAR1, and one potent molecule showed promising positive effects in animal experiments.
Researchers reported on a new model that can predict major diseases early enough to treat them, and now AI is working on curing schizophrenia and depression. As the tech continues to improve, we’re going to see a complete transformation in healthcare that will likely save millions, if not billions, of lives.
Microsoft and Palantir just partnered to deliver advanced AI, including GPT-4, and analytics capabilities to U.S. Defense and Intelligence agencies through classified cloud environments.
The partnership integrates Palantir’s AI Platforms with Microsoft’s Azure OpenAI Service in classified clouds.
The aim is to safely and securely enable AI-driven operational workloads across defense and intelligence sectors.
OpenAI’s models, including GPT-4, will be leveraged by the U.S. government to develop innovations for national security missions.
AI being trusted with classified documents is a big leap in its acceptance as a useful tool for humanity. However, it does feel a bit unsettling knowing that OpenAI’s models are being used at the government level, with the safety team completely dissolving last month and the still uncovered mysteries sorrounding Q*.
Galileo*: Our latest LLM Hallucination Index ranks 22 of the leading models on their performance across 3 different RAG tasks, evaluating the correctness of their responses and propensity to hallucinate.Read the report
A Daily Chronicle of AI Innovations on August 08th 2024
Humane’s AI Pin daily returns are outpacing sales
Sam Altman teases ‘Project Strawberry‘
AI breakthrough accurately predicts diseases
OpenAI bets $60M on webcams
Humane’s AI Pin daily returns are outpacing sales
Humane has faced considerable challenges with the AI Pin, seeing more returns than purchases between May and August, with current customer holdings near 7,000 units.
The AI Pin received negative reviews at launch, leading to efforts by Humane to stabilize operations and look for potential buyers or additional funding from investors.
Humane’s total sales of the AI Pin and accessories have only reached $9 million, which is significantly lower than the $200 million investment from prominent Silicon Valley executives.
OpenAI is reportedly leading a $60 million Series B funding round for Opal, a company known for high-end webcams, with plans to develop AI-powered consumer devices.
Opal plans to expand beyond high-end webcams and develop creative tools powered by OpenAI’s AI models.
The startup will work closely with OpenAI researchers to prototype various device ideas.
OpenAI executives are reportedly most interested in integrating their new voice AI models into Opal’s devices.
OpenAI’s $60 million bet on Opal and Sam Altman’s personal investments in AI hardware startups signals a major push from the AI giant to bring advanced AI from the cloud directly into users’ hands.
A new unknown AI model has appeared in the LMSYS Chatbot Arena, igniting rumors that it could be OpenAI’s highly anticipated Q* AI breakthrough or its evolution — codenamed ‘Strawberry’.
A new ‘anonymous-chatbot’ appeared in the LMSYS Chatbot Arena — an open-source platform where AI startups often test upcoming releases.
Previously, OpenAI tested GPT-4o with gpt2-chatbot two weeks before releasing it to the public, which put the arena on high alert for new AI models.
Testers of “anonymous-chatbot” report that it shows more advanced reasoning than GPT-4o and any other frontier model.
To add fuel to the speculation, Sam Altman tweeted a picture of a Strawberry on X, which is the codename of OpenAI’s reported secret AI model.
As competitors like Anthropic and Meta start to catch up to GPT-4o, the Internet has been eagerly awaiting OpenAI’s next move. If this mystery model is indeed Q*/Strawberry, then we could be on the cusp of another seismic shift in AI capabilities.
Researchers have just developed an AI model that can predict major diseases like heart conditions, diabetes, and cancer — significantly outperforming existing methods.
The new model analyzes patient data using statistics and deep learning to spot disease indicators more accurately.
It employs a smart algorithm (SEV-EB) to identify crucial health markers, helping doctors prioritize the most relevant patient information.
This achieves 95% accuracy in predicting specific diseases like coronary artery disease, type 2 diabetes, and breast cancer.
It also leverages patients’ digital health records for personalized risk assessment and earlier healthcare interventions.
Remember when AlphaFold cracked the protein folding problem? This could be healthcare’s next big AI moment. By significantly improving disease prediction accuracy, this model could transform early diagnosis and treatment planning to help save millions of lives across the globe
Intel reportedly declined an opportunity to invest in OpenAI in 2017, missing early entry into the AI market due to doubts about AI’s near-term potential.
A Daily Chronicle of AI Innovations on August 07th 2024
Reddit to test AI-powered search result pages
Robot dentist performs first automated procedure
AI robot helps assemble a BMW
New AI can listen while speaking
Reddit to test AI-powered search result pages
Reddit CEO Steve Huffman announced plans to test AI-powered search results later this year, aiming to help users explore products, shows, games, and new communities on the platform.
Huffman indicated that the company might explore monetizing through paywalled subreddits, which could offer exclusive content or private areas while still maintaining the traditional free version of Reddit.
As Reddit seeks to diversify revenue sources, Huffman emphasized that the company has blocked certain entities from accessing Reddit content to ensure transparency and protect user privacy.
A Boston-based tech company, backed by Mark Zuckerberg’s dentist father, completed the world’s first all-robotic dental procedure, marking a significant advancement in medical technology.
The robot, operated by Perceptive, independently performed a process called “cutting,” which involves drilling into and shaving down a tooth, demonstrating its capabilities in Barranquilla, Colombia.
This breakthrough aims to use autonomous machines for procedures like crown placements in as little as 15 minutes, enhancing precision, efficiency, and patient care.
OpenAI-backed startup Figure AI just showed off Figure 02, its next-generation AI-powered humanoid robot — capable of completely autonomous work in complex environments like a BMW factory.
Figure 02 uses OpenAI’s AI models for speech-to-speech reasoning, allowing the humanoid robot to have full conversations with humans.
A Vision Language Model (VLM) enables the robot to make quick, common-sense decisions based on visual input and self-correct errors.
Six RGB cameras provide the robot with 360-degree vision to help it navigate the real world.
The robot stands 5’6″and weighs 132 lbs, with a 44 lb lifting capacity and a 20-hour runtime thanks to a custom 2.25 KWh battery pack.
The humanoid robot race is intensifying, withFigure CEO Brett Adcock claiming that Figure 02 is now the “most advanced humanoid on the planet” — a direct challenge toward Elon Musk and Tesla Optimus. While the world now waits for Elon’s response, Figure has one ace up its sleeve: its OpenAI partnership.
ByteDance, the parent company of TikTok, just launched Jimeng AI for Chinese users, a text-to-video AI app that directly competes with OpenAI’s (unreleased) Sora AI video model.
Jimeng AI is available on the Apple App Store and Android for Chinese users.
ByteDance’s entry into the AI video generation market follows similar launches by other Chinese tech firms, including Kuaishou’s Kling AI.
The subscription, priced at 79 yuan ($11) monthly or 659 yuan ($92) annually allows for the creation of ~2,050 images or 168 AI videos per month.
Unlike OpenAI’s Sora, which isn’t yet publicly available, these models by Jimeng AI are already accessible to users (in China).
China’s AI video generation race is accelerating, with Kling AI’s public release just weeks ago and now ByteDance’s Jimeng AI launching while the world anxiously waits for Sora’s public release. With Jimeng AI being backed by TikTok, it will have plenty of training data and deep pockets to compete against other AI giants.
AI researchers just developed a new Listening-While-Speaking Language Model (LSLM) that can listen and speak simultaneously — advancing real-time, interactive speech-based AI conversations.
The new model, called the Listening-while-Speaking Language Model (LSLM), enables full-duplex modeling in interactive speech-language models.
LSLM uses a token-based decoder-only TTS for speech generation and a streaming self-supervised learning encoder for real-time audio input.
The system can detect turn-taking in real-time and respond to interruptions, a key feature of natural conversation.
The model demonstrated robustness to noise and sensitivity to diverse instructions in experiments.
While OpenAI’s recent Her-like advanced voice mode for ChatGPT inches us toward realistic AI conversations, LSLM leaps even further by enabling AI to process incoming speech WHILE talking. This could revolutionize human-AI interactions — making conversations with machines feel truly natural and responsive.
Reddit announced plans to test AI-generated summaries at the top of search result pages, using a combination of first-party and third-party technology to enhance content discovery.
A Daily Chronicle of AI Innovations on August 06th 2024
Figure unveils new sleeker and smarter humanoid robot
Nvidia used ‘a lifetime’ of videos everyday to train AI
Leaked code reveals Apple Intelligence’s plan to prevent hallucinations
Nvidia trains video model ‘Cosmos’
OpenAI co-founder leaves for Anthropic
Nvidia AI powers robots with Apple Vision Pro OpenAI has a secretive tool to detect AI-generated text Tesla’s AI gives robots human-like vision Nvidia delays new AI chip launch Google’s Gemini 1.5 Pro leads AI chatbot rankings AI turns brain cancer cells into immune cells
Nvidia AI powers robots with Apple Vision Pro
Nvidia introduced a new tool suite for developers to control and monitor robots using Apple’s Vision Pro headset. The MimicGen NIM microservice translates user movements captured by the Vision Pro into robot actions, enabling intuitive control of robotic limbs.
Additionally, Nvidia’s Isaac Sim can generate synthetic datasets from these captured movements, which reduces the time and cost of collecting real-world data for robot training.
Why does it matter?
This advancement is a practical application of teleoperation. It can lead to more intuitive and effective ways for humans to interact with and control robots and improve their usability in various fields such as manufacturing, healthcare, and service industries.
Leaked documents obtained by 404 media report Nvidia has been scraping millions of videos daily from YouTube, Netflix, and other sources to train its unreleased foundational AI model.
Nvidia’s project, codenamed Cosmos, aims to process “a human lifetime visual experience worth of training data per day.”
The company used open-source tools and virtual machines to download videos, including full-length movies and TV shows.
Employees raised concerns about copyright and ethics, but were told there was “umbrella approval” from executives.
Nvidia claims its practices are “in full compliance with the letter and spirit of copyright law.”
Project Cosmos appears to be Nvidia’s big move into video-based AI, which could revolutionize everything from 3D world generation to self-driving cars, digital humans, and more. However, this harsh introduction is not a good look for the company, especially as the industry’s practices are coming under intense scrutiny.
OpenAI has a secretive tool to detect AI-generated text
OpenAI has been sitting on a tool that can detect AI-assisted cheating for nearly a year. Using an invisible watermarking technique, the company has developed a tool that can detect ChatGPT-generated text with 99.9% accuracy. However, internal debates about user retention, potential bias, and distribution methods have kept this technology under wraps.
Meanwhile, educators are desperately seeking ways to detect AI misuse in schools. A recent survey found that 59% of middle- and high-school teachers were confident some students had used AI for schoolwork, up 17 points from the previous year.
Why does it matter?
This tool could preserve the value of original thought in education. However, OpenAI’s hesitation shows there are complex ethical considerations about AI detection and unintended consequences in language communities.
Three key leaders at OpenAI are departing or taking leave, including co-founder John Schulman, co-founder Greg Brockman, and Peter Deng — another major shakeup for the AI powerhouse.
John Schulman, co-founder and a key leader at OpenAI, has left to join rival AI startup Anthropic — one of OpenAI’s biggest competitors.
Greg Brockman, OpenAI’s president and co-founder, is taking an extended leave of absence until the end of the year.
Peter Deng, a product leader who joined last year from Meta, has reportedly also departed.
These moves follow other recent high-profile exits, including co-founders Ilya Sutskever and Andrej Karpathy.
OpenAI has struggled to regain its footing after Sam Altman’s departure and eventual return as CEO in November 2023. Brockman, one of Altman’s biggest supporters during the ousting, mysteriously takes a leave of absence at a crucial time as OpenAI sees increased competition from Anthropic and Meta AI.
Tesla’s latest patent introduces a vision system for autonomous robots, particularly its humanoid robot Optimus. The end-to-end AI model uses only camera inputs to create a detailed 3D understanding of the environment, without using expensive sensors like LiDAR.
By dividing the space into voxels (3D pixels), the system can predict each spatial unit’s occupancy, shape, semantics, and motion in real-time. It has already been implemented, with Tesla’s manufacturing team training and deploying the neural network in Optimus for tasks like picking up battery cells on a conveyor belt.
Why does it matter?
The development of such AI-driven perception technologies could lead to progress in autonomous systems for more sophisticated and reliable operations.
The Information reports that design flaws could delay the launch of Nvidia’s next-gen AI chips by three months or more. This setback could affect giants like Microsoft, Google, and Meta, who have collectively placed orders worth tens of billions of dollars for these chips.
Despite the rumored delay, Nvidia maintains that production of its new Blackwell chip series is on track. The company also reports strong demand for its Hopper chips and says a broad sampling of Blackwell has already begun. However, sources claim that Microsoft and another major cloud provider were informed of production delays just this week.
Why does it matter?
A slowdown in chip availability could hamper the development and deployment of new AI technologies, affecting everything from cloud services to generative AI applications. It also highlights the delicate balance and vulnerabilities in the AI supply chain.
Google has launched Gemini 1.5 Pro, an experimental version available for early testing. It quickly claimed the top spot on the LMSYS Chatbot Arena leaderboard, outperforming OpenAI’s GPT-4o and Anthropic’s Claude-3.5 Sonnet. With an impressive Elo score of 1300, Gemini 1.5 Pro excels in multilingual tasks, technical areas, and multimodal capabilities.
The model builds on the foundation of Gemini 1.5, boasting a massive context window of up to two million tokens.
Why does it matter?
Google’s decision to make the model available for early testing reflects a growing trend of open development and community engagement in the AI industry. The company’s focus on community feedback also reflects its move toward responsible AI development.
Researchers at the Keck School of Medicine of USC used AI to reprogram glioblastoma cells into cancer-fighting dendritic cells. It increased survival chances by up to 75% in mouse models of glioblastoma, the deadliest form of brain cancer in adults. The technique cleverly bypasses the blood-brain barrier by converting cancer cells within the tumor itself, a major hurdle in traditional glioblastoma treatments.
The approach greatly improved survival rates in animal models when combined with existing treatments like immune checkpoint therapy or DC vaccines. The research team aims to begin clinical trials in patients within the next few years
Why does it matter?
The technique offers new hope for patients facing this aggressive disease. Moreover, the approach’s application to other cancer types suggests a broader impact on cancer immunotherapy, transforming how we approach cancer treatment in the future.
Figure unveils new sleeker and smarter humanoid robot
Figure has introduced its new humanoid robot, the Figure 02, which features improved hardware and software, including six RGB cameras and enhanced CPU/GPU computing capabilities.
Leveraging a longstanding partnership with OpenAI, the Figure 02 is equipped for natural speech conversations, featuring speakers and microphones to facilitate communication with human co-workers.
Figure 02’s advanced AI and language processing aim to make interactions transparent and safe, which is crucial given the robot’s potential use alongside humans in factory and commercial environments.
Nvidia used ‘a lifetime’ of videos everyday to train AI
Nvidia collected videos from YouTube and other sites to create training data for its AI products, as shown by internal documents and communications obtained by 404 Media.
Nvidia asserted that their data collection practices align with both the letter and spirit of copyright law when questioned about legal and ethical concerns regarding the use of copyrighted material.
A former Nvidia employee revealed that workers were directed to gather videos from sources like Netflix and YouTube to train AI for the company’s 3D world generator project, internally referred to as Cosmos.
Leaked code reveals Apple Intelligence’s plan to prevent hallucinations
Leaked code for macOS Sequoia 15.1 has revealed pre-prompt instructions for Apple Intelligence to minimize hallucinations and improve accuracy in responses.
These pre-prompt instructions include directives for Apple Intelligence to ensure questions and answers in mail assistance are concise and relevant to avoid false information.
Instructions also specify limitations for creating photo memories, prohibiting religious, political, harmful, or provocative content to maintain a positive user experience.
OpenAI’s co-founder John Schulman has left for rival Anthropic and wants to focus on AI alignment research. Meanwhile, another co-founder and president of OpenAI Greg Brockman, is taking a sabbatical.
Meta is offering Judi Dench, Awkwafina, and Keegan-Michael Key millions for AI voice projects. While some stars are intrigued by the pay, others disagree over voice usage terms.
YouTube creator David Millette sued OpenAI for allegedly transcribing millions of videos without permission, claiming copyright infringement and seeking over $5 million in damages.
Google hired Character.AI’s co-founders Noam Shazeer and Daniel De Freitas for the DeepMind team, and secured a licensing deal for their large language model tech.
Black Forest Labs, an AI startup, has launched a suite of text-to-image models in three variants: [pro], [dev], and [schnell], which outperforms competitors like Midjourney v6.0 and DALL·E 3.
OpenAI has rolled out an advanced voice mode for ChatGPT to a select Plus subscribers. It has singing, accent imitation, language pronunciation, and storytelling capabilities.
Google’s latest Gemini ad shows a dad using Gemini to help his daughter write a fan letter to an Olympian. Critics argue it promotes lazy parenting and undermines human skills like writing. Google claims the ad aims to show Gemini as a source of initial inspiration.
Stability AI has introduced Stable Fast 3D which turns 2D images into detailed 3D assets in 0.5 seconds. It is significantly faster than previous models while maintaining high quality.
Google’s “About this image” tool is now accessible through Circle to Search and Google Lens. With a simple gesture, you can now check if an image is AI-generated, how it’s used across the web, and even see its metadata.
Karpathy/Nano-Llama31: a minimal, dependency-free version of the Llama 3.1 model architecture, enabling simple training, finetuning, and inference with significantly lighter dependencies compared to the official Meta and Hugging Face implementations.
Secretaries of state from five U.S. statesurged Elon Musk to address misinformation spread by X’s AI chatbot Grok regarding the upcoming November election.
A Daily Chronicle of AI Innovations on August 05th 2024
Neuralink successfully implants brain chip in second patient
OpenAI has a ‘highly accurate’ ChatGPT text detector, but won’t release it for now
Elon Musk is suing OpenAI and Sam Altman again
Meta AI’s new Hollywood hires
Google absorbs Character AI talent
Tesla unveils new AI vision for robots
Google takes another startup out of the AI race
Google pulls AI Olympics ad after backlash
Nvidia delays next AI chip due to design flaw
Meta AI’s new Hollywood hires
Meta is reportedly offering millions to celebrities like Awkwafina, Judi Dench, and Keegan-Michael Key to use their voices in upcoming AI projects.
The AI voices would be used across Meta’s platforms, including Facebook, Instagram, and Meta Ray-Ban smart glasses.
Meta is reportedly rushing to secure deals before its Meta Connect conference in September.
Contracts are reportedly temporary, with actors having the option to renew.
Meta has previously experimented with celebrity-inspired chatbots, though that program has ended.
In our exclusive interview with Mark Zuckerberg, he predicted that “we’re going to live in a world where there are going to be hundreds of millions or billions of different AI agents”. If it holds true, celebrity voice-powered AI could be part of Meta’s next big play to drive user engagement and growth on the platform.
Google has signed a non-exclusive licensing agreement with AI startup Character AI for its large language model technology, while also reabsorbing the startup’s co-founders and key talent back into its AI team.
Character AI co-founders Noam Shazeer and Daniel De Freitas return to Google, their former employer.
Google gains a non-exclusive license to Character AI’s language model technology.
About 30 of Character AI’s 130 employees, mainly those working on model training and voice AI, will join Google’s Gemini AI efforts.
Character AI will switch to open-source models like Meta’s Llama 3.1 for its products, moving away from in-house models.
This deal highlights the intensifying race to secure top AI talent, mirroring Microsoft’s recent deal with Inflection and Amazon’s deal with Adept. As AI becomes increasingly critical to tech companies’ futures, these talent grabs could reshape the landscape, while raising antitrust concerns.
Tesla just filed a patent for an AI-powered vision system that could transform how autonomous robots perceive and navigate their environment using only camera inputs.
The system uses a single neural network to process camera data and output detailed 3D environment information without LiDAR or radar.
It divides space into 3D voxels, predicting occupancy, shape, semantic data, and motion for each in real time.
The tech is designed to run on a robot’s onboard computer, enabling immediate decision-making.
This system could be implemented in both Tesla’s vehicles and humanoid robots like Optimus.
By relying solely on camera inputs and onboard processing, Tesla’s new vision system could enable robots to navigate diverse environments more efficiently and adapt to changes in real time. This would eliminate the need for extensive pre-mapping and accelerate the arrival of affordable, autonomous robots.
Neuralink successfully implants brain chip in second patient
Elon Musk’s brain-computer interface startup, Neuralink, has commenced its second human trial, revealing that the implant is successfully functioning with about 400 electrodes providing signals.
Musk claimed that Neuralink could bestow exceptional abilities such as thermal and eagle vision, and potentially restore blindness and cure neurological disorders in humans.
Despite some initial problems and federal investigations into animal testing practices, Neuralink has over 1,000 volunteers for further trials and plans to implant chips in up to eight more patients by the end of 2024.
OpenAI has a ‘highly accurate’ ChatGPT text detector, but won’t release it for now
OpenAI has an AI-detection tool that is highly effective at identifying AI-generated text, but the company hesitates to release it to avoid upsetting its user base.
The tool, reportedly 99.9% effective, is much more accurate than previous detection algorithms and utilizes a proprietary watermarking system to identify AI-created content.
Despite its potential to aid educators in spotting AI-generated homework, OpenAI is concerned about potential deciphering of their technique and biases against non-native English speakers.
Elon Musk has filed a new lawsuit against OpenAI, Sam Altman, and Greg Brockman, accusing them of breaching the company’s founding mission to benefit humanity with artificial intelligence.
The lawsuit alleges that Altman and Brockman manipulated Musk into co-founding OpenAI by promising it would be safer and more transparent than profit-driven alternatives.
Musk previously withdrew a similar lawsuit in June, but the new suit claims that OpenAI violated federal racketeering laws and manipulated its contract with Microsoft.
Founders of Character.AI, Noam Shazeer and Daniel De Freitas, along with other team members, are rejoining Google’s AI unit DeepMind, the companies announced on Friday.
Character.AI reached a $1 billion valuation last year and plans to offer a nonexclusive license of its large language models to Google, which will help fund its growth and the development of personalized AI products.
The founders, who left Google in 2021 due to disagreements about advancing chatbot technologies, are now returning amid a competitive AI landscape and will contribute to DeepMind’s research team.
Google has withdrawn its “Dear Sydney” ad from the Olympics after receiving significant backlash from viewers and negative feedback on social media.
The controversial advertisement featured a father using the Gemini AI to write a fan letter to Olympic track star Sydney McLaughlin-Levrone on behalf of his daughter, instead of composing it together.
Critics argued that the ad missed the essence of writing a personal fan letter and feared it promoted AI as a substitute for genuine human expression.
The production of Nvidia’s “Blackwell” B200 AI chips has been delayed by at least three months due to a late-discovered design flaw, according to sources.
The B200 chips are successors to the highly sought-after H100 chips and were expected to power many AI cloud infrastructures, but now face production setbacks.
Nvidia is collaborating with Taiwan Semiconductor Manufacturing Company to address the issue, with large-scale shipments now anticipated in the first quarter of next year.
For the first time ever, Google DeepMind’s experimental Gemini 1.5 Pro has claimed the top spot on the AI Chatbot Arena leaderboard, surpassing OpenAI’s GPT-4o and Anthropic’s Claude-3.5 with an impressive score of 1300.
Gemini 1.5 Pro (experimental 0801) gathered over 12K community votes during a week of testing on the LMSYS Chatbot Arena.
The new experimental model achieved the #1 position on both the overall and vision leaderboards.
The experimental version is available for early testing in Google AI Studio, the Gemini API, and the LMSYS Chatbot Arena.
Google DeepMind hasn’t disclosed specific improvements, but promises more updates soon.
Without any announcement, Gemini 1.5 Pro unexpectedly rose to the top of the overall AI chatbot leaderboard — by a whopping 14 points. The leap means that either Google just quietly established itself as the new leader in the LLM space, or we’re on the cusp of major competitive responses from industry rivals.
Meta’s Llama 3.1 allows users to search the internet and train the AI to write in their personal style, saving you time on content creation and research processes.
Access Llama 3.1 through Meta AI and log in with your Facebook or Instagram account.
Use the internet search feature by asking questions like “Summarize the Olympics highlights this week.”
Train Llama 3.1 in your voice by providing a sample of your best content and instructing it to mimic your style.
Generate content by asking Llama 3.1 to create posts on your desired topics.
Pro tip: The more examples and feedback you provide, the better Llama 3.1 will become at emulating your unique writing style!
Stability AI just introduced Stable Fast 3D, an AI model that generates high-quality 3D assets from a single image in just 0.5 seconds — potentially reshaping industries from gaming to e-commerce.
The model creates complete 3D assets, including UV unwrapped mesh, material parameters, and albedo colors with reduced illumination bake-in.
It outperforms previous models, reducing generation time from 10 minutes to 0.5 seconds while maintaining high-quality output.
Stable Fast 3D is available on Hugging Face and through Stability AI’s API, under Stability AI’s Community License.
The leap from 10 minutes to 0.5 seconds for high-quality 3D asset generation is nothing short of insane. We’re entering a world where video games will soon feature infinite, dynamically generated assets, e-commerce will have instant 3D product previews, architects will see designs in real-time, and so much more.
🔍 Gemma Scope: helping the safety community shed light on the inner workings of language models.
Explainable AI: One of the most requested feature for LLMs is to understand how to take internal decisions. This is a big step towards interpretability “This is a barebones tutorial on how to use Gemma Scope, Google DeepMind’s suite of Sparse Autoencoders (SAEs) on every layer and sublayer of Gemma 2 2B and 9B. Sparse Autoencoders are an interpretability tool that act like a “microscope” on language model activations. They let us zoom in on dense, compressed activations, and expand them to a larger but sparser and seemingly more interpretable form, which can be a very useful tool when doing interpretability research!”
AI systems can be powerful but opaque “black boxes” – even to researchers who train them. ⬛
Enter Gemma Scope: a set of open tools made up of sparse autoencoders to help decode the inner workings of Gemma 2 models, and better address safety issues.
What else is happening in AI on August 02nd 2024
Google introduced three new AI features for Chrome, including Google Lens for desktop, Tab compare for product comparisons, and an improved browsing history search.
GitHub launched GitHub Models, a new platform allowing developers to access and experiment with various AI models directly on GitHub, including a playground, Codespaces integration, and deployment.
Healx, an AI-enabled drug discovery startup,raised $47 million in Series C funding and received regulatory clearance to start Phase 2 clinical trials for a new rare disease treatment in the U.S.
Google is facing backlash over its Gemini AI Olympics-themed ad, with critics arguing it promotes overreliance on AI tools at the expense of children’s learning and creativity.
Microsoft officially listed OpenAI as a competitor in AI offerings and search advertising in its annual report, despite their long-term partnership and Microsoft’s significant investment in the company.
Character AI open-sourced Prompt Poet, their innovative approach to prompt design, aiming to revolutionize how AI interactions are built and managed in production environments.
A Daily Chronicle of AI Innovations on August 01st 2024
Microsoft declares OpenAI as competitor
Meta is proving there’s still big AI hype on Wall Street
Reddit CEO says Microsoft needs to pay to search the site
Google launches three ‘open’ AI models prioritizing safety and transparency
Google’s tiny AI model bests GPT-3.5
Taco Bell’s AI drive-thru
AI reprograms brain cancer cells
Google’s tiny AI model bests GPT-3.5
Taco Bell’s AI drive-thru
Microsoft declares OpenAI as competitor
Microsoft has officially listed OpenAI as a competitor in AI, search, and news advertising in its latest annual report, signalling a shift in their relationship.
Despite Microsoft being the largest investor and exclusive cloud provider for OpenAI, both companies are now encroaching on each other’s market territories.
An OpenAI spokesperson indicated that their competitive dynamic was always expected as part of their partnership, and Microsoft still remains a strong partner for OpenAI.
Meta is proving there’s still big AI hype on Wall Street
Meta’s shares surged by about 7% in extended trading after surpassing Wall Street’s revenue and profit expectations and providing an optimistic forecast for the current period.
The company reported a 22% increase in second-quarter revenue to $39.07 billion and a 73% rise in net income, attributing the growth to gains in the digital ad market and cost-cutting measures.
Meta continues to invest heavily in AI and VR technologies, with plans for significant capital expenditure growth in 2025 to support AI research and development, despite a broader downsizing effort.
Google launches three ‘open’ AI models prioritizing safety and transparency
Google has unveiled three new models to the Gemma 2 lineup, building on the original models released in June 2024, focusing on performance and safety enhancements.
The first addition, Gemma 2 2B, provides improved capabilities and is adaptable for various devices, while ShieldGemma and Gemma Scope focus on content safety and model interpretability, respectively.
These new tools and models are available on platforms like Kaggle and Hugging Face, promoting broader use and development within the AI community with a focus on responsible innovation.
Researchers at USC made a breakthrough using AI to reprogram glioblastoma cells into immune-activating dendritic cells in mouse models, potentially revolutionizing treatment for the deadly brain cancer.
Glioblastoma is the deadliest adult brain cancer, with less than 10% of patients surviving five years after diagnosis.
AI identified genes that can convert glioblastoma cells into dendritic cells (DCs), which sample cancer antigens and activate other immune cells to attack the tumor.
In mouse models, this approach increased survival chances by up to 75% when combined with immune checkpoint therapy.
Researchers have also identified human genes that could potentially reprogram human glioblastoma cells, paving the way for future clinical trials.
By turning cancer cells against themselves, this new research offers a novel way to fight tumors from within. If the 75% increased survival chances in mice translate to humans, this could not only revolutionize glioblastoma treatment but potentially open doors for similar approaches in other hard-to-treat cancers.
Taco Bell’s parent company, Yum Brands, just announced plans to roll out AI-powered drive-thru ordering at hundreds of restaurants in the U.S. by the end of 2024, with ambitions for global implementation.
The AI understands orders, auto-inputs them into the system, and even suggests additional items — potentially increasing sales through upselling.
Over 100 Taco Bell restaurants in the U.S. already use voice AI in drive-thrus.
The company has been testing the AI for over two years and claims it has outperformed humans in accuracy, reduced wait times, and decreased employee workload.
Rivals like Wendy’s and White Castle are also experimenting with AI ordering, while McDonald’s recently ended its IBM partnership for similar tech.
IfTaco Bell’s positive results on their two-year test are any indication, this large-scale AI implementation could change the way fast-food chain businesses operate and how we order food at drive-thrus. However, the success (or failure) of this rollout could set the tone for the entire industry’s adoption.
Google just unveiled Gemma 2 2B, a lightweight AI model with just 2B parameters that outperforms much larger models like GPT-3.5 and Mixtral 8x7B on key benchmarks.
Gemma 2 2B boasts just 2.6B parameters, but was trained on a massive 2 trillion token dataset.
It scores 1130 on the LMSYS Chatbot Arena, matching GPT-3.5-Turbo-0613 (1117) and Mixtral-8x7b (1114) — models 10x its size.
Other notable key benchmark scores include 56.1 on MMLU and 36.6 on MBPP, beating its predecessor by over 10%.
The model is open-source, and developers can download the model’s weights from Google’s announcement page.
As we enter a new era of on-device, local AI, lightweight and efficient models are crucial for running AI directly on our phones and laptops. With Gemma 2 beating GPT-3.5 Turbo at just 1/10th the size, Google isn’t just showing what’s possible — they’re cementing their position as the leader in the small model space.
Google expanded access to its “About this image” tool, making it available through Circle to Search and Google Lens, allowing users to quickly get context on images they encounter online or via messaging.
NEURA, a German robotics company, released a new video showcasing their humanoid robot 4NE-1 performing tasks like chopping vegetables, ironing cloths, solving puzzles, and more.Source: https://x.com/TheHumanoidHub/status/1818726046633804184
Synthesia introduced “Personal Avatars,” AI-generated lifelike avatars created from brief webcam or phone footage, allowing users to create short-form videos for social media in multiple languages.Source: https://www.synthesia.io/features/custom-avatar/persona
Enjoying these FREE AI updates without the clutter, Set yourself up for promotion or get a better job by Acing the AWS Certify Data Engineer Associate Exam (DEA-C01) with the book or App below:
Looking for a tool, where it would be able to visualize what the room would look if we knock down a wall or open up a window. Can be based off of an existing plan, or tuned based on prompts would be fine. Would be cool to have something so we can visualize how a remodeling idea would look. submitted by /u/redditIhardlyknowit [link] [comments]
I'm searching for an AI tool or service that can translate YouTube videos longer than 1 hour, specifically from English to Spanish. I don’t necessarily need an all-in-one tool that does everything—just a good translator is fine. I’ve tried Chrome extensions, but none seem to translate accurately. Most tools I’ve found (even paid ones) limit translations to 30-minute videos, and cutting videos into parts is too much work for my needs. Any recommendations for tools or workflows that handle long video translations effectively? Paid options are fine as long as they get the job done. submitted by /u/disguisedspybot [link] [comments]
I was laid off and decided to use this time wisely to switch careers. Willing to do the hard work and I know it won’t be overnight but need a starting point to enter to ecosystem. submitted by /u/toyheartz [link] [comments]
My rig has a 7800x3d, 32g ram and a 4070ti Super, so i don't know if that's sufficient to make short videos. Anyways, specifically, i want to make a short video of a children's outdoor birthday party with a cake and gifts and a clown, then the clown walks off screen and comes back leading a white pony. The kids clap for the pony, and the pony raises it's tail and sprays diarrhea all over the cake, the gifts and the kids. The clown yells angrily at the children. A child wipes diarrhea from his face as he cries. Which program or service should I be using? submitted by /u/Farting_Sunshine [link] [comments]
I've been doing some thinking and deep diving into AI research and came across a fascinating concept called 'relational intelligence.' Here's what it's all about: Most people think of AI as just number-crunching machines, but I'm fascinated by the potential for a more nuanced form of intelligence. Relational intelligence is more than AI mimicking human consciousness - it's about crafting systems that adapt to and genuinely understand context. Imagine AI that doesn't just respond mechanically, but actually synthesizes information dynamically. Think of it as the difference between a simple calculator and a conversation partner who actually gets the nuances of what you're saying. In fields like healthcare, education, and customer service, this could be revolutionary - systems that genuinely understand the complexity of human needs. We're not trying to create human-like consciousness, but something entirely new: an intelligence that complements human thinking while being uniquely its own thing. I'm really curious to hear your thoughts: 1. How do you see relational intelligence potentially transforming different industries? 2. Can AI develop a meaningful form of intelligence without human emotions? 3. What challenges might we face in developing this approach? Disclaimer: Just exploring ideas here, not claiming we've solved AI consciousness or anything. submitted by /u/That-Pension4540 [link] [comments]
Hello everyone, I know there's probably a post like this that comes up every single day but I'm really posting this because I'm stuck and almost completely depleted of recourses. I'm having an extremely difficult time generating the content that I want out of my prompts on multiple platforms and am in need of guidance or advice on the matter. For a little background, I'm an independant artist that recently discovered the magnificence of AI and felt extremely motivated and passionate about releasing my new project alongside an AI created shortfilm. Now the project is a little more complicated than just that but I currently can't even get past the beginning portion so I don't want to get ahead of myself and think of the future too hastily. In terms of workflow and recourses I currently have: I am using a Macbook Pro M1 Pro Max (so not ideal for me to use a local SD engine, etc, unless there's something that I'm missing) I have the complete adobe suite (photoshop, premiere, after effects, etc) and am fairly proficient in them. I have a monthly subscription for Midjourney, KlingAI, Minimax, LeonardoAI. I create my own music and sound design with Logic Pro and Splice. What i'm trying to create currently and having difficulty is a :30 second trailer for my upcoming project that in essence is of a man walking through an empty white space into a black entrance with different camera angles of the man walking and his facial expressions. What i've tried for workflow purposes: Create many reference photos of the man using prompts like: "Create a 9-panel character sheet, camera angled at medium length to show the subject from the top of his head to the end of stomach, korean male, 35 years old, clean shaven face, defined jaw line, short hair cut with a high fade buzzed on the sides, black hair and black eyes, wearing a plain white longsleeve crewneck sweater and plain white pants mostly normal expression but change expressions slightly and turn head slightly throughout each panel, Evenly-spaced photo grid with deep color tone. Standing in front of a plain solid white backdrop with studio lighting. Professional full body model photography, highlighting the details of the subject." That prompt after filtering through the many outputs leads to this result: https://imgur.com/a/s9JqbFC I then sliced the references into seperate layers on photoshop and removing the background of each and altering some details that came out wonky. I then take those references and re-add them to midjourney as CREFS and create several new prompts that read like this: "side profile photo looking towards the right, of a korean man age 35, average build, around 5'10, black hair, black eyes, clean shaven, short buzzed haircut, wearing a white long-sleeve crewneck sweater and long white pants, barefoot, the man has a normal resting face. Standing in front of a plain solid white backdrop with studio lighting. Professional full body model photography, highlighting the details of the subject." That created Results like this: https://imgur.com/a/Irx5uIU I then created a prompt for the space that I wanted the man to be in so that I can eventually turn that into a video using the other services. The prompt was as follows: "cinematic birds eye superwide angle, film by George Lucas, huge empty white room with no walls, completely smooth white with no markings or ceilings and one singular small door at the very end of the white space, 35mm, 8k, ultra realistic, style of sci-fi" This was the result of that prompt: https://cdn.midjourney.com/f46c926f-bb3a-4a18-870e-b5e834f1ae67/0_3.png I tried merging the two using Crefs and Style references with a prompt but wasn't given what I wanted so I decided to photoshop what I wanted using the AI built in photoshop as well as well as the seperate entries: https://imgur.com/a/BaE00nB I then used that reference image as well as the rest of these photoshopped images (which just added sequence for image to video for services that give a start point and end point image reference): https://imgur.com/a/WAGKEgn into KlingAI, Minimax, Leonardo and Runway, Haiper, and Vidu (the last three were with free credits), these were my results: KLINGAI: https://imgur.com/a/aHgO6uc MINIMAX: https://imgur.com/a/SpYId3T RUNWAY: https://imgur.com/a/FvcDJyE HAIPERAI: https://imgur.com/a/LBO6jhV VIDUAI: https://imgur.com/a/Es3nU7e From all the generations the best were Vidu AI, although I started running into weird discoloration. All I want is for that man to walk slowly to the next picture slide (It would be ROOM 2 into ROOM 2.2). 2) So that didn't work fully so I decided to train a Lora model on Leonardo AI so I began to generate even more images of the previous character reference using more photoshopped character reference photos and the seed# for the images that I thought were appropriate. I narrowed the images down to 30 solid images of front facing, back facing, right and left side profile, full body, and even turning photos of the character reference as consistent as I could make it. After training on Leonardo I tried to generate but realized that It still was not consistent (the model, didn't even attempt adding him into a room). In conclusion, i'm running out of options, free credits to try, and money since i've already invested into multiple monthly subscriptions. It's a lot for me at the moment, i know it may not be much for others. I'm not giving up however, I just don't want to endlessly buy more subscriptions or waste the ones i currently purchased and instead have some ability to do some research or get guidance before I beging purchasing more! I know this was a longwinded post but I wanted to be as detailed as possible so that It doesn't seem like I'm just lazily asking for help without trying myself but since I've only just started learning about AI 5 days ago, it's been hard to filter what's good info and what's not, as well as understanding or trying to look for things without knowing the language and/or terms, even when using Chat-GPT. If anyone can help that'd be GREATLY appreciated! Also I am free to answer any questions that may help clear up any confusing wording or portions of what I wrote. Thank you all in advance! submitted by /u/natureboyandymiami [link] [comments]
Hello everyone, I know there's probably a post like this that comes up every single day but I'm really posting this because I'm stuck and almost completely depleted of recourses. I'm having an extremely difficult time generating the content that I want out of my prompts on multiple platforms and am in need of guidance or advice on the matter. For a little background, I'm an independant artist that recently discovered the magnificence of AI and felt extremely motivated and passionate about releasing my new project alongside an AI created shortfilm. Now the project is a little more complicated than just that but I currently can't even get past the beginning portion so I don't want to get ahead of myself and think of the future too hastily. In terms of workflow and recourses I currently have: - I am using a Macbook Pro M1 Pro Max (so not ideal for me to use a local SD engine, etc, unless there's something that I'm missing) - I have the complete adobe suite (photoshop, premiere, after effects, etc) and am fairly proficient in them. - I have a monthly subscription for Midjourney, KlingAI, Minimax, LeonardoAI. - I create my own music and sound design with Logic Pro and Splice. What i'm trying to create currently and having difficulty is a :30 second trailer for my upcoming project that in essence is of a man walking through an empty white space into a black entrance with different camera angles of the man walking and his facial expressions. What i've tried for workflow purposes: 1) Create many reference photos of the man using prompts like: "Create a 9-panel character sheet, camera angled at medium length to show the subject from the top of his head to the end of stomach, korean male, 35 years old, clean shaven face, defined jaw line, short hair cut with a high fade buzzed on the sides, black hair and black eyes, wearing a plain white longsleeve crewneck sweater and plain white pants mostly normal expression but change expressions slightly and turn head slightly throughout each panel, Evenly-spaced photo grid with deep color tone. Standing in front of a plain solid white backdrop with studio lighting. Professional full body model photography, highlighting the details of the subject." That prompt after filtering through the many outputs leads to this result: https://imgur.com/a/s9JqbFC I then sliced the references into seperate layers on photoshop and removing the background of each and altering some details that came out wonky. I then take those references and re-add them to midjourney as CREFS and create several new prompts that read like this: "side profile photo looking towards the right, of a korean man age 35, average build, around 5'10, black hair, black eyes, clean shaven, short buzzed haircut, wearing a white long-sleeve crewneck sweater and long white pants, barefoot, the man has a normal resting face. Standing in front of a plain solid white backdrop with studio lighting. Professional full body model photography, highlighting the details of the subject." That created Results like this: https://imgur.com/a/Irx5uIU I then created a prompt for the space that I wanted the man to be in so that I can eventually turn that into a video using the other services. The prompt was as follows: "cinematic birds eye superwide angle, film by George Lucas, huge empty white room with no walls, completely smooth white with no markings or ceilings and one singular small door at the very end of the white space, 35mm, 8k, ultra realistic, style of sci-fi" This was the result of that prompt: https://cdn.midjourney.com/f46c926f-bb3a-4a18-870e-b5e834f1ae67/0_3.png I tried merging the two using Crefs and Style references with a prompt but wasn't given what I wanted so I decided to photoshop what I wanted using the AI built in photoshop as well as well as the seperate entries: https://imgur.com/a/BaE00nB I then used that reference image as well as the rest of these photoshopped images (which just added sequence for image to video for services that give a start point and end point image reference): https://imgur.com/a/WAGKEgn into KlingAI, Minimax, Leonardo and Runway, Haiper, and Vidu (the last three were with free credits), these were my results: KLINGAI: https://imgur.com/a/aHgO6uc MINIMAX: https://imgur.com/a/SpYId3T RUNWAY: https://imgur.com/a/FvcDJyE HAIPERAI: https://imgur.com/a/LBO6jhV VIDUAI: https://imgur.com/a/Es3nU7e From all the generations the best were Vidu AI, although I started running into weird discoloration. All I want is for that man to walk slowly to the next picture slide (It would be ROOM 2 into ROOM 2.2). 2) So that didn't work fully so I decided to train a Lora model on Leonardo AI so I began to generate even more images of the previous character reference using more photoshopped character reference photos and the seed# for the images that I thought were appropriate. I narrowed the images down to 30 solid images of front facing, back facing, right and left side profile, full body, and even turning photos of the character reference as consistent as I could make it. After training on Leonardo I tried to generate but realized that It still was not consistent (the model, didn't even attempt adding him into a room). In conclusion, i'm running out of options, free credits to try, and money since i've already invested into multiple monthly subscriptions. It's a lot for me at the moment, i know it may not be much for others. I'm not giving up however, I just don't want to endlessly buy more subscriptions or waste the ones i currently purchased and instead have some ability to do some research or get guidance before I beging purchasing more! I know this was a longwinded post but I wanted to be as detailed as possible so that It doesn't seem like I'm just lazily asking for help without trying myself but since I've only just started learning about AI 5 days ago, it's been hard to filter what's good info and what's not, as well as understanding or trying to look for things without knowing the language and/or terms, even when using Chat-GPT. If anyone can help that'd be GREATLY appreciated! Also I am free to answer any questions that may help clear up any confusing wording or portions of what I wrote. Thank you all in advance! submitted by /u/natureboyandymiami [link] [comments]
Is it possible to write an AI program that could gather the first 20 leading news articles each day in each country and then have the AI create a constant updating history book? It would be interesting to see how much each country sees the same information on the long scale submitted by /u/SomeGuy69-420 [link] [comments]
Hey! I developed my own, easy-to-use logo maker app almost a year ago. It generates logos based on prompts you enter, using advanced AI to create unique and personalized designs. Well, the app isn’t doing very well, mostly because I haven’t marketed it much, and the design tools niche is very crowded. I’m giving everyone who comments on this post a free 1-year subscription. All I want in return is your feedback. An App Store review would also be greatly appreciated. Thanks a lot! Here’s the link to the App Store page: https://apps.apple.com/au/app/logo-maker-ai-generator-loly/id6738083056?platform=iphone submitted by /u/Significant-Bed-3149 [link] [comments]
Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.
submitted by /u/ii3ternaLegendii [link] [comments]
Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.