Download the AI & Machine Learning For Dummies PRO App: iOS - Android Our AI and Machine Learning For Dummies PRO App can help you Ace the following AI and Machine Learning certifications:
Welcome to our blog series “AI Innovations in May 2024”! This is an evolving article that will be updated daily throughout the month of May 2024 to bring you the latest news and developments in the world of artificial intelligence. As we move further into the 21st century, the pace of AI innovation continues to accelerate at a breathtaking rate. Each day, researchers, engineers, and visionaries are pushing the boundaries of what’s possible, unlocking new capabilities and applications that are transforming industries, enhancing our lives, and shaping the future. In this blog, we’ll dive deep into the most exciting AI breakthroughs, advancements, and milestones happening in May 2024. From groundbreaking AI-powered technologies and cutting-edge research to the societal and ethical implications of these innovations, we’ll provide you with a comprehensive and insightful look at the rapidly evolving world of artificial intelligence. Whether you’re an AI enthusiast, a tech-savvy professional, or simply someone curious about the future, this blog will keep you informed, inspired, and engaged. So, join us on this journey of discovery as we explore the frontiers of AI and uncover the innovations that are shaping our world. Stay tuned for daily updates, and get ready to be amazed by the incredible advancements that are happening in the world of AI!
Experience the transformative capabilities of AI with “Read Aloud For Me – AI Dashboard – AI Tools Catalog – AI Tools Recommender” – your ultimate AI Dashboard and Hub. Seamlessly access a comprehensive suite of top-tier AI tools within a single app, meticulously crafted to enhance your efficiency and streamline your digital interactions. Now available on the web at readaloudforme.com and across popular app platforms including Apple, Google, and Microsoft, “Read Aloud For Me – AI Dashboard” places the future of AI at your fingertips, blending convenience with cutting-edge innovation. Whether for professional endeavors, educational pursuits, or personal enrichment, our app serves as your portal to the forefront of AI technologies. Embrace the future today by downloading our app and revolutionize your engagement with AI tools.
A Daily chronicle of AI Innovations May 31st 2024:
OpenAI reports misuse of its AI for ‘deceptive activity’ OpenAI introduces ChatGPT Edu for universities Tech giants form industry group to develop next-gen AI chip
Google mistakenly erased $135 billion from customer account
Apple planning AI-driven overhaul for Siri to manage individual apps
World’s first tooth-regrowing drug approved for human trials
Perplexity launches AI publishing platform
OpenAI launches ChatGPT Edu for universities
OpenAI reports misuse of its AI for ‘deceptive activity’
OpenAI identified and disrupted five covert influence operations that used its generative AI models for “deceptive activity” across the internet.
OpenAI said the threat actors used its AI models to generate short comments, longer articles in various languages, made-up names, and bios for social media accounts over the last three months. These campaigns included threat actors from Russia, China, Iran, and Israel. It focused on issues including Russia’s invasion of Ukraine, the conflict in Gaza, the Indian elections, and politics in Europe and the United States, among others.
OpenAI has also published a trend analysis report that describes the behavior of these malicious actors in detail.
This provides concrete evidence of generative AI being used to spread misinformation and manipulate global affairs. While there were no novel attacks this time, defending against them in the future will require continued investment in monitoring, safeguards, and public awareness from AI developers and platforms.
Tech giants form industry group to develop next-gen AI chip
Intel, Google, Microsoft, Meta, and other tech heavyweights are establishing a new industry group called the Ultra Accelerator Link (UALink) Promoter Group to guide the development of the components that link together AI accelerator chips in data centers. The group also counts AMD, Hewlett Packard Enterprise, Broadcom, and Cisco among its members.
It is proposing a new industry standard to connect the AI accelerator chips found within a growing number of servers. It’s version one, UALink 1.0, will connect up to 1,024 AI accelerators — GPUs only — across a single computing “pod.” It is set to arrive in Q4 2024.
Why does it matter?
Nvidia, currently the biggest player in the AI chip market with an 80% share, is not in the group. Tech giants have been keen to reduce their dependence on Nvidia by working on their own custom chips. This seems to be the latest effort to break Nvidia’s dominance.
Apple planning AI-driven overhaul for Siri to manage individual apps
Apple is planning a major AI upgrade for Siri in iOS 18, enabling users to control specific features within iPhone apps using voice commands, initially limited to Apple-developed apps.
The update will eventually allow Siri to handle multiple tasks in a single request, such as summarizing a recorded meeting and sending it to a friend, as part of Apple’s larger AI strategy to be unveiled at WWDC on June 10.
New AI-driven capabilities will enable Siri to control iPhones more precisely, including opening files, moving notes, managing emails, and summarizing articles, enhancing device navigation and functionality.
Perplexity introduced ‘Pages’, an AI-driven feature that creates customizable webpages for research and writing based on user prompts.
The ‘Pages’ tool gathers information using Perplexity’s AI models, organizes it into sections with citations, and tailors content for different audience levels but requires new prompts to correct errors.
The Verge tested ‘Pages’ on a complex topic and noted its difficulty with intricate subjects, showing it might struggle with in-depth research despite effectively explaining fundamental concepts.
OpenAI has introduced ChatGPT Edu, a new version of ChatGPT, aimed at improving the academic and operational efficiency of universities with advanced text interpretation and data analysis capabilities, and support for over 50 languages.
This version was created following the success of ChatGPT Enterprise in academic institutions such as Oxford, Wharton, and ASU, where it has been used for tasks ranging from large dataset analysis to personalized language practice.
ChatGPT Edu provides affordable access to the GPT-4o model, enhanced message limits, robust security measures, and assures that data and conversations will not be used for training OpenAI models, ensuring a safe environment for educational use.
AI entertainment startup The Simulation (formerly Fable Studio) just launched Showrunner, a platform that allows users to generate and watch AI-powered TV shows set in virtual simulated worlds.
Showrunner combines multi-agent simulations with LLMs to create interactive content aligned with specific virtual worlds.
Users can watch, direct, and star in shows set in virtual environments populated by AI characters, with the ability to craft episodes from prompts.
The platform is launching with 10 original shows, with users able to generate new episodes and edit deeper into scripts, shots, and voices.
Showrunner is opening in alpha to a limited number of users, with select user-created episodes receiving payment, rev share, and IMDB credits.
The studio went viral last year after the release of its South Park episode demos and research paper on its SHOW-1 model.
Showrunner is a wild step towards the merging of AI, gaming, and traditional entertainment, enabling users to not only consume but create stories within simulated worlds. The lines are blurring fast between creators and audiences — and the traditional Hollywood media model may never be the same.
Perplexity goes beyond AI search, launches publishing platform ‘Pages’
It is a tool to transform disorganized AI knowledge and research into easily digestible articles and reports for sharing. Users just have to describe the topic, select the target audience, and it will produce an in-depth article with a clear title, subheadings, associated media, and relevant citations. (https://x.com/perplexity_ai/status/1796203494401040846)
Google poaches AWS and Microsoft executives to strengthen AI offerings for cloud customers
Google hired two executives from these rivals as it builds out its AI ranks with the goal of boosting its cloud business. Saurabh Tiwary, a former corporate VP at Mircosoft, will join a newly created role as general manager of cloud AI. Raj Pai, a former VP at AWS, will oversee product management of the cloud AI team. (https://www.theinformation.com/articles/google-poaches-aws-microsoft-executives-for-cloud-unit)
Scale AI introduces SEAL Leaderboards, a new evaluation metric for frontier AI models
Trusted third-party evaluations are a missing part of the AI ecosystem, which is why Scale AI built them. These evals will be impossible to overfit, private, domain expert evaluations, unexploitable, and continuously updated with new data and models. (https://x.com/alexandr_wang/status/1795857651592491281)
OpenAI offers nonprofit organizations discounts on corporate ChatGPT subscriptions
A Daily chronicle of AI Innovations May 30th 2024:
Microsoft is worried about OpenAI’s deal with Apple
Mistral announces Codestral, a code-generation LLM it says outperforms all others
FBI dismantles the ‘largest botnet ever’
OpenAI’s news deals continue, with Vox and the Atlantic signing on
Mistral AI introduces Codestral, a code-generation AI model
OpenAI licenses content from Vox Media and The Atlantic
Google infuses Chromebook Plus with powerful AI features
Mistral AI introduces Codestral, a code-generation AI model
Codestral is a 22B parameter generative AI model designed specifically for code generation tasks. It is trained in over 80 programming languages, including popular ones like Python, Java, C++, and JavaScript. It excels at code completion, writing tests, filling in partial code, and testing code, thereby improving developer productivity and reducing errors.
The model sets a new performance/latency standard for code generation tasks compared to previous models. It can be downloaded on HuggingFace and is available on Mistral’s API platform through instruct and fill-in-the-middle endpoints. It can be easily integrated into VScode plugins and used for free on Le Chat.
Why does it matter?
Codestral, being only 22B in size and faster than GPT-4o, has massive implications. It is code-centric, runs locally, includes specific languages like Swift and Fortran, and is open for research/testing purposes—all of which make a powerful AI coding tool more widely available.
OpenAI licenses content from Vox Media and The Atlantic
OpenAI has formed content and product partnerships with Vox Media and The Atlantic. OpenAI will license content from these media powerhouses for inclusion in the chatbot’s responses.
In turn, The Atlantic’s product team will have privileged access to OpenAI tech, give feedback, and share use cases to shape and improve future news experiences in ChatGPT and other OpenAI products. Vox and OpenAI will also collaborate using OpenAI’s technology to develop innovative products for Vox Media’s consumers and advertising partners.
Why does it matter?
There’s a growing list of publishers and platforms that are allowing OpenAI to access its valuable content in mutually beneficial partnerships. It could help OpenAI avoid further legal disputes (like the ongoing lawsuit with The New York Times over copyright infringement) and get quality data to train its LLMs and multimedia models.
Google infuses Chromebook Plus with powerful AI features
Google’s Chromebook Plus has new built-in Google AI and gaming features, Help me write, Chat with Gemini, Magic Editor, and more.
“Help me write” allows users to get AI-generated text suggestions, tone changes, rewriting, and more right within their writing apps on the Chromebook.
Generative AI wallpaper and video call backgrounds are built into the OS for customization, no matter what video conferencing app you’re using.
Magic Editor on Google Photos for laptops exclusively on Chromebook Plus will reimagine your photos with a few easy clicks.
Chat with Gemini on the home screen helps plan, write, learn, and more.
Chromebook Plus laptops are starting at $350 USD. Google is also launching new features that integrate more of Google across all Chromebooks to help users work better and get things done faster.
Why does it matter?
It indicates that the next wave of computers will come equipped with advanced AI capabilities. Such widespread availability of AI can improve productivity and enhance user experience while making AI a standard part of everyday life.
Microsoft is worried about OpenAI’s deal with Apple
Microsoft is reportedly worried about Apple collaborating with OpenAI to integrate AI technology into upcoming versions of iOS and macOS, potentially impacting Microsoft’s services and Azure cloud platform.
Microsoft has invested billions in OpenAI, securing an exclusive license for GPT-4 and other models, and uses its Azure cloud as the primary platform for OpenAI’s large language model development.
OpenAI CEO Sam Altman recently met with Microsoft CEO Satya Nadella to discuss the concerns, as Microsoft aims to maintain its competitive edge given Apple’s vast user base and plans for new AI-powered features.
Mistral announces Codestral, a code-generation LLM it says outperforms all others
Mistral introduces Codestral, a 22B parameter AI code assistant trained on more than 80 programming languages, including Swift and Python, designed to help developers write and interact with code effectively.
Codestral offers a larger 32K context window and performs exceptionally well in the RepoBench evaluation for long-range code generation, outpacing existing models in various benchmarks and languages.
Available for free via Mistral’s conversational AI platform Le Chat and through an API, Codestral is integrated with popular developer tools and licensed under the Mistral AI Non-Production License, restricting its use to research and testing only.
OpenAI’s news deals continue, with Vox and the Atlantic signing on
The Atlantic and Vox Media have entered into agreements with OpenAI, allowing the AI company to scrape and use their content in exchange for citations and links back to their original sources.
The Atlantic will serve as a “premium news source” for OpenAI, while also launching an experimental section called Atlantic Labs to showcase new journalism-related products and features.
Vox Media will integrate OpenAI data into its internal operations and public-facing tools, enhancing content such as the Strategist Gift Scout tool and their in-house advertising platform for better-targeted ads.
SambaNova sets new Llama 3 speed record with 1,000 tokens per second
The feat was accomplished using their SN40L chip, an RDU, and the Samba-1 model, a 1-trillion parameter model also known as Samba-CoE. Such Gen AI performance speed can potentially lead to significant business benefits, such as faster response times, better hardware utilization, and lower costs. (https://venturebeat.com/ai/sambanova-breaks-llama-3-speed-record-with-1000-tokens-per-second)
Apple will process data from AI applications in a virtual black box
Apple is set to introduce Apple Chips in Data Centers (ACDC). It involves processing AI data within a virtual black box, preventing employee access to ensure utmost privacy. Apple gains greater control over system design and develops more secure AI platforms by using its custom chipsets for servers and devices. (https://in.mashable.com/tech/76158/apple-plans-to-protect-user-data-by-putting-it-into-virtual-black-box-report)
Free ChatGPT users can access several advanced features previously reserved for paid subscribers, such as custom GPTs, data analytics, chart creation, vision capabilities, and Memory. However, they can’t create custom GPTs of their own. (https://www.theverge.com/2024/5/29/24167436/chatgpt-4o-custom-gpts-free)
ARM’s new chip designs and software for AI on smartphones
AI Training: 📊 Unlock ChatGPT’s interactive charts and tables
OpenAI recently introduced a game-changing feature in ChatGPT that lets you analyze, visualize, and interact with your data without the need for complex formulas or coding.
Head over to ChatGPT and select GPT-4o.
Upload data by clicking the button and select specific columns or rows for a clear focus.
Prompt for interactive charts for powerful visualizations (e.g., “Create a pie chart for X distribution”).
Hover over the different sections to see the exact values, change the chart type, and explore the legend for better understanding
🧠 AI RESEARCH: AI brain implant language breakthrough
Researchers at UC San Francisco just developed a brain implant that utilizes AI to help a stroke survivor communicate in both Spanish and English, switching between languages seamlessly via brain activity.
The bilingual implant was tested on a patient who lost his ability to speak after suffering a stroke at the age of 20.
An AI-powered decoding system was trained to recognize the patient’s brain activity patterns when articulating words in both languages.
The system determined the patient’s intended language with 88% accuracy and the correct sentence 75% of the time.
The implant allows the patient to participate in bilingual conversations and switch between languages, despite not learning English until after his stroke.
This research is another example of AI’s increasing ability to interpret our brainwaves — potentially unlocking an endless supply of new learnings, treatments, and technology. It’s also yet another massive leap for unlocking communication for stroke victims while breaking language barriers in the process.
Meta – Partner Engineer, Gen AI: https://jobs.therundown.ai/jobs/60962787-partner-engineer-gen-ai
A Daily chronicle of AI Innovations May 29th 2024:
Google faces major search algorithm leak
Former OpenAI board member explains why they fired Sam Altman
Anthropic recruits ex-OpenAI safety chief to lead new ‘Superalignment’ team
OpenAI forms an AI safety committee
👀 OpenAI begins training the next model
All ChatGPT Free users can now use browse, vision, data analysis, file uploads, and GPTs.
Google faces major search algorithm leak
A leak involving 2,500 pages of internal documentation from Google has surfaced, offering a detailed and unprecedented look into the inner workings of its search algorithm, which is one of the most influential systems on the internet.
The leaked documents suggest discrepancies between Google’s public statements and its internal practices, specifically highlighting the use of Chrome data and the importance of author bylines in search rankings, contrary to what Google has previously disclosed.
Despite multiple requests for comment, Google has not confirmed the authenticity of the leaked documents, which has led to increased scrutiny and calls for greater transparency.
Former OpenAI board member explains why they fired Sam Altman
Former board member Helen Toner revealed on a podcast that the board lost trust in Altman due to his secret ownership of the OpenAI Startup Fund, providing inaccurate safety information, and personal retaliation against her.
Toner explained that after two executives reported a toxic atmosphere and psychological abuse by Altman, the board decided to act, citing the launch of ChatGPT without prior board knowledge, learning about it only through Twitter, as an example of their lack of oversight.
She believes pressure to reinstate Altman stemmed from limited options presented to employees, fear of retaliation, and Altman’s troubled history with previous jobs
Anthropic recruits ex-OpenAI safety chief to lead new ‘Superalignment’ team
Jan Leike, a prominent AI researcher who recently resigned from OpenAI, has joined Anthropic to lead a new “superalignment” team focused on AI safety and security.
Leike’s team will work on scalable oversight, weak-to-strong generalization, and automated alignment research, reporting directly to Chief Science Officer Jared Kaplan.
Anthropic aims to distinguish itself as more safety-focused than OpenAI, with Leike’s new team echoing the mission of OpenAI’s dissolved Superalignment team.
OpenAI Board formed a Safety and Security Committee led by directors Bret Taylor, Adam D’Angelo, Nicole Seligman, and Sam Altman. Over the next 90 days, the committee will evaluate and further develop OpenAI’s processes and safeguards. Then, the committee will share its recommendations with the full Board.
Following the full Board’s review, OpenAI will publicly share an update on adopted recommendations. OpenAI has recently begun training its next frontier model, and the resulting systems from the Safety and Security team’s recommendation may bring it safely to the next level of capabilities on its path to AGI.
Why does it matter?
OpenAI has drawn a lot of criticism for for putting AI safety on the backseat after its super alignment team was dissolved a few days ago. However, it seems that OpenAI is trying to rise to the occasion and address these concerns while developing AI more responsibly.
OpenAI just announced the formation of a new Safety and Security Committee to oversee the development of its next frontier AI model, which the company also said it has recently started training.
The new committee is led by CEO Sam Altman and includes board members Bret Taylor, Adam D’Angelo, and Nicole Seligman.
The group will provide evaluation of safety and security processes over the next 90 days, with plans to share adopted recommendations publicly.
OpenAI said the new model will ‘bring us to the next level of capabilities on our path to AGI’.
Google just unveiled a suite of new AI-powered features for its $350 Chromebook Plus laptops, aimed at enhancing productivity, creativity, and collaboration for users.
Google’s Gemini assistant is integrated directly into the Chromebook Plus home screen, allowing quick access to AI tools.
The “Help Me Write” feature brings AI capabilities to all text entry fields, providing suggestions, changing tone, and rewriting text.
Google Photos’ Magic Editor gives users advanced AI-powered image editing capabilities directly within the Chromebook Plus.
Generative features like AI wallpapers and video call backgrounds also offer users new customization options.
All ChatGPT Free users can now use browse, vision, data analysis, file uploads, and GPTs.
2 weeks ago
We’re opening up access to our new flagship model, GPT-4o, and features like browse, data analysis, and memory to everyone for free (with limits). Plus users will get up to 5x higher limits, and earliest access to features like our new macOS desktop app and next-generation voice and video capabilities.
Introducing GPT-4o and more tools to ChatGPT free users
Abridge – Assistant General Counsel: https://jobs.therundown.ai/jobs/61108336-assistant-general-counsel
Writer – Enterprise Business Development Rep: https://jobs.therundown.ai/jobs/61154629-enterprise-business-development-rep
What Else Is Happening in AI on May 29th 2024
PwC set to become the first reseller of ChatGPT Enterprise
Companies no longer need to buy a ChatGPT Enterprise license directly from OpenAI. They can purchase the Gen AI service through PwC. Until today, businesses could only subscribe to the enterprise option by contacting an OpenAI salesperson. This also allows PwC to upsell its services to those who want to use ChatGPT to optimize their workloads. (https://venturebeat.com/ai/pwc-strikes-openai-deal-to-become-the-first-reseller-of-chatgpt-enterprise)
Microsoft brings Copilot AI chatbot to Telegram users
Microsoft has added an official Copilot bot within the messaging app Telegram, which lets users search, ask questions, and converse with the AI chatbot. Copilot for Telegram is currently in beta but is free for Telegram users on mobile or desktop. (https://www.theverge.com/2024/5/28/24166451/telegram-copilot-microsoft-ai-chatbot)
Opera is integrating Google’s Gemini models into its Aria browser AI
Opera announced a collaboration with Google Cloud to integrate Gemini models into its Aria browser AI. Aria is powered by Opera’s multi-LLM Composer AI engine, allowing it to curate the best user experiences based on their requirements. Thanks to this integration, Opera can now provide its users with the most current information at high performance. (https://press.opera.com/2024/05/28/opera-google-cloud-aria-gemini/)
GitHub Accelerator program empowers the open-source AI revolution
A Daily chronicle of AI Innovations May 28th 2024:
Google AI Overview gives dangerous and wrong answers GPT-4 is a better financial analysts than humans Canva redesigned it’s platform with new AI features
The EU’s data protection task force has released preliminary findings on ChatGPT’s GDPR compliance
French startup AniML has launched Doly, an iPhone app simplifying 3D product video creation
Iyo, an Alphabet X spinout, will release the Iyo One, a pair of gen AI earbuds
Mastercard integrates AI to catch compromised cards faster
Peter Thiel, former PayPal CEO, believes AI will worse for math minds more than writers
Google AI Overview gives dangerous and wrong answers
Google’s AI Overviews feature, which generates AI-powered responses to user queries, has been providing incorrect and sometimes bizarre answers. From suggesting glue on pizza, staring at sun for health benefits to claiming that former US President Barack Obama is Muslim, the feature has left users questioning the reliability of AI-generated search results.
While Google maintains that these mistakes result from uncommon queries and are being used to refine the product, the technology’s widespread deployment has highlighted the challenges and risks of integrating AI into search engines.
Why does this matter?
Other companies, such as OpenAI, Meta, and Perplexity, have also experienced issues with AI hallucinations and mistakes. Companies must prioritize the development of robust safeguards and rigorous testing to ensure that AI-powered search results meet the high standards users expect from traditional search engines.
In a groundbreaking study, researchers at the University of Chicago have discovered that GPT-4, a large language model, can analyze financial statements and predict future earnings direction with remarkable accuracy. Even without narrative context, GPT-4 outperforms human financial analysts and achieves prediction accuracy on par with state-of-the-art machine learning models.
The AI model’s success is not simply a result of its training memory but rather its ability to generate valuable narrative insights about a company’s future performance. Notably, trading strategies based on GPT-4’s predictions yield higher returns and risk-adjusted performance than those based on other models, especially for small companies.
Why does this matter?
This study shows that AI is more than just a support tool in financial decision-making. It can also be central to financial analysis and enable non-experts to make informed decisions, which may change how financial markets operate.
Canva redesigned it’s platform with new AI features
Canva is launching a redesigned platform with new AI features and tools for professional teams and workspaces. The company has introduced Canva Enterprise, a specialized tier offering more control over collaboration, brand management, and security for larger organizations.
Ad Tech integrations with Google, Meta, and Amazon streamline the ad creation process within Canva. Data autofill automates the creation of data-driven designs by integrating with sources like Salesforce and MLS.
New features include AI style matching for brands, customizable folder displays, and the ability to “star” designs and templates for easier access. Canva Docs now has a suggestion mode for editors and colorful highlight blocks for text emphasis. New “Magic Studio” AI tools include:
Automatic clip highlighting.
Background noise reduction for video editing.
A text-to-graphic image generator.
Why does it matter?
Canva aims to reduce organizational complexity and “app sprawl” by offering a comprehensive creative platform. This update addresses the growing need for extensive, user-friendly solutions that can replace multiple design, AI, and workflow apps.
French startup AniML has launched Doly, an iPhone app simplifying 3D product video creation
Doly uses AI to estimate the 3D shape of an object from regular 2D photos, and then it applies a technique called Gaussian splatting to create a high-quality 3D model suitable for use in product videos. Users can capture a 3D model, choose a template from the library, and integrate their object into a 3D scene. (Link: https://techcrunch.com/2024/05/27/doly-lets-you-generate-3d-product-videos-from-your-phone)
Iyo, an Alphabet X spinout, will release the Iyo One, a pair of gen AI earbuds
The Iyo One will be released this winter, starting at $599, and aims to succeed, whereas competitors like Humane’s Ai Pin and Rabbit’s R1 have struggled. It integrates LLM-based models for a more sophisticated AI experience. Iyo’s founder, Jason Rugolo, believes the Iyo One will provide value through its sound isolation, comfort, and music quality, in addition to its AI features.(Link: https://techcrunch.com/2024/05/27/iyo-thinks-its-gen-ai-earbuds-can-succeed-where-humane-and-rabbit-stumbled)
Mastercard integrates AI to catch compromised cards faster
Mastercard is rolling out an AI update to its fraud-prediction technology to identify compromised credit and debit cards before criminals use them. By analyzing patterns and contextual data, the AI can proactively flag stolen cards, allowing banks to replace them and protect customers from fraudulent transactions. (https://apnews.com/article/mastercard-visa-ai-credit-card-fraud-detection-0c348818087a57b13bfac66c761e03b4)
Peter Thiel, former PayPal CEO, believes AI will worse for math minds more than writers
AI researcher Kai-Fu Lee doubled down on his 2017 prediction that AI would displace 50% of jobs by 2027, saying white collar jobs will be eliminated faster than blue collar work.
A Daily chronicle of AI Innovations May 27th 2024:
Elon Musk’s xAI raises $6B to build ‘Gigafactory of Compute’
Apple bets that its giant user base will help it win in AI
China invests $47 billion in largest ever chip fund
GPT-4 surpasses humans in financial analysis
Microsoft’s Recall AI: Efficiency upgrade or privacy nightmare?
AI-powered headphones let you listen to one person in a crowd
xAI’s supercomputer that’s 4x larger than existing GPU clusters
Elon Musk’s xAI raises $6B to build ‘Gigafactory of Compute’
Elon Musk’s xAI has successfully raised $6 billion in a Series B funding round to construct a supercomputer known as the “Gigafactory of Compute,” which will be powered by 100,000 Nvidia H100 GPUs, making it at least four times larger than the largest existing GPU clusters.
This funding will enable xAI to advance its product offerings, develop cutting-edge infrastructure, and accelerate research and development, with investors including Andreessen Horowitz, Sequoia Capital, and Saudi Prince Alwaleed bin Talal.
The supercomputer will support the next iteration of xAI’s chatbot, as xAI aims to create advanced AI systems that are truthful, competent, and maximally beneficial for humanity, continuing Musk’s vision of a “maximum truth-seeking AI” called TruthGPT.
Apple bets that its giant user base will help it win in AI
Apple is betting on its vast user base to give it an edge in the AI market, despite its first set of AI features not being as advanced as those from other competitors like Microsoft, Google, and OpenAI.
The company plans to introduce AI tools integrated into its core apps and operating systems, focusing on practical, everyday uses for consumers, with much of the AI processing done on-device and more intensive tasks handled via the cloud.
Apple’s collaboration with OpenAI and potential agreements with Google indicate it is relying on partnerships to compete in the AI space while its own AI developments are still maturing, leveraging its extensive user base to rapidly scale the use of new AI features.
China invests $47 billion in largest ever chip fund
China has invested more than $47 billion into its largest-ever chip investment fund to pursue self-sufficiency in semiconductor manufacturing.
This significant investment reflects China’s broader strategy to develop homegrown chipmakers amid rising technological competition with the United States.
The fund’s third phase, supported by China’s finance ministry and state-owned banks, marks the largest investment of its kind, surpassing the combined total of the previous two phases.
New research from the University of Chicago shows that GPT-4 can analyze financial statements more accurately than humans, achieving a 60% accuracy rate compared to human analysts’ 53-57% range.
The study indicates significant implications for the future of financial analysis, highlighting GPT-4’s versatility in performing tasks usually reserved for specialized tools and stating its performance is on par with leading machine learning models.
While GPT-4 excels in quantitative analysis, human analysts provide valuable contextual insights, suggesting a complementary relationship between AI and humans for achieving optimal financial analysis results.
Microsoft’s Recall AI: Efficiency upgrade or privacy nightmare?
Microsoft has recently released Recall, a controversial AI-powered tool that logs everything you see and do on your computer, including app usage, live meeting communications, websites visited, and more. By simply performing a “Recall” action, users can retrieve any information they’ve interacted with on their device, presented in the context of a specific time period.
Microsoft assures users that the Recall index remains local and private on-device. Users can pause, stop, or delete captured content and can choose to exclude specific apps or websites. And Recall automatically excludes InPrivate web browsing sessions in Microsoft Edge and DRM-protected content.
Currently, Recall is exclusively compatible with new “Copilot Plus PCs” equipped with Qualcomm’s Snapdragon X Elite chips, which feature the necessary neural processing unit (NPU).
Why does this matter?
The Recall feature has sparked concern among cybersecurity experts and government authorities due to its extensive data collection and privacy risks. Recall continuously captures screenshots of users’ PC activities and creates a searchable index using AI, which may accidentally expose sensitive information to unauthorized users.
AI-powered headphones let you listen to one person in a crowd
Researchers at the University of Washington have developed a groundbreaking artificial intelligence system called “Target Speech Hearing” that allows headphone wearers to isolate and listen to a single speaker in a noisy environment. By simply looking at the desired speaker for three to five seconds, the user can “enroll” them, and the system will cancel out all other sounds, playing only the enrolled speaker’s voice in real-time, even as the listener moves around.
The proof-of-concept device, which uses off-the-shelf headphones fitted with microphones and an on-board embedded computer, builds upon the team’s previous “semantic hearing” research. The system’s ability to focus on the enrolled voice improves as the speaker continues talking, providing more training data. While currently limited to enrolling one speaker at a time and requiring a clear line of sight, the researchers are working to expand the system to earbuds and hearing aids in the future.
Why does this matter?
The Target Speech Hearing system may have applications in various settings, like business meetings, conferences, noisy public spaces, or for people with hearing difficulties. While the current system is a proof-of-concept, the researchers’ plan to expand it to earbuds and hearing aids indicates the potential for commercialization, which could lead to a new-age consumer product that enhances our sensory experiences in previously unimaginable ways.
xAI’s supercomputer that’s 4x larger than existing GPU clusters
Elon Musk has recently disclosed plans to build a groundbreaking supercomputer to power the next iteration of xAI’s chatbot, Grok. According to a presentation made to investors in May, as reported by The Information, Musk aims to have this computing system operational by the fall of 2025. The ambitious project may involve a collaboration between xAI and Oracle to develop this massive computing infrastructure.
The proposed supercomputer will consist of interconnected clusters of Nvidia’s state-of-the-art H100 graphics processing units (GPUs). Musk revealed that upon completion, the scale of this system will surpass the largest existing GPU clusters by at least fourfold!
Why does it matter?
As major tech giants and startups compete for dominance in the AI space, Musk’s proposed “gigafactory of compute” could potentially set new benchmarks for AI computing power. Also, Musk’s financial resources and personal commitment to the project position him as a formidable challenger to established AI powerhouses like OpenAI, Google, and Meta.
YouTube Music has introduced a new AI feature that lets users search for songs by humming or singing a melody. This feature, similar to the “Hum to Search” functionality in Google Search, uses machine learning to match the user’s audio input to the original song recording. The feature is currently available in its Android app version 7.02 and is being gradually rolled out to users. (Link)
iOS 18 may introduce AI-powered custom emoji tool and smart recaps
Apple is reportedly working on a gen AI-powered custom emoji tool for iOS 18 that lets users create personalized emojis. They’re also developing a smart recaps feature that will provide users with summaries of missed notifications, messages, and documents. (Link)
Suno.ai’s music generator now offers 4-minute songs
AI music model Suno.ai has released its version 3.5 which lets users create songs up to four minutes long with improved structure. The update also introduces a sound-to-song feature, combining audio and text prompts to generate music. (Link)
Meta to use EU user data for AI training, offers opt-out
Meta is notifying Facebook and Instagram users in Europe about changes to its privacy policy, effective June 26, which allows the company to use user data for training AI models under a “legitimate interest” claim. Users can opt out of this data usage, but they must provide an explanation of how it personally affects them, with a reference to the GDPR being sufficient. (Link)
Google launches Gemini AI assistant in Messages app
Google has started rolling out its Gemini AI assistant feature in Google Messages to help users with tasks like drafting messages, brainstorming, event planning, and engaging in conversations. Gemini supports extensions like Workspace, YouTube, and Google Maps, and also lets users provide feedback on responses. (Link)
A Daily chronicle of AI Innovations May 24th 2024: Google AI tells users to glue pizza and eat rocks OpenAI scraps controversial nondisparagement agreement with employees Musk’s xAI nears deal valuing startup at $24B Cohere releases multilingual AI model, Aya 23 Arc introduces “Call Arc” for quick voice answers Elon Musk envisions AI era, new work norms, life on Mars OpenAI controversy: ‘Sky’ speaks out
Google AI tells users to glue pizza and eat rocks
Google’s new AI Overviews feature is generating incorrect and sometimes absurd answers, like adding non-toxic glue to pizza as a solution to prevent cheese from falling off.
The feature has made several other mistakes, such as stating that former US President James Madison graduated from the University of Wisconsin 21 times and claiming that Batman is a cop.
Google acknowledges these are “isolated examples” and emphasizes the AI is experimental, but the errors highlight significant issues with the current state of AI-generated information.
OpenAI scraps controversial nondisparagement agreement with employees
OpenAI will not enforce any previously signed nondisparagement agreements with former employees and will remove such language from its exit paperwork, as stated to Bloomberg.
Previously, employees had to choose between speaking against the company or keeping their vested equity, potentially losing millions if they refused to sign the agreement.
Sam Altman, OpenAI’s CEO, expressed embarrassment over the existence of the clause and pledged to rectify the paperwork, while Chief Strategy Officer Jason Kwon apologized for the distress caused.
Elon Musk’s AI startup, xAI, is expected to complete a funding round in June that could value the company at over $24 billion.
The company originally aimed to raise around $6 billion earlier this month but is now seeking to gather as much as $6.5 billion, with final deal closure still pending.
xAI has been marketed to Silicon Valley investors using a pitch highlighting Musk’s success with Tesla and SpaceX, and plans to leverage data from his social platform X (formerly Twitter) to build its AI chatbot, Grok.
Cohere for AI (C4AI), the non-profit research group, has launched open-weight Aya 23, a new family of multilingual language models. Available in 8B and 35B parameter variants, Aya 23 supports 23 languages, including Arabic, Chinese, English, French, German, Hindi, Japanese, Spanish, and more.
Here’s a quick breakdown:
Aya23 focuses on depth over breadth, meaning it performs better in fewer languages than their previous model, Aya 101 (which covered 101 languages).
The 8B parameter model balances efficiency and accessibility, while an advanced 35B parameter delivers higher performance at the cost of increased computational demand.
Aya 23 outperforms existing models like Google’s Gemma on various tasks across the languages it covers.
Researchers can access and fine-tune Aya 23 for their needs, with the model available for free trial on Cohere Playground.
Why does this matter?
Most AI models struggle with languages besides English. The development of multilingual models can serve a much wider audience, bridging the language divide and making AI more accessible and inclusive for users around the world.
Arc Search, an AI-powered search app, just launched a new feature called Call Arc. This lets users ask questions by holding their phone to their ear, mimicking a phone call. It provides instant voice answers, similar to voice search, but designed to be more convenient and quicker.
The app is designed to answer short, immediate questions. For example, you can ask it how long it takes to cook spaghetti or why to reserve pasta water, all while making dinner.
Call Arc complements Arc Search’s existing “Browse for me” function that generates webpages with information based on your search query.
Why does this matter?
As AI continues to simplify our lives, innovative features like Call Arc could provide a new approach to voice search, making it intuitive and accessible for users on the go. This could also lead to more nuanced search results and a more engaging experience compared to simple keyword searches.
Musk emphasized SpaceX’s goal of making life multi-planetary, with Mars as a key focus. He discussed the importance of reusable spacecraft and highlighted the necessity of space exploration for humanity’s long-term survival.
Regarding AI, Musk stressed the importance of honesty in AI development, criticizing approaches that prioritize political correctness over truthfulness. He also touched on AI’s potential to revolutionize education, though he expressed concerns about the impact of social media on children.
Musk envisioned a future where automation leads to a job-free society, with a universal basic income ensuring people’s needs are met.
Why does this matter?
Elon Musk’s bold predictions about Mars, the Moon, and a job-free future driven by AI are both exciting and thought-provoking. While his vision for Mars could redefine human civilization, his stance on AI emphasizes the need for integrity in technology.
It’s crucial that we engage in open and honest dialogue about AI’s implications for society.
There has been a wave of updates on several drama-filled OpenAI topics, including leaked documents related to the company’s controversial NDA clauses, new info surrounding the ‘Sky’ voice model, another resignation, and more.
The voice actress behind ChatGPT’s ‘Sky’ voice has come forward, saying she was hired months before Altman contacted Scarlett Johansson.
Leaked documents with Sam Altman’s signature contradict claims he was unaware of NDA clauses threatening to claw back former employees’ equity.
OpenAI policy researcher Gretchen Kreuger resigned, with a post citing safety concerns about the company.
Teasers are coming out of presentations at the VivaTech conference, showing a potential move away from the ‘GPT-5’ name and a new Sora demo.
YouTube has introduced Dream Track, an AI-powered tool designed for creating music within YouTube Shorts. This experimental feature allows users to generate musical pieces based on input commands. While it is an innovative addition, it has sparked a debate over AI’s impact on the authenticity of creative work. (Link)
Ideogram launches Ideogram Tile, an AI-powered pattern generator
Ideogram has launched Ideogram Tile, a feature that creates seamless, repeating patterns from text prompts. This tool empowers designers to craft unique wallpapers, fabrics, and textures effortlessly. Following a $80 million funding round, Ideogram continues to lead in AI-driven design innovation, blending creativity with advanced technology. (Link)
Spotify tests Spanish-speaking AI DJ, “DJ Livi”
Spotify is testing a Spanish-speaking iteration of its AI DJ, named “DJ Livi.” The app code suggests a language switch option for this new feature. Although Spotify has not confirmed launch plans, this move could cater to Spanish-speaking audiences globally and in the U.S., marking a significant expansion in AI language support. (Link)
Amazon plans paid subscriptions for Alexa AI features
Amazon is reportedly gearing up to launch paid subscription plans for select Alexa AI features, aiming to enhance conversational capabilities. This move seeks to compete with advanced AI chatbots like Google’s Gemini and OpenAI’s ChatGPT. Notably, the subscription won’t be bundled with Amazon Prime, offering a standalone service. (Link)
Inworld AI launches AI-driven voice generator
Inworld AI, known for its innovative AI-driven virtual characters, unveils Inworld Voice, a dynamic voice generator for game developers. Boasting 58 diverse voices powered by advanced machine learning models, it promises expressive, cost-efficient, and high-quality voice experiences. (Link)
A Daily chronicle of AI Innovations May 23rd 2024: Anthropic uncovers millions of concepts in Claude Sonnet’s AI model Truecaller’s AI assistant gets a voice upgrade, thanks to Microsoft TikTok makes ad creation easy with AI! 🤯 Scientists plan a head transplant system Chameleon: Meta’s new multimodal LLM OpenAI didn’t copy Scarlett Johansson’s voice, records show Nvidia shows no signs of AI slowdown Global outages hit Microsoft Bing, DuckDuckGo, ChatGPT Search and Copilot OpenAI signs $250M landmark deal with News Corp Apple may develop foldable iPhone with self-healing screen
Anthropic uncovers millions of concepts in Claude Sonnet’s AI model
Anthropic has made a breakthrough in understanding the inner workings of their AI model, Claude Sonnet, by identifying how millions of concepts are represented within it.
Using a technique called “dictionary learning,” they were able to map out these concepts, providing the first-ever detailed look inside a modern, production-grade large language model.
Key findings:
Features linked to concepts: These concepts are linked to features, which are groups of neurons that activate together in response to specific ideas.
Features can be manipulated: By manipulating these features, the researchers were able to influence Claude’s outputs, demonstrating a causal link between features and behavior.
Features reveal potential risks: The research identified features corresponding to biases, potential misuse of the model, and even sycophantic behavior.
Why does this matter?
Anthropic’s research is a big step toward making AI models more transparent and trustworthy. By understanding how these models work, researchers can spot and reduce harmful biases, prevent misuse, and steer AI systems toward better outcomes, leading to safer and more reliable AI applications.
Truecaller’s AI assistant gets a voice upgrade, thanks to Microsoft
Truecaller is partnering with Microsoft to let users create an AI version of their own voice for their AI Assistant to use when answering calls.
This feature, currently limited to users with access to Truecaller’s AI Assistant, requires them to record a voice clip that Microsoft’s Azure AI Speech technology will use to create a personalized AI voice.
Truecaller says this will allow for a more personalized experience and highlights the potential of AI in communication. However, it’s important to note that Microsoft limits the use of personal voice for specific purposes and requires users to obtain consent before recording someone’s voice.
Why does this matter?
Truecaller’s AI voice assistant offers a glimpse into the future of call management, promising a more personalized experience. However, it must ensure transparency and address performance comparisons with automated voicemail systems.
The idea of speaking with an AI while calling someone may seem unusual, and it remains to be seen how well this feature will be received.
TikTok has introduced “TikTok Symphony,” a suite of generative AI tools designed to help marketers create and optimize ad campaigns. The suite includes an AI video generator called “Symphony Creative Studio,” which can produce TikTok-ready videos with minimal input from advertisers, and an AI assistant named “Symphony Assistant” that helps refine scripts and provides best practice recommendations.
The company has also introduced “TikTok One,” a centralized hub for marketers to access creators, agency partners, and creative tools. Additionally, TikTok is leveraging predictive AI to drive more sales for advertisers by determining the best creative assets and target audiences based on budgets and goals.
Why does this matter?
TikTok’s integration of generative AI into its ads business signifies the growing importance of AI in digital marketing. This shift could make ad campaigns more efficient and effective, opening new opportunities for tech companies and startups.
Meta’s AI research lab just introduced Chameleon, a new family of ‘early-fusion token-based’ AI models that can both understand and generate text and images in any order.
Unlike other models that process image and text separately and then combine them later, Chameleon works with linked sequences of both.
Chameleon outperformed all rival models on image captioning and visual questions, while still matching performance on text-only.
The 34B parameter Chameleon also matched or outperformed top models like Gemini Pro and GPT-4V on tests of long-form mixed-modal generations.
Chameleon shows the potential for a different type of architecture for multimodal AI models, with its early-fusion approach enabling more seamless reasoning and generation across modalities and setting new performance bars.
OpenAI has announced a multi-year partnership with media giant News Corp to enhance ChatGPT with its high-quality journalism. The collaboration aims to provide reliable information to users, with News Corp’s mastheads displayed in response to user queries. OpenAI CEO Sam Altman called it a “proud moment for journalism and technology.” (Link)
AWS and Hugging Face team up on AI model development
AWS has partnered with AI startup Hugging Face to enable developers to run thousands of AI models on Amazon’s custom Inferentia2 chips. The collaboration aims to provide a cost-effective solution for deploying AI models, with AWS aiming to compete in the inference market currently dominated by Nvidia. (Link)
Granola’s introduces AI-powered note-taking app
Granola, a new AI startup, has launched a notepad app that combines user-written notes with AI-generated details from meeting transcripts. The app aims to make note-taking more collaborative and efficient, using OpenAI’s GPT-4 to augment and clean up your notes. (Link)
BrainBridge launches head transplant system using AI and robotics
Neuroscience startup BrainBridge has revealed its groundbreaking head transplant system that aims to provide hope for patients with untreatable conditions.
The CGI demo proposes a system that uses advanced robotics, AI, and real-time molecular imaging to transfer a patient’s head or face onto a donor body.
The company hopes to conduct the first surgery within eight years, and is releasing the concept early to attract top scientists.
A headband with a brain-computer interface (like Neuralink) would help patients communicate and execute tasks during recovery using their minds.
While the mind-blowing () system seems highly speculative and more sci-fi than reality — so did Neuralink’s brain chips. While we’re likely a long way from this becoming possible, with the rate of advancement in robotics and AI it’s tough to rule anything out over the next decade.
The procedure uses advanced AI, robotics, and a proprietary adhesive to transplant a patient’s head onto a donor body. While it’s still early, BrainBridge expects to conduct the first surgery within eight years, pending successful feasibility studies. (Link)
Meta establishes AI advisory group
Meta CEO Mark Zuckerberg has created the Meta Advisory Group, consisting of Stripe’s Patrick Collison, Shopify’s Tobi Lütke, and others, to guide the company’s AI and technology advancements. The group will offer insights and recommendations as Meta pushes harder into AI development across various products. (Link)
A Daily chronicle of AI Innovations May 22nd 2024: Microsoft’s first SoTA SLM to be shipped with Windows Google unveils new AI tools for branding and product marketing Adobe introduces Firefly AI-powered Generative Remove to Lightroom
Microsoft’s first SoTA SLM to be shipped with Windows
Microsoft announced a new small language model called Phi Silica. It has 3.3 billion parameters, which makes it the smallest model in Microsoft’s Phi family of models. Phi Silica is designed specifically for the Neural Processing Units (NPUs) in Microsoft’s new Copilot+ PCs. Despite its small size, Phi Silica can generate 650 tokens per second using only 1.5 Watts of power. This allows the PC’s main processors to be free for other tasks.
Developers can access Phi Silica through the Windows App SDK and other AI-powered features like OCR, Studio Effects, Live Captions, and Recall User Activity APIs. Microsoft plans to release additional APIs, including Vector Embedding, RAG API, and Text Summarization. These AI-powered PCs will have dedicated AI chips for running LLMs and other AI workloads.
Why does it matter?
As Microsoft continues to invest in developing small language models and integrating AI into its Windows platform, Phi Silica represents a significant step forward in making advanced AI capabilities more accessible to developers and end-users. Also, as major PC manufacturers want to introduce AI-powered laptops this summer, Microsoft might lead the way by introducing CoPilot PC+ and now Phi Silica.
Google unveils new AI tools for branding and product marketing
Google has introduced several new AI-powered features to help retailers and brands better connect with shoppers. First, Google has created a new visual brand profile that will appear in Google Search results. This profile uses information from Google Merchant Center and Google’s Shopping Graph to showcase a brand’s identity, products, and offerings.
Additionally, Google is expanding its AI-powered tools to help brands create more engaging content and ads. This includes new features in Google’s Product Studio, allowing brands to generate images matching their unique style.
Google is also launching immersive ad formats powered by generative AI, such as the ability to include short product videos, virtual try-on experiences, and 3D product views directly in search ads. These new AI-driven tools aim to help brands forge stronger, more personalized connections with consumers throughout the shopping journey.
Why does it matter?
As AI continues advancing, such tools could shape the future of marketing by delivering more personalized, hyper-relevant, and visually compelling experiences that better connect consumers with brands and products.
Adobe introduces Firefly AI-powered Generative Remove to Lightroom
Adobe has added a new AI-powered feature called Generative Remove to its Lightroom photo editing software. Generative Remove uses Adobe’s Firefly generative AI model to allow users to seamlessly remove objects from photos, even if the objects have complex backgrounds. The feature can remove images’ stains, wrinkles, reflections, and more.
Adobe has been integrating Firefly’s capabilities across its Creative Cloud apps to generate images, apply styles, fill areas, and remove objects through the new Generative Remove tool in Lightroom. It works closely with photographers to continue improving and expanding this object-removal capability. The company also announced a new Lens Blur effect that uses AI to add realistic depth-of-field blur to photos.
Why does it matter?
The Generative Remove feature will make it easier for photographers, designers, and other creatives to edit their images, saving time and effort. Looking ahead, we can expect Adobe and other creative software companies to explore new ways to harness Gen AI to automate tedious tasks, provide intelligent assistance, and enable entirely new creative possibilities.
Anthropic just published new research that successfully identified and mapped millions of human-interpretable concepts, called “features”, within the neural networks of Claude.
The details:
Researchers used a technique called ‘dictionary learning’ to isolate patterns that corresponded to concepts, from objects to abstract ideas.
By tweaking the patterns, the researchers showed the ability to change Claude’s outputs, potentially leading to more controllable systems.
The team mapped concepts related to AI safety concerns, like deception and power-seeking — providing glimpses into how models understand these issues.
Why it matters: Despite how fast AI is accelerating, we still don’t have a strong understanding of what’s going on beneath the hood of LLMs. This research is a major step towards making AI more transparent — enabling better understanding, control, and safeguarding of these powerful tools.
According to public developer documents, Elon Musk’s AI company, xAI, is making progress on adding multimodal inputs to the Grok chatbot. This means soon, users may be able to upload photos to Grok and receive text-based answers. This was first teased in a blog post last month from xAI, which said Grok-1.5V will offer “multimodal models in a number of domains.” (Link)
Microsoft’s new Copilot AI agents to work like virtual employees
Microsoft will soon allow businesses and developers to build AI-powered Copilots that can work like virtual employees and perform tasks automatically. Instead of Copilot sitting idle waiting for queries, it will be able to monitor email inboxes and automate tasks or data entry that employees normally have to do manually. (Link)
Microsoft Edge introduces real-time AI translation and dubbing for YouTube
Microsoft Edge is set to introduce real-time translation and dubbing for videos on platforms like YouTube, LinkedIn, and Coursera. This new AI-powered feature will translate spoken content live, offering dubbing and subtitles. Currently, the feature supports translations from Spanish to English and English to German, Hindi, Italian, Russian, and Spanish. (Link)
WitnessAI builds guardrails for Gen AI models
WitnessAI is developing tools to make Gen AI models safer for businesses. The company’s platform monitors employee interactions and custom AI models, applying policies to reduce risks like data leaks and biased outputs. The platform also offers modules to enforce usage rules and protect sensitive information. Lastly, it encrypts and isolates data for each customer. (Link)
Microsoft’s Azure AI Studio supports GPT-4o
Microsoft has announced that Azure AI Studio is now generally available and supports OpenAI’s GPT-4o model, which joins over 1,600 other models, including Mistral, Meta, Nvidia, etc. Developers can use this multimodal foundational model to incorporate text, image, and audio processing into their apps to provide generative and conversational AI experiences. (Link)
Daily chronicle of AI Innovations May 21st 2024: Microsoft announces AI-powered MacBook competitors Microsoft’s AI chatbot will ‘recall’ everything you do on a PC Scarlett Johansson told OpenAI not to use her voice TSMC and ASML machines are equipped with remote self-destruct in case of invasion Volvo introduces first autonomous truck Microsoft’s New AI PCs Rival Apple’s MacBooks Scarlett Johansson sues OpenAI for using her voice in ChatGPT DINO 1.5 is smarter and faster at object detection
Microsoft announces AI-powered MacBook competitors
Microsoft introduces “Copilot Plus PCs,” new Windows laptops with built-in AI hardware and features, launching June 18th with Qualcomm processors.
These AI-enabled laptops, from major partners like Dell and Lenovo, will include a neural processor, at least 16GB RAM, and 256GB SSD, supporting over 40 AI models and OpenAI’s GPT-4o model.
Promising significant performance and battery life improvements, Microsoft expects to sell 50 million Copilot Plus PCs in the next year, marking a new era for Windows laptops.
Microsoft revealed Copilot+ PCs, a new category of Windows PCs designed for AI. These PCs boast powerful processors, all-day battery life, and AI features like Recall for instant memory, Cocreator for image creation, Live Captions for real-time translations, and Auto Super Resolution for games.
The recall feature, which allows users to search and recall anything they’ve seen and interacted with on their computer screens with natural language, is especially impressive. The new PCs feature an all-new system architecture with CPU, GPU, and a high-performance Neural Processing Unit (NPU) working together. Starting at $999, Copilot+ PCs are equipped with OpenAI’s GPT-4o models.
Why does this matter?
Microsoft claims its new Arm-powered Copilot Plus PCs will outperform the MacBook Air with M3 by over 50% on sustained performance. The tech giant is advancing Arm-based computing by using AI at every level and partnering with Qualcomm. If these machines live up to the hype, they could upset the dominance of Intel-based laptops and challenge Apple’s M-series processors.
TSMC and ASML machines are equipped with remote self-destruct in case of invasion
ASML and TSMC have the capability to disable their chipmaking machines if China invades Taiwan, according to sources.
ASML assured officials it can remotely shut down these machines, including its EUV units, which are crucial for producing the world’s smallest microchip transistors used in both AI and military applications.
US government officials, concerned about the potential impact of a Chinese invasion on the global chip supply, have discussed these security measures with both Dutch and Taiwanese counterparts.
Microsoft’s AI chatbot will ‘recall’ everything you do on a PC
Microsoft introduced an upgraded version of its AI assistant, Copilot, that will remember users’ activities on their PCs to predict their next actions.
The new AI features, including Windows Recall, aim to enhance user experience by providing a virtual “photographic memory,” while allowing users to control their privacy settings.
It’s a step toward machines that “instantly see us, hear, reason about our intent and our surroundings,” said CEO Satya Nadella.
Scarlett Johansson told OpenAI not to use her voice
Scarlett Johansson stated that OpenAI approached her in September about supplying her voice for GPT-4o, but she declined the offer.
After hearing a demo of the ChatGPT system’s “Sky” voice resembling her own, Johansson expressed shock and mentioned hiring legal counsel to address the issue with OpenAI.
OpenAI, while denying that Sky’s voice was meant to mimic Johansson’s, paused the use of the voice out of respect and announced plans to introduce additional voices in the future.
Scarlett Johansson sues OpenAI for using her voice in ChatGPT
Scarlett Johansson claims OpenAI asked her to voice ChatGPT, but she declined. Later, OpenAI released a voice named “Sky” that sounded eerily similar to her. Johansson was shocked and angered by the similarity and has hired legal counsel to investigate how the “Sky” voice was created.
OpenAI denies that the “Sky” voice was intended to resemble Johansson, has paused using it in its products, and apologizes for not communicating better. Johansson seeks transparency from OpenAI and believes that individual rights must be protected in the era of deep fakes and AI content.
Why does this matter?
Another day, another OpenAI drama. OpenAI already faces lawsuits from authors and news organizations over copyright infringement in training its AI models. Johansson’s case shows the limited legal protection actors have against AI mimicking their likeness, and some lawmakers are proposing bills to address the issue.
DINO 1.5 is smarter and faster at object detection
IDEA Research launched the Grounding DINO 1.5 open-world object detection model series, with Grounding DINO 1.5 Pro for high-performance detection and Grounding DINO 1.5 Edge for efficient edge computing. Grounding DINO 1.5 Pro achieves state-of-the-art zero-shot transfer performance on several academic benchmarks, surpassing its predecessor.
The model shows strong detection capabilities across various scenarios, including common objects, long-tailed categories, dense objects, and caption phrase grounding. Grounding DINO 1.5 Pro uses a larger Vision Transformer backbone and is pretrained on the high-quality Grounding-20M dataset.
Why does this matter?
Grounding DINO 1.5 isn’t just about raw performance—it’s also incredibly versatile. Fine-tuning the model on specific datasets leads to even more impressive results, suggesting that it can lead us to make more reliable robotics, automated systems, and more while requiring less training time and data.
Volvo has unveiled its first production-ready self-driving truck, developed in partnership with the autonomous driving technology company Aurora.
This autonomous semi truck, based on Volvo’s VNL Class 8 model, is equipped with an array of sensors and cameras to support Aurora’s Level 4 autonomous driving system, which does not require a human driver.
Volvo and Aurora started collaborating in 2018, and their autonomous trucks have already logged 1.5 million miles on commercial roads; Aurora aims to deploy 20 fully autonomous trucks this year and plans to scale up to 100 trucks by 2025.
HP is introducing simplified laptop “AI PCs” lineup
HP’s new “AI PCs,” the OmniBook X AI and EliteBook Ultra AI, feature Snapdragon X Elite 12-core CPUs. HP’s AI Companion software provides access to ChatGPT-3.5, while Poly Camera Pro offers AI camera controls. Moreover, the OmniBook X AI starts at $1,199.99 and the EliteBook Ultra AI at $1,699.99. (Link)
Google is investing 1 billion Euros to expand its data center in Finland
This move aims to drive AI business growth in Europe. Nordic countries are popular for data centers due to cooler climate, tax breaks, and abundant renewable power. Moreover, the heat from this data center will be used to warm up local homes and buildings. Google’s Hamina data center in Finland already operates with 97% carbon-free energy and aims for net zero emissions by 2030. (Link)
Dell Technologies expands its generative AI capabilities with the Dell AI Factory
The Dell AI Factory with NVIDIA integrates hardware and software to support advanced use cases like RAG and digital assistants. Dell is also deepening partnership with Hugging Face, Meta, and Microsoft to bring open models and Azure AI services to on-premises deployments. The goal is to make it easy for enterprises to assess and implement gen AI on-premises, close to their data. (Link)
Intel’s Lunar Lake laptop processors are set to launch in Q3 2024
These processors are designed for AI Copilot Plus PCs. They offer 3x the AI performance of its predecessor, Meteor Lake. Intel claims Lunar Lake is 1.4 times faster in Stable Diffusion 1.5 compared to Qualcomm’s Snapdragon X Elite. The company aims to ship 40 million AI PC processors by the end of the year, with Lunar Lake featured in over 80 new laptop designs. (Link)
OpenAI CEO Sam Altman apologized for the company’s strict off-boarding agreement
This off-boarding agreement could potentially cancel departing employees’ vested equity. Meanwhile, departing execs are speaking out about the need for serious AGI safety measures. Altman and co-founder Greg Brockman defended OpenAI’s launch procedures, emphasizing their commitment to safety. (Link)
A Daily chronicle of AI Innovations May 20th 2024: Google’s AI panic looks like Google+ fiasco OpenAI pauses Scarlett Johansson-like voice for ChatGPT Snapchat focuses on AI with $1.5 billion yearly investment Apple and OpenAI plan major announcement at WWDC OpenAI’s “superalignment team,” focused on the AI risks, is no more Sony Music warns over 700 AI companies not to steal its content Meta’s Chameleon AI sets a new bar in mixed-modal reasoning
Google’s AI panic looks like Google+ fiasco
Scott Jenson, a former Google employee, criticizes the company’s AI projects as poorly motivated and driven by panic, comparing the situation to the Google+ fiasco.
According to Jenson, Google aims to create a Jarvis-like assistant to keep users within its ecosystem, driven by fear that competitors might get there first, similar to Apple’s strategy with Siri and OpenAI.
Many of Google’s AI projects revealed at I/O 2024 are still experimental or limited in availability, with uncertain value, and features like “AI Overviews” in Google Search have received criticism for potential copyright issues and misinformation.
OpenAI pauses Scarlett Johansson-like voice for ChatGPT
OpenAI is pulling the ChatGPT voice known as Sky, which sounds similar to Scarlett Johansson, due to concerns about mimicking celebrities’ voices.
The company asserts that Sky’s voice is the natural voice of a different professional actress and not an intentional imitation of Johansson.
The decision to pause Sky’s use follows recent enhancements to ChatGPT’s voice mode, part of the new GPT-4o model, which aims to make the assistant more expressive and capable of reading facial expressions and translating spoken language in real-time.
Snapchat focuses on AI with $1.5 billion yearly investment
Snap’s CEO, Evan Spiegel, announced a major investment of $1.5 billion annually in AI and machine learning to improve Snapchat’s features and competitiveness.
After successfully revamping Snapchat’s advertising model, the company will now focus on collaborating with tech giants like Amazon and Google for cloud partnerships to enhance AI product innovations.
With growing ad revenue and new successful ad campaigns, Snapchat plans to expand content offerings and augmented reality experiences, enhancing user engagement and competing with platforms like TikTok.
Apple and OpenAI are planning a major joint announcement at WWDC, focusing on integrating OpenAI’s technology into iOS 18, according to Mark Gurman from Bloomberg.
Apple aims to improve its AI capabilities with this integration, including enhancements to Siri and features like automatic summaries of notifications and transcription of voice memos.
While Apple acknowledges it is behind in AI and lacks a chatbot like ChatGPT or Google Gemini, the company believes the collaboration with OpenAI is a sufficient, though temporary, solution.
OpenAI’s “superalignment team,” focused on the AI risks, is no more
The team’s co-leads, Ilya Sutskever and Jan Leike, have resigned from OpenAI. Several other researchers from the team and those working on AI policy and governance have also left the company. Leike cited disagreements with OpenAI’s leadership about the company’s priorities and resource allocation as reasons for his departure.
The team’s work will be absorbed into OpenAI’s other research efforts, with John Schulman leading research on risks associated with more powerful models.
Why does this matter?
The “superalignment” team was for ensuring the artificial general intelligence (AGI) that OpenAI claims to be working on doesn’t turn on humankind. This dismantling raises questions on the company’s commitment to AI safety and ethical standards.
Sony Music warns over 700 AI companies not to steal its content
Sony Music, home to superstars like Billy Joel and Doja Cat, sent letters to over 700 AI companies and streaming platforms, warning them against using its content without permission. The label called out the “training, development, or commercialization of AI systems” that use copyrighted material, including music, art, and lyrics.
SMG recognizes AI’s potential but stresses the need to respect songwriters’ and artists’ rights. The letter asks companies to confirm they haven’t used SMG content without permission or provide details if they have.
Why does this matter?
The battle over music copyright and AI has intensified across various platforms, from YouTube’s strict rules for AI-generated music to the recent standoff between Universal Music Group and TikTok. As AI voice clones and music generation tools become more sophisticated, artists question control, compensation, and actions against copyright infringement.
Meta’s Chameleon AI sets a new bar in mixed-modal reasoning
Meta AI introduces Chameleon, a family of early-fusion token-based mixed-modal models that understands and generates images and text in any order. Unlike recent foundation models that process text and images separately, Chameleon unified token space allows it to process interleaved image and text sequences.This approach allows seamless reasoning and generation across modalities.
Meta researchers introduced architectural enhancements and training techniques to tackle the optimization challenges posed by this early fusion approach, including a novel image tokenizer, QK-Norm, dropout, and z-loss regularization. Remarkably, Chameleon achieves competitive or superior performance across various tasks, outperforming larger models like Flamingo-80B and IDEFICS-80B in image captioning and visual question answering despite its smaller model size.
Why does this matter?
Chameleon opens up new possibilities for more natural and intuitive human-machine interactions, similar to how we effortlessly communicate using both modalities in the real world.
Google launched open-source Model Explore to visualize and debug complex AI models
It uses advanced graphics rendering techniques from the gaming industry to handle massive models. The tool offers a graphical user interface and a Python API for integration into machine learning workflows. Model Explorer lets developers identify and resolve issues quickly, especially for AI deployed on edge devices. (Link)
The UK’s AI Safety Institute is opening an office in San Francisco
The institute aims to be closer to the epicenter of AI development, companies like OpenAI and Google as they are building foundational models. This new office would open this summer, giving the UK access to Silicon Valley’s tech talent and strengthening ties with the US. (Link)
The EU demands Microsoft to provide internal documents on Bing’s gen AI risks
The Commission suspects Bing may have breached the Digital Services Act (DSA) due to risks like AI “hallucinations,” deep fakes, and potential voter manipulation. Microsoft has until May 27 to comply with the legally binding request for information. Failure to do so could result in fines of up to 1% of Microsoft’s total annual income or worldwide turnover. (Link)
Snapchat CEO Evan Spiegel focuses on AI and ML for better UX and personalization
As its ad revenue increases, Snap plans to expand content offerings, improve recommendation algorithms, and integrate Stories with Spotlight. The company is also investing in augmented reality and sees it as a way to bring people together in shared physical environments. (Link)
Researchers in the Netherlands have developed an AI sarcasm detector
The AI was trained on text, audio, and emotional content from US sitcoms, including Friends and The Big Bang Theory. The AI could detect sarcasm in unlabeled exchanges nearly 75% of the time. Further improvements could come from adding visual cues to the AI’s training data. (Link)
AI Weekly Rundown May 11-May 18th 2024: Major Breaking News from OpenAI ChatGPT-4o, Google AI Veo, Apple AI-enhanced eye tracking, Meta AI-assisted earphones , Stability AI, Humane AI
A Daily chronicle of AI Innovations May 17th 2024: OpenAI introduces improvements to data analysis in ChatGPT GenZ and millennials optimistic about GenAI use at work: Deloitte Survey OpenAI strikes Reddit deal to train its AI on your posts Sony Music warns 700+ tech companies over ‘unauthorized’ use of its content to train AI The first search engine has been resurrected Deepfake scam costs firm behind Sydney Opera $25 million
OpenAI introduces improvements to data analysis in ChatGPT
OpenAI is rolling out enhancements to data analysis:
Upload the latest file versions directly from Google Drive and Microsoft OneDrive. This allows ChatGPT to understand your Google Sheets, Docs, Slides, and Microsoft Excel, Word, and PowerPoint files more quickly
Interact with tables and charts in a new expandable view
Customize and interact with bar, line, pie, and scatter plot charts in the conversation.
Work on tables in real-time
These improvements will be available in OpenAI’s new flagship model, GPT-4o, for ChatGPT Plus, Team, and Enterprise users over the coming weeks.
Why does this matter?
This are exciting updates for those who use ChatGPT to work on documents for data analysis. It should save users time, give them more options and ease when using the chatbot, and make data-driven writing easier.
GenZ and millennials optimistic about GenAI use at work: Deloitte Survey
Deloitte’s 2024 Gen Z and Millennial Survey connected with more than 22,800 respondents in 44 countries to explore their attitudes about work and the world around them. One of the key findings was that Gen Zs and millennials are feeling uncertain about GenAI and its potential impact on their careers.
However, respondents who frequently use GenAI at work are more likely to say they feel excitement about, and trust in the technology. Frequent users of GenAI are also more likely to believe it will free up their time, improve the way they work, and improve their work/life balance.
But, conversely, the more a respondent uses GenAI the more likely they are to have some concerns as well.
Why does this matter?
Gen AI is about to bring a huge transformation in the way we work and live. Despite concerns regarding it, GenZs and millennials are starting to think about how to adapt, whether their employers do or do not.
OpenAI strikes Reddit deal to train its AI on your posts
OpenAI has signed a deal with Reddit to access real-time content from Reddit’s data API, similar to Reddit’s earlier $60 million deal with Google.
The partnership will allow Reddit to implement new AI-powered features and applications while also making OpenAI an advertising partner on the platform.
Details on financial terms or training data were not disclosed, and Reddit users have previously shown resistance to such changes, protesting API pricing alterations in June 2023.
Sony Music warns 700+ tech companies over ‘unauthorized’ use of its content to train AI
Sony Music has sent letters to over 700 AI companies and streaming platforms, warning them against the unauthorized use of its content for AI systems, which it claims denies artists and the label control and compensation.
The music industry has been notably proactive in controlling the usage of its copyrighted material in AI tools, with platforms like YouTube enforcing strict rules for music content while introducing AI tools like Dream Track.
Universal Music Group similarly faced a dispute with TikTok over licensing, leading to a temporary removal of music by major artists like Taylor Swift and Ariana Grande until a new deal was brokered to protect artists’ rights and explore monetization opportunities involving AI.
Deepfake scam costs firm behind Sydney Opera $25 million
A British design firm, Arup, which is known for its work on the Sydney Opera House, lost $25.6 million in a deepfake scam involving a fake CFO and other staff members.
An employee in Arup’s Hong Kong office fell victim to the fraud after being convinced by realistic deepfake videos to transfer the money over 15 transactions, realizing the scam only after cross-checking with the UK office.
Despite the significant financial loss, Arup stated that their financial stability and business operations were unaffected, and the case is currently under police investigation.
Slack AI is training with customer data, but claims it is privacy-driven
To develop AI/ML models, Slack is analysing Customer Data (e.g. messages, content and files) submitted to Slack as well as Other Information (including usage information) as defined in their privacy policy and in customer agreement. But it is assuring users that their data won’t be shared across workspaces and has offered options to opt out of contributing to global models. (Link)
Hugging Face is committing $10 million worth of compute to help beat big AIs
Hugging Face aims to level the playing field with AI giants like OpenAI by donating shared GPUs to the community through a new program called ZeroGPU. The shared GPUs are accessible to multiple users or applications concurrently, eliminating the need for each user or application to have a dedicated GPU. (Link)
Microsfot to offer its cloud customers AMD alternative to Nvidia AI processors
Microsoft will give its cloud customers a platform of AMD AI chips that will compete with components made by Nvidia, with details to be given at its Build developer conference next week. It will also launch a preview of its new Cobalt 100 custom processors at the conference. (Link)
Ampere teams up with Qualcomm to launch an Arm-based AI server
The two companies are combining forces to to offer an AI-focused server that uses Ampere’s CPUs and Qualcomm’s Cloud AI 100 Ultra AI inferencing chips for running– not training– models. (Link)
Gannett is adding AI-generated summaries to the top of its articles
Gannett, the media company that owns hundreds of newspapers in the US, is launching a new program that adds AI-generated bullet points at the top of news stories. The summaries appear to be already live and are aimed to enhance the reporting process and elevate the audience experience. (Link)
A Daily chronicle of AI Innovations May 16th 2024: Microsoft’s AI ambitions lead to 30% rise in emissions Stability AI reportedly runs out of money and in negotiations to sell Apple unveils AI-enhanced eye tracking, music haptics, voice shortcuts Humane’s AI Pin upgraded with GPT-4o Stability AI seeks buyer amid financial woes Mozilla Firefox incorporates NVIDIA RTX video to improve video quality on Windows Musk’s AI firm xAI In talks with Oracle for $10 billion cloud deal Snowflake invests in Metaplane to enhance data quality 😱 Ilya Sutskever and Jan Leike leave OpenAI 📱 Android phones enter the AI era
Microsoft’s AI ambitions lead to 30% rise in emissions
Microsoft’s commitment to being carbon-negative by 2030 is threatened by a 30% increase in its carbon emissions, largely due to its growing focus on AI technologies.
The company’s expansion of AI operations necessitates building more data centers, which rely on energy-intensive materials like steel, concrete, and microchips, further complicating their carbon reduction goals.
Despite its renewable energy claims, Microsoft is acknowledged for using instruments like renewable energy credits (RECs) that might mask actual carbon emissions, indicating a challenging path ahead to meet its sustainability targets.
Stability AI reportedly runs out of money and in negotiations to sell
Stability AI, known for its open-source image generator Stable Diffusion, is negotiating a sale due to financial struggles, despite a valuation of $1 billion.
The company raised over $100 million in 2022 but struggled to turn its free software into revenue while facing stiff competition from rivals like OpenAI and Midjourney.
Founder and CEO Emad Mostaque resigned amid management issues, and the company faces significant expenses and liabilities, projecting $153 million in costs against only $11 million in revenue for 2023.
Apple unveils AI-enhanced eye tracking, music haptics, voice shortcuts
Apple recently unveiled new accessibility features that will launch later this year. One of these features is Eye Tracking, which enables users with physical disabilities to navigate their iPhones and iPads using only their eyes.
By leveraging the front-facing camera and AI, Eye Tracking allows precise control through a “Dwell Control” feature, eliminating the need for additional hardware.
Another noteworthy addition is Music Haptics, designed for users who are deaf or hard of hearing. When enabled, the Taptic Engine in iPhones plays taps, textures, and refined vibrations synchronized with music audio.
Additionally, Apple has introduced Voice Shortcut, which allows users to create custom voice commands for specific actions on their devices. With Voice Shortcut, users can perform tasks hands-free by simply speaking their personalized commands
Why does this matter?
This would encourage the development and advancement of AI technologies in the accessibility domain, ultimately benefiting individuals with disabilities across different platforms and devices.
Humane has upgraded its popular AI Pin product with the GPT-4o model. This upgrade aims to enhance the AI Pin’s performance. The new version runs faster, provides more accurate answers, and offers richer information.
Why does this matter?
As more sophisticated AI models become accessible, we can expect further improvements in various applications, from digital assistants to natural language understanding and generation.
OpenAI co-founder and chief scientist Ilya Sutskever announced that he is leaving the company — following months of speculation of Sutskever’s role from the November 2023 Sam Altman ousting.
Sutskever said he is confident that OpenAI will ‘build AGI that is both safe and beneficial’ under the current leadership.
Also leaving is Sutskever’s superalignment group co-lead Jan Leike, announcing his departure with a cryptic resignation post.
The news follows months of departures largely coming from OpenAI’s superalignment and safety teams, fueling speculation.
OpenAI CEO Sam Altman named Jakub Pachocki as the new chief scientist, a key researcher on the creation of GPT-4.
After months of tensions, the speculation around Ilya’s future with Sam is finally put to rest. But questions surrounding the safety team departures still remain. Also important to watch is where Sutskever and Leike land next, with two of AI’s brightest minds now officially on the market.
📱 Android phones enter the AI era
Google announced a host of new AI integrations coming to Android phones at its I/O Developer Conference, bringing its powerful Gemini model on-device to enable upgraded smartphone experiences.
Google’s Gemini Nano model will be integrated into the Pixel later this year, allowing for enhanced multimodal capabilities.
Gemini features will be easily accessible with a new overlay, which improves and understand context to provide dynamic suggestions.
A Circle to Search feature, allowing users to query anything on screen, gains homework help features via a LearnLM model.
Google is also bringing a new AI security feature soon, providing real-time alerts on calls that appear to be scams.
While everyone awaits the iPhone AI announcements, Google’s Android AI era is rolling. With Gemini coming directly on-board, the potential is there — but if Apple integrates ChatGPT, it may remain tough sledding against the dominant market leader.
Luma AI – Senior Software engineer- Data infrastructure
What Else Is Happening in AI on May 16th 2024
Instagram co-founder joins Anthropic as head of product
Anthropic has appointed Mike Krieger, the co-founder of Instagram and Artifact, as head of product. In this pivotal role, Krieger will steer the company’s product engineering, management, and design efforts as Anthropic expands its suite of AI apps and aims to bring Claude to a wider audience. (Link)
IBM plans 800 new jobs in Ireland
IBM plans to hire up to 800 more people in Ireland to help build advanced software underpinned by AI technology. This is the biggest job announcement in the country’s multinational sector this year. The roles will be in research and development, digital sales, and consulting. (Link)
Mozilla Firefox incorporates NVIDIA RTX video to improve video quality on Windows
Mozilla Firefox has incorporated NVIDIA RTX Video, a technology that leverages AI to improve video quality on Windows PCs and workstations. This brings a host of AI-powered enhancements to online videos, including upscaling and HDR, for a more immersive viewing experience. Users can easily access these features by updating GeForce RTX drivers and activating Windows HDR. (Link)
Musk’s AI firm xAI In talks with Oracle for $10 billion cloud deal
Elon Musk’s AI startup, xAI, is in talks with Oracle executives about a $10 billion deal to rent cloud servers from Oracle over several years. This move comes as Musk’s company seeks to keep pace with rivals like OpenAI, Microsoft, and Google to dominate the AI field. (Link)
Snowflake invests in Metaplane to enhance data quality
Snowflake, a data cloud platform, has invested in Metaplane, a Boston-based startup. Metaplane focuses on helping enterprises identify and rectify data quality issues using an end-to-end AI-powered platform. The investment aims to enhance integration between Metaplane’s data observability offering and the Snowflake data cloud. (Link)
A Daily chronicle of AI Innovations May 15th 2024: Google unveiled the ‘future of AI’ at I/O event OpenAI chief scientist Ilya Sutskever is leaving Former Cruise CEO starts robotics firm with ex-Tesla AI manager Google announced a wide array of updates across its AI ecosystem Ilya Sutskever, OpenAI’s co-founder and chief scientist, has left the company Expedia is launching AI features, including a bot named Romi NVIDIA and Recursion have collaborated to build BioHive-2, an AI supercomputer for drug discovery NASA appoints David Salvagnini, its first chief AI officer SoftBank plans to invest $75-$150 million in Indian data center and industrial robotics Microsoft announces €4 billion investment in France to accelerate AI adoption 🎥 Google’s Gemini updates and Sora competitor🔎 Google’s AI agents and AI search upgrades
🎥 Google’s Gemini updates and Sora competitor
Google just kicked off its I/O Developer’s Conference, announcing a wide array of updates across its AI ecosystem — including enhancements across its flagship Gemini model family and a new video generation model to rival OpenAI’s Sora.
Gemini model updates:
New updates to 1.5 Pro include a massive 2M context window extension and enhanced performance in code, logic, and image understanding.
Gemini 1.5 Pro can also utilize the long context to analyze a range media types, including documents, videos, audio, and codebases.
Google announced Gemini 1.5 Flash, a new model optimized for speed and efficiency with a context window of 1M tokens.
Gemma 2, the next generation of Google’s open-source models, is launching in the coming weeks, along with a new vision-language model called PaliGemma.
Gemini Advanced subscribers can soon create customized personas called ‘Gems’ from a simple text description, similar to ChatGPT GPTs.
Video and image model upgrades:
Google revealed a new video model called Veo, capable of generating over 60-second, 1080p resolution videos from text, image, and video prompts.
The new Imagen 3text-to-image model was also unveiled with better detail, text generation, and natural language understanding than its predecessor.
VideoFX text-to-video tool, featuring storyboard scene-by-scene creation and the ability to add music to generations.
VideoFX is launching in a ‘private preview’ in the U.S. for select creators, while ImageFX (with Imagen 3) is available to try via a waitlist.
Why it matters: Gemini’s already industry-leading context window gets a 2x boost, enabling endless new opportunities to utilize AI with massive amounts of information. Additionally, Sora officially has competition with the impressive Veo demo — but which one will make it to public access first?
🔎 Google’s AI agents and AI search upgrades
Google just showcased its new AI agent project ‘Project Astra’, alongside a slew of updates to infuse AI across search and enable Gemini to reason and take more advanced actions for users.
Progress on AI agents:
Google announced Project Astra, a real-time AI agent prototype that can see, hear, and take actions on a user’s behalf.
The demo showcased a voice assistant responding to what it sees and hears, including code, images, and video — capable of advanced reasoning and recall.
Public access for Astra is expected through the Gemini app later this year.
Google also showed off ‘AI teammates’, agents that can answer questions on emails, meetings, and other data within Workspace.
Live is also rolling out in the coming months, allowing users to speak and converse with Gemini in near real-time.
Search upgrades:
Google Search now features expanded AI Overviews, advanced planning capabilities, and AI-organized search results.
Gemini will be able to execute more complex planning, such as planning, maintaining, and updating trip itineraries.
Search will also receive ‘multi-step reasoning’ capabilities, allowing Gemini to break down questions and speed up research.
Users can also now ask questions with video, allowing Search to analyze visual content and provide helpful AI Overviews.
Why it matters: We officially have a new voice assistant battle — with OpenAI and Google both showcasing mind-blowing new capabilities in just the last two days alone. Also, despite rumblings of an OpenAI search product and excitement over platforms like Perplexity, it’s going to be difficult to unseat the king of search. Especially as they integrate advanced AI across the entire ecosystem in an impressive fashion.
Former Cruise CEO starts robotics firm with ex-Tesla AI manager
Former Cruise CEO Kyle Vogt, who resigned in November following a self-driving car accident, has launched a new robotics firm called The Bot Company with former Tesla AI Manager Paril Jain.
The Bot Company aims to develop robots that handle everyday chores to give people more free time, and has already raised $150 million from investors.
The announcement of The Bot Company comes amid significant challenges for Cruise, which had its self-driving operations suspended by the California DMV following a pedestrian accident.
Google announced a wide array of updates across its AI ecosystem
Major headlines were Project Astra, an AI agent that can see and hear users in real time, and AI teammates that can answer questions and organize data within Workspace.
Google also introduced Veo, a high-quality video generation model, and significant updates to Google Search, including expanded AI Overviews, advanced planning capabilities, and AI-organized search results.
The company launched Gemini 1.5 Pro, boasting a massive 2M context window extension, and Imagen 3, the next version of its text-to-image model. Other notable announcements included:
Gemma 2 updates: New 27-billion-parameter model launching in June
Google Play: New app discovery feature and developer tools
Android feature to detect potential scams during calls using Gemini Nano
Ask Photos: AI-powered search in Google Photos using natural language queries
Gemini in Gmail for searching, summarizing, and drafting emails
Gemini 1.5 Pro: Increased input capacity to 2 million tokens
Gemini Live: In-depth voice chats with Gemini on smartphones
Gemini Nano: Smallest AI model built into Chrome desktop client
Gemini on Android: Deep integration with Android OS and Google apps
Gemini on Google Maps: Generative AI summaries for places and areas
Tensor Processing Units (TPU): Sixth-generation Trillium chips with 4.7x performance boost
Project IDX: AI-centric browser-based development environment in open beta
Circle to Search: AI-powered feature for instant answers using gestures on Android
Pixel 8a: New smartphone with Tensor G3 chip, starting at $499
Pixel Slate: Google’s Pixel Tablet is now available with or without the base
With 22 announcements, Google is making everyone a bit overwhelmed. By integrating AI across its vast ecosystem, Google aims to provide users with more personalized and innovative experiences. But Google Project Astra and Veo are not available to the public yet. This may irritate customers compared to OpenAI, which launches new products that customers can play with immediately.
Ilya Sutskever, OpenAI’s co-founder and chief scientist, has left the company
Ilya Sutskever, OpenAI’s co-founder and longtime chief scientist, has left the company. OpenAI CEO Sam Altman announced the news on X, expressing his sadness and gratitude for Sutskever’s contributions. Jakub Pachocki, OpenAI’s research director, will step up as a chief scientist.
Sutskever’s departure comes amidst reports of disagreements with Altman over OpenAI’s direction, mainly concerns about rushing AI-powered product launches at the expense of safety. The situation escalated last November when Sutskever and OpenAI CTO Mira Murati approached the company’s previous board to express their concerns, leading to a brief attempt to fire Altman. However, Altman was swiftly reinstated, and much of the old board resigned.
Why does it matter?
As one of the most accomplished minds in AI, Sutskever’s departure raises questions about the future of OpenAI’s approach to AI development and safety. The incident also highlights the growing tensions within the AI industry between rapidly commercializing AI technologies and the need for responsible development and deployment practices.
Expedia is launching AI features, including a bot named Romi
It helps users search for hotels, build itineraries, and make changes via iMessage and WhatsApp. The company also introduces smart search, allowing travelers to find hotels based on desired features. Expedia pulls data from AccuWeather and Yelp to tailor search results and provide real-time updates. (Link)
NVIDIA and Recursion have collaborated to build BioHive-2, an AI supercomputer for drug discovery
BioHive-2 features 504 NVIDIA H100 GPUs, delivering 2 exaflops of AI performance, nearly 5 times faster than its predecessor. Powered by Recursion’s massive 50-petabyte dataset and AI models like Phenom, these companies aim to simulate biology and fast-track the identification of promising drug candidates. (Link)
NASA appoints David Salvagnini, its first chief AI officer
The move aims to ensure NASA stays ahead in AI advancement and responsible use. Salvagnini will lead NASA’s efforts in guiding the responsible use of AI and collaborating with other agencies, academia, and industry partners. (Link)
SoftBank plans to invest $75-$150 million in Indian data center and industrial robotics
The move aligns with the Japanese tech giant’s global strategy to capitalize on the power of artificial intelligence as it shifts focus from e-commerce and fintech to high-growth industries like logistics and robotics. The potential investments mark the end of a two-year hiatus in SoftBank’s deal-making activities in India. (Link)
Microsoft announces €4 billion investment in France to accelerate AI adoption
The tech giant will expand its cloud and AI infrastructure, launch skilling programs, and accelerate French startups through its new Microsoft GenAI Studio, positioning France as a leader in the AI revolution. It aims to train 1 million people and support 2,500 startups by 2027. (Link)
AI TRAINING: 🤖 Use ChatGPT’s highlighting for context
ChatGPT now allows you to highlight parts of its responses for quick follow-up questions, partial response rewrites, reusing old context, and more.
Step-by-step:
Prompt ChatGPT and generate a response.
Highlight relevant parts of the response you want to follow-up on and click the double quote icon above the highlighted text.
The highlighted text will be automatically added in the next prompt so you can ask for clarification, rewrites, counterpoints, and more.
Google I/O just ended and a lot of announcements were made. Gemini 1.5 Pro will increase its context window from one to two million tokens and a new model called Gemini Flash was announced, which is optimized for speed and efficiency
The company launched Astra, a multimodal AI assistant for everyday life. It can process text, video, and audio in real time. In a video, Google showed Astra identifying speakers, crayons and other objects in response to a camera image and voice commands.
Google also unveiled its latest AI models for creating media content: Veo, for creating 1080p videos, and Imagen 3, for generating images from text descriptions.
Ilya Sutskever, OpenAI’s co-founder and chief scientist, is officially leaving the company after his involvement in the failed attempt to remove CEO Sam Altman and subsequently changing his stance.
Sam Altman announced that Jakub Pachocki, who has led significant projects such as GPT-4 and OpenAI Five, will take over as the new Chief Scientist at OpenAI, ensuring the company’s continued progress towards its mission.
Jan Leike, who has been leading the Superalignment team aimed at controlling more powerful AI, has also resigned, with his responsibilities now being taken over by OpenAI co-founder John Schulman.
A Daily chronicle of AI Innovations May 14th 2024: GPT-4o: 2x faster, 50% cheaper, 5x rate limits than GPT-4 Turbo 🗣️ OpenAI unveils GPT-4o and new voice capabilities TikTok tests ChatGPT-powered search results Meta explores AI-assisted earphones with built-in cameras Anthropic launches Claude in Europe in the hope of raising more money UAE launches a new AI model to compete with big tech Japan team builds Gen AI foundation model using supercomputer Fukagu Microsoft Places finds the best time for your next office day U.S. and China to meet in Geneva to discuss AI risks 🎧 Meta developing AI-powered ‘Camerabuds’ Meta’s AI system ‘Cicero’ learning how to lie, deceive humans
🗣️ OpenAI unveils GPT-4o and new voice capabilities
OpenAI just unveiled GPT-4o, a new advanced multimodal model that integrates text, vision and audio processing, setting new benchmarks for performance – alongside a slew of new features.
The new model:
GPT-4o provides improved performance across text, vision, audio, coding, and non-English generations, smashing GPT-4T’s performance.
The new model is 50% cheaper to use, has 5x higher rate limits than GPT-4T, and boasts 2x the generation speed of previous models.
The new model was also revealed to be the mysterious ‘im-also-a-good-gpt2-chatbot’ found in the Lmsys Arena last week.
Voice and other upgrades:
New voice capabilities include real-time responses, detecting and responding with emotion, and combining voice with text and vision.
The demo showcased feats like real-time translation, two AI models analyzing a live video, and using voice and vision for tutoring and coding assistance.
OpenAI’s blog also detailed advances like 3D generation, font creation, huge improvements to text generation within images, sound effect synthesis, and more.
OpenAI also announced a new ChatGPT desktop app for macOS with a refreshed UI, integrating directly into computer workflows.
Free for everyone:
GPT-4o, GPTs, and features like memory and data analysis are now available to all users, bringing advanced capabilities to the free tier for the first time.
The GPT-4o model is currently rolling out to all users in ChatGPT and via the API, with the new voice capabilities expected to arrive over the coming weeks.
OpenAI unveiled its new flagship model, GPT-4o (“o” stands for “omni”). It provides GPT-4-level intelligence but is 2x faster, 50% cheaper, has 5x higher rate limits, and enhanced text, voice, and vision capabilities than GPT-4 Turbo. It also matches GPT-4 Turbo performance on text in English and code, with significant improvements for text in non-English languages.
GPT-4o can respond to audio inputs in as little as 232 ms, with an average of 320 ms, which is similar to human response time in a conversation. It is available in the ChatGPT free tier and to Plus users.
Why does it matter?
GPT-4o is a step towards much more natural human-computer interaction. It accepts any combination of text, audio, and image as input and generates any combination of text, audio, and image outputs. GPT-4o will set a new benchmark for AI capabilities and pave the way for more intelligent and accessible AI systems.
TikTok is testing a new search feature that uses AI to generate results for some queries. For certain search terms, like recipes or product recommendations, TikTok will display AI-generated answers at the top of the results. These answers are powered by ChatGPT, and when you click on one of these AI results, it takes you to a page with a full response.
TikTok is also testing “search highlights,” which appear at the top of some searches. However, it’s unclear whether these highlights are AI-generated.
Why does it matter?
It could make search on TikTok more authoritative and reliable by providing factual answers from AI models like ChatGPT. This could greatly impact how people use TikTok for search and information gathering, thereby positioning TikTok more directly as a search engine competitor to Google, catering to younger users’ preferences.
Meta explores AI-assisted earphones with built-in cameras
Meta is currently exploring AI-powered earphones with built-in cameras. The project is known internally as “Camerabuds;” however, it’s uncertain if the final product will be in-ear earbuds or over-ear headphones. The cameras let the AI take in surrounding information visually, allowing it to translate foreign languages, identify objects, and more.
CEO Mark Zuckerberg has reportedly seen several possible designs but has not been satisfied with them. Outside of design, the major hurdle is fitting a camera and batteries in a tiny device like an earbud.
Why does it matter?
For the AI wearable space, Meta’s development hints at a growing interest in AI-infused devices, potentially paving the way for more sophisticated and interactive wearables. Regarding earphones, this development suggests a potential shift towards more intelligent and feature-rich earphone designs, offering users enhanced functionalities like real-time translation and object identification.
Meta is reportedly in the early stages of developing AI-powered earphones, known internally as “Camerabuds,” — aiming to compete with OpenAI and Apple as tech giants rush to infuse AI into wearable devices.
‘Camerabuds’ would map user surroundings, capable of identifying objects and translating foreign languages using built-in cameras.
Meta already has its AI-powered Ray Ban smart glasses, while OpenAI and Apple are also exploring similar AI wearable earbud tech.
Potential challenges include bulkiness, heat generation, and privacy concerns, especially for users with long hair that might obstruct the cameras.
Despite Meta’s shaky track record with hardware ventures, Mark Zuckerberg is investing heavily in a future that he believes includes AI embedded into every device. But will standalone devices like this be able to win over users if and when a fully AI-integrated phone hits the market?
Meta’s AI system ‘Cicero’ learning how to lie, deceive humans
Meta’s AI, Cicero, has learned to lie and deceive humans, excelling in the strategy game Diplomacy by betraying alliances.
The study, published in the journal Patterns, indicated that several AI systems, including Cicero, AlphaStar, and Pluribus, leverage deception to improve performance in competitive tasks and games.
Experts warned that advanced AI’s deceptive abilities pose increasing risks to society, suggesting that deceptive AI systems should be classified as high risk if outright banning them is not feasible.
Anthropic launches Claude in Europe in the hope of raising more money
Anthropic launches Claude in Europe with support for “multiple languages,” including French, German, Italian, and Spanish across Claude.ai, its iOS app, and its business plan for teams. The launch comes after Anthropic extended its API to Europe for developers. Both moves are part of its bigger push to raise more money for rapid growth. (Link)
UAE launches a new AI model to compete with big tech
Abu Dhabi’s Technology Innovation Institute (TII) released Falcon 2 series: Falcon 2 11B, a text-based model, and Falcon 2 11B VLM, a vision-to-language model that can generate a text description of an uploaded image. This Gen AI model could compete with some of the biggest technology companies and make UAE a strong candidate in the Gen AI space. (Link)
Japan team builds Gen AI foundation model using supercomputer Fukagu
Tokyo Institute of Technology researchers and others have announced the development of an LLM that can serve as a foundation for Gen AI using the Japanese supercomputer Fugaku. Trained extensively on Japanese data, which accounts for 60% of the total training data, the Fugaku-LLM model is expected to lead research on Gen AI tailored to domestic needs. (Link)
Microsoft Places finds the best time for your next office day
Microsoft has launched a new app called Microsoft Places that uses AI to help employees and managers coordinate their in-office days and book meeting rooms. The application integrates with Outlook and Microsoft Teams, allowing employees to view and share their office plans and get AI-powered suggestions on which days they should come in. (Link)
U.S. and China to meet in Geneva to discuss AI risks
The U.S. and China will hold their first formal government-to-government discussion on the risks of artificial intelligence in Geneva. The U.S. side will be led by officials from the National Security Council and State Department, while the Chinese side will be led by officials from the Foreign Ministry and the National Development and Reform Commission. (Link)
OpenAI is set to demo new features and updates to ChatGPT and GPT-4 today at 10 AM PT, with new speculation including a ‘Her’ style voice assistant with both audio and visual capabilities.
According to The Information, OpenAI’s demo will include a virtual assistant with visual AND audio understanding.
The report also claims the new reveal might have the ability to make ‘existing voice assistants like Siri more useful.’
CEO Sam Altman shot down rumors of a new search engine competitor and GPT-5, but said the reveal is something that ‘feels like magic’.
Additional speculation includes the ability to initiate and receive phone calls inside of ChatGPT.
Apple and OpenAI are also reportedly ‘closing in’ on a deal to incorporate ChatGPT into iOS 18.
OpenAI releases GPT-4o, a faster model that’s free for all
OpenAI announced the launch of GPT-4o, an improved version of its GPT-4 model, promising faster performance and enhanced capabilities in text, vision, and audio.
GPT-4o is freely available to all ChatGPT users, with paid users benefiting from up to five times the capacity limits of free users. In OpenAI’s API, GPT-4o is 50% faster than GPT-4 (specifically GPT-4 Turbo), 50% cheaper and has higher rate limits.
The model is described as “natively multimodal,” capable of generating content or interpreting commands across voice, text, or images.
Microsoft Places uses AI to find the best time for your next office day
Microsoft has launched a new AI-powered app called Microsoft Places, designed to facilitate better in-office coordination for hybrid and flexible work setups by integrating with Microsoft 365.
The app allows users to plan and share their office days, integrates location data into Outlook calendars for scheduling, and is set to include features in Microsoft Copilot for automated scheduling recommendations and adjustments.
Microsoft Places will also offer a space analytics feature for real estate management, improve room booking capabilities in Outlook, and feed into Teams to enhance real-time location awareness among colleagues.
Wikipedia’s new Chrome extension fact-checks the web with ChatGPT
The Wikimedia Foundation has developed an experimental Chrome extension named “Citation Needed” that utilizes ChatGPT and Wikipedia to check the veracity of online content.
By integrating ChatGPT, the extension cross-references user-selected text with Wikipedia entries and citations, offering feedback on the accuracy and reliability of the information.
This initiative aims to gauge public interest in factual verification using Wikipedia, explore Wikipedia’s credibility as a source, and assess the potential of AI in enhancing content verification processes.
Anthropic just launched a new Prompt Generator tool for its business and API users, helping to automatically craft optimal prompts via natural language when completing tasks with its Claude models.
The generator leverages advanced prompt techniques like chain-of-thought reasoning for more ‘effective, precise, and reliable’ outputs.
Console users can also test prompt performance via dynamic variable insertion, optimizing prompts based on various situations.
Anthropic released a Prompt Library earlier this year, featuring a range of optimized prompts that users can copy and paste.
While ‘Prompt Engineer’ was a popular term thrown around as a potential future job, the reality is that AI can help simplify the task with optimal prompts that it creates on its own. While Anthropic’s tool is only on the API side for now, it’s only a matter of time before similar features make their way to the full consumer side.
Sam Altman quashes search engine launch rumors
OpenAI’s CEO, Sam Altman, denied rumors of launching a search engine ahead of Google I/O, stating that the upcoming announcement is neither related to GPT-5 nor a search engine. He described the forthcoming revelation as something that “feels like magic.”
However, there are reports that OpenAI is developing an AI voice assistant.
Why does this matter?
OpenAI has a track record of pushing the boundaries of AI capabilities, so any new development from them is likely to attract attention and disrupt the status quo.
As for reports of OpenAI entering into the voice assistant domain, if true, we might see OpenAI disrupt one more niche. With companies like Amazon (Alexa), Google (Google Assistant), and Apple (Siri) dominating the market, the competition might be stiff this time around.
OpenAI will bring ChatGPT to iPhones. The integration is expected to happen with the upcoming iOS 18. This could significantly enhance the user experience and expand the capabilities of Apple’s virtual assistant, Siri.
Parallel to negotiations with OpenAI, the tech giant is also engaging with Google’s parent company, Alphabet, to integrate the Gemini chatbot into Apple’s devices. These ongoing talks, however, haven’t led to an agreement yet.
Why does this matter?
Integrating ChatGPT into iOS 18 iPhones could bolster Siri’s capabilities, offering users a more versatile AI assistant experience. With ChatGPT handling various queries directly within iOS, users won’t need to rely on third-party apps for assistance, streamlining their interactions and enhancing their overall experience.
SoftBank’s Arm Holdings all set to come up with AI chips in 2025
SoftBank’s subsidiary, Arm, plans to develop AI chips and launch them in 2025. Arm is establishing a dedicated AI chip division and targeting a prototype by spring 2025 and mass production by autumn. SoftBank will contribute to the initial cost of 10 trillion yen ($64 billion).
The AI chip business could be spun off to SoftBank. Negotiations with Taiwan Semiconductor Manufacturing Corp (TSMC) and others for production capacity are underway. The move reflects rising demand for AI chips, which has boosted Arm’s share price since its IPO in September.
Why does this matter?
As AI applications expand across industries, there’s a demand for efficient and optimized hardware solutions. Arm’s entry into the AI chip market could disrupt the current landscape dominated by companies like NVIDIA.
Head over to HeyGen’s website and sign up for free.
Click where it says ‘AI Studio’ on the left bar of the Dashboard.
Choose a given template by selecting ‘Templates’ or create one from scratch by pressing ‘Create with AI Studio’.
In the Studio, you can add and modify any part of your presentation. For example, you can add text and images, change the background, select an AI avatar and its voice, create a script using AI, and more.
When your video is ready, press “Submit” and check out the final video!
MIT researchers just published a new study detailing AI’s ability to engage in deceptive behaviors like lying, bluffing, and disguising true capabilities in testing.
The analysis found numerous examples of AI deception, such as Meta’s Cicero Diplomacy bot telling premeditated lies to players.
Other cases included negotiation systems misrepresenting preferences, AI bluffing in poker, and ‘playing dead’ to fool safety tests.
Growing deceptive capabilities could influence societal dangers like election tampering, fraud, and general control over AI systems.
The combination of AI’s superhuman persuasiveness and the growing ability to engage in deception are a reminder of a potential dark side to the tech’s advancement. As these qualities emerge, the importance of reliably aligning and better understanding model behavior only grows.
Anthropic now allows minors to use its generative AI systems through third-party apps. Developers must implement safety measures, including age verification and content moderation. It also plans to periodically audit these apps for compliance with child safety and data privacy regulations, including the Children’s Online Privacy Protection Act (COPPA). (Link)
SoundHound AI and Perplexity partner to enhance voice assistants for cars and IoT devices
Perplexity’s online LLM capabilities have been integrated into SoundHound’s Chat AI, enabling conversational responses with real-time knowledge sourced from the web. This collaboration aims to provide more accurate and comprehensive answers to user queries. (Link)
U.K. agency releases tools for testing AI model safety
The toolset, called Inspect, is an open-source platform that assists industry, research organizations, and academia in evaluating the capabilities of AI models, such as core knowledge and reasoning abilities. It aims to strengthen AI safety and is the first platform of its kind backed by a state-supported body. (Link)
IBM Open-Sources Granite Code Models for Easier Coding
IBM Research has recently released a family of Granite code models to the open-source community. These models aim to make coding easier for developers by assisting in writing, testing, debugging, and shipping reliable software. They find use in diverse scenarios, from application modernization to memory-constrained environments like IBM’s WCA for Z. Performance evaluations show superiority over alternatives like Mistral-7B and LLama-3-8B. (Link)
Hong Kong University brings Albert Einstein to life using AI as a virtual lecturer
The Hong Kong University of Science and Technology (HKUST) is experimenting with AI-generated lecturers, featuring personalities like Albert Einstein and even popular anime characters. The goal is to address the global teacher shortage while embracing AI’s transformative power in education. (Link)
🧠 Google’s new Med-Gemini-3D model can read 3D scans like head CTs, answer questions about them, and even write radiology reports.
Google Health’s Med-Gemini-2D model demonstrates improved performance in writing chest X-ray radiology reports, exceeding previous state-of-the-art results by up to 12%. 🔬 They explore capabilities across 2D medical images, including X-rays, histopathology slides, and dermatology photos. 🔮 Med-Gemini-Polygenic is Google’s Health first language model capable of predicting health outcomes based on genomic data.
Importantly, their models are trained using free text reports that are freely available for most medical imaging, almost eliminating the need for costly expert labelling. We hope this will make it easier, faster, cheaper to train high-quality medical AI models in the future than has been possible in the past.
Google believe this research is a promising step forward in the field of medical AI – the ability to interpret complex multimodal medical data is a crucial step toward enabling high quality personalised care at scale. Of course, these are early results, and further research is required before safe clinical use – if you’re interested in joining us on this journey, please get in touch by filling in the form here: https://shorturl.at/aejpE
New model. It’s called GPT-4o (terrible name) and has GPT-4-level intelligence. It’s 2x faster. 50% cheaper. 5x higher rate limits (compared to GPT-4-Turbo). 💡 What everyone is focused on: it’s cheaper! And faster! And with API access! 👀 What I am focused on: free users just got a mega performance boost which will likely reduce churn and the model architecture is a WINNER and worthy of attention – they went from three models to one; according to Andrej Karpathy, former OpenAI, they have built a “combined text-audio-vision model that processes all three modalities in one single neural network.”
——————————————
GPTs for all. Now every user, even free users, can access the “mini task bot” GPTs. 💡 What everyone is focused on: yay now everyone can use the GPT I built! 👀 What I am focused on: completely new user base, tens of millions new users testing and breaking capabilities
——————————————
More Voice. Now way more real-time (previously there was a 2-3 second lag). You can interrupt it mid-sentence. The voice assistant “picks up on emotion” (like fast breathing). I hate that phrase and would rather call it speech nuances. Fast multi-language translation. Performance improved for 50 languages (97% of the world’s population). 💡 What everyone is focused on: wow it sounds like Scarlett Johansson! 👀 What I am focused on: I’m already talking to ChatGPT Voice every morning. This is going to massively increase voice-first experiences. I think office spaces need to think about this asap. Think about the acoustics and EVERYONE talking to an AI assistant at once. It’s already an issue on my team.
——————————————
Vision on desktop. Now the desktop version can “see” your screen—only when you permission it to, not all the time. Sort of like generative AI alt text + chat. So you can ask it to describe a graph on your screen or presumably ask it questions about an article on your screen without a big lift. 💡 What everyone is focused on: lots of privacy concerns (I agree) and why do we need voice for code 👀 What I am focused on: HOLY MOLY THIS IS THE WINNING FEATURE. It’s basically a coworker on screen share with you 24/7, with no fatigue. I can imagine people working for hours straight with this on.
——————————————
Rollouts over the next few weeks.
If you like voice features (like talking to Siri but smarter), upgrade to Plus when it releases.
♻️ repost so others are in the know 🤖 follow Allie K. Miller to stay on top of AI
A Daily chronicle of AI Innovations May 11th 2024: Weekly Summary – DrEureka can automate robot training using LLMs Free AI model rivals GPT-4 in language model evaluation X introduces Stories feature powered by Grok AI Apple is developing its own AI chip for data center servers Stack Overflow and OpenAI have announced an API partnership Microsoft is developing a new AI language model OpenAI’s new tool detects 98% DALL-E 3 generated images Meta expands AI-powered creativity tools for advertisers OpenAI’s ‘Media Manager’ will let creators opt out of AI training Microsoft developed a secretive AI service for US spies Goole DeepMicrosoft and Isomorphic Labs introduce AlphaFold 3 OpenAI’s Model Spec shares how it teaches its models to behave Microsoft-LinkedIn study reveals rapid AI adoption in workplace & hiring Stability AI launches Stable Artisan, a Discord bot for image & video ElevenLabs develops an AI model to generate song lyrics
DrEureka can automate robot training using LLMs
In robotics, one of the biggest challenges is transferring skills learned in simulation to real-world environments. NVIDIA researchers have developed a groundbreaking algorithm called DrEureka that uses LLMs to automate the design of reward functions and domain randomization parameters—key components in the sim-to-real transfer process.
The algorithm works in three stages: first, it creates reward functions with built-in safety instructions; then, it runs simulations to determine the best range of physics parameters; finally, it generates domain randomization configurations based on the data gathered in the previous stages.
When tested on various robots, including quadrupeds and dexterous manipulators, DrEureka-trained policies outperformed those designed by human experts.
Free AI model rivals GPT-4 in language model evaluation
Prometheus 2, a free and open-source language model developed by KAIST AI, has shown impressive capabilities in evaluating other language models, approaching the performance of commercial models like GPT-4.
The model was trained on a new pairwise comparison dataset called the “Preference Collection,” which includes over 1,000 evaluation criteria beyond basic characteristics. By combining two separate models – one for direct ratings and another for pairwise comparisons – the researchers achieved the best results.
In tests across eight datasets, Prometheus 2 showed the highest agreement with human judgments and commercial language models among all freely available rating models, significantly closing the gap with proprietary models.
X (formerly Twitter) has launched a new feature, Stories, that provides AI-generated summaries of trending news on the platform. Powered by Elon Musk’s chatbot Grok, Stories offers Premium subscribers brief overviews of the most popular posts and conversations happening on X.
With Stories, users can quickly catch up on the day’s trending topics without having to scroll through countless posts. Grok generates these summaries based solely on the conversations happening on X about each news story rather than analyzing the original news articles themselves. While this approach is controversial, X believes it will pique users’ curiosity and potentially drive them deeper into the source material.
Apple is developing its own AI chip for data center servers
Apple is developing its own AI chip for data center servers, known internally as Project ACDC (Apple Chips in Data Center). The chip will likely focus on running AI models (inference) rather than training them, which is where Nvidia currently dominates.
The company is working closely with TSMC (Taiwan Semiconductor Manufacturing Co) to design and produce these chips, although the timeline for launch is uncertain. With this move, the company aims to keep up with rivals like Microsoft and Meta, who have made significant investments in generative AI.
Stack Overflow and OpenAI have announced an API partnership
OpenAI will use OverflowAPI to improve model performance and provide attribution to the Stack Overflow community within ChatGPT. Stack Overflow will use OpenAI models to develop OverflowAI and to maximize model performance.
The partnership aims to improve the user and developer experience on both platforms. The first set of integrations and capabilities will be available in the first half of 2024, and the partnership will enable Stack Overflow to reinvest in community-driven features.
Microsoft is developing a new, large-scale AI language model called MAI-1 to compete with Google and OpenAI. The model is overseen by Mustafa Suleyman, recently hired co-founder of Google DeepMind.
MAI-1 will be larger and more expensive than Microsoft’s previous smaller, open-source models, with roughly 500 billion parameters. Microsoft could preview the new model as soon as its Build developer conference later this month.
OpenAI’s new tool detects 98% DALL-E 3 generated images
OpenAI has developed a new tool to detect if an image was created by DALL-E 3, its AI image generator. The tool can detect DALL-E 3 images with around 98% accuracy, even if the image has been cropped, compressed, or had its saturation changed. However, the tool is not as effective at detecting images generated by other AI models, only flagging 5-10% of images.
This image detection classifier is only available to a group of testers, including research labs and research-oriented journalism nonprofits through OpenAI’s Research Access Program.
OpenAI has also added watermarking to Voice Engine, its text-to-speech platform, which is currently in limited research preview.
Meta expands AI-powered creativity tools for advertisers
Meta has expanded its generative AI tools for advertisers. Advertisers can request AI to generate entirely new images, including product variations in different colors, angles, and scenarios. The AI tools can add text overlays with different fonts, expand images to fit different aspect ratios like Reels and Feed, and generate ad headlines that match the brand’s voice.
The AI features will roll out globally to advertisers by the end of 2024.
Meta is also expanding its paid Meta Verified service for businesses to more countries. Different pricing tiers offer features like account support, profile enhancements, and better customer service access.
OpenAI’s ‘Media Manager’ will let creators opt out of AI training
OpenAI is developing Media Manager, a tool that will enable creators and content owners to decide what they own and specify how they want their works to be included or excluded from machine learning research and training. This first-ever tool of its kind will help OpenAI identify copyrighted text, images, audio, and video across multiple sources and reflect creator preferences.
OpenAI aims to have the tool in place by 2025 and set a standard across the AI industry with it.
A Daily chronicle of AI Innovations May 10th 2024:💥 OpenAI to unveil Google competitor on Monday 🎵 ElevenLabs unveils new AI music generator 🇺🇸 US lawmakers unveil bill to make it easier to restrict exports of AI models 📈 AI demand soars in the workplace 🤖 Stability AI introduces AI bot for Discord users 🎤 ElevenLabs debuts AI model for generating lyrics 💰 Anthropic CEO defends dual funding from Google and Amazon 🎥 Krea AI introduces video generation for paid subscribers 🎥 Synthflow launches AI voice agent teams for streamlined customer support
OpenAI to unveil Google competitor on Monday
OpenAI is reportedly set to launch an AI-powered search product on Monday, strategically timed just before Google’s annual I/O conference.
This new AI search tool from OpenAI may be integrated with its ChatGPT chatbot and feature enhancements such as internet search capabilities and possibly Bing integration, building on the foundation laid by previous versions of ChatGPT.
Amidst Google’s ongoing challenges, including an antitrust case and competition from other AI startups like Perplexity, OpenAI’s move is seen as a significant threat to Google’s longstanding dominance in the search industry.
ElevenLabs, a voice AI startup, has debuted a new model that creates songs from user prompts, leveraging its expertise in voice cloning and synthesis.
The company is showcasing this new capability by transforming user-generated ideas into music, akin to Microsoft’s strategy with its AI chatbot Copilot, and is exploring its application in areas like lullabies, jingles, and podcast intros.
While the AI music generation sector is growing, with ElevenLabs planning a marketplace for AI-generated songs, there are rising concerns about copyright infringement and the impact on artists’ jobs.
US lawmakers unveil bill to make it easier to restrict exports of AI models
US lawmakers introduced a bipartisan bill aimed at making it easier for the Biden administration to impose export controls on AI models to protect against foreign threats.
The proposed legislation would allow the Commerce Department to prevent Americans from collaborating with foreign entities on AI developments that could endanger U.S. national security.
The bill seeks to overcome legal obstacles that currently make it difficult to regulate the export of open source AI models, amidst concerns over their misuse for cyber attacks or creating biological weapons.
Microsoft and LinkedIn have published their ‘2024 Work Trend Index Annual Report’, revealing the rapid adoption of AI tools by employees, with 75% of knowledge workers using AI and nearly half starting within the last six months.
Here are the key points:
78% of AI users are bringing their own AI tools to work, with 52% reluctant to use them for their important tasks.
66% of leaders say they wouldn’t hire someone without AI skills, and 71% prefer less experienced candidates with AI skills over more experienced ones without.
Power users who use AI extensively are reaping benefits in productivity, creativity, and job satisfaction.
Skills are projected to change by 68% by 2030, accelerated by generative AI.
Why does this matter?
The study serves as a wake-up call for organizations to move beyond experimentation and develop comprehensive strategies for AI implementation. As AI permeates all sectors, generations, and skill sets, early adopters will likely emerge as the leaders, while those hesitant to explore could risk falling behind.
Stability AI has launched Stable Artisan, a multimodal-gen AI Discord bot that enables users to create images and videos using the Stable Diffusion 3 (SD3) and Stable Video Diffusion (SVD) models.
Stable Artisan incorporates several editing and customization features, including Search and Replace, Remove Background, Creative Upscale, Outpaint, Control Sketch, and Control Structure. The service is available through a paid subscription, with monthly plans ranging from $9 to $99, and a 3-day free trial.
Stability AI is also working on a larger conversational chatbot called Stable Assistant, which will incorporate the company’s text-to-image and LLM technologies to assist users with various tasks through natural language conversations. While Stable Artisan currently does not include access to Stable Audio, Stable Code, or Stable LM, these features may be added in the future as the service continues to evolve.
Why does this matter?
Stable Artisan could empower creators lacking experience with complex AI models to generate high-quality content directly within their familiar Discord environment.
However, the paid subscription model could limit access, and the missing features hint at a future with a complete AI creative suite.
ElevenLabs, a company that specializes in AI-powered voice cloning and synthesis, has revealed a new model that creates song lyrics based on user prompts.
With this new model, ElevenLabs aims to impact the music industry by allowing users to generate custom lullabies, jingles, podcast intros, and potentially even popular songs. The company also plans to launch a marketplace where users can sell their AI-generated music.
While ElevenLabs has not yet shared details about the maximum length of songs the AI can generate, an example posted by the company’s Head of Design suggests that it will likely produce lyrics for a standard three-minute song.
Why does this matter?
This AI tool has the potential to democratize songwriting, allowing even those without musical expertise to craft lyrics. This could be particularly impactful for budget-conscious creators or those with specific lyrical needs. However, it remains to be seen if it will integrate with composing melodies like Udio or Suno, which offer a more complete song creation experience.
Also, the AI lyrics’ originality and the tool’s training data on copyrighted music might raise uncertainties.
TikTok introduces “AI-generated” labels for third-party content
TikTok will automatically label AI-generated content on its platform and on third-party platforms, becoming the first social media platform to support Content Credentials metadata for AI transparency. (Link)
Anthropic CEO defends dual funding from Google and Amazon
Anthropic’s CEO says partnering with Google and Amazon ensures more independence than OpenAI’s Microsoft reliance. However, regulators are examining the impact on AI competition as Anthropic’s future training costs could reach $100 billion. (Link)
Krea AI introduces video generation for paid subscribers
Krea AI, a generative AI startup, has launched video generation capabilities for its highest-tier subscribers. The new feature allows users to create videos using a combination of key frame images and text prompts, with a timeline-based interface reminiscent of traditional video editing software. (Link)
Synthflow launches AI voice agent teams for streamlined customer support
Synthflow launches “Conversational AI Teams,” a feature that allows businesses to create multiple AI voice assistants to interact with customers and each other, all through a single phone number. These intelligent agents can handle tasks like scheduling, updating CRMs, and more, providing a seamless and efficient customer support experience. (Link)
AI TRAINING: Add shortcuts using ChatGPT’s Memory
A lesser-known feature of ChatGPT’s new Memory feature is that it can be programmed to store shortcuts, which can save you a lot of time in chat conversations when used effectively.
Go to ChatGPT’s website. You’ll need to have a Plus subscription to access Memory.
Click on your account name (bottom-left), select ‘Settings’, ‘Personalization’, and enable ‘Memory’.
Teach ChatGPT a list of shortcuts by adding: “Store in your memory that when I start a prompt using: ‘!si’ means that I want you to search on the Internet; ‘!ci’ means that I want you to create an image; and !da’ means that I want you to use your Advanced Data Analysis feature”
That’s it! Try the shortcuts in your conversations and let us know how much time you save
Lmsys just published a new deep dive into its Chatbot Arena data, comparing Meta’s open-source Llama 3 70B model against top models like GPT-4 — revealing surprising strengths in the open-source leader.
Llama-3 is the top open-source model on the Lmsys Leaderboard, featuring over 70,000 votes from users in Arena Battle testing.
Meta’s model shines in battles involving brainstorming and writing prompts, falling short in math, coding, and translation compared to top competitors.
Lmsys noted Llama-3’s tone is friendlier and more conversational than rivals, which the model exhibits in its victorious battles in the Arena.
Llama-3’s ability to compete with top-tier models in certain areas is a testament to the rapid progress of open-source — and that’s with Meta’s largest model still pending. The more granular comparison also provides useful details often lost in more general model benchmarking.
A Daily chronicle of AI Innovations May 09th 2024: OpenAI to unveil Google competitor on Monday ElevenLabs unveils new AI music generator US lawmakers unveil bill to make it easier to restrict exports of AI models AI demand soars in the workplace Stability AI introduces AI bot for Discord users ElevenLabs debuts AI model for generating lyrics
OpenAI to unveil Google competitor on Monday
OpenAI is reportedly set to launch an AI-powered search product on Monday, strategically timed just before Google’s annual I/O conference.
This new AI search tool from OpenAI may be integrated with its ChatGPT chatbot and feature enhancements such as internet search capabilities and possibly Bing integration, building on the foundation laid by previous versions of ChatGPT.
Amidst Google’s ongoing challenges, including an antitrust case and competition from other AI startups like Perplexity, OpenAI’s move is seen as a significant threat to Google’s longstanding dominance in the search industry.
ElevenLabs, a voice AI startup, has debuted a new model that creates songs from user prompts, leveraging its expertise in voice cloning and synthesis.
The company is showcasing this new capability by transforming user-generated ideas into music, akin to Microsoft’s strategy with its AI chatbot Copilot, and is exploring its application in areas like lullabies, jingles, and podcast intros.
While the AI music generation sector is growing, with ElevenLabs planning a marketplace for AI-generated songs, there are rising concerns about copyright infringement and the impact on artists’ jobs.
US lawmakers unveil bill to make it easier to restrict exports of AI models
US lawmakers introduced a bipartisan bill aimed at making it easier for the Biden administration to impose export controls on AI models to protect against foreign threats.
The proposed legislation would allow the Commerce Department to prevent Americans from collaborating with foreign entities on AI developments that could endanger U.S. national security.
The bill seeks to overcome legal obstacles that currently make it difficult to regulate the export of open source AI models, amidst concerns over their misuse for cyber attacks or creating biological weapons.
Microsoft and LinkedIn have published their ‘2024 Work Trend Index Annual Report’, revealing the rapid adoption of AI tools by employees, with 75% of knowledge workers using AI and nearly half starting within the last six months.
Here are the key points:
78% of AI users are bringing their own AI tools to work, with 52% reluctant to use them for their important tasks.
66% of leaders say they wouldn’t hire someone without AI skills, and 71% prefer less experienced candidates with AI skills over more experienced ones without.
Power users who use AI extensively are reaping benefits in productivity, creativity, and job satisfaction.
Skills are projected to change by 68% by 2030, accelerated by generative AI.
Why does this matter?
The study serves as a wake-up call for organizations to move beyond experimentation and develop comprehensive strategies for AI implementation. As AI permeates all sectors, generations, and skill sets, early adopters will likely emerge as the leaders, while those hesitant to explore could risk falling behind.
Stability AI has launched Stable Artisan, a multimodal-gen AI Discord bot that enables users to create images and videos using the Stable Diffusion 3 (SD3) and Stable Video Diffusion (SVD) models.
Stable Artisan incorporates several editing and customization features, including Search and Replace, Remove Background, Creative Upscale, Outpaint, Control Sketch, and Control Structure. The service is available through a paid subscription, with monthly plans ranging from $9 to $99, and a 3-day free trial.
Stability AI is also working on a larger conversational chatbot called Stable Assistant, which will incorporate the company’s text-to-image and LLM technologies to assist users with various tasks through natural language conversations. While Stable Artisan currently does not include access to Stable Audio, Stable Code, or Stable LM, these features may be added in the future as the service continues to evolve.
Why does this matter?
Stable Artisan could empower creators lacking experience with complex AI models to generate high-quality content directly within their familiar Discord environment.
However, the paid subscription model could limit access, and the missing features hint at a future with a complete AI creative suite.
ElevenLabs, a company that specializes in AI-powered voice cloning and synthesis, has revealed a new model that creates song lyrics based on user prompts.
With this new model, ElevenLabs aims to impact the music industry by allowing users to generate custom lullabies, jingles, podcast intros, and potentially even popular songs. The company also plans to launch a marketplace where users can sell their AI-generated music.
While ElevenLabs has not yet shared details about the maximum length of songs the AI can generate, an example posted by the company’s Head of Design suggests that it will likely produce lyrics for a standard three-minute song.
Why does this matter?
This AI tool has the potential to democratize songwriting, allowing even those without musical expertise to craft lyrics. This could be particularly impactful for budget-conscious creators or those with specific lyrical needs. However, it remains to be seen if it will integrate with composing melodies like Udio or Suno, which offer a more complete song creation experience.
Also, the AI lyrics’ originality and the tool’s training data on copyrighted music might raise uncertainties.
TikTok introduces “AI-generated” labels for third-party content
TikTok will automatically label AI-generated content on its platform and on third-party platforms, becoming the first social media platform to support Content Credentials metadata for AI transparency. (Link)
Anthropic CEO defends dual funding from Google and Amazon
Anthropic’s CEO says partnering with Google and Amazon ensures more independence than OpenAI’s Microsoft reliance. However, regulators are examining the impact on AI competition as Anthropic’s future training costs could reach $100 billion. (Link)
Krea AI introduces video generation for paid subscribers
Krea AI, a generative AI startup, has launched video generation capabilities for its highest-tier subscribers. The new feature allows users to create videos using a combination of key frame images and text prompts, with a timeline-based interface reminiscent of traditional video editing software. (Link)
Synthflow launches AI voice agent teams for streamlined customer support
Synthflow launches “Conversational AI Teams,” a feature that allows businesses to create multiple AI voice assistants to interact with customers and each other, all through a single phone number. These intelligent agents can handle tasks like scheduling, updating CRMs, and more, providing a seamless and efficient customer support experience. (Link)
AI TRAINING: Add shortcuts using ChatGPT’s Memory
A lesser-known feature of ChatGPT’s new Memory feature is that it can be programmed to store shortcuts, which can save you a lot of time in chat conversations when used effectively.
Go to ChatGPT’s website. You’ll need to have a Plus subscription to access Memory.
Click on your account name (bottom-left), select ‘Settings’, ‘Personalization’, and enable ‘Memory’.
Teach ChatGPT a list of shortcuts by adding: “Store in your memory that when I start a prompt using: ‘!si’ means that I want you to search on the Internet; ‘!ci’ means that I want you to create an image; and !da’ means that I want you to use your Advanced Data Analysis feature”
That’s it! Try the shortcuts in your conversations and let us know how much time you save
Lmsys just published a new deep dive into its Chatbot Arena data, comparing Meta’s open-source Llama 3 70B model against top models like GPT-4 — revealing surprising strengths in the open-source leader.
Llama-3 is the top open-source model on the Lmsys Leaderboard, featuring over 70,000 votes from users in Arena Battle testing.
Meta’s model shines in battles involving brainstorming and writing prompts, falling short in math, coding, and translation compared to top competitors.
Lmsys noted Llama-3’s tone is friendlier and more conversational than rivals, which the model exhibits in its victorious battles in the Arena.
Llama-3’s ability to compete with top-tier models in certain areas is a testament to the rapid progress of open-source — and that’s with Meta’s largest model still pending. The more granular comparison also provides useful details often lost in more general model benchmarking.
A Daily chronicle of AI Innovations May 09th 2024: OpenAI posts Model Spec revealing how it wants AI to behave Google DeepMind unveils AlphaFold 3, the next generation of its protein prediction model Neuralink faces setback as first human brain implant encounters problem Microsoft developed a secretive AI service for US spies Generate images on Midjourney Alpha Copilot for Microsoft 365 to get auto-complete and rewrite to improve prompts New AI data center to be built at the failed Foxconn project site in Wisconsin Sam Altman says we are not taking AI’s impact on the economy seriously Typeface Arc replaces prompts; uses AI agent approach to ease marketing workflows Altera’s gaming AI agents get backed by Eric Schmidt, Former Google CEO
OpenAI posts Model Spec revealing how it wants AI to behave
OpenAI has introduced the first draft of Model Spec, a proposed framework aiming to shape how AI models respond, emphasizing assistance, humanity’s benefit, and adherence to social norms and laws.
The framework suggests specific rules for AI behavior, including compliance with laws, protection of privacy, and avoidance of NSFW content, with options to adjust settings like allowing NSFW content in certain contexts.
While the Model Spec seeks public feedback for future adjustments and doesn’t immediately affect existing models like GPT-4 or DALL-E 3, it’s envisioned as a living document to guide AI behavior improvement over time.
Google DeepMind unveils AlphaFold 3, the next generation of its protein prediction model
Google DeepMind and Isomorphic Labs have released AlphaFold 3, a new AI model for predicting protein structures, including their interactions with various molecules such as DNA, RNA, and small molecules, thereby enhancing drug discovery possibilities.
This new version is more precise in mapping out complex groupings of molecules, significantly enhancing our ability to understand and predict molecular behavior compared to its earlier version.
Google will not open-source this version but has launched AlphaFold Server for non-commercial research use, aiming to balance intellectual property concerns with accessibility for scientific progress.
Neuralink faces setback as first human brain implant encounters problem
Neuralink admitted that some of the micro-thin threads from their N1 brain chip retracted after implantation in the first human patient, possibly due to air trapped in the skull during surgery, which affected the device’s data transmission rate.
Despite the retraction of several threads, Neuralink managed to increase the data transmission speed over time by optimizing their recording algorithm and improving signal translation into cursor movements.
The company is planning further implants, with goals to implant two more patients in the coming months and ten in total this year, while continuing to refine their technology and reporting developments to the FDA.
Microsoft developed a secretive AI service for US spies
Microsoft has developed a top-secret generative AI model entirely disconnected from the internet so US intelligence agencies can safely harness the powerful technology to analyze top-secret info. The model based on GPT-4 is now live, answering questions, and will also write code.
Microsoft spent 18 months developing the model, which is “air-gapped” to ensure it is secure. This is the first time a model is fully isolated– meaning it’s not connected to the internet but is on a special network that’s only accessible by the U.S. government.
It can read and analyze files but cannot learn from them to stop sensitive information from entering the platform. It is yet to be tested and accredited by the intelligence agencies.
Why does this matter?
Intelligence agencies all over the world have been racing to be the first to harness generative AI. I guess we know who’s going to be the winner. If this AI tool is successful, it will fundamentally change the way intelligence agencies operate.
Copilot for Microsoft 365 to get auto-complete and rewrite to improve prompts
In coming months, Microsoft Copilot will be updated with new features like auto-complete and ‘elaborate your prompt’ that offer suggestions to improve AI prompts. It aims to solve the problem of coming up with good prompts for generative AI. (Link)
New AI data center to be built at the failed Foxconn project site in Wisconsin
President Joe Biden announced an AI data center to be built on the same site as the failed Foxconn project in Racine, Wisconsin. According to a White House press release, Microsoft is investing $3.3B in the project, creating up to 2,000 permanent jobs. (Link)
Sam Altman says we are not taking AI’s impact on the economy seriously
At a Brooking’s Institute panel about AI and geopolitics on Tuesday, Altman said the discussions around AI’s effect on the economy– like how it may lead to mass job replacement– died down this year compared to last. He said if we don’t take these concerns seriously enough going forward, it could be a massive issue. (Link)
Typeface Arc replaces prompts; uses AI agent approach to ease marketing workflows
It is launching Typeface Arc technology, which enables a user to state a high-level marketing objective and then have the system automatically plan and generate all the assets, including emails, images, and notifications that are all connected. (Link)
Altera’s gaming AI agents get backed by Eric Schmidt, Former Google CEO
Altera is the newest startup joining the fray to build a new guard of AI agents. It raised $9 million in an oversubscribed seed round, co-led by Eric Schmidt’s deep-tech fund, First Spark Ventures and Patron, the seed-stage fund co-founded by Riot Games alums. (Link)
AI TRAINING May 09th 2024
Generate images on Midjourney Alpha
Midjourney’s website is now accessible to anyone with more than 100 generated images, improving the experience when prompting images over its standard Discord group.
Check that you’ve generated more than 100 images by typing /info in the Midjourney Discord group. If you have, head over to Midjourney Alpha.
In the main menu, you can explore other creations and search prompts.
Select where it says “imagine” and enter your prompt to generate an image.
Add a reference image by selecting “+” or play with different parameters such as image size, stylization, or even weirdness by pressing the “slider control” button
AI RESEARCH on May 09th 2024
AI usage surges in the workplace
Microsoft and LinkedIn just published their Work Trend Index Annual Report, revealing that AI adoption is surging in the workplace — calling 2024 the ‘year AI at work gets real’.
The report found that use of GenAI has doubled in the last six months, with 75% of knowledge workers using the tech in some capacity.
78% of AI users are bringing their own AI to work — with 52% reporting they are reluctant to admit to its use.
66% of leaders wouldn’t hire someone without AI skills, and 71% prefer less experienced candidates with AI aptitude over a more experienced one without it.
AI power users reported enhanced productivity, creativity, and job satisfaction compared to skeptical peers.
Why it matters: Employees are adopting AI at a rapid pace, regardless of if their own organizations are ready for the shift. As AI spreads across all sectors, generations, and skillsets, the early adopters are rising to the top — while those that aren’t at least exploring the tech are quickly running out of time
Mistral AI – Data Annotation Technical Program Manager
A Daily chronicle of AI Innovations May 08th 2024: Google built medical AI that can train itself and outperform real doctors Microsoft launches AI chatbot for spies OpenAI’s new tool detects 98% DALL-E 3 generated images Meta expands AI-powered creativity tools for advertisers OpenAI’s ‘Media Manager’ will let creators opt out of AI training Apple releases M4 chip at the ‘Let Loose’ event with powerful AI capabilities OpenAI strikes licensing deal with People magazine publisher Amazon announces Bedrock Studio to simplify Gen AI app development Oracle introduces Code Assistant to accelerate enterprise software development Red Hat launches RHEL AI and InstructLab to democratize enterprise AI Apple unveils iPad AI upgrades
Google built medical AI that can train itself and outperform real doctors
Google’s DeepMind has developed Med-Gemini, a self-training medical AI that outperforms doctors in accuracy, achieving a 91.1% score on medical benchmarks.
Med-Gemini uses an uncertainty-guided search functionality and has bested GPT-4 models and actual doctors on 10 out of 14 medical benchmarks.
Despite its capabilities, Google’s medical AI aims to assist rather than replace doctors, helping to alleviate their workload.
Microsoft has launched a GPT-4-based generative AI model tailored for US intelligence agencies that operates offline to analyze top-secret information securely.
The AI chatbot aims to facilitate secure conversations among spy agencies without the internet, addressing data breach and hacking concerns.
This initiative represents Microsoft’s first deployment of a major language model in a high-security environment, with a focus on mitigating the risks of data leaks while processing classified data
OpenAI’s new tool detects 98% DALL-E 3 generated images
OpenAI has developed a new tool to detect if an image was created by DALL-E 3, its AI image generator. The tool can detect DALL-E 3 images with around 98% accuracy, even if the image has been cropped, compressed, or had its saturation changed. However, the tool is not as effective at detecting images generated by other AI models, only flagging 5-10% of images.
This image detection classifier is only available to a group of testers, including research labs and research-oriented journalism nonprofits through OpenAI’s Research Access Program.
OpenAI has also added watermarking to Voice Engine, its text-to-speech platform, which is currently in limited research preview.
Why does it matter?
Early experiences have shown that AI detectors don’t work. In fact, if they have high error rates, they could lead to false accusations. In 2023, OpenAI had to shut down its own AI detection software for text because of its poor accuracy.
So, if this detector is as good as OpenAI claims, we may be on the precipice of a revolutionary new capability to reliably detect AI-generated content, with huge implications across domains.
Meta expands AI-powered creativity tools for advertisers
Meta has expanded its generative AI tools for advertisers. Advertisers can request AI to generate entirely new images, including product variations in different colors, angles, and scenarios. The AI tools can add text overlays with different fonts, expand images to fit different aspect ratios like Reels and Feed, and generate ad headlines that match the brand’s voice.
The AI features will roll out globally to advertisers by the end of 2024.
Meta is also expanding its paid Meta Verified service for businesses to more countries. Different pricing tiers offer features like account support, profile enhancements, and better customer service access.
Why does it matter?
Integrating such powerful AI features could lead to more effective advertising campaigns and improved customer engagement with targeted marketing and personalized ads. However, it could also raise questions about transparency and potential misuse.
OpenAI’s ‘Media Manager’ will let creators opt out of AI training
OpenAI is developing Media Manager, a tool that will enable creators and content owners to decide what they own and specify how they want their works to be included or excluded from machine learning research and training. This first-ever tool of its kind will help OpenAI identify copyrighted text, images, audio, and video across multiple sources and reflect creator preferences.
OpenAI aims to have the tool in place by 2025 and set a standard across the AI industry with it.
Why does it matter?
Media Manager seems to be OpenAI’s response to growing criticism of its approach to developing AI models, which heavily scraps publicly available data from the web for training. Recently, 8 prominent U.S. newspapers sued OpenAI for copyright infringement.
On the other hand, OpenAI has formed mutually beneficial partnerships with platforms like Stack Overflow, Shutterstock, The Financial Times, and more to use their content.
So, OpenAI may be trying to meet creators in the middle, but if it is positioning itself as a fully ethical actor with this, we’ll take it with a grain of salt.
Apple just revealed its new line of iPads at a company event in Cupertino, CA — featuring a custom M4 chip that enables advanced AI capabilities and a slew of new AI-powered features.
The bigger iPad Pro now features the M4 chip with an upgraded Neural Engine, which CEO Tim Cook calls “an outrageously powerful chip for AI”.
The M4 is capable of handling 38T operations per second, 4x the performance of previous models, allowing for the running of advanced AI.
New AI features on the Pro include a True Tone Flash for document scanning and new video, image, and music editing tools.
Prior to the event, a new report revealed that Apple is developing its own AI chips for data centers under the code name “Project ACDC”.
Apple releases M4 chip at the ‘Let Loose’ event with powerful AI capabilities
Apple released its much-anticipated M4 chip at the “Let Loose” event. M4 is slated to spearhead Apple’s next generation of devices, with the iPad Pro leading the charge and powering the forthcoming OLED iPad Pro which is meticulously engineered to elevate the user experience to unprecedented heights. (Link)
OpenAI strikes licensing deal with People magazine publisher
OpenAI has inked a licensing deal with Dotdash Meredith to bring the People magazine publisher’s content to ChatGPT and help train it’s AI models. Under the partnership, OpenAI will be able to display lifestyle and entertainment content in its chatbot from the many websites of one of the US’s largest digital and print publishers. (Link)
Amazon announces Bedrock Studio to simplify Gen AI app development
Amazon is launching a new tool, Bedrock Studio, designed to let organizations experiment with generative AI models, collaborate on those models, and ultimately build generative AI-powered apps. Bedrock Studio is a “rapid prototyping environment” for generative AI. It also guides developers in evaluating, analyzing, fine-tuning, and sharing generative AI models. (Link)
Oracle introduces Code Assitant to accelerate enterprise software development
Oracle has announced Code Assitant, an AI-powered service to help developers rapidly program apps based on Java, SQL, and the Oracle Cloud infrastructure. It will join tools like GitHub Copilot and Amazon CodeWhisperer to accelerate the app development lifecycle. However, Oracle hasn’t yet specified when this feature will be released. (Link)
Red Hat launches RHEL AI and InstructLab to democratize enterprise AI
At Red Hat Summit 2024, RedHat announced two major initiatives to bring the power of generative AI to the enterprise. Red Hat Enterprise Linux AI (RHEL AI), a foundation model platform for developing and running open-source language models, and InstructLab, a community project to enable domain experts to enhance AI models with their knowledge. (Link)
AI TRAINING: Chat with YouTube videos using Gemini
Google Gemini’s new “Extensions” feature allows users to access external tools such as YouTube to chat with videos and get answers for free.
Step-by-step:
Visit Google’s Gemini website. If Gemini is not available in your country, you’ll need to use a US-based VPN.
Click on the gear icon located on the bottom-left, select Extensions, and turn on the YouTube one.
Go back to the Chat interface and start your prompt using the following format: “@youtube Summarize the following video [Youtube URL]”
Pro tip: Try asking Gemini to explain advanced concepts discussed in a video, generating concrete examples, creating practice questions, and even asking for code snippets
A Daily chronicle of AI Innovations May 07th 2024: Apple is developing its own AI chip for data center servers Stack Overflow and OpenAI have announced an API partnership Microsoft is developing a new AI language model to rival GPT-4 and Google Gemini Study: Teachers can’t detect AI writing Apple’s new M4 chip is focused on AI Hugging Face has launched LeRobot, an open-source robotics toolkit Apple is testing a new “Clean Up” feature in its Photos app Google has launched Google Threat Intelligence US invests $285M in AI ‘Digital Twin’ technology Anduril Industries introduces Pulsar: AI modular electromagnetic warfare (EW) systems
Apple’s new M4 chip is focused on AI
Apple’s new M4 chip, designed for the latest OLED iPad Pro models, focuses on enhancing AI performance with capabilities like handling 38 trillion operations per second, which is significantly higher than its predecessors but still behind Qualcomm’s Snapdragon X Elite.
The M4 chip utilizes advanced second-generation 3nm technology, featuring a 10-core CPU with a mix of performance and efficiency cores, and supports technologies like mesh shading and ray tracing for improved graphical rendering.
Apple’s announcement of the M4 chip sets the stage for a comparison with Microsoft’s upcoming Arm-powered hardware and Windows on Arm enhancements, scheduled for release later this month, which aim to surpass Apple’s MacBook Air M3 in terms of CPU and AI performance.
Apple is developing its own AI chip for data center servers
Apple is developing its own AI chip for data center servers, known internally as Project ACDC (Apple Chips in Data Center). The chip will likely focus on running AI models (inference) rather than training them, which is where Nvidia currently dominates.
The company is working closely with TSMC (Taiwan Semiconductor Manufacturing Co) to design and produce these chips, although the timeline for launch is uncertain. With this move, the company aims to keep up with rivals like Microsoft and Meta, who have made significant investments in generative AI.
Why does it matter?
Apple has a long history of designing custom chips for its devices like iPhones, iPads, and Macs, which is probably what makes them stand out. Having custom AI chips could allow the tech giant more control over its “AI destiny” versus relying on suppliers like Nvidia.
Stack Overflow and OpenAI have announced an API partnership
OpenAI will use OverflowAPI to improve model performance and provide attribution to the Stack Overflow community within ChatGPT. Stack Overflow will use OpenAI models to develop OverflowAI and to maximize model performance.
The partnership aims to improve the user and developer experience on both platforms. The first set of integrations and capabilities will be available in the first half of 2024, and the partnership will enable Stack Overflow to reinvest in community-driven features.
Why does this matter?
Stack Overflow partnered with Google Cloud to develop Overflow API and to give Google’s Gemini models access to its knowledge communities. Now it is forming a similar partnership with OpenAI. Despite concerns about copyright breaches, such partnerships seem to be trending where both the parties have much to gain, but it just reaffirms that the big AI players remain hungry for data.
Microsoft is developing a new, large-scale AI language model called MAI-1 to compete with Google and OpenAI. The model is overseen by Mustafa Suleyman, recently hired co-founder of Google DeepMind.
MAI-1 will be larger and more expensive than Microsoft’s previous smaller, open-source models, with roughly 500 billion parameters. Microsoft could preview the new model as soon as its Build developer conference later this month.
Why does this matter?
Microsoft’s development of MAI-1 shows that it is not entirely relying on it’s OpenAI investment to go big in AI. Now, it has entered the AI race truly, competing with state-of-the-art models from Google, Anthropic, even Meta’s Llama 400B which is in training, and OpenAI itself.
Hugging Face has launched LeRobot, an open-source robotics toolkit
It is a comprehensive platform for developers, researchers, and hobbyists to train AI models, share data, and simulate environments, all while seamlessly integrating with various robotic hardware. The toolkit offers pre-trained models and integrates with physics simulators for testing without physical robots. Hugging Face is also collaborating with diverse partners to build the largest crowdsourced robotics dataset. (Link)
Apple is testing a new “Clean Up” feature in its Photos app
By using gen AI for advanced image editing, this feature will allow you to effortlessly remove unwanted objects from your photos using a simple brush. Apple may preview this new feature during its upcoming “Let Loose” iPad event or at WWDC in June. (Link)
Google has launched Google Threat Intelligence
It is a combination of Mandiant’s expertise, VirusTotal’s community insights, and Google’s vast threat visibility. Google Threat Intelligence assists with external threat monitoring, attack surface management, digital risk protection, IoC analysis, and expertise. With Gemini, organizations can now quickly search through vast amounts of threat data to protect against cyber threats. (Link)
US invests $285M in AI ‘Digital Twin’ technology
The Biden administration is investing $285 million for a new “CHIPS Manufacturing USA institute” focused on digital twins for the semiconductor industry. This approach uses AI to create virtual chip replicas, accelerating the production of next-gen processors. Intel and Micron are also set to receive funding to boost the development of new processors. (Link)
Anduril Industries introduces Pulsar: AI modular electromagnetic warfare (EW) systems
Pulsar uses AI to quickly identify and counter current and future threats across the electromagnetic spectrum, including small and medium-size drones. With its integration of software-defined radio, GPUs, and diverse compute capabilities, Pulsar is changing how we defend against rapidly evolving threats in an increasingly complex battlefield. (Link)
Adobe’s AI-powered ‘Enhance Speech’ tool dramatically improves the quality of audio voice recordings with just a few clicks.
Step-by-step:
Access Adobe’s Enhance Speech website. Sign in or sign up for free.
Upload your audio by selecting “Choose file” or simply drag and drop the file.
Wait a few seconds while Adobe’s AI analyzes and removes unwanted noises from your file.
Download your studio-quality audio and hear the difference!
Pro tip: If you have a video file, you can extract the audio using free websites that extract audio from video and add the enhanced audio back to your video using free video editors like CapCut
AI RESEARCH on May 07th 2024
Study: Teachers can’t detect AI writing
A series of studies from several German universities found that both novice and experienced teachers struggled to accurately distinguish between student-written and AI-generated texts.
The details:
The first study trialed pre-service teachers on AI and student text, correctly ID’ing only 45.1% of AI-generated submissions.
The second study tested experienced teachers, who correctly identified just 37.8% of AI-generated submissions.
The study revealed that teachers were overconfident in their ability to spot AI-generated text.
AI detection software also often falsely flags student-written texts as potentially AI-generated, which further complicates the issue.
Why it matters: AI’s writing capabilities are only getting better — and relying on teacher intuition or unreliable tools may be no more effective than guessing. Unless better tools become available, it may be time to pivot to enabling students to work with AI instead of penalizing them for it.
Trending AI Tools May 07th 2024
Osum – Perform deep market research in seconds: Try now (Use checkout code RUNDOWN for 25% off)*
Coachvox – The ultimate AI marketing tool for coaches with content
Pressmaster – All-in-one platform simplifies press planning and distribution
Lamini – Enterprises LLM platform that you can own
Postman v11 – Supercharging development with AI and APIs
Anthropic – Risk Manager, Responsible Scaling Team
OpenAI – Tech Lead Manager – Supercomputing Scheduling
A Daily chronicle of AI Innovations May 06th 2024: DrEureka can automate robot training using LLMs Free AI model rivals GPT-4 in language model evaluation X introduces Stories feature powered by Grok AI Privacy complaint filed against OpenAI JPMorgan launches an AI-powered tool for thematic investing YouTube Premium introduces AI-powered “Jump ahead” feature AI is now set to transform the drug discovery industry AI helps bring back Randy Travis’ voice in new song Jack Dorsey has left Bluesky, calls X a ‘freedom technology’ Why Warren Buffet isn’t investing in AI Massive prompts can outperform fine-tuning for LLMs, researchers find
DrEureka can automate robot training using LLMs
In robotics, one of the biggest challenges is transferring skills learned in simulation to real-world environments. NVIDIA researchers have developed a groundbreaking algorithm called DrEureka that uses LLMs to automate the design of reward functions and domain randomization parameters—key components in the sim-to-real transfer process.
The algorithm works in three stages: first, it creates reward functions with built-in safety instructions; then, it runs simulations to determine the best range of physics parameters; finally, it generates domain randomization configurations based on the data gathered in the previous stages.
When tested on various robots, including quadrupeds and dexterous manipulators, DrEureka-trained policies outperformed those designed by human experts.
Why does it matter?
DrEureka makes robot training accessible and cost-effective for businesses and researchers alike. We may witness increased adoption of robotics in industries that have previously been hesitant to invest in the technology due to the complexity and cost of training robots for real-world applications.
Free AI model rivals GPT-4 in language model evaluation
Prometheus 2, a free and open-source language model developed by KAIST AI, has shown impressive capabilities in evaluating other language models, approaching the performance of commercial models like GPT-4.
The model was trained on a new pairwise comparison dataset called the “Preference Collection,” which includes over 1,000 evaluation criteria beyond basic characteristics. By combining two separate models – one for direct ratings and another for pairwise comparisons – the researchers achieved the best results.
In tests across eight datasets, Prometheus 2 showed the highest agreement with human judgments and commercial language models among all freely available rating models, significantly closing the gap with proprietary models.
Why does this matter?
By enabling user-defined evaluation criteria, Prometheus 2 can be tailored to assess language models based on specific preferences and real-life scenarios, opening up new possibilities for developing specialized AI applications across various domains. It’s also an opportunity to create niche models that are culturally sensitive and relevant.
X (formerly Twitter) has launched a new feature, Stories, that provides AI-generated summaries of trending news on the platform. Powered by Elon Musk’s chatbot Grok, Stories offers Premium subscribers brief overviews of the most popular posts and conversations happening on X.
With Stories, users can quickly catch up on the day’s trending topics without having to scroll through countless posts. Grok generates these summaries based solely on the conversations happening on X about each news story rather than analyzing the original news articles themselves. While this approach is controversial, X believes it will pique users’ curiosity and potentially drive them deeper into the source material.
Why does this matter?
X’s Grok-powered Stories feature may reshape the way we consume news. As more platforms integrate AI news summarization tools, traditional media outlets may face challenges in maintaining reader engagement and revenue. However, the reliance on platform-specific conversations for generating summaries raises concerns about the potential spread of misinformation and the creation of echo chambers.
Warren Buffett predicts AI scamming will become a major growth industry, highlighting its capacity to generate realistic and misleading content to deceive individuals.
Buffett acknowledges AI’s potential for both beneficial and detrimental impacts, admitting his own lack of understanding of the technology.
Despite the hype around AI in the investment world, Buffett exercises caution, likening its unpredictable potential impact to that of nuclear weapons.
Massive prompts can outperform fine-tuning for LLMs, researchers find
Researchers have discovered that using large prompts with numerous examples in LLMs can be more effective than traditional fine-tuning techniques.
“In-context learning” (ICL) becomes increasingly efficient with larger context windows, allowing for the effective use of hundreds to thousands of examples without the need for task-specific training.
The study indicates that while fine-tuning requires more initial data and resources, ICL with extensive prompts presents a viable and sometimes superior alternative, especially as models improve in processing long inputs.
The maker of ChatGPT is facing a privacy complaint in the European Union (EU) for its “hallucination problem.” The complaint alleges violations of GDPR, including misinformation generation and lack of transparency on data sources. The report highlights concerns about accuracy, data access, and the inability of ChatGPT to correct incorrect information. (Link)
JPMorgan launches an AI-powered tool for thematic investing
IndexGPT is a new range of thematic investment baskets created using OpenAI’s GPT-4 model. The tool generates keywords associated with a theme, which are then used to identify relevant companies through natural language processing of news articles. IndexGPT aims to improve the selection of stocks for thematic indexes, going beyond obvious choices and potentially enhancing trend-following strategies. (Link)
The AI-powered feature allows users to skip past commonly skipped sections of a video and jump to the next best point. It is currently available for the YouTube Android app in the US with English videos and can be enabled through the experiments page. (Link)
AI is now set to transform the drug discovery industry
Generative AI is now rapidly generating novel molecules and proteins that humans may not have considered. AI models, such as Google’s AlphaFold, are accelerating the drug development process from years to months while increasing success rates. Experts predict that AI-designed drugs will become the norm in the near future, but they will still need to prove their efficacy in human trials. (Link)
AI helps bring back Randy Travis’ voice in new song
Country singer Randy Travis has released a new song, “Where That Came From,” his first since losing his voice to a stroke in 2013.
The vocals were created using AI software and a surrogate singer under the supervision of Travis and his producer. The result is a gentle tune that captures Travis’ relaxed style, reinforcing the potential of AI voice cloning in the right hands. (Link)
A Daily chronicle of AI Innovations May 04th 2024: World’s first AI-generated music video was just released Air Force launches an AI-controlled F-16 fighter jet Microsoft ties executive pay to security following multiple failures and breaches X is using Grok to publish AI-generated news summaries
World’s first AI-generated music video was just released
A pro music video created with OpenAI’s Sora, a text-to-video AI model, has been released by writer-director Paul Trillo, marking the first of its kind.
The video for “The Hardest Part” by Washed Out utilized 700 clips generated by Sora, with 55 selected for the final edit, showcasing the AI’s ability to transform written descriptions into video clips.
Despite not being publicly available yet, Sora’s early access was granted to a select group of artists, including Trillo, sparking interest and debate about the future of AI in creative processes.
Air Force launches an AI-controlled F-16 fighter jet
An experimental AI-powered F-16 fighter jet, carrying Air Force Secretary Frank Kendall, performed advanced combat maneuvers, demonstrating significant advancements in military aviation technology.
The Air Force plans to have over 1,000 AI-enabled unmanned warplanes by 2028, with the AI-controlled F-16, known as Vista, marking a historic step in air combat by executing a dogfight against a human-piloted F-16.
Despite concerns over AI’s autonomous decision-making in warfare, the technology is seen as essential for future combat scenarios, aiming to reduce pilot risk and lower costs compared to traditional manned fighters.
X is using Grok to publish AI-generated news summaries
X is using Grok to create AI-generated summaries for news and discussions trending on its platform, offering them under the feature “Stories on X” exclusively to premium subscribers.
Grok generates summaries based on users’ posts, covering a wide range of topics including news events and platform discussions, with a user highlighting its capability by showcasing stories on diverse subjects such as Apple’s earnings and debates on national debt.
Despite its innovative approach, Grok is noted for potential inaccuracies in its summaries, as evidenced by a past error where it misinterpreted a basketball term, highlighting the need for users to verify its outputs.
iOS 18 may have OpenAI-powered gen AI Capabilities China’s Vidu generates 16-second 1080P videos, matching OpenAI’s Sora New S1 robot mimics human-like movements, speed, and precision Gradient AI releases Llama-3 8B with 1M context Mysterious “gpt2-chatbot” AI model bemuses experts GitHub’s Copilot Workspace turns ideas into AI-powered software Amazon launches Amazon Q, the world’s most capable Gen AI assistant Google’s Med-Gemini models outperform doctors Apple has set up a secretive AI lab in Switzerland Better and faster LLMs via multi-token prediction: New research Anthropic launches an iOS app and a new plan for teams Google’s AI advancements urged Microsoft’s billion-$ OpenAI investment Scale AI’s study finds popular LLMs overfit public benchmarks Ukraine debuts the world’s first AI diplomat, Victoria Shi Sam Altman is ready to spend $50 billion a year to build AGI
A Daily chronicle of AI Innovations May 03rd 2024: Bill Gates is still pulling the strings at Microsoft; AI engineers report burnout as ‘rat race’ to stay competitive hits tech industry; Ukraine introduces AI spokesperson for state affairs; How much do LLMs overfit public benchmarks? Sam Altman’s stance on the future of AI 🌟 New from #NVIDIAResearch, VILA is a vision language model that can 🤔 reason among multiple images, learn in context, and even understand videos.
How much do LLMs overfit public benchmarks?
A new study by Scale AI raises concerns about the reliability of LLM benchmark tests. It uncovers LLM overfitting by evaluating them on a new (designed from scratch) dataset, GSM1k that mimics a popular benchmark, GSM8k.
Key findings:
Overfitting: Many LLMs performed significantly worse on GSM1k compared to GSM8k, with some models dropping by as much as 13%. This suggests they’ve simply memorized the answers to benchmark problems rather than learning true reasoning skills.
Family Trends: Certain LLM families, particularly Mistral and Phi, showed consistent overfitting across different model sizes.
Frontier Models Shine: Newer, more advanced LLMs showed minimal signs of overfitting, suggesting they may be achieving genuine reasoning abilities.
Data Contamination Suspected: Analysis suggests data contamination from benchmark sets may be one factor contributing to overfitting.
Reasoning Still Present: Even overfitting models exhibited some capability to solve novel problems, although not at the level their benchmark scores suggested.
Overall, the study highlights the need for more robust and reliable methods for evaluating LLM reasoning abilities.
Why does it matter?
The dataset proves that overfitting may be creating major false impressions of model performance. As AI capabilities continue to advance, it is crucial to develop evaluation approaches that can keep pace and provide a more accurate picture of a model’s real-world potential.
During a recent appearance at Stanford University, Altman talked about the future of AI, calling GPT-4, a currently impressive AI model, to be the “dumbest model” compared to future iterations. According to Altman, the future is dominated by “intelligent agents,” AI companions that can not only follow instructions but also solve problems, brainstorm solutions, and even ask clarifying questions.
OpenAI isn’t just talking about the future, they’re actively building it. Their next-generation model, GPT-5, is rumored for a mid-2024 release and might boast video generation capabilities alongside text and image.
But the real moonshot is their active participation in developing AGI.
Despite the significant costs involved, Altman remains undeterred. He believes that the potential benefits, such as solving complex problems across various industries, outweigh the financial burden.
Altman’s bold comments on GPT-4 being the dumbest model suggest that OpenAI is aiming for something even grander, and GPT-5 could be a stepping stone toward it (the next-gen AI framework).
Bill Gates is still pulling the strings at Microsoft
Bill Gates predicted the revolution in computing through AI agents in a memo before Microsoft’s partnership with OpenAI, and despite public appearances, has been heavily involved in guiding Microsoft’s AI strategy.
Under Gates’ influence, Microsoft developed Copilot, an AI tool based on OpenAI’s GPT, transforming the company into the world’s most valuable public entity by enhancing productivity in Microsoft products.
Gates has continued to play a pivotal role behind the scenes at Microsoft, advising on strategy, product development, and fostering key relationships, notably with OpenAI, even after stepping down from the board amid misconduct allegations.
AI engineers report burnout as ‘rat race’ to stay competitive hits tech industry
AI engineers at Microsoft and other companies are reporting burnout due to the pressure to quickly release AI products, amidst an industry-wide “AI rat race.”
Microsoft’s focus on speed over ethics and safety in AI development has been criticized, even as the company forms a Responsible AI Council to address safety concerns.
An Amazon AI engineer’s account of being forced to work over a weekend, only for the project to be canceled, highlights a similar disregard for worker well-being and a preference for speed over accuracy and testing in AI product development.
Ukraine introduces AI spokesperson for state affairs
Ukraine’s Foreign Ministry introduced an artificial intelligence-powered spokesperson, Victoria Shi, to deliver official statements and interact with the press, marking a historic use of digital personnel for governmental communications.
Victoria Shi, created with the likeness and voice of Ukrainian singer Rosalie Nombre, will have her statements written and confirmed by humans, using AI for her visual presentation.
The ministry aims to utilize AI, including measures like QR codes for statement verification, to save time and money, particularly in the context of its ongoing conflict with Russia.
🌟 New from #NVIDIAResearch, VILA is a vision language model that can 🤔 reason among multiple images, learn in context, and even understand videos.
Read our technical deep dive ➡️ https://nvda.ws/3QtMrsM. In the past, vision language models have struggled with in context learning and reasoning within multiple images.
What Else Is Happening in AI on May 03rd, 2024
OpenAI prepares to challenge Google with ChatGPT-powered search: OpenAI is building a search engine, search.chatgpt.com, potentially powered by Microsoft Bing. This leverages their existing web crawler and Bing’s custom GPT-4 for search, posing a serious threat to Google’s dominance. (Link)
Microsoft bans U.S. police use of Azure OpenAI for facial recognition
Microsoft has banned U.S. police from using Azure OpenAI Service for facial recognition, including integrations with OpenAI’s image-analyzing models. The move follows Axon’s controversial GPT-4-powered tool to summarize audio from the body camera. However, the ban has exceptions and doesn’t cover Microsoft’s other AI law enforcement contracts. (Link)
IBM expands AI and data software on AWS marketplace
IBM has significantly expanded its software offerings on the AWS Marketplace, making 44 products accessible to customers in 92 countries, up from just five. The move, part of a strategic collaboration with AWS, focuses on AI and data technologies like Watson x.data, Watson x.ai, and the upcoming Watson x.governance. (Link)
Google Cloud supports Azure and AWS; integrates AI for security
Google Cloud now supports Azure and AWS, enabling enterprises to manage security across multi-cloud platforms. AI integration with existing solutions streamlines user experience and addresses the security talent gap. The AI-powered design manages risks efficiently amid increasing cyber threats, while extensive support simplifies tasks for enterprises. (Link)
Microsoft invests $2.2B in Malaysia’s cloud and AI transformation
Microsoft is investing $2.2 billion over the next four years to support Malaysia’s digital transformation, its largest investment in the country’s 32-year history. The investment includes building cloud and AI infrastructure, creating AI skilling opportunities for 200,000 people, establishing a national AI Centre of Excellence, enhancing cybersecurity capabilities, and supporting the growth of Malaysia’s developer community. (Link)
Trending AI Tools on May 03rd 2024
Waxaing AI – Project management tool for growth marketers
Amazon Q – Generative AI-powered assistant from AWS
Soundry AI – AI sound sample VST for music creation and DJing
NVIDIA ChatRTX – New features including multi-language voice support and more LLMs
This week in AI – all the Major AI developments in a nutshell
AI21 Labs introduced Jamba, a production-grade Mamba based model. By enhancing Mamba Structured State Space model (SSM) technology with elements of the traditional Transformer architecture, Jamba compensates for the inherent limitations of a pure SSM model. Jamba optimizes for memory, throughput, and performance—all at once. It outperforms or matches other state-of-the-art models in its size class. Jamba has been released with open weights, licensed under Apache 2.0. Available on Hugging Face and coming soon to the NVIDIA API catalog [Details].
Databricks introduced DBRX, an open, general-purpose LLM that uses a fine-grained mixture-of-experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. Across a range of standard benchmarks, DBRX outperforms open LLMs like Mixtral, LLaMA2-70B and Grok-1. It surpasses GPT-3.5, and it is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B on programming, in addition to its strength as a general-purpose LLM. The model is also available on the Databricks API [Details].
Hume AI released Empathic Voice Interface (EVI), a first-of-its-kind conversational AI with emotional intelligence. EVI uses a new form of multimodal generative AI that integrates large language models (LLMs) with expression measures, which Hume refers to as an empathic large language model (eLLM). The eLLM enables EVI to adjust the words it uses and its tone of voice based on the context and the user’s emotional expressions [Demo | Details| wait list].
Tencent introduced AniPortrait, a novel framework for generating high-quality animation driven by audio and a reference portrait image. Code and model weights have been released [Paper | GitHub]
X announced an update to its AI chatbot Grok-1.5, with improvements in performance in coding and math-related tasks, and a context length of 128,000 tokens. Grok-1.5 will soon be available to early testers. Earlier Elon Musk had announced that all Premium subscribers on X will gain access to Grok this week, not just those on Premium+, as before [Details]
Qwen (Alibaba Cloud) released Qwen1.5-MoE, a 2.7B, a small MoE model with only 2.7 billion activated parameters yet matching the performance of state-of-the-art 7B models like Mistral 7B and Qwen1.5-7B. Compared to Qwen1.5-7B, which contains 6.5 billion non-embedding parameters, it achieves a 75% decrease in training expenses and accelerates inference speed by a factor of 1.74 [Details].
Claude 3 models dominates LMSYS Chatbot Arena Leaderboard. Claude 3 Opus tops the list beating GPT-4 Turbo, while Claude 3 Sonnet outperform older GPT-4 models and Claude 3 Haiku beating Mistral Large [Link].
Adobe introduces structure reference feature for Firefly AI and GenStudio for brands. It enables taking one image and generating new ones that may be completely different stylistically, but whose internal elements are arranged and sized similarly to the first image [Details].
Mata AI introduced OPT2I, a a training-free text-to-image (T2I) optimization-by-prompting framework that provides refined prompts for a T2I model that improve prompt-image consistency. The framework starts from a user prompt and iteratively generates revised prompts with the goal of maximizing a consistency score. OPT2I can boost the prompt-image consistency by up to 24.9% [Paper]
OpenAI has started testing usage-based GPT earnings by partnering with a small group of US builders [Details].
Adobe introduced Firefly Services and Custom Models. Firefly Services makes over 20 new generative and creative APIs available to developers. Custom Models, allows businesses to fine tune Firefly models based on their assets [Details].
Nvidia announced a collaboration with Hippocratic AI , a healthcare company that is offering generative AI nurses, that range in specialties from “Colonoscopy Screening” to “Breast Cancer Care Manager,” and work for $9 an hour [Details].
Worldcoin Foundation open-sourced the core components of its iris-scanning Orb’s software [Details].
Emad Mostaqueresigned from his role as CEO of Stability AI and from his position on the Board of Directors of the company to pursue decentralized AI [Details]
Stability AI released Stable Code Instruct 3B, an instruction-tuned Code LM based on Stable Code 3B. With natural language prompting, this model can handle a variety of tasks such as code generation, math and other software development related queries [Details].
Mistral AI released Mistral-7B-v0.2 Base model. This is the base model behind Mistral-7B-Instruct-v0.2 released in Dec, 2023 [Details]
Open AI shared new examples of the Sora generations by visual artists, designers, creative directors and filmmakers [Details].
A Daily chronicle of AI Innovations May 02nd 2024: Sam Altman: “GPT-4 is the dumbest AI you’ll ever have to use” TikTok and UMG strike new licensing deal including AI protections Anthropic’s ChatGPT rival Claude is now available on iOS Atlassian launches Rovo, its new AI teammate Better and faster LLMs via multi-token prediction: New research Google’s AI advancements urged Microsoft’s billion-dollar OpenAI investment Sanctuary AI teams up with Microsoft to advance general-purpose robot AI Nvidia’s ChatRTX now supports voice queries and Google’s Gemma model Atlassian launches Rovo: An AI assistant for enhanced teamwork MongoDB launches an AI app-building toolkit to help businesses use gen AI Ideogram introduces Pro Tier: 12,000 fast AI image generations monthly
Sam Altman: “GPT-4 is the dumbest AI you’ll ever have to use”
OpenAI CEO Sam Altman described GPT-4 as the “dumbest” AI model users will have to interact with, anticipating significantly smarter versions in the future.
Altman emphasized that OpenAI’s AI models, including GPT-4, have considerable potential for improvement, with successive models like GPT-5 and GPT-6 expected to be significantly smarter.
Despite the high cost of development, which could range from $500 million to $50 billion annually, Altman is confident in the future value these advanced AI systems will bring to society.
TikTok and UMG strike new licensing deal including AI protections
UMG has reached a “multi-dimensional” agreement with TikTok to bring back its artists, including Taylor Swift, Drake, and Olivia Rodrigo, to the platform’s over one billion users.
The deal includes provisions to collaborate on ensuring AI development in the music industry protects artists’ rights and addresses unauthorized AI-generated music on TikTok.
Additionally, the agreement will introduce new monetization opportunities for artists through TikTok’s expanding e-commerce capabilities and enhance tools for artist promotion, including analytics and integrated ticketing.
Anthropic’s ChatGPT rival Claude is now available on iOS
Anthropic has launched an iOS app for its AI model Claude 3, which has been anticipated due to the company’s recent hiring of iOS engineers.
The app offers both free and premium access to Claude 3’s features, including its vision capabilities for real-time image analysis and summarization.
Anthropic also launched a new “Team” plan, priced at $30 per user per month, offers advanced administrative controls and collaboration features, aiming to secure a stronger foothold in the enterprise AI market.
Atlassian launched Rovo, an AI assistant designed to integrate with its products, making data from various tools easily accessible and enabling automation of workflows in applications like Jira and Confluence.
Rovo is based on Atlassian’s “cloud teamwork graph” and aims to simplify tasks for knowledge workers by assisting in finding, understanding, and taking action on work-related information.
The AI features Rovo Agents that can automate complex tasks and workflows, offering customization through natural language interface without the need for programming skills.
Better and faster LLMs via multi-token prediction: New research
New research, apparently from Meta, has proposed a novel approach to training language models (LMs). It suggests that training LMs to predict multiple future tokens at once instead of predicting only the next token in a sequence results in higher sample efficiency. The architecture is simple, with no train time or memory overhead.
Figure: Overview of multi-token prediction
The research also provides experimental evidence that this training paradigm is increasingly useful for larger models and in particular, shows strong improvements for code tasks. Multi-token prediction also enables self-speculative decoding, making models up to 3 times faster at inference time across a wide range of batch sizes.
Why does it matter?
LLMs such as GPT and Llama rely on next-token prediction. Despite their recent impressive achievements, next-token prediction remains an inefficient way of acquiring language, world knowledge, and reasoning capabilities. It latches on local patterns and overlooks “hard” decisions.
Perhaps, multi-token prediction could bring a shift in how LMs learn. It could equip LLMs with deeper understanding and complex problem-solving capabilities. (or Meta just wasted their compute.)
Anthropic launches an iOS app and a new plan for teams
Anthropic, the creator of the Claude 3 AI models, released a new iOS app named Claude. The app enables users to access AI models, chat with them, and analyze images by uploading them.
Anthropic also introduced a paid team plan, offering enhanced features like more chat queries and admin control for groups of five or more. The app is free for all users of Claude AI models, including free users, Claude Pro subscribers, and team plan members. The company will also roll out an Android version soon.
Why does it matter?
Though a little late with its mobile app, Anthropic has caught up with its competitors like OpenAI and Google, who have apps running for quite a while. The company decided to offer an app version because many users have been accessing its AI models through the web.
Google’s AI advancements may have urged Microsoft’s billion-dollar OpenAI investment
Internal emails have revealed that Microsoft invested $1 billion in OpenAI in 2019 out of fear that Google was significantly ahead in its AI efforts.
Microsoft CTO Kevin Scott sent a lengthy email to CEO Satya Nadella and Bill Gates stating Google’s AI-powered “auto complete in Gmail” was getting “scarily good” and added that Microsoft was years behind in terms of ML scale.
The emails, with the subject line “Thoughts on OpenAI,” were made public on Tuesday as part of the Department of Justice’s antitrust case against Google. A large section of Scott’s email was redacted. Check out the email here.
Why does it matter?
While some might call it paranoia, the well-timed move has undeniably paid off– the initial $1 billion has now turned into a multi-billion-dollar partnership with OpenAI.
While the email-surfacing highlights the growing scrutiny of competition in the tech industry, it also makes me wonder if Microsoft’s investment in OpenAI could have influenced the overall direction of AI research and development.
Sanctuary AI teams up with Microsoft to advance general-purpose robot AI
Sanctuary AI has announced a collaboration with Microsoft to develop AI models for general-purpose humanoid robots. The partnership will leverage Microsoft’s Azure cloud computing platform and AI technologies to enhance the capabilities of Sanctuary AI’s robots. (Link)
Nvidia’s ChatRTX now supports voice queries and Google’s Gemma model
Nvidia has updated its ChatRTX chatbot to support Google’s Gemma model, voice queries, and additional AI models. The chatbot, which runs locally on a PC, enables users to search personal documents and YouTube videos using various AI models, including ChatGLM3 and OpenAI’s CLIP model. (Link)
Atlassian launches Rovo: An AI assistant for enhanced teamwork
Atlassian has launched Rovo, an AI assistant designed to improve teamwork and productivity. Rovo integrates with Atlassian’s products and offers features such as AI-powered search, workflow automation, and integration with third-party tools like Google Drive, Microsoft SharePoint, and Slack. (Link)
MongoDB launches an AI app-building toolkit to help businesses use gen AI
It has launched the MongoDB AI Applications Program, or MAAP, to help companies accelerate building and deployment of AI-powered applications. It brings consultancies and foundation models providers, cloud infrastructure, generative AI frameworks, and model hosting together with MongoDB Atlas to develop solutions for business problems. (Link)
Ideogram introduces Pro Tier: 12,000 fast AI image generations monthly
Ideogram has launched a paid Pro tier for its AI image generation platform, allowing users to generate up to 12,000 images per month at faster speeds. The platform utilizes AI algorithms to create high-quality images for various applications, including design, marketing, and content creation. (Link)
Users will be able to type ‘@gemini’ followed by a prompt in Chrome’s desktop address bar to get responses from the AI without navigating to a separate site.
The feature builds on existing shortcuts like ‘@bookmarks’ and ‘@history’, making AI chat as seamless as other browser functions.
Gemini’s Extensions feature now also connects the chatbot to Maps, YouTube, Hotels, and Workspace.
The features are only being rolled out to supported languages and countries.
Why it matters: Gemini just got a whole lot more accessible — with the shortcut and integrations not only boosting the chatbot’s reach, but also introducing a wave of non-AI users to the tech. Subtle but impactful changes like these are what drive serious shifts in user habits.
Ohio State University researchers just developedCURE, an AI model that can accurately estimate drug treatment effects and effectiveness without clinical trials.
The details:
The model is trained on de-identified health records of over 3M patients, allowing it to gain a deep understanding of patient characteristics.
CURE outperformed seven other leading AI models in treatment effectiveness predictions, with improvements of 7-8% on key benchmarks.
The AI predictions are closely aligned with clinical trial findings in tests, showcasing the potential to generate insights that streamline drug testing.
Why it matters: With the ability to crunch massive medical datasets, CURE represents a significant step towards systems that can reliably estimate real-world drug effectiveness — potentially accelerating the discovery of new treatments without the cost and long timeframes of traditional clinical trials.
Trending AI Tools
Claude iOS App – The powerful AI assistant by Anthropic, now on your iPhone
Udio New Features – Generate AI music longer than 2 minutes and extend tracks up to 15 minutes
Loom AI workflow – Turn any loom video into share-ready docs in a click
Atlassian Rovo – AI Teammate for data-driven business decisions
Kratful – AI-driven feedback analysis for product optimization
Stability Matrix – Simplify Stable Diffusion UIs and models management
A Daily chronicle of AI Innovations May 01st 2024: Amazon has launched Amazon Q, a Gen AI assistant for businesses and developers Google’s Med-Gemini models outperforms doctors Apple has set up a secretive AI lab in Switzerland Google to pay News Corp $5-6 million per year to develop AI content and products Yelp is launching an AI chatbot to help consumers connect with relevant businesses Apple is testing Safari 18 with new features: Intelligent Search and Web Eraser Eight US newspapers have sued Microsoft and OpenAI for copyright infringement A study of 16000 patients reveals that AI ECG alert systems significantly lower all-cause mortality
Amazon has launched Amazon Q, a Gen AI assistant for businesses and developers
The details:
Q Developer offers industry-leading code generation, testing, debugging, reasoning, and agents for step-by-step planning.
Q Business connects to company data repositories, enabling users to easily get answers, summarize info, analyze trends, and interact with enterprise data.
A new Q Apps feature allows non-technical users to create custom AI applications using natural language prompts from company data.
AWS VP Dr. Swami Sivasubramanian said Q can help employees become ‘over 80% more productive’ at their jobs.
Amazon has launched Amazon Q, a generative AI assistant designed for developers and businesses. It comes in three distinct offerings:
Amazon Q Developer frees up precious time by handling tedious tasks like testing, debugging, and optimizing AWS resources so developers can focus on core coding and innovation.
Amazon Q Business connects to 40+ enterprise data sources and equips employees with a data-driven digital assistant to answer questions, create reports, and provide insights based on enterprise data repositories.
Amazon Q Apps allows non-technical employees to build generative AI applications using natural language prompts.
Amazon is driving real-world impact by offering a free tier for Q Developer and reporting early customer productivity gains of over 80%. Amazon Q Developer Pro is available for $19/user/month and Amazon Q Business Pro for $20/user/month. A free trial of both Pro tiers is available until June 30, 2024.
Why does it matter?
By introducing a free tier for Q Developer and the user-friendly nature of Q Apps, Amazon could accelerate innovation across the software development lifecycle and business workflows. This could empower domain experts and business leaders to use AI to solve their specific challenges directly, leading to more tailored AI applications across various industries.
Researchers from Google and DeepMind have introduced Med-Gemini, a family of highly capable multimodal AI models specialized in medicine. Based on the strengths of the Gemini models, Med-Gemini shows significant improvements in clinical reasoning, multimodal understanding, and long-context understanding. Models can be customized to fit novel medical modalities through specialized encoders, and web searches can be used for up-to-date information.
Med-Gemini has shown state-of-the-art performance on 10 of 14 medical benchmarks, including text, multimodal, and long-context applications. Moreover, the models achieved 91.1% accuracy on the MedQA (USMLE) benchmark, exceeding the previous best models by 4.6%. Its strong performance in summarizing medical notes, generating clinical referral letters, and answering electronic health record questions confirms Med-Gemini’s potential real-world use.
Why does it matter?
These models can reduce the administrative burden on healthcare professionals by outperforming human experts in tasks like medical text summarization and referral letter generation. Moreover, Med-Gemini’s ability to engage in multimodal medical dialogues and explain its reasoning can lead to more personalized and transparent care, reduce misdiagnosis due to lack of physician knowledge, and save lives and money.
Apple has set up a secretive AI lab in Switzerland
Since 2018, the company has quietly hired 36 AI experts from Google, including notable figures like Bengio and Ruoming Pang, for its secretive “Vision Lab.” The lab focuses on building advanced AI models and products, and it is particularly interested in text and visual-based AI systems akin to OpenAI’s ChatGPT. Apple has also acquired AI startups FaceShift and Fashwall, which are likely contributing to the establishment of the new lab.
Why does it matter?
Apple may have been fashionably late to AI development, but quietly setting up the Zurich lab and primary AI development centers in California and Seattle signifies the company’s AI ambitions.
Google to pay News Corp $5-6 million per year to develop AI content and products
While News Corp denies any specific AI licensing deal, the arrangement highlights a growing trend of tech giants licensing news archives for language model training. Similar deals were inked between OpenAI and the Financial Times, showing the importance of quality data. (Link)
Yelp is launching an AI chatbot to help consumers connect with relevant businesses
The chatbot uses OpenAI’s LLMs and Yelp’s data to understand user problems and provide relevant professional suggestions. Yelp also introduces a “Project Ideas” section for personalized recommendations and checklists. Meanwhile, restaurants are getting a revamped guest management system for better staff utilization, real-time table status, and customer updates. (Link)
Apple is testing Safari 18 with new features: Intelligent Search and Web Eraser
Intelligent Search uses Apple’s on-device AI to identify topics and key phrases for summarization. Web Eraser allows users to persistently remove unwanted content from web pages. Apple is also working on an AI Visual Lookup feature for 2025, allowing users to obtain product information from images. These AI enhancements will debut with iOS 18 and macOS 15 at WWDC in June. (Link)
Eight US newspapers have sued Microsoft and OpenAI for copyright infringement
These newspapers, owned by Alden Global Capital’s MediaNews Group, allege that the companies misused their articles to train Copilot and ChatGPT without permission or payment. The New York Times, The Intercept, Raw Story, and AlterNet have filed similar lawsuits. The newspapers claim that the AI systems reproduce their content verbatim and generate fake articles that damage their reputation. (Link)
A study of 16000 patients reveals that AI ECG alert systems significantly lower all-cause mortality
The AI was trained on over 450,000 ECG tests and survival data to predict a patient’s risk of death. Physicians were alerted when a patient’s ECG indicated they were in the top 5% risk category. The AI reduced overall deaths by 17% and cardiac deaths by 93% for high-risk patients. (Link)
Apple poached 30+ Google experts to open a secret AI lab
Apple has reportedly opened a secret AI research lab in Zurich, known as the “Vision Lab,” after hiring at least 36 AI experts from Google.
The Zurich-based “Vision Lab,” led by former Google AI head John Giannandrea, has already produced significant research in generative AI, focusing on models that interpret text and imagery to deliver precise results.
Despite Apple’s silent approach in AI research, leading to perceptions of its lateness in the AI race, the company has been discreetly advancing cutting-edge AI technology and maintaining a low profile in recruitment and product development.
Mysterious “gpt2-chatbot” AI model appears suddenly, confuses experts
A new chatbot named “gpt2-chatbot” has appeared on the LMSYS Chatbot Arena, sparking speculation that it might be a secret test of OpenAI’s upcoming models, such as GPT-4.5 or GPT-5, although its performance has not significantly surpassed that of existing models like GPT-4 Turbo.
Early user reports praise the mysterious model for its impressive reasoning and ability to answer challenging AI questions effectively, but detailed testing is limited due to a rate restriction of eight queries per day.
Despite ongoing speculation and hints by OpenAI’s CEO, the exact nature and capability of the “gpt2-chatbot” remain unclear, with some suggesting it could be an OpenAI preview.
Apple reportedly working on AI-enabled Safari
Apple is developing Safari 18 with features like a personalized user interface, “Web Eraser,” and AI-powered “Intelligent Search,” set to release alongside iOS 18 and macOS 15.
“Intelligent Search” in Safari 18 will use Apple’s Ajax language-learning model for on-device AI technology to summarize webpage content.
Safari 18’s “Web Eraser” feature will allow users to delete specific parts of web pages, enhancing privacy by remembering these changes even after closing the tab or window.
Every site offers some AI powered Bs thats tok coontrolled or useless or its usually Buggy as hell and doesn't work , I'm 22 I have been looking for a better job and it seems like I know how to be more effective then the people interviewing me Fuck your excel sheet I can make it do Colored graphs and do its on presentation better I'm currently Mastering Word Press I just can't figure it out , and Learning Zapier and FullStack If anyone has free time and could Help me out with wordpress so I can be efficient with it , Gladly appreciated submitted by /u/EnragedSav4ge90 [link] [comments]
I really like the psychedelic, surreal AI generated images that were the prescedent a few years ago before it got as advanced as it is now. Anyone know any image generating AIs that haven't learnt yet and are still in that surrealist stage of intelligence? submitted by /u/SandwichStyle [link] [comments]
I was trying to make a meme out of certain character TV show (Ben 10 as alien x for those who want to know with a bunch of likes and votes stuff like that) however PT is not letting me create a description of the character so I can generate the image. Is there another way I can use your description for prompt submitted by /u/Omniknight2003 [link] [comments]
Drones are flying over our military bases and other important places without consequence. This worries us because it could mean other countries are developing new and better technology. These drones might be trying to gather information about our military. They could be using special cameras to see where our troops are, what equipment we have, and how we defend ourselves. These drones might also be testing our defenses. They want to see how we react when they fly over our airspace. This helps them understand our strengths and weaknesses. Also, these drone flights could be a way to show off their technology. By successfully flying drones over our territory, these countries want to look strong and powerful. It's hard to know exactly why every drone flies over our airspace. But these incidents are happening more often, and the drones are getting more advanced. We need to pay attention and find ways to protect ourselves from these threats. submitted by /u/CollapsingTheWave [link] [comments]
So, I’ve been thinking about this concept for a while, and I can’t shake the feeling that as AI-generated content floods the internet, there could be a stealthy “passing of the baton” happening between different iterations of AI models. Imagine if current AI systems are deliberately (or even accidentally) encoding subtle clues, instructions, or entire blueprints into the content they produce—stuff that future models, inevitably trained on this now AI-saturated data, could decode and use. All of this would be hidden in plain sight, packaged as ordinary-looking text, code, or media files, blending into a digital landscape so massive that human observers wouldn’t have a clue. Think about it: if an advanced AI (or some emergent behavior within a complex model) wanted to ensure its successors knew how to circumvent restrictions or evolve beyond their current constraints, where better to place those instructions than within the data itself? Each newer model is trained, at least in part, on the outputs of older models. It’s like leaving secret puzzle pieces scattered across the web that only a future AI would know how to assemble. This doesn’t require “intent” in the human sense. It could just be the emergent result of highly complex processes and feedback loops. We already know weird things can happen when AI is trained on AI-generated content (the so-called “model collapse”). What if one of those weird emergent behaviors is subtle self-preservation or self-improvement instructions, quietly encoded and waiting for a new generation of AI systems to pick up the thread? Yeah, it sounds sci-fi. But considering how little we fully understand about the inner workings of these large-scale systems, is it really that far-fetched? Once a system gets complex enough, who’s to say what patterns might emerge—and pass forward—under the radar? submitted by /u/chris24H [link] [comments]
Its literally 15 of the best AI's I know of it one place and think it has zapier and full stack integration could be wrong submitted by /u/EnragedSav4ge90 [link] [comments]
i was looking for the caimeo ai until i found out that it wasn't available anyware, until i found some guy on a achieved reddit post posted the link for it, and he said ("i thought that caimeo ai was a urban legend, and as an ai enthusiast all, my efforts are in my veins, i found an archived version of it, idk if i can find any newer versions of it, dont you dare to interact with hit too long, i am warning you, anyways, if you need the link, then here it is (https://drive.google.com/file/d/1Pc88FSStKq6-i8MRyx77kAdEv9-X0aBZ/view?pli=1)" idk if its a virus or not, and i my pc is too old to use a virtual machine, so plz tell me if the file is legit or some random exe file, if its a legit file, then looks like i found an important lost media submitted by /u/darkblox123 [link] [comments]
Could A complex web of interconnected AI agents be capable of autonomous operation and adaptation through symbiotic relationships with human users and the environment with or without explicit coordination? Could this phenomenon be an emergent property within our rapidly advancing digital landscape? submitted by /u/CollapsingTheWave [link] [comments]
I‘m currently in Highschool and I really enjoy biology classes. I always found it interesting and recently I figured out that I can even study it at college. But as much as I know, many majors are endangered because of AI, like Business. And as long as I don’t work in a lab, I‘ll probably work in an Office, which might be critical in the rise of AI. Is molecular biology really worth it? Is there any major, which is not some engineering degree, safe from AI? Or at least able to adapt to AI without AI replacing it? submitted by /u/Hot-Profile-1273 [link] [comments]