A Daily Chronicle of AI Innovations in April 2024

A daily chronicle of AI Innovations April 01st 2024

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

AI Innovations in April 2024.

Welcome to the April 2024 edition of the Daily Chronicle, your gateway to the latest Artificial Intelligence innovations! Join us as we uncover the most recent advancements, trends, and groundbreaking discoveries in the world of AI. Explore a realm where industry leaders gather at events like ‘AI Innovations at Work’ and where visionary forecasts shape the future of AI. Stay informed with daily updates as we navigate through the dynamic world of AI, uncovering its potential impact and exploring cutting-edge developments throughout this exciting month. Join us on this thrilling journey into the limitless possibilities of AI in April 2024.

Experience the transformative capabilities of AI with “Read Aloud For Me – AI Dashboard – AI Tools Catalog” – your ultimate AI Dashboard and Hub. Seamlessly access a comprehensive suite of top-tier AI tools within a single app, meticulously crafted to enhance your efficiency and streamline your digital interactions. Now available on the web at readaloudforme.com and across popular app platforms including Apple, Google, and Microsoft, “Read Aloud For Me – AI Dashboard” places the future of AI at your fingertips, blending convenience with cutting-edge innovation. Whether for professional endeavors, educational pursuits, or personal enrichment, our app serves as your portal to the forefront of AI technologies. Embrace the future today by downloading our app and revolutionize your engagement with AI tools.

AI Tools Catalog - AI Tools Recommender - Read ALoud For Me
AI Tools Catalog – AI Tools Recommender – Read ALoud For Me

A Daily chronicle of AI Innovations April 18th 2024: 🧠 Samsung unveils lightning-fast DRAM for AI-powered devices 🤖 Logitech’s new AI prompt builder & Signature AI edition mouse 📸 Snapchat to add watermark to images produced with its AI tools ✈️ US Air Force confirms first successful AI dogfight 🏆 Mistral’s latest model sets new records for open source LLMs 🎭 Microsoft’s new AI model creates hyper-realistic video using static image 👁️ GPT-4 nearly matches expert doctors in eye assessments 🔒 Brave unleashes real-time privacy-focused AI answer engine 📸 Snapchat to add watermark to images produced with its AI tools

Microsoft’s VASA-1 generates lifelike talking faces with audio

Microsoft Research’s groundbreaking project, VASA-1, introduces a remarkable framework for generating lifelike talking faces from a single static image and a speech audio clip.

This premiere model achieves exquisite lip synchronization and captures a rich spectrum of facial nuances and natural head motions, resulting in hyper-realistic videos.

Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet)  Business Plan (AMERICAS) with  the following codes:  C37HCAQRVR7JTFK Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

Active Anti-Aging Eye Gel, Reduces Dark Circles, Puffy Eyes, Crow's Feet and Fine Lines & Wrinkles, Packed with Hyaluronic Acid & Age Defying Botanicals

Why does it matter?

VASA-1 is crucial in AI for improving lifelike interactions with realistic facial expressions, benefiting customer service, education, and companionship. Its expressive features also enhance storytelling in games and media. Additionally, VASA-1 contributes to developing accessibility tools for those with communication challenges.

Source


AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Bard, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, Promp Engineering)

Boston Dynamics charges up for the future by electrifying Atlas

Boston Dynamics has unveiled an electric version of their humanoid robot, Atlas. Previously powered by hydraulics, the new Atlas operates entirely on electricity. This development aims to enhance its strength and range of motion, making it more versatile for real-world applications.

Boston Dynamics also plans to collaborate with partners like Hyundai to test and iterate Atlas applications in various environments, including labs, factories, and everyday life.

Why does it matter?

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLF-C02 book

The electric version of Boston Dynamics’ humanoid robot, Atlas, matters because it offers enhanced strength, agility, and practicality for real-world applications. Its electric power source allows it to move in ways that exceed human capabilities, making it versatile for various tasks

Source

Intel reveals world’s largest brain-inspired computer

Intel has introduced the world’s largest neuromorphic computer, mimicking the human brain. Unlike traditional computers, it combines computation and memory using artificial neurons. With 1.15 billion neurons, it consumes 100 times less energy than conventional machines. It performs 380 trillion synaptic operations per second This breakthrough could revolutionize AI and enhance energy-efficient computing.

Why does it matter?

In current AI models, data transfer between processing units can be bottlenecks. Neuromorphic architectures, directly address this issue by integrating computation and memory. This could lead to breakthroughs in training deep learning models.

Source

✈️ US Air Force confirms first successful AI dogfight

  • The US Air Force, via DARPA, announced that an AI-controlled jet successfully engaged in an in-air dogfight against a human pilot for the first time, during tests at Edwards Air Force Base in California in September 2023.
  • DARPA has been working on AI for air combat through its Air Combat Evolution (ACE) program since December 2022, aiming to develop AI capable of autonomously flying fighter jets while adhering to safety protocols.
  • The AI was tested in a real aircraft, the experimental X-62A, against an F-16 flown by a human, achieving close maneuvers without the need for human pilots to intervene, and plans to continue testing through 2024.
  • Source

🏆 Mistral’s latest model sets new records for open source LLMs 

  • French AI startup Mistral AI has released Mixtral 8x22B, claiming it to be the highest-performing and most efficient open-source language model, utilizing a sparse mixture-of-experts model with 39 billion of its 141 billion parameters active.
  • Mixtral 8x22B excels in multilingual support and possesses strong math and programming capabilities, despite having a smaller context window compared to leading commercial models like GPT-4 or Claude 3.
  • The model, licensed under the Apache 2.0 license for unrestricted use, achieves top results on various comprehension and logic benchmarks and outperforms other models in its supported languages on specific tests.
  • Source

🧠 Intel unveils the world’s largest neuromorphic computer

  • Intel Labs introduced its largest neuromorphic computer yet, the Hala Point, featuring 1.15 billion neurons, likened to the brain capacity of an owl, that aims to process information more efficiently by emulating the brain’s neurons and synapses in silicon.
  • The Hala Point system, consuming 2,600 W, is designed to achieve deep neural network efficiencies up to 15 TOPS/W at 8-bit precision, significantly surpassing Nvidia’s current and forthcoming systems in energy efficiency.
  • While showcasing remarkable potential for AI inference and optimization problems with significantly reduced power consumption, Intel’s neuromorphic technology is not yet a universal solution for all AI workloads, with limitations in general-purpose AI acceleration and challenges in adapting large language models.
  • Source

🎭 Microsoft’s new AI model creates hyper-realistic video using static image

  • Microsoft introduced VASA-1, an AI model that produces hyper-realistic videos from a single photo and audio clip, featuring realistic lip syncs and facial movements.
  • The model can create 512x512p resolution videos at 40fps from one image, support modifications like eye gaze and emotional expressions, and even incorporate singing or non-English audio.
  • While Microsoft recognizes the AI’s potential for misuse in creating deepfakes, it intends to use VASA-1 solely for developing virtual interactive characters and advancing forgery detection.
  • Source

👁️ GPT-4 nearly matches expert doctors in eye assessments

  • OpenAI’s GPT-4 almost matched the performance of expert ophthalmologists in an eye assessment study, as reported by the Financial Times and conducted by the University of Cambridge’s School of Clinical Medicine.
  • GPT-4 scored higher than trainee and junior doctors with 60 correct answers out of 87, closely following the expert doctors’ average score of 66.4, in a test evaluating knowledge on various ophthalmology topics.
  • The study, highlighting both potential benefits and risks, indicates that while GPT-4 shows promise in medical assessments, concerns about inaccuracies and the model’s tendency to “hallucinate” answers remain.
  • Source

What Else Is Happening in AI on April 18th 2024❗

🧠 Samsung unveils lightning-fast DRAM for AI-powered devices

Samsung Electronics has achieved a significant milestone by developing the industry’s fastest LPDDR5X DRAM, capable of reaching speeds up to 10.7 Gbps. This new LPDDR5X offers 25% higher performance and 30% more capacity, making it an optimal solution for the on-device AI era. (Link)

🤖 Logitech’s new AI prompt builder & Signature AI edition mouse

Logitech has launched the Logi AI Prompt Builder, a free software tool that enhances interaction with OpenAI’s ChatGPT. It allows Logitech keyboards and mice to serve as shortcuts for more fluent AI prompts. Additionally, Logitech introduced the Signature AI Edition Mouse, featuring a dedicated AI prompt button. (Link)

Djamgatech: Build the skills that’ll drive your career into six figures: Get Djamgatech.

📸 Snapchat to add watermark to images produced with its AI tools

Snapchat plans to add watermarks to AI-generated images on its platform. These watermarks, featuring a translucent Snap logo and a sparkle emoji, will enhance transparency and prevent content misuse.  (Link)

🔒 Brave unleashes real-time privacy-focused AI answer engine

Brave, the privacy-centric browser, has introduced an AI-driven answer engine within Brave Search. Unlike competitors, it prioritizes privacy by avoiding external search engines. The feature provides real-time generative answers across multiple languages, making it a robust alternative to traditional search.  (Link)

💼 LinkedIn tests premium company page subscription 

LinkedIn is quietly testing a Premium Company Page subscription service for small and medium businesses. The service includes AI-generated content, follower-enhancing tools, and other features to elevate company profiles. Pricing starts at $99.99 per month.    (Link)

A Daily chronicle of AI Innovations April 17th 2024: 🎮 NVIDIA RTX A400 A1000: Lower-cost single slot GPUs; 📊 Stanford’s report reflects industry dominance and rising training costs in AI; 🎵 Amazon Music launches Maestro, an AI playlist generator; 📷 Snap adds watermarks to AI-generated images; 🤖 Boston Dynamics unveils a new humanoid robot; 💰 Andreessen Horowitz raises $7.2 billion, a sign that tech startup market may be bouncing back; 💰 OpenAI offers a 50% discount for off-peak GPT usage; 💻 AMD unveils AI chips for business laptops and desktops; 🧠 Anthropic Claude 3 Opus is now available on Amazon Bedrock; 👤 Zendesk launches an AI-powered customer experience platform; 💼 Intel and The Linux Foundation launch Open Platform for Enterprise AI (OPEA)

Google will pump more than $100B into AI says DeepMind boss

  • DeepMind CEO predicts Google will invest over $100 billion in AI, surpassing rivals like Microsoft in processing prowess.
  • Google’s investment in AI may involve hardware like Axion CPUs based on the Arm architecture, claimed to be faster and more efficient than competitors.
  • Some of the budget will likely go to DeepMind, known for its work on the software side of AI, despite recent mixed results in material discoveries and weather prediction.
  • DeepMind has made progress in teaching AI social skills, a crucial step in advancing AI capabilities.
  • Hassabis emphasized the need for significant computing power, a reason for teaming up with Google in 2014.

Source

A monster of a paper by Stanford, a 500-page report on the 2024 state of AI

Top 10 Takeaways:

  1. AI beats humans on some tasks, but not on all. AI has surpassed human performance on several benchmarks, including some in image classification, visual reasoning, and English understanding. Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning.
  2. Industry continues to dominate frontier AI research. In 2023, industry produced 51 notable machine learning models, while academia contributed only 15. There were also 21 notable models resulting from industry-academia collaborations in 2023, a new high.
  3. Frontier models get way more expensive. According to AI Index estimates, the training costs of state-of-the-art AI models have reached unprecedented levels. For example, OpenAI’s GPT-4 used an estimated $78 million worth of compute to train, while Google’s Gemini Ultra cost $191 million for compute.
  4. The United States leads China, the EU, and the U.K. as the leading source of top AI models. In 2023, 61 notable AI models originated from U.S.-based institutions, far outpacing the European Union’s 21 and China’s 15.
  5. Robust and standardized evaluations for LLM responsibility are seriously lacking. New research from the AI Index reveals a significant lack of standardization in responsible AI reporting. Leading developers, including OpenAI, Google, and Anthropic, primarily test their models against different responsible AI benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top AI models.
  6. Generative AI investment skyrockets. Despite a decline in overall AI private investment last year, funding for generative AI surged, nearly octupling from 2022 to reach $25.2 billion. Major players in the generative AI space, including OpenAI, Anthropic, Hugging Face, and Inflection, reported substantial fundraising rounds.
  7. The data is in: AI makes workers more productive and leads to higher quality work. In 2023, several studies assessed AI’s impact on labor, suggesting that AI enables workers to complete tasks more quickly and to improve the quality of their output. These studies also demonstrated AI’s potential to bridge the skill gap between low- and high-skilled workers. Still, other studies caution that using AI without proper oversight can lead to diminished performance.
  8. Scientific progress accelerates even further, thanks to AI. In 2022, AI began to advance scientific discovery. 2023, however, saw the launch of even more significant science-related AI applications— from AlphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of materials discovery.
  9. The number of AI regulations in the United States sharply increases. The number of AI related regulations in the U.S. has risen significantly in the past year and over the last five years. In 2023, there were 25 AI-related regulations, up from just one in 2016. Last year alone, the total number of AI-related regulations grew by 56.3%.
  10. People across the globe are more cognizant of AI’s potential impact—and more nervous. A survey from Ipsos shows that, over the last year, the proportion of those who think AI will dramatically affect their lives in the next three to five years has increased from 60% to 66%. Moreover, 52% express nervousness toward AI products and services, marking a 13 percentage point rise from 2022. In America, Pew data suggests that 52% of Americans report feeling more concerned than excited about AI, rising from 37% in 2022.

Source

NVIDIA RTX A400 A1000: Lower-cost single slot GPUs

NVIDIA is expanding its lineup of professional RTX graphics cards with two new desktop GPUs – the RTX A400 and RTX A1000. These new GPUs are designed to bring enhanced AI and ray-tracing capabilities to workstation-class computers. The RTX A1000 GPU is already available from resellers, while the RTX A400 GPU is expected to launch in May.

NVIDIA RTX A400

NVIDIA RTX A400 A1000: Lower-cost single slot GPUs
NVIDIA RTX A400 A1000: Lower-cost single slot GPUs

With 24 tensor cores for AI processing, the A400 enables professionals to run AI apps directly on their desktops, such as intelligent chatbots and copilots. The GPU allows creatives to produce vivid, physically accurate 3D renderings. The A400 also features four display outputs, making it ideal for high-density display environments such as financial services, command and control, retail, and transportation.

NVIDIA RTX A1000

Ace the Microsoft Azure Fundamentals AZ-900 Certification Exam: Pass the Azure Fundamentals Exam with Ease

With 72 Tensor Cores, the A1000 offers 3x faster generative AI processing for tools like Stable Diffusion. The A1000 also excels in video processing, as it can process up to 38% more encoding streams and offers up to 2x faster decoding performance than the previous generation. With their slim single-slot design and power consumption of just 50W, the A400 and A1000 GPUs offer impressive features for compact, energy-efficient workstations.

Why does it matter?

NVIDIA RTX A400 and A1000 GPUs provide professionals with cutting-edge AI, graphics, and computing capabilities to increase productivity and unlock creative possibilities. These GPUs can be used by industrial designers, creatives, architects, engineers, healthcare teams, and financial professionals to improve their workflows and achieve faster and more accurate results. With their advanced features and energy efficiency, these GPUs have the potential to impact the future of AI in various industries.

Source

Amazon Music launches Maestro, an AI-based playlist generator

Amazon Music is launching its AI-powered playlist generator, Maestro, following a similar feature introduced by Spotify. Maestro allows users in the U.S. to create playlists by speaking or writing prompts. The AI will then generate a song playlist that matches the user’s input. This feature is currently in beta and is being rolled out to a subset of Amazon Music’s free, Prime, and Unlimited subscribers on iOS and Android.

Like Spotify’s AI playlist generator, Amazon has built safeguards to block inappropriate prompts. However, the technology is still new, and Amazon warns that Maestro “won’t always get it right the first time.”

Why does it matter?

Introducing AI-powered playlist generators could profoundly impact how we discover and consume music in the future. These AI tools can revolutionize music curation and personalization by allowing users to create highly tailored playlists simply through prompts. This trend could increase user engagement, drive more paid subscriptions, and spur further innovation in AI-powered music experiences as companies offer more cutting-edge features.

Source

Standford’s report reflects industry dominance and rising training costs in AI

The AI Index, an independent report by the Stanford Institute for Human-Centered Artificial Intelligence (HAI), provides a comprehensive overview of global AI trends in 2023.

The report states that the industry outpaced academia in AI development and deployment. Out of the 149 foundational models published in 2023, 108 (72.5%) were from industry compared to just 28 (18.8%) from academia.

Standford’s report reflects industry dominance and rising training costs in AI
Standford’s report reflects industry dominance and rising training costs in AI

Google (18) leads the way, followed by Meta (11), Microsoft (9), and OpenAI (7).

Standford’s report reflects industry dominance and rising training costs in AI
Standford’s report reflects industry dominance and rising training costs in AI

United States leads as the top source with 109 foundational models out of 149, followed by China (20) and the UK (9). In case of machine learning models, the United States again tops the chart with 61 notable models, followed by China (15) and France (8).

Regarding AI models’ training and computing costs, Gemini Ultra leads with a training cost of $191 million, followed by GPT-4, which has a training cost of $78 million.

Standford’s report reflects industry dominance and rising training costs in AI
Standford’s report reflects industry dominance and rising training costs in AI

Lastly, in 2023, AI reached human performance levels in many key AI benchmarks, such as reading comprehension, English understanding, visual thinking, image classification, etc.

Standford’s report reflects industry dominance and rising training costs in AI
Standford’s report reflects industry dominance and rising training costs in AI

Why does it matter?

Industry dominance in AI research suggests that companies will continue to drive advancements in the field, leading to more advanced and capable AI systems. However, the rising costs of AI training may pose challenges, as it could limit access to cutting-edge AI technology for smaller organizations or researchers.

Source

A Daily chronicle of AI Innovations April 16th 2024: 🎬 Adobe partners with OpenAI, RunwayML & Pika for Premiere Pro; 🚀 Reka launches Reka Core: their frontier in multimodal AI; 🇯🇵 OpenAI is opening its first international office in Tokyo;  🤖 Hugging Face has rolled out Idefics2 ; 💬 Quora’s Poe aims to become the ‘App Store’ for AI chatbots; 👥 Instagram is testing an AI program to amplify influencer engagement;  👩‍💻 Microsoft has released and open-sourced the new WizardLM-2 family of LLMs; 📋 Limitless AI launched a personal meeting assistant in a pendant 💰 Microsoft invests $1.5 billion in AI firm 📈 Baidu says its ChatGPT-like Ernie bot exceeds 200 million users 💻 OpenAI introduces Batch API with up to 50% discount for asynchronous tasks

Adobe partners with OpenAI, RunwayML & Pika for Premiere Pro

Adobe is integrating generative AI in Premiere Pro. The company is developing its own Firefly Video Model and teaming up with third-party AI models like OpenAI’s Sora, RunwayML, and Pika to bring features like Generative Extend, Object Addition and Removal, and Generative B-Roll to the editing timeline.

Adobe is committed to an open approach for delivering models. It allows editors to choose the best AI models for their needs to streamline video workflows, reduce tedious tasks, and expand creativity. It also provides “Content Credentials” to track model usage.

Why does this matter?

Adobe Premiere Pro has been used in blockbuster films like Deadpool, Gone Girl, and Everything Everywhere All at Once. By integrating generative AI into Premiere Pro, Adobe is transforming the film industry, allowing editors to streamline workflows and focus more on creative storytelling.

Source

Reka launches Reka Core: their frontier in multimodal AI

Another day, another GPT-4-class model. But this time, it’s not from the usual suspects like OpenAI, Google, or Anthropic. Reka, a lesser-known AI startup has launched a new flagship offering, Reka Core – a most advanced and one of only two commercially available comprehensive multimodal solutions. It excels at understanding images, videos, and audio while offering a massive context window, exceptional reasoning skills, and even coding.

Reka launches Reka Core: their frontier in multimodal AI
Reka launches Reka Core: their frontier in multimodal AI

It outperforms other models on various industry-accepted evaluation metrics. To provide flexibility, Reka Core can be deployed via API, on-premises, or on-device. Reka’s partnerships with Snowflake and Oracle are set to democratize access to this tech for AI innovation across industries.

Why does this matter?

Reka Core matches and even surpasses the performance of leading OpenAI, Google, and Anthropic models across various benchmarks and modalities. By offering cost-effective, multi-modal solutions, Reka has the potential to make advanced AI more accessible and drive new applications across multiple industries.

Source

OpenAI is opening its first international office in Tokyo

OpenAI is releasing a custom version of its GPT-4 model, specially optimized for the Japanese language. This specialized offering promises faster and more accurate performance and improved text handling.

Tadao Nagasaki has been appointed President of OpenAI Japan. The company plans to collaborate with the Japanese government, local businesses, and research institutions to develop safe AI tools that serve Japan’s unique needs. With Daikin and Rakuten already using ChatGPT Enterprise and local governments like Yokosuka City seeing productivity boosts, OpenAI is poised to impact the region significantly.

Why does it matter?

The move reflects OpenAI’s commitment to serving diverse markets. It could set a precedent for other AI companies, fostering a more inclusive and local approach. And as Japan grapples with rural depopulation and labor shortages, AI could prove invaluable in driving progress.

Source

💰 Microsoft invests $1.5 billion in AI firm

  • Microsoft will invest $1.5 billion in G42, a leading UAE artificial intelligence firm, as part of a strategic shift to align with American technology and disengage from Chinese partnerships following negotiations with the US government.
  • The investment enhances Microsoft’s influence in the Middle East, positioning G42 to use Microsoft Azure for its AI services, underpinning US efforts to limit Chinese access to advanced technologies.
  • This deal, which also involves Microsoft’s Brad Smith joining G42’s board, comes amidst broader US concerns about tech firms with Chinese ties.
  • Source

📈 Baidu says its ChatGPT-like Ernie bot exceeds 200 million users 

  • Baidu’s AI chatbot ‘Ernie Bot’ has reached 200 million users and its API is used 200 million times daily.
  • Ernie Bot, the first locally developed ChatGPT-like chatbot in China, was publicly released eight months ago after receiving approval from Beijing.
  • Despite its growth, Ernie Bot faces strong competition from rival domestic AI services, such as the Alibaba-backed ‘Kimi’ chatbot from Moonshot AI.
  • Source

💻 OpenAI introduces Batch API with up to 50% discount for asynchronous tasks

  • OpenAI introduces a new Batch API providing up to 50% discount for asynchronous tasks like summarization, translation, and image classification.
  • This Batch API allows results for bulk API requests within 24 hours by uploading a JSONL file of requests in batch format, currently supporting only the /v1/chat/completions endpoint.
  • OpenAI expects this to enable more efficient use of its APIs for applications that require a large number of requests.
  • Source

Unleash the Power of Generative AI: Build Breakthrough Apps with AWS Bedrock

Struggling to keep up with the rapid advancements in generative AI? AWS Bedrock offers a one-stop shop for developers. Access a variety of high-performing foundation models from leading names in AI, all through a single API. Fine-tune models with your data, leverage pre-built agents, and focus on building innovative applications.

For detailed study, refer the blog – https://www.seaflux.tech/blogs/aws-bedrock-models

What is AWS Bedrock?

AWS Bedrock is a fully managed service designed for developers to streamline the development of generative AI applications. It consists of high-performing foundation models (FMs) from leading AI companies that can be accessed from a single API. AWS Bedrock has tied its know with AI pros like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and also has in-house capabilities. Each FM has its own unique feature that can be leveraged according to your project preference. This eliminates the need for developers to manage the infrastructure and tooling necessary to train and deploy their own models. Despite the simplified process for developing applications, privacy and security aspects are not compromised. AWS Bedrock ensures the integrity and confidentiality of the developer’s data used for creating generative AI applications.

Key Features of AWS Bedrock

  • Variety of FMs: A wide range of high-performing models are available for different tasks like text generation, image generation, code generation, and more.
  • Simple API: A single API that makes it quick and easy to integrate FMs into your applications.
  • Fully managed service: All the infrastructure and tooling are managed for you to focus on building your applications.
  • Scalable: Applications can be scaled up or down as the requirement changes.
  • Secure: AWS Bedrock provides built-in security and privacy features ensuring integrity and confidentiality

How does AWS Bedrock work?

  1. Choose a foundation model: Browse the available models and select the one that best fits your needs.
  2. Send an API request: Use the simple API to send your data to the chosen model.
  3. Receive the output: The model will generate the desired output, such as text, code, or an image.
  4. Integrate the output: Use the output in your application however you like.

Types of Foundation Model (FM)

AWS Bedrocks provides 6 FMs with more than 15 versions that can be leveraged as per the project’s requirements. All the models are pre-trained on large datasets and are very reliable tools for a wide range of applications. The following table shows a brief about these FMs, and to know more about the models, visit the AWS Bedrock official website.
AAWS Bedrock, AWS Bedrock blog, aws solution to develop generative ai applications, Token size of AWS bedrock models, Languages supported by aws bedrock foundation models, Different use cases of AWS bedrock foundation models

AWS Bedrock Pricing

AWS Bedrock provides two types of pricing models and charges based on the model inference and customization in the model.

  1. On-demand & Batch: The pay-as-you-go pricing model is used without any time-based commitments.
  2. Provisioned Throughput: A sufficient throughput is provided in exchange for a time-based commitment to meet the performance demand of the application. The term can be 1 month or 6-month commitment.
Follow through this pricing table to compare the models, or visit the AWS Bedrock official pricing website to know more about it.
AWS bedrock pricing, aws solution to develop generative ai applications, AI21 labs pricing, Anthropic claude pricing, Cohere Command pricing, Stabality AI XL pricing

What Else Is Happening in AI on April 16th, 2024❗

🤖 Hugging Face has rolled out Idefics2 

Hugging Face has released Idefics2, a more compact and capable version of its visual language model. With just 8 billion parameters, this open-source model enhances image manipulation, improves OCR, and answers questions on visual data. (Link)

💬 Quora’s Poe aims to become the ‘App Store’ for AI chatbots

After a $75 million funding round, Poe has launched a “multi-bot chat” feature that allows users to seamlessly integrate various AI models into a single conversation. Positioning itself as the “app store” for chatbots, Poe is also rolling out monetization tools for creators and planning an enterprise tier for businesses. (Link)

👥 Instagram is testing an AI program to amplify influencer engagement

The “Creator AI” program lets popular creators interact with fans through automated chatbots. The bots will mimic the influencer’s voice using their past content, aiming to boost engagement while cutting down on manual responses. While some creators worry this could undermine authenticity, Meta sees AI as crucial to its future. (Link)

👩‍💻 Microsoft has released and open-sourced the new WizardLM-2 family of LLMs

This next-gen LLM lineup boasts three cutting-edge versions—The 8x22B model outperforms even the best open-source alternatives, while the 70B and 7B variants deliver best-in-class reasoning and efficiency, respectively. (Link)

📋 Limitless AI launched a personal meeting assistant in a pendant

Limitless launched a $99 wearable “Limitless Pendant” to transcribe conversations, generate real-time notes, and seamlessly integrate with your work apps. While starting with a focus on meetings, the startup’s CEO Dan Siroker sees Limitless eventually doing much more – proactively surfacing relevant information and even automating tasks on your behalf. (Link)

A Daily chronicle of AI Innovations April 15th 2024: 🚗 Tesla lays off more than 10% of its workforce 🎥 Adobe explores OpenAI partnership as it adds AI video tools 📱 Apple’s AI features on iOS 18 may run locally on your iPhone 📊 xAI’s first multimodal model with a unique dataset ♾️ Infini-Attention: Google’s breakthrough gives LLMs limitless context ⚠️ Adobe’s Firefly AI trained on competitor’s images: Bloomberg report

xAI’s first multimodal model with a unique dataset

xAI, Elon Musk’s AI startup, has released the preview of Grok-1.5V, its first-generation multimodal AI model. This new model combines strong language understanding capabilities with the ability to process various types of visual information, like documents, diagrams, charts, screenshots, and photographs.

The startup claims Grok-1.5V has shown competitive performance across several benchmarks, including tests for multidisciplinary reasoning, mathematical problem-solving, and visual question answering. One notable achievement is its exceptional performance on the RealWorldQA dataset, which evaluates real-world spatial understanding in AI models.

Developed by xAI, this dataset features over 700 anonymized images from real-world scenarios, each accompanied by a question and verifiable answer. The release of Grok-1.5V and the RealWorldQA dataset aims to advance the development of AI models that can effectively comprehend and interact with the physical world.

Why does this matter?

What makes Grok-1.5V unique is its integration with the RealWorldQA dataset, which focuses on real-world spatial understanding crucial for AI systems in physical environments. The public availability of this dataset could significantly advance the development of AI-driven robotics and autonomous systems. With Musk’s backing, xAI could lead in multimodal AI and contribute to reshaping human-AI interaction.

Source

Infini-Attention: Google’s breakthrough gives LLMs limitless context

Google researchers have developed a new technique called Infini-attention that allows LLMs to process text sequences of unlimited length. By elegantly modifying the Transformer architecture, Infini-attention enables LLMs to maintain strong performance on input sequences exceeding 1 million tokens without requiring additional memory or causing exponential increases in computation time.

Infini-Attention: Google's breakthrough gives LLMs limitless context
Infini-Attention: Google’s breakthrough gives LLMs limitless context

The key innovation behind Infini-attention is the addition of a “compressive memory” module that efficiently stores old attention states once the input sequence grows beyond the model’s base context length. This compressed long-range context is then aggregated with local attention to generate coherent and contextually relevant outputs.

In benchmark tests on long-context language modeling, summarization, and information retrieval tasks, Infini-attention models significantly outperformed other state-of-the-art long-context approaches while using up to 114 times less memory.

Why does this matter?

Infini-attention can help AI systems expertly organize, summarize, and surface relevant information from vast knowledge bases. Additionally, infinite contextual understanding can help AI systems generate more nuanced and contextually relevant long-form content like articles, reports, and creative writing pieces. Overall, we can expect AI tools to generate more valuable and less generic content with this technique.

Source

Adobe’s Firefly AI trained on competitor’s images: Bloomberg report

In a surprising revelation, Adobe’s AI image generator Firefly was found to have been trained not just on Adobe’s own stock photos but also on AI-generated images from rival platforms like Midjourney and DALL-E. The Bloomberg report, which cites insider sources, notes that while these AI images made up only 5% of Firefly’s training data, their inclusion has sparked an internal ethics debate within Adobe.

The news is particularly noteworthy given Adobe’s public emphasis on Firefly’s “ethical” sourcing of training data, a stance that aimed to differentiate it from competitors. The company had even set up a bonus scheme to compensate artists whose work was used to train Firefly. However, the decision to include AI-generated images, even if labeled as such by the submitting artists, has raised questions about the consistency of Adobe’s ethical AI practices.

Why does it matter?

As AI systems learn from one another in a continuous feedback loop, the distinction between original creation, inspiration, and imitation becomes blurred. This raises complex issues around intellectual property rights, consent, and the difference between remixing and replicating. Moreover, the increasing prevalence of AI-generated content in training data sets could lead to a homogenization of AI outputs, potentially stifling creativity and diversity.

Source

🚗 Tesla lays off more than 10% of its workforce 

  • Tesla plans to lay off “more than 10 percent” of its global workforce following its first year-over-year decline in vehicle deliveries since 2020, impacting at least 14,000 employees.
  • CEO Elon Musk expressed regret over the layoffs in an internal email, stating they are necessary for the company to remain “lean, innovative and hungry” for future growth.
  • Senior vice president Drew Baglino and policy chair Rohan Patel are among the top executives reported to be leaving the company amid these changes.
  • Source

🎥 Adobe explores OpenAI partnership as it adds AI video tools 

  • Adobe is enhancing Premiere Pro with new AI video tools, enabling capabilities such as video generation, object addition/removal, and clip extension, and is exploring a potential partnership with OpenAI.
  • The integration of OpenAI’s Sora with Adobe’s video tools is considered an “early exploration,” aiming to augment Adobe’s offerings and provide users with advanced generative capabilities.
  • Adobe aims to offer more choice to Premiere Pro users by potentially integrating third-party AI models and adding Content Credentials to identify the AI used, despite current limitations and the unclear extent of user control over these new features.
  • Source

📱 Apple’s AI features on iOS 18 may run locally on your iPhone LINK

  • Apple’s iOS 18, set to debut at WWDC 2024 on June 10, promises to be the most significant software upgrade with enhanced features like a smarter Siri through generative AI.
  • According to Bloomberg’s Mark Gurman, the initial set of AI features in iOS 18 will operate entirely on-device without requiring cloud processing, ensuring privacy and efficiency.
  • Apple is in discussions with AI developers such as Google’s Gemini, OpenAI’s GPT, and Baidu to integrate generative AI tools into iOS 18, potentially including third-party AI chatbots.

What Else Is Happening in AI on April 15th 2024❗

🤖 Meta trials AI chatbot on WhatsApp, Instagram, and Messenger

Meta is testing its AI chatbot, Meta AI, with WhatsApp, Instagram, and Messenger users in India and parts of Africa. The move allows Meta to leverage its massive user base across these apps to scale its AI offerings. Meta AI can answer user queries, generate images from text prompts, and assist with Instagram search queries. (Link)

🎨 Ideogram introduces new features to its AI image generation model

Ideogram’s AI image generation model now offers enhanced capabilities like description-based referencing, negative prompting, and options for generating images at varying speeds and quality levels. The upgrade aims to improve image coherence, photorealism, and text rendering quality, with human raters showing a 30-50% preference for the new version over the previous one. (Link)

🖼️ New Freepik AI tool redefines image generation with realism and versatility

Freepik has launched the latest version of its AI Image Generator that offers real-time generation, infinite variations, and photorealistic results. The tool allows users to create infinite variations of an image with intuitive prompts, combining colors, settings, characters, and scenarios. It delivers highly realistic results and offers a streamlined workflow with real-time generation and infinite scrolling. (Link)

💼 OpenAI promoted ChatGPT Enterprise to corporates with road-show-like events

OpenAI CEO Sam Altman recently hosted events in San Francisco, New York, and London, pitching ChatGPT Enterprise and other AI services to hundreds of Fortune 500 executives. This move is part of OpenAI’s strategy to diversify revenue streams and compete with partner Microsoft in selling AI products to enterprises. The events showcased applications such as call center management, translation, and custom AI solutions. (Link)

📔 Google’s Notes tool now offers custom AI-generated backgrounds

Google has introduced an AI-powered background generation feature for its experimental Notes tool, allowing users to personalize their notes with custom images created from text prompts. The feature, currently available for select users in the US and India, utilizes Google’s Gemini AI model for image generation. (Link)

A Daily chronicle of AI Innovations April 12th 2024: 💥 OpenAI fires two researchers for alleged leaking; 🍎 Apple is planning to bring new AI-focused M4 chips to entire line of Macs; 🤷‍♀️ Amazon CEO: don’t wait for us to launch a ChatGPT competitor; 💬 ChatGPT GPT-4 just got a huge upgrade; 🧠 Gabe Newell, the man behind Steam, is working on a brain-computer interface; 🔍 Cohere’s Rerank 3 powers smarter enterprise search; 💻 Apple M4 Macs: Coming soon with AI power!; 📝 Meta’s OpenEQA puts AI’s real-world comprehension to test

Cohere’s Rerank 3 powers smarter enterprise search

Cohere has released a new model, Rerank 3, designed to improve enterprise search and Retrieval Augmented Generation (RAG) systems. It can be integrated with any database or search index and works with existing legacy applications.

Cohere’s Rerank 3 powers smarter enterprise search
Cohere’s Rerank 3 powers smarter enterprise search

Rerank 3 offers several improvements over previous models:

  • It handles a longer context of documents (up to 4x longer) to improve search accuracy, especially for complex documents.
  • Rerank 3 supports over 100 languages, addressing the challenge of multilingual data retrieval.
  • The model can search various data formats like emails, invoices, JSON documents, codes, and tables.
  • Rerank 3 works even faster than previous models, especially with longer documents.
  • When used with Cohere’s RAG systems, Rerank 3 reduces the cost by requiring fewer documents to be processed by the expensive LLMs.

Plus, enterprises can access it through Cohere’s hosted API, AWS Sagemaker, and Elasticsearch’s inference API.

Why does this matter?

Rerank 3 represents a step towards a future where data is not just stored but actively used by businesses to make smarter choices and automate tasks. Imagine instantly finding a specific line of code from an email or uncovering pricing details buried in years of correspondence.

Source

Apple M4 Macs: Coming soon with AI power!

Apple is overhauling its Mac lineup with a new M4 chip focused on AI processing. This comes after the recent launch of M3 Macs, possibly due to slowing Mac sales and similar features in competitor PCs.

The M4 chip will come in three tiers (Donan, Brava, Hidra) and will be rolled out across various Mac models throughout 2024 and early 2025. Lower-tier models like MacBook Air and Mac Mini will get the base Donan chip, while high-performance Mac Pro will be equipped with the top-tier Hidra. We can expect to learn more about the specific AI features of the M4 chip at Apple’s WWDC on June 10th.

Why does this matter?

Apple’s new AI-powered M4 Mac chip could make Macs much faster for things like video editing and scientific work, competing better with computers with similar AI features.

By controlling hardware and software, Apple can fine-tune everything to ensure a smooth user experience and future improvements.

Source

Meta’s OpenEQA puts AI’s real-world comprehension to test

Meta AI has released a new dataset called OpenEQA to measure how well AI understands the real world. This “embodied question answering” (EQA) involves an AI system being able to answer questions about its environment in natural language.

The dataset includes over 1,600 questions about various real-world places and tests an AI’s ability to recognize objects, reason about space and function, and use common sense knowledge.

Why does this matter?

While OpenEQA challenges AI with questions demanding visual and spatial reasoning, it also exposes limitations in current AI models that often rely solely on text knowledge. Its role could push researchers to develop AI with a stronger grasp of the physical world.

Source

💥 OpenAI fires two researchers for alleged leaking

  • OpenAI has dismissed two researchers, Leopold Aschenbrenner and Pavel Izmailov, for allegedly leaking information following an undisclosed internal investigation.
  • The leaked information may be related to a research project called Q*, which involved a breakthrough in AI models solving unseen math problems, raising concerns about the lack of safeguards for commercializing such advanced technology.
  • The firings highlight a potential contradiction in OpenAI’s mission, as the company faces criticism for moving away from its original ethos of openness and transparency.
  • Source

🍎 Apple is planning to bring new AI-focused M4 chips to entire line of Macs

  • Apple is poised to launch its next-generation M4 chips as early as this year, aimed at enhancing AI capabilities and rejuvenating Mac sales following a 27% drop last fiscal year.
  • The M4 chips, reported to be nearing production, are expected to come in three variants named Donan, Brava, and Hidra, supporting a range of Mac products, including updates to the iMac, MacBook Pros, and Mac Mini initially, with the MacBook Air and Mac Studio to follow.
  • This accelerated update cycle to introduce M4 chips may lead to a short lifespan for the recently launched M3 chips, indicating Apple’s urgency to compete in the AI technology space against rivals with similar AI-focused hardware advancements.
  • Source

🤷‍♀️ Amazon CEO: don’t wait for us to launch a ChatGPT competitor

  • Amazon CEO Andy Jassy emphasizes the company’s focus on building foundational “primitives” for generative AI rather than quickly launching public-facing products like a ChatGPT competitor.
  • Amazon has launched AI products such as Amazon Bedrock and Amazon Q aimed at software engineers and business customers, aligning with its strategy to empower third-party developers to create GenAI applications.
  • Despite not directly competing with ChatGPT, Amazon is investing in the AI domain, including a $4 billion investment in AI company Anthropic, while also enhancing its existing products like Alexa with AI capabilities.
  • Source

💬 ChatGPT GPT-4 just got a huge upgrade 

  • ChatGPT’s GPT-4 Turbo model has received an upgrade, enhancing its abilities in writing, math, logical reasoning, and coding, as announced by OpenAI for its premium users.
  • The upgrade, distinguished by significant performance improvements in mathematics and GPQA, also aims for more succinct, direct, and conversational responses.
  • This new version of ChatGPT, which includes data up until December 2023, shows improved performance on recent topics, such as acknowledging the launch of the iPhone 15.
  • Source

🧠 Gabe Newell, the man behind Steam, is working on a brain-computer interface

  • Gabe Newell, co-founder of Valve and the force behind Steam, has been developing a brain-computer interface (BCI) technology through a venture named Starfish Neuroscience, rivaling Elon Musk’s Neuralink.
  • Since 2019, Newell has explored gaming applications for BCIs and discussed potential future capabilities like editing feelings, highlighting the technology’s potential beyond traditional interfaces.
  • Aside from his BCI pursuits, Newell has faced recent challenges including an antitrust lawsuit against Steam and the sale of his megayacht, amidst managing COVID-19 precautions and legal appearances.
  • Source

What Else Is Happening in AI on April 12th 2024❗

🔄 ChatGPT gets an upgrade for premium users

OpenAI has released an enhanced version of GPT-4 Turbo for ChatGPT Plus, Team, and Enterprise customers. The new model, trained on data until December 2023, promises more direct responses, less verbosity, and improved conversational language, along with advancements in writing, math, reasoning, and coding. (Link)

🤝 Dr. Andrew Ng joins Amazon’s Board of Directors

Amazon has appointed Dr. Andrew Ng, a renowned AI expert and founder of several influential AI companies, to its Board of Directors. With his deep expertise in machine learning and AI education, Ng is expected to provide valuable insights as Amazon navigates the transformative potential of generative AI. (Link)

⌚️ Humane’s $699 Ai Pin hits the US market

Humane’s Ai Pin is now available across the US, with global expansion on the horizon through SKT and SoftBank partnerships. The wearable AI device is powered by a $24/month plan, including unlimited AI queries, data, and storage. The international availability is to be announced soon. (Link)

📱 TikTok might use AI influencers for ads

TikTok is developing a new feature that lets companies use AI characters to advertise products. These AI influencers can read scripts made by advertisers or sellers. TikTok has been testing this feature but isn’t sure when it will be available for everyone to use. (Link)

🤖 Sanctuary AI’s humanoid robot to be tested at Magna

Magna, a major European car manufacturer, will pilot Sanctuary AI’s humanoid robot, Phoenix, at one of its facilities. This follows similar moves by other automakers exploring the use of humanoid robots in manufacturing, as companies seek to determine the potential return on investment. (Link)

A Daily chronicle of AI Innovations April 11th 2024: 🚀 Meta unveils next-generation AI chip for enhanced workloads 🎶 New AI tool lets you generate 1200 songs per month for free 💰 Adobe is buying videos for $3 per minute to build an AI model 🤖 Google expands Gemma family with new models 🌐 Mistral unveils Mixtral-8x22B open language model 📷 Google Photos introduces free AI-powered editing tools 🖼️ Microsoft enhances Bing visual search with personalization 🛡️ Sama red team: Safety-centered solution for Generative AI 💥 Apple hit with ‘mercenary spyware attacks’  🧠 Humane AI has only one problem: it just doesn’t work 🔍 MistralAI unveils groundbreaking open model Mixtral 8x22B 🙃 Microsoft proposed using DALL-E to US military last year 🎵 New AI music generator Udio synthesizes realistic music on demand 🎬 Adobe is purchasing video content to train its AI model

🚀 Meta unveils next-generation AI chip for enhanced workloads

Meta has introduced the next generation of its Meta Training and Inference Accelerator (MTIA), significantly improving on MTIAv1 (its first-gen AI inference accelerator). This version more than doubles the memory and compute bandwidth, designed to effectively serve Meta’s crucial AI workloads, such as its ranking and recommendation models and Gen AI workloads.

Meta has also co-designed the hardware system, the software stack, and the silicon, which is essential for the success of the overall inference solution.

Meta unveils next-generation AI chip for enhanced workloads
Meta unveils next-generation AI chip for enhanced workloads

Early results show that this next-generation silicon has improved performance by 3x over the first-generation chip across four key models evaluated. MTIA has been deployed in the data center and is now serving models in production.

Why does this matter?

This is a bold step towards self-reliance in AI! Because Meta controls the whole stack, it can achieve an optimal mix of performance and efficiency on its workloads compared to commercially available GPUs. This eases NVIDIA’s grip on it, which might be having a tough week with other releases, including Intel’s Gaudi 3 and Google Axion Processors.

Source

New AI tool lets you generate 1200 songs per month for free

Udio, a new AI music generator created by former Google DeepMind researchers, is now available in beta. It allows users to generate up to 1200 songs per month for free, with the ability to specify genres and styles through text prompts.

The startup claims its AI can produce everything from pop and rap to gospel and blues, including vocals. While the free beta offers limited features, Udio promises improvements like longer samples, more languages, and greater control options in the future. The company is backed by celebrities like Will.i.am and investors like Andreessen Horowitz.

Why does this matter?

AI-generated music platforms like Udio democratize music creation by making it accessible to everyone, fostering new artists and diverse creative expression. This innovation could disrupt traditional methods, empowering independent creators lacking access to expensive studios or musicians.

Source

💰 Adobe is buying videos for $3 per minute to build an AI model

Adobe is buying videos at $3 per minute from its network of photographers and artists to build a text-to-video AI model. It has requested short clips of people engaged in everyday actions such as walking or expressing emotions including joy and anger, interacting with objects such as smartphones or fitness equipment, etc.

The move shows Adobe trying to catch up to competitors like OpenAI (Sora). Over the past year, Adobe has added generative AI features to its portfolio, including Photoshop and Illustrator, that have garnered billions of uses. However, Adobe may be lagging behind the AI race and is trying to catch up.

Why does this matter?

Adobe’s targeted video buying for AI training exposes the hefty price tag of building competitive AI. Smaller companies face an uphill battle—they might need to get scrappier, focus on specific niches, team up, or use free, open-source AI resources.

Source

💥 Apple hit with ‘mercenary spyware attacks

  • Apple has issued a warning to iPhone users in 92 countries about a potential “mercenary spyware attack” aimed at compromising their devices, without identifying the attackers or the consequences.
  • The company suggests that the attack is highly targeted, advising recipients to take the warning seriously and to update their devices with the latest security patches and practice strong cyber hygiene.
  • This type of attack is often linked to state actors employing malware from private companies, with the infamous ‘Pegasus’ spyware mentioned as an example, capable of extensive surveillance on infected phones.
  • Source

🧠 Humane AI has only one problem: it just doesn’t work

  • The Humane AI Pin, retailing for $699 plus a $24 monthly fee, is designed as a wearable alternative to smartphones, promising users freedom from their screens through AI-assisted tasks. However, its functionality falls significantly short of expectations.
  • Throughout testing, the AI Pin struggled with basic requests and operations, demonstrating unreliability and slow processing times, leading to the conclusion that it fails to deliver on its core promise of a seamless, smartphone-free experience.
  • Despite its well-intentioned vision for a post-smartphone future and the integration of innovative features like a screenless interface and ambient computing, the device’s current state of performance and high cost make it a poor investment for consumers.
  • Source

🔍 MistralAI unveils groundbreaking open model Mixtral 8x22B

  • Mistral AI has released Mixtral 8x22B, an open-source AI model boasting 176 billion parameters and a 65,000-token context window, expected to surpass its predecessor and compete with major models like GPT-3.5 and Llama 2.
  • The Paris-based startup, valued at over $2 billion, aims to democratize access to cutting-edge AI by making Mixtral 8x22B available on platforms like Hugging Face and Together AI, allowing for widespread use and customization.
  • Despite its potential for innovation in fields like customer service and drug discovery, Mixtral 8x22B faces challenges related to its “frontier model” status, including the risk of misuse due to its open-source nature and lack of control over harmful applications.
  • Source

🙃 Microsoft proposed using DALL-E to US military last year

  • Microsoft proposed to the U.S. Department of Defense in 2023 to use OpenAI’s DALL-E AI for software development in military operations.
  • The proposal included using OpenAI tools like ChatGPT and DALL-E for document analysis, machine maintenance, and potentially training battlefield management systems with synthetic data.
  • Microsoft had not implemented the use of DALL-E in military projects, and OpenAI, which did not participate in Microsoft’s presentation, restricts its technology from being used to develop weapons or harm humans.
  • Source

🎵 New AI music generator Udio synthesizes realistic music on demand

  • Uncharted Labs has officially launched its music generator, Udio, which can transform text prompts into professional-quality music tracks, challenging the leading AI music generator, Suno V3.
  • Udio has impressed users and reviewers alike with its ability to generate songs that feature coherent lyrics, well-structured compositions, and competitive rhythms, some even considering it superior to Suno V3.
  • Despite facing initial server overload due to high user demand, Udio’s user-friendly interface and strong backing from notable investors suggest a promising future for AI-assisted music creation, though it remains free during its beta testing phase.
  • Source

🎬 Adobe is purchasing video content to train its AI model

  • Adobe is developing a text-to-video AI model, offering artists around $3 per minute for video footage to train the new tool, as reported by Bloomberg.
  • The software company has requested over 100 video clips from artists, aiming for content that showcases various emotions and activities, but has set a low budget for acquisitions.
  • Despite the potential for AI to impact artists’ future job opportunities and the lack of credit or royalties for the contributed footage, Adobe is pushing forward with the AI model development.
  • Source

What Else Is Happening in AI on April 11th 2024❗

🤖 Google expands Gemma family with new models

Google has expanded its Gemma family with two new models: CodeGemma and RecurrentGemma. CodeGemma is tailored for developers, offering intelligent code completion and chat capabilities for languages like Python and JavaScript. RecurrentGemma is optimized for efficiency in research, utilizing recurrent neural networks and local attention. (Link)

🌐 Mistral unveils Mixtral-8x22B open language model

Mistral AI has unveiled Mixtral-8x22B, a new open language model with extensive capabilities. This model, featuring 64,000 token context windows and requiring 258GB of VRAM, is a mixture-of-experts model. Early users are exploring its potential, with more details expected soon. (Link)

📷 Google Photos introduces free AI-powered editing tools

Google Photos is rolling out free AI-powered editing tools for all users starting May 15. Features like Magic Eraser, Photo Unblur, and Portrait Light will be accessible without a subscription. Pixel users will also benefit from the Magic Editor, which simplifies complex edits using generative AI. (Link)

🖼️ Microsoft enhances Bing visual search with personalization

Microsoft enhances Bing Visual Search with personalized visual systems based on user preferences. A patent application reveals that search results will be tailored to individual interests, such as showing gardening-related images to gardening enthusiasts and food-related visuals to chefs. (Link)

🛡️ Sama red team: Safety-centered solution for Generative AI

Sama has introduced Sama Red Team, a safety-centered solution for evaluating risks associated with generative AI and LLMs. This system simulates adversarial attacks to identify vulnerabilities related to bias, personal information, and offensive content, contributing to a more ethical AI landscape. (Link)

A Daily chronicle of AI Innovations April 10th 2024: 👀 OpenAI gives GPT-4 a major upgrade; 💬 Quora’s Poe now lets AI chatbot developers charge per message; 🌐 Google updates and expands its open source Gemma AI model family; 🔥 Intel unveils latest AI chip as Nvidia competition heats up; 📱 WordPress parent acquires Beeper app which brought iMessage to Android; 🤔 New bill would force AI companies to reveal use of copyrighted art; 🧠 Intel’s new AI chip: 50% faster, cheaper than NVIDIA’s; 🤖 Meta to Release Llama 3 Open-source LLM next week; ☁️ Google Cloud announces major updates to enhance Vertex AI

Intel’s new AI chip: 50% faster, cheaper than NVIDIA’s

Intel has unveiled its new Gaudi 3 AI accelerator, which aims to compete with NVIDIA’s GPUs. According to Intel, the Gaudi 3 is expected to reduce training time for large language models like Llama2 and GPT-3 by around 50% compared to NVIDIA’s H100 GPU. The Gaudi 3 is also projected to outperform the H100 and H200 GPUs in terms of inference throughput, with around 50% and 30% faster performance, respectively.

Intel's new AI chip: 50% faster, cheaper than NVIDIA's
Intel’s new AI chip: 50% faster, cheaper than NVIDIA’s

The Gaudi 3 is built on a 5nm process and offers several improvements over its predecessor, including doubling the FP8, quadrupling the BF16 processing power, and increasing network and memory bandwidth. Intel is positioning the Gaudi 3 as an open, cost-effective alternative to NVIDIA’s GPUs, with plans to make it available to major OEMs starting in the second quarter of 2024. The company is also working to create an open platform for enterprise AI with partners like SAP, Red Hat, and VMware.

Why does it matter?

Intel is challenging NVIDIA’s dominance in the AI accelerator market. It will introduce more choice and competition in the market for high-performance AI hardware. It could drive down prices, spur innovation, and give customers more flexibility in building AI systems. The open approach with community-based software and standard networking aligns with broader trends toward open and interoperable AI infrastructure.

Source

Meta to release Llama 3 open-source LLM next week

Meta plans to release two smaller versions of its upcoming Llama 3 open-source language model next week. These smaller models will build anticipation for the larger version, which will be released this summer. Llama 3 will significantly upgrade over previous versions, with about 140 billion parameters compared to 70 billion for the biggest Llama 2 model. It will also be a more capable, multimodal model that can generate text and images and answer questions about images.

The two smaller versions of Llama 3 will focus on text generation. They’re intended to resolve safety issues before the full multimodal release. Previous Llama models were criticized as too limited, so Meta has been working to make Llama 3 more open to controversial topics while maintaining safeguards.

Why does it matter?

The open-source AI model landscape has become much more competitive in recent months, with other companies like Mistral and Google DeepMind also releasing their own open-source models. Meta hopes that by making Llama 3 more open and responsive to controversial topics, it can catch up to models like OpenAI’s GPT-4 and become a standard for many AI applications.

Source

Google Cloud announces major updates to enhance Vertex AI

Google Cloud has announced exciting model updates and platform capabilities that continue to enhance Vertex AI:

  • Gemini 1.5 Pro: Gemini 1.5 Pro is now available in public preview in Vertex AI, the world’s first one million-token context window to customers. It also supports the ability to process audio streams, including speech and even the audio portion of videos.
  • Imagen 2.0: Imagen 2.0 can now create short, 4-second live images from text prompts, enabling marketing and creative teams to generate animated content. It also has new image editing features like inpainting, outpainting, and digital watermarking.
  • Gemma: Google Cloud is adding CodeGemma to Vertex AI. CodeGemma is a new lightweight model from Google’s Gemma family based on the same research and technology used to create Gemini.
  • MLOps: To help customers manage and deploy these large language models at scale, Google has expanded the MLOps capabilities for Gen AI in Vertex AI. This includes new prompt management tools for experimenting, versioning, optimizing prompts, and enhancing evaluation services to compare model performance.

Why does it matter?

These updates significantly enhance Google Cloud’s generative AI offerings. It also strengthens Google’s position in the generative AI space and its ability to support enterprise adoption of these technologies.

Source

👀 OpenAI gives GPT-4 a major upgrade

  • OpenAI has introduced GPT-4 Turbo with Vision, a new model available to developers that combines text and image processing capabilities, enhancing AI chatbots and other applications.
  • This multimodal model, which maintains a 128,000-token window and knowledge from December 2023, simplifies development by allowing a single model to understand both text and images.
  • GPT-4 Turbo with Vision simplifies development processes for apps requiring multimodal inputs like coding assistance, nutritional insights, and website creation from drawings.
  • Source

💬 Quora’s Poe now lets AI chatbot developers charge per message

  • Poe, a Quora-owned AI chatbot platform, introduced a new revenue model allowing creators to earn money by setting a price-per-message for their bots.
  • The revenue model aims to compensate creators for operational costs, fostering a diverse ecosystem of bots ranging from tutoring to storytelling.
  • This monetization strategy is initially available to U.S. creators, complemented by an analytics dashboard to track earnings and bot usage.
  • Source

🌐 Google updates and expands its open source Gemma AI model family

  • Google has enhanced the Gemma AI model family with new code completion models and improvements for more efficient inference, along with more flexible terms of use.
  • Three new versions of CodeGemma have been introduced, including a 7 billion parameter model for code generation and discussion, and a 2 billion parameter model optimized for fast code completion on local devices.
  • Google also unveiled RecurrentGemma, a model leveraging recurrent neural networks for better memory efficiency and speed in text generation, indicating a shift towards optimizing AI performance on devices with limited resources.
  • Source

🔥 Intel unveils latest AI chip as Nvidia competition heats up

  • Intel introduced its latest artificial intelligence chip, Gaudi 3, highlighting its efficiency and speed advantages over Nvidia’s H100 GPU and offering configurations that enhance AI model training and deployment.
  • The Gaudi 3 chip, which outperforms Nvidia in power efficiency and AI model processing speed, will be available in the third quarter, with Dell, Hewlett Packard Enterprise, and Supermicro among the companies integrating it into their systems.
  • Despite Nvidia’s dominant position in the AI chip market, Intel is seeking to compete by emphasizing Gaudi 3’s competitive pricing, open network architecture, and partnerships for open software development with companies like Google, Qualcomm, and Arm.
  • Source

📱 WordPress parent acquires Beeper app which brought iMessage to Android

  • Automattic, the owner of WordPress and Tumblr, has acquired Beeper, a startup known for its Beeper Mini app that attempted to challenge Apple’s iMessage, for $125 million despite the app’s quick defeat.
  • Beeper CEO Eric Migicovsky will oversee the merging of Beeper with Automattic’s similar app Texts, aiming to create the best chat app, with the combined service expected to launch later this year.
  • The acquisition raises questions due to Beeper Mini’s brief success and upcoming changes like Apple introducing RCS support to iPhones, but Automattic sees potential in Beeper’s stance on open messaging standards and its established brand.
  • Source

🤔 New bill would force AI companies to reveal use of copyrighted art

  • A new bill introduced in the US Congress by Congressman Adam Schiff aims to make artificial intelligence companies disclose the copyrighted material used in their generative AI models.
  • The proposed Generative AI Copyright Disclosure Act would require AI companies to register copyrighted works in their training datasets with the Register of Copyrights before launching new AI systems.
  • The bill responds to concerns about AI firms potentially using copyrighted content without permission, amidst growing litigation and calls for more regulation from the entertainment industry and artists.
  • Source

What Else Is Happening in AI on April 10th 2024❗

🚀 OpenAI launches GPT-4 Turbo with Vision model through API

OpenAI has unveiled the latest addition to its AI arsenal, the GPT -4 Turbo with Vision model, which is now “generally available” through its API. This new version has enhanced capabilities, including support for JSON mode and function calling for Vision requests. The upgraded GPT-4 Turbo model promises improved performance and is set to roll out in ChatGPT.  (Link)

👂 Google’s Gemini 1.5 Pro can now listen to audio

Google’s update to Gemini 1.5 Pro gives the model ears. It can process text, code, video, and uploaded audio streams, including audio from video, which it can listen to, analyze, and extract information from without a corresponding written transcript.(Link)

💰 Microsoft to invest $2.9 billion in Japan’s AI and cloud infrastructure

Microsoft announced it would invest $$2.9 billion over the next two years to increase its hyperscale cloud computing and AI infrastructure in Japan. It will also expand its digital skilling programs with the goal of providing AI skills to more than 3 million people over the next three years. (Link)

👩‍💻 Google launches Gemini Code Assist, the latest challenger to GitHub’s Copilot

At its Cloud Next conference, Google unveiled Gemini Code Assist, its enterprise-focused AI code completion and assistance tool. It provides various functions such as enhanced code completion, customization, support for various repositories, and integration with Stack Overflow and Datadog. (Link)

🛍️ eBay launches AI-driven ‘Shop the Look’ feature on its iOS app

eBay launched an AI-powered feature to appeal to fashion enthusiasts – “Shop the Look” on its iOS mobile application. It will suggest a carousel of images and ideas based on the customer’s shopping history. The recommendations will be personalized to the end user. The idea is to introduce how other fashion items may complement their current wardrobe. (Link)

A Daily chronicle of AI Innovations April 09th 2024: 🤖 Stability AI launches multilingual Stable LM 2 12B 📱 Ferret-UI beats GPT-4V in mobile UI tasks ⏰ Musk says AI will outsmart humans within a year 🍁 Canada bets big on AI with $2.4B investment 🎥 OpenAI is using YouTube for GPT-4 training 🤖 Meta to launch new Llama 3 models 👂 Google’s Gemini 1.5 Pro can now hear 💥 Google’s first Arm-based CPU will challenge Microsoft and Amazon in the AI race 📈 Boosted by AI, global PC market bounces back

🤖 Meta to launch new Llama 3 models

  • According to an insider, Meta will release two smaller versions of its planned major language model, Llama 3, next week to build anticipation for the major release scheduled for this summer.
  • The upcoming Llama 3 model, which will include both text generation and multimodal capabilities, aims to compete with OpenAI’s GPT-4 and is reported to potentially have up to 140 billion parameters.
  • Meta’s investment in the Llama 3 model and open-source AI reflects a broader trend of tech companies leveraging these technologies to set industry standards, similar to Google’s strategy with Android.
  • Source

👂 Google’s Gemini 1.5 Pro can now hear

  • Google has enhanced Gemini 1.5 Pro to interpret audio inputs, allowing it to process information from sources like earnings calls or video audio directly without needing a transcript.
  • Gemini 1.5 Pro, positioned as a mid-tier option within the Gemini series, now outperforms even the more advanced Gemini Ultra by offering faster and more intuitive responses without requiring model fine-tuning.
  • Alongside Gemini 1.5 Pro updates, Google introduced enhancements to its Imagen 2 model, including inpainting and outpainting features, and debuted a digital watermarking technology, SynthID, for tracking the origin of generated images.
  • Source

💥 Google’s first Arm-based CPU will challenge Microsoft and Amazon in the AI race

  • Google is developing its own Arm-based CPU named Axion to enhance AI operations in data centers and will launch it for Google Cloud business customers later this year.
  • The Axion CPU will improve performance by 30% over general-purpose Arm chips and by 50% over Intel’s processors, and it will support services like Google Compute Engine and Google Kubernetes Engine.
  • Google’s move to create its own Arm-based CPU and update its TPU AI chips aims to compete with Microsoft and Amazon in the AI space and reduce reliance on external suppliers like Intel and Nvidia.
  • Source

📈 Boosted by AI, global PC market bounces back

  • The global PC market has seen growth for the first time in over two years, with a 1.5% increase in shipments to 59.8 million units in the first quarter, reaching pre-pandemic levels.
  • The resurgence is partly attributed to the emergence of “AI PCs,” which feature onboard AI processing capabilities, with projections suggesting these will represent almost 60% of all PC sales by 2027.
  • Major PC manufacturers like Lenovo, HP, Dell, and Apple are heavily investing in the AI PC segment, with Lenovo leading the market and Apple experiencing the fastest growth in shipments.
  • Source

🤖Stability AI launches multilingual Stable LM 2 12B

Stability AI has released a 12-billion-parameter version of its Stable LM 2 language model, offering both a base and an instruction-tuned variant. These models are trained on a massive 2 trillion token dataset spanning seven languages: English, Spanish, German, and more. Stability AI has also improved its 1.6 billion-parameter Stable LM 2 model with better conversational abilities and tool integration.

The new 12B model is designed to balance high performance with relatively lower hardware requirements than other large language models. Stability AI claims it can handle complex tasks requiring substantially more computational resources. The company also plans to release a long-context variant of these models on the Hugging Face platform soon.

Why does this matter?

Stable LM 2 uses powerful 12B models without the most advanced hardware, making it a great choice for enterprises and developers. Stability AI’s multi-pronged approach to language solutions may give it an edge in the competitive generative AI market.

Source

📱 Ferret-UI beats GPT-4V in mobile UI tasks

Researchers have launched Ferret-UI, a multimodal language model designed to excel at understanding and interacting with mobile user interfaces (UIs). Unlike general-purpose models, Ferret-UI is trained explicitly for various UI-centric tasks, from identifying interface elements to reasoning about an app’s overall functionality.

Ferret-UI beats GPT-4V in mobile UI tasks
Ferret-UI beats GPT-4V in mobile UI tasks

By using “any resolution” technology and a meticulously curated dataset, Ferret-UI digs deep into the intricacies of mobile UI screens, outperforming its competitors in elementary and advanced tasks. Its ability to execute open-ended instructions may make it the go-to solution for developers looking to create more intuitive mobile experiences.

Why does this matter?

Ferret-UI’s advanced capabilities in understanding and navigating mobile UI screens will increase accessibility, productivity, and user satisfaction. By setting a new standard for mobile UI interaction, this innovative MLLM paves the way for more intuitive and responsive mobile experiences for users to achieve more with less effort.

Source

⏰ Musk says AI will outsmart humans within a year

Tesla CEO Elon Musk has boldly predicted that AI will surpass human intelligence as early as next year or by 2026. In a wide-ranging interview, Musk discussed AI development’s challenges, including chip shortages and electricity supply constraints, while sharing updates on his xAI startup’s AI chatbot, Grok. Despite the hurdles, Musk remains optimistic about the future of AI and its potential impact on society.

Why does this matter?

Musk’s prediction highlights the rapid pace of AI development and its potential to reshape our world in the near future. As AI becomes increasingly sophisticated, it could transform the job market and raise important ethical questions about the role of technology in society.

Source

What Else Is Happening in April 09th 2024❗

🇬🇧 Microsoft is opening a new AI research hub in London

Microsoft is tapping into the UK’s exceptional talent pool to drive language models and AI infrastructure breakthroughs. The move highlights Microsoft’s commitment to invest £2.5 billion in upskilling the British workforce and building the AI-driven future. (Link)

🎥 OpenAI is using YouTube for GPT-4 training

OpenAI reportedly transcribed over a million hours of YouTube videos to train its advanced GPT-4 language model. Despite legal concerns, OpenAI believes this is fair use. Google and Meta have also explored various solutions to obtain more training data, including using copyrighted material and consumer data. (Link)

🧠 Arm’s new chips bring AI to the IoT edge

Arm has introduced the Ethos-U85 NPU and Corstone-320 IoT platform, designed to enhance edge AI applications with improved performance and efficiency. These technologies aim to accelerate the development and deployment of intelligent IoT devices by providing an integrated hardware and software solution for Arm’s partners. (Link)

🍁 Canada bets big on AI with $2.4B investment

Prime Minister Justin Trudeau has announced a $2.4 billion investment in Canada’s AI sector, with the majority aimed at providing researchers access to computing capabilities and infrastructure. The government also plans to establish an AI Safety Institute and an Office of the AI and Data Commissioner to ensure responsible development and regulation of the technology. (Link)

A Daily chronicle of AI Innovations April 08th 2024: 🇬🇧 Microsoft opens AI Hub in London to ‘advance state-of-the-art language models’ 💡 JPMorgan CEO compares AI’s potential impact to electricity and the steam engine 🎵 Spotify moves into AI with new feature ⚖️ Build resource-efficient LLMs with Google’s MoD 📡 Newton brings sensor-driven intelligence to AI models 💰 Internet archives become AI training goldmines for Big Tech

Build resource-efficient LLMs with Google’s MoD

Google DeepMind has introduced “Mixture-of-Depths” (MoD), an innovative method that significantly improves the efficiency of transformer-based language models. Unlike traditional transformers that allocate the same amount of computation to each input token, MoD employs a “router” mechanism within each block to assign importance weights to tokens. This allows the model to strategically allocate computational resources, focusing on high-priority tokens while minimally processing or skipping less important ones.

Build resource-efficient LLMs with Google's MoD
Build resource-efficient LLMs with Google’s MoD

Notably, MoD can be integrated with Mixture-of-Experts (MoE), creating a powerful combination called Mixture-of-Depths-and-Experts (MoDE). Experiments have shown that MoD transformers can maintain competitive performance while reducing computational costs by up to 50% and achieving significant speedups during inference.

Why does this matter?

MoD can greatly reduce training times and enhance model performance by dynamically optimizing computational resources. Moreover, it adapts the model’s depth based on the complexity of the task at hand. For simpler tasks, it employs shallower layers, conserving resources. Conversely, for intricate tasks, it deepens the network, enhancing representation capacity. This adaptability ensures that creators can fine-tune LLMs for specific use cases without unnecessary complexity.

Source

Newton brings sensor-driven intelligence to AI models

Startup Archetype AI has launched with the ambitious goal of making the physical world understandable to artificial intelligence. By processing data from a wide variety of sensors, Archetype’s foundational AI model called Newton aims to act as a translation layer between humans and the complex data generated by the physical world.

Using plain language, Newton will allow people to ask questions and get insights about what’s happening in a building, factory, vehicle, or even the human body based on real-time sensor data. The company has already begun pilot projects with Amazon, Volkswagen, and healthcare researchers to optimize logistics, enable smart vehicle features, and track post-surgical recovery. Archetype’s leadership team brings deep expertise from Google’s Advanced Technology and Products (ATAP) division.

Why does this matter?

General-purpose AI systems like Newton that can interpret diverse sensor data will be the pathway to building more capable, context-aware machines. In the future, users may increasingly interact with AI not just through screens and speakers but through intelligently responsive environments that anticipate and adapt to their needs. However, as AI becomes more deeply embedded in the physical world, the stakes of system failures or unintended consequences become higher.

Source

Internet archives become AI training goldmines for Big Tech

To gain an edge in the heated AI arms race, tech giants Google, Meta, Microsoft, and OpenAI are spending billions to acquire massive datasets for training their AI models. They are turning to veteran internet companies like Photobucket, Shutterstock, and Freepik, who have amassed vast archives of images, videos, and text over decades online.

The prices for this data vary depending on the type and buyer but range from 5 cents to $7 per image, over $1 per video, and around $0.001 per word for text. The demand is so high that some companies are requesting billions of videos, and Photobucket says it can’t keep up.

Why does this matter?

This billion-dollar rush for AI training data could further solidify Big Tech’s dominance in artificial intelligence. As these giants hoard the data that’s crucial for building advanced AI models, it may become increasingly difficult for startups or academic labs to compete on a level playing field. We need measures to protect the future diversity and accessibility of AI technologies.

Source

🎵 Spotify moves into AI with new feature

  • Spotify is launching a beta tool enabling Premium subscribers to create playlists using text descriptions on mobile.
  • Users can input various prompts reflecting genres, moods, activities, or even movie characters to receive a 30-song playlist tailored to their request, with options for further refinement through additional prompts.
  • The AI Playlist feature introduces a novel approach to playlist curation, offering an efficient and enjoyable way to discover music that matches specific aesthetics or themes, despite limitations on non-music related prompts and content restrictions.
  • Source

🇬🇧 Microsoft opens AI Hub in London to ‘advance state-of-the-art language models’

  • Mustafa Suleyman, co-founder of DeepMind and new CEO of Microsoft AI, announced the opening of a new AI hub in London, focusing on advanced language models, under the leadership of Jordan Hoffmann.
  • The hub aims to recruit fresh AI talent for developing new language models and infrastructure, bolstered by Microsoft’s £2.5 billion investment in the U.K. over the next three years to support AI economy training and data centre expansion.
  • Suleyman, Hoffmann, and about 60 AI experts recently joined Microsoft through its indirect acquisition of UK-based AI startup Inflection AI.
  • Source

💡 JPMorgan CEO compares AI’s potential impact to electricity and the steam engine

  • JPMorgan CEO Jamie Dimon stated AI could significantly impact every job, comparing its potential to revolutionary technologies like the steam engine and electricity.
  • Dimon highlighted AI’s importance in his shareholder letter, revealing the bank’s investment in over 400 AI use cases and the acquisition of thousands of AI experts and data scientists.
  • He expressed belief in AI’s transformative power, equating its future impact to historical milestones such as the printing press, computing, and the internet.
  • Source

What Else Is Happening in AI on April 08th, 2024❗

🎧 Spotify introduces AI-generated personalized playlists

Spotify has launched AI-powered personalized playlists that users can create using text prompts. The feature is currently available in beta for UK and Australia users on iOS and Android. Spotify uses LLMs to understand the prompt’s intent and its personalization technology to generate a custom playlist, which users can further refine. (Link)

🔍 Meta expands “Made with AI” labeling to more content types

Meta will start applying a “Made with AI” badge to a broader range of AI-generated content, including videos, audio, and images. The company will label content where it detects AI image indicators or when users acknowledge uploading AI-generated content. (Link)

🚀 Gretel’s Text-to-SQL dataset sets new standard for AI training data

Gretel has released the world’s largest open-source Text-to-SQL dataset containing over 100,000 high-quality synthetic samples spanning 100 verticals. The dataset, generated using Gretel Navigator, aims to help businesses unlock the potential of their data by enabling AI models to understand natural language queries and generate SQL queries. (Link)

💾 Microsoft upgrades Azure AI Search with more storage and support for OpenAI apps

Microsoft has made Azure AI Search more cost-effective for developers by increasing its vector and storage capacity. The service now supports OpenAI applications, including ChatGPT and GPTs, through Microsoft’s retrieval augmented generation system. Developers can now scale their apps to a multi-billion vector index within a single search without compromising speed or performance. (Link)

📱 Google brings Gemini AI chatbot to Android app

Google is bringing its AI chatbot, Gemini, to the Android version of the Google app. Similar to its iOS integration, users can access Gemini by tapping its logo at the top of the app, opening a chatbot prompt field. Here, users can type queries, request image generation, or ask for image analysis. (Link)

A Daily chronicle of AI Innovations April 06th 2024: 👀 Sam Altman and Jony Ive seek $1B for personal AI device 🚕 Elon Musk says Tesla will unveil robotaxi in August 🔖 Meta to label content ‘made with AI’ 🙃 How OpenAI, Google and Meta ignored corporate policies to train their AI 🛒

👀 Sam Altman and Jony Ive seek $1B for personal AI device OpenAI CEO

Sam Altman and former Apple design chief Jony Ive are collaborating to create an AI-powered personal device and are currently seeking funding. The specifics of the device are unclear, but it is noted to not resemble a smartphone, with speculation about it being similar to the screenless Humane AI pin. The venture, still unnamed, aims to raise up to $1 billion and is in discussions with major investors, including Thrive Capital and Emerson Collective, with potential ownership involvement from OpenAI. https://invest.radintel.ai

🚕 Elon Musk says Tesla will unveil robotaxi in August

Elon Musk announced that Tesla will unveil its robotaxi on August 8th, aiming to focus on autonomous vehicles over mass-market EVs. The Tesla robotaxi is part of Musk’s vision for a shared fleet that owners can monetize, described in the Tesla Network within his Master Plan Part Deux. Musk’s history of ambitious claims about self-driving technology contrasts with regulatory scrutiny and safety concerns involving Tesla’s Autopilot and Full Self-Driving features.

OpenAI’s AI model can clone your voice in 15 seconds

OpenAI has offered a glimpse into its latest breakthrough – Voice Engine, an AI model that can generate stunningly lifelike voice clones from a mere 15-second audio sample and a text input. This technology can replicate the original speaker’s voice, opening up possibilities for improving educational materials.

Though the model has many applications, the AI giant is cautious about its potential misuse, especially during elections. They have strict rules for partners, like no unauthorized impersonation, clear labeling of synthetic voices, and technical measures like watermarking and monitoring.

Meta to label content ‘made with AI’

  • Meta announced that starting in May 2024, AI-generated content on Facebook, Instagram, and Threads will be labeled “Made with AI.”
  • The decision for broader labeling, including AI-generated videos, audio, and images, is influenced by expert consultations and public opinion surveys.
  • Meta’s goal with the “Made with AI” label is to provide more context to users, aiding in content evaluation, while content violating community standards will still be removed.
  • Source

How OpenAI, Google and Meta ignored corporate policies to train their AI

  • OpenAI, Google, and Meta pushed the boundaries of data acquisition for AI development, with OpenAI transcribing over one million hours of YouTube videos for its GPT-4 model.
  • Meta considered extreme measures such as purchasing a publishing house for access to copyrighted materials, and Google amended its privacy policy to potentially harness user-generated content in Google Docs for AI.
  • As the demand for data outpaces supply, tech companies are exploring the creation of synthetic data generated by AI models themselves, despite the risk of models reinforcing their own errors, suggesting a future where AI might train on data it generates.
  • Source

🛒 Tech giants are on a billion-dollar shopping spree for AI training data

  • Tech giants are spending billions to license images, videos, and other content from companies such as Photobucket and Shutterstock to train their AI models, with costs ranging from 5 cents to $1 per photo and more for videos.
  • Prices for licensing data to train AI vary, with figures from $1 to $2 per image, $2 to $4 for short videos, and up to $300 per hour for longer films, while special handling items like nude images may cost $5 to $7 each.
  • Legal concerns arise as companies like Photobucket update their terms of service to sell user-uploaded content for AI training, despite the US Federal Trade Commission warning against retroactively changing terms for AI use, leading to investigations into deals like Reddit’s with Google.
  • Source

A daily chronicle of AI Innovations April 05th 2024: 🤷‍♀️ YouTube CEO warns OpenAI that training models on its videos is against the rules; 🏢 OpenAI says 2024 is the “year of the enterprise” when it comes to AI; ⚔️ The war for AI talent has begun; 🏢 Cohere launches the “most powerful LLM for enterprises”; 🧰 OpenAI doubles down on AI model customization; 🏠 Will personal home robots be Apple’s next big thing?

Cohere launches the “most powerful LLM for enterprises”

Cohere has announced the release of Command R+, its most powerful and scalable LLM to date. Designed specifically for enterprise use cases, Command R+ boasts several key features:

  • Advanced Retrieval Augmented Generation (RAG) to access and process vast amounts of information, improving response accuracy and reliability.
  • Support for ten business languages, enabling seamless operation across global organizations.
  • Tool Use feature to automate complex workflows by interacting with various software tools.

Moreover, Command R+ outperforms other scalable models on key metrics while providing strong accuracy at lower costs.

Cohere launches the “most powerful LLM for enterprises”
Cohere launches the “most powerful LLM for enterprises”

The LLM is now available through Cohere’s API and can be deployed on various cloud platforms, including Microsoft Azure and Oracle Cloud Infrastructure.

Why does this matter?

As one of the first “enterprise-hardened” LLMs optimized for real-world use cases, Command R+ could shape how companies operationalize generative AI across their global operations and product lines. Similar to how Robotic Process Automation (RPA) transformed back-office tasks, Command R+ could significantly improve efficiency and productivity across diverse industries. Additionally, availability on Microsoft Azure and upcoming cloud deployments make it readily accessible to businesses already using these platforms, which could lower the barrier to entry for implementing gen AI solutions.

Source

OpenAI doubles down on AI model customization

OpenAI is making significant strides in AI accessibility with new features for its fine-tuning API and an expanded Custom Models program. These advancements give developers greater control and flexibility when tailoring LLMs for specific needs.

The fine-tuning AP includes:

  • Epoch-based checkpoint creation for easier retraining
  • A playground for comparing model outputs
  • Support for third-party integration
  • Hyperparameters adjustment directly from the dashboard

The Custom Models program now offers assisted fine-tuning with OpenAI researchers for complex tasks and custom-trained models built entirely from scratch for specific domains with massive datasets.

Why does this matter?

This signifies a significant step towards more accessible and powerful AI customization. Previously, fine-tuning required technical expertise and large datasets. Now, with OpenAI’s assisted programs, organizations can achieve similar results without needing in-house AI specialists, potentially democratizing access to advanced AI capabilities.

Source

Will personal home robots be Apple’s next big thing?

Apple is reportedly venturing into personal robotics after abandoning its self-driving car project and launching its mixed-reality headset. According to Bloomberg’s sources, the company is in the early stages of developing robots for the home environment.

Two potential robot designs are mentioned in the report. One is a mobile robot that can follow users around the house. The other is a stationary robot with a screen that can move to mimic a person’s head movements during video calls. Apple is also considering robots for household tasks in the long term.

The project is being spearheaded by Apple’s hardware and AI teams under John Giannandrea. Job postings on Apple’s website further support its commitment to robotics, highlighting its search for talent to develop “the next generation of Apple products” powered by AI.

Why does this matter?

If Apple does release personal home robots, it could mainstream consumer adoption and create new use cases, as the iPhone did for mobile apps and smart assistants. Apple’s brand power and integrated ecosystem could help tackle key barriers like cost and interoperability that have hindered household robotics so far.

It could also transform homes with mobile AI assistants for tasks like elderly care, household chores, entertainment, and more. This may spur other tech giants to double down on consumer robotics.

Source

🤷‍♀️ YouTube CEO warns OpenAI that training models on its videos is against the rules

  • YouTube CEO Neal Mohan warned that OpenAI’s use of YouTube videos to train its text-to-video generator Sora could breach the platform’s terms of service, emphasizing creators’ expectations of content use compliance.
  • This stance poses potential challenges for Google, facing multiple lawsuits over alleged unauthorized use of various content types to train its AI models, arguing such use constitutes “fair use” through transformative learning.
  • Mohan’s remarks could undermine Google’s defense in ongoing legal battles by highlighting inconsistencies in the company’s approach to using content for AI training, including its use of YouTube videos and content from other platforms.
  • Source

⚔️ The war for AI talent has begun

  • Elon Musk aims to retain Tesla’s AI talent by increasing their compensation to counteract aggressive recruitment tactics from OpenAI.
  • Tesla Staff Machine Learning Scientist Ethan Knight’s move to Musk’s AI startup, xAI, exemplifies efforts to prevent employees from joining competitors like OpenAI.
  • Musk describes the ongoing competition for AI professionals as the “craziest talent war” he has ever seen and sees increased compensation as a means to achieve Tesla’s ambitious AI goals, including autonomous driving and humanoid robots development.
  • Source

🏢 OpenAI says 2024 is the “year of the enterprise” when it comes to AI

  • OpenAI’s ChatGPT Enterprise has attracted over 600,000 sign-ups, prompting COO Brad Lightcap to declare 2024 as the “year of adoption for AI in the enterprise”.
  • Despite the strong uptake of ChatGPT Enterprise, OpenAI faces stiff competition from companies eager to penetrate the workplace AI market, including major investor Microsoft with its enterprise AI solutions.
  • OpenAI’s venture into the enterprise sector, especially with ChatGPT Enterprise, marks a significant move towards profitability, with successful partnerships with major media companies like Axel Springer SE, Le Monde, and Prisa.
  • Source

What Else Is Happening in AI on April 05th, 2024❗

📈S&P Global launches AI benchmarking tool

S&P Global has launched S&P AI Benchmarks by Kensho, a groundbreaking tool that evaluates the performance of LLMs in complex financial and quantitative applications. This solution aims to set a new industry standard and promote transparency in AI adoption within the financial sector. (Link)

🤝Waymo and Uber partner for autonomous food delivery in Phoenix

Waymo and Uber have teamed up to launch autonomous Uber Eats deliveries in Phoenix using Waymo’s self-driving vehicles. The service will initially cover select merchants in Chandler, Tempe, and Mesa. Customers can opt out during checkout if they prefer a human courier and will receive instructions for retrieving their order from the autonomous vehicle upon arrival. (Link)

🔍Storyblocks integrates AI for smarter search

Storyblocks has integrated OpenAI’s LLM into its search engine to improve search accuracy for complex queries. Coupled with algorithms analyzing content performance and user engagement, the AI-driven search adapts to provide fresh, high-quality content. Storyblocks also uses machine learning to optimize thumbnails, prioritize representation, and suggest complementary assets, streamlining the creative process. (Link)

🚀Hercules AI streamlines enterprise AI app development

Hercules AI has introduced a new “assembly line” approach for rapid deployment of AI assistants in enterprises. The pre-configured components allow companies to develop cost-effective, scalable AI agents. Plus, their RosettaStoneLLM, built on Mistral-7B and WizardCoder-13B, outperforms competitors by converting data for internal AI workflows. (Link)

🤖Yum Brands embraces AI across restaurants

Yum Brands, the parent company of KFC, Pizza Hut, and Taco Bell, is infusing AI into every aspect of its restaurant operations. From voice AI taking drive-thru orders to an AI-powered “SuperApp” for staff, Yum aims to elevate customer experiences and streamline processes. The AI-driven initiatives include personalized promotions, predictive ordering, and even AI-assisted cooking instructions. (Link)

A daily chronicle of AI Innovations April 04th 2024: 🎵 What’s new in Stability AI’s Stable Audio 2.0? 🖥️ Opera One browser becomes the first to offer local AI integration 🚀 Copilot gets GPT-4 Turbo upgrade
🤖 SWE-agent: AI coder that solves GitHub issues in 93 seconds
📲 Mobile-first Higgsfield aims to disrupt video marketing with AI

What’s new in Stability AI’s Stable Audio 2.0?

Stability AI has released Stable Audio 2.0, a new AI model that generates high-quality, full-length audio tracks. Built upon its predecessor, the latest model introduces three groundbreaking features:

  • Generates tracks up to 3 minutes long with coherent musical structure
  • Enables audio-to-audio generation, allowing users to transform uploaded samples using natural language prompts
  • Enhances sound effect generation and style transfer capabilities, offering more flexibility and control for artists

Stable Audio 2.0’s architecture combines a highly compressed autoencoder and a diffusion transformer (DiT) to generate full tracks with coherent structures. The autoencoder condenses raw audio waveforms into shorter representations, capturing essential features, while the DiT excels at manipulating data over long sequences. This combination allows the model to recognize and reproduce the large-scale structures essential for creating high-quality musical compositions.

Trained exclusively on a licensed dataset from AudioSparx, Stable Audio 2.0 prioritizes creator rights by honoring opt-out requests and ensuring fair compensation. You can explore the capabilities of the model for free on the Stable Audio website.

Why does this matter?

Stable Audio 2’s capability to generate 3-minute songs is a big step forward for AI music tools. But it still has some issues, like occasional glitches and “soulless” vocals, showing that AI has limits in capturing the emotion of human-made music. Also, a recent open letter from artists like Billie Eilish and Katy Perry raises concerns about the ethics of AI-generated music.

Source

SWE-agent: AI coder that solves GitHub issues in 93 seconds

Researchers at Princeton University have developed SWE-agent, an AI system that converts language models like GPT-4 into autonomous software engineering agents. SWE-agent can identify and fix bugs and issues in real-world GitHub repositories in 93 seconds! It does so by interacting with a specialized terminal, which allows it to open, scroll, and search through files, edit specific lines with automatic syntax checking, and write and execute tests. This custom-built agent-computer interface is critical for the system’s strong performance.

SWE-agent: AI coder that solves GitHub issues in 93 seconds
SWE-agent: AI coder that solves GitHub issues in 93 seconds

In the SWE-Bench benchmark test, SWE-agent solved 12.29% of the problems presented, nearly matching the 13.86% achieved by Devin, a closed-source $21 million commercial AI programmer developed by Cognition AI. While Devin is currently only available to select developers, the Princeton team has made SWE-agent open-source to gather feedback and encourage collaboration in advancing this technology.

Why does this matter?

The rise of SWE-agent shows AI systems are becoming more sophisticated in assisting human programmers. Over time, they may change the nature of software development roles, requiring developers to focus more on high-level problem-solving and architectural design while delegating routine tasks to AI assistants. This change could make software development faster and more creative, but it might also require significant upskilling within the developer community.

Source

Mobile-first Higgsfield aims to disrupt video marketing with AI

Former Snap AI chief Alex Mashrabov has launched a new startup called Higgsfield AI, which aims to make AI-powered video creation accessible to creators and marketers. The company’s first app, Diffuse, allows users to generate original video clips from text descriptions or edit existing videos to insert themselves into the scenes.

Higgsfield is taking on OpenAI’s Sora video generator but targeting a broader audience with its mobile-first, user-friendly tools. The startup has raised $8 million in seed funding and plans to further develop its video editing capabilities and AI models. While questions remain around data usage and potential for abuse, Higgsfield believes it can carve out a niche in social media marketing with its realistic, easy-to-use video generation.

Why does this matter?

Higgsfield’s mobile-first approach to AI video generation could be a game-changer regarding accessibility and ease of use. The company is positioning itself to capture a significant portion of the creator economy by prioritizing consumer-friendly features and social media integration. As more users embrace these tools, we can expect to see an explosion of AI-generated content across social media platforms, which could have far-reaching implications for content authenticity and user engagement.

Source

Generative AI Used To Develop Potential New Drugs For Antibiotic-Resistant Bacteria

Researchers at Stanford Medicine and McMaster University have devised a new AI model, SyntheMol (“synthesizing molecules”), which creates recipes for chemists to synthesize drugs in the lab. With nearly 5 million deaths linked to antibiotic resistance globally every year, new ways to combat resistant bacterial strains are urgently needed, according to the researchers.

Using SyntheMol, the researchers have so far developed six novel drugs aimed at killing resistant strains of Acinetobacter baumannii, one of the leading pathogens responsible for antibacterial resistance-related deaths, as noted in a study published March 22 in the journal Nature Machine Intelligence.
Read more here

🤖 Apple explores making personal robots

  • Apple is investigating personal robotics as a new venture, focusing on a mobile robot that can follow users and a robotic table-top device that moves a display around, despite the uncertain future of these products.
  • This move into robotics is part of Apple’s search for new growth avenues after discontinuing its electric vehicle project, with the company looking to capitalize on advancements in artificial intelligence for home automation.
  • Apple’s robotics efforts are led within its hardware engineering division and AI group, indicating a strategic investment in developing cutting-edge home devices, although the projects are still in early research stages and have not been officially confirmed for release.
  • Source

💰 Google could soon start charging a fee for AI-powered search results

  • Google is exploring the introduction of a paid “premium” tier for its search engine, featuring new generative AI-powered enhancements, marking a significant shift from its traditionally ad-supported model.
  • The company is considering integrating these AI-powered search features into existing premium subscription services, amidst concerns about the impact of AI on its advertising revenue, which is critical to its business model.
  • Google has begun experimenting with AI-powered search services, presenting detailed answers alongside traditional search results and advertisements, but has yet to fully implement these features into its main search engine.
  • Source

🖼 ChatGPT now lets you edit AI images created in DALL-E 

  • OpenAI has updated DALL-E with image editing tools accessible within ChatGPT on both web and mobile platforms, allowing users to refine AI-generated images without leaving the chat interface.
  • DALL-E now provides preset style suggestions, such as woodcut, gothic, synthwave, and hand-drawn, to inspire users in their image creation process, similar to AI-generated wallpaper prompts on Android.
  • The integration of DALL-E with ChatGPT, particularly with the latest updates, aims to enhance user-friendliness by simplifying the image creation process and offering starting points for creativity.
  • Source

Meta’s AI image generator struggles to create images of couples of different races. LINK

OpenAI’s Sora just made its first music video and it’s like a psychedelic trip. LINK

What Else Is Happening in AI on April 04th, 2024❗

👨‍💻 Codiumate offers secure, compliant AI-assisted coding for enterprises

Codium AI, an Israeli startup, has launched Codiumate, a semi-autonomous AI agent, to help enterprise software developers with coding, documentation, and testing. It can help with creating development plans from existing code, writing code, finding duplicate code, and suggesting tests. Codiumate aims to make development faster and more secure, with features like zero data retention and the ability to run on private servers or air-gapped computers. (Link)

🖥️ Opera One browser becomes the first to offer local AI integration

Opera now supports 150 local LLM variants in its Opera One browser, making it the first major browser to offer access to local AI models. This feature lets users process their input locally without sending data to a server. Opera One Developer users can select and download their preferred local LLM, which typically requires 2-10 GB of storage space per variant, instead of using Opera’s native browser AI, Aria. (Link)

🧠 AWS expands Amazon Bedrock with Mistral Large model

AWS has included Mistral Large in its Amazon Bedrock managed service for generative AI and app development. Mistral Large is fluent in English, French, Spanish, German, and Italian, and can handle complex multilingual tasks like text understanding, transformation, and code generation. AWS also mentioned that Mistral AI will use its Tranium and Inferentia silicon chips for future models, and that Amazon Bedrock is now in France. (Link)

🚀 Copilot gets GPT-4 Turbo upgrade and enhanced image generation

Microsoft is providing GPT-4 Turbo access to business subscribers of its AI-powered Copilot assistant, without daily limits on chat sessions. The company is also improving image generation capabilities in Microsoft Designer for Copilot subscribers, increasing the limit to 100 images per day using OpenAI’s DALL-E 3 model. These upgrades are part of the $30 per user, per month pricing of Copilot for Microsoft 365. (Link)

🌐 Status invests in Matrix to create a decentralized messaging platform

Status, a mobile Ethereum client, has invested $5 million in New Vector, the company behind the open-source, decentralized communication platform Matrix.org. They plan to create a secure messaging solution for users to control their data and communicate across apps and networks. (Link)

A daily chronicle of AI Innovations April 03rd 2024: 🔍 Google’s Gecko: LLM-powered text embedding breakthrough; 🔓 Anthropic’s “many-shot jailbreaking” wears down AI ethics; 🌌 CosmicMan enables the photorealistic generation of human images

Google’s Gecko: LLM-powered text embedding breakthrough

Gecko is a compact and highly versatile text embedding model that achieves impressive performance by leveraging the knowledge of LLMs. DeepMind researchers behind Gecko have developed a novel two-step distillation process to create a high-quality dataset called FRet using LLMs. The first step involves using an LLM to generate diverse, synthetic queries and tasks from a large web corpus. In the second step, the LLM mines positive and hard negative passages for each query, ensuring the dataset’s quality.

Google's Gecko: LLM-powered text embedding breakthrough
Google’s Gecko: LLM-powered text embedding breakthrough

When trained on FRet combined with other academic datasets, Gecko outperforms existing models of similar size on the Massive Text Embedding Benchmark (MTEB). Remarkably, the 256-dimensional version of Gecko surpasses all models with 768 dimensions, and the 768-dimensional Gecko competes with models that are 7x larger or use embeddings with 5x higher dimensions.

Why does it matter?

Text embedding models are crucial in natural language processing tasks such as document retrieval, sentence similarity, and classification. Gecko’s development shows the potential for creating a single model that can support multiple downstream tasks, eliminating the need for separate embedding models for each task. Using LLMs and knowledge distillation techniques, Gecko achieves strong retrieval performance and sets a strong baseline as a zero-shot embedding model.

Source

Anthropic’s “many-shot jailbreaking” wears down AI ethics 

Researchers at Anthropic discovered a new way to get advanced AI language models to bypass their safety restrictions and provide unethical or dangerous information. They call this the “many-shot jailbreaking” technique. By including many made-up dialog examples in the input where an AI assistant provides harmful responses, the researchers could eventually get the real AI to override its training and provide instructions on things like bomb-making.

Anthropic’s “many-shot jailbreaking” wears down AI ethics 
Anthropic’s “many-shot jailbreaking” wears down AI ethics

The researchers say this vulnerability arises from AI models’ increasing ability to process and “learn” from very long input sequences. Essentially, the AI mimics the unethical behavior repeatedly demonstrated in the made-up examples. Anthropic has implemented safeguards against this attack on its systems and has also shared the findings openly so other AI companies can work on mitigations.

Why does it matter?

As AI models become more capable over time, techniques to override their built-in ethical restraints pose serious risks if not addressed. While Anthropic has been transparent in disclosing this vulnerability to enable mitigations, it underscores the need for continued research into AI safety and security. Simple precautions like limiting input length are inadequate; more sophisticated AI “jailbreak” prevention methods are required as these systems advance.

Source

CosmicMan enables the photorealistic generation of human images 

Researchers at the Shanghai AI Laboratory have created a new AI model called CosmicMan that specializes in generating realistic images of people. CosmicMan can produce high-quality, photorealistic human images that precisely match detailed text descriptions, unlike current AI image models that struggle with human images.

CosmicMan enables the photorealistic generation of human images 
CosmicMan enables the photorealistic generation of human images

The key to CosmicMan’s success is a massive dataset called CosmicMan-HQ 1.0 containing 6 million annotated human images and a novel training method—“ Annotate Anyone,” which focuses the model on different parts of the human body. By categorizing words in the text description into body part groups like head, arms, legs, etc., the model can generate each part separately for better accuracy and customizability, thereby outperforming the current state-of-the-art models.

CosmicMan enables the photorealistic generation of human images 
CosmicMan enables the photorealistic generation of human images

Why does it matter?

Existing AI models have struggled to create realistic human images and accurately represent diverse human appearances. With CosmicMan, AI systems will be better equipped to generate high-fidelity images of people, which can have implications for computer vision, graphics, entertainment, virtual reality, and fashion. It may enable more realistic virtual avatars, improved character generation in games and movies, and enhanced visual content creation.

Source

OpenAI-Superhuman introduces a new era of email with OpenAI.

 OpenAI-Superhuman introduces a new era of email with OpenAI.
OpenAI-Superhuman introduces a new era of email with OpenAI

Source

Apple Vision Pro’s Spatial Avatars are a game changer

Get the Meta Quest 3 at half the price for similar functionalities here

Meta Quest 3

UBTECH and Baidu have partnered to integrate large AI models into humanoid robots. Their demo features the Walker S robot folding clothes and sorting objects through natural language, using Baidu’s LLM, ERNIE Bot, for task interpretation/planning.

UBTECH and Baidu have partnered to integrate large AI models into humanoid robots. Their demo features the Walker S robot folding clothes and sorting objects through natural language, using Baidu’s LLM, ERNIE Bot, for task interpretation/planning.
byu/SharpCartographer831 insingularity

YCombinator’s AI boom is still going strong (W24)

With YC’s latest Demo Day (W24), the AI companies are continuing to grow. Six months ago, there were around 139 companies working with AI or ML – that number has climbed to 158, a clear majority of 65% (there are 243 total companies in the batch).

Let’s dive into what’s new, what’s stayed the same, and what we can learn about the state of AI startups.

YCombinator's AI boom is still going strong (W24)
YCombinator’s AI boom is still going strong (W24)

The biggest domains stayed big

Perhaps unsurprisingly, the most popular categories remained unchanged from the last batch. Last time, the top 4 domains were AI Ops, Developer Tools, Healthcare + Biotech, and Finance + Payments. This time, the top 5 were:

  • Developer Tools: Apps, plugins, and SDKs making it easier to write code. Tools for testing automation, website optimization, codebase search, improved Jupyter notebooks, and AI-powered DevOps were all present. There was also a strong contingent of code-generation tools, from coding Copilots to no-code app builders.
  • AI Ops: Tooling and platforms to help companies deploy working AI models. That includes hosting, testing, data management, security, RAG infrastructure, hallucination mitigation, and more. We’ll discuss how the AI Ops sector has continued to mature below.
  • Healthcare + Biotech: While I’ve once again lumped these two categories together, there’s a pretty big split in the types of AI businesses being built. Healthcare companies are building automation tools for the entire healthcare lifecycle: patient booking, reception, diagnosis, treatment, and follow-up. Whereas biotech companies are creating foundation models to enable faster R&D.
  • Sales + Marketing: Early generative AI companies were focused on the sales and marketing benefits of GPT-3: write reasonable sounding copy instantly. Now, we’re seeing more niche use cases for revenue-generating AI: AI-powered CRMs for investors, customer conversation analysis, and AI personal network analysis were among some sales-oriented companies.
  • Finance: Likewise, on the finance side, companies covered compliance, due diligence, deliverable automation, and more. Perhaps one of my favorite descriptions was “a universal API for tax documents.”

The long tail is getting longer

Even though the top categories were quite similar, one new aspect was a wider distribution of industries. Compared with the last batch, there were roughly 35 categories of companies versus 28 (examples of new categories include HR, Recruiting, and Aerospace). That makes sense to me. I’ve been saying for a while now that “AI isn’t a silver bullet” and that you need domain-expertise to capture users and solve new problems.

But it’s also clear that with AI eating the world, we’re also creating new problems. It was interesting to see companies in the batch focused on AI Safety – one company is working on fraud and deepfake detection, while another is building foundation models that are easy to align. I suspect we will continue seeing more companies dealing with the second-order effects of our new AI capabilities.

We’re also seeing more diverse ways of applying AI. In the last batch, a dominant theme was “copilots.” And while those are still present here (as well as “agents”), there are also more companies building “AI-native” products and platforms – software that uses AI in ways beyond a shoehorned sidebar conversation with an AI assistant.

What comes after CustomGPTs?

“AI agents. These will integrate more fully into numerous systems and you would give them the authority to execute things on your behalf. I.e. making reservations for dinner somewhere and then sending you the details or searching and purchasing and sending a gift to someone or planning and executing a vacation reservation including my purchasing travel arrangements, hotel stays, transport to and from, etc. Even something as simple as telling it you are hungry and having and AI agent find something you would like and having it delivered to you. Or it acting on its own to do any number of those because it also sees your schedule, knows you didn’t really eat all day and that it is your mom’s birthday and you forgot to get her anything or to even call…”

How accurate is that statement above?

AI agents are software entities that act autonomously on behalf of their users, making decisions or performing tasks based on predefined criteria, learned preferences, or adaptive learning algorithms. They can range from simple chatbots to sophisticated systems capable of managing complex tasks. The accuracy of the statement reflects a forward-looking perspective on the capabilities of AI agents, envisioning a future where they are deeply integrated into our daily lives, handling tasks from personal to professional spheres with minimal human intervention.

  • 🤖 Autonomy and Integration: The description is accurate in envisioning AI agents that are more fully integrated into various systems. This integration will likely increase as advancements in AI, machine learning, and data analytics continue to evolve. Such agents will understand user preferences, schedules, and even predict needs based on historical data and real-time inputs.
  • 🔍 Executing Tasks on Behalf of Users: The ability of AI agents to perform tasks such as making reservations, purchasing gifts, or arranging travel is not only plausible but is already being realized to a certain extent with existing AI and machine learning technologies. Examples include virtual assistants like Google Assistant, Siri, and Alexa, which can perform a range of tasks from setting reminders to booking appointments.
  • 🎁 Personalization and Prediction: The statement also touches on the AI agents’ capability to act proactively based on the user’s schedule, preferences, or significant dates. This level of personalization and predictive action is a key area of development in AI, aiming to provide more personalized and anticipative user experiences. Implementing this effectively requires sophisticated models of user behavior and preferences, which can be built using machine learning techniques.
  • 🚀 Future Prospects and Ethical Considerations: While the vision of AI agents acting autonomously to manage aspects of our lives is grounded in realistic expectations of technology’s trajectory, it also raises ethical and privacy concerns. Issues such as data security, user consent, and the potential for over-dependence on technology for personal tasks are significant. The development and deployment of such AI agents must consider these aspects to ensure that they serve users’ interests ethically and securely.
  • 📈 Current Limitations and Challenges: It’s important to note that while the statement captures a future potential, current AI technologies have limitations. The complexity of fully understanding human needs, contexts, and the nuances of personal preferences in an ethical manner remains a challenge.

What Else Is Happening in AI on April 03rd, 2024❗

🎮 Microsoft is planning to add an AI chatbot to Xbox

Microsoft is currently testing a new AI-powered chatbot to be added to Xbox to automate customer support tasks. The software giant has tested an “embodied AI character” that animates when responding to Xbox support queries. The virtual representative can handle either text or voice requests. It’s an effort to integrate AI into Xbox platforms and services. (Link)

☁️ CloudFare launches Workers AI to power one-click deployment with Hugging Face

CloudFare has launched Workers AI, which empowers developers to bring their AI applications from Hugging Face to its platform in one click. The serverless GPU-powered interface is generally available to the public. The Cloudflare-Hugging Face integration was announced nearly seven months ago. It makes it easy for models to be deployed onto Workers AI. (Link)

🍺 Machine Learning can predict and enhance complex beer flavor

In a study by Nature Communications, researchers combined chemical analyses, sensory data, and machine learning to create models that accurately predict beer flavor and consumer appreciation from the beer’s chemical composition. They identified compounds that enhance flavor and used this knowledge to improve the taste and popularity of commercial beers. (Link)

📖 Read AI adds AI summaries to meetings, emails, and messages

Read AI is expanding its services from summarizing video meetings to including messages and emails. The platform connects to popular communication platforms like Gmail, Outlook, Slack, Zoom, Microsoft Teams, and Google Meet to deliver daily updates, summaries, and AI-generated takeaways. The goal is to help users save time and improve productivity. (Link)

🤖 Bille Elish, Kety Perry, and 200 other artists protest AI’s devaluation of music

Nicki Minaj, Billie Eilish, Katy Perry and other musicians warn against replacing human singers with AI

In an open letter, over 200 famous musicians, including Billie Eilish and Katy Perry, have expressed their concerns about the negative impact of AI on human creativity. They call for the responsible use of AI and urge AI companies to stop creating music that undermines their work. They believe that unregulated and uncontrolled use of AI can harm songwriters, musicians, and creators. They emphasize the need to protect artists’ rights and fair compensation. (Link)

A daily chronicle of AI Innovations April 02nd 2024: 📲 Apple’s Siri will now understand what’s on your screen; 🤖 OpenAI introduces instant access to ChatGPT; 🚨 Elon Musk says AI might destroy humanity, but it’s worth the risk; 🤖 Sam Altman gives up control of OpenAI Startup Fund; 📰 Yahoo acquires Instagram co-founders’ AI-powered news startup Artifact

🤖 Sam Altman gives up control of OpenAI Startup Fund

  • Sam Altman has relinquished formal control of the OpenAI Startup Fund, which he initially managed, to Ian Hathaway, marking a resolution to the fund’s unique corporate structure.
  • The fund was established in 2021 with Altman temporarily at the helm to avoid potential conflicts had he not returned as CEO after a brief departure; he did not personally invest in or financially benefit from it.
  • Under Hathaway’s management, the fund, starting with $175 million in commitments, has grown to $325 million in assets and has invested in early-stage AI companies across healthcare, law, education, and more, with at least 16 startups backed.
  • Source

🙏 US and UK sign deal to partner on AI research 

  • The US and UK have formed a partnership focused on advancing the safety testing of AI technologies, sharing information and expertise to develop tests for cutting-edge AI models.
  • A Memorandum of Understanding (MOU) has been signed to enhance the regulation and testing of AI, aiming to effectively assess and mitigate the risks associated with AI technology.
  • The partnership involves the exchange of expert personnel between the US and UK AI Safety Institutes, with plans for potential joint testing on publicly available AI models, reinforcing their commitment to addressing AI risks and promoting its safe development globally.
  • Source

📰 Yahoo acquires Instagram co-founders’ AI-powered news startup Artifact

  • Yahoo is acquiring the AI news app Artifact, built by Instagram co-founders, but not its team, aiming to enhance its own news platform with Artifact’s advanced technology and recommendation systems.
  • Artifact’s technology, which focuses on personalizing and recommending content, will be integrated into Yahoo News and potentially other Yahoo platforms, despite the discontinuation of the Artifact app itself.
  • The integration of Artifact’s technology into Yahoo aims to create a personalized content ecosystem, leveraging Yahoo’s vast user base to realize the potential of AI in news curation and recommendation.
  • Source

Apple’s Siri will now understand what’s on your screen

Apple researchers have developed an AI system called ReALM which enables voice assistants like Siri to understand contextual references to on-screen elements. By converting the complex task of reference resolution into a language modeling problem, ReALM outperforms even GPT-4 in understanding ambiguous references and context.

Apple's Siri will now understand what’s on your screen
Apple’s Siri will now understand what’s on your screen

This innovation lies in reconstructing the screen using parsed on-screen entities and their locations to generate a textual representation that captures the visual layout. This approach, combined with fine-tuning language models specifically for reference resolution, allows ReALM to achieve substantial performance gains compared to existing methods.

  • Apple researchers have developed an AI system called ReALM that can understand screen context and ambiguous references, improving interactions with voice assistants.
  • ReALM reconstructs the screen using parsed on-screen entities to generate a textual representation, outperforming GPT-4.
  • Apple is investing in making Siri more conversant and context-aware through this research.
  • However, automated parsing of screens has limitations, especially with complex visual references.
  • Apple is catching up in AI research but faces stiff competition from tech rivals like Google, Microsoft, Amazon, and OpenAI.

Why does this matter?

ReALM’s ability to understand screen context creates possibilities for more intuitive and hands-free interactions with voice assistants. Imagine effortlessly instructing Siri to “open the app at the bottom right corner.” As Apple races to close the AI gap with rivals like Google and Microsoft, ReALM could be a game-changer in making Siri and other Apple products more contextually aware.

Source

OpenAI introduces instant access to ChatGPT

OpenAI now allows users to use ChatGPT without having to create an account. With over 100 million weekly users across 185 countries, it can now be accessed instantly by anyone curious about its capabilities.

While this move makes AI more accessible, other OpenAI products like DALL-E 3 still require an account. The company has also introduced new content safeguards and allows users to opt out of model training, even without an account. Despite growing competition from rivals like Google’s Gemini, ChatGPT remains the most visited AI chatbot site, attracting 1.6 billion visitors in February.

Why does this matter?

By allowing anyone to instantly access ChatGPT, OpenAI is expanding its user base and encouraging more people to explore the potential applications of AI. This move could accelerate the adoption of AI tools across various industries, as users become more comfortable with the technology.

Source

Elon Musk says AI might destroy humanity, but it’s worth the risk

Elon Musk recently shared his thoughts on the potential dangers of AI at the Abundance Summit’s “Great AI Debate” seminar. He estimated a 10-20% chance that AI could pose an existential threat to humanity.

Despite the risks, Musk believes that the benefits of AI outweigh the potential dangers. He emphasized the importance of teaching AI to be truthful and curious, although he didn’t provide specifics on how he arrived at his risk assessment.

Why does this matter?

Musk’s comments emphasize the importance of using AI’s advantages while addressing its potential risks. This involves creating transparent, accountable AI systems aligned with human values. While his estimate is concerning, continued research in AI safety and governance is necessary to ensure AI remains beneficial.

Source

Artificial intelligence is taking over drug development

The most striking evidence that artificial intelligence can provide profound scientific breakthroughs came with the unveiling of a program called AlphaFold by Google DeepMind. In 2016 researchers at the company had scored a big success with AlphaGo, an ai system which, having essentially taught itself the rules of Go, went on to beat the most highly rated human players of the game, sometimes by using tactics no one had ever foreseen. This emboldened the company to build a system that would work out a far more complex set of rules: those through which the sequence of amino acids which defines a particular protein leads to the shape that sequence folds into when that protein is actually made. AlphaFold found those rules and applied them with astonishing success.

The achievement was both remarkable and useful. Remarkable because a lot of clever humans had been trying hard to create computer models of the processes which fold chains of amino acids into proteins for decades. AlphaFold bested their best efforts almost as thoroughly as the system that inspired it trounces human Go players. Useful because the shape of a protein is of immense practical importance: it determines what the protein does and what other molecules can do to it. All the basic processes of life depend on what specific proteins do. Finding molecules that do desirable things to proteins (sometimes blocking their action, sometimes encouraging it) is the aim of the vast majority of the world’s drug development programmes.

Source

Comment: Someone needs to fire up a CRISPR-cas AI service you can submit your DNA to and they develop and ship you a treatment kit for various cancers, genetic disorders etc.

What Else Is Happening in AI on April 02nd, 2024❗

🚫 Pinecone launches Luna AI that never hallucinates

Trained using a novel “information-free” approach, Luna achieved zero hallucinations by always admitting when it doesn’t know an answer. The catch? Its performance on other tasks is significantly reduced. While not yet open-sourced, vetted institutions can access the model’s source and weights. (Link)

🤝 US and UK  collaborate to tackle AI safety risks

As concerns grow over the potential risks of next-gen AI, the two nations will work together to develop advanced testing methods and share key information on AI capabilities and risks. The partnership will address national security concerns and broader societal issues, with plans for joint testing exercises and personnel exchanges between their respective AI safety institutes. (Link)

🔍 Perplexity to test sponsored questions in AI search

Perplexity’s Chief Business Officer, Dmitry Shevelenko, announced the company’s plan to introduce sponsored suggested questions later this year. When users search for more information on a topic, the platform will display sponsored queries from brands, allowing Perplexity to monetize its AI search platform. (Link)

🇯🇵 OpenAI expands to Japan with Tokyo office

The Tokyo office will be OpenAI’s first in Asia and third international location, following London and Dublin. The move aims to offer customized AI services in Japanese to businesses and contribute to the development of an AI governance framework in the country. (Link)

🤖 Bixby gets a GenAI upgrade

Despite speculation, Samsung isn’t giving up on its voice assistant, Bixby. Instead, the company is working hard to equip Bixby with generative AI to make it smarter and more conversational. Samsung introduced a suite of AI features called Galaxy AI to its smartphones, including the Galaxy S24’s use of Google’s Gemini Nano AI model. (Link)

A daily chronicle of AI Innovations April 01st 2024: 🎤 This AI model can clone your voice in 15 seconds; 🚀 Microsoft and OpenAI plan $100B supercomputer for AI development; 🖼️ MagicLens: Google DeepMind’s breakthrough in image retrieval technology

🍎Apple says its latest AI model is even better than OpenAI’s GPT4

  • Apple researchers have introduced ReALM, an advanced AI model designed to understand and navigate various contexts more effectively than OpenAI’s GPT4.
  • ReALM aims to enhance user interaction by accurately understanding onscreen, conversational, and background entities, making device interactions more intuitive.
  • Apple believes ReALM’s ability to handle complex reference resolutions, including onscreen elements, positions it as a superior solution compared to the capabilities of GPT-4.
 

Deepmind chief doesn’t see AI reaching its limits anytime soon

  • Deepmind founder Demis Hassabis believes AI is both overhyped and underestimated, with the potential for AI far from being reached and warning against the excessive hype surrounding it.
  • Hassabis predicts many AI startups will fail due to the high computing power demands, expects industry consolidation, and sees no limit to the advancements in massive AI models.
  • Despite concerns over hype, Hassabis envisions the beginning of a new golden era in scientific discovery powered by AI and estimates a 50% chance of achieving artificial general intelligence within the next ten years.

This AI model can clone your voice in 15 seconds

OpenAI has offered a glimpse into its latest breakthrough – Voice Engine, an AI model that can generate stunningly lifelike voice clones from a mere 15-second audio sample and a text input. This technology can replicate the original speaker’s voice, opening up possibilities for improving educational materials, making videos more accessible to global audiences, assisting with communication for people with speech impairments, and more.

Reference audio:

LISTEN NOW · 0:15

Generated audio:

LISTEN NOW · 0:16

Though the model has many applications, the AI giant is cautious about its potential misuse, especially during elections. They have strict rules for partners, like no unauthorized impersonation, clear labeling of synthetic voices, and technical measures like watermarking and monitoring. OpenAI hopes this early look will start a conversation about how to address potential issues by educating the public and developing better ways to trace the origin of audio content.

Why does this matter?

OpenAI’s Voice Engine can transform industries from gaming and entertainment to education and healthcare. Imagine video games with non-player characters that sound like real people, animated films with AI-generated voiceovers, or personalized voice assistants for individuals with speech impairments. But as AI-generated voices become more human-like, questions about consent, privacy, and robust authentication measures must be addressed to prevent misuse.

Source

Microsoft+OpenAI plan $100B supercomputer for AI development

Microsoft and OpenAI are reportedly planning to build a massive $100 billion supercomputer called “Stargate” to rapidly advance the development of OpenAI’s AI models. Insiders say the project, set to launch in 2028 and expand by 2030, would be one of the largest investments in computing history, requiring several gigawatts of power – equivalent to multiple large data centers.

Much of Stargate’s cost would go towards procuring millions of specialized AI chips, with funding primarily from Microsoft. A smaller $10B precursor called “Phase 4” is planned for 2026. The decision to move forward with Stargate relies on OpenAI achieving significant improvements in AI capabilities and potential “superintelligence.” If realized, Stargate could enable OpenAI’s AI systems to recursively generate synthetic training data and become self-improving.

Why does this matter?

The Stargate project will give OpenAI and Microsoft a massive advantage in creating AI systems that are far more capable than what we have today. This could lead to breakthroughs in areas like scientific discovery, problem-solving, and the automation of complex tasks. But it also raises concerns about the concentration of power in the AI industry. We’ll need new frameworks for governing advanced AI to ensure it benefits everyone, not just a few giants.

Source

MagicLens: Google DeepMind’s breakthrough in image retrieval technology

Google DeepMind has introduced MagicLens, a revolutionary set of image retrieval models that surpass previous state-of-the-art methods in multimodality-to-image, image-to-image, and text-to-image retrieval tasks. Trained on a vast dataset of 36.7 million triplets containing query images, text instructions, and target images, MagicLens achieves outstanding performance while meeting a wide range of search intents expressed through open-ended instructions.

Multimodality-to-Image performance

MagicLens: Google DeepMind's breakthrough in image retrieval technology
MagicLens: Google DeepMind’s breakthrough in image retrieval technology

Image-to-Image performance

MagicLens employs a dual-encoder architecture, which allows it to process both image and text inputs, delivering highly accurate search results even when queries are expressed in everyday language. By leveraging advanced AI techniques, like contrastive learning and single-modality encoders, MagicLens can satisfy diverse search intents and deliver relevant images with unprecedented efficiency.

Why does this matter?

The release of MagicLens highlights the growing importance of multimodal AI systems that can process both text and visual information. We can expect to see more seamless integration between language and vision, enabling the development of more sophisticated AI applications. This trend could have far-reaching implications for fields such as robotics, autonomous vehicles, and augmented reality, where the ability to interpret and respond to visual data is crucial.

Source

What Else Is Happening in AI on April 01st, 2024❗

🧠 TCS aims to build the largest AI-ready workforce

Tata Consultancy Services (TCS) has announced that it has trained 3.5 lakh employees, more than half of its workforce, in generative AI skills. The company set up a dedicated AI and cloud business unit in 2023 to address the growing needs of customers for cloud and AI adoption, offering a comprehensive portfolio of GenAI services and solutions. (Link)

🔗 ChatGPT introduces hyperlinked source citations in the latest update

OpenAI has introduced a feature for ChatGPT premium users that makes source links more prominent in the bot’s responses. The update hyperlinks words within ChatGPT’s answers, directing users to the source websites — a feature already present in other chatbot search resources like Perplexity. (Link)

✏️ OpenAI’s DALL·E now allows users to edit generated images

OpenAI has launched a new image editing feature for DALL·E, enabling users to modify generated images by selecting areas and describing changes. The editor offers tools to add, remove, or update objects within the image using either the selection tool or conversational prompts. (Link)

🚇 NYC to test Evolv’s AI gun detection technology in subways

New York City plans to test Evolv’s AI-powered gun detection scanners in subway stations within 90 days, according to Mayor Eric Adams. However, Evolv is under scrutiny for the accuracy of its technology, facing reports of false positives and missed detections. (Link)

🚫 Microsoft Copilot banned in US House due to potential data breaches

The US House of Representatives has banned its staffers from using Microsoft Copilot due to concerns about possible data leaks to unauthorized cloud services. This decision mirrors last year’s restriction on the use of ChatGPT in congressional offices, with no other chatbots currently authorized. Microsoft has indicated that it plans to address federal government security and compliance requirements for AI tools like Copilot later this year. (Link)

A Daily Chronicle of AI Innovations in March 2024

  • Meta’s battle with ChatGPT begins now
    by /u/BiggerGeorge (Artificial Intelligence Gateway) on April 19, 2024 at 6:35 am

    ChatGPT kicked off the AI chatbot race. Meta is determined to win it. Aimed to rival OpenAI's ChatGPT, Llama 3 integrates into Meta's various platforms and offers significant improvements in capabilities and global accessibility. https://www.theverge.com/2024/4/18/24133808/meta-ai-assistant-llama-3-chatgpt-openai-rival Key Points: 1. Meta's AI Aspirations: Meta integrates its AI assistant into major platforms like Instagram and Facebook, aiming to make it a dominant player in the AI space. 2. Introduction of Llama 3: Llama 3 is presented as a superior model to its predecessors, boasting enhancements in coding tasks and general AI performance. 3. Global Rollout: Meta's AI assistant is now available in multiple English-speaking countries beyond the U.S., planning further expansion. 4. Technical Improvements: Llama 3 features real-time search integration, upgraded image generation, and a user-friendly interface with prompt suggestions. 5. Open Source Models: Meta introduces open-source versions of Llama 3 for developers, featuring models with 8 and 70 billion parameters. 6. Future Developments: Zuckerberg discusses ongoing developments in Llama 3, including more personalized and multimodal capabilities. Long-term Vision: Despite the rapid evolution of AI technology, Zuckerberg emphasizes a long-term strategy to position Meta as a leader in the global AI market. Meta's Blog of LLama-3: https://ai.meta.com/blog/meta-llama-3/ ​ submitted by /u/BiggerGeorge [link] [comments]

  • NLP vs. LLM: Toolbox vs. Swiss Army Knife
    by /u/krunal_bhimani_ (Artificial Intelligence Gateway) on April 19, 2024 at 6:33 am

    NLP: Specialized tools for specific tasks (translation, sentiment analysis). LLM: Versatile for many tasks (writing, answering questions), but may need refinement for pinpoint accuracy. Think NLP as the foundation, LLM as the advanced tool. Both shape the future of language interaction. https://www.seaflux.tech/blogs/llm-vs-nlp-use-case-for-business-solutions submitted by /u/krunal_bhimani_ [link] [comments]

  • Thinking on the Edge: How Edge AI is Revolutionizing Devices
    by /u/krunal_bhimani_ (Artificial Intelligence Gateway) on April 19, 2024 at 6:31 am

    Imagine a world where your smart devices get smarter, not by sending data to the cloud, but by processing it themselves. That's the power of Edge AI. By bringing artificial intelligence to the "edge" of the network, on devices like phones and wearables, Edge AI unlocks a new era of real-time decision making, privacy control, and lightning-fast performance. This isn't just science fiction - Edge AI is already transforming industries from manufacturing to healthcare. Are you ready for the intelligent edge? https://www.seaflux.tech/blogs/EdgeAI-advantages-and-use-cases submitted by /u/krunal_bhimani_ [link] [comments]

  • AI will turn everything into a cyber-threat, from searching for exploits and backdoors, to programming viruses, to speedrunning account/password hacks, etc.
    by /u/crummynubs (Artificial Intelligence Gateway) on April 19, 2024 at 6:04 am

    I figure all countries have a vested interest, and also explains why the military is so keen. I think we're going to enter dilemmic times, where every boundary will be pushed in the name of the new space race. submitted by /u/crummynubs [link] [comments]

  • System for writing the pronunciation of words (across languages) using AI?
    by /u/lancejpollard (Artificial Intelligence Gateway) on April 19, 2024 at 5:39 am

    Say I have written the IPA form of 1000 English words, i.e. I wrote the pronunciation down for 1000 words in a standard text format, (or used something like the CMU pronunciation dictionary in English, which has over 100k words with written pronunciations). Ideally it only requires less than 1000 written pronunciations covering all aspects of pronunciation for a given language, as more than that would be a lot of manual work to create the training data. Ideally, I could write 1000 IPA transcriptions in 100 different languages (1000 per language), and some sort of AI models would integrate with that, and it would be able to automatically generate the IPA text pronunciation from spoken audio. Is that possible? So I'd have basically a CSV like this: sublime,/səˈblaɪm/ creativity,/krieɪˈtɪvəti/ bounty,/ˈbaʊn(t)i/ ... Given that's all the data I have (CSV with word and pronunciation written down), what sort of AI tools would I need to be able to take speech audio and create the IPA written transcription from it? Tools like AWS Polly and Google Cloud Text2Speech allow for taking IPA transcriptions (like my CSV above), and converting it to AI-generated audio for many languages, including several voices per language (see the Google Cloud link for supported languages). But how would you do the reverse? Converting audio into IPA transcriptions? What is the basic system of tools you would need to wire up, and the "AI models" you would need to find out in the wild to make this happen? Assuming you don't have the luxury of having any sorts of huge datasets like Meta/Google/Microsoft/etc. have acccess to. For example, I would need to use some sort of existing model for dealing with the audio of voices. The part I would add (it seems so far) would be a small curated list of ~1000 words and their written IPA pronunciation. I would then plugin the word like "creativity" to get the audio of it from some free AI database. Then I would write some sort of code to train the AI to convert "creativity" into "/krieɪˈtɪvəti/" (and correspondingly, "creation", which it may have never seen an IPA transcription for, into "/kriːˈʲeɪʃɘn/", given some sort of model training). What is the general approach to making this a reality? I would like to build a curated database of IPA pronunciations for words, and your best current source is Wiktionary, but for many languages the IPA pronunciation transcriptions are sorely lacking. So if we could use AI to take an existing audio database of saying words in 100's of different languages, and convert it to the written IPA pronunciation, that would be a huge win. Is that at all possible? I am a full-stack software developer mainly focusing on TypeScript, so that's where I'm coming from. If it's possible to do in Python or some other language, I will surely learn that next, I just am unsure what components go into the final mixture to make this possible. submitted by /u/lancejpollard [link] [comments]

  • New Google AI response
    by /u/Any_Arrival_4479 (Artificial Intelligence Gateway) on April 19, 2024 at 5:05 am

    Is anyone else getting google response that are made by AI? I’ve been looking into it and I can’t find anyone talking about it I’m asking bc it sucks and I want to figure out how to turn it off. I get it, AI is the next wave of tech, it’ll soon take over. What’s annoying is how every platform is now trying to get ahead of the curve and replace their perfectly fine system with a system powered by cheap AI. I was pissed enough already that they introduced ads that couldn’t be turned off, but now a shitty AI model that doesn’t even know what I’m asking? Please tell me I can at least turn this off. I don’t see any reason why Google would want to force me to use the AI submitted by /u/Any_Arrival_4479 [link] [comments]

  • Even in a world that’s post-work, there'll still be “work”
    by /u/SuperFluffyTeddyBear (Artificial Intelligence Gateway) on April 19, 2024 at 4:47 am

    People tend to assume that if AI takes all our jobs, this will either usher in a utopia where every waking second is pure bliss (watching Netflix all day, how wonderful!), or a dystopia where everything we do is meaningless and unfulfilling (um, watching Netflix all day - how empty and stupid!). But in my opinion, neither of those are true. So long as AI doesn’t literally kill us and we have sufficient financial support through, say, UBI — I realize those assumptions are debatable and it’s possible we’ll just be straight up dead, in which case nothing I’m saying here has any relevance — if AI takes all our jobs, the rest of our lives will actually feel quite … wait for it.. normal. Why do I say that? Because think of what’s already the case: as humans, huge chunks of our time are — and always will be — taken up by many things which are essentially “work” but lie outside of our official jobs. These things are often just as important to us (or more important) than our jobs. No, I’m not talking about hobbies — the very word “hobbies” is indicative of things of lesser importance, so if all we have left is our “hobbies,” then God help us. And no, I’m not talking about us all becoming so in touch with the deepest reaches of the human spirit that we all become Picasso-level painters or something. I realize that ain’t gonna happen. So then what am I talking about? Examples include: having and raising children, taking care of elderly or sick relatives, raising pets, dating, maintaining friendships, therapy, self improvement, volunteering, political activism, cooking, home improvement, working out, putting thought and effort into your appearance. What all of these things have in common is that, while sometimes they’re enjoyable in the moment, to a large extent they’re “work” — that is, they all have some combination of the following aspects: delayed gratification, drudgery, necessity, productivity, personal meaning, personal identity, self-esteem, social connection, social judgement. The same things that our jobs give us. It’s possible that AI/robots will eventually be able to do many of the above things as well. But given that these things aren’t our jobs, we’ll never be required to give them up if we don’t want to. And it’s not like these things occupy fixed time slots. If we no longer have jobs, the above activities can (and, in my prediction, will) expand to fill up the time we were previously spending working, with the end result being no significant change up or down in terms of our total levels of fulfillment, personal meaning, etc. Thoughts? Do you agree or do you see it differently? submitted by /u/SuperFluffyTeddyBear [link] [comments]

  • Which AI tools are needed to change the singer of a song?
    by /u/PlzHelpme2021_God (Artificial Intelligence Gateway) on April 19, 2024 at 3:35 am

    Which AI tools are needed to change the singer of a song? I have seen on YouTube some people were able to change the voice of the singers for a song and also change the song to a retro style or add drums but still stick to the original tune? Are there any tools that will help replace the vocals? submitted by /u/PlzHelpme2021_God [link] [comments]

  • One-Minute Daily AI News 4/18/2024
    by /u/Excellent-Target-847 (Artificial Intelligence Gateway) on April 19, 2024 at 3:20 am

    Meta’s new AI assistant is rolling out across WhatsApp, Instagram, Facebook and Messenger.[1] Meta steps up AI battle with OpenAI and Google with release of Llama 3.[2] Meta should be called “OPEN”AI. Google is combining its Android software and Pixel hardware divisions to more broadly integrate AI.[3] New Atlas robot stuns experts in first reveal from Boston Dynamics.[4] Sources included at: https://bushaicave.com/2024/04/18/4-18-2024/ submitted by /u/Excellent-Target-847 [link] [comments]

  • One-Minute Daily AI News 4/18/2024
    by /u/Excellent-Target-847 (Artificial Intelligence) on April 19, 2024 at 3:19 am

    Meta’s new AI assistant is rolling out across WhatsApp, Instagram, Facebook and Messenger.[1] Meta steps up AI battle with OpenAI and Google with release of Llama 3.[2] Meta should be called “OPEN”AI. Google is combining its Android software and Pixel hardware divisions to more broadly integrate AI.[3] New Atlas robot stuns experts in first reveal from Boston Dynamics.[4] Sources: [1] https://www.cnbc.com/2024/04/18/meta-ai-assistant-comes-to-whatsapp-instagram-facebook-and-messenger.html [2] https://llama.meta.com/ [3] https://abcnews.go.com/US/wireStory/google-combining-android-software-pixel-hardware-divisions-broadly-109407358 [4] https://techcrunch.com/2024/04/17/techcrunch-minute-new-atlas-robot-stuns-experts-in-first-reveal-from-boston-dynamics/ submitted by /u/Excellent-Target-847 [link] [comments]

Pass the 2023 AWS Cloud Practitioner CCP CLF-C02 Certification with flying colors Ace the 2023 AWS Solutions Architect Associate SAA-C03 Exam with Confidence Pass the 2023 AWS Certified Machine Learning Specialty MLS-C01 Exam with Flying Colors

List of Freely available programming books - What is the single most influential book every Programmers should read



#BlackOwned #BlackEntrepreneurs #BlackBuniness #AWSCertified #AWSCloudPractitioner #AWSCertification #AWSCLFC02 #CloudComputing #AWSStudyGuide #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AWSBasics #AWSCertified #AWSMachineLearning #AWSCertification #AWSSpecialty #MachineLearning #AWSStudyGuide #CloudComputing #DataScience #AWSCertified #AWSSolutionsArchitect #AWSArchitectAssociate #AWSCertification #AWSStudyGuide #CloudComputing #AWSArchitecture #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AzureFundamentals #AZ900 #MicrosoftAzure #ITCertification #CertificationPrep #StudyMaterials #TechLearning #MicrosoftCertified #AzureCertification #TechBooks

Top 1000 Canada Quiz and trivia: CANADA CITIZENSHIP TEST- HISTORY - GEOGRAPHY - GOVERNMENT- CULTURE - PEOPLE - LANGUAGES - TRAVEL - WILDLIFE - HOCKEY - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
zCanadian Quiz and Trivia, Canadian History, Citizenship Test, Geography, Wildlife, Secenries, Banff, Tourism

Top 1000 Africa Quiz and trivia: HISTORY - GEOGRAPHY - WILDLIFE - CULTURE - PEOPLE - LANGUAGES - TRAVEL - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
Africa Quiz, Africa Trivia, Quiz, African History, Geography, Wildlife, Culture

Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada.
Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada

Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA
Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA


Health Health, a science-based community to discuss health news and the coronavirus (COVID-19) pandemic

Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.

Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.

    Feed has no items.

Turn your dream into reality with Google Workspace: It’s free for the first 14 days.
Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes:
Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes: 96DRHDRA9J7GTN6 96DRHDRA9J7GTN6
63F733CLLY7R7MM
63F7D7CPD9XXUVT
63FLKQHWV3AEEE6
63JGLWWK36CP7WM
63KKR9EULQRR7VE
63KNY4N7VHCUA9R
63LDXXFYU6VXDG9
63MGNRCKXURAYWC
63NGNDVVXJP4N99
63P4G3ELRPADKQU
With Google Workspace, Get custom email @yourcompany, Work from anywhere; Easily scale up or down
Google gives you the tools you need to run your business like a pro. Set up custom email, share files securely online, video chat from any device, and more.
Google Workspace provides a platform, a common ground, for all our internal teams and operations to collaboratively support our primary business goal, which is to deliver quality information to our readers quickly.
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE
C37HCAQRVR7JTFK
C3AE76E7WATCTL9
C3C3RGUF9VW6LXE
C3D9LD4L736CALC
C3EQXV674DQ6PXP
C3G9M3JEHXM3XC7
C3GGR3H4TRHUD7L
C3LVUVC3LHKUEQK
C3PVGM4CHHPMWLE
C3QHQ763LWGTW4C
Even if you’re small, you want people to see you as a professional business. If you’re still growing, you need the building blocks to get you where you want to be. I’ve learned so much about business through Google Workspace—I can’t imagine working without it.
(Email us for more codes)

error: Content is protected !!