AI Innovations in July 2024
AI Innovations in July 2024.
Welcome to our blog series “AI Innovations in July 2024”! As we continue to ride the wave of extraordinary developments from June, the momentum in artificial intelligence shows no signs of slowing down. Last month, we witnessed groundbreaking achievements such as the unveiling of the first quantum AI chip, the successful deployment of autonomous medical drones in remote areas, and significant advancements in natural language understanding that have set new benchmarks for AI-human interaction.
July promises to be just as exhilarating, with researchers, engineers, and visionaries pushing the boundaries of what’s possible even further. In this evolving article, updated daily throughout the month, we’ll dive deep into the latest AI breakthroughs, advancements, and milestones shaping the future.
From revolutionary AI-powered technologies and cutting-edge research to the societal and ethical implications of these innovations, we provide you with a comprehensive and insightful look at the rapidly evolving world of artificial intelligence. Whether you’re an AI enthusiast, a tech-savvy professional, or simply someone curious about the future, this blog will keep you informed, inspired, and engaged.
Join us on this journey of discovery as we explore the frontiers of AI, uncovering the innovations that are transforming industries, enhancing our lives, and shaping our future. Stay tuned for daily updates, and get ready to be amazed by the incredible advancements happening in the world of AI!
LISTEN DAILY AT OUR PODCAST HERE
Meta has introduced Meta 3D Gen, a new state-of-the-art, fast pipeline for text-to-3D asset generation. It offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in less than a minute.
According to Meta, the process is three to 10 times faster than existing solutions. The research paper even mentions that when assessed by professional 3D artists, the output of 3DGen is preferred a majority of time compared to industry alternatives, particularly for complex prompts, while being from 3× to 60× faster.
A significant feature of 3D Gen is its support physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications.
Why does it matter?
3D Gen’s implications extend far beyond Meta’s sphere. In gaming, it could speed up the creation of expansive virtual worlds, allowing rapid prototyping. In architecture and industrial design, it could facilitate quick concept visualization, expediting the design process.
Source: https://ai.meta.com/research/publications/meta-3d-gen/
Perplexity AI has improved Pro Search to tackle more complex queries, perform advanced math and programming computations, and deliver even more thoroughly researched answers. Everyone can use Pro Search five times every four hours for free, and Pro subscribers have unlimited access.
Perplexity suggests the upgraded Pro Search “can pinpoint case laws for attorneys, summarize trend analysis for marketers, and debug code for developers—and that’s just the start”. It can empower all professions to make more informed decisions.
Why does it matter?
This showcases AI’s potential to assist professionals in specialized fields. Such advancements also push the boundaries of AI’s practical applications in research and decision-making processes.
Source: https://www.perplexity.ai/hub/blog/pro-search-upgraded-for-more-advanced-problem-solving
Edgeless Systems introduced Continuum AI, the first generative AI framework that keeps prompts encrypted at all times with confidential computing by combining confidential VMs with NVIDIA H100 GPUs and secure sandboxing.
The Continuum technology has two main security goals. It first protects the user data and also protects AI model weights against the infrastructure, the service provider, and others. Edgeless Systems is also collaborating with NVIDIA to empower businesses across sectors to confidently integrate AI into their operations.
Why does it matter?
This greatly advances security for LLMs. The technology could be pivotal for a future where organizations can securely utilize AI, even for the most sensitive data.
Source: https://developer.nvidia.com/blog/advancing-security-for-large-language-models-with-nvidia-gpus-and-edgeless-systems
Announced a few weeks ago, Gen-3 is Runway’s latest frontier model and a big upgrade from Gen-1 and Gen-2. It allows users to produce hyper-realistic videos from text, image, or video prompts. Users must upgrade to a paid plan to use the model.
Source: https://venturebeat.com/ai/runways-gen-3-alpha-ai-video-model-now-available-but-theres-a-catch
In a job listing, Meta mentioned it is seeking to research and prototype “new consumer experiences” with new types of gameplay driven by Gen AI. It is also planning to build Gen AI-powered tools that could “improve workflow and time-to-market” for games.
Source: https://techcrunch.com/2024/07/02/meta-plans-to-bring-generative-ai-to-metaverse-games
As a part of its AI agreement with OpenAI, Apple will get an observer role on OpenAI’s board. Apple chose Phil Schiller, the head of Apple’s App Store and its former marketing chief, for the position.
Source: https://www.theverge.com/2024/7/2/24191105/apple-phil-schiller-join-openai-board
Figma’s Make Design feature generates UI layouts and components from text prompts. It repeatedly reproduced Apple’s Weather app when used as a design aid, drawing accusations that Figma’s AI seems heavily trained on existing apps.
Source: https://techcrunch.com/2024/07/02/figma-disables-its-ai-design-feature-that-appeared-to-be-ripping-off-apples-weather-app
According to the World Intellectual Property Organization (WIPO), more than 50,000 patent applications were filed in the past decade for Gen AI. More than 38,000 GenAI inventions were filed by China between 2014-2023 vs. only 6,276 by the U.S.
Source: https://www.reuters.com/technology/artificial-intelligence/china-leading-generative-ai-patents-race-un-report-says-2024-07-03
Source: https://www.theverge.com/2024/7/2/24191105/apple-phil-schiller-join-openai-board
Source: https://www.techradar.com/pro/google-says-its-emissions-have-grown-nearly-50-due-to-ai-data-center-boom-and-heres-what-it-plans-to-do-about-it
Source: https://www.maginative.com/article/meta-unveils-3d-gen-ai-that-creates-detailed-3d-assets-in-under-a-minute/
Runway just announced that its AI video generator, Gen-3 Alpha, is now available to all users following weeks of impressive, viral outputs after the model’s release in mid-June.
Despite impressive recent releases from KLING and Luma Labs, Runway’s Gen-3 Alpha model feels like the biggest leap AI video has taken since Sora. However, the tiny generation limits for non-unlimited plans might be a hurdle for power users.
Source: https://x.com/runwayml/status/1807822396415467686
Motorola just launched its ‘Styled By Moto’ ad campaign, an entirely AI-generated fashion spot promoting its new line of Razr folding smartphones — created using nine different AI tools, including Sora and Midjourney.
This is a fascinating look at the AI-powered stack used by a major brand, and a glimpse at how tools can (and will) be combined to open new creative avenues. It’s also another example of the shift in discourse surrounding AI’s use in marketing — potentially paving the way for wider acceptance and integration.
Elon Musk has announced the release dates for two new AI assistants from xAI. The first, Grok 2, will be launched in August. Musk says Grok 2 is inspired by JARVIS from Iron Man and The Hitchhiker’s Guide to the Galaxy and aims to answer virtually any user query. This ambitious goal is fueled by xAI’s focus on “purging” LLM datasets used for training.
Musk also revealed that an even more powerful version, Grok 3, is planned for release by the end of the year. Grok 3 will leverage the processing power of 100,000 Nvidia H100 GPUs, potentially pushing the boundaries of AI performance even further.
Why does it matter?
These advanced AI assistants from xAI are intended to compete with and outperform AI chatbots like OpenAI’s ChatGPT by focusing on data quality, user experience, and raw processing power. This will significantly advance the state of AI and transform how people interact with and leverage AI assistants.
Source: https://www.coinspeaker.com/xai-grok-2-elon-musk-jarvis-ai-assistant/
Apple and the Swiss Federal Institute of Technology Lausanne (EPFL) have released a public demo of the ‘4M’ AI model on Hugging Face. The 4M (Massively Multimodal Masked Modeling) model can process and generate content across multiple modalities, such as creating images from text, detecting objects, and manipulating 3D scenes using natural language inputs.
While companies like Microsoft and Google have been making headlines with their AI partnerships and offerings, Apple has been steadily advancing its AI capabilities. The public demo of the 4M model suggests that Apple is now positioning itself as a significant player in the AI industry.
Why does it matter?
By making the 4M model publicly accessible, Apple is seeking to engage developers to build an ecosystem. It could lead to more coherent and versatile experiences, such as enhanced Siri capabilities and advancements in Apple’s augmented reality efforts.
Source: https://venturebeat.com/ai/apple-just-launched-a-public-demo-of-its-4m-ai-model-heres-why-its-a-big-deal
Amazon is hiring the co-founders, including the CEO and several other key employees, from the AI startup Adept.CEO David Luan will join Amazon’s AGI autonomy group, which is led by Rohit Prasad, who is spearheading a unified push to accelerate Amazon’s AI progress across different divisions like Alexa and AWS.
Amazon is consolidating its AI projects to develop a more advanced LLM to compete with OpenAI and Google’s top offerings. This unified approach leverages the company’s collective resources to accelerate progress in AI capabilities.
Why does it matter?
This acquisition indicates Amazon’s intent to strengthen its position in the competitive AI landscape. By bringing the Adept team on board, Amazon is leveraging its expertise and specialized knowledge to advance its AGI aspirations.
Source:https://www.bloomberg.com/news/articles/2024-06-28/amazon-hires-top-executives-from-ai-startup-adept-for-agi-team
YouTube lets people request the removal of AI-generated content that simulates their face or voice. Under YouTube’s privacy request process, the requests will be reviewed based on whether the content is synthetic, if it identifies the person, and if it shows the person in sensitive behavior. Source: https://techcrunch.com/2024/07/01/youtube-now-lets-you-request-removal-of-ai-generated-content-that-simulates-your-face-or-voice
Meta is refining its AI photo labeling on Instagram and Facebook. The “Made with AI” label will be replaced with “AI info” to more accurately reflect the extent of AI use in images, from minor edits to the entire AI generation. It addresses photographers’ concerns about the mislabeling of their photos. Source: https://techcrunch.com/2024/07/01/meta-changes-its-label-from-made-with-ai-to-ai-info-to-indicate-use-of-ai-in-photos
Deepfake-related losses will increase from $12.3 billion in 2023 to $40 billion by 2027, growing at 32% annually. There was a 3,000% increase in incidents last year alone. Enterprises are not well-prepared to defend against deepfake attacks, with one in three having no strategy.
Source: https://venturebeat.com/security/deepfakes-will-cost-40-billion-by-2027-as-adversarial-ai-gains-momentum
Anthropic is launching a program to fund new AI benchmarks. The aim is to create more comprehensive evaluations of AI models, including assessing capabilities in cyberattacks and weapons and beneficial applications like scientific research and bias mitigation. Source: https://techcrunch.com/2024/07/01/anthropic-looks-to-fund-a-new-more-comprehensive-generation-of-ai-benchmarks
China has criticized the US approach to regulating and restricting investments in AI. Chinese officials stated that US actions targeting AI are not helpful for AI’s healthy and sustainable development. They argued that the US measures will be divisive when it comes to global governance of AI.
Source: https://www.reuters.com/technology/artificial-intelligence/china-says-us-targeting-ai-not-helpful-healthy-development-2024-07-01
Source: https://www.independent.co.uk/tech/robot-human-brain-china-b2571978.html
Source: https://techcrunch.com/2024/07/02/figma-disables-its-ai-design-feature-that-appeared-to-be-ripping-off-apples-weather-app/
Source: https://finance.yahoo.com/news/french-antitrust-regulators-set-charge-151406034.html?
Google has rolled out significant improvements to its Vertex AI platform, including the general availability of Gemini 1.5 Flash with a massive 1 million-token context window. Also, Gemini 1.5 Pro now offers an industry-leading 2 million-token context capability. Google is introducing context caching for these Gemini models, slashing input costs by 75%.
Moreover, Google launched Imagen 3 in preview and added third-party models like Anthropic’s Claude 3.5 Sonnet on Vertex AI.
They’ve also made Grounding with Google Search generally available and announced a new service for grounding AI agents with specialized third-party data. Plus, they’ve expanded data residency guarantees to 23 countries, addressing growing data sovereignty concerns.
Why does it matter?
Google is positioning Vertex AI as the most “enterprise-ready” generative AI platform. With expanded context windows and improved grounding capabilities, this move also addresses concerns about the accuracy of Google’s AI-based search features.
Source: https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-offers-enterprise-ready-generative-ai
LMSYS Org added image recognition to Chatbot Arena to compare vision language models (VLMs), collecting over 17,000 user preferences in just two weeks. OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet outperformed other models in image recognition. Also, the open-source LLaVA-v1.6-34B performed comparably to some proprietary models.
These AI models tackle diverse tasks, from deciphering memes to solving math problems with visual aids. However, the examples provided show that even top models can stumble when interpreting complex visual information or handling nuanced queries.
Why does it matter?
This leaderboard isn’t just a tech popularity contest—it shows how advanced AI models can decode images. However, the varying performance also serves as a reality check, reminding us that while AI can recognize a cat in a photo, it might struggle to interpret your latest sales graph.
Source: https://lmsys.org/blog/2024-06-27-multimodal
Apple is reportedly working to bring its Apple Intelligence features to the Vision Pro headset, though not this year. Meanwhile, Apple is tweaking its in-store Vision Pro demos, allowing potential buyers to view personal media and try a more comfortable headband. Apple’s main challenge is adapting its AI features to a mixed-reality environment.
The company is tweaking its retail strategy for Vision Pro demos, hoping to boost sales of the pricey headset. Apple is also exploring the possibility of monetizing AI features through subscription services like “Apple Intelligence+.”
Why does it matter?
Apple’s Vision Pro, with its 16GB RAM and M2 chip, can handle advanced AI tasks. However, cloud infrastructure limitations are causing a delay in launch. It’s a classic case of “good things come to those who wait.”
Source: https://www.bloomberg.com/news/newsletters/2024-06-30/apple-s-longer-lasting-devices-ios-19-and-apple-intelligence-on-the-vision-pro-ly1jnrw4
Agility Robotics just signed a multi-year deal with GXO Logistics to bring the company’s Digit humanoid robots to warehouses, following a successful pilot in Spanx facilities in 2023.
Is RaaS the new SaaS? Soon, every company will be looking to adopt advanced robotics into their workforce — and subscription services could help lower the financial and technical barriers needed to scale without the massive upfront costs.
Source: https://agilityrobotics.com/content/gxo-signs-industry-first-multi-year-agreement-with-agility-robotics
Google just announced its largest-ever expansion of Google Translate, adding support for 110 new languages enabled by the company’s PaLM 2 LLM model.
We’ve talked frequently about AI’s coming power to break down language barriers with its translation capabilities — but the technology is also playing a very active role in both uncovering and preserving languages from lost and endangered cultures.
Source: https://blog.google/products/translate/google-translate-new-languages-2024
The update provides real-time, step-by-step guides for customer issues. It aims to reduce the “toggle tax” – time wasted switching between applications. The system listens to calls in real-time and automatically provides relevant information.
Source: https://venturebeat.com/ai/amazon-upgrades-ai-assistant-q-to-make-call-centers-way-more-efficient
Users will be able to choose between two options: faster responses with Llama 3-70B (default) or more complex queries with Llama 3-405B (advanced). Llama 3-405B will be limited to a certain number of prompts per week. This feature aims to give users more control over their AI interactions.
Source: https://wabetainfo.com/whatsapp-beta-for-android-2-24-14-7-whats-new/
He claims that while data centers may consume up to 6% of global electricity, AI will ultimately drive greater energy efficiency. Gates believes tech companies will invest in green energy to power their AI operations, potentially offsetting the increased demand.
Source: https://www.theregister.com/2024/06/28/bill_gates_ai_power_consumption
Perplexity appears to be scraping websites that have forbidden access through robots.txt. AWS prohibits customers from violating the robots.txt standard. Perplexity uses an unpublished IP address to access websites that block its official crawler. The company claims a third party performs web crawling for them.
Source: https://www.wired.com/story/aws-perplexity-bot-scraping-investigation
Mustafa Suleyman claimed that anything published online becomes “freeware” and fair game for AI training. This stance, however, contradicts basic copyright principles and ignores the legal complexities of fair use. He suggests that robots.txt might protect content from scraping.
Source: https://www.theverge.com/2024/6/28/24188391/microsoft-ai-suleyman-social-contract-freeware
Source: https://www.techradar.com/computing/is-apple-intelligence-the-new-icloud-ai-platform-tipped-to-get-new-subscription-tier
Source: https://the-decoder.com/metas-new-hot3d-dataset-could-enable-robots-to-learn-manual-skills-from-human-experts/
submitted by /u/Kulimar [link] [comments]
RGB color is what is used by VAE encoding and decoding as far as I know, but has anyone tried creating a VAE that takes CMYK or even arbitrary spectrum telemetry as input and can decode into other color spaces? CMYK is perhaps not a great example, as I believe that CMYK is a proper subset of RGB (though the reverse is not true). But there are many color spaces used by a large variety of processes out there and I would think that being able to train models on their use would be helpful. Then again, CMYK decoding could potentially be more useful than trying to covert from latent space to RGB to CMYK (e.g. for print). I did a few quick searches on Google Scholar and basic web search, and didn't see anything, but this is not my area of expertise. submitted by /u/Tyler_Zoro [link] [comments]
submitted by /u/DataPhreak [link] [comments]
submitted by /u/Dung3onlord [link] [comments]
submitted by /u/Maxie445 [link] [comments]
Hey, everyone, I want to create a bot that can respond in my style using my chat history. What should I do, and what approaches can I take to achieve this? Thanks in advance! submitted by /u/particle4dev [link] [comments]
Have been working with TensorFlow since the year it was made available, following progression that got us to LLMs, going back to the "Attention Is All You Need" paper... What a ride. Here are a handful of fun predictions, some based on knowledge shared with me internally as to active proof of concepts, some speculative. Entertainment and the future of popular music Consumption The music industry missed the boat on streaming and has to share huge profits with Spotify, Apple Music, etc.. for stream distribution. But they will unite behind a new streaming service that enables content creators the ability to use AI generated vocals, instrumental parts, etc.. to create custom mashups and even new tracks using the likeness of artist. Record labels have been buying up artist catalogs, recently KISS and Queen were sold and in these contracts the right to use their "likeness" included, enabling future derived AI works to be owned by the labels.... The vision here is a new platform combining the addictive nature of TikTok with on-demand streaming of Spotify, where "creators" use the AI and IP resources of the recordto make AI generated works popular, at first focused on using assets from established artist from the past and present, but eventually fostering creators who contribute the vocals, instrument parts, etc.. to be fed into the model for entirely new possibilities. Live performance Studio plug-ins like Auto-Tune will be taken to the next level for both studio use and live applications. Want to sound like Frank Sinatra? There will be a plug-in for that. Workplace Future employment contracts will come with a clause giving employers the right to create AI agents of your likeness, which employees will be required to sign-over the rights to and allow their employer to use these agents indefinitely. The employer will also retain the right to train this AI bot that can be used as substitute for you, on all work related communications. "Training days" will become common where corporations will have employees play a part in knowledge dumps, providing answers to questions to aid in the training of their bot likeness. Nights and weekends your bot can be used in place of you for some scenarios and even once employment has been terminated. This is a coming battle between workers and employers, a fight over the fair use of their likeness and expect substantial abuse and legal drama before it resolves. Every day consumer use AI agents working for you to save you time is the holy grail big tech is after. The problem is, to achieve that big tech needs your personal behavioral data to fine tune these models into behaving as you would. Considering about 70% of on-line behavior a model would mimic occurs on Smartphones, the key players here in position to build these sort of agents are Google and Apple. Microsoft's huge investment in OpenAI was wise, but it's a rather small cohort using Windows PCs for all the things an agent would need to be built on, even in those cases, Microsoft has to get consent to collect that data. Apple doesn't have their own LLM, so I would say Google is probably best positioned to innovate in this area and create something like the Rabbit R1, but real and just embedded into Android, Chrome, etc.. I think this will manifest in the device being able to recognize common patterns and trigger actions. Sort of like today when we get in our car and our phone tells us how far away we are from its best guess on where we're going, it will be able to respond to much more sensory, text messages, social post, e-mails, phone calls, etc.. and just act to save us time. Saving time is the point of most technology as "time is money", so the use cases around that aren't hard to predict. Without keeping this too long, those are three things I think we'll see in the next 3-6 years. What are your predictions? submitted by /u/rmz76 [link] [comments]
submitted by /u/wiredmagazine [link] [comments]
This model dramatically lowers the energy cost of LLMs and approaches the same performance at large model sizes. submitted by /u/Metabolical [link] [comments]
submitted by /u/adeno_gothilla [link] [comments]
Read here to discover the power and efficiency of thermal foggers and learn more about…
Ace the AWS Certified Data Engineer Exam (DEA-C01): Mastering AWS Services for Data Ingestion, Transformation,…
Wonderland Bedtime Adventures: Diverse Tales for Dreamy Nights. Embark on a magical journey with “Wonderland…
Artificial intelligence can make running a rental property easier in many regards. Use these tips…
Cell therapy plays a vital role in medical treatments and advancements but it’s the process…
Becoming a certified locksmith requires following a few industry rules and a commitment to personal…