AI Innovations in July 2024

AI Innovations in July 2024.

Welcome to our blog series “AI Innovations in July 2024”! As we continue to ride the wave of extraordinary developments from June, the momentum in artificial intelligence shows no signs of slowing down. Last month, we witnessed groundbreaking achievements such as the unveiling of the first quantum AI chip, the successful deployment of autonomous medical drones in remote areas, and significant advancements in natural language understanding that have set new benchmarks for AI-human interaction.

July promises to be just as exhilarating, with researchers, engineers, and visionaries pushing the boundaries of what’s possible even further. In this evolving article, updated daily throughout the month, we’ll dive deep into the latest AI breakthroughs, advancements, and milestones shaping the future.

From revolutionary AI-powered technologies and cutting-edge research to the societal and ethical implications of these innovations, we provide you with a comprehensive and insightful look at the rapidly evolving world of artificial intelligence. Whether you’re an AI enthusiast, a tech-savvy professional, or simply someone curious about the future, this blog will keep you informed, inspired, and engaged.

Join us on this journey of discovery as we explore the frontiers of AI, uncovering the innovations that are transforming industries, enhancing our lives, and shaping our future. Stay tuned for daily updates, and get ready to be amazed by the incredible advancements happening in the world of AI!

LISTEN DAILY AT OUR PODCAST HERE

A Daily chronicle of AI Innovations July 03rd 2024:

Apple joins OpenAI board

Google’s emissions spiked by almost 50% due to AI boom

Meta’s new AI can create 3D objects from text in under a minute

Meta’s 3D Gen creates 3D assets at lightning speed
Perplexity AI upgrades Pro Search with more advanced problem-solving
The first Gen AI framework that keeps your prompts always encrypted

ElevenLabs launches ‘Iconic Voices’

Leaks reveal Google Pixel AI upgrades

Meta’s new text-to-3D AI

Meta’s 3D Gen creates 3D assets at lightning speed

Meta has introduced Meta 3D Gen, a new state-of-the-art, fast pipeline for text-to-3D asset generation. It offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in less than a minute.

According to Meta, the process is three to 10 times faster than existing solutions. The research paper even mentions that when assessed by professional 3D artists, the output of 3DGen is preferred a majority of time compared to industry alternatives, particularly for complex prompts, while being from 3× to 60× faster.

A significant feature of 3D Gen is its support physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications.

Why does it matter?

3D Gen’s implications extend far beyond Meta’s sphere. In gaming, it could speed up the creation of expansive virtual worlds, allowing rapid prototyping. In architecture and industrial design, it could facilitate quick concept visualization, expediting the design process.

Source: https://ai.meta.com/research/publications/meta-3d-gen/

Perplexity AI upgrades Pro Search with more advanced problem-solving

Perplexity AI has improved Pro Search to tackle more complex queries, perform advanced math and programming computations, and deliver even more thoroughly researched answers. Everyone can use Pro Search five times every four hours for free, and Pro subscribers have unlimited access.

Perplexity suggests the upgraded Pro Search “can pinpoint case laws for attorneys, summarize trend analysis for marketers, and debug code for developers—and that’s just the start”. It can empower all professions to make more informed decisions.

Why does it matter?

This showcases AI’s potential to assist professionals in specialized fields. Such advancements also push the boundaries of AI’s practical applications in research and decision-making processes.

Source: https://www.perplexity.ai/hub/blog/pro-search-upgraded-for-more-advanced-problem-solving

The first Gen AI framework that keeps your prompts always encrypted

Edgeless Systems introduced Continuum AI, the first generative AI framework that keeps prompts encrypted at all times with confidential computing by combining confidential VMs with NVIDIA H100 GPUs and secure sandboxing.

The Continuum technology has two main security goals. It first protects the user data and also protects AI model weights against the infrastructure, the service provider, and others. Edgeless Systems is also collaborating with NVIDIA to empower businesses across sectors to confidently integrate AI into their operations.

Why does it matter?

This greatly advances security for LLMs. The technology could be pivotal for a future where organizations can securely utilize AI, even for the most sensitive data.

Source: https://developer.nvidia.com/blog/advancing-security-for-large-language-models-with-nvidia-gpus-and-edgeless-systems

RunwayML’s Gen-3 Alpha models is now generally available

Announced a few weeks ago, Gen-3 is Runway’s latest frontier model and a big upgrade from Gen-1 and Gen-2. It allows users to produce hyper-realistic videos from text, image, or video prompts. Users must upgrade to a paid plan to use the model.

Source: https://venturebeat.com/ai/runways-gen-3-alpha-ai-video-model-now-available-but-theres-a-catch

Meta might be bringing generative AI to metaverse games

In a job listing, Meta mentioned it is seeking to research and prototype “new consumer experiences” with new types of gameplay driven by Gen AI. It is also planning to build Gen AI-powered tools that could “improve workflow and time-to-market” for games.

Source: https://techcrunch.com/2024/07/02/meta-plans-to-bring-generative-ai-to-metaverse-games

Apple gets a non-voting seat on OpenAI’s board

As a part of its AI agreement with OpenAI, Apple will get an observer role on OpenAI’s board. Apple chose Phil Schiller, the head of Apple’s App Store and its former marketing chief, for the position.

Source: https://www.theverge.com/2024/7/2/24191105/apple-phil-schiller-join-openai-board

Smartphone 101 - Pick a smartphone for me - android or iOS - Apple iPhone or Samsung Galaxy or Huawei or Xaomi or Google Pixel

Figma disabled AI tool after being criticised for ripping off Apple’s design

Figma’s Make Design feature generates UI layouts and components from text prompts. It repeatedly reproduced Apple’s Weather app when used as a design aid, drawing accusations that Figma’s AI seems heavily trained on existing apps.

Source: https://techcrunch.com/2024/07/02/figma-disables-its-ai-design-feature-that-appeared-to-be-ripping-off-apples-weather-app

China is far ahead of other countries in generative AI inventions

According to the World Intellectual Property Organization (WIPO), more than 50,000 patent applications were filed in the past decade for Gen AI. More than 38,000 GenAI inventions were filed by China between 2014-2023 vs. only 6,276 by the U.S.

Source: https://www.reuters.com/technology/artificial-intelligence/china-leading-generative-ai-patents-race-un-report-says-2024-07-03

Apple joins OpenAI board

Phil Schiller, Apple’s former marketing head and App Store chief, will reportedly join OpenAI’s board as a non-voting observer, according to Bloomberg.
This role will allow Schiller to understand OpenAI better, as Apple aims to integrate ChatGPT into iOS and macOS later this year to enhance Siri’s capabilities.
Microsoft also took a non-voting observer position on OpenAI’s board last year, making it rare and significant for both Apple and Microsoft to be involved in this capacity.

Source: https://www.theverge.com/2024/7/2/24191105/apple-phil-schiller-join-openai-board

Google’s emissions spiked by almost 50% due to AI boom

Google reported a 48% increase in greenhouse gas emissions over the past five years due to the high energy demands of its AI data centers.
Despite achieving seven years of renewable energy matching, Google faces significant challenges in meeting its goal of net zero emissions by 2030, highlighting the uncertainties surrounding AI’s environmental impact.
To address water consumption concerns, Google has committed to replenishing 120% of the water it uses by 2030, although in 2023, it only managed to replenish 18%.

Source: https://www.techradar.com/pro/google-says-its-emissions-have-grown-nearly-50-due-to-ai-data-center-boom-and-heres-what-it-plans-to-do-about-it

Meta’s new AI can create 3D objects from text in under a minute

Meta has introduced 3D Gen, an AI system that creates high-quality 3D assets from text descriptions in under a minute, significantly advancing 3D content generation.
The system uses a two-stage process, starting with AssetGen to generate a 3D mesh with PBR materials and followed by TextureGen to refine the textures, producing detailed and professional-grade 3D models.
3D Gen has shown superior performance and visual quality compared to other industry solutions, with potential applications in game development, architectural visualization, and virtual/augmented reality.

Source: https://www.maginative.com/article/meta-unveils-3d-gen-ai-that-creates-detailed-3d-assets-in-under-a-minute/

A Daily chronicle of AI Innovations July 02nd 2024:

JARVIS-inspired Grok 2 aims to answer any user query
Apple unveils a public demo of its ‘4M’ AI model
Amazon hires Adept’s top executives to build an AGI team

YouTube lets you remove AI-generated content resembling face or voice

Runway opens Gen-3 Alpha access

Motorola hits the AI runway

Meta swaps ‘Made with AI’with ‘AI info’ to indicate AI photos

Deepfakes to cost $40 billion by 2027: Deloitte survey

Anthropic launches a program to fund the creation of reliable AI benchmarks

US’s targeting of AI not helpful for healthy development: China

New robot controlled by human brain cells

Figma to temporarily disable AI feature amid plagiarism concerns

Runway opens Gen-3 Alpha access

Runway just announced that its AI video generator, Gen-3 Alpha, is now available to all users following weeks of impressive, viral outputs after the model’s release in mid-June.

Runway unveiled Gen-3 Alpha last month, the first model in its next-gen series trained for learning ‘general world models’.

Gen-3 Alpha upgrades key features, including character and scene consistency, camera motion and techniques, and transitions between scenes.

Gen-3 Alpha is available behind Runway’s ‘Standard’ $12/mo access plan, which gives users 63 seconds of generations a month.

On Friday, we’re running a free, hands-on workshop in our AI University covering how to create an AI commercial using Gen-3, ElevenLabs, and Midjourney.

Despite impressive recent releases from KLING and Luma Labs, Runway’s Gen-3 Alpha model feels like the biggest leap AI video has taken since Sora. However, the tiny generation limits for non-unlimited plans might be a hurdle for power users.

Source: https://x.com/runwayml/status/1807822396415467686

Motorola hits the AI runway

Motorola just launched its ‘Styled By Moto’ ad campaign, an entirely AI-generated fashion spot promoting its new line of Razr folding smartphones — created using nine different AI tools, including Sora and Midjourney.

The 30-second video features AI-generated models wearing outfits inspired by Motorola’s iconic ‘batwing’ logo in settings like runways and photo shoots.
Each look was created from thousands of AI-generated images, incorporating the brand’s logo and colors of the new Razr phone line.
Tools used include OpenAI’s Sora, Adobe Firefly, Midjourney, Krea, Magnific, Luma, and more — reportedly taking over four months of research.
The 30-second spot is also set to an AI-generated soundtrack incorporating the ‘Hello Moto’ jingle, created using Udio.

This is a fascinating look at the AI-powered stack used by a major brand, and a glimpse at how tools can (and will) be combined to open new creative avenues. It’s also another example of the shift in discourse surrounding AI’s use in marketing — potentially paving the way for wider acceptance and integration.

JARVIS-inspired Grok 2 aims to answer any user query

Elon Musk has announced the release dates for two new AI assistants from xAI. The first, Grok 2, will be launched in August. Musk says Grok 2 is inspired by JARVIS from Iron Man and The Hitchhiker’s Guide to the Galaxy and aims to answer virtually any user query. This ambitious goal is fueled by xAI’s focus on “purging” LLM datasets used for training.

Musk also revealed that an even more powerful version, Grok 3, is planned for release by the end of the year. Grok 3 will leverage the processing power of 100,000 Nvidia H100 GPUs, potentially pushing the boundaries of AI performance even further.

Why does it matter?

These advanced AI assistants from xAI are intended to compete with and outperform AI chatbots like OpenAI’s ChatGPT by focusing on data quality, user experience, and raw processing power. This will significantly advance the state of AI and transform how people interact with and leverage AI assistants.

Source: https://www.coinspeaker.com/xai-grok-2-elon-musk-jarvis-ai-assistant/

Apple unveils a public demo of its ‘4M’ AI model

Apple and the Swiss Federal Institute of Technology Lausanne (EPFL) have released a public demo of the ‘4M’ AI model on Hugging Face. The 4M (Massively Multimodal Masked Modeling) model can process and generate content across multiple modalities, such as creating images from text, detecting objects, and manipulating 3D scenes using natural language inputs.

While companies like Microsoft and Google have been making headlines with their AI partnerships and offerings, Apple has been steadily advancing its AI capabilities. The public demo of the 4M model suggests that Apple is now positioning itself as a significant player in the AI industry.

Why does it matter?

By making the 4M model publicly accessible, Apple is seeking to engage developers to build an ecosystem. It could lead to more coherent and versatile experiences, such as enhanced Siri capabilities and advancements in Apple’s augmented reality efforts.

Source: https://venturebeat.com/ai/apple-just-launched-a-public-demo-of-its-4m-ai-model-heres-why-its-a-big-deal

Amazon hires Adept’s top executives to build an AGI team

Amazon is hiring the co-founders, including the CEO and several other key employees, from the AI startup Adept.CEO David Luan will join Amazon’s AGI autonomy group, which is led by Rohit Prasad, who is spearheading a unified push to accelerate Amazon’s AI progress across different divisions like Alexa and AWS.

Amazon is consolidating its AI projects to develop a more advanced LLM to compete with OpenAI and Google’s top offerings. This unified approach leverages the company’s collective resources to accelerate progress in AI capabilities.

Why does it matter?

This acquisition indicates Amazon’s intent to strengthen its position in the competitive AI landscape. By bringing the Adept team on board, Amazon is leveraging its expertise and specialized knowledge to advance its AGI aspirations.

Source:https://www.bloomberg.com/news/articles/2024-06-28/amazon-hires-top-executives-from-ai-startup-adept-for-agi-team

YouTube lets you remove AI-generated content resembling face or voice

YouTube lets people request the removal of AI-generated content that simulates their face or voice. Under YouTube’s privacy request process, the requests will be reviewed based on whether the content is synthetic, if it identifies the person, and if it shows the person in sensitive behavior. Source: https://techcrunch.com/2024/07/01/youtube-now-lets-you-request-removal-of-ai-generated-content-that-simulates-your-face-or-voice

Meta swaps ‘Made with AI’with ‘AI info’ to indicate AI photos

Meta is refining its AI photo labeling on Instagram and Facebook. The “Made with AI” label will be replaced with “AI info” to more accurately reflect the extent of AI use in images, from minor edits to the entire AI generation. It addresses photographers’ concerns about the mislabeling of their photos. Source: https://techcrunch.com/2024/07/01/meta-changes-its-label-from-made-with-ai-to-ai-info-to-indicate-use-of-ai-in-photos

Deepfakes to cost $40 billion by 2027: Deloitte survey

Deepfake-related losses will increase from $12.3 billion in 2023 to $40 billion by 2027, growing at 32% annually. There was a 3,000% increase in incidents last year alone. Enterprises are not well-prepared to defend against deepfake attacks, with one in three having no strategy.

Source: https://venturebeat.com/security/deepfakes-will-cost-40-billion-by-2027-as-adversarial-ai-gains-momentum

Anthropic launches a program to fund the creation of reliable AI benchmarks

Anthropic is launching a program to fund new AI benchmarks. The aim is to create more comprehensive evaluations of AI models, including assessing capabilities in cyberattacks and weapons and beneficial applications like scientific research and bias mitigation. Source: https://techcrunch.com/2024/07/01/anthropic-looks-to-fund-a-new-more-comprehensive-generation-of-ai-benchmarks

US’s targeting of AI not helpful for healthy development: China

China has criticized the US approach to regulating and restricting investments in AI. Chinese officials stated that US actions targeting AI are not helpful for AI’s healthy and sustainable development. They argued that the US measures will be divisive when it comes to global governance of AI.

Source: https://www.reuters.com/technology/artificial-intelligence/china-says-us-targeting-ai-not-helpful-healthy-development-2024-07-01

New robot controlled by human brain cells

Scientists in China have developed a robot with an artificial brain grown from human stem cells, which can perform basic tasks such as moving limbs, avoiding obstacles, and grasping objects, showcasing some intelligence functions of a biological brain.
The brain-on-chip utilizes a brain-computer interface to facilitate communication with the external environment through encoding, decoding, and stimulation-feedback mechanisms.
This pioneering brain-on-chip technology, requiring similar conditions to sustain as a human brain, is expected to have a revolutionary impact by advancing the field of hybrid intelligence, merging biological and artificial systems.

Source: https://www.independent.co.uk/tech/robot-human-brain-china-b2571978.html

Figma to temporarily disable AI feature amid plagiarism concerns

Figma has temporarily disabled its “Make Design” AI feature after accusations that it was replicating Apple’s Weather app designs.
Andy Allen, founder of NotBoring Software, discovered that the feature consistently reproduced the layout of Apple’s Weather app, leading to community concerns.
CEO Dylan Field acknowledged the issue and stated the feature would be disabled until they can ensure its reliability and originality through comprehensive quality assurance checks.

Source: https://techcrunch.com/2024/07/02/figma-disables-its-ai-design-feature-that-appeared-to-be-ripping-off-apples-weather-app/

Nvidia faces first antitrust charges

French antitrust enforcers plan to charge Nvidia with alleged anticompetitive practices, becoming the first to take such action, according to Reuters.
Nvidia’s offices in France were raided last year as part of an investigation into possible abuses of dominance in the graphics cards sector.
Regulatory bodies in the US, EU, China, and the UK are also examining Nvidia’s business practices due to its significant presence in the AI chip market.

Source: https://finance.yahoo.com/news/french-antitrust-regulators-set-charge-151406034.html?

A Daily chronicle of AI Innovations July 01st 2024:

Some Apple Intelligence features may be put behind a paywall

Meta’s new dataset could enable robots to learn manual skills from human experts

Google announces advancements in Vertex AI models
LMSYS’s new Multimodal Arena compares top AI models’ visual processing abilities
Apple’s Vision Pro gets an AI upgrade

Humanoid robots head to the warehouse

Google Translate adds 110 languages

Google announces advancements in Vertex AI models

Google has rolled out significant improvements to its Vertex AI platform, including the general availability of Gemini 1.5 Flash with a massive 1 million-token context window. Also, Gemini 1.5 Pro now offers an industry-leading 2 million-token context capability. Google is introducing context caching for these Gemini models, slashing input costs by 75%.

Moreover, Google launched Imagen 3 in preview and added third-party models like Anthropic’s Claude 3.5 Sonnet on Vertex AI.

They’ve also made Grounding with Google Search generally available and announced a new service for grounding AI agents with specialized third-party data. Plus, they’ve expanded data residency guarantees to 23 countries, addressing growing data sovereignty concerns.

Why does it matter?

Google is positioning Vertex AI as the most “enterprise-ready” generative AI platform. With expanded context windows and improved grounding capabilities, this move also addresses concerns about the accuracy of Google’s AI-based search features.

Source: https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-offers-enterprise-ready-generative-ai

LMSYS’s new Multimodal Arena compares top AI models’ visual processing abilities

LMSYS Org added image recognition to Chatbot Arena to compare vision language models (VLMs), collecting over 17,000 user preferences in just two weeks. OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet outperformed other models in image recognition. Also, the open-source LLaVA-v1.6-34B performed comparably to some proprietary models.

These AI models tackle diverse tasks, from deciphering memes to solving math problems with visual aids. However, the examples provided show that even top models can stumble when interpreting complex visual information or handling nuanced queries.

Why does it matter?

This leaderboard isn’t just a tech popularity contest—it shows how advanced AI models can decode images. However, the varying performance also serves as a reality check, reminding us that while AI can recognize a cat in a photo, it might struggle to interpret your latest sales graph.

Source: https://lmsys.org/blog/2024-06-27-multimodal

Apple’s Vision Pro gets an AI upgrade

Apple is reportedly working to bring its Apple Intelligence features to the Vision Pro headset, though not this year. Meanwhile, Apple is tweaking its in-store Vision Pro demos, allowing potential buyers to view personal media and try a more comfortable headband. Apple’s main challenge is adapting its AI features to a mixed-reality environment.

The company is tweaking its retail strategy for Vision Pro demos, hoping to boost sales of the pricey headset. Apple is also exploring the possibility of monetizing AI features through subscription services like “Apple Intelligence+.”

Why does it matter?

Apple’s Vision Pro, with its 16GB RAM and M2 chip, can handle advanced AI tasks. However, cloud infrastructure limitations are causing a delay in launch. It’s a classic case of “good things come to those who wait.”

Source: https://www.bloomberg.com/news/newsletters/2024-06-30/apple-s-longer-lasting-devices-ios-19-and-apple-intelligence-on-the-vision-pro-ly1jnrw4

Humanoid robots head to the warehouse

Agility Robotics just signed a multi-year deal with GXO Logistics to bring the company’s Digit humanoid robots to warehouses, following a successful pilot in Spanx facilities in 2023.

The agreement is being hailed as the first Robots-as-a-Service (RaaS) deal and ‘formal commercial deployment’ of the humanoid robots.
Agility’s Digit robots will be integrated into GXO’s logistics operations at a Spanx facility in Connecticut, handling repetitive tasks and logistics work.
The 5’9″ tall Digit can lift up to 35 pounds, and integrates with a cloud-based Agility Arc platform to control full fleets and optimize facility workflows.
Digit tested a proof-of-concept trial with Spanx in 2023, with Amazon also testing the robots at its own warehouses.

Is RaaS the new SaaS? Soon, every company will be looking to adopt advanced robotics into their workforce — and subscription services could help lower the financial and technical barriers needed to scale without the massive upfront costs.

Source: https://agilityrobotics.com/content/gxo-signs-industry-first-multi-year-agreement-with-agility-robotics

Google Translate adds 110 languages

Google just announced its largest-ever expansion of Google Translate, adding support for 110 new languages enabled by the company’s PaLM 2 LLM model.

The new languages represent over 614M speakers, covering about 8% of the global population.
Google’s PaLM 2 model was the driving force behind the expansion, helping unlock translations for closely related languages.
The expansion also includes some languages with no current native speakers, displaying how AI models can help preserve ‘lost’ dialects.
The additions are part of Google’s ‘1,000 Languages Initiative,’ which aims to build AI that supports all of the world’s spoken languages.

We’ve talked frequently about AI’s coming power to break down language barriers with its translation capabilities — but the technology is also playing a very active role in both uncovering and preserving languages from lost and endangered cultures.

Source: https://blog.google/products/translate/google-translate-new-languages-2024

Amazon’s Q AI assistant for enterprises gets an update for call centers

The update provides real-time, step-by-step guides for customer issues. It aims to reduce the “toggle tax” – time wasted switching between applications. The system listens to calls in real-time and automatically provides relevant information.

Source: https://venturebeat.com/ai/amazon-upgrades-ai-assistant-q-to-make-call-centers-way-more-efficient

WhatsApp is developing a feature to choose Meta AI Llama models

Users will be able to choose between two options: faster responses with Llama 3-70B (default) or more complex queries with Llama 3-405B (advanced). Llama 3-405B will be limited to a certain number of prompts per week. This feature aims to give users more control over their AI interactions.

Source: https://wabetainfo.com/whatsapp-beta-for-android-2-24-14-7-whats-new/

Bill Gates says AI’s energy consumption isn’t a major concern

He claims that while data centers may consume up to 6% of global electricity, AI will ultimately drive greater energy efficiency. Gates believes tech companies will invest in green energy to power their AI operations, potentially offsetting the increased demand.

Source: https://www.theregister.com/2024/06/28/bill_gates_ai_power_consumption

Amazon is investigating Perplexity AI for possible scraping abuse

Perplexity appears to be scraping websites that have forbidden access through robots.txt. AWS prohibits customers from violating the robots.txt standard. Perplexity uses an unpublished IP address to access websites that block its official crawler. The company claims a third party performs web crawling for them.

Source: https://www.wired.com/story/aws-perplexity-bot-scraping-investigation

Microsoft AI chief claims content on the open web is “freeware”

Mustafa Suleyman claimed that anything published online becomes “freeware” and fair game for AI training. This stance, however, contradicts basic copyright principles and ignores the legal complexities of fair use. He suggests that robots.txt might protect content from scraping.

Source: https://www.theverge.com/2024/6/28/24188391/microsoft-ai-suleyman-social-contract-freeware

Some Apple Intelligence features may be put behind a paywall

Apple Intelligence, initially free, is expected to introduce a premium “Apple Intelligence+” subscription tier with additional features, similar to iCloud, according to Bloomberg’s Mark Gurman.
Apple plans to monetize Apple Intelligence not only through direct subscriptions but also by taking a share of revenue from partner AI services like OpenAI and potentially Google Gemini.
Apple Intelligence will be integrated into multiple devices, excluding the HomePod due to hardware limitations, and may include a new robotic device, making it comparable to iCloud in its broad application and frequent updates.

Source: https://www.techradar.com/computing/is-apple-intelligence-the-new-icloud-ai-platform-tipped-to-get-new-subscription-tier

Meta’s new dataset could enable robots to learn manual skills from human experts

Meta has introduced a new benchmark dataset named HOT3D to advance AI research in 3D hand-object interactions, containing over one million frames from various perspectives.
This dataset aims to enhance the understanding of human hand manipulation of objects, addressing a significant challenge in computer vision research according to Meta.
HOT3D includes over 800 minutes of egocentric video recordings, multiple perspectives, detailed 3D pose annotations, and 3D object models, which could help robots and XR devices learn manual skills from human experts.

Source: https://the-decoder.com/metas-new-hot3d-dataset-could-enable-robots-to-learn-manual-skills-from-human-experts/

[AI vs. Real Cost]: How Much Would This Cost To Shoot and Composite For Real? (Details in the first comment)
by /u/Kulimar (Artificial Intelligence) on July 3, 2024 at 10:03 pm
submitted by /u/Kulimar [link] [comments]
Has anyone worked on cross-attention encoding for alternative color spaces?
by /u/Tyler_Zoro (Artificial Intelligence) on July 3, 2024 at 3:43 pm
RGB color is what is used by VAE encoding and decoding as far as I know, but has anyone tried creating a VAE that takes CMYK or even arbitrary spectrum telemetry as input and can decode into other color spaces? CMYK is perhaps not a great example, as I believe that CMYK is a proper subset of RGB (though the reverse is not true). But there are many color spaces used by a large variety of processes out there and I would think that being able to train models on their use would be helpful. Then again, CMYK decoding could potentially be more useful than trying to covert from latent space to RGB to CMYK (e.g. for print). I did a few quick searches on Google Scholar and basic web search, and didn't see anything, but this is not my area of expertise. submitted by /u/Tyler_Zoro [link] [comments]
The Physics of Associative Memory
by /u/DataPhreak (Artificial Intelligence) on July 3, 2024 at 1:08 pm
submitted by /u/DataPhreak [link] [comments]
You can turn a sketch into a 3D environment
by /u/Dung3onlord (Artificial Intelligence) on July 3, 2024 at 9:28 am
submitted by /u/Dung3onlord [link] [comments]
Geoffrey Hinton says AI has passed the Turing Test. This happened when it could explain why a joke is funny.
by /u/Maxie445 (Artificial Intelligence) on July 3, 2024 at 8:26 am
submitted by /u/Maxie445 [link] [comments]
How to create a bot based on my chat data?
by /u/particle4dev (Artificial Intelligence) on July 3, 2024 at 4:22 am
Hey, everyone, I want to create a bot that can respond in my style using my chat history. What should I do, and what approaches can I take to achieve this? Thanks in advance! submitted by /u/particle4dev [link] [comments]
Predictions for the near future.
by /u/rmz76 (Artificial Intelligence) on July 2, 2024 at 10:30 pm
Have been working with TensorFlow since the year it was made available, following progression that got us to LLMs, going back to the "Attention Is All You Need" paper... What a ride. Here are a handful of fun predictions, some based on knowledge shared with me internally as to active proof of concepts, some speculative. Entertainment and the future of popular music Consumption The music industry missed the boat on streaming and has to share huge profits with Spotify, Apple Music, etc.. for stream distribution. But they will unite behind a new streaming service that enables content creators the ability to use AI generated vocals, instrumental parts, etc.. to create custom mashups and even new tracks using the likeness of artist. Record labels have been buying up artist catalogs, recently KISS and Queen were sold and in these contracts the right to use their "likeness" included, enabling future derived AI works to be owned by the labels.... The vision here is a new platform combining the addictive nature of TikTok with on-demand streaming of Spotify, where "creators" use the AI and IP resources of the recordto make AI generated works popular, at first focused on using assets from established artist from the past and present, but eventually fostering creators who contribute the vocals, instrument parts, etc.. to be fed into the model for entirely new possibilities. Live performance Studio plug-ins like Auto-Tune will be taken to the next level for both studio use and live applications. Want to sound like Frank Sinatra? There will be a plug-in for that. Workplace Future employment contracts will come with a clause giving employers the right to create AI agents of your likeness, which employees will be required to sign-over the rights to and allow their employer to use these agents indefinitely. The employer will also retain the right to train this AI bot that can be used as substitute for you, on all work related communications. "Training days" will become common where corporations will have employees play a part in knowledge dumps, providing answers to questions to aid in the training of their bot likeness. Nights and weekends your bot can be used in place of you for some scenarios and even once employment has been terminated. This is a coming battle between workers and employers, a fight over the fair use of their likeness and expect substantial abuse and legal drama before it resolves. Every day consumer use AI agents working for you to save you time is the holy grail big tech is after. The problem is, to achieve that big tech needs your personal behavioral data to fine tune these models into behaving as you would. Considering about 70% of on-line behavior a model would mimic occurs on Smartphones, the key players here in position to build these sort of agents are Google and Apple. Microsoft's huge investment in OpenAI was wise, but it's a rather small cohort using Windows PCs for all the things an agent would need to be built on, even in those cases, Microsoft has to get consent to collect that data. Apple doesn't have their own LLM, so I would say Google is probably best positioned to innovate in this area and create something like the Rabbit R1, but real and just embedded into Android, Chrome, etc.. I think this will manifest in the device being able to recognize common patterns and trigger actions. Sort of like today when we get in our car and our phone tells us how far away we are from its best guess on where we're going, it will be able to respond to much more sensory, text messages, social post, e-mails, phone calls, etc.. and just act to save us time. Saving time is the point of most technology as "time is money", so the use cases around that aren't hard to predict. Without keeping this too long, those are three things I think we'll see in the next 3-6 years. What are your predictions? submitted by /u/rmz76 [link] [comments]
Google Search Ranks AI Spam Above Original Reporting in News Results
by /u/wiredmagazine (Artificial Intelligence) on July 2, 2024 at 6:49 pm
submitted by /u/wiredmagazine [link] [comments]
Researchers upend AI status quo by eliminating matrix multiplication in LLMs
by /u/Metabolical (Artificial Intelligence) on July 2, 2024 at 2:15 pm
This model dramatically lowers the energy cost of LLMs and approaches the same performance at large model sizes. submitted by /u/Metabolical [link] [comments]
State-of-the-art LLMs are 4 to 6 orders of magnitude less efficient than human brain. A dramatically better architecture is needed to get to AGI.
by /u/adeno_gothilla (Artificial Intelligence) on July 2, 2024 at 12:16 pm
submitted by /u/adeno_gothilla [link] [comments]

Etienne Noumen

Sports Lover, Linux guru, Engineer, Entrepreneur & Family Man.

Previous « Everything You Need To Know about Thermal Foggers

Published by

Etienne Noumen

Tags: AI advancementsAI BreakthroughsAI InnovationsAI newsAI ResearchAI TechnologiesArtificial IntelligenceCutting-edge AIfuture of AIJuly 2024

2 days ago

Everything You Need To Know about Thermal Foggers

Read here to discover the power and efficiency of thermal foggers and learn more about…

6 days ago

Data Engineering

Ace the AWS Certified Data Engineer Exam

Ace the AWS Certified Data Engineer Exam (DEA-C01): Mastering AWS Services for Data Ingestion, Transformation,…

2 weeks ago

Diversity

Wonderland Bedtime Adventures: Diverse Tales for Dreamy Nights

Wonderland Bedtime Adventures: Diverse Tales for Dreamy Nights. Embark on a magical journey with “Wonderland…

3 weeks ago

Tips for Using AI for Your Rental Property

Artificial intelligence can make running a rental property easier in many regards. Use these tips…

3 weeks ago

Sciences

Easy-To-Understand Facts About Cell Therapy

Cell therapy plays a vital role in medical treatments and advancements but it’s the process…

3 weeks ago

business

Things To Know Before Becoming a Certified Locksmith

Becoming a certified locksmith requires following a few industry rules and a commitment to personal…

3 weeks ago

AI Innovations in July 2024

A Daily chronicle of AI Innovations July 03rd 2024:

Apple joins OpenAI board

Google’s emissions spiked by almost 50% due to AI boom

Meta’s new AI can create 3D objects from text in under a minute

Meta’s 3D Gen creates 3D assets at lightning speed Perplexity AI upgrades Pro Search with more advanced problem-solving The first Gen AI framework that keeps your prompts always encrypted

ElevenLabs launches ‘Iconic Voices’

Leaks reveal Google Pixel AI upgrades

Meta’s new text-to-3D AI

Meta’s 3D Gen creates 3D assets at lightning speed

Perplexity AI upgrades Pro Search with more advanced problem-solving

The first Gen AI framework that keeps your prompts always encrypted

RunwayML’s Gen-3 Alpha models is now generally available

Meta might be bringing generative AI to metaverse games

Apple gets a non-voting seat on OpenAI’s board

Figma disabled AI tool after being criticised for ripping off Apple’s design

China is far ahead of other countries in generative AI inventions

Apple joins OpenAI board

Google’s emissions spiked by almost 50% due to AI boom

Meta’s new AI can create 3D objects from text in under a minute

A Daily chronicle of AI Innovations July 02nd 2024:

JARVIS-inspired Grok 2 aims to answer any user query Apple unveils a public demo of its ‘4M’ AI model Amazon hires Adept’s top executives to build an AGI team

YouTube lets you remove AI-generated content resembling face or voice

Runway opens Gen-3 Alpha access

Motorola hits the AI runway

Deepfakes to cost $40 billion by 2027: Deloitte survey

Anthropic launches a program to fund the creation of reliable AI benchmarks

US’s targeting of AI not helpful for healthy development: China

New robot controlled by human brain cells

Figma to temporarily disable AI feature amid plagiarism concerns

Runway opens Gen-3 Alpha access

Motorola hits the AI runway

JARVIS-inspired Grok 2 aims to answer any user query

Apple unveils a public demo of its ‘4M’ AI model

Amazon hires Adept’s top executives to build an AGI team

YouTube lets you remove AI-generated content resembling face or voice

Deepfakes to cost $40 billion by 2027: Deloitte survey

Anthropic launches a program to fund the creation of reliable AI benchmarks

US’s targeting of AI not helpful for healthy development: China

New robot controlled by human brain cells

Figma to temporarily disable AI feature amid plagiarism concerns

Nvidia faces first antitrust charges

A Daily chronicle of AI Innovations July 01st 2024:

Some Apple Intelligence features may be put behind a paywall

Meta’s new dataset could enable robots to learn manual skills from human experts

Google announces advancements in Vertex AI models LMSYS’s new Multimodal Arena compares top AI models’ visual processing abilities Apple’s Vision Pro gets an AI upgrade

Humanoid robots head to the warehouse

Google Translate adds 110 languages

Google announces advancements in Vertex AI models

LMSYS’s new Multimodal Arena compares top AI models’ visual processing abilities

Apple’s Vision Pro gets an AI upgrade

Humanoid robots head to the warehouse

Google Translate adds 110 languages

WhatsApp is developing a feature to choose Meta AI Llama models

Bill Gates says AI’s energy consumption isn’t a major concern

Amazon is investigating Perplexity AI for possible scraping abuse

Microsoft AI chief claims content on the open web is “freeware”

Some Apple Intelligence features may be put behind a paywall

Meta’s new dataset could enable robots to learn manual skills from human experts

Related Post

Recent Posts

Headline

Meta’s 3D Gen creates 3D assets at lightning speed
Perplexity AI upgrades Pro Search with more advanced problem-solving
The first Gen AI framework that keeps your prompts always encrypted

JARVIS-inspired Grok 2 aims to answer any user query
Apple unveils a public demo of its ‘4M’ AI model
Amazon hires Adept’s top executives to build an AGI team

Google announces advancements in Vertex AI models
LMSYS’s new Multimodal Arena compares top AI models’ visual processing abilities
Apple’s Vision Pro gets an AI upgrade