Decoding GPTs & LLMs: Training, Memory & Advanced Architectures Explained

Decoding GPTs & LLMs: Training, Memory & Advanced Architectures Explained

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Decoding GPTs & LLMs: Training, Memory & Advanced Architectures Explained

Unlock the secrets of GPTs and Large Language Models (LLMs) in our comprehensive guide!

Listen here

Decoding GPTs & LLMs: Training, Memory & Advanced Architectures Explained
Decoding GPTs & LLMs: Training, Memory & Advanced Architectures Explained

🤖🚀 Dive deep into the world of AI as we explore ‘GPTs and LLMs: Pre-Training, Fine-Tuning, Memory, and More!’ Understand the intricacies of how these AI models learn through pre-training and fine-tuning, their operational scope within a context window, and the intriguing aspect of their lack of long-term memory.

Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet)  Business Plan (AMERICAS) with  the following codes:  C37HCAQRVR7JTFK Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

🧠 In this article, we demystify:

  • Pre-Training & Fine-Tuning Methods: Learn how GPTs and LLMs are trained on vast datasets to grasp language patterns and how fine-tuning tailors them for specific tasks.
  • Context Window in AI: Explore the concept of the context window, which acts as a short-term memory for LLMs, influencing how they process and respond to information.
  • Lack of Long-Term Memory: Understand the limitations of GPTs and LLMs in retaining information over extended periods and how this impacts their functionality.
  • Database-Querying Architectures: Discover how some advanced AI models interact with external databases to enhance information retrieval and processing.
  • PDF Apps & Real-Time Fine-Tuning

Drop your questions and thoughts in the comments below and let’s discuss the future of AI! #GPTsExplained #LLMs #AITraining #MachineLearning #AIContextWindow #AILongTermMemory #AIDatabases #PDFAppsAI”

Subscribe for weekly updates and deep dives into artificial intelligence innovations.

✅ Don’t forget to Like, Comment, and Share this video to support our content.


AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Gemini, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, Promp Engineering)

📌 Check out our playlist for more AI insights

📖 Read along with the podcast below:

Welcome to AI Unraveled, the podcast that demystifies frequently asked questions on artificial intelligence and keeps you up to date with the latest AI trends. Join us as we delve into groundbreaking research, innovative applications, and emerging technologies that are pushing the boundaries of AI. From the latest trends in ChatGPT and the recent merger of Google Brain and DeepMind, to the exciting developments in generative AI, we’ve got you covered with a comprehensive update on the ever-evolving AI landscape. In today’s episode, we’ll cover GPTs and LLMs, their pre-training and fine-tuning methods, their context window and lack of long-term memory, architectures that query databases, PDF app’s use of near-realtime fine-tuning, and the book “AI Unraveled” which answers FAQs about AI.

GPTs, or Generative Pre-trained Transformers, work by being trained on a large amount of text data and then using that training to generate output based on input. So, when you give a GPT a specific input, it will produce the best matching output based on its training.

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLF-C02 book

The way GPTs do this is by processing the input token by token, without actually understanding the entire output. It simply recognizes that certain tokens are often followed by certain other tokens based on its training. This knowledge is gained during the training process, where the language model (LLM) is fed a large number of embeddings, which can be thought of as its “knowledge.”

After the training stage, a LLM can be fine-tuned to improve its accuracy for a particular domain. This is done by providing it with domain-specific labeled data and modifying its parameters to match the desired accuracy on that data.

Now, let’s talk about “memory” in these models. LLMs do not have a long-term memory in the same way humans do. If you were to tell an LLM that you have a 6-year-old son, it wouldn’t retain that information like a human would. However, these models can still answer related follow-up questions in a conversation.

For example, if you ask the model to tell you a story and then ask it to make the story shorter, it can generate a shorter version of the story. This is possible because the previous Q&A is passed along in the context window of the conversation. The context window keeps track of the conversation history, allowing the model to maintain some context and generate appropriate responses.

As the conversation continues, the context window and the number of tokens required will keep growing. This can become a challenge, as there are limitations on the maximum length of input that the model can handle. If a conversation becomes too long, the model may start truncating or forgetting earlier parts of the conversation.

Regarding architectures and databases, there are some models that may query a database before providing an answer. For example, a model could be designed to run a database query like “select * from user_history” to retrieve relevant information before generating a response. This is one way vector databases can be used in the context of these models.

There are also architectures where the model undergoes near-realtime fine-tuning when a chat begins. This means that the model is fine-tuned on specific data related to the chat session itself, which helps it generate more context-aware responses. This is similar to how “speak with your PDF” apps work, where the model is trained on specific PDF content to provide relevant responses.

In summary, GPTs and LLMs work by being pre-trained on a large amount of text data and then using that training to generate output based on input. They do this token by token, without truly understanding the complete output. LLMs can be fine-tuned to improve accuracy for specific domains by providing them with domain-specific labeled data. While LLMs don’t have long-term memory like humans, they can still generate responses in a conversation by using the context window to keep track of the conversation history. Some architectures may query databases before generating responses, and others may undergo near-realtime fine-tuning to provide more context-aware answers.

GPTs and Large Language Models (LLMs) are fascinating tools that have revolutionized natural language processing. It seems like you have a good grasp of how these models function, but I’ll take a moment to provide some clarification and expand on a few points for a more comprehensive understanding.

When it comes to GPTs and LLMs, pre-training and token prediction play a crucial role. During the pre-training phase, these models are exposed to massive amounts of text data. This helps them learn to predict the next token (word or part of a word) in a sequence based on the statistical likelihood of that token following the given context. It’s important to note that while the model can recognize patterns in language use, it doesn’t truly “understand” the text in a human sense.

Djamgatech: Build the skills that’ll drive your career into six figures: Get Djamgatech.

During the training process, the model becomes familiar with these large datasets and learns embeddings. Embeddings are representations of tokens in a high-dimensional space, and they capture relationships and context around each token. These embeddings allow the model to generate coherent and contextually appropriate responses.

However, pre-training is just the beginning. Fine-tuning is a subsequent step that tailors the model to specific domains or tasks. It involves training the model further on a smaller, domain-specific dataset. This process adjusts the model’s parameters, enabling it to generate responses that are more relevant to the specialized domain.

Now, let’s discuss memory and the context window. LLMs like GPT do not possess long-term memory in the same way humans do. Instead, they operate within what we call a context window. The context window determines the amount of text (measured in tokens) that the model can consider when making predictions. It provides the model with a form of “short-term memory.”

For follow-up questions, the model relies on this context window. So, when you ask a follow-up question, the model factors in the previous interaction (the original story and the request to shorten it) within its context window. It then generates a response based on that context. However, it’s crucial to note that the context window has a fixed size, which means it can only hold a certain number of tokens. If the conversation exceeds this limit, the oldest tokens are discarded, and the model loses track of that part of the dialogue.

It’s also worth mentioning that there is no real-time fine-tuning happening with each interaction. The model responds based on its pre-training and any fine-tuning that occurred prior to its deployment. This means that the model does not learn or adapt during real-time conversation but rather relies on the knowledge it has gained from pre-training and fine-tuning.

While standard LLMs like GPT do not typically utilize external memory systems or databases, some advanced models and applications may incorporate these features. External memory systems can store information beyond the limits of the context window. However, it’s important to understand that these features are not inherent to the base LLM architecture like GPT. In some systems, vector databases might be used to enhance the retrieval of relevant information based on queries, but this is separate from the internal processing of the LLM.

In relation to the “speak with your PDF” applications you mentioned, they generally employ a combination of text extraction and LLMs. The purpose is to interpret and respond to queries about the content of a PDF. These applications do not engage in real-time fine-tuning, but instead use the existing capabilities of the model to interpret and interact with the newly extracted text.

To summarize, LLMs like GPT operate within a context window and utilize patterns learned during pre-training and fine-tuning to generate responses. They do not possess long-term memory or real-time learning capabilities during interactions, but they can handle follow-up questions within the confines of their context window. It’s important to remember that while some advanced implementations might leverage external memory or databases, these features are not inherently built into the foundational architecture of the standard LLM.

Are you ready to dive into the fascinating world of artificial intelligence? Well, I’ve got just the thing for you! It’s an incredible book called “AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence.” Trust me, this book is an absolute gem!

Ace the Microsoft Azure Fundamentals AZ-900 Certification Exam: Pass the Azure Fundamentals Exam with Ease

Now, you might be wondering where you can get your hands on this treasure trove of knowledge. Look no further, my friend. You can find “AI Unraveled” at popular online platforms like Etsy, Shopify, Apple, Google, and of course, our old faithful, Amazon.

This book is a must-have for anyone eager to expand their understanding of AI. It takes those complicated concepts and breaks them down into easily digestible chunks. No more scratching your head in confusion or getting lost in a sea of technical terms. With “AI Unraveled,” you’ll gain a clear and concise understanding of artificial intelligence.

So, if you’re ready to embark on this incredible journey of unraveling the mysteries of AI, go ahead and grab your copy of “AI Unraveled” today. Trust me, you won’t regret it!

On today’s episode, we explored the power of GPTs and LLMs, discussing their ability to generate outputs, be fine-tuned for specific domains, and utilize a context window for related follow-up questions. We also learned about their limitations in terms of long-term memory and real-time updates. Lastly, we shared information about the book “AI Unraveled,” which provides valuable insights into the world of artificial intelligence. Join us next time on AI Unraveled as we continue to demystify frequently asked questions on artificial intelligence and bring you the latest trends in AI, including ChatGPT advancements and the exciting collaboration between Google Brain and DeepMind. Stay informed, stay curious, and don’t forget to subscribe for more!

Mastering GPT-4: Simplified Guide for Everyday Users

📢 Advertise with us and Sponsorship Opportunities

Are you eager to expand your understanding of artificial intelligence? Look no further than the essential book “AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence,” available at Etsy, Shopify, Apple, Google, or Amazon

AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Bard, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, AI Podcast)
AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Bard, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, AI Podcast)

The Future of Generative AI: From Art to Reality Shaping

  • Could this PC Build Handle Local Models?
    by /u/Xianimus (Artificial Intelligence Gateway) on May 9, 2024 at 2:45 am

    I'm very passionate about entering a new domain, but I'm also pretty nervous. Would the parts list I've thrown together work to run local LLMs / derivatives? Any glaringly stupid mistakes in my part selection? I'd add links, but I'm not sure if that's allowed.. • GPU: ($1,900)MSI Suprim Liquid X 24G GeForce RTX 4090 Graphics Card PCIe 4.0 24GB • CPU:($330) AMD Ryzen 7 8700G - Ryzen 7 8000-G Series 8-Core 4.2 GHz Socket AM5 65W AMD Radeon 780M Processor - 100-100001236BOX • Mobo: ($430)ASRock X670E Taichi Carrara AM5 EATX Mainboard. 4xDDR5 slots, 2 x PCIe 5.0 x16 Slots , AMD Cross Fire, Quad M.2 slots, 2.5GB LAN, WIFI-6E, 5,1 HD audio, HDMI 2.1, DP 1.4 ports, USB4 Type-C • RAM: ($500)NEMIX RAM 128GB (2 x 64GB) DDR5 5600MHz PC5-44800 ECC RDIMM Compatible with ASRock TRX50 WS Workstation Motherboard • Storage: ($450)Crucial T700 GEN5 NMVE M.2 Heatsink M.2 SSD 2280 4TB PCI-Express 5.0 x4 TLC NAND² Internal Solid State Drive (SSD) CT4000T700SSD5 • PSU: ($360) CORSAIR HX1500i Fully Modular Ultra-Low Noise ATX Power Supply - ATX 3.0 & PCIe 5.0 Compliant - Fluid Dynamic Bearing Fan - CORSAIR iCUE Software Compatible - 80 PLUS Platinum Efficiency • Cooling - Liquid: ($340) ASUS ROG Ryujin III 360 ARGB all-in-one liquid CPU cooler with 360mm radiator. Asetek 8th gen pump, 3x magnetic 120mm ARGB fans (Daisy Chain design), 3.5” LCD display • OS: ($FREE)Enterprise Open Source and Linux | Ubuntu submitted by /u/Xianimus [link] [comments]

  • Sam Altman: we are introducing the Model Spec, which specifies how our models should behave. we will listen, debate, and adapt this over time, but i think it will be very useful to be clear when something is a bug vs. a decision.
    by /u/GrantFranzuela (Artificial Intelligence) on May 9, 2024 at 1:46 am

    submitted by /u/GrantFranzuela [link] [comments]

  • Adaptable and Intelligent Generative AI through Advanced Information Lifecycle (AIL)
    by /u/siphonfilter79 (Artificial Intelligence) on May 9, 2024 at 1:07 am

    Video: Husky AI: An Ensemble Learning Architecture for Dynamic Context-Aware Retrieval and Generation (youtube.com) Pleases excuse my video, I will make a improved one. I would like to do a live event. Abstract: Husky AI represents a groundbreaking advancement in generative AI, leveraging the power of Advanced Information Lifecycle (AIL) management to achieve unparalleled adaptability, accuracy, and context-aware intelligence. This paper delves into the core components of Husky AI's architecture, showcasing how AIL enables intelligent data manipulation, dynamic knowledge evolution, and iterative learning. By integrating the innovative classes developed entirely in python, using open source tools , Husky AI dynamically incorporates real-time data from the web and its local ElasticSearchDocument DB, significantly expanding its knowledge base and contextual understanding. The system's ability to continuously learn and refine its response generation capabilities through user interactions sets a new standard in the development of generative AI systems. Husky AI's superior performance, real-time knowledge integration, and generalizability across applications position it as a paradigm shift in the field, paving the way for the future of intelligent systems. Husky AI Architecture: A Symphony of AIL Components At the heart of Husky AI's success lies its innovative architecture, which seamlessly integrates various AIL components to achieve its cutting-edge capabilities. Let's dive into the core elements that make Husky AI a game-changer: 2.1. Intelligent Data Manipulation: Streamlining Information Processing Husky AI's foundation is built upon intelligent data manipulation techniques that ensure efficient storage, retrieval, and processing of information. The system employs state-of-the-art sentence transformers to convert unstructured textual data into dense vector representations, known as embeddings. These embeddings capture the semantic meaning and relationships within the data, enabling precise similarity searches during information retrieval. Under the hood, the preprocess_and_write_data function works its magic. It ingests raw data, encodes it as a text string, and feeds it to the sentence transformer model. The resulting embeddings are then stored alongside the data within a Document object, which is subsequently committed to the document store for efficient retrieval. 2.2. Dynamic Context-Aware Retrieval: The Mastermind of Relevance Husky AI takes information retrieval to the next level with its dynamic context-aware retrieval mechanism. The MultiModalRetriever class, in seamless integration with Elasticsearch (ESDB), serves as the mastermind behind this operation, ensuring lightning-fast indexing and retrieval. When a user query arrives, the MultiModalRetriever springs into action. It generates a query embedding and performs a similarity search against the document embeddings stored within Elasticsearch. The similarity function meticulously calculates the semantic proximity between the query and document embeddings, identifying the most relevant documents based on their similarity scores. This approach ensures that Husky AI stays in sync with the evolving conversation context, retrieving the most pertinent information at each turn. The result is a system that generates responses that are not only accurate but also exhibit remarkable coherence and contextual relevance. 2.3. Ensemble of Specialized Language Models: A Symphony of Expertise Husky AI takes response generation to new heights by employing an ensemble of specialized language models, orchestrated by the MultiModelAgent class. Each model within the ensemble is meticulously trained for specific tasks or domains, contributing its unique expertise to the response generation process. When a user query is received, the MultiModelAgent leverages the retrieved documents and conversation context to generate responses from each language model in the ensemble. These individual responses are then carefully combined and processed to select the optimal response, taking into account factors such as relevance, coherence, and factual accuracy. By harnessing the strengths of specialized models like BlenderbotConversationalAgent, HFConversationalModel, and MyConversationalAgent, Husky AI can handle a wide range of topics and generate responses tailored to specific domains or tasks. 2.4. Integration of CustomWebRetriever: The Game Changer Husky AI takes adaptability and knowledge expansion to new heights with the integration of the CustomWebRetriever class. This powerful tool enables the system to dynamically retrieve and incorporate external data from the web, significantly expanding Husky AI's knowledge base and enhancing its contextual understanding by providing access to real-time information. Under the hood, the CustomWebRetriever class leverages the Serper API to conduct web searches and retrieve relevant documents based on user queries. It generates query embeddings using sentence transformers and utilizes these embeddings to ensure that the retrieved information aligns closely with the user's intent. The impact of the CustomWebRetriever on Husky AI's knowledge acquisition is profound. By incorporating this component into its pipeline, Husky AI gains access to a vast reservoir of external knowledge. It can retrieve up-to-date information from the web and dynamically adapt to new domains and topics. This dynamic knowledge evolution empowers Husky AI to handle a broader spectrum of information needs and provide accurate and relevant responses, even for niche or evolving topics. Iterative Learning: The Continuous Improvement Engine One of the key strengths of Husky AI lies in its ability to learn and improve over time through iterative learning. The system's knowledge base and response generation capabilities are continuously refined based on user interactions, ensuring a constantly evolving and adapting AI. 3.1. Learning from Interactions With every user interaction, Husky AI diligently analyzes the conversation history, user feedback (implicit or explicit), and the effectiveness of the chosen response. This analysis provides invaluable insights that help the system refine its understanding of user intent, identify areas for improvement, and strengthen its knowledge base. 3.2. Refining Response Generation The insights gleaned from user interactions are then used to refine the response generation process. Husky AI can dynamically adjust the weights assigned to different language models within the ensemble, prioritize specific information retrieval strategies, and optimize the response selection criteria based on user feedback. This continuous learning cycle ensures that Husky AI's responses become progressively more accurate, coherent, and user-centric over time. 3.3. Adaptability Across Applications The iterative learning mechanism in Husky AI fosters generalizability, enabling the system to adapt to diverse applications. As Husky AI encounters new domains, topics, and user interaction patterns, it can refine its knowledge and response generation strategies accordingly. This adaptability makes Husky AI a valuable tool for a wide range of use cases, from customer support and virtual assistants to content generation and knowledge management. Experimental Results and Analysis While traditional evaluation metrics provide valuable insights into the performance of generative AI systems, they may not fully capture the unique strengths and capabilities of Husky AI's AIL-powered architecture. The system's ability to dynamically acquire knowledge, continuously learn through user interactions, and leverage the synergy of its components presents challenges for conventional evaluation methods. 4.1. The Limitations of Traditional Metrics Traditional evaluation metrics, such as precision, recall, and F1 score, are designed to assess the performance of individual components or specific tasks. However, Husky AI's true potential lies in the seamless integration and collaboration of its various modules. Attempting to evaluate Husky AI using isolated metrics would be like judging a symphony by focusing on individual instruments rather than appreciating the harmonious performance of the entire orchestra. Moreover, traditional metrics may not adequately account for Husky AI's ability to continuously learn and update its knowledge base through the `CustomWebRetriever`. The system's dynamic knowledge acquisition capabilities enable it to adapt to new domains and provide accurate responses to previously unseen topics. This ongoing learning process, driven by user interactions, is a progressive feature that may not be fully reflected in conventional evaluation methods. 4.2. Showcasing Husky AI's Strengths through Real-World Scenarios To truly showcase Husky AI's superior capabilities, it is essential to evaluate the system in real-world scenarios that highlight its adaptability, contextual relevance, and continuous learning. By engaging Husky AI in diverse conversational contexts and assessing its performance over time, we can gain a more comprehensive understanding of its strengths and potential. 4.2.1. Dynamic Knowledge Acquisition and Adaptation To demonstrate Husky AI's dynamic knowledge acquisition capabilities, the system can be exposed to new domains and topics in real-time. By observing how quickly and effectively Husky AI retrieves and incorporates relevant information from the web, we can assess its ability to adapt to evolving knowledge landscapes. This showcases the power of the `CustomWebRetriever` in expanding Husky AI's knowledge base and enhancing its contextual understanding. 4.2.2. Continuous Learning through User Interactions Husky AI's continuous learning capabilities can be evaluated by engaging the system in extended conversational sessions with users. By analyzing how Husky AI refines its responses, improves its understanding of user intent, and adapts to individual preferences over time, we can demonstrate the effectiveness of its iterative learning mechanism. This highlights the system's ability to learn from user feedback and deliver increasingly personalized and relevant responses. 4.2.3. Contextual Relevance and Coherence To assess Husky AI's contextual relevance and coherence, the system can be evaluated in real-world conversational scenarios that require a deep understanding of context and the ability to maintain a coherent dialogue. By engaging Husky AI in multi-turn conversations spanning various topics and domains, we can demonstrate its ability to generate accurate, contextually relevant, and coherent responses. This showcases the power of the ensemble model and the synergy between the system's components. Husky AI sets a new standard for intelligent, adaptable, and user-centric systems. Its AIL-powered architecture paves the way for the development of AI systems that can seamlessly integrate with the dynamic nature of real-world knowledge and meet the diverse needs of users. With its continuous learning capabilities and real-time knowledge acquisition, Husky AI represents a significant step forward in the quest for truly intelligent and responsive AI systems. Samples of outputs and debug logs showcasing its abilities. I would be happy to show more examples. https://preview.redd.it/hpfqkg6arazc1.png?width=1920&format=png&auto=webp&s=c332d26dc0144842ff30c1ba0a1c1d435f14e6b3 https://preview.redd.it/lgq7agebrazc1.png?width=1904&format=png&auto=webp&s=8cc15dd15fe3e480161819dd9614b15ad114ad37 https://preview.redd.it/476a0n20vazc1.png?width=2548&format=png&auto=webp&s=837870eff7b51eef932f46498a662b1846f0591e submitted by /u/siphonfilter79 [link] [comments]

  • The AI Risk Matrix: A Strategy for Risk Mitigation
    by /u/superc0w (Artificial Intelligence) on May 9, 2024 at 12:05 am

    submitted by /u/superc0w [link] [comments]

  • Are you comfortable telling your boss that you use AI in the workplace? 52% say no
    by /u/Similar_Diver9558 (Artificial Intelligence Gateway) on May 8, 2024 at 11:01 pm

    https://www.forbes.com.au/news/leadership/workers-dont-want-bosses-knowing-they-use-ai/ View Poll submitted by /u/Similar_Diver9558 [link] [comments]

  • Looking to build a chat bot - where should I go?
    by /u/Modern_chemistry (Artificial Intelligence Gateway) on May 8, 2024 at 10:37 pm

    So I’m a teacher and I was able to use playlab.io and build some pretty awesome things, but im interested in building personal bots in the same manner. For example: some sort of daily journal, health monitor, life coach, etc etc to help me with everyday things like career advice or random ideas. a philosopher steeped in XYZ and I want help philosophizing over some random thought I had which my girlfriend has already heard enough of. a school teacher helping to lesson plan and brainstorm ideas and connecting different concepts. a personal teaching assistant to help keep me on track and knows my scope and sequence AND a personal assistant to help me with my scheduling and what not. Where can I build these that’s safe, reliable, and free? Can I do this all on chatGPT+? My only experience building bots is with playlab.ai. submitted by /u/Modern_chemistry [link] [comments]

  • Best way to get better at AI
    by /u/Personal_Concept8169 (Artificial Intelligence Gateway) on May 8, 2024 at 10:34 pm

    Hello everyone! Hope yall having a good day. I'll just jump straight into it. I'm a computer science student (second year), I do have experience with python. I do understand linear algebra, statistics, and calculus, I do have experience coding AI, i've built a NN from scratch, trying to learn pytorch now... I completed the deep learning specialization on coursera and have dabbled with the stanford AI courses. Still despite this I feel like I know nothing. I mean I can barely complete basic kaggle problems, I've done two so far, titanic and MNIST. MNIST i coded without anything but numpy, and that took me 3 weeks of debugging and all that. With titanic, I decided to try to learn pytorch but it shit on me and It's been a month and I'm just going to start another problem from scratch again I think... I'm incredibly slow with these problems, data wrangling is fine for me, i don't really know what my issue is everything just breaks all the time. I know it's probably normal, but is there any other materials I should be picking up? Or should I just keep pushing through these problems? I'm completely self taught by the way. (no ai courses in my uni) And this is just with basic networks, I mean i haven't even started on GANS and convolutional networks and recursive and yadda yadda. It's a weird state where I'm not a beginner but I'm not an intermediate and just tryna get yall's input on the best way forward. tysm in advance for any/all advice. submitted by /u/Personal_Concept8169 [link] [comments]

  • Google DeepMind’s new AI can model DNA, RNA, and ‘all life’s molecules’
    by /u/Rare_Adhesiveness518 (Artificial Intelligence Gateway) on May 8, 2024 at 10:04 pm

    Google DeepMind is introducing an improved version of its AI model that predicts not just the structure of proteins but also the structure of “all life’s molecules.” The work from the new model, AlphaFold 3, will help researchers in medicine, agriculture, materials science, and drug development test potential discoveries. Want to stay ahead of the curve in AI and tech? take a look here. Key points: AlphaFold 3, can predict the structure of all life's molecules, not just proteins, including DNA, RNA and smaller molecules. This is a significant improvement from previous versions. AlphaFold 3 uses a similar method to AI image generators to predict how different molecules fit together. This method is called diffusion. DeepMind is making AlphaFold 3 and the AlphaFold Server available to some researchers for free, with a focus on non-commercial uses. They are also working on responsible deployment of the model considering biosecurity risks. Source (The Verge) PS: If you enjoyed this post, you’ll love my ML-powered newsletter that summarizes the best AI/tech news from 50+ media sources. It’s already being read by hundreds of professionals from OpenAI, HuggingFace, Apple… submitted by /u/Rare_Adhesiveness518 [link] [comments]

  • Fine-tuning
    by /u/pelatho (Artificial Intelligence Gateway) on May 8, 2024 at 8:55 pm

    I've been developing a AI Companion app for few months. It's coming along nicely, but the complexity of my prompts have reached a point where some models struggle to understand and sometimes get lost and start producing output that I don't like. I've experimented with different models and while the majority of models fail to comprehend the prompts (They are fairly long and include personality traits, recent events, active memory entries, and instructions for writing messages and lists of available commands etc.). Some models, in particular Cohere's Command R Plus is astonishingly good - however, I may not be able to use this model since my project is NSFW in nature. The only alternative I've found that is somewhat usable is Mistrals "Mistral Large" model - however, it often looses track of context and ends up asking the same question over and over. So, my question is, could I take a slightly smaller model, like a 70B model and fine-tune it? Fine-tuning and hosting of models seems to be fairly straight-forward and easy enough for me to accomplish via together.ai. But, how much training data do I really need? 30k lines? 200k? 1 million? I have no idea. I imagine my dataset needs to contain snippets of certain likely phrases or instructions in the system prompt, paired with a user message and assistant response. For example ("pseudocode"): system prompt: Commands: web_browse (browse the web, arguments: query) user: {"message": "What's the latest news in AI?", "time": "just now"} assistant: {"message": "Let me see...", "command": {"name": "web_browse", "arguments": {"query": "Artificial Intelligence news"}}} If anyone has experience in this area, I'd be happy to learn! Thanks in advance. submitted by /u/pelatho [link] [comments]

  • OpenAI Is ‘Exploring’ How to Responsibly Generate AI Porn
    by /u/wiredmagazine (Artificial Intelligence) on May 8, 2024 at 8:07 pm

    submitted by /u/wiredmagazine [link] [comments]

Ace the 2023 AWS Solutions Architect Associate SAA-C03 Exam with Confidence Pass the 2023 AWS Certified Machine Learning Specialty MLS-C01 Exam with Flying Colors

List of Freely available programming books - What is the single most influential book every Programmers should read



#BlackOwned #BlackEntrepreneurs #BlackBuniness #AWSCertified #AWSCloudPractitioner #AWSCertification #AWSCLFC02 #CloudComputing #AWSStudyGuide #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AWSBasics #AWSCertified #AWSMachineLearning #AWSCertification #AWSSpecialty #MachineLearning #AWSStudyGuide #CloudComputing #DataScience #AWSCertified #AWSSolutionsArchitect #AWSArchitectAssociate #AWSCertification #AWSStudyGuide #CloudComputing #AWSArchitecture #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AzureFundamentals #AZ900 #MicrosoftAzure #ITCertification #CertificationPrep #StudyMaterials #TechLearning #MicrosoftCertified #AzureCertification #TechBooks

Top 1000 Canada Quiz and trivia: CANADA CITIZENSHIP TEST- HISTORY - GEOGRAPHY - GOVERNMENT- CULTURE - PEOPLE - LANGUAGES - TRAVEL - WILDLIFE - HOCKEY - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
zCanadian Quiz and Trivia, Canadian History, Citizenship Test, Geography, Wildlife, Secenries, Banff, Tourism

Top 1000 Africa Quiz and trivia: HISTORY - GEOGRAPHY - WILDLIFE - CULTURE - PEOPLE - LANGUAGES - TRAVEL - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
Africa Quiz, Africa Trivia, Quiz, African History, Geography, Wildlife, Culture

Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada.
Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada

Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA
Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA


Health Health, a science-based community to discuss health news and the coronavirus (COVID-19) pandemic

Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.

Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.

Reddit Sports Sports News and Highlights from the NFL, NBA, NHL, MLB, MLS, and leagues around the world.

Turn your dream into reality with Google Workspace: It’s free for the first 14 days.
Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes:
Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes: 96DRHDRA9J7GTN6 96DRHDRA9J7GTN6
63F733CLLY7R7MM
63F7D7CPD9XXUVT
63FLKQHWV3AEEE6
63JGLWWK36CP7WM
63KKR9EULQRR7VE
63KNY4N7VHCUA9R
63LDXXFYU6VXDG9
63MGNRCKXURAYWC
63NGNDVVXJP4N99
63P4G3ELRPADKQU
With Google Workspace, Get custom email @yourcompany, Work from anywhere; Easily scale up or down
Google gives you the tools you need to run your business like a pro. Set up custom email, share files securely online, video chat from any device, and more.
Google Workspace provides a platform, a common ground, for all our internal teams and operations to collaboratively support our primary business goal, which is to deliver quality information to our readers quickly.
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE
C37HCAQRVR7JTFK
C3AE76E7WATCTL9
C3C3RGUF9VW6LXE
C3D9LD4L736CALC
C3EQXV674DQ6PXP
C3G9M3JEHXM3XC7
C3GGR3H4TRHUD7L
C3LVUVC3LHKUEQK
C3PVGM4CHHPMWLE
C3QHQ763LWGTW4C
Even if you’re small, you want people to see you as a professional business. If you’re still growing, you need the building blocks to get you where you want to be. I’ve learned so much about business through Google Workspace—I can’t imagine working without it.
(Email us for more codes)

error: Content is protected !!