AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version
What is the Best Machine Learning Algorithms for Imbalanced Datasets?
In machine learning, imbalanced datasets are those where one class heavily outnumbers the others. This can be due to the nature of the problem or simply because more data is available for one class than the others. Either way, imbalanced datasets can pose a challenge for machine learning algorithms. In this blog post, we’ll take a look at which machine learning algorithms are best suited for imbalanced datasets and why they tend to perform better than others.
For example, in a binary classification problem, if there are 100 observations, and only 10 of them are positive (the rest are negatives), then we say that the dataset is imbalanced. The ratio of positive to negative cases is 1:10.
There are a few reasons why some machine learning algorithms tend to perform better on imbalanced datasets than others. First, certain algorithms are designed to handle imbalanced datasets. Second, some algorithms are more robust to outliers, which can be more common in imbalanced datasets. And third, some algorithms are better able to learn from a limited amount of data, which can be an issue when one class is heavily outnumbered by the others.
Some of the best machine learning algorithms for imbalanced datasets include:
– Support Vector Machines (SVMs),
– Decision Trees,
– Random Forests,
– Naive Bayes Classifiers,
– k-Nearest Neighbors (kNN),
Of these, SVMs tend to be the most popular choice as they are specifically designed to handle imbalanced datasets. SVMs work by finding a hyperplane that maximizes the margin between the two classes. This helps to reduce overfitting and improve generalization. Decision trees and random forests are also popular choices as they are less sensitive to outliers than other algorithms such as linear regression. Naive Bayes classifiers are another good choice as they are able to learn from a limited amount of data. kNN is also a good choice as it is not sensitive to outliers and is able to learn from a limited amount of data. However, it can be computationally intensive for large datasets.
There are two main types of machine learning algorithms: supervised and unsupervised. Supervised algorithms tend to perform better on imbalanced datasets than unsupervised algorithms. In this blog post, we will discuss why this is so and look at some examples.
Supervised Algorithms
Supervised algorithms are those where the target variable is known. In other words, we have training data where the correct answers are already given. The algorithm then learns from this data and is able to generalize to new data. Some examples of supervised algorithms are regression and classification.
Unsupervised Algorithms
Unsupervised algorithms are those where the target variable is not known. With unsupervised algorithms, we only have input data, without any corresponding output labels. The algorithm has to learn from the data itself without any guidance. Some examples of unsupervised algorithms are clustering and dimensionality reduction.
Why Supervised Algorithms Perform Better on Imbalanced Datasets
The reason why supervised algorithms perform better on imbalanced datasets is because they can learn from the training data which cases are more important. With unsupervised algorithms, all data points are treated equally, regardless of whether they are in the minority or majority class.
For example, in a binary classification problem with an imbalanced dataset, let’s say that we want to predict whether a customer will default on their loan payment or not. We have a training dataset of 1000 customers, out of which only 100 (10%) have defaulted on their loan in the past.
If we use a supervised algorithm like logistic regression, the algorithm will learn from the training data that defaulting on a loan is rare (since only 10% of cases in the training data are Positive). This means that it will be more likely to predict correctly that a new customer will not default on their loan (since this is the majority class in the training data).
However, if we use an unsupervised algorithm like k-means clustering, all data points will be treated equally since there is no target variable to guide the algorithm. This means that it might incorrectly cluster together customers who have defaulted on their loans with those who haven’t since there is no guidance provided by a target variable.
Conclusion:
In conclusion, supervised machine learning algorithms tend to perform better on imbalanced datasets than unsupervised machine learning algorithms because they can learn from the training data which cases are more important.
Some machine learning algorithms tend to perform better on highly imbalanced datasets because they are designed to deal with imbalance or because they can learn from both classes simultaneously. If you are working with a highly imbalanced dataset, then you should consider using one of these algorithms.
Thanks for reading!
How are machine learning techniques being used to address unstructured data challenges?
Machine learning techniques are being used to address unstructured data challenges in a number of ways:
- Natural language processing (NLP): NLP algorithms can be used to extract meaningful information from unstructured text data, such as emails, documents, and social media posts. NLP algorithms can be trained to classify text data, identify key terms and concepts, and extract structured data from unstructured text.
- Image recognition: Machine learning algorithms can be used to analyze and classify images, enabling the automatic identification and classification of objects, people, and other elements in images. This can be useful for tasks such as image tagging and search, as well as for applications such as security and surveillance.
- Audio and speech recognition: Machine learning algorithms can be used to analyze and classify audio data, enabling the automatic transcription and translation of spoken language. This can be useful for tasks such as speech-to-text transcription, as well as for applications such as call center automation and language translation.
- Video analysis: Machine learning algorithms can be used to analyze and classify video data, enabling the automatic detection and classification of objects, people, and other elements in video. This can be useful for tasks such as video tagging and search, as well as for applications such as security and surveillance.
Overall, machine learning techniques are being used in a wide range of applications to extract meaningful information from unstructured data, and to enable the automatic classification and analysis of data in a variety of formats.
How is AI and machine learning impacting application development today?
Artificial intelligence (AI) and machine learning are having a significant impact on application development today in a number of ways:
- Enabling new capabilities: AI and machine learning algorithms can be used to enable applications to perform tasks that would be difficult or impossible for humans to do. For example, AI-powered applications can be used to analyze and classify large amounts of data, or to automate complex decision-making processes.
- Improving performance: AI and machine learning algorithms can be used to optimize the performance of applications, making them faster, more efficient, and more accurate. For example, machine learning algorithms can be used to improve the accuracy of predictive models, or to optimize the performance of search algorithms.
- Streamlining development: AI and machine learning algorithms can be used to automate various aspects of application development, such as testing, debugging, and deployment. This can help to streamline the development process and reduce the time and resources needed to build and maintain applications.
- Enhancing user experiences: AI and machine learning algorithms can be used to enhance the user experience of applications, by providing personalized recommendations, recommendations, or by enabling applications to anticipate and respond to the needs and preferences of users.
Overall, AI and machine learning are having a significant impact on application development today, and they are likely to continue to shape the way applications are built and used in the future.
How will advancements in artificial intelligence and machine learning shape the future of work and society?
Advancements in artificial intelligence (AI) and machine learning are likely to shape the future of work and society in a number of ways. Some potential impacts include:
Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)
Active Anti-Aging Eye Gel, Reduces Dark Circles, Puffy Eyes, Crow's Feet and Fine Lines & Wrinkles, Packed with Hyaluronic Acid & Age Defying Botanicals
- Automation: AI and machine learning algorithms can be used to automate tasks that are currently performed by humans, such as data entry, customer service, and manufacturing. This could lead to changes in the types of jobs that are available and the skills that are in demand, as well as to increased productivity and efficiency.
- Job displacement: While automation may create new job opportunities, it could also lead to job displacement, particularly for workers in industries that are more susceptible to automation. This could lead to social and economic challenges, including unemployment and income inequality.
- Increased efficiency: AI and machine learning algorithms can be used to optimize and streamline business processes, leading to increased efficiency and productivity. This could lead to economic growth and innovation, and could also help to reduce costs for businesses and consumers.
- Enhanced decision-making: AI and machine learning algorithms can be used to analyze large amounts of data and make more informed and accurate decisions. This could lead to improved outcomes in fields such as healthcare, finance, and education, and could also help to reduce bias and improve fairness.
Overall, the impact of AI and machine learning on the future of work and society is likely to be significant and complex, with both potential benefits and challenges. It will be important to consider and address these impacts as these technologies continue to advance and become more widely adopted.
- [D] How researcher think of inductive bias when thinking of creating new/improving foundational models?by /u/binny_sarita (Machine Learning) on April 24, 2024 at 2:36 am
I am undergradute student learning machine learning. What I got to know while reading few papers that we try to reduce search space by imposing inductive bias in machine learning models. And the success in creating useful models comes when inductive bias matches with the underlying data. In heriarchical models like NVAE how they instilled inductive bias by specifing the way data gets computed? (I thinks it's called algorithmic bias, not sure though) But how people think such inductive bias will be helpful, what is step by step procedure they go through to insist such inductive bias. I took a lot of class in machine learning and statistics but didn't got any lectures explaing such stuff. Did I missed any course/lecture? Please provide my with papers/lectures/talks related to it if possible Thankyou submitted by /u/binny_sarita [link] [comments]
- [R] Generalized Contrastive Learning for Multi-Modal Retrieval and Rankingby /u/Jesse_marqo (Machine Learning) on April 23, 2024 at 11:07 pm
Generalization of the popular training method of CLIP to be better suited for search and recommendations. Paper: https://arxiv.org/pdf/2404.08535.pdf Github: https://github.com/marqo-ai/GCL Generalises CLIP: Use any number of text and/or images to represent documents. Better text understanding by having both inter- and intra-modal losses. Can encode rank/importance/relevance, a.k.a “rank-tune”. Works with pretrained, text, CLIP models. Can learn uni- or multi-vector representations for documents. Works with binary and Matryoshka methods. Open source 10M row multi-modal dataset with 100k queries and ~5M products. Why? The prevailing methods for training embedding models are largely disconnected from the end use-case (like search), the vector database, the requirements of users, and a lack of representative datasets for development and evaluation, particularly when multiple modalities and ranking is involved. Limitations of current embedding models for vector search Although vector search is very powerful and enables searching across just about any data, the current methods have some limitations. The prevailing methods for training embedding models are largely disconnected from the end use-case (like search), the vector database, and the requirements of users. This means that a lot of the potential of vector search is being unmet. Some of the current challenges are described below. Restricted to using a single piece of information to represent a document Current models encode and represent one piece of information with one vector. The reality is that often there are multiple pieces of pertinent information for a document that may span multiple modalities. For example, in product search there may be a title, description, reviews, and multiple images, each with its own caption. GCL generalises embedding model training to use as many pieces of information as is desired. No notion of rank when dealing with degenerate queries When there are degenerate queries - multiple results that satisfy some criteria of relevance - the ordering of the results is only ever learned indirectly from the many binary relationships. In reality, the ordering of results matters, even for first stage retrieval. GCL allows for the magnitude of query-document specific relevance to be encoded in the embeddings and improves ranking of candidate documents. Poor text understanding when using CLIP like methods For multi-modal models like CLIP, these are trained to only work from image to text (and vice versa). The text-text understanding is not as good as text only models due to the text-text relationships being learned indirectly through images. For many applications, having both inter- and intra-modality understanding is required. GCL allows for any combination of inter- and intra-modal understanding by directly optimizing for this. Lack of representative datasets to develop methods for vector search In developing GCL, it became apparent there was a disconnect with publicly available datasets for embedding model training and evaluation for real-world use cases. Existing benchmarks are typically text only or inter-modal only and focus on the 1-1 query-result paradigm. Additionally, existing datasets have limited notions of relevance, with the majority encoding it as a binary relationship while several use (up-to) a handful of discrete categorizations often on the test set only. This differs from a typical real-world use cases where relevance can be both hard binary relationships or come from continuous variables. To help with this we compiled a dataset of 10M (ranked) product-query pairs, across ~100k queries, nearly 5M products, and four evaluation splits (available here). submitted by /u/Jesse_marqo [link] [comments]
- [D] Practical uses of AI inside companiesby /u/CJSF (Machine Learning) on April 23, 2024 at 10:25 pm
How are people using AI inside companies (startups -> FAANG) to improve operations and processes? There is so much talk about leveraging LLM’s and GenAI but I’m struggling to find real concrete examples that are successful. The following areas come to mind first but this list isn’t exhaustive of course: Design (and handoff) Engineering Customer Support Sales Documentation Marketing What’s worked or shown promise? What hasn’t worked? submitted by /u/CJSF [link] [comments]
- Meta does everything OpenAI should be [D]by /u/ReputationMindless32 (Machine Learning) on April 23, 2024 at 10:03 pm
I'm surprised (or maybe not) to say this, but Meta (or Facebook) democratises AI/ML much more than OpenAI, which was originally founded and primarily funded for this purpose. OpenAI has largely become a commercial project for profit only. Although as far as Llama models go, they don't yet reach GPT4 capabilities for me, but I believe it's only a matter of time. What do you guys think about this? submitted by /u/ReputationMindless32 [link] [comments]
- [D] Speech to Text Word Level Timestamps Accuracy Issueby /u/Mindless-Ordinary485 (Machine Learning) on April 23, 2024 at 7:18 pm
I've had a lot of success with Whisper when it comes to transcriptions, but word level timestamps seems to be slightly inaccurate. From my understanding ("Whisper cannot provide reliable word timestamps, because the END-TO-END models like Transformer using cross-entropy training criterion are not designed for reliably estimating word timestamps." https://www.youtube.com/watch?v=H576iCWt1Co&t=192s) For my use case, I need precise word level timestamps, because I'm doing audio insertion after specific words. This becomes problematic when I do an insertion and the back part of a word ends up on the other side. Example: Given an original audio file with speech that has been transcribed, If I want to insert a clip at the end of the word "France", and according to the timestamp, the word "France" starts at 19.26 and ends at 19.85, I will insert the clip at 19.85. However, if the actual end of France is at 19.92, then when I insert the laugher at 19.85, I will here the remaining "France", likely "ce" (0.07), at the end. I'm curious if anyone has been posed with a similar problem and what they did to get around this? I've experimented with a few open source variations of whisper, but still running into that issue. submitted by /u/Mindless-Ordinary485 [link] [comments]
- [R] Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometryby /u/SeawaterFlows (Machine Learning) on April 23, 2024 at 7:11 pm
Paper: https://arxiv.org/abs/2404.06405 Code: https://huggingface.co/datasets/bethgelab/simplegeometry Abstract: Proving geometric theorems constitutes a hallmark of visual reasoning combining both intuitive and logical skills. Therefore, automated theorem proving of Olympiad-level geometry problems is considered a notable milestone in human-level automated reasoning. The introduction of AlphaGeometry, a neuro-symbolic model trained with 100 million synthetic samples, marked a major breakthrough. It solved 25 of 30 International Mathematical Olympiad (IMO) problems whereas the reported baseline based on Wu's method solved only ten. In this note, we revisit the IMO-AG-30 Challenge introduced with AlphaGeometry, and find that Wu's method is surprisingly strong. Wu's method alone can solve 15 problems, and some of them are not solved by any of the other methods. This leads to two key findings: (i) Combining Wu's method with the classic synthetic methods of deductive databases and angle, ratio, and distance chasing solves 21 out of 30 methods by just using a CPU-only laptop with a time limit of 5 minutes per problem. Essentially, this classic method solves just 4 problems less than AlphaGeometry and establishes the first fully symbolic baseline strong enough to rival the performance of an IMO silver medalist. (ii) Wu's method even solves 2 of the 5 problems that AlphaGeometry failed to solve. Thus, by combining AlphaGeometry with Wu's method we set a new state-of-the-art for automated theorem proving on IMO-AG-30, solving 27 out of 30 problems, the first AI method which outperforms an IMO gold medalist. submitted by /u/SeawaterFlows [link] [comments]
- [D] Method to generate shapely contributions without model objectby /u/ozymandias_514 (Machine Learning) on April 23, 2024 at 6:08 pm
Is there a method to generate the approximations of Shapely values (or something similar) for a data without using model object. Essentialy I input features and model predictions on benchmark data, and the same for test data, and output is contributions for each feature on test data submitted by /u/ozymandias_514 [link] [comments]
- [P] A Python Intelligence Config Manager. Superset of hydra+pydantic+lspby /u/cssunfu (Machine Learning) on April 23, 2024 at 3:57 pm
I developed a very powerful Python Config Management Tool. It can make your json config as powerful as python code. And very friendly to humans. The most attractive feature is that it will analyze Python code and json config file in real time, provide document display, parameter completion, and goto Python definition from the json config. (Powered by LSP) Similar or better config inheritance, parameter reference and parameter grid search like hydra Data validation similar to pydantic, and the ability to convert dataclass to json-schema This project is still in its early stages, and everyone is welcome to provide some suggestions and ideas. git repo submitted by /u/cssunfu [link] [comments]
- [N] Phi-3-mini released on HuggingFaceby /u/topcodemangler (Machine Learning) on April 23, 2024 at 3:26 pm
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct The numbers in the technical report look really great, I guess need to be verified by 3rd parties. submitted by /u/topcodemangler [link] [comments]
- [D]Drastic Change in the Accuracy score and other measures after hyper parameter tuning.by /u/Saheenus (Machine Learning) on April 23, 2024 at 1:41 pm
Hey, I am currently doing a malware classification (malware,benign).Used the naive Bayes(Bernoulli) the accuracy was 67 at this point.After the tuning it goes straight up 100. Is this normal or not? I did Outlier removal using the IQR and feature selection using the correlation. submitted by /u/Saheenus [link] [comments]
- [D] How to and Deploy LLaMA 3 Into Production, and Hardware Requirementsby /u/juliensalinas (Machine Learning) on April 23, 2024 at 12:33 pm
Many are trying to install and deploy their own LLaMA 3 model, so here is a tutorial I just made showing how to deploy LLaMA 3 on an AWS EC2 instance: https://nlpcloud.com/how-to-install-and-deploy-llama-3-into-production.html Deploying LLaMA 3 8B is fairly easy but LLaMA 3 70B is another beast. Given the amount of VRAM needed you might want to provision more than one GPU and use a dedicated inference server like vLLM in order to split your model on several GPUs. LLaMA 3 8B requires around 16GB of disk space and 20GB of VRAM (GPU memory) in FP16. As for LLaMA 3 70B, it requires around 140GB of disk space and 160GB of VRAM in FP16. I hope it is useful, and if you have questions please don't hesitate to ask! Julien submitted by /u/juliensalinas [link] [comments]
- [D] Language Model for studentsby /u/mbungee (Machine Learning) on April 23, 2024 at 11:31 am
Hello Community, I am preparing a lecture on language models with a lot of hands on work in python. I am trying Llama 3 8b but it seems to run very slow to answer prompts. I have a laptop with a modern I7 and 32GB of RAM. I assume my students have something less powerful and I don’t want them to take hours just for a prompt. So do you know a model that is reasonably fast without giving up too much performance? submitted by /u/mbungee [link] [comments]
- Robotics and AI [D]by /u/navarrox99 (Machine Learning) on April 23, 2024 at 11:31 am
Hello! I've studied robotics engineering, and my approach to Al has primarily focused on computer vision. I've been deeply interested in this area and, of course, nowadays, I use GPT-4 on a daily basis for both work and personal projects. I'm starting an MSc in Al next year, and in the cover letter for the master's application, I would like to discuss potential research topics that combine robotics and Al. My first choice involves using reinforcement learning for navigation and path planning. I'm also interested in the potential application of vision transformers to visual and semantic SLAM, as well as deep learning-based feature extractors. I'm also impressed by what large companies are achieving with multimodal transformers and humanoid robot like the figure Al, although I understand that these projects might only be feasible for such organizations. I'd love to hear your opinion and insights on the trends in robotics and Al. And receive some guidance from experts on the field. What do you think are the current hot topics, and what will be the key areas of focus in the next 5, 10, and 15 years? Thanks! submitted by /u/navarrox99 [link] [comments]
- [P] I built a website to detect audio deepfakes and spoofs.by /u/mummni (Machine Learning) on April 23, 2024 at 10:27 am
Hi everybody, I've been working on a project for over two years now: Deepfake-Total.com, a website where you can analyze audio files for traces of text-to-speech or voice conversion, i.e. identify audio deepfakes and spoofs, which I feel is an important topic, especially given the number of important elections coming up. The tool is free-to-use and supports YouTube and twitter URLs, as well as the manual upload of audio files. I wanted to share this and get your feedback: What works for you and what does not, where can I improve? The model should be quite precise for most input, but similar to an anti-virus scanner, there'll always be the occasional outlier. If you find any, feedback is much appreciated, as well as ideas for (scientific) collaboration, etc. submitted by /u/mummni [link] [comments]
- [D] Are there any MoE models other than LLMs?by /u/lime_52 (Machine Learning) on April 23, 2024 at 9:58 am
Is MoE architecture also applied in other ML areas, let’s say Computer Vision? Why aren’t they popular? Is it because we don’t scale vision transformers as much as LLMs, and MoE is best for scalability? submitted by /u/lime_52 [link] [comments]
- [D] What best practices and workflows those working solo as DS/MLE should keep in mind?by /u/Melodic_Reality_646 (Machine Learning) on April 23, 2024 at 9:40 am
I'm wondering what technical recruiters or seasoned DS/MLE have to say about people with profiles like mine: good theoretical and decent technical background but working solo for too long. Summary of my career for context: I've been working 8 years now as a DS, the first 3 in medium sized R&D and consulting teams (for a big tech company), then for the past 5 as a solo DS for relatively successful non-ai focused start-ups, mostly developing ML/NLP stuff to address specific issues or improve one specific feature of their product (i.e. never a whole product). In 5 years I designed. developed and deployed, say, 4 models (but experimented with many ofc) - along with a few dashboards and simple streamlit POCs). Recently attending to meetups and seeing how people that make part of actual teams work, discuss and exchange knowledge it suddenly stroke me: I'm missing out, I'm becoming obsolete. I dont feel sharp enough for technical interviews, I'm not sure the way I develop and maintain my projects are following good standards/best practices (heck, i hardly follow a kanban, mostly use my planner to report to my boss on progress). I do some version control and document what I put into prod, but not even that I'm sure I'm doing as it'd be expected within a team. submitted by /u/Melodic_Reality_646 [link] [comments]
- [R] Seeking Expert Reviewers for Neural Network-Based Thermal Diffusion Researchby /u/No-Palpitation-7229 (Machine Learning) on April 23, 2024 at 8:44 am
Hello everyone, I'm preparing to submit a research paper and need to identify potential reviewers. My paper introduces a novel approach to solving thermal diffusion problems in steel rods using neural networks. Traditional methods often struggle with complex boundary conditions or nonlinear material properties, but our neural network model, trained on solutions from classical analytical methods, shows promise in predicting temperature distributions accurately with a low error margin. I'm looking for experts with a Ph.D. or M.D. and significant experience in physics, thermal dynamics, or related fields of machine learning. If you have expertise in these areas or know someone who does, I would greatly appreciate your input or referral. What I've done so far: Developed the neural network model and tested it against classical solutions. Drafted the manuscript detailing methodologies, results, and implications. I'm facing a challenge in finding suitable reviewers who have a deep understanding of both thermal physics and machine learning applications. Any guidance or suggestions from this community would be incredibly helpful. Thank you for considering my request! Best regards, Ed submitted by /u/No-Palpitation-7229 [link] [comments]
- [D] Gated Long-Term Memoryby /u/jessielesbian (Machine Learning) on April 23, 2024 at 7:52 am
Today, I am presenting my latest idea: Gated Long-Term Memory GLTM unit Gated Long-Term Memory tries to implement an efficient LSTM alternative. Unlike LSTM, GLTM does all the heavy lifting in parallel, the only operations that are performed sequentially are the multiplication and addition operations. Gated Long-Term Memory uses only linear memory, compared to the quadratic memory of Transformers. submitted by /u/jessielesbian [link] [comments]
- [P] Extensible and Customisable Vertex AI MLOps Platformby /u/AdComfortable5974 (Machine Learning) on April 23, 2024 at 7:52 am
I've just released an open-source, end-to-end Vertex AI MLOps platform on Google Cloud Platform (GCP). It provides a comprehensive overview of fundamental components and the various operations required to enable such a platform. If you're a team or individual looking to get started with MLOps on GCP, this could be a valuable resource for you. You can read about it on Medium: Extensible and Customisable Vertex AI MLOps Platform submitted by /u/AdComfortable5974 [link] [comments]
- [R] A Survey on Self-Evolution of Large Language Modelsby /u/tnlin (Machine Learning) on April 23, 2024 at 7:17 am
Hi everyone, I am the second author (and project leader) and would like to share of our latest work: A Survey on Self-Evolution of Large Language Models. LLMs that use a self-evolution approach have rapidly increased. However, the relationships between these methods remain unclear, lacking systematic organization. https://preview.redd.it/bhfeilfni6wc1.jpg?width=1240&format=pjpg&auto=webp&s=8c268f3033fcd08c55ce860d00ff02c83bcb3884 To bridge the gap, we are pleased to introduce our latest paper, "A Survey on Self-Evolution of Large Language Models", which presents a conceptual framework for the self-evolution of LLMs, enabling models (such as WizardLM, LLAMA, and Phi) to autonomously (1) acquire and (2) refine experiences, (3) update themselves, and (4) evaluate their performance iteratively. https://preview.redd.it/br95klfni6wc1.jpg?width=1500&format=pjpg&auto=webp&s=93f6ae86772e632c168617c4cabd350f892eeedb Our framework explores the potential for LLMs to move from a data flywheel to an intelligent flywheel and hopefully become a new training paradigm that scales LLMs and LLM-base agents towards more autonomous AI systems. For more details, please visit: 📄 Arxiv: https://arxiv.org/abs/2404.14387 🤖 GitHub: https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/Awesome-Self-Evolution-of-LLM 🐦 Tweet: https://twitter.com/tnlin_tw/status/1782662569481916671 We will keep adding papers and improving the survey and repo. Any suggestions and PRs are welcome! submitted by /u/tnlin [link] [comments]
Active Hydrating Toner, Anti-Aging Replenishing Advanced Face Moisturizer, with Vitamins A, C, E & Natural Botanicals to Promote Skin Balance & Collagen Production, 6.7 Fl Oz
Age Defying 0.3% Retinol Serum, Anti-Aging Dark Spot Remover for Face, Fine Lines & Wrinkle Pore Minimizer, with Vitamin E & Natural Botanicals
Firming Moisturizer, Advanced Hydrating Facial Replenishing Cream, with Hyaluronic Acid, Resveratrol & Natural Botanicals to Restore Skin's Strength, Radiance, and Resilience, 1.75 Oz
Skin Stem Cell Serum
Smartphone 101 - Pick a smartphone for me - android or iOS - Apple iPhone or Samsung Galaxy or Huawei or Xaomi or Google Pixel
Can AI Really Predict Lottery Results? We Asked an Expert.
Djamgatech
Read Photos and PDFs Aloud for me iOS
Read Photos and PDFs Aloud for me android
Read Photos and PDFs Aloud For me Windows 10/11
Read Photos and PDFs Aloud For Amazon
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more)
Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6(Email us for more)
FREE 10000+ Quiz Trivia and and Brain Teasers for All Topics including Cloud Computing, General Knowledge, History, Television, Music, Art, Science, Movies, Films, US History, Soccer Football, World Cup, Data Science, Machine Learning, Geography, etc....
List of Freely available programming books - What is the single most influential book every Programmers should read
- Bjarne Stroustrup - The C++ Programming Language
- Brian W. Kernighan, Rob Pike - The Practice of Programming
- Donald Knuth - The Art of Computer Programming
- Ellen Ullman - Close to the Machine
- Ellis Horowitz - Fundamentals of Computer Algorithms
- Eric Raymond - The Art of Unix Programming
- Gerald M. Weinberg - The Psychology of Computer Programming
- James Gosling - The Java Programming Language
- Joel Spolsky - The Best Software Writing I
- Keith Curtis - After the Software Wars
- Richard M. Stallman - Free Software, Free Society
- Richard P. Gabriel - Patterns of Software
- Richard P. Gabriel - Innovation Happens Elsewhere
- Code Complete (2nd edition) by Steve McConnell
- The Pragmatic Programmer
- Structure and Interpretation of Computer Programs
- The C Programming Language by Kernighan and Ritchie
- Introduction to Algorithms by Cormen, Leiserson, Rivest & Stein
- Design Patterns by the Gang of Four
- Refactoring: Improving the Design of Existing Code
- The Mythical Man Month
- The Art of Computer Programming by Donald Knuth
- Compilers: Principles, Techniques and Tools by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman
- Gödel, Escher, Bach by Douglas Hofstadter
- Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
- Effective C++
- More Effective C++
- CODE by Charles Petzold
- Programming Pearls by Jon Bentley
- Working Effectively with Legacy Code by Michael C. Feathers
- Peopleware by Demarco and Lister
- Coders at Work by Peter Seibel
- Surely You're Joking, Mr. Feynman!
- Effective Java 2nd edition
- Patterns of Enterprise Application Architecture by Martin Fowler
- The Little Schemer
- The Seasoned Schemer
- Why's (Poignant) Guide to Ruby
- The Inmates Are Running The Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity
- The Art of Unix Programming
- Test-Driven Development: By Example by Kent Beck
- Practices of an Agile Developer
- Don't Make Me Think
- Agile Software Development, Principles, Patterns, and Practices by Robert C. Martin
- Domain Driven Designs by Eric Evans
- The Design of Everyday Things by Donald Norman
- Modern C++ Design by Andrei Alexandrescu
- Best Software Writing I by Joel Spolsky
- The Practice of Programming by Kernighan and Pike
- Pragmatic Thinking and Learning: Refactor Your Wetware by Andy Hunt
- Software Estimation: Demystifying the Black Art by Steve McConnel
- The Passionate Programmer (My Job Went To India) by Chad Fowler
- Hackers: Heroes of the Computer Revolution
- Algorithms + Data Structures = Programs
- Writing Solid Code
- JavaScript - The Good Parts
- Getting Real by 37 Signals
- Foundations of Programming by Karl Seguin
- Computer Graphics: Principles and Practice in C (2nd Edition)
- Thinking in Java by Bruce Eckel
- The Elements of Computing Systems
- Refactoring to Patterns by Joshua Kerievsky
- Modern Operating Systems by Andrew S. Tanenbaum
- The Annotated Turing
- Things That Make Us Smart by Donald Norman
- The Timeless Way of Building by Christopher Alexander
- The Deadline: A Novel About Project Management by Tom DeMarco
- The C++ Programming Language (3rd edition) by Stroustrup
- Patterns of Enterprise Application Architecture
- Computer Systems - A Programmer's Perspective
- Agile Principles, Patterns, and Practices in C# by Robert C. Martin
- Growing Object-Oriented Software, Guided by Tests
- Framework Design Guidelines by Brad Abrams
- Object Thinking by Dr. David West
- Advanced Programming in the UNIX Environment by W. Richard Stevens
- Hackers and Painters: Big Ideas from the Computer Age
- The Soul of a New Machine by Tracy Kidder
- CLR via C# by Jeffrey Richter
- The Timeless Way of Building by Christopher Alexander
- Design Patterns in C# by Steve Metsker
- Alice in Wonderland by Lewis Carol
- Zen and the Art of Motorcycle Maintenance by Robert M. Pirsig
- About Face - The Essentials of Interaction Design
- Here Comes Everybody: The Power of Organizing Without Organizations by Clay Shirky
- The Tao of Programming
- Computational Beauty of Nature
- Writing Solid Code by Steve Maguire
- Philip and Alex's Guide to Web Publishing
- Object-Oriented Analysis and Design with Applications by Grady Booch
- Effective Java by Joshua Bloch
- Computability by N. J. Cutland
- Masterminds of Programming
- The Tao Te Ching
- The Productive Programmer
- The Art of Deception by Kevin Mitnick
- The Career Programmer: Guerilla Tactics for an Imperfect World by Christopher Duncan
- Paradigms of Artificial Intelligence Programming: Case studies in Common Lisp
- Masters of Doom
- Pragmatic Unit Testing in C# with NUnit by Andy Hunt and Dave Thomas with Matt Hargett
- How To Solve It by George Polya
- The Alchemist by Paulo Coelho
- Smalltalk-80: The Language and its Implementation
- Writing Secure Code (2nd Edition) by Michael Howard
- Introduction to Functional Programming by Philip Wadler and Richard Bird
- No Bugs! by David Thielen
- Rework by Jason Freid and DHH
- JUnit in Action
#BlackOwned #BlackEntrepreneurs #BlackBuniness #AWSCertified #AWSCloudPractitioner #AWSCertification #AWSCLFC02 #CloudComputing #AWSStudyGuide #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AWSBasics #AWSCertified #AWSMachineLearning #AWSCertification #AWSSpecialty #MachineLearning #AWSStudyGuide #CloudComputing #DataScience #AWSCertified #AWSSolutionsArchitect #AWSArchitectAssociate #AWSCertification #AWSStudyGuide #CloudComputing #AWSArchitecture #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AzureFundamentals #AZ900 #MicrosoftAzure #ITCertification #CertificationPrep #StudyMaterials #TechLearning #MicrosoftCertified #AzureCertification #TechBooks
Top 1000 Canada Quiz and trivia: CANADA CITIZENSHIP TEST- HISTORY - GEOGRAPHY - GOVERNMENT- CULTURE - PEOPLE - LANGUAGES - TRAVEL - WILDLIFE - HOCKEY - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
Top 1000 Africa Quiz and trivia: HISTORY - GEOGRAPHY - WILDLIFE - CULTURE - PEOPLE - LANGUAGES - TRAVEL - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada.
Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA
Health Health, a science-based community to discuss health news and the coronavirus (COVID-19) pandemic
- 7 halal meat outlets closed in Calgary as RCMP investigate unlawful slaughter and saleby /u/boppinmule on April 23, 2024 at 3:31 pm
submitted by /u/boppinmule [link] [comments]
- Watchdog group asks 5 attorneys general to investigate crisis pregnancy center privacy practicesby /u/nbcnews on April 23, 2024 at 3:11 pm
submitted by /u/nbcnews [link] [comments]
- Less burnout for doctors, better clinical trials, among the benefits of AI in health careby /u/FJO1989 on April 23, 2024 at 1:48 pm
submitted by /u/FJO1989 [link] [comments]
- Racism may increase risk of heart diseaseby /u/newsweek on April 23, 2024 at 9:01 am
submitted by /u/newsweek [link] [comments]
- Oncologists' meetings with drug reps don't help cancer patients live longerby /u/Maxcactus on April 23, 2024 at 7:00 am
submitted by /u/Maxcactus [link] [comments]
Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.
- TIL By tradition, character deaths in ancient greek theater almost never happened on stage. No matter the importance of the character, deaths almost always occured off stage and announced via messenger, with the body only showed laterby /u/Ainsley-Sorsby on April 23, 2024 at 1:41 pm
submitted by /u/Ainsley-Sorsby [link] [comments]
- TIL about Peter Fossett, a man born into slavery at Thomas Jefferson's Monticello. He later bought his freedom and became a conductor on the underground railroad, a military officer, and a pastor. His wife, Sarah, filed a lawsuit in 1860 which desegregated the streetcars in Cincinnati.by /u/L8_2_PartE on April 23, 2024 at 12:41 pm
submitted by /u/L8_2_PartE [link] [comments]
- TIL the Philipp 1866 Copiales 3 manuscript is a cracked 260 year old code that concealed the arcane rituals of an ancient secret order, the Oculists - who were a group of ophthalmologists.by /u/Puzzleheaded-Cat4647 on April 23, 2024 at 10:55 am
submitted by /u/Puzzleheaded-Cat4647 [link] [comments]
- TIL that in 1983 a Mexcian Gulftstream jet was forced to make an emergency landing on the Mallow Racecourse near Cork, Ireland and subsequently was stuck there for 39 days until a locals were able to construct a temporary runway to allow the plane to take off againby /u/Loki-L on April 23, 2024 at 10:39 am
submitted by /u/Loki-L [link] [comments]
- TIL about Dr. Jesse Bennett, the first American physician to perform a C-Section, which he performed on his own wifeby /u/Cactus_Jacks_Ear on April 23, 2024 at 10:27 am
submitted by /u/Cactus_Jacks_Ear [link] [comments]
Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.
- Subtle differences in the wording of social media messages may be enough to sway, at least temporarily, young people’s beliefs about whether depression and anxiety can be treated or not.by /u/geoff199 on April 23, 2024 at 3:24 pm
submitted by /u/geoff199 [link] [comments]
- Analyzing 36 plastic food-contact articles from 5 countries, researchers found 9936 different chemicals in a single product that can affect hormones, metabolism, and signal transmission in the bodyby /u/giuliomagnifico on April 23, 2024 at 2:57 pm
submitted by /u/giuliomagnifico [link] [comments]
- Genetically engineering a treatment for incurable brain tumors: researchers develop fully off-the-shelf, stem cell-derived, natural killer cells against glioblastoma. A new study in mice showed these immune cells to completely eliminate the growth of the brain tumors.by /u/mvea on April 23, 2024 at 11:33 am
submitted by /u/mvea [link] [comments]
- If every day appears to go in a blur, try seeking out new and interesting experiences, researchers have suggested | Researchers have found louder experiences seem to last longer, while focusing on the clock also makes time dilate, or drag.by /u/chrisdh79 on April 23, 2024 at 11:17 am
submitted by /u/chrisdh79 [link] [comments]
- Dog-killing worm documented in California for the first timeby /u/ludwig_scientist on April 23, 2024 at 11:12 am
submitted by /u/ludwig_scientist [link] [comments]
Reddit Sports Sports News and Highlights from the NFL, NBA, NHL, MLB, MLS, and leagues around the world.
- Lakers' LeBron James sounds off on officiating, replay centerby /u/PrincessBananas85 on April 23, 2024 at 3:19 pm
submitted by /u/PrincessBananas85 [link] [comments]
- Novak Djokovic weighing whether coach needed after 20 yearsby /u/PrincessBananas85 on April 23, 2024 at 2:06 pm
submitted by /u/PrincessBananas85 [link] [comments]
- LeBron goes off on officiating, replay center: 'It bothers me'by /u/Oldtimer_2 on April 23, 2024 at 1:22 pm
submitted by /u/Oldtimer_2 [link] [comments]
- Rory McIlroy set to make a surprise return to PGA Tour boardby /u/kundu123 on April 23, 2024 at 10:54 am
submitted by /u/kundu123 [link] [comments]
- China doping case leaves serious questions just months before Olympicsby /u/davster39 on April 23, 2024 at 5:37 am
submitted by /u/davster39 [link] [comments]