You can translate the content of this page by selecting a language in the select box.
What is the Best Machine Learning Algorithms for Imbalanced Datasets?
In machine learning, imbalanced datasets are those where one class heavily outnumbers the others. This can be due to the nature of the problem or simply because more data is available for one class than the others. Either way, imbalanced datasets can pose a challenge for machine learning algorithms. In this blog post, we’ll take a look at which machine learning algorithms are best suited for imbalanced datasets and why they tend to perform better than others.
For example, in a binary classification problem, if there are 100 observations, and only 10 of them are positive (the rest are negatives), then we say that the dataset is imbalanced. The ratio of positive to negative cases is 1:10.

There are a few reasons why some machine learning algorithms tend to perform better on imbalanced datasets than others. First, certain algorithms are designed to handle imbalanced datasets. Second, some algorithms are more robust to outliers, which can be more common in imbalanced datasets. And third, some algorithms are better able to learn from a limited amount of data, which can be an issue when one class is heavily outnumbered by the others.
Some of the best machine learning algorithms for imbalanced datasets include:
– Support Vector Machines (SVMs),
– Decision Trees,
– Random Forests,
– Naive Bayes Classifiers,
– k-Nearest Neighbors (kNN),
Of these, SVMs tend to be the most popular choice as they are specifically designed to handle imbalanced datasets. SVMs work by finding a hyperplane that maximizes the margin between the two classes. This helps to reduce overfitting and improve generalization. Decision trees and random forests are also popular choices as they are less sensitive to outliers than other algorithms such as linear regression. Naive Bayes classifiers are another good choice as they are able to learn from a limited amount of data. kNN is also a good choice as it is not sensitive to outliers and is able to learn from a limited amount of data. However, it can be computationally intensive for large datasets.
There are two main types of machine learning algorithms: supervised and unsupervised. Supervised algorithms tend to perform better on imbalanced datasets than unsupervised algorithms. In this blog post, we will discuss why this is so and look at some examples.
Supervised Algorithms
Supervised algorithms are those where the target variable is known. In other words, we have training data where the correct answers are already given. The algorithm then learns from this data and is able to generalize to new data. Some examples of supervised algorithms are regression and classification.
Unsupervised Algorithms
Unsupervised algorithms are those where the target variable is not known. With unsupervised algorithms, we only have input data, without any corresponding output labels. The algorithm has to learn from the data itself without any guidance. Some examples of unsupervised algorithms are clustering and dimensionality reduction.
Why Supervised Algorithms Perform Better on Imbalanced Datasets
The reason why supervised algorithms perform better on imbalanced datasets is because they can learn from the training data which cases are more important. With unsupervised algorithms, all data points are treated equally, regardless of whether they are in the minority or majority class.
For example, in a binary classification problem with an imbalanced dataset, let’s say that we want to predict whether a customer will default on their loan payment or not. We have a training dataset of 1000 customers, out of which only 100 (10%) have defaulted on their loan in the past.
If we use a supervised algorithm like logistic regression, the algorithm will learn from the training data that defaulting on a loan is rare (since only 10% of cases in the training data are Positive). This means that it will be more likely to predict correctly that a new customer will not default on their loan (since this is the majority class in the training data).
However, if we use an unsupervised algorithm like k-means clustering, all data points will be treated equally since there is no target variable to guide the algorithm. This means that it might incorrectly cluster together customers who have defaulted on their loans with those who haven’t since there is no guidance provided by a target variable.
Conclusion:
In conclusion, supervised machine learning algorithms tend to perform better on imbalanced datasets than unsupervised machine learning algorithms because they can learn from the training data which cases are more important.
Some machine learning algorithms tend to perform better on highly imbalanced datasets because they are designed to deal with imbalance or because they can learn from both classes simultaneously. If you are working with a highly imbalanced dataset, then you should consider using one of these algorithms.
Thanks for reading!
How are machine learning techniques being used to address unstructured data challenges?
Machine learning techniques are being used to address unstructured data challenges in a number of ways:
- Natural language processing (NLP): NLP algorithms can be used to extract meaningful information from unstructured text data, such as emails, documents, and social media posts. NLP algorithms can be trained to classify text data, identify key terms and concepts, and extract structured data from unstructured text.
- Image recognition: Machine learning algorithms can be used to analyze and classify images, enabling the automatic identification and classification of objects, people, and other elements in images. This can be useful for tasks such as image tagging and search, as well as for applications such as security and surveillance.
- Audio and speech recognition: Machine learning algorithms can be used to analyze and classify audio data, enabling the automatic transcription and translation of spoken language. This can be useful for tasks such as speech-to-text transcription, as well as for applications such as call center automation and language translation.
- Video analysis: Machine learning algorithms can be used to analyze and classify video data, enabling the automatic detection and classification of objects, people, and other elements in video. This can be useful for tasks such as video tagging and search, as well as for applications such as security and surveillance.
Overall, machine learning techniques are being used in a wide range of applications to extract meaningful information from unstructured data, and to enable the automatic classification and analysis of data in a variety of formats.
How is AI and machine learning impacting application development today?
Artificial intelligence (AI) and machine learning are having a significant impact on application development today in a number of ways:
- Enabling new capabilities: AI and machine learning algorithms can be used to enable applications to perform tasks that would be difficult or impossible for humans to do. For example, AI-powered applications can be used to analyze and classify large amounts of data, or to automate complex decision-making processes.
- Improving performance: AI and machine learning algorithms can be used to optimize the performance of applications, making them faster, more efficient, and more accurate. For example, machine learning algorithms can be used to improve the accuracy of predictive models, or to optimize the performance of search algorithms.
- Streamlining development: AI and machine learning algorithms can be used to automate various aspects of application development, such as testing, debugging, and deployment. This can help to streamline the development process and reduce the time and resources needed to build and maintain applications.
- Enhancing user experiences: AI and machine learning algorithms can be used to enhance the user experience of applications, by providing personalized recommendations, recommendations, or by enabling applications to anticipate and respond to the needs and preferences of users.
Overall, AI and machine learning are having a significant impact on application development today, and they are likely to continue to shape the way applications are built and used in the future.
How will advancements in artificial intelligence and machine learning shape the future of work and society?
Advancements in artificial intelligence (AI) and machine learning are likely to shape the future of work and society in a number of ways. Some potential impacts include:
- Automation: AI and machine learning algorithms can be used to automate tasks that are currently performed by humans, such as data entry, customer service, and manufacturing. This could lead to changes in the types of jobs that are available and the skills that are in demand, as well as to increased productivity and efficiency.
- Job displacement: While automation may create new job opportunities, it could also lead to job displacement, particularly for workers in industries that are more susceptible to automation. This could lead to social and economic challenges, including unemployment and income inequality.
- Increased efficiency: AI and machine learning algorithms can be used to optimize and streamline business processes, leading to increased efficiency and productivity. This could lead to economic growth and innovation, and could also help to reduce costs for businesses and consumers.
- Enhanced decision-making: AI and machine learning algorithms can be used to analyze large amounts of data and make more informed and accurate decisions. This could lead to improved outcomes in fields such as healthcare, finance, and education, and could also help to reduce bias and improve fairness.
Overall, the impact of AI and machine learning on the future of work and society is likely to be significant and complex, with both potential benefits and challenges. It will be important to consider and address these impacts as these technologies continue to advance and become more widely adopted.
- [D] Overfitting on small GPT datasetsby /u/rwill128 (Machine Learning) on May 30, 2023 at 4:17 pm
I've recently cloned NanoGPT and trained a few character-level models on the Shakespeare dataset. The process of looking at these last few runs in WandB eventually got me thinking about overfitting in GPT models in general, and how it interacts with two things: the temperature setting during text generation, and also the weaknesses of LLMs when it comes to hallucinations, arithmetic, and rigorous fact-based reasoning. I don't know how to run experiments for some of these ideas yet, but I'm thinking about it, and I'd like to hear about any papers that might be related. --- First of all, how do the occurrence of hallucinations in a GPT model change if you allow overfitting on a dataset? It seems like it could reduce its occurrence, because the model has "memorized" various features of the dataset. So if the dataset contains factual information, that would be more likely to be preserved. I have the same kind of questions regarding temperature and how it affects the frequency of hallucinations. --- To give an example of what I mean, let's look at URLs generated by ChatGPT. If you're not using the web-browsing extension, they are basically 100% of the time going to be wrong, as in they won't actually exist if you try to visit them. They just look like plausible URLs, which is a consequence of the fact that the model learned what URLs look like, but it didn't memorize any specifics URLs, right? My hypothesis (and maybe URLs are actually a great idea for how to test this...) is that if the model were allowed to overfit on certain parts of the dataset (namely URLs in this case) and if the temperature were selectively controlled so that it's extremely low when the model knows a URL is currently being generated, then you could get an GPT model that can mostly function as a normal GPT model, but can also precisely transcribe certain things that were present in its dataset. This would be enormously useful if the model could learn when precise character-by-character recall is important, such as when it's generating URLs, and when it's not important and it can be more "creative", such as when it's describing a general idea or concept. Any thoughts on this? If the theory is actually true that overfitting and low temperature can help generate real URLs that appear in the dataset, then perhaps you could more broadly apply this to other areas, and discover techniques that allow a GPT model to know when it needs to be precise and when it doesn't. submitted by /u/rwill128 [link] [comments]
- [R] Automated Checks for Violations of Independent and Identically Distributed (IID) Assumptionby /u/jonas__m (Machine Learning) on May 30, 2023 at 4:00 pm
Hey Redditors! Before modeling a dataset, do you remember to check if it seems IID? The non-IID data on the right were collected in such a way that violates the Independent and Identically Distributed (IID) assumption. Distribution drift and interactions between datapoints (autocorrelation) are common violations of the Independent and Identically Distributed (IID) assumption which make data-driven inference untrustworthy. I present an automated check for such IID violations that you can quickly run on any {numeric, image, text, audio, etc.} dataset! My method helps you understand: does the order in which my data were collected matter? When the answer is yes, you must take special precautions in modeling to ensure proper generalization from data violating the IID property. Almost all of standard Machine Learning and Statistics relies on this fundamental property! I just published a paper detailing this non-IID check and open-sourced its code in the cleanlab package — just one line of code will check for this and many other types of issues in your dataset. Don’t let such issues mess up your data analysis, use automated software to detect them before you dive into modeling! submitted by /u/jonas__m [link] [comments]
- [D] Is there any way to filter searches by metadata over current vector DBs like Pinecone?by /u/Galbatorix123 (Machine Learning) on May 30, 2023 at 3:56 pm
So, I'm thinking of building an application that enables organizations to query their documents with natural language. The basic solution would be to upload all documents to the vector DB and then query for the nearest neighbors. The issue is that not all users in the organization have access to all documents. Ideally, we can limit the search over documents from the vector DB based on the role of the user. Is this possible? Are there any vector DB providers that allow filtering over metadata? Thanks! submitted by /u/Galbatorix123 [link] [comments]
- [D] Understanding frequency penalty, presence penalty, repetition penaltyby /u/dualtree (Machine Learning) on May 30, 2023 at 3:53 pm
I'm using Llama for a chatbot that engages in dialogue with the user. However, I notice that it often generates replies that are very similar to messages it has sent in the past (which appear in the message history as part of the prompt). Will increasing the frequency penalty, presence penalty, or repetition penalty help here? My understanding is that they reduce repetition within the generated text (aka avoid repeating a word multiple times), but they don't prevent repeating words or phrases that appear in the prompt. Is that correct? If not, then which of the three penalties should be increased? Thanks so much. submitted by /u/dualtree [link] [comments]
- [D] What does the process for building and maintaining a knowledge graph look like?by /u/biscuits-and-jamies (Machine Learning) on May 30, 2023 at 3:12 pm
What does a knowledge graph process look like? I feel like learning about a functional, purpose-built knowledge graph - where it comes from, the gist of how it was built, and how it is being maintained - would go a long way to provide clarity on what can be done with a knowledge graph. -------------------------------- Over the past two weeks, I worked through a collection tutorials and training videos (primarily Stardog) - learning the vocabulary, high-level uses, and interacting with knowledge graph libraries UI, learning the basics of Turtle and SPARQL language syntax going through examples and testing things. All great stuff. I feel comfortable with the main themes of knowledge graphs. From what I gathered, there appears to be two ways to build a knowledge graph: (1) manually (e.g., creating the data, loading the data directly or via virtualization, defining classes and properties, imposing constraints, etc.) or (2) programmatically (e.g., creating data by scraping text with NLP models, converting extracted data for subject-predict-object syntax, creating object properties programmatically (I'm really not sure how people do this, GNNs?) and uploading it to a knowledge graph). How both of those processes in the real world seem opaque to me. Here are two resources I intend to start with: [0] https://allenai.org/demos and [1] https://link.springer.com/chapter/10.1007/978-3-319-25010-6_12 submitted by /u/biscuits-and-jamies [link] [comments]
- Cheap ways to deploy ML models [D]?by /u/r1a2k3i4b (Machine Learning) on May 30, 2023 at 1:56 pm
Are there any cheap or recommended ways to deploy a few machine learning models as REST APIs? My app uses a few stable diffusion models to generate images but we rely on another service's API which often goes down... We pay around $150/Month for this but was thinking paying for GPUs would be more expensive Does anyone have any suggestions or ideas? submitted by /u/r1a2k3i4b [link] [comments]
- Interactively explore your AI Datasets with Spotlight [P]by /u/DocBrownMS (Machine Learning) on May 30, 2023 at 1:03 pm
Hey r/MachineLearning, We are excited to share with you a new open source tool from Renumics: Spotlight. The OSS release of Spotlight on github.com/Renumics/spotlight happened today on May 30, 2023. Spotlight offers an interactive way to explore your datasets. It provides a customizable layout where you can leverage Similarity Maps based on embeddings, and various plots like histograms or scatter plots. In addition, it supports detailed views for images, 3D meshes and audio data. To illustrate its functionality, let's consider the CIFAR100 dataset. In this example, embeddings were added using a Vision Transformer: import datasets from renumics import spotlight dataset = datasets.load_dataset("renumics/cifar100-enriched", split="test") df = dataset.to_pandas() df_show = df.drop(columns=['embedding']) # drop large embeddings spotlight.show(df_show, dtype={"image": spotlight.Image, "embedding_reduced": spotlight.Embedding}) https://preview.redd.it/1ze14id7703b1.png?width=1485&format=png&auto=webp&s=a0890accb1a48ec9d02db07b3527cb8508c0da02 Getting started with Spotlight is straightforward. You'll need Python version 3.8-3.10, and you can install Spotlight via pip by running: pip install renumics-spotlight datasets After installation, you're all set to load your dataframe and begin exploring with Spotlight. We invite you to try out Spotlight with your own use cases and datasets. If you encounter any issues or require support, don't hesitate to report here on Reddit or create an issue on our GitHub page. submitted by /u/DocBrownMS [link] [comments]
- [P] Fine-tuning LLaMA on TheVault by AI4Codeby /u/04RR (Machine Learning) on May 30, 2023 at 12:47 pm
Hey everyone, I'm looking for suggestions and things to keep in mind while I do this. I essentially want to fine-tune LLaMA on a dataset that's geared towards code generation. After a bit of research I found TheVault which seems good enough for the job (let me know if there are better datasets tho). For the fine-tuning part, I looking to use LoRA or other similar methods. This is the first time I'm fine-tuning LLMs so let me know if you have any suggestions or tips. submitted by /u/04RR [link] [comments]
- [D] Building a PC for light ML/DL trainingby /u/Tekno-12345 (Machine Learning) on May 30, 2023 at 11:59 am
I am seeking some help with choosing the best components for a light Deep Learning training station without overspending, do you think these are reasonable? Specs: Intel Core i7-12700 12th Gen Processor - Alder Lake 12 Core LGA 1700 CPU | 12700 Gigabyte GeForce RTX 3060 WINDFORCE OC 12G (rev. 2.0) | RTX 3060 WINDFORCE OC HIKVISION RAM 16GB DDR4 3000MHz- For Desktop | HKED4161DAA2D1ZA2 Xigmatek LUX A Shadow Metal Grey ATX 4PCS RGB FANS GALAXY II | EN48274 Kingston 1TB NV2 M.2 2280 PCIe 4.0 x4 NVMe SSD | SNV2S/1000G Xigmatek Hydra M 750W Power Supply | EN44221 submitted by /u/Tekno-12345 [link] [comments]
- [P] Opinionated Web Framework for Converting Jupyter Notebooks to Web Appsby /u/pp314159 (Machine Learning) on May 30, 2023 at 11:52 am
We're working on open-source web framework Mercury that converts Python notebooks to Web Apps. It is very opinionated: it has no callbacks - we automatically re-execute cells below updated widget it has no layout widgets, all input widgets are always in the left sidebar Thanks to above decisions you don't need to change notebook's code to fit into framework UI paradigm, with minimal changes you get web app. The simplicity of the framework is very important to us. We also care about deployment simplicity. That's why we created a shared hosting service called Mercury Cloud. You can deploy notebook by uploading a file. The GitHub repository https://github.com/mljar/mercury Documentation https://RunMercury.com/docs/ Mercury Cloud https://cloud.runmercury.com submitted by /u/pp314159 [link] [comments]
- Hybrid CNN-SVM model [p]by /u/ImeneCharabi (Machine Learning) on May 30, 2023 at 10:51 am
Hello, if I want to build a CNN-SVM hybrid model where the CNN is used for feature extraction and SVM is employed for classification, which approach would be better: using an end-to-end trainable model or extracting features from the last CNN layer and passing them to another SVM model for classification? I'm wondering what the best approach is and the reason behind it. Are both approaches equally effective? submitted by /u/ImeneCharabi [link] [comments]
- [R] 1m+ High Res. vehicle imagesby /u/victorkimuyu (Machine Learning) on May 30, 2023 at 9:54 am
I have a pretty large collection of vehicle images comprising saloons (sedans), station wagons, SUVs, trucks, pick-up trucks, vans and everything in between. The vehicles are staged to be photographed for purposes of appraisal and valuation reports. As such, they are taken in different locations (garage, roadside, under a tree), from at the very least 4 angles (left front, left rear, right rear, right front, interior, dashboard, engine, VIN), and in varied lighting conditions. However, overall, the lighting is good as most of the time the photos are taken outdoors during the day. I would like to enrich this collection with tags with which I can train a model in future for various automation conveniences. I might also explore ML dataset marketplaces but this is not a priority at the moment. What are some of the useful tagging techniques that could simplify the process and what tags have the most potential for usefulness and versatility? submitted by /u/victorkimuyu [link] [comments]
- [D] Graph neural network on multiple graphsby /u/jef_107 (Machine Learning) on May 30, 2023 at 9:47 am
I am trying to create a horse racing predicting model using graphs. For each race there is a graph, each graph there are different number jockey and horse. There will also be different features for horse and jockey. The ultimate goal of the model would be to do a node classification which the winning horse would be labelled 1 (or the highest probability). However I’m struggling which model to use, is there any suggestion which is suitable for my case. Note: the edge between each horse will be based on their race record, and since a lot of them have not competed with each other, the graph is a bit sparse. I have seen online that some people will merge the graphs into a larger graph and separate each smaller graphs by having no edge between them, but in my case the jockeys are mostly the same between races so i don’t think this method is suitable. submitted by /u/jef_107 [link] [comments]
- [N] Hinton, Bengio, and other AI experts sign collective statement on AI riskby /u/DanielHendrycks (Machine Learning) on May 30, 2023 at 9:45 am
We recently released a brief statement on AI risk, jointly signed by a broad coalition of experts in AI and other fields. Geoffrey Hinton and Yoshua Bengio have signed, as have scientists from major AI labs—Ilya Sutskever, David Silver, and Ian Goodfellow—as well as executives from Microsoft and Google and professors from leading universities in AI research. This concern goes beyond AI industry and academia. Signatories include notable philosophers, ethicists, legal scholars, economists, physicists, political scientists, pandemic scientists, nuclear scientists, and climate scientists. The statement reads: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” We wanted to keep the statement brief, especially as different signatories have different beliefs. A few have written content explaining some of their concerns: Yoshua Bengio – How Rogue AIs May Arise Emad Mostaque (Stability) on the risks, opportunities and how it may make humans 'boring' David Krueger (Cambridge) – Harms from Increasingly Agentic Algorithmic Systems As indicated in the first sentence of the signatory page, there are numerous "important and urgent risks from AI," in addition to the potential risk of extinction. AI presents significant current challenges in various forms, such as malicious use, misinformation, lack of transparency, deepfakes, cyberattacks, phishing, and lethal autonomous weapons. These risks are substantial and should be addressed alongside the potential for catastrophic outcomes. Ultimately, it is crucial to attend to and mitigate all types of AI-related risks. Signatories of the statement include: The authors of the standard textbook on Artificial Intelligence (Stuart Russell and Peter Norvig) Two authors of the standard textbook on Deep Learning (Ian Goodfellow and Yoshua Bengio) An author of the standard textbook on Reinforcement Learning (Andrew Barto) Three Turing Award winners (Geoffrey Hinton, Yoshua Bengio, and Martin Hellman) CEOs of top AI labs: Sam Altman, Demis Hassabis, and Dario Amodei Executives from Microsoft, OpenAI, Google, Google DeepMind, and Anthropic AI professors from Chinese universities The scientists behind famous AI systems such as AlphaGo and every version of GPT (David Silver, Ilya Sutskever) The top two most cited computer scientists (Hinton and Bengio), and the most cited scholar in computer security and privacy (Dawn Song) submitted by /u/DanielHendrycks [link] [comments]
- [N] Free Machine Learning virtual conference tomorrowby /u/WeAreDevelopers_ (Machine Learning) on May 30, 2023 at 9:01 am
Hi people, we're hosting a free virtual mini-conference about Machine Learning tomorrow (Wednesday, May 31st). It starts at 10:00 AM (CEST) but all the talks will be recorded and available for watching on-demand after the event in case you can't make it on time. Moreover, all the talks are streamed live with Q&A sessions after each talk. Check out the topics and the agenda and make sure to save your spot in case you're interested in joining: https://www.wearedevelopers.com/event/machine-learning-day-may-2023 This post isn't intended as a self-promo or anything, just wanna contribute to the community. Hopefully, you'll find the event resourceful and it'll help you in your future ML/AI projects. submitted by /u/WeAreDevelopers_ [link] [comments]
- [D] KPIs for Machine Learning Teams in an Industry Settingby /u/CrypticParagon (Machine Learning) on May 30, 2023 at 6:37 am
Hey all, It's pretty easy for me to relate the performance of my team's various models to company-level KPIs, revenue, EBITDA, etc. However, I struggle with coming up with KPIs for my team, which is primarily responsible for developing models. I don't like model performance metrics as a KPI because those metrics depend on too much that is out of our control, e.g. the quality of the data available to us and the tractability of the problem. Rate of completed experiments doesn't make much sense because there can be a ton of code writing with spurts of model trainings. It seems like the only real KPI that I can measure with any meaning is some sort of LoE velocity, e.g. Jira Story Points. What are some other ideas? What do you use as KPIs on your teams? submitted by /u/CrypticParagon [link] [comments]
- [R] Machine Learning for Ancient Languagesby /u/yannisassael (Machine Learning) on May 29, 2023 at 7:42 pm
We wanted to share our recent review paper “Machine learning for ancient languages: a survey” published in Computational Linguistics (MIT Press). Our work surveyed over 240 research papers using machine learning for the study of ancient texts written in any language, script and medium. This review is intended to promote and support the continued collaborative impetus between the Humanities and Machine Learning, and is a part of our effort on AI for the Humanities. https://direct.mit.edu/coli/article/doi/10.1162/coli_a_00481/116160/Machine-Learning-for-Ancient-Languages-A-Survey We also created a GitHub repository to host the taxonomy of the reviewed literature and maintain an up-to-date catalogue of active interdisciplinary research on this theme (pull requests encouraged!) https://github.com/ancientml/ml-for-ancient-languages submitted by /u/yannisassael [link] [comments]
- [Discussion] Guidance to stay somewhat up-to date in AIby /u/Public-Mechanic-5476 (Machine Learning) on May 29, 2023 at 6:04 pm
I work as a Computer Vision engineer, working mostly with classification and object detection problems. Work is quite demanding so whatever time I get, I try to search for new stuff happening in Computer Vision/Deep Learning space. I usually rely on LinkedIn, Twitter and Reddit. At times I find good stuff while scrolling but not always. I really want few fixed sources (3-4 sites maybe?) which keeps me somewhat up to date in this space. I know it's very difficult to stay 100% upto date. Also, not limiting the space to only classification and object detection, it can be any area in Computer Vision (Zero shot learning, new Optimizers, survey papers, LLM + CV, etc) Few sources I refer to apart from above (not very regular though) Papers with code Arxiv Meta/Google blogs Looking for guidance and help 🙏 submitted by /u/Public-Mechanic-5476 [link] [comments]
- [N] Researchers from MIT and McMaster University leveraged a machine learning AI algorithm to discover a new antibiotic for drug-resistant infections caused by Acinetobacter baumanniiby /u/mesqz (Machine Learning) on May 29, 2023 at 4:57 pm
https://medium.com/@tiago-mesquita/from-algorithms-to-antibiotics-ai-guides-scientists-to-novel-antibiotic-for-drug-resistant-6a902e9e33f6 To develop their computational model, the researchers exposed A. baumannii to around 7,500 chemical compounds in a lab setting. By feeding the structure of each molecule into the model and indicating whether it inhibited bacterial growth, the algorithm learned the chemical features associated with growth suppression. submitted by /u/mesqz [link] [comments]
- [R] LaVIN: Large Vision-Language Instructed Modelby /u/Technical-Vast1314 (Machine Learning) on May 29, 2023 at 4:24 pm
https://preview.redd.it/t37xwe9i6u2b1.png?width=1440&format=png&auto=webp&s=5a19d3002f4cd20fd292b183aa7833033da1ee1b Paper: https://arxiv.org/pdf/2305.15023.pdf Project: https://github.com/luogen1996/LaVIN Adapting large language models to multimodal instructions typically requires a significant amount of training time. Both BLIP2 and mini-GPT4 require large sets of paired text and image samples for pretraining. Additionally, LLaVA requires fine-tuning of the entire large language model. These approaches greatly increase the cost of multimodal adaptation and can lead to a decrease in the textual capabilities of the large language model. In this paper, we propose an efficient multimodal instruction fine-tuning approach that enables fast adaptation of large language models to text-only instructions and text+image instructions. Based on this approach, we propose a new multimodal large model (LaVIN-7B, LaVIN-13B) with the following advantages: - Parameter Efficiency: LaVIN only has 3~5M training parameters. - Training Efficiency: LaVIN only needs 1.4 hours for fine-tuning on ScienceQA dataset - Strong Performance: LaVIN achieves 90.8% accuracy on the ScienceQA dataset, outperforming LLaMA-Adapter with about 6% accuracy. - Multimodality: LaVIN supports both text-only and text-image instructions. https://preview.redd.it/vnr8m18g7y2b1.png?width=1656&format=png&auto=webp&s=e6cfeba67004605322dab5f0adb4bf486c4d890f https://preview.redd.it/kmqn64ue7y2b1.png?width=1566&format=png&auto=webp&s=f7d91b316bf581f49f24415f7f2be5198148b4eb https://preview.redd.it/n14ni8dh7y2b1.png?width=1604&format=png&auto=webp&s=94dbad21ab43b1c3cb0fba9ccdbdaf867212ba9d https://preview.redd.it/vz48i7298u2b1.png?width=2816&format=png&auto=webp&s=d1c5c748d4f7810a1f81f57b3c96654558b04085 https://i.redd.it/91qc617r7y2b1.gif submitted by /u/Technical-Vast1314 [link] [comments]