What are some ways to increase precision or recall in machine learning?
What are some ways to Boost Precision and Recall in Machine Learning?
Sensitivity vs Specificity?
In machine learning, recall is the ability of the model to find all relevant instances in the data while precision is the ability of the model to correctly identify only the relevant instances. A high recall means that most relevant results are returned while a high precision means that most of the returned results are relevant. Ideally, you want a model with both high recall and high precision but often there is a trade-off between the two. In this blog post, we will explore some ways to increase recall or precision in machine learning.
There are two main ways to increase recall:
by increasing the number of false positives or by decreasing the number of false negatives. To increase the number of false positives, you can lower your threshold for what constitutes a positive prediction. For example, if you are trying to predict whether or not an email is spam, you might lower the threshold for what constitutes spam so that more emails are classified as spam. This will result in more false positives (emails that are not actually spam being classified as spam) but will also increase recall (more actual spam emails being classified as spam).
Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6 Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)
you can increase your threshold for what constitutes a positive prediction. For example, going back to the spam email prediction example, you might raise the threshold for what constitutes spam so that fewer emails are classified as spam. This will result in fewer false negatives (actual spam emails not being classified as spam) but will also decrease recall (fewer actual spam emails being classified as spam).
There are two main ways to increase precision:
by increasing the number of true positives or by decreasing the number of true negatives. To increase the number of true positives, you can raise your threshold for what constitutes a positive prediction. For example, using the spam email prediction example again, you might raise the threshold for what constitutes spam so that fewer emails are classified as spam. This will result in more true positives (emails that are actually spam being classified as spam) but will also decrease precision (more non-spam emails being classified as spam).
To decrease the number of true negatives,
you can lower your threshold for what constitutes a positive prediction. For example, going back to the spam email prediction example once more, you might lower the threshold for what constitutes spam so that more emails are classified as spam. This will result in fewer true negatives (emails that are not actually spam not being classified as spam) but will also decrease precision (more non-spam emails being classified as spam).
To summarize,
there are a few ways to increase precision or recall in machine learning. One way is to use a different evaluation metric. For example, if you are trying to maximize precision, you can use the F1 score, which is a combination of precision and recall. Another way to increase precision or recall is to adjust the threshold for classification. This can be done by changing the decision boundary or by using a different algorithm altogether.
Sensitivity vs Specificity
In machine learning, sensitivity and specificity are two measures of the performance of a model. Sensitivity is the proportion of true positives that are correctly predicted by the model, while specificity is the proportion of true negatives that are correctly predicted by the model.
Google introduced computing units, which you can purchase just like any other cloud computing unit you can from AWS or Azure etc. With Pro you get 100, and with Pro+ you get 500 computing units. GPU, TPU and option of High-RAM effects how much computing unit you use hourly. If you don’t have any computing units, you can’t use “Premium” tier gpus (A100, V100) and even P100 is non-viable.
Google Colab Pro+ comes with Premium tier GPU option, meanwhile in Pro if you have computing units you can randomly connect to P100 or T4. After you use all of your computing units, you can buy more or you can use T4 GPU for the half or most of the time (there can be a lot of times in the day that you can’t even use a T4 or any kinds of GPU). In free tier, offered gpus are most of the time K80 and P4, which performs similar to a 750ti (entry level gpu from 2014) with more VRAM.
For your consideration, T4 uses around 2, and A100 uses around 15 computing units hourly. Based on the current knowledge, computing units costs for GPUs tend to fluctuate based on some unknown factor.
Considering those:
For hobbyists and (under)graduate school duties, it will be better to use your own gpu if you have something with more than 4 gigs of VRAM and better than 750ti, or atleast purchase google pro to reach T4 even if you have no computing units remaining.
For small research companies, and non-trivial research at universities, and probably for most of the people Colab now probably is not a good option.
Colab Pro+ can be considered if you want Pro but you don’t sit in front of your computer, since it disconnects after 90 minutes of inactivity in your computer. But this can be overcomed with some scripts to some extend. So for most of the time Colab Pro+ is not a good option.
If you have anything more to say, please let me know so I can edit this post with them. Thanks!
In machine learning, precision and recall trade off against each other; increasing one often decreases the other. There is no single silver bullet solution for increasing either precision or recall; it depends on your specific use case which one is more important and which methods will work best for boosting whichever metric you choose. In this blog post, we explored some methods for increasing either precision or recall; hopefully this gives you a starting point for improving your own models!
Does anybody have any tips on how to handle re organization/ re-structuring? Still employed as DA. And tbh have avenues to stay at my company but seems like they’re moving to a more centralized data structure. Probably will give primary access to tech hub office employees. I am remote. I do power BI, vba and data processing. Right now mostly ETL stuff. Any tips would be appreciated! submitted by /u/pintora0318 [link] [comments]
I’m starting a Master’s program in the fall and I’d like to improve my programming skills. My undergrad was in Math, so programming wasn’t really much of a focus. I took one actual CS course which mostly used Python and just a little bit of C. I encountered R in college in my Stats courses and I use it regularly in my current role (DBA/Analyst at a small nonprofit). I’ve also kept up with Python and I’m fairly comfortable with it still. I’ve never actually learned about OOP or say structures and algorithms but I’d like to. I’ve read a bunch about Java being a more rigorous language which forces you to code in an object-oriented way. I guess my question is: is there enough of a benefit to using Java for OOP, or should I just use resources designed for Python? submitted by /u/orndoda [link] [comments]
If anyone has any experiences, I'd love to hear it. And if it's not a thing, what are the blockers in your opinion? submitted by /u/trashed_culture [link] [comments]
I made this post after getting an offer a couple months ago. A couple weeks after the offer, it was rescinded. Probably for the best as I realized the original description did not match the actual role. After the offer was rescinded, I took a couple weeks off the job hunt before getting back at it. Cleaned up the resume, started being more selective with where I applied, and grinding SQL problems online. About a month in I was interviewing with 3 companies. I don't feel like making another Sankey, but it's pretty much identical to the last, except I got 3 first round interviews, rather than the 1 last time. Companies are 1 mid-sized tech and 2 pre-IPO unicorns. I was ghosted by one unicorn after a screening round and am still interviewing with the other after 2 rounds, though after 5 rounds with the mid-sized tech I accepted a DS manager position. My advice: 1) stop following this subreddit, it's 90% doom posting and 10% circle jerk. It doesn't feel like anyone here is actually interested in data science beyond getting a job. 2) mass send an easy to parse resume everywhere. 3) keep your head up, it's a grind. Don't forget to exercise, eat well, and have a social outlet. 4) referrals aren't worth what they once were. None of my dozen or so referrals resulted in even a screening interview I was rejected for roles I thought I was a shoo-in for and interviewed for roles I thought were a reach. There's a lot of luck (preparation+opportunity) involved that's often out of your control. Good luck submitted by /u/JeepMan831 [link] [comments]
Hello Everyone! This Friday, I have an upcoming final-round interview with the Director of the division I'd be interning under if I got the position. Per the recruiter, this is just to sort of solidify me as the right candidate for this role. I know that there won't be any sort of technical/coding aspect as it is just a 30-minute call. If anyone has any advice on how to approach it, it would be greatly appreciated! This is my first ever final round interview so any advice would be great! Thanks and have a great day! submitted by /u/deeht0xdagod [link] [comments]
https://arstechnica.com/tech-policy/2024/04/feds-appoint-ai-doomer-to-run-us-ai-safety-institute/ Article intro: Appointed as head of AI safety is Paul Christiano, a former OpenAI researcher who pioneered a foundational AI safety technique called reinforcement learning from human feedback (RLHF), but is also known for predicting that "there's a 50 percent chance AI development could end in 'doom.'" While Christiano's research background is impressive, some fear that by appointing a so-called "AI doomer," NIST may be risking encouraging non-scientific thinking that many critics view as sheer speculation. submitted by /u/bregav [link] [comments]
Hey All, I think many of us can agree that defining the hardware we want to use right next to the piece of code we are running is objectively a much better developer experience. I have always loved the idea of lowering the barrier when it comes to running code in the cloud. As more cloud abstractions hit the market, I was honestly really surprised by the lack of adoption. There aren't any unicorns (I don't think any actually) in this space yet, just series A businesses. After speaking with a handful of Data Scientists, Machine Learning Engineers, and DevOps Engineers, it started to dawn on me that risk aversion is causing most of the friction. Using a fully managed service can definitely have some upsides, and in many cases, I prefer using them, but convincing your boss to pipe petabytes of data to another company's cloud and incur 3-5x compute costs probably isn't going to sit well. There are also some open source alternatives but they are intentionally difficult to configure so you pay for their premium offerings that reduce config setup. Would love to hear everyone's thoughts, especially those who work at lean startups and global 5,000 companies. submitted by /u/Ok_Post_149 [link] [comments]
I am curious to know the job opportunities in Banking & Insurance for someone pursuing PhD in Statistics given the current market conditions. submitted by /u/SpiritualCellist4303 [link] [comments]
Title. Is there a way to determine the degree of sphericity or hyperbolicity of the embeddings a feature extractor learns for a set of examples it has been trained on / will be tested on? I am new to geometry in deep learning. It would be amazing if anyone could also point me to a paper or a book to get started on this. Thanks in advance. submitted by /u/Mad_Scientist2027 [link] [comments]
Paper: https://arxiv.org/abs/2104.10751 Package: https://github.com/sametcopur/ruleopt Documentation: https://ruleopt.readthedocs.io/ RuleOpt is an optimization-based rule learning algorithm designed for classification problems. Focusing on scalability and interpretability, RuleOpt utilizes linear programming for rule generation and extraction. The Python library ruleopt is capable of extracting rules from ensemble models, and it also implements a novel rule generation scheme. The library ensures compatibility with existing machine learning pipelines, and it is especially efficient for tackling large-scale problems. Here are a few highlights of ruleopt: Efficient Rule Generation and Extraction: Leverages linear programming for scalable rule generation (stand-alone machine learning method) and rule extraction from trained random forest and boosting models. Interpretability: Prioritizes model transparency by assigning costs to rules in order to achieve a desirable balance with accuracy. Integration with Machine Learning Libraries: Facilitates smooth integration with well-known Python libraries scikit-learn, LightGBM, and XGBoost, and existing machine learning pipelines. Extensive Solver Support: Supports a wide array of solvers, including Gurobi, CPLEX and OR-Tools. submitted by /u/zedeleyici3401 [link] [comments]
I've been using LSTM models for time series forecasting and have noticed they perform well for predicting the immediate next step. However, when attempting multi-step predictions to forecast one week ahead (168 periods, with hourly data), the performance drops significantly. Currently, I'm using a recursive approach: feeding back the prediction as the next input (closed loop). This method isn't yielding good results, although open loop predictions are much more accurate. Is there a better technique for enhancing LSTM's multi-step prediction accuracy? Are LSTMs not useful for doing multi step forecasting? Any links or resources to articles explain multi step forecasting with LSTMs would be appreciated. https://preview.redd.it/30y3m16gr3vc1.png?width=833&format=png&auto=webp&s=6d6b29e05b105b50d2689127ea6881d1ec667903 https://preview.redd.it/a971j16gr3vc1.png?width=833&format=png&auto=webp&s=fec277d9343c5f702247a6135dbb630358c14cca submitted by /u/StressAccomplished26 [link] [comments]
Hi, I'm one year into my first DS job at a big German company. I want to decide in which direction I want to develop myself careerwise and ask you for your opinion on that. Right now I do basic things like building ML models, big data analysis in pyspark, dashboards in powerbi and I also built small chatbots with streamlit, langchain and some Azure ressources. I know functional programming in Python but I never really learned object oriented programming, is this maybe something I should go for? I don't really have a senior colleague right now that could create a plan for me, it's a bit of a weird hierarchy there, so I'm super thankful for any input 🙂 Thank you! submitted by /u/Florida-Rolf [link] [comments]
Dive into the transformative power of data science in the world of American football with Eric Eager, PhD's "Using Data Science to Better Evaluate American Football Players." In this presentation, Dr. Neubig, an expert in machine learning and natural language processing, showcases how the sport is evolving through advanced analytics. 🏈💻 From play-by-play and charting data to the revolutionary potential of player tracking data, discover the cutting-edge techniques that are setting the stage for a new era in football analysis. https://www.youtube.com/watch?v=8lwFUO_yj7c submitted by /u/Data_Nerd1979 [link] [comments]
Hi, I'm a student learning data science. I see few of my mates, making project with generative AI tools like langchain or open AI API etc But this is what I think, and I want to know if what I think is correct or not. Knowing how to use generative AI frameworks does not validate that you know deep learning or even basic machine learning. I think building projects with generative AI frameworks only validate that you know how to code by reading some docs. I think anyone who knows basic programming can make an "AI summarizer" or "AI Chatbot" using langchain. I don't feel that making such projects can make me standout in any way for machine learning jobs. I would rather make a basic data science project which at least tries to solve some real business problem. submitted by /u/Medium_Alternative50 [link] [comments]
Paper: https://arxiv.org/abs/2404.07738 Abstract: Scientific Research, vital for improving human life, is hindered by its inherent complexity, slow pace, and the need for specialized experts. To enhance its productivity, we propose a ResearchAgent, a large language model-powered research idea writing agent, which automatically generates problems, methods, and experiment designs while iteratively refining them based on scientific literature. Specifically, starting with a core paper as the primary focus to generate ideas, our ResearchAgent is augmented not only with relevant publications through connecting information over an academic graph but also entities retrieved from an entity-centric knowledge store based on their underlying concepts, mined and shared across numerous papers. In addition, mirroring the human approach to iteratively improving ideas with peer discussions, we leverage multiple ReviewingAgents that provide reviews and feedback iteratively. Further, they are instantiated with human preference-aligned large language models whose criteria for evaluation are derived from actual human judgments. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines, showcasing its effectiveness in generating novel, clear, and valid research ideas based on human and model-based evaluation results. submitted by /u/SeawaterFlows [link] [comments]
Paper: https://arxiv.org/abs/2404.09967 Code: https://github.com/HL-hanlin/Ctrl-Adapter Models: https://huggingface.co/hanlincs/Ctrl-Adapter Project page: https://ctrl-adapter.github.io/ Abstract: ControlNets are widely used for adding spatial control in image generation with different conditions, such as depth maps, canny edges, and human poses. However, there are several challenges when leveraging the pretrained image ControlNets for controlled video generation. First, pretrained ControlNet cannot be directly plugged into new backbone models due to the mismatch of feature spaces, and the cost of training ControlNets for new backbones is a big burden. Second, ControlNet features for different frames might not effectively handle the temporal consistency. To address these challenges, we introduce Ctrl-Adapter, an efficient and versatile framework that adds diverse controls to any image/video diffusion models, by adapting pretrained ControlNets (and improving temporal alignment for videos). Ctrl-Adapter provides diverse capabilities including image control, video control, video control with sparse frames, multi-condition control, compatibility with different backbones, adaptation to unseen control conditions, and video editing. In Ctrl-Adapter, we train adapter layers that fuse pretrained ControlNet features to different image/video diffusion models, while keeping the parameters of the ControlNets and the diffusion models frozen. Ctrl-Adapter consists of temporal and spatial modules so that it can effectively handle the temporal consistency of videos. We also propose latent skipping and inverse timestep sampling for robust adaptation and sparse control. Moreover, Ctrl-Adapter enables control from multiple conditions by simply taking the (weighted) average of ControlNet outputs. With diverse image/video diffusion backbones (SDXL, Hotshot-XL, I2VGen-XL, and SVD), Ctrl-Adapter matches ControlNet for image control and outperforms all baselines for video control (achieving the SOTA accuracy on the DAVIS 2017 dataset) with significantly lower computational costs (less than 10 GPU hours). submitted by /u/SeawaterFlows [link] [comments]
Hello! I am a researcher in computational neuroscience, looking to apply some contemporary machine learning techniques to fMRI timeseries data. I have a collection of highly dimensional 4D fMRI timeseries data collected while subjects were observing naturalistic images from COCO at regular intervals. We currently have decoding models that take preprocessed "snapshots" of this timeseries data flattened into an activation pattern that is aggregated over the short period the image was being observed, and use some machine learning models to decode and reconstruct the image content from the brain. (See some of my recent work). I am curious what sort of machine learning techniques exist that might be able to address the time-series data itself, without having to collapse the timeseries to a single snapshot to perform our decoding process. What I am envisioning is a model (perhaps a transformer) that can take as input a highly dimensional multichannel timeseries and output a flattened latent representation (say, a CLIP vector) corresponding to an image stimulus, or even a series of latent vectors separated by a known regular interval (as we have in our data for the different image presentations). To my knowledge most of the work in machine learning with time series data is in forecasting, but what I want is a static (or potentially repetitive) output. My hope is that the more detailed timeseries data will have additional signal that will boost decoding performance for fMRI vision decoding. Is there any existing work in the field of ML that has tackled a similar problem? submitted by /u/reesespike [link] [comments]
I have built a model for detecting outliers in a data series using ML.NET. I read from the dev forums that ML.NET using DirectML can support the new NPUs built into the new Core Ultra processors from Intel. I have not been able to find evidence that this is true for AutoML from the Microsoft team. Does anyone have experience using AutoML with DirectML backend? submitted by /u/tradingnumbers [link] [comments]
I am wondering if there is some sort of multilevel model that I can use to identify likely buyers of specific products or create a lookalike audience. The issue is that I have 1000s of products and around a million users. It would be computationally infeasible to create a model for every product. The structure I am thinking of is the first level is a product and the next level is all the users in my database. Is there some sort of ML algo I could use to achieve this? submitted by /u/Terrible-Hamster-342 [link] [comments]
Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.
submitted by /u/DennisHoffmanOqng [link] [comments]
Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.