What are some ways to increase precision or recall in machine learning?

What are some ways to increase precision or recall in machine learning?
DjamgaMind - AI Unraveled Podcast

DjamgaMind: Audio Intelligence for the C-Suite (Daily AI News, Energy, Healthcare, Finance)

Full-Stack AI Intelligence. Zero Noise.The definitive audio briefing for the C-Suite and AI Architects. From Daily News and Strategic Deep Dives to high-density Industrial & Regulatory Intelligence—decoded at the speed of the AI era. . 👉 Start your specialized audio briefing today at Djamgamind.com


AI Jobs and Career

I wanted to share an exciting opportunity for those of you looking to advance your careers in the AI space. You know how rapidly the landscape is evolving, and finding the right fit can be a challenge. That's why I'm excited about Mercor – they're a platform specifically designed to connect top-tier AI talent with leading companies. Whether you're a data scientist, machine learning engineer, or something else entirely, Mercor can help you find your next big role. If you're ready to take the next step in your AI career, check them out through my referral link: https://work.mercor.com/?referralCode=82d5f4e3-e1a3-4064-963f-c197bb2c8db1. It's a fantastic resource, and I encourage you to explore the opportunities they have available.

Job TitleStatusPay
Full-Stack Engineer Strong match, Full-time $150K - $220K / year
Developer Experience and Productivity Engineer Pre-qualified, Full-time $160K - $300K / year
Software Engineer - Tooling & AI Workflows (Contract) Contract $90 / hour
DevOps Engineer (India) Full-time $20K - $50K / year
Senior Full-Stack Engineer Full-time $2.8K - $4K / week
Enterprise IT & Cloud Domain Expert - India Contract $20 - $30 / hour
Senior Software Engineer Contract $100 - $200 / hour
Senior Software Engineer Pre-qualified, Full-time $150K - $300K / year
Senior Full-Stack Engineer: Latin America Full-time $1.6K - $2.1K / week
Software Engineering Expert Contract $50 - $150 / hour
Generalist Video Annotators Contract $45 / hour
Generalist Writing Expert Contract $45 / hour
Editors, Fact Checkers, & Data Quality Reviewers Contract $50 - $60 / hour
Multilingual Expert Contract $54 / hour
Mathematics Expert (PhD) Contract $60 - $80 / hour
Software Engineer - India Contract $20 - $45 / hour
Physics Expert (PhD) Contract $60 - $80 / hour
Finance Expert Contract $150 / hour
Designers Contract $50 - $70 / hour
Chemistry Expert (PhD) Contract $60 - $80 / hour

What are some ways to increase precision or recall in machine learning?

What are some ways to Boost Precision and Recall in Machine Learning?

Sensitivity vs Specificity?


In machine learning, recall is the ability of the model to find all relevant instances in the data while precision is the ability of the model to correctly identify only the relevant instances. A high recall means that most relevant results are returned while a high precision means that most of the returned results are relevant. Ideally, you want a model with both high recall and high precision but often there is a trade-off between the two. In this blog post, we will explore some ways to increase recall or precision in machine learning.

What are some ways to increase precision or recall in machine learning?
What are some ways to increase precision or recall in machine learning?


There are two main ways to increase recall:

by increasing the number of false positives or by decreasing the number of false negatives. To increase the number of false positives, you can lower your threshold for what constitutes a positive prediction. For example, if you are trying to predict whether or not an email is spam, you might lower the threshold for what constitutes spam so that more emails are classified as spam. This will result in more false positives (emails that are not actually spam being classified as spam) but will also increase recall (more actual spam emails being classified as spam).

2023 AWS Certified Machine Learning Specialty (MLS-C01) Practice Exams
2023 AWS Certified Machine Learning Specialty (MLS-C01) Practice Exams

To decrease the number of false negatives,

you can increase your threshold for what constitutes a positive prediction. For example, going back to the spam email prediction example, you might raise the threshold for what constitutes spam so that fewer emails are classified as spam. This will result in fewer false negatives (actual spam emails not being classified as spam) but will also decrease recall (fewer actual spam emails being classified as spam).

What are some ways to increase precision or recall in machine learning?

There are two main ways to increase precision:

by increasing the number of true positives or by decreasing the number of true negatives. To increase the number of true positives, you can raise your threshold for what constitutes a positive prediction. For example, using the spam email prediction example again, you might raise the threshold for what constitutes spam so that fewer emails are classified as spam. This will result in more true positives (emails that are actually spam being classified as spam) but will also decrease precision (more non-spam emails being classified as spam).

AI-Powered Professional Certification Quiz Platform
Crack Your Next Exam with Djamgatech AI Cert Master

Web|iOs|Android|Windows

Are you passionate about AI and looking for your next career challenge? In the fast-evolving world of artificial intelligence, connecting with the right opportunities can make all the difference. We're excited to recommend Mercor, a premier platform dedicated to bridging the gap between exceptional AI professionals and innovative companies.

Whether you're seeking roles in machine learning, data science, or other cutting-edge AI fields, Mercor offers a streamlined path to your ideal position. Explore the possibilities and accelerate your AI career by visiting Mercor through our exclusive referral link:

Find Your AI Dream Job on Mercor

Your next big opportunity in AI could be just a click away!

To decrease the number of true negatives,

you can lower your threshold for what constitutes a positive prediction. For example, going back to the spam email prediction example once more, you might lower the threshold for what constitutes spam so that more emails are classified as spam. This will result in fewer true negatives (emails that are not actually spam not being classified as spam) but will also decrease precision (more non-spam emails being classified as spam).

What are some ways to increase precision or recall in machine learning?

To summarize,

there are a few ways to increase precision or recall in machine learning. One way is to use a different evaluation metric. For example, if you are trying to maximize precision, you can use the F1 score, which is a combination of precision and recall. Another way to increase precision or recall is to adjust the threshold for classification. This can be done by changing the decision boundary or by using a different algorithm altogether.

What are some ways to increase precision or recall in machine learning?

Sensitivity vs Specificity

In machine learning, sensitivity and specificity are two measures of the performance of a model. Sensitivity is the proportion of true positives that are correctly predicted by the model, while specificity is the proportion of true negatives that are correctly predicted by the model.

Google Colab For Machine Learning

State of the Google Colab for ML (October 2022)

Google introduced computing units, which you can purchase just like any other cloud computing unit you can from AWS or Azure etc. With Pro you get 100, and with Pro+ you get 500 computing units. GPU, TPU and option of High-RAM effects how much computing unit you use hourly. If you don’t have any computing units, you can’t use “Premium” tier gpus (A100, V100) and even P100 is non-viable.

AI Jobs and Career

And before we wrap up today's AI news, I wanted to share an exciting opportunity for those of you looking to advance your careers in the AI space. You know how rapidly the landscape is evolving, and finding the right fit can be a challenge. That's why I'm excited about Mercor – they're a platform specifically designed to connect top-tier AI talent with leading companies. Whether you're a data scientist, machine learning engineer, or something else entirely, Mercor can help you find your next big role. If you're ready to take the next step in your AI career, check them out through my referral link: https://work.mercor.com/?referralCode=82d5f4e3-e1a3-4064-963f-c197bb2c8db1. It's a fantastic resource, and I encourage you to explore the opportunities they have available.

Google Colab Pro+ comes with Premium tier GPU option, meanwhile in Pro if you have computing units you can randomly connect to P100 or T4. After you use all of your computing units, you can buy more or you can use T4 GPU for the half or most of the time (there can be a lot of times in the day that you can’t even use a T4 or any kinds of GPU). In free tier, offered gpus are most of the time K80 and P4, which performs similar to a 750ti (entry level gpu from 2014) with more VRAM.

For your consideration, T4 uses around 2, and A100 uses around 15 computing units hourly.
Based on the current knowledge, computing units costs for GPUs tend to fluctuate based on some unknown factor.

Considering those:

  1. For hobbyists and (under)graduate school duties, it will be better to use your own gpu if you have something with more than 4 gigs of VRAM and better than 750ti, or atleast purchase google pro to reach T4 even if you have no computing units remaining.
  2. For small research companies, and non-trivial research at universities, and probably for most of the people Colab now probably is not a good option.
  3. Colab Pro+ can be considered if you want Pro but you don’t sit in front of your computer, since it disconnects after 90 minutes of inactivity in your computer. But this can be overcomed with some scripts to some extend. So for most of the time Colab Pro+ is not a good option.

If you have anything more to say, please let me know so I can edit this post with them. Thanks!

Conclusion:


In machine learning, precision and recall trade off against each other; increasing one often decreases the other. There is no single silver bullet solution for increasing either precision or recall; it depends on your specific use case which one is more important and which methods will work best for boosting whichever metric you choose. In this blog post, we explored some methods for increasing either precision or recall; hopefully this gives you a starting point for improving your own models!

 

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

Machine Learning and Data Science Breaking News 2022 – 2023

  • The end of finetuning
    by /u/rhiever (Data Science) on May 14, 2026 at 5:29 pm

    submitted by /u/rhiever [link] [comments]

  • [N] LangChain Interrupt 2026 announcements [N]
    by /u/Equal_Winter3150 (Machine Learning) on May 14, 2026 at 5:20 pm

    LangChain just wrapped Day 1 of Interrupt 2026 and announced a few things worth knowing about: SmithDB — A purpose-built distributed database for agent observability. The problem they're solving: agent traces are getting too large and complex for general-purpose databases. SmithDB is built with Rust, Apache DataFusion, and Vortex, designed specifically for multimodal content and long-span tracing. They're reporting P50 latency of 92ms for loading trace trees and 400ms for full-text search, with up to 12x speedup over previous LangSmith performance. Architecture is object storage + small Postgres metadata store + stateless services, so it scales elastically and can be self-hosted. Context Hub — A centralized system for managing agent context (AGENTS.md files, skills, policies, memory) in LangSmith. The interesting part is they're working with MongoDB, Pinecone, Elastic, and Redis on an open standard for agent memory — covering episodic, semantic, and procedural memory with versioning and portability across frameworks. Deep Agents v0.6 — New release includes ContextHubBackend integration, an installable code interpreter that gives agents a programmable workspace inside the agent loop (distinct from sandboxes — this is for composing tools and managing state within the reasoning process), and you can scope specific file paths to different backends. The conference also has production case studies from Toyota, Coinbase, Lyft, LinkedIn, Bridgewater Associates, and others on deploying agents at enterprise scale. Andrew Ng keynoted alongside Harrison Chase. submitted by /u/Equal_Winter3150 [link] [comments]

  • I think I need to rethink my career roadmap
    by /u/prattman333 (Data Science) on May 14, 2026 at 1:13 pm

    I had a meeting today that basically gave me an existential crisis. I spent most of the morning cleaning a mess of a dataset and building out what I thought was a pretty slick visualisation on consumer behaviour. I go into the meeting, present the findings, and instead of receiving questions about methodology as I expected, my manager asked me how to show him the actual strategy, which i never thought was part of my role in the first place. Actually, I would prefer no questions at all lol. Anyway, I am doing the technical work behind the scenes and it seems that it’s kind of invisible for everyone else. In fact, I am getting more requests on giving my input on strategy and consumer psychology lately, so I started doing some research. It’s actually interesting how everything changes, but also quite overwhelming because I really do not like the storytelling part. Usually, I do my bit, present it, and I’m out lol. What I wanted to share with you here is that while this situation is definitely not in my advantage, I started to do some digging and found some really interesting perspectives on this and what expectations organisations have now with the massive implementation of AI everywhere. I use AI daily and it makes my work sooooo much easier, but using AI is not enough anymore apparently. Here it is: https://www.qualtrics.com/articles/strategy-research/market-research-trends/ The main idea here is that technical skills are the baseline, not the real value added to the organisation...??? Does anyone else feel like the goalposts are moving? I’m genuinely wondering if I should stop grinding LeetCode and start reading business strategy books just to stay relevant. Would love to hear if your roles are actually changing or if I'm just overthinking one bad meeting. submitted by /u/prattman333 [link] [comments]

  • Would a 2000-2021 ML paper even get accepted today? [D]
    by /u/Hope999991 (Machine Learning) on May 14, 2026 at 11:39 am

    I keep hearing some version of this: “A paper that got accepted years ago wouldn’t stand a chance today.” Honestly, for a lot of ML subfields, this doesn’t sound crazy anymore. A paper that once looked solid can now look under-evaluated, under-ablated, weak on baselines, or just too obvious. So maybe the real claim is: A mediocre accepted ML paper from years ago would probably get rejected today. Do people agree? Has the bar actually gone up, or has the field just become more crowded and more competitive? submitted by /u/Hope999991 [link] [comments]

  • Continual Harness: Online Adaptation for Self-Improving Foundation Agents [R]
    by /u/PokeAgentChallenge (Machine Learning) on May 14, 2026 at 3:45 am

    https://preview.redd.it/p9cd2zmfy01h1.png?width=2000&format=png&auto=webp&s=a8e99bac438c2505d97ed3716983aa731da855f8 Sharing a new paper from the GPP and PokeAgent teams. Gemini Plays Pokémon (GPP) was the first AI system to complete Pokémon Blue, Yellow Legacy on hard mode, and Crystal without losing a battle. How? Early signs of iterative harness development. In the Blue era a human watched the stream and edited the harness. By Yellow Legacy and Crystal, the model itself was performing most of the editing through general meta-tools (define_agent, run_code, notepad edits). Our new paper, Continual Harness: Online Adaptation for Self-Improving Foundation Agents, formalizes the loop and automates the refining role end to end. We then carry the same loop into training, enabling model-harness co-learning. The takeaways: 1. Iterative harness refinement closes most of the gap to a hand-engineered version. 2. Long-horizon agency requires self-refinement, and self-refinement requires a useful model. 3. The future of agents is model-harness co-learning. Paper (arXiv). https://arxiv.org/abs/2605.09998 Article (Substack). https://sethkarten.substack.com/p/gemini-plays-pokemon-discovered-something Project page (video demos). https://sethkarten.ai/continual-harness submitted by /u/PokeAgentChallenge [link] [comments]

  • Trained transformer-based chess models to play like humans (including thinking time) [P]
    by /u/hazard02 (Machine Learning) on May 13, 2026 at 10:08 pm

    I trained a set of deep learning (transformer-based) chess models to play like humans (inspired by MAIA and Grandmaster Chess Without Search). There's a separate model for each 100-point rating bucket from ~800 to 2500+. I started with training a mid-strength model from scratch on a 8xH100 cluster, then fine-tuned models for the other rating ranges on my local 5090 GPU. The total training size was nearly a year of Lichess data, about 1B total games. Each rating range actually has 3 models: A move model, a thinking time model, and a white win / draw / black win model. Despite being quite small (only 9MM parameters!) the move models achieve better accuracy than MAIA-2 and are approximately on par with MAIA-3 (see here for MAIA-2 comparison). AFAIK this is the only attempt to train on thinking times in chess, so I don't have a benchmark to compare against for that. Likely because of the network size, at high ratings the models aren't quite as good as they could be. They see short tactical motifs but can't do deep calculation - probably a bigger model would help here. The move and win models take into account player ratings and clock times. For instance, under extreme time pressure a much stronger player has a lower win prob even if their opponent is weaker. The models blunder more under time pressure as well. The data pipeline is C++ via nanobind, then training with Pytorch. Getting this right was actually the thing I spent the most time on. Pre-shuffling the dataset and then being able to read the shuffled dataset sequentially at training time kept the GPU utilization high. Without this it spent a huge percentage of time on I/O while the GPU sat idle. Happy to answer questions about the rating-conditioning, the clock model, or the data pipeline. Code (including training code and model weights) is at https://github.com/thomasj02/1e4_ai/. A demo is at https://1e4.ai/ but all the frontend code is also in the repo if you want to self-host. submitted by /u/hazard02 [link] [comments]

  • Scenema Audio: Zero-shot expressive voice cloning and speech generation [N]
    by /u/a__side_of_fries (Machine Learning) on May 13, 2026 at 9:29 pm

    We've been building Scenema Audio as part of our video production platform at scenema.ai, and we're releasing the model weights and inference code. The core idea: emotional performance and voice identity are independent. You describe how the speech should be performed (rage, grief, excitement, a child's wonder), and optionally provide reference audio for voice identity. The reference provides the "who." The prompt provides the "how." Any voice can perform any emotion, even if that voice has never been recorded in that emotional state. Limitations (and why we still use it) This is a diffusion model, not a traditional TTS pipeline. Common issues include repetition and gibberish on some seeds. Different seeds give different results, and you will not get a perfect output with 0% error rate. This model is meant for a post-editing workflow: generate, pick the best take, trim if needed. Same way you'd work with any generative model. That said, we keep coming back to Scenema Audio over even Gemini 3.1 Flash TTS, which is already more controllable than most TTS systems out there. The reason is simple: the output just sounds more natural and less robotic. There's a quality to diffusion-generated speech that autoregressive TTS doesn't quite match, especially for emotional delivery. Audio-first video generation As this video points out, generating audio first and then using it to drive video generation is a powerful workflow. That's actually how we've used Scenema Audio in some cases. Generate the voice performance, then feed it into an A2V pipeline (LTX 2.3, Wan 2.6, Seedance 2.0, etc.) to generate video that matches the speech. Here's an example of that workflow in action. On distillation and speed A few people have asked this. Our bottleneck is not denoising steps. The diffusion pass is a small fraction of total generation time. The real costs are elsewhere in the pipeline. We're already at 8 steps (down from 50 in the base model), and that's the sweet spot where quality holds. Prompting matters This model is sensitive to prompting, the same way LTX 2.3 is for video. A generic voice description gives you generic output. A specific, theatrical description with action tags gives you a performance. There's also a pace parameter that controls how much time the model gets per word. Takes some experimentation to find what works for your use case, but once you do, you can generate hours of audio with minimal quality loss. Complex words and proper nouns benefit from phonetic spelling. Unlike traditional TTS, it doesn't have a phoneme-to-audio pipeline or a pronunciation dictionary. If it garbles "Tchaikovsky," you would spell it "Chai-koff-skee" or whatever makes sense to you. Docker REST API with automatic VRAM management We ship this as a Docker container with a REST API. Same setup we use in production on scenema.ai. The service auto-detects your GPU and picks the right configuration: VRAM Audio Model Gemma Notes 16 GB INT8 (4.9 GB) CPU streaming Needs 32 GB system RAM 24 GB INT8 (4.9 GB) NF4 on GPU Default config 48 GB bf16 (9.8 GB) bf16 on GPU Best quality We went with Docker because that's how we serve it. No dependency hell, no conda environments. Pull, set your HF token for Gemma access, then docker compose up. ComfyUI Native ComfyUI node support is planned. We're hoping to release it in the coming weeks, unless someone from the community beats us to it. In the meantime, the REST API is straightforward to call from a custom node since it's just a local HTTP service. Links All demos + article: scenema.ai/audio Model weights: huggingface.co/ScenemaAI/scenema-audio Code + setup: github.com/ScenemaAI/scenema-audio YouTube demo: youtu.be/VnEQ_ImOaAc This is fully open source. The model weights derive from the LTX-2 Community License but all inference and pipeline code is MIT. submitted by /u/a__side_of_fries [link] [comments]

  • Have the "on-hold" durations been getting longer for arXiv submissions? [D]
    by /u/Megixist (Machine Learning) on May 13, 2026 at 6:51 pm

    I have a paper that has been "on-hold" for about 2 weeks now. I understand that it might take a little longer now because of inundation of AI generated low-effort papers but my papers have gone from "on-hold" to "submitted" within a couple of days in the past. Wondering if anyone else is facing the same issue. submitted by /u/Megixist [link] [comments]

  • EEML Summer School (Eastern European ML) - Anyone here got accepted? [D]
    by /u/ade17_in (Machine Learning) on May 13, 2026 at 4:55 pm

    Has anyone got into EEML Summer School in Montenegro? I did and please feel free to DM to manage stay or other plans after the summer school. I see that it's tricky to get there and find a stay. submitted by /u/ade17_in [link] [comments]

  • Best examples of ML projects with good dataset/task code abstractions? [D]
    by /u/LetsTacoooo (Machine Learning) on May 13, 2026 at 2:57 pm

    I am working on a benchmark and need to manage several interlocking components: datasets and metadata, diverse ML tasks (varying inputs and outputs), and baseline experiments covering models, training, and evaluations. Any pointers to projects that handle these through clean/minimal data structures like Dataclasses or Pydantic. Specifically, I want to see how others manage: Dataset Information: Representing dataset cards, metadata, and split definitions as first-class objects. Task Schemas: Defining ML tasks with specific input and output types to ensure consistency across different models. Experiment Composition: Structures that link a model and training configuration to a specific evaluation and prediction set. If you have seen repositories that maintain these abstractions with minimal boilerplate and high type safety, please share them. I am interested in internal code organization rather than external tools like W&B or MLflow. Definitely aware of cookie-cutter data-science, looking for for datastructures. submitted by /u/LetsTacoooo [link] [comments]

  • Human-level performance via ML was *not* proven impossible with complexity theory [D]
    by /u/mike_uoftdcs (Machine Learning) on May 13, 2026 at 2:50 pm

    Van Rooij, Guest, de Haan, Adolfi, Kolokolova, and Rich claimed to have proven that AGI via ML is impossible in Computational Brain & Behavior in 2024. The basic idea was to try to reduce a known NP-hard problem to the problem of learning a human-level classifier from data. The purported result, called "Ingenia Theorem" by the authors, made some noise on the internet, including here. My paper showing that the proof is irreparably broken is now also out in CBB (ungated preprint here). The basic issue is that "human-level classifier" is not mathematically defined, which the authors solve by ... never defining it. They have a construct that corresponds to "distribution of human situation-behaviour tuples" when they introduce the problem, but the construct then gets swapped out for "for all polytime-sampleable distributions" when it comes time to doing the formal proof. This means that the paper, if you find-and-replace human situation-behavior tuples for ImageNet inputs/labels, also proves that learning to classify ImageNet is intractable. Blogpost discussion similar attempts from Penrose to Chomsky here. submitted by /u/mike_uoftdcs [link] [comments]

  • Built Support Vector Machine(SVM) from scratch in Rust [P]
    by /u/Yeet132416 (Machine Learning) on May 13, 2026 at 2:23 pm

    Built my own SVM classifier from scratch in Rust. It uses SMO optimization, have linear and rbf kernel, uses grid search to tune the hyperparameters. I tested it on two datasets one using Linear dataset and other using RBF, these were the results: Dataset Kernel Accuracy Recall F1 Banknote Auth Linear 96% 94% 95% Breast Cancer RBF 93% 100% 92% https://preview.redd.it/uw26u1uo0w0h1.jpg?width=720&format=pjpg&auto=webp&s=1784e1d7d310a26fa67efc63fa5191f45433a695 https://preview.redd.it/o0ahkq7p0w0h1.jpg?width=720&format=pjpg&auto=webp&s=dcb1053c34931d11b82831c6ad8cd4755ebc5816 The plot.rs file, used for plotting only was written using AI as I could not wrap my head around plotters crate, apart from that everything was by my own. Repo Link: Github Repo Happy to get some feedback! submitted by /u/Yeet132416 [link] [comments]

  • Training a number-aware embedding model + Text JEPA doesn't work too well + Text auto-encoders have a strange frequency bias [R][P]
    by /u/Academic_Sleep1118 (Machine Learning) on May 13, 2026 at 11:55 am

    Hi guys! I've spent 1y trying to predict company growth from the full text of their 10-k filings. It completely failed. But I've had a lot of fun playing with encoder transformers and making them good at numbers (bypassing the tokenizer/prediction head for numbers). I've MLM-trained a modified ModernBERT for this and it works really well. The model is available on HF: https://huggingface.co/edereynal/financial_bert Then, I've made this MLM-trained model into a nice sequence embedder. I've experimented with JEPA, but it failed. The auto-encoder setup worked much better. But I encountered a strange frequency bias, where the decoder only cared about high-frequency information, and I had to mitigate it by adding a Contrastive Loss term. I also investigated the tendency of transformers to have a low effective-dimensionality output space (compared to its input embedding space). So, here's the technical blog post, that reads a bit like "how to waste 1,000 hours and $400 trying to solve an unsolvable real-world problem, but having a lot of fun along the way": https://www.eloidereynal.com/p/i-spent-1-year-trying-to-predict submitted by /u/Academic_Sleep1118 [link] [comments]

  • Elastic Attention Cores for Scalable Vision Transformers [R]
    by /u/44seconds (Machine Learning) on May 13, 2026 at 11:51 am

    Wanted to share our latest paper on an alternative building block for Vision Transformers. Illustration of our model's accuracy and dense features Traditional ViTs utilize dense (N2) self-attention, which can become pretty costly at higher resolutions. In this work, we propose an alternative backbone with a core-periphery block-sparse attention structure that scales as (2NC + C2) for C core tokens. We further train this using nested dropout, which enables test-time elastic adjustments to the inference cost. The whole model can achieve very competitive dense & classification accuracy compared with DINOv3, and is stable across resolutions (256 all the way to 1024). Interestingly, the core-dense attention patterns exhibit strong emergent behavior. At early layers of the network the attention maps are isotropic (spherical), but become increasingly semantically aligned deeper into the network. Visual Elastic Core Attention paper abstract While adjusting the number of core tokens, if you decrease the number of cores, the attention patterns become more diffuse & cover a spatially larger region. If you increase the number of core tokens, the attention patterns become smaller & more concentrated. Paper: https://arxiv.org/abs/2605.12491 Project with the code (still in progress): https://github.com/alansong1322/VECA Happy to answer any questions about our research. submitted by /u/44seconds [link] [comments]

  • Learning, Fast and Slow: Towards LLMs That Adapt Continually [R]
    by /u/LakshyAAAgrawal (Machine Learning) on May 13, 2026 at 10:38 am

    Large language models (LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result in catastrophic forgetting and loss of plasticity. In contrast, in-context learning with fixed LLM parameters can cheaply and rapidly adapt to task-specific requirements (e.g., prompt optimization), but cannot by itself typically match the performance gains available through updating LLM parameters. There is no good reason for restricting learning to being in-context or in-weights. Moreover, humans also likely learn at different time scales (e.g., System 1 vs 2). To this end, we introduce a fast-slow learning framework for LLMs, with model parameters as "slow" weights and optimized context as "fast" weights. These fast "weights" can learn from textual feedback to absorb the task-specific information, while allowing slow weights to stay closer to the base model and persist general reasoning behaviors. Fast-Slow Training (FST) is up to 3x more sample-efficient than only slow learning (RL) across reasoning tasks, while consistently reaching a higher performance asymptote. Moreover, FST-trained models remain closer to the base LLM (up to 70% less KL divergence), resulting in less catastrophic forgetting than RL-training. This reduced drift also preserves plasticity: after training on one task, FST trained models adapt more effectively to a subsequent task than parameter-only trained models. In continual learning scenarios, where task domains change on the fly, FST continues to acquire each new task while parameter-only RL stalls. https://arxiv.org/abs/2605.12484v1 submitted by /u/LakshyAAAgrawal [link] [comments]

  • Learnings From Crawling Technical Documentation
    by /u/rhazn (Data Science) on May 13, 2026 at 9:01 am

    submitted by /u/rhazn [link] [comments]

  • Sharing all KGC 2026 decks. More production-grade KG systems than I've seen at any conference. [D]
    by /u/Ok_Gas7672 (Machine Learning) on May 13, 2026 at 8:12 am

    Didn't make it to New York for the Knowledge Graph Conference this year, but caught some talks virtually and managed to download all the decks. Sharing them below because some of what was shown is worth knowing about. Majority of the presentations described live production systems. Enterprises showing up with real engineers delivering real compliance requirements. That's not usual for most ai eventss. Most talks are proofs of concept with a "coming soon to prod" slide at the end. For eg - Bloomberg showed a formal dependency model for ontology governance. AbbVie walked through ARCH, their internal KG for drug and disease-area intelligence, connected to a scoring engine, a researcher dashboard, and an LLM companion for plain-language queries. The KG is the source of truth. The LLM is the interface. Even Morgan Stanley showed continuous SHACL drift detection on risk reporting data - automated weekly checks that alert when the semantic layer deviates from what's governed. Crux: knowledge graphs are being actively used as infrastructure, not a retrieval layer on top of vectors. The graph is doing reasoning work, not lookup work. We've been skeptical of the "only using vector dbs" framing for a while. These production systems are the clearest evidence I've seen of where that breaks down - and what the alternative actually looks like when it's running. Link to the all the decks in the comment. All decks here: https://drive.google.com/drive/folders/1Csdv4hZePrBMJGggsisPXYBueTRCK1kV?usp=sharing submitted by /u/Ok_Gas7672 [link] [comments]

  • Looking for advice: Online Master's in Applied Math for ML while working full-time
    by /u/Lamba_ghoda (Data Science) on May 13, 2026 at 6:22 am

    Hi everyone, I'm looking for some honest input from people who've been down this road or know the landscape well. My background: B.Com in Finance & Accounting from Delhi University (2019) During Covid somewhat made my way into machine learning by doing self study at home. Currently a Senior ML Engineer at a large financial data/tech company in Bengaluru Day-to-day work spans around NLP/LLM systems, real-time ML pipelines, distributed data infra, and AWS. What I'm trying to do: I want to seriously deepen my foundations in applied mathematics for ML — think probability, linear algebra, optimization, statistical learning theory, the actual mathematical machinery behind modern ML rather than just the engineering side. I've been doing ML professionally for a few years now and I keep hitting the ceiling where deeper math intuition would make me significantly better at my job (and at research-leaning problems). My constraints: Can't leave my job. I need a fully online / part-time / WILP-style program. Based in India, so an Indian program is ideal (IISc, IIT online degrees, CMI, ISI, BITS, etc, i know getting into top tiers college is very very hard for someone whose background isn't in engineering but still if there's any way they accept non-techincal degree holders, I would like to know more about how one can enrol for such programes) Open to foreign universities too if the program is genuinely online and the time zones work out What I'd love input on: Programs you'd actually recommend (and ones to avoid) for applied math / mathematical ML at the master's level, fully online If anyone has done IIT/IISc online degrees coming from non-technical background in math/stats/ML while working full-time, how was the experience and workload? Not looking for career change advice happy in my role. Just trying to build deeper foundations the right way. Any pointers appreciated. submitted by /u/Lamba_ghoda [link] [comments]

  • How do you create memorable poster for top tier conferences ( ICML/ICLR/NEURips ect…) [D]
    by /u/DazzlingPin3965 (Machine Learning) on May 13, 2026 at 12:05 am

    Hello everyone, Presenting at a top-tier conference for the first time and having a very hard time coming up with an appropriate design for my poster. Everything I do seems basic and banal. My paper is more theory-oriented, and apart from putting math formulas in bold in the middle, I am not sure what the best way is to design the poster. Even the sizing choice is complicated as ICML gives 3 different recommendations to pick from, and somehow from my computer, I can’t see how the PowerPoint slide will look like printed on those dimensions. And Printing a poster is nearly $100 CAD, so there’s no room for trial and error. So If anyone has any tips on how to do it properly, I have been using PowerPoint, but perhaps I should go to Canvas? Or Does anyone have another software to recommend? submitted by /u/DazzlingPin3965 [link] [comments]

  • I created a minimal one-file implementations (160loc) of JEPA family (ijepa, vjepa, vjepa2, cjepa) for educational purposes [P]
    by /u/kwk236 (Machine Learning) on May 12, 2026 at 11:08 pm

    Hi all, I made my own minimal implementation of JEPA algorithms. Making things minimal and removing all the things needed for scaling the algorithm always helped me understand the essence. So I stripped everything but the algorithm parts. What's left is 160-200 lines of code that distills the essence of the mathematics. It is very easy to compare with the math in the paper and the code and how it can be implemented in PyTorch. I added [algo]_tutorial.md files to help with understanding. https://github.com/keon/jepa submitted by /u/kwk236 [link] [comments]

Top 100 Data Science and Data Analytics and Data Engineering Interview Questions and Answers


AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Gemini, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, Promp Engineering)

What are some good datasets for Data Science and Machine Learning?

What is Google Workspace?
Google Workspace is a cloud-based productivity suite that helps teams communicate, collaborate and get things done from anywhere and on any device. It's simple to set up, use and manage, so your business can focus on what really matters.

Watch a video or find out more here.

Here are some highlights:
Business email for your domain
Look professional and communicate as you@yourcompany.com. Gmail's simple features help you build your brand while getting more done.

Access from any location or device
Check emails, share files, edit documents, hold video meetings and more, whether you're at work, at home or on the move. You can pick up where you left off from a computer, tablet or phone.

Enterprise-level management tools
Robust admin settings give you total command over users, devices, security and more.

Sign up using my link https://referworkspace.app.goo.gl/Q371 and get a 14-day trial, and message me to get an exclusive discount when you try Google Workspace for your business.

Google Workspace Business Standard Promotion code for the Americas 63F733CLLY7R7MM 63F7D7CPD9XXUVT 63FLKQHWV3AEEE6 63JGLWWK36CP7WM
Email me for more promo codes

Active Hydrating Toner, Anti-Aging Replenishing Advanced Face Moisturizer, with Vitamins A, C, E & Natural Botanicals to Promote Skin Balance & Collagen Production, 6.7 Fl Oz

Age Defying 0.3% Retinol Serum, Anti-Aging Dark Spot Remover for Face, Fine Lines & Wrinkle Pore Minimizer, with Vitamin E & Natural Botanicals

Firming Moisturizer, Advanced Hydrating Facial Replenishing Cream, with Hyaluronic Acid, Resveratrol & Natural Botanicals to Restore Skin's Strength, Radiance, and Resilience, 1.75 Oz

Skin Stem Cell Serum

Smartphone 101 - Pick a smartphone for me - android or iOS - Apple iPhone or Samsung Galaxy or Huawei or Xaomi or Google Pixel

Can AI Really Predict Lottery Results? We Asked an Expert.

Ace the 2025 AWS Solutions Architect Associate SAA-C03 Exam with Confidence Pass the 2025 AWS Certified Machine Learning Specialty MLS-C01 Exam with Flying Colors

List of Freely available programming books - What is the single most influential book every Programmers should read



#BlackOwned #BlackEntrepreneurs #BlackBuniness #AWSCertified #AWSCloudPractitioner #AWSCertification #AWSCLFC02 #CloudComputing #AWSStudyGuide #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AWSBasics #AWSCertified #AWSMachineLearning #AWSCertification #AWSSpecialty #MachineLearning #AWSStudyGuide #CloudComputing #DataScience #AWSCertified #AWSSolutionsArchitect #AWSArchitectAssociate #AWSCertification #AWSStudyGuide #CloudComputing #AWSArchitecture #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AzureFundamentals #AZ900 #MicrosoftAzure #ITCertification #CertificationPrep #StudyMaterials #TechLearning #MicrosoftCertified #AzureCertification #TechBooks

Top 1000 Canada Quiz and trivia: CANADA CITIZENSHIP TEST- HISTORY - GEOGRAPHY - GOVERNMENT- CULTURE - PEOPLE - LANGUAGES - TRAVEL - WILDLIFE - HOCKEY - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
zCanadian Quiz and Trivia, Canadian History, Citizenship Test, Geography, Wildlife, Secenries, Banff, Tourism

Top 1000 Africa Quiz and trivia: HISTORY - GEOGRAPHY - WILDLIFE - CULTURE - PEOPLE - LANGUAGES - TRAVEL - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION
Africa Quiz, Africa Trivia, Quiz, African History, Geography, Wildlife, Culture

Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada.
Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada

Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA
Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA


Health Health, a science-based community to discuss human health

Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.

Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.

Reddit Sports Sports News and Highlights from the NFL, NBA, NHL, MLB, MLS, NCAA, F1, and other leagues around the world.

Turn your dream into reality with Google Workspace: It’s free for the first 14 days.
Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes:
Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes: 96DRHDRA9J7GTN6 96DRHDRA9J7GTN6
63F733CLLY7R7MM
63F7D7CPD9XXUVT
63FLKQHWV3AEEE6
63JGLWWK36CP7WM
63KKR9EULQRR7VE
63KNY4N7VHCUA9R
63LDXXFYU6VXDG9
63MGNRCKXURAYWC
63NGNDVVXJP4N99
63P4G3ELRPADKQU
With Google Workspace, Get custom email @yourcompany, Work from anywhere; Easily scale up or down
Google gives you the tools you need to run your business like a pro. Set up custom email, share files securely online, video chat from any device, and more.
Google Workspace provides a platform, a common ground, for all our internal teams and operations to collaboratively support our primary business goal, which is to deliver quality information to our readers quickly.
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE
C37HCAQRVR7JTFK
C3AE76E7WATCTL9
C3C3RGUF9VW6LXE
C3D9LD4L736CALC
C3EQXV674DQ6PXP
C3G9M3JEHXM3XC7
C3GGR3H4TRHUD7L
C3LVUVC3LHKUEQK
C3PVGM4CHHPMWLE
C3QHQ763LWGTW4C
Even if you’re small, you want people to see you as a professional business. If you’re still growing, you need the building blocks to get you where you want to be. I’ve learned so much about business through Google Workspace—I can’t imagine working without it.
(Email us for more codes)