DjamgaMind: Audio Intelligence for the C-Suite (Energy, Healthcare, Finance)
Are you drowning in dense legal text? DjamgaMind is the new audio intelligence platform that turns 100-page healthcare or Energy mandates into 5-minute executive briefings. Whether you are navigating Bill C-27 (Canada) or the CMS-0057-F Interoperability Rule (USA), our AI agents decode the liability so you don’t have to. 👉 Start your specialized audio briefing today at Djamgamind.com
AI Jobs and Career
I wanted to share an exciting opportunity for those of you looking to advance your careers in the AI space. You know how rapidly the landscape is evolving, and finding the right fit can be a challenge. That's why I'm excited about Mercor – they're a platform specifically designed to connect top-tier AI talent with leading companies. Whether you're a data scientist, machine learning engineer, or something else entirely, Mercor can help you find your next big role. If you're ready to take the next step in your AI career, check them out through my referral link: https://work.mercor.com/?referralCode=82d5f4e3-e1a3-4064-963f-c197bb2c8db1. It's a fantastic resource, and I encourage you to explore the opportunities they have available.
- Full Stack Engineer [$150K-$220K]
- Software Engineer, Tooling & AI Workflow, Contract [$90/hour]
- DevOps Engineer, India, Contract [$90/hour]
- More AI Jobs Opportunitieshere
| Job Title | Status | Pay |
|---|---|---|
| Full-Stack Engineer | Strong match, Full-time | $150K - $220K / year |
| Developer Experience and Productivity Engineer | Pre-qualified, Full-time | $160K - $300K / year |
| Software Engineer - Tooling & AI Workflows (Contract) | Contract | $90 / hour |
| DevOps Engineer (India) | Full-time | $20K - $50K / year |
| Senior Full-Stack Engineer | Full-time | $2.8K - $4K / week |
| Enterprise IT & Cloud Domain Expert - India | Contract | $20 - $30 / hour |
| Senior Software Engineer | Contract | $100 - $200 / hour |
| Senior Software Engineer | Pre-qualified, Full-time | $150K - $300K / year |
| Senior Full-Stack Engineer: Latin America | Full-time | $1.6K - $2.1K / week |
| Software Engineering Expert | Contract | $50 - $150 / hour |
| Generalist Video Annotators | Contract | $45 / hour |
| Generalist Writing Expert | Contract | $45 / hour |
| Editors, Fact Checkers, & Data Quality Reviewers | Contract | $50 - $60 / hour |
| Multilingual Expert | Contract | $54 / hour |
| Mathematics Expert (PhD) | Contract | $60 - $80 / hour |
| Software Engineer - India | Contract | $20 - $45 / hour |
| Physics Expert (PhD) | Contract | $60 - $80 / hour |
| Finance Expert | Contract | $150 / hour |
| Designers | Contract | $50 - $70 / hour |
| Chemistry Expert (PhD) | Contract | $60 - $80 / hour |
What is the Best Machine Learning Algorithms for Imbalanced Datasets?
In machine learning, imbalanced datasets are those where one class heavily outnumbers the others. This can be due to the nature of the problem or simply because more data is available for one class than the others. Either way, imbalanced datasets can pose a challenge for machine learning algorithms. In this blog post, we’ll take a look at which machine learning algorithms are best suited for imbalanced datasets and why they tend to perform better than others.
For example, in a binary classification problem, if there are 100 observations, and only 10 of them are positive (the rest are negatives), then we say that the dataset is imbalanced. The ratio of positive to negative cases is 1:10.

There are a few reasons why some machine learning algorithms tend to perform better on imbalanced datasets than others. First, certain algorithms are designed to handle imbalanced datasets. Second, some algorithms are more robust to outliers, which can be more common in imbalanced datasets. And third, some algorithms are better able to learn from a limited amount of data, which can be an issue when one class is heavily outnumbered by the others.
Some of the best machine learning algorithms for imbalanced datasets include:
– Support Vector Machines (SVMs),
– Decision Trees,
– Random Forests,
– Naive Bayes Classifiers,
– k-Nearest Neighbors (kNN),
Of these, SVMs tend to be the most popular choice as they are specifically designed to handle imbalanced datasets. SVMs work by finding a hyperplane that maximizes the margin between the two classes. This helps to reduce overfitting and improve generalization. Decision trees and random forests are also popular choices as they are less sensitive to outliers than other algorithms such as linear regression. Naive Bayes classifiers are another good choice as they are able to learn from a limited amount of data. kNN is also a good choice as it is not sensitive to outliers and is able to learn from a limited amount of data. However, it can be computationally intensive for large datasets.
There are two main types of machine learning algorithms: supervised and unsupervised. Supervised algorithms tend to perform better on imbalanced datasets than unsupervised algorithms. In this blog post, we will discuss why this is so and look at some examples.
Supervised Algorithms
Supervised algorithms are those where the target variable is known. In other words, we have training data where the correct answers are already given. The algorithm then learns from this data and is able to generalize to new data. Some examples of supervised algorithms are regression and classification.
Unsupervised Algorithms
Unsupervised algorithms are those where the target variable is not known. With unsupervised algorithms, we only have input data, without any corresponding output labels. The algorithm has to learn from the data itself without any guidance. Some examples of unsupervised algorithms are clustering and dimensionality reduction.
Why Supervised Algorithms Perform Better on Imbalanced Datasets
The reason why supervised algorithms perform better on imbalanced datasets is because they can learn from the training data which cases are more important. With unsupervised algorithms, all data points are treated equally, regardless of whether they are in the minority or majority class.
For example, in a binary classification problem with an imbalanced dataset, let’s say that we want to predict whether a customer will default on their loan payment or not. We have a training dataset of 1000 customers, out of which only 100 (10%) have defaulted on their loan in the past.
If we use a supervised algorithm like logistic regression, the algorithm will learn from the training data that defaulting on a loan is rare (since only 10% of cases in the training data are Positive). This means that it will be more likely to predict correctly that a new customer will not default on their loan (since this is the majority class in the training data).
However, if we use an unsupervised algorithm like k-means clustering, all data points will be treated equally since there is no target variable to guide the algorithm. This means that it might incorrectly cluster together customers who have defaulted on their loans with those who haven’t since there is no guidance provided by a target variable.
Conclusion:
In conclusion, supervised machine learning algorithms tend to perform better on imbalanced datasets than unsupervised machine learning algorithms because they can learn from the training data which cases are more important.
Some machine learning algorithms tend to perform better on highly imbalanced datasets because they are designed to deal with imbalance or because they can learn from both classes simultaneously. If you are working with a highly imbalanced dataset, then you should consider using one of these algorithms.
Thanks for reading!
How are machine learning techniques being used to address unstructured data challenges?
Machine learning techniques are being used to address unstructured data challenges in a number of ways:
- Natural language processing (NLP): NLP algorithms can be used to extract meaningful information from unstructured text data, such as emails, documents, and social media posts. NLP algorithms can be trained to classify text data, identify key terms and concepts, and extract structured data from unstructured text.
- Image recognition: Machine learning algorithms can be used to analyze and classify images, enabling the automatic identification and classification of objects, people, and other elements in images. This can be useful for tasks such as image tagging and search, as well as for applications such as security and surveillance.
- Audio and speech recognition: Machine learning algorithms can be used to analyze and classify audio data, enabling the automatic transcription and translation of spoken language. This can be useful for tasks such as speech-to-text transcription, as well as for applications such as call center automation and language translation.
- Video analysis: Machine learning algorithms can be used to analyze and classify video data, enabling the automatic detection and classification of objects, people, and other elements in video. This can be useful for tasks such as video tagging and search, as well as for applications such as security and surveillance.
Overall, machine learning techniques are being used in a wide range of applications to extract meaningful information from unstructured data, and to enable the automatic classification and analysis of data in a variety of formats.
How is AI and machine learning impacting application development today?
Artificial intelligence (AI) and machine learning are having a significant impact on application development today in a number of ways:
- Enabling new capabilities: AI and machine learning algorithms can be used to enable applications to perform tasks that would be difficult or impossible for humans to do. For example, AI-powered applications can be used to analyze and classify large amounts of data, or to automate complex decision-making processes.
- Improving performance: AI and machine learning algorithms can be used to optimize the performance of applications, making them faster, more efficient, and more accurate. For example, machine learning algorithms can be used to improve the accuracy of predictive models, or to optimize the performance of search algorithms.
- Streamlining development: AI and machine learning algorithms can be used to automate various aspects of application development, such as testing, debugging, and deployment. This can help to streamline the development process and reduce the time and resources needed to build and maintain applications.
- Enhancing user experiences: AI and machine learning algorithms can be used to enhance the user experience of applications, by providing personalized recommendations, recommendations, or by enabling applications to anticipate and respond to the needs and preferences of users.
Overall, AI and machine learning are having a significant impact on application development today, and they are likely to continue to shape the way applications are built and used in the future.
How will advancements in artificial intelligence and machine learning shape the future of work and society?
Advancements in artificial intelligence (AI) and machine learning are likely to shape the future of work and society in a number of ways. Some potential impacts include:
- Automation: AI and machine learning algorithms can be used to automate tasks that are currently performed by humans, such as data entry, customer service, and manufacturing. This could lead to changes in the types of jobs that are available and the skills that are in demand, as well as to increased productivity and efficiency.
- Job displacement: While automation may create new job opportunities, it could also lead to job displacement, particularly for workers in industries that are more susceptible to automation. This could lead to social and economic challenges, including unemployment and income inequality.
- Increased efficiency: AI and machine learning algorithms can be used to optimize and streamline business processes, leading to increased efficiency and productivity. This could lead to economic growth and innovation, and could also help to reduce costs for businesses and consumers.
- Enhanced decision-making: AI and machine learning algorithms can be used to analyze large amounts of data and make more informed and accurate decisions. This could lead to improved outcomes in fields such as healthcare, finance, and education, and could also help to reduce bias and improve fairness.
Overall, the impact of AI and machine learning on the future of work and society is likely to be significant and complex, with both potential benefits and challenges. It will be important to consider and address these impacts as these technologies continue to advance and become more widely adopted.
- [D] Looks like we'll be having an AMA on AI agents in this subreddit later today. Illia's is building IronClaw: a secure, open-source alternative to OpenClaw written in Rust 🦀by /u/fiatisabubble (Machine Learning) on March 5, 2026 at 3:06 pm
https://preview.redd.it/ovu1hf15s8ng1.png?width=1080&format=png&auto=webp&s=7fee57293b16c8ef7072c4491517129e178631b4 🗓️ Today | 9:30 AM – 12 PM PST I don't see this mentioned anywhere on this subreddit, hence sharing for visibility ✌️ submitted by /u/fiatisabubble [link] [comments]
- [R] Are keywords necessary for ECCV submission?by /u/Training-Adeptness57 (Machine Learning) on March 5, 2026 at 11:44 am
Hello, First time submitting to ECCV here. No other team member has done it before. I’m not really sure if inclusion of keywords is necessary or not in the submission, can someone help me here please? Thanks! submitted by /u/Training-Adeptness57 [link] [comments]
- [D] Ijcai 2026 reviewsby /u/adi_gawd (Machine Learning) on March 5, 2026 at 10:18 am
[D] Did anyone received their ijcai 2026 reviews and what are expectations by all ? I am also new to chairing tool if anyone has used it can tell me also how to check reviews on that or it will pop up as entering to its page submitted by /u/adi_gawd [link] [comments]
- [D] Impact of EU AI Act on your work?by /u/spdazero (Machine Learning) on March 5, 2026 at 10:00 am
Greetings r/MachineLearning. I am studying the impact of EU AI Act on data science practitioners, especially those working on models that are classified as high risk. I am outside EU, so it has not impacted my company yet, but my country is drafting a similar one, and I am worried about its impact. From my understanding, the act covers a broad range of models as high risk (https://artificialintelligenceact.eu/annex/3/), including credit scoring and insurance pricing, and imposes a very high standard for developing and maintaining those models. Prior to the act, some companies in credit scoring can try lots of models on an arbitrary scale (usually small) to test out on real customers, and if it succeeds, will go on deploying on a larger scale. Does the Act completely shutdown that practice, with the administrative cost of compliance on small test models now insane? Any one with experience working on high-risk models as defined by the Act? submitted by /u/spdazero [link] [comments]
- [P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)by /u/No_Gap_4296 (Machine Learning) on March 5, 2026 at 5:51 am
It is hard to communicate how frustrating the current Apple ML stack is for low-level research. CoreML imposes opaque abstractions that prevent direct ANE programming and do not support on-device training. Despite having up to 38 TOPS (INT8) and ~19 TFLOPS of fp16 compute, the ANE remains almost entirely unused for large language model workloads. Building on the foundational hardware reverse-engineering by maderix (who mapped the private API surface and benchmarked the 32 MB SRAM cliff), I wanted to see if we could bridge the gap from a raw hardware exploit to a mathematically stable runtime. I recently open-sourced ORION, to my knowledge the first open end-to-end system that combines direct ANE execution, a custom compiler pipeline, and stable multi-step training. Just to be transparent about the methodology: I approached this entire build as an exercise in what I'll call architectural delegation. My day job is Enterprise Program Management, not writing low-level C kernels. I used Claude to rapidly generate the Objective-C syntax while I acted as the system state manager—designing the compiler passes and forcing a probabilistic model to map deterministic hardware boundaries across 140 engineering tasks spanning 14 sessions. When you map it out, the ANE presents a massive wall of undocumented silicon behavior. We cataloged 17 total programming constraints, 11 of which were newly discovered during ORION's development. A few of the critical ones: • The concat operation causes an immediate compilation failure. • There is a minimum IOSurface size of approximately 49 KB for evaluation. • BLOBFILE weights require an undocumented offset of 64 bytes from the chunk header, which causes silent weight corruption if incorrect. • The compiler limits each process to ~119 compilations before silently failing. To handle this, ORION uses a custom compiler that lowers a 27-operation graph IR through five optimization passes (including Dead Code Elimination, Cast Fusion, and SRAM annotation against the 32 MB budget) to emit ANE-native MIL. The hardest part was what I'll call the numerical stability ceiling. Previous attempts at ANE training (like ANEgpt) suffered from 100% NaN divergence after the first training step. We solved this by isolating three interacting bugs: Stale Programs on Resume: ANE programs were compiling before checkpoint weights loaded. We fixed this via a deferred compilation pipeline. The leverage here is real. On an M4 Max, the system hits 170+ tokens/s for GPT-2 124M inference in decode mode. For training, we demonstrated stable multi-step training of a 110M-parameter transformer on TinyStories. Over 1,000 steps, the loss dropped from 12.29 to 6.19 with zero NaN occurrences. To bypass the 119-compilation limit, the runtime uses an exec() restart strategy, passing checkpoint state through the filesystem. There are real caveats here. Because the ANE bakes weights at compile time, every single weight update requires recompilation. In our loop, compilation consumes ~4.2 s per step, while the actual compute takes ~908 ms (achieving 0.612 TFLOPS). But imo, this is nowhere near "steady state" time for local AI—this is a layer change. Proving that we can execute mathematically stable, multi-step gradient descent directly on Apple's locked-down NPU opens up a lot of room for future work on weight patching or incremental compilation. The repo (Objective-C runtime, Python used only for one-time weight conversion) is MIT licensed and available here: https://github.com/mechramc/Orion I would love to hear thoughts from the systems ML folks here on the constraint catalog, or ideas on how to tackle the compile-time weight bottleneck. submitted by /u/No_Gap_4296 [link] [comments]
- [D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)by /u/Ok-Preparation-3042 (Machine Learning) on March 5, 2026 at 5:50 am
Hello, r/MachineLearning . I am just a regular user from a Korean AI community ("The Singularity Gallery"). I recently came across an anonymous post with a paper attached. I felt that the mathematical proof inside was too important to be buried in a local forum and not go viral globally, so I used Gemini to help me write this English post to share it with you all. The author claims they do not work in the LLM industry, but they dropped a paper titled: "The d^2 Pullback Theorem: Why Attention is a d^2-Dimensional Problem". They argue that the field has been fundamentally misunderstanding the intrinsic geometry of Attention. Here is the core of their mathematical proof: The d^2 Pullback Theorem (The Core Proof): The author mathematically proves that if you combine the Forward pass (n X n) and the Backward gradient (n X n), the actual optimization landscape the parameter explores is strictly d^2-dimensional. The n X n bottleneck is merely an illusion caused by the softmax normalization choice. Softmax destroys the Euclidean Matching structure: Previous O(n) linear attention models failed because removing exp() (softmax) destroyed the contrast (matching). Softmax creates the "matching" but artificially inflates the rank to n, causing the O(n^2) curse. O(nd^3) Squared Attention without the instability: Because the true optimization geometry is d^2, we can swap softmax with a degree-2 polynomial kernel (x^2) and still explore the exact same optimization landscape. The author introduces CSQ (Centered Shifted-Quadratic) Attention with soft penalties. This retains the Euclidean matching property, stabilizes the training, and drops both training AND inference complexity to O(nd^3). The author wrote: "I'm not in the LLM industry, so I have nowhere to share this. I'm just posting it here hoping it reaches the researchers who can build better architectures." I strongly believe this math needs to be verified by the experts here. Could this actually be the theoretical foundation for replacing standard Transformers? Original PDF:https://drive.google.com/file/d/1IhcjxiiHfRH4\_1QIxc7QFxZL3\_Jb5dOI/view?usp=sharing Original Korean Forum Post:https://gall.dcinside.com/mgallery/board/view/?id=thesingularity&no=1016197 submitted by /u/Ok-Preparation-3042 [link] [comments]
- [P] I built an open cognitive architecture for Android that maintains persistent beliefs, doubts, and goals across conversations. 13-section reasoning pipeline, local knowledge graph, flat cost at scale. Free.by /u/Edenisb (Machine Learning) on March 5, 2026 at 4:13 am
I'll keep this short and just show you what it does. I spent the last several months building The Orchard because I got frustrated with the same problem everyone in this space knows about: stateless conversations. You talk to a system for weeks, it forgets everything. The platform swaps the model underneath you and the behavior shifts overnight. Your context window grows until the API costs become absurd. So I built an architecture where none of that happens. The Orchard is an Android app that wraps any LLM provider (Anthropic, OpenAI, Google, local models through Ollama/OpenRouter) in a structured cognitive pipeline. You bring your own API key. Everything else runs locally. No servers, no accounts, no data collection. The persistent state lives in a SQLite database on your phone that never leaves the device. Here's the architecture and what actually makes this interesting from an ML perspective: Every message passes through a 13-section pipeline before a response is generated. It's not "send text to API, get response." The sections parse intent, check incoming claims against an existing knowledge graph, assess patterns, surface tensions and contradictions, model the user, track uncertainty, synthesize across past conversations, form new beliefs, evaluate them through an independent teacher model running a separate inference call, update goals, plan the response, and then generate it. Each section can be routed to a different model. You can watch the full trace in real time. The knowledge graph persists beliefs with confidence scores, claims awaiting validation, active doubts, and goals. Everything links through a weighted graph with co-retrieval reinforcement and decay. After a few weeks of conversation this graph gets genuinely interesting to explore. There's a full interactive browser with D3 force visualization, semantic search, and node expansion. After each conversation there's a sleep consolidation cycle. It strengthens important connections, decays stale ones, and occasionally surfaces emergent insights. Loosely inspired by memory consolidation literature but I won't oversell the analogy. Cost stays flat. This was important to me to prove out. At 400+ turns the per-message cost is effectively the same as turn 1. The architecture handles context management so there's no runaway token accumulation. One thing that made me laugh during testing: the system attempted to prompt inject itself through its own pipeline. The architecture caught it and continued normally. Screenshot included because I think it demonstrates something real about the robustness of structured reasoning over raw prompting. I want to be clear about what this is and isn't. This is not polished consumer software. I built it alone. The UI is functional, not pretty. If you're expecting Replika or Character.ai this is a completely different thing. It's rougher and it asks more of you upfront. But the architecture underneath is doing something I haven't seen elsewhere and I think this community would find it worth poking at. The prompt architecture is documented on GitHub. I filed a provisional patent on the core cognitive architecture (USPTO #63/979,094) but the research documentation is Creative Commons licensed because I want people building on this. APK available here: https://github.com/cedenburn-ai/Thought-Seed/releases Updates on the subreddit: https://www.reddit.com/r/OrchardApp/ Happy to go deep on any part of the architecture. The pipeline design, the knowledge graph schema, the anti-echo constraints, the cost model, whatever. I've been living in this codebase for months and I love talking about it. Apologies to iPhone users. I don't know the Apple development environment yet but it's on the roadmap. https://preview.redd.it/p97usyv3j5ng1.png?width=495&format=png&auto=webp&s=19d64611c6e4066e81f15c32e8ed38fda743f3cf https://preview.redd.it/3qvwiq94j5ng1.png?width=493&format=png&auto=webp&s=5c7462f922a16064465f88032fd4cf9d65c212a8 https://preview.redd.it/05dl6ijej5ng1.png?width=498&format=png&auto=webp&s=c22a5bb25acee5213cde297e532b7c37accc098e https://preview.redd.it/1kvmo7efj5ng1.png?width=495&format=png&auto=webp&s=c6eddd7723940590ccc0aca1c321e56d0aceb347 https://preview.redd.it/5mfzw85pj5ng1.jpg?width=1080&format=pjpg&auto=webp&s=05c583c448ada9ae2f176bef7ca917c7098d7e3d submitted by /u/Edenisb [link] [comments]
- [D] Working on a photo-based calorie tracker appby /u/DinoDinac (Machine Learning) on March 5, 2026 at 2:43 am
Hey, I’m building a photo-based calorie tracking app. Apps like CalAI already do this, but from what I’ve seen they often struggle with mixed dishes, portion size estimation, and general hiccups with calorie estimates. I’m trying to approach it a bit more seriously from an ML perspective and i want to hear your thoughts. I really want to make the scan part as accurate as possible. I don't want it to be something simple as an OpenAI API call. I'm wondering if there is another approach for this using classic ML or specific food datasets which will give me an edge for the calculations. Right now I’m experimenting with YOLOv8 for multi-food detection, and thinking about adding segmentation or some kind of regression model for portion/volume estimation. Curious what others here think: Would you model this as detection + regression, or go full segmentation? Any good datasets for portion-aware food recognition? Is monocular depth estimation practical for something like this on mobile? Would appreciate any thoughts, especially from anyone who’s worked on food recognition or similar real-world CV problems. submitted by /u/DinoDinac [link] [comments]
- [D] Intel Core Ultra 7 265K vs AMD Ryzen 7 7800X3D Which one is better for ML?by /u/peter34512800 (Machine Learning) on March 4, 2026 at 8:56 pm
I am building a new PC for a mix of gaming and ML work, having a hard time to pick weather if I should go with Intel or AMD, current specs are 5070 ti, 32gb ram, what do u guys think? Edit: Intel is the better choice here, there's barely any performance difference in terms of gaming submitted by /u/peter34512800 [link] [comments]
- [R] IJCAI-ECAI'26 Summary Rejects statusby /u/AddendumNo5533 (Machine Learning) on March 4, 2026 at 1:37 pm
Hi, is there any update regarding summary rejects ? Deadline is March 4 AOE, and my paper status is still "Submitted" on chairingtool. Does anyone know by when they will be out ? submitted by /u/AddendumNo5533 [link] [comments]
- [R] GFlowsNets for accelerating ray tracing for radio propagation modelingby /u/jeertmans (Machine Learning) on March 4, 2026 at 9:04 am
Hi everyone! I have just submitted my new journal paper on using Generative Flow Networks (GFlowNets) to speed up radio propagation modeling. Preprint on arXiv Tutorial notebook GitHub repository The problem and our solution Traditional point-to-point ray tracing suffers from exponential computational complexity, scaling with the number of objects raised to the interaction order. To fix this bottleneck, we define path finding as a sequential decision process and trained a generative model to intelligently sample valid ray paths instead of relying on an exhaustive search. This work extends previous work I presented at ICMLCN 2025, but with much better results and details. Specifically, the proposed model achieves speedups of up to 10x on GPU and 1000x on CPU while maintaining high coverage accuracy! Comparison of the coverage map between the ground truth (upper left) and the prediction (upper right) using 20 samples. Lower left and right figures show the relative and log-relative differences (in dB) between the two coverage maps, as defined in the paper. Improvements from previous model While working on this project, I researched a lot about reinforcement learning and GFlowNets. Applying GFlowNets here meant traversing a tree rather than a generic directed graph, which led to a number of standard solutions not being applicable. However, a few of them led to positive outcomes: Sparse Rewards: Finding valid geometric paths is rare, leading to a massive sparse reward issue and model collapse. After exploring goal-oriented RL with no success, I solved this by introducing a successful experience replay buffer to capture and store rare valid paths. Exploration: Using a uniform exploratory policy (ε-greedy) turned out to slightly improve performance on higher-order paths (i.e., deeper trees). Action Masking: I applied a physics-based action masking strategy to filter out physically impossible paths before the model even considers them, drastically pruning the search space. Muon Optimizer: Finally, I recently tried the Muon optimizer instead of the traditional Adam I was always using, and noticed much better training performance and convergence speed. ML framework and hardware Everything was built using the JAX ecosystem (Equinox, Optax, and my own library DiffeRT). Sadly, sharing code isn't super common in my specific research community, but I strongly believe open-sourcing research data can only benefit everyone. As a result, I put a lot of effort into making the code clean and well-documented. I'm not an ML expert but a telecom researcher, and I performed these experiments entirely on my own using a single NVIDIA RTX 3070. FYI, training the three models (as shown in the tutorial) takes about 3 hours on my computer. It might not be ready to completely replace exhaustive ray tracing just yet, but the results are really promising. I'm very happy to receive questions, comments, or criticisms about this work. I hope you like it! 🙂 submitted by /u/jeertmans [link] [comments]
- [P] I open-sourced a synth framework for creating physics-simulated humanoids in Unity with MuJoCo -- train them with on-device RL and interact in VRby /u/arghyasur (Machine Learning) on March 4, 2026 at 7:31 am
I've been building a system to create physics-based humanoid characters in Unity that can learn through reinforcement learning -- and you can physically interact with them in mixed reality on Quest. Today I'm open-sourcing the three packages that make it up. What it does: synth-core -- Take any Daz Genesis 8 or Mixamo character, run it through an editor wizard (or one-click right-click menu), and get a fully physics-simulated humanoid with MuJoCo rigid-body dynamics, mesh-based collision geometry, configurable joints, and mass distribution. Extensible to other skeleton types via an adapter pattern. synth-training -- On-device SAC (Soft Actor-Critic) reinforcement learning using TorchSharp. No external Python server -- training runs directly in Unity on Mac (Metal/MPS), Windows, or Quest (CPU). Includes prioritized experience replay, automatic entropy tuning, crash-safe state persistence, and motion reference tooling for imitation learning. synth-vr -- Mixed reality on Meta Quest. The Synth spawns in your physical room using MRUK. Physics-based hand tracking lets you push, pull, and interact with it using your real hands. Passthrough rendering with depth occlusion and ambient light estimation. The workflow: Import a humanoid model into Unity Right-click -> Create Synth (or use the full wizard) Drop the prefab in a scene, press Play -- it's physics-simulated Add ContinuousLearningSkill and it starts learning Build for Quest and interact with it in your room Tech stack: Unity 6, MuJoCo (via patched Unity plugin), TorchSharp (with IL2CPP bridge for Quest), Meta XR SDK Links: synth-core -- Physics humanoid creation synth-training -- On-device RL training synth-vr -- Mixed reality interaction All Apache-2.0 licensed. The long-term goal is autonomous virtual beings with integrated perception, memory, and reasoning -- but right now the core infrastructure for creating and training physics humanoids is solid and ready for others to build on. Contributions welcome. Happy to answer questions about the architecture, MuJoCo integration challenges, or getting TorchSharp running on IL2CPP/Quest. submitted by /u/arghyasur [link] [comments]
- [P] I trained Qwen2.5-1.5b with RLVR (GRPO) vs SFT and compared benchmark performanceby /u/jayminban (Machine Learning) on March 3, 2026 at 9:44 pm
Hello everyone. I trained Qwen2.5-1.5B-Instruct with RLVR and SFT on the GSM8K dataset. RLVR boosted math reasoning by +11.9 points. SFT degraded it by -15.2. SFT (Supervised Fine-tuning): Standard next-token prediction training on labeled data. RLVR (Reinforcement Learning with Verifiable Rewards): The training approach behind DeepSeek-R1. The model is reinforced to produce responses that earn higher rewards from a verifiable signal (e.g., correct math answers). This is what enabled models to generate their own chain-of-thought reasoning and led to dramatic improvements in reasoning and agentic tasks. I ran three experiments: RLVR vs SFT on GSM8K train split: Standard training and comparison. Cheating analysis: Training directly on the GSM8K test set to measure data contamination effects. One-example RLVR: RLVR training with only a single example from two different data sources. Results: RLVR training significantly improves GSM8K performance while also improving unrelated MATH scores, suggesting general reasoning improvement, even when training with only one example. SFT degrades performance significantly on both benchmarks regardless of train or test data. SFT appears to override the model's pretrained knowledge, making it mimic surface patterns without actually improving reasoning ability. Notably, SFT does reduce the no-answer rate, meaning the model learns to produce answers in the expected format, but the answers themselves are less accurate. See the training progression plots and results table above. GPU whirring that went into this project: Experiment GPUs Duration Epochs GRPO GSM8K Train 6× RTX 4090 32h 12m 13 GRPO GSM8K Test 8× RTX 3090 20h 09m 30 GRPO GSM8K 1-Example 8× RTX 3090 11h 16m - GRPO DSR 1-Example 8× RTX 3090 12h 43m - SFT GSM8K Train 1× RTX 5090 2h 46m 7 SFT GSM8K Test 1× RTX 5090 1h 06m 15 Benchmarking 388 Checkpoints 1× RTX 5090 17h 41m - 388 checkpoints were benchmarked for this project. Every prompt, model response, and extracted answer across all benchmarks is logged in a SQLite database, over 2.4 million rows, viewable live on Hugging Face Spaces via Datasette! https://huggingface.co/spaces/jayminban/RLVR-vs-SFT-Qwen2.5-1.5b For detailed analysis, all plots, training code, data, checkpoints, and more, check out the full project on GitHub. https://github.com/jayminban/RLVR-vs-SFT-Qwen2.5-1.5b Any feedback or ideas for my next project are greatly appreciated! submitted by /u/jayminban [link] [comments]
- [P] We made GoodSeed, a pleasant ML experiment trackerby /u/gQsoQa (Machine Learning) on March 3, 2026 at 9:09 pm
GoodSeed v0.3.0 🎉 I and my friend are pleased to announce GoodSeed - a ML experiment tracker which we are now using as a replacement for Neptune. Key Features Simple and fast: Beautiful, clean UI Metric plots: Zoom-based downsampling, smoothing, relative time x axis, fullscreen mode, ... Monitoring plots: GPU/CPU usage (both NVIDIA and AMD), memory consumption, GPU power usage Stdout/Stderr monitoring: View your program's output online. Structured Configs: View your hyperparams and other configs in a filesystem-like interactive table. Git Status Logging: Compare the state of your git repo across experiments. Remote Server (beta version): Back your experiments to a remote server and view them online. For now, we only support metrics, strings, and configs (no files). Neptune Proxy: View your Neptune runs through the GoodSeed web app. You can also migrate your runs to GoodSeed (either to local storage or to the remote server). Try it Web: https://goodseed.ai/ Click on Demo to see the app with an example project. Connect to Neptune to see your Neptune runs in GoodSeed. pip install goodseed to log your experiments. Log In to create an account and sync your runs with a remote server (we only have limited seats now because the server is quite expensive - we might set up some form of subscription later). Repo (MIT): https://github.com/kripner/goodseed Migration guide from Neptune: https://docs.neptune.ai/transition_hub/migration/to_goodseed submitted by /u/gQsoQa [link] [comments]
- [R] Boundary-Metric Evaluation for Thin-Structure Segmentation under 2% Foreground Sparsityby /u/TheRealManual (Machine Learning) on March 3, 2026 at 1:50 pm
Hey! I'm currently a undergrad student graduating in May and soon starting my Masters in AI. I've wanted to write a research paper to start gaining some experience in that area and just recently finished my first one. This paper focuses on investigating segmentation under some extreme foreground sparsity, around 1.8% of positive pixels during a whiteboard digitization. It connects to a small project I was working on where you can take a photo of a whiteboard and it would identify what is actual ink strokes and not the background or smudges and then export it to a OneNote page. Instead of proposing a new loss, I wanted to focus on evaluation methodology and extreme analysis of this method. Some main things I focus on in this paper are Region Metrics such as F1 and IoU Boundary Metrics such as BF1 and Boundary-IoU Core vs thin-subset equity analysis Multi-seed training Per-image robustness statistics If anyone has any feedback to this, I'd love to talk more about it! I'm very new to this so if people could advise me in certain areas or just advise me on if it's good enough to display on my resume, that would be amazing! https://arxiv.org/abs/2603.00163 submitted by /u/TheRealManual [link] [comments]
- [R] AdamWClip: AdamW with adaptive gradient clippingby /u/ElectricVote (Machine Learning) on March 3, 2026 at 11:28 am
Hi, Would you like to try out an optimizer that does (adaptive) gradient clipping, so you don't have to set clipping thresholds manually? We have developed AdamWClip, an extension to AdamW that does exactly that, with no additional memory required and only marginal computational overhead. In our preliminary experiments, it often outperformed AdamW with grad_norm clipping by quite a significant margin, so we would be interested to hear how it performs in your use cases. If you would like to try it, simply insert the following into your code: %pip install AdamWClip from AdamWClip import AdamWClip ... optimizer = AdamWClip(model.parameters(),*args) The source code is available on Github: https://github.com/wandeln/AdamWClip submitted by /u/ElectricVote [link] [comments]
- [R] Are neurons the wrong primitive for modeling decision systems?by /u/TutorLeading1526 (Machine Learning) on March 3, 2026 at 2:09 am
A recent ICLR paper proposes Behavior Learning — replacing neural layers with learnable constrained optimization blocks. It models it as: "utility + constraints → optimal decision" https://openreview.net/forum?id=bbAN9PPcI1 If many real-world systems are optimization-driven, should "optimization modules" replace neurons as the basic building block of ML? Or is this just structured inductive bias rebranded as a new paradigm? submitted by /u/TutorLeading1526 [link] [comments]
- [D] How much time do you actually lose trying to reproduce ML papers?by /u/votrinhan88 (Machine Learning) on March 2, 2026 at 11:49 pm
Hey folks! Long-time lurker, first time poster. I’m a PhD student, and I’ve been wondering: how much time do you actually spend just trying to reproduce ML papers? Even when the code is available, it can take days (or weeks!) to get everything running—tracking down missing hyperparameters, figuring out weird environment issues, or just dealing with stuff that’s buried in an appendix. So I’m genuinely curious: + How much time do you lose each week just getting baselines or prior work running? + What’s the most annoying part? Is it missing code, bad documentation, hardware headaches, dataset versions, or something else? + How do you deal with it? Do you just accept the time loss, reach out to authors, skip the baseline, or have some other strategy? + Would you pay for a tool that automated all this? If yes, what would it need to do for you to trust it, and what’s a realistic price? + What would make you trust (or distrust) a tool’s results? Not trying to sell anything, just want to know how common this pain is before I think about building something. All answers welcome, even if you think I'm overthinking non-issue! submitted by /u/votrinhan88 [link] [comments]
- [D] The engineering overhead of Verifiable ML: Why GKR + Hyrax for on-device ZK-ML?by /u/bebo117722 (Machine Learning) on March 2, 2026 at 8:15 pm
The idea of "Privacy-Preserving AI" usually stops at local inference. You run a model on a phone, and the data stays there. But things get complicated when you need to prove to a third party that the output was actually generated by a specific, untampered model without revealing the input data. I’ve been looking into the recently open-sourced Remainder prover (the system Tools for Humanity uses for World). From an ML engineering perspective, the choice of a GKR (Goldwasser-Kalai-Rothblum) + Hyrax-based proof system is an interesting case study in balancing prover time vs. mobile hardware constraints. Most ZK-ML implementations (like those using Plonky2 or Halo2) struggle with the sheer scale of circuit depth when you start mapping even mid-sized neural networks. GKR is theoretically "doubly-efficient", but implementation-wise, it’s a nightmare to make it work on consumer-grade mobile GPUs. The hardware-heavy approach (relating on physical Orb sensors for every state update) was always the biggest scaling bottleneck. Shifting the compute to client-side ZK-SNARKs means the "trust" moves from the hardware's physical security to the mathematical integrity of the prover. We often talk about Edge AI in terms of latency, but we rarely talk about verifiability. If we want a future where "Proof of Personhood" or "Proof of Model" is decentralized, we need provers that don't melt a smartphone battery. Seeing a production-grade GKR prover that handles ML layers locally is a solid benchmark for the field, regardless of how you feel about the project itself. I’m curious if we’re reaching a point where the prover overhead is finally low enough for real-time applications, or if we’re still just scratching the surface of what mobile GPUs can handle in terms of ZK-proof generation. submitted by /u/bebo117722 [link] [comments]
- [D] How to get credits to run experiments on closed source models as a student researcher.by /u/Exciting_Wonder67 (Machine Learning) on March 2, 2026 at 5:25 pm
Hello! I am working on building and evaluating frontier models on a benchmark. The task is overall pretty reasoning intensive, and ends up consuming a lot of tokens. For reference, in our pilot tests, for Gemini 3.1 Pro, the average output tokens were around 30k and GPT 5.2 runs for around 15 minutes. I would need to evaluate the models on around 900 questions. What would be the best way to get credits for this? submitted by /u/Exciting_Wonder67 [link] [comments]
![[D] Looks like we'll be having an AMA on AI agents in this subreddit later today. Illia's is building IronClaw: a secure, open-source alternative to OpenClaw written in Rust 🦀](https://preview.redd.it/ovu1hf15s8ng1.png?width=140&height=78&auto=webp&s=b2cc613b7fa6a4238be7b64c1092df6646d5d068)
![[P] I built an open cognitive architecture for Android that maintains persistent beliefs, doubts, and goals across conversations. 13-section reasoning pipeline, local knowledge graph, flat cost at scale. Free.](https://preview.redd.it/p97usyv3j5ng1.png?width=140&height=140&crop=1:1,smart&auto=webp&s=e1c3bbe7fce082b0d86a617748e1a254b08809ad)
![[R] GFlowsNets for accelerating ray tracing for radio propagation modeling](https://preview.redd.it/umpnob8otzmg1.png?width=140&height=123&auto=webp&s=268ec34602ef8a9e4f9ffb2fca21cdf74e1dfe74)
![[P] I trained Qwen2.5-1.5b with RLVR (GRPO) vs SFT and compared benchmark performance](https://preview.redd.it/62xtrlk4hwmg1.png?width=140&height=59&auto=webp&s=c9516eff1ae8ab963cfc68194b7135a738566d44)
![[P] We made GoodSeed, a pleasant ML experiment tracker](https://preview.redd.it/yx8gprmyawmg1.png?width=140&height=70&auto=webp&s=3482ea582f2b00b4d8dbe490524a4a0de85ba2af)

























96DRHDRA9J7GTN6