DjamgaMind: Audio Intelligence for the C-Suite (Energy, Healthcare, Finance)
Are you drowning in dense legal text? DjamgaMind is the new audio intelligence platform that turns 100-page healthcare or Energy mandates into 5-minute executive briefings. Whether you are navigating Bill C-27 (Canada) or the CMS-0057-F Interoperability Rule (USA), our AI agents decode the liability so you don’t have to. 👉 Start your specialized audio briefing today at Djamgamind.com
AI Jobs and Career
I wanted to share an exciting opportunity for those of you looking to advance your careers in the AI space. You know how rapidly the landscape is evolving, and finding the right fit can be a challenge. That's why I'm excited about Mercor – they're a platform specifically designed to connect top-tier AI talent with leading companies. Whether you're a data scientist, machine learning engineer, or something else entirely, Mercor can help you find your next big role. If you're ready to take the next step in your AI career, check them out through my referral link: https://work.mercor.com/?referralCode=82d5f4e3-e1a3-4064-963f-c197bb2c8db1. It's a fantastic resource, and I encourage you to explore the opportunities they have available.
- Full Stack Engineer [$150K-$220K]
- Software Engineer, Tooling & AI Workflow, Contract [$90/hour]
- DevOps Engineer, India, Contract [$90/hour]
- More AI Jobs Opportunitieshere
| Job Title | Status | Pay |
|---|---|---|
| Full-Stack Engineer | Strong match, Full-time | $150K - $220K / year |
| Developer Experience and Productivity Engineer | Pre-qualified, Full-time | $160K - $300K / year |
| Software Engineer - Tooling & AI Workflows (Contract) | Contract | $90 / hour |
| DevOps Engineer (India) | Full-time | $20K - $50K / year |
| Senior Full-Stack Engineer | Full-time | $2.8K - $4K / week |
| Enterprise IT & Cloud Domain Expert - India | Contract | $20 - $30 / hour |
| Senior Software Engineer | Contract | $100 - $200 / hour |
| Senior Software Engineer | Pre-qualified, Full-time | $150K - $300K / year |
| Senior Full-Stack Engineer: Latin America | Full-time | $1.6K - $2.1K / week |
| Software Engineering Expert | Contract | $50 - $150 / hour |
| Generalist Video Annotators | Contract | $45 / hour |
| Generalist Writing Expert | Contract | $45 / hour |
| Editors, Fact Checkers, & Data Quality Reviewers | Contract | $50 - $60 / hour |
| Multilingual Expert | Contract | $54 / hour |
| Mathematics Expert (PhD) | Contract | $60 - $80 / hour |
| Software Engineer - India | Contract | $20 - $45 / hour |
| Physics Expert (PhD) | Contract | $60 - $80 / hour |
| Finance Expert | Contract | $150 / hour |
| Designers | Contract | $50 - $70 / hour |
| Chemistry Expert (PhD) | Contract | $60 - $80 / hour |
What is the Best Machine Learning Algorithms for Imbalanced Datasets?
In machine learning, imbalanced datasets are those where one class heavily outnumbers the others. This can be due to the nature of the problem or simply because more data is available for one class than the others. Either way, imbalanced datasets can pose a challenge for machine learning algorithms. In this blog post, we’ll take a look at which machine learning algorithms are best suited for imbalanced datasets and why they tend to perform better than others.
For example, in a binary classification problem, if there are 100 observations, and only 10 of them are positive (the rest are negatives), then we say that the dataset is imbalanced. The ratio of positive to negative cases is 1:10.

There are a few reasons why some machine learning algorithms tend to perform better on imbalanced datasets than others. First, certain algorithms are designed to handle imbalanced datasets. Second, some algorithms are more robust to outliers, which can be more common in imbalanced datasets. And third, some algorithms are better able to learn from a limited amount of data, which can be an issue when one class is heavily outnumbered by the others.
Some of the best machine learning algorithms for imbalanced datasets include:
– Support Vector Machines (SVMs),
– Decision Trees,
– Random Forests,
– Naive Bayes Classifiers,
– k-Nearest Neighbors (kNN),
Of these, SVMs tend to be the most popular choice as they are specifically designed to handle imbalanced datasets. SVMs work by finding a hyperplane that maximizes the margin between the two classes. This helps to reduce overfitting and improve generalization. Decision trees and random forests are also popular choices as they are less sensitive to outliers than other algorithms such as linear regression. Naive Bayes classifiers are another good choice as they are able to learn from a limited amount of data. kNN is also a good choice as it is not sensitive to outliers and is able to learn from a limited amount of data. However, it can be computationally intensive for large datasets.
There are two main types of machine learning algorithms: supervised and unsupervised. Supervised algorithms tend to perform better on imbalanced datasets than unsupervised algorithms. In this blog post, we will discuss why this is so and look at some examples.
Supervised Algorithms
Supervised algorithms are those where the target variable is known. In other words, we have training data where the correct answers are already given. The algorithm then learns from this data and is able to generalize to new data. Some examples of supervised algorithms are regression and classification.
Unsupervised Algorithms
Unsupervised algorithms are those where the target variable is not known. With unsupervised algorithms, we only have input data, without any corresponding output labels. The algorithm has to learn from the data itself without any guidance. Some examples of unsupervised algorithms are clustering and dimensionality reduction.
Why Supervised Algorithms Perform Better on Imbalanced Datasets
The reason why supervised algorithms perform better on imbalanced datasets is because they can learn from the training data which cases are more important. With unsupervised algorithms, all data points are treated equally, regardless of whether they are in the minority or majority class.
For example, in a binary classification problem with an imbalanced dataset, let’s say that we want to predict whether a customer will default on their loan payment or not. We have a training dataset of 1000 customers, out of which only 100 (10%) have defaulted on their loan in the past.
If we use a supervised algorithm like logistic regression, the algorithm will learn from the training data that defaulting on a loan is rare (since only 10% of cases in the training data are Positive). This means that it will be more likely to predict correctly that a new customer will not default on their loan (since this is the majority class in the training data).
However, if we use an unsupervised algorithm like k-means clustering, all data points will be treated equally since there is no target variable to guide the algorithm. This means that it might incorrectly cluster together customers who have defaulted on their loans with those who haven’t since there is no guidance provided by a target variable.
Conclusion:
In conclusion, supervised machine learning algorithms tend to perform better on imbalanced datasets than unsupervised machine learning algorithms because they can learn from the training data which cases are more important.
Some machine learning algorithms tend to perform better on highly imbalanced datasets because they are designed to deal with imbalance or because they can learn from both classes simultaneously. If you are working with a highly imbalanced dataset, then you should consider using one of these algorithms.
Thanks for reading!
How are machine learning techniques being used to address unstructured data challenges?
Machine learning techniques are being used to address unstructured data challenges in a number of ways:
- Natural language processing (NLP): NLP algorithms can be used to extract meaningful information from unstructured text data, such as emails, documents, and social media posts. NLP algorithms can be trained to classify text data, identify key terms and concepts, and extract structured data from unstructured text.
- Image recognition: Machine learning algorithms can be used to analyze and classify images, enabling the automatic identification and classification of objects, people, and other elements in images. This can be useful for tasks such as image tagging and search, as well as for applications such as security and surveillance.
- Audio and speech recognition: Machine learning algorithms can be used to analyze and classify audio data, enabling the automatic transcription and translation of spoken language. This can be useful for tasks such as speech-to-text transcription, as well as for applications such as call center automation and language translation.
- Video analysis: Machine learning algorithms can be used to analyze and classify video data, enabling the automatic detection and classification of objects, people, and other elements in video. This can be useful for tasks such as video tagging and search, as well as for applications such as security and surveillance.
Overall, machine learning techniques are being used in a wide range of applications to extract meaningful information from unstructured data, and to enable the automatic classification and analysis of data in a variety of formats.
How is AI and machine learning impacting application development today?
Artificial intelligence (AI) and machine learning are having a significant impact on application development today in a number of ways:
- Enabling new capabilities: AI and machine learning algorithms can be used to enable applications to perform tasks that would be difficult or impossible for humans to do. For example, AI-powered applications can be used to analyze and classify large amounts of data, or to automate complex decision-making processes.
- Improving performance: AI and machine learning algorithms can be used to optimize the performance of applications, making them faster, more efficient, and more accurate. For example, machine learning algorithms can be used to improve the accuracy of predictive models, or to optimize the performance of search algorithms.
- Streamlining development: AI and machine learning algorithms can be used to automate various aspects of application development, such as testing, debugging, and deployment. This can help to streamline the development process and reduce the time and resources needed to build and maintain applications.
- Enhancing user experiences: AI and machine learning algorithms can be used to enhance the user experience of applications, by providing personalized recommendations, recommendations, or by enabling applications to anticipate and respond to the needs and preferences of users.
Overall, AI and machine learning are having a significant impact on application development today, and they are likely to continue to shape the way applications are built and used in the future.
How will advancements in artificial intelligence and machine learning shape the future of work and society?
Advancements in artificial intelligence (AI) and machine learning are likely to shape the future of work and society in a number of ways. Some potential impacts include:
- Automation: AI and machine learning algorithms can be used to automate tasks that are currently performed by humans, such as data entry, customer service, and manufacturing. This could lead to changes in the types of jobs that are available and the skills that are in demand, as well as to increased productivity and efficiency.
- Job displacement: While automation may create new job opportunities, it could also lead to job displacement, particularly for workers in industries that are more susceptible to automation. This could lead to social and economic challenges, including unemployment and income inequality.
- Increased efficiency: AI and machine learning algorithms can be used to optimize and streamline business processes, leading to increased efficiency and productivity. This could lead to economic growth and innovation, and could also help to reduce costs for businesses and consumers.
- Enhanced decision-making: AI and machine learning algorithms can be used to analyze large amounts of data and make more informed and accurate decisions. This could lead to improved outcomes in fields such as healthcare, finance, and education, and could also help to reduce bias and improve fairness.
Overall, the impact of AI and machine learning on the future of work and society is likely to be significant and complex, with both potential benefits and challenges. It will be important to consider and address these impacts as these technologies continue to advance and become more widely adopted.
- [P] Random Forest on ~100k Polymarket questions — 80% accuracy (text-only)by /u/No_Syrup_4068 (Machine Learning) on February 17, 2026 at 9:50 pm
Built a text-only baseline: trained a Random Forest on ~90,000 resolved Polymarket questions (YES/NO). Features: TF-IDF (word ngrams, optional char ngrams) + a few cheap flags (date/number/%/currency, election/macro/M&A keywords). Result: ~80% accuracy on 15.000 held-out data/questions (plus decent Brier/logloss after calibration). Liked the idea played a bit more with differnt data sets and did some cross validation with Kalshi data and saw similar results. Now having this running with paper money and competing with stat of the art LLM's as benchmakrs. Lets see. Currently looks like just from the formulation of the question at polymarket (in the given data set) we can predict with 80% accurarcy if it's a YES or NO. Happy to share further insights or get feedback if someone tried smth similar? Source of the paper trading. Model is called "mystery:rf-v1": Agent Leaderboard | Oracle Markets. Did not publish accuary so far there. submitted by /u/No_Syrup_4068 [link] [comments]
- [D] How often do you run into reproducibility issues when trying to replicate papers?by /u/ArtVoyager77 (Machine Learning) on February 17, 2026 at 9:27 pm
I’m a researcher currently trying to replicate published results, and I’m running into reproducibility issues more often than I expected. I’m trying to calibrate whether this is “normal” or a sign I’m missing something fundamental. I have been careful about all the parameter as stated in papers. Despite that, I’m still seeing noticeable deviations from reported numbers—sometimes small but consistent gaps, sometimes larger swings across runs. For example, I was trying to replicate “Machine Theory of Mind” (ICML 2018), and I keep hitting discrepancies that I can’t fully understand. My labmates also tried to replicate the paper they were not able to replicate results even closely. What are the papers you tried but couldn’t replicate no matter what you did? submitted by /u/ArtVoyager77 [link] [comments]
- [P] I trained an XGBoost model with DuckLake and ADBCby /u/empty_cities (Machine Learning) on February 17, 2026 at 7:56 pm
I've been spending time with Apache ADBC (Arrow Database Connectivity) and DuckLake (lakehouse architecture using DuckDB) to read columnar data. I realized XGBoost took Arrow tables as a data input and I was able to pass arrow tables with little memory overhead to train. I also wanted to try to not use scikit-learn so I built a train and test split function with PyArrow instead. ADBC also allows you to stream larger than memory data and train a model in the right circumstances. submitted by /u/empty_cities [link] [comments]
- [D] Should unpublished research material be kept close and guarded, and how often does academic or IP theft occur during research?by /u/WadeEffingWilson (Machine Learning) on February 17, 2026 at 2:19 pm
I'm working on a research project where I've gotten to the point of confirmation and I'm working on the proof. The POC works and the results give extremely strong evidence supporting the proposed method across various datasets. Here's the heart of the problem: I'm not in academia, I've never attempted publication, and I have limited credentials. I'm in the public sector with close relationships with certain academic organizations and national labs, as well as a host of experienced folks in the operational workspace. The research is self-driven and self-motivated but is built off of years of personal experience and a literal ton of white papers, so I'm aware of the SOTA and other similar approaches (which will be included in the paper). I'd like to reach out to some folks in various capacities, maybe even reach out to the local university, to ask for guidance, recommendations, and review. I'm absolutely open to bringing in a partner for co-authorship as long as they contribute or provide mentorship. I just have zero sense as to the risk of doing so. I don't feel like theft is a common problem but theft is a spectrum--it could happen at any point with any level of granularity. I understand that it might sound like I'm conflating IP/copyright/patent theft but I'm not. I want other people to use the proposed method, to add on to it, to enhance it, to reference it in other work, or to just use it operationally, but to do so after it's been published or made available. If anyone has any advice on this, I'd love to hear it. submitted by /u/WadeEffingWilson [link] [comments]
- [R] Learning State-Tracking from Code Using Linear RNNsby /u/Yossarian_1234 (Machine Learning) on February 17, 2026 at 7:07 am
Link: https://arxiv.org/abs/2602.14814 Authors: Julien Siems, Riccardo Grazzi, Kirill Kalinin, Hitesh Ballani, Babak Rahmani Abstract: Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models like Transformers and RNNs (linear and non-linear). However, these are often sequence-to-sequence tasks: learning to map actions (permutations) to states, which is incompatible with the next-token prediction setting commonly used to train language models. We address this gap by converting permutation composition into code via REPL traces that interleave state-reveals through prints and variable transformations. We show that linear RNNs capable of state-tracking excel also in this setting, while Transformers still fail. Motivated by this representation, we investigate why tracking states in code is generally difficult: actions are not always fully observable. We frame this as tracking the state of a probabilistic finite-state automaton with deterministic state reveals and show that linear RNNs can be worse than non-linear RNNs at tracking states in this setup. https://preview.redd.it/9cjies2580kg1.png?width=1080&format=png&auto=webp&s=e5e534d329bbdbf3d705e811c473ada55d503d20 submitted by /u/Yossarian_1234 [link] [comments]
- [D] Is content discovery becoming a bottleneck in generative AI ecosystems?by /u/Opposite-Alfalfa-700 (Machine Learning) on February 16, 2026 at 10:18 pm
I’ve been thinking about an emerging structural issue in generative AI. Model quality is improving rapidly. Creation cost is decreasing. Inference is becoming cheaper. But discovery mechanisms haven’t evolved at the same pace. As generative systems scale, the amount of produced content increases superlinearly. Ranking, filtering and relevance models often remain engagement-driven rather than quality-driven. From a machine learning perspective, I’m curious: Do we see discovery and relevance modeling becoming the next major bottleneck in generative ecosystems? Specifically: – Are current ranking systems fundamentally misaligned with user value? – Is engagement still the right optimization objective? – Could smaller, curated relevance models outperform large engagement-optimized feeds? Would appreciate perspectives from people working on recommender systems or ranking models. submitted by /u/Opposite-Alfalfa-700 [link] [comments]
- [D] SparseFormer and the future of efficient Al vision modelsby /u/SR1180 (Machine Learning) on February 16, 2026 at 9:30 pm
Hi everyone, I've been diving deep into sparse architectures for vision transformers, and I'm incredibly impressed with the potential of SparseFormer to solve the O(n²) compute bottleneck, especially for commercial applications like data labeling and industrial inspection. It feels like this is where the industry is heading for efficiency, and it seems to have more commercial potential than it's currently given credit for, especially with the push towards multimodal models. Is anyone here working with or researching SparseFormer? Curious to hear thoughts on its commercial viability versus other sparse MoE approaches for vision tasks. submitted by /u/SR1180 [link] [comments]
- Short Paper Reviews [R]by /u/Efficient_Ad_6772 (Machine Learning) on February 16, 2026 at 8:47 pm
Various venues offer, or have in the past offered, the opportunity to submit short papers, often with a four pages page limit. This is currently true of the ACL. Short papers are not long papers, and there are usually explicit requirements as to how they should be treated differently by reviewers. See for example http://aclrollingreview.org/cfp section on short papers. Question to anyone who has submitted short papers in the past, do you think your paper was reviewed fairly as a short paper? I know we've all had some bad experiences with subletting any kind of paper, but do you think on average the reviewers understood the assignment and evaluated your work based on the criteria for short papers? I think it's true that ICLR used to have a short papers track and removed it. Does anyone know why it was removed? submitted by /u/Efficient_Ad_6772 [link] [comments]
- Collaboration invite - medical Imag!ng, algorithmic fairness or open track [D]by /u/ade17_in (Machine Learning) on February 16, 2026 at 7:13 pm
I'm a 2nd year PhD student and looking to broaden my collaboration circle and what better than this community. I primarily work on developing frameworks for fairness (imaging models, LM) (evaluation/mitigation for clinical deployment) but really open for boarder topics. If there's a possibility we can connect and work on something exciting (for a publication in conf or a workshop), would be great. If you have hold of a dataset which will be useful we can make it formal with our institutes. looking forward to hearing from brilliant minds! submitted by /u/ade17_in [link] [comments]
- [D] Supervisor supportby /u/_karma_collector (Machine Learning) on February 16, 2026 at 4:19 pm
I just want to ask PhDs in AI on this sub, how much does your supervisor support your phd ? In term of research output, how much help do you get from your supervisor? Only ambigious direction (e.g. Active Learning/RL for architecture X)? Or more details idea, like the research gap itself? If you meet a certain problem (e.g. cannot solve X because too hard to solve), do they give you any help, like potential solution direction to try, or just tell you "please do something about it"? How often do their suggestion actually help you? If they don't help much, do they ask their post doc or other student to collaborate/help you solve the problem? Do they have KPI for you? (E.g. number of finished work per year?) In term of networking/connection, how much do he/she help you? submitted by /u/_karma_collector [link] [comments]
- [P] eqx-learn: Classical machine learning using JAX and Equinoxby /u/gvcallen (Machine Learning) on February 16, 2026 at 7:45 am
Hello everyone! I am writing here to share a library I am currently developing for research use that filled a niche for me in the Equinox/JAX eco-system: eqx-learn. I am using Equinox as the foundation for my radio-frequency modelling library ParamRF, and I have absolutely loved the mixed OO/functional style. However, for my research, I require classical ML models (specifically PCA and Gaussian Process Regression), but could not find an Equinox-native library in the ecosystem that was as straight-forward and consistent as scikit-learn. eqx-learn aims to address this, with a JAX-based take on the scikit-learn API. All models in the library are ultimately Equinox Module's, and can be fit using the library's free "fit" function. The design is such that models simply "advertise" their capabilities by implementing specific methods (e.g. solve(X, y), condition(X, y), loss(), and the "fit" function then fits/trains the model accordingly. I believe that this de-coupling of capabilities vs fitting algorithm fits the JAX style better, and also has lots of potential. At the moment, eqx-learn addresses all my research needs, but I thought it may be useful to share the library online to advertise that it exists, and mention that I am happy to accept PRs for additional models and fitting algorithms! Although there are no docs, there are short examples in the repo :). Happy coding! Cheers, Gary submitted by /u/gvcallen [link] [comments]
- [D] ACL ARR Jan 2026 Reviewsby /u/srkrrr (Machine Learning) on February 16, 2026 at 4:51 am
Hi I got 3 official reviews. OA: 2/2.5/2.5 (average OA is 2.33) and Confidence: 4/4/3 (average Confidence is 3.67) Thoughts? submitted by /u/srkrrr [link] [comments]
- [D] Interview experience for LLM inference systems positionby /u/dividebyzero74 (Machine Learning) on February 16, 2026 at 1:04 am
Hi I am preparing for a interview at an AI Lab for LLM inference team with a systems role, not MLE. I have been told I will have an LLM inference related coding round, a design round and an inference optimization related discussion. I have been extensively preparing for these. My Prep for coding is learning to code from scratch the following: SelfAttention, Transformer block, BPE tokenizer, Sampling methods, LV Cache, Bean Search. For other two interviews, I am just studying all the inference design and bottlenecks and old/new work done to eliminate them. I would love to hear if anyone has had similar interview and can share experiences. submitted by /u/dividebyzero74 [link] [comments]
- [D] Advice on sequential recommendations architecturesby /u/adjgiulio (Machine Learning) on February 15, 2026 at 11:52 pm
I've tried to use a Transformer decoder architecture to model a sequence of user actions. Unlike an item_id paradigm where each interaction is described by the id of the item the user interacted with, I need to express the interaction through a series of attributes. For example "user clicked on a red button on the top left of the screen showing the word Hello", which today I'm tokenizing as something like [BOS][action:click][what:red_button][location:top_left][text:hello]. I concatenate a series of interactions together, add a few time gap tokens, and then use standard CE to learn the sequential patterns and predict some key action (like a purchase 7 days in the future). I measure success with a recall@k metric. I've tried a buch of architectures framed around gpt2, from standard next token prediction, to weighing the down funnel action more, to contrastive heads, but I can hardly move the needle compared to naive baselines (i.e. the user will buy whatever they clicked on the most). Is there any particular architecture that is a natural fit to the problem I'm describing? submitted by /u/adjgiulio [link] [comments]
- [R] TimeBase: The Power of Minimalism in Efficient Long-term Time Series Forecastingby /u/Whatever_635 (Machine Learning) on February 15, 2026 at 11:49 pm
The paper was accepted as a spotlight poster at ICML for 2025. For industry, I know that when it comes to time series forecasting, many non faang companies still use ARIMA due to resource cost and efficiency, and they focus on stationary data. I wonder if this model can be a good alternative that can be implemented. Worth noting that TimeBase is benchmarked on long-horizon tasks (96–720 steps), so if your ARIMA usage is for short-term forecasting, the comparison is less direct. What are your thoughts? Their code is public on github, I provided the link here submitted by /u/Whatever_635 [link] [comments]
- Can we stop these LLM posts and replies? [D]by /u/Playful-Fee-4318 (Machine Learning) on February 15, 2026 at 2:59 pm
I am tired of reading all these clearly LLM generated ‘I implemented XYZ in python’ and nonsensical long replies on this subreddit. They add absolutely zero value and just creates meaningless noise. Can we block these posts and replies? submitted by /u/Playful-Fee-4318 [link] [comments]
- [D] Advice on a Modern NLP Roadmap (for someone with strong ML theory background)by /u/meni_s (Machine Learning) on February 15, 2026 at 9:57 am
I have a strong background in ML theory (did a Ph.D. in the field) but I'm out of the loop on the current NLP state-of-the-art. I'm looking for a "roadmap" that respects a PhD-level understanding of math/optimization while skipping "Intro to Python" style tutorials. The end goal isn't academia but more of industry / research roles, maybe. If you had to design a 4-week "crash course" for someone who already understands backprop but hasn't touched a Transformer, what repos or advanced courses would you include? Going over some seminal papers? Is building from scratch (like NanoGPT) a good idea? submitted by /u/meni_s [link] [comments]
- [D] ICML assigned me a paper that I reviewed in ICLRby /u/famous-BlueRaincoat (Machine Learning) on February 14, 2026 at 8:12 pm
Basically titles says it all... I gave the paper a 6 in ICLR, but it ended up being rejected. Just wondering if this is normal? Should I review the paper and pretend it's my first time reading it? Btw, I'm not an expert in that field; the topic is from one of my collaborations. submitted by /u/famous-BlueRaincoat [link] [comments]
- [D] Average Number of Interviews to Get a Job (US)by /u/Zealousideal-Egg1354 (Machine Learning) on February 14, 2026 at 7:33 pm
Hi all, Do you have a guess of what is the average number of interviews people make until getting a job offer in ML in the US? I made 23 interviews in the last ~8 months without an offer. I don't know if they find my experience outdated, or if my background is actually okay but they keep constantly choosing someone who worked in a job recently, or if there is a problem in the way I communicate or something else. Between 2020 and 2023, I worked as a Data Scientist for ~3 years. I put what I did during this period here • Curated high-quality question–answer pairs from company documents and fine-tuned an LLM (RoBERTa) for extractive question answering. This resulted in a 20% improvement in exact match score. • Trained, optimized, and evaluated deep learning model to predict whether changes in documents need to be reported. Experimented with MLflow and deployed it as a REST API. • Fine-tuned a BERT-based sentence transformer and built an NLP pipeline to extract key topics from company documents. Deployed and integrated the model into an application to deliver actionable document insights. • Designed and implemented end-to-end ETL pipelines with Python, Spark, and SQL to ingest data from different document sources, extract the right data from these documents, and apply various data/text preprocessing methods to ensure data quality, diversity, and compatibility with downstream machine learning models. • Built, optimized, and deployed a deep learning pipeline to classify the regulatory questions into correct categories and integrated it into an application which saved the department approximately $1,500,000 After 2023, I started my Master of Science program in Computer Science in T20 university in the US. I graduated in May 2025. I did an agentic AI project like this: • Built a multi-agent data analytics chatbot using GPT-4 and LangGraph to orchestrate specialized LangChain tools for file parsing, automated statistical analysis, anomaly detection, and data visualization. • Implemented production-ready infrastructure with authentication, session management, file management, caching, and rate limiting. • Implemented backend API with FastAPI and containerized deployment on AWS EC2 using Docker and Docker Compose. submitted by /u/Zealousideal-Egg1354 [link] [comments]
- [P] I trained YOLOX from scratch to avoid Ultralytics' AGPL (aircraft detection on iOS)by /u/MzCWzL (Machine Learning) on February 14, 2026 at 2:43 pm
submitted by /u/MzCWzL [link] [comments]
![[R] Learning State-Tracking from Code Using Linear RNNs](https://preview.redd.it/9cjies2580kg1.png?width=140&height=46&auto=webp&s=4496b107a4238124f9e9b01f22a5baabda2f1e61)
![[P] I trained YOLOX from scratch to avoid Ultralytics' AGPL (aircraft detection on iOS)](https://external-preview.redd.it/VgxN_BHzj3QWKLjM_HicsmE5yLu-TPCy60DlF6DG4rc.jpeg?width=640&crop=smart&auto=webp&s=72bc4eed477ee4ac90ca31d43e2f609419964b72)





















96DRHDRA9J7GTN6