Machine learning techniques are being used to address unstructured data challenges in a number of ways:
Overall, machine learning techniques are being used in a wide range of applications to extract meaningful information from unstructured data, and to enable the automatic classification and analysis of data in a variety of formats.
Artificial intelligence (AI) and machine learning are having a significant impact on application development today in a number of ways:
Overall, AI and machine learning are having a significant impact on application development today, and they are likely to continue to shape the way applications are built and used in the future.
Advancements in artificial intelligence (AI) and machine learning are likely to shape the future of work and society in a number of ways. Some potential impacts include:
Overall, the impact of AI and machine learning on the future of work and society is likely to be significant and complex, with both potential benefits and challenges. It will be important to consider and address these impacts as these technologies continue to advance and become more widely adopted.
Hi all, I've been using Pytorch since 2019, and it has changed a lot in that time (especially since huggingface). Are there any modern guides/style-docs/example-repos you would recommend? For example, are namedtensors a good/common practice? Is Pytorch Lightning recommended? What are the best config management tools these days? How often do you use torch.script or torch.compile? submitted by /u/SirBlobfish [link] [comments]
Hi everyone, I have a question about how to improve the XGBoost model performance when the residuals between forecasted values and the actual target values shows a pattern. In the graph I uploaded, each point represents an actual versus predicted pair. I've also added a diagonal line indicating a perfect match between predicted and actual values. What are some strategies I can explore to reduce this residual pattern and improve the alignment between my forecast and the actual values? Should I consider changing the model type, or are there other methods such as feature engineering or statistical adjustments that can address this issue? Any advice on model diagnostics or evaluation techniques that might help refine my model further? https://preview.redd.it/shj4g7m9hvxc1.png?width=846&format=png&auto=webp&s=b8e658c9ee7b8ce7612e3cd1f199a1682e056dca submitted by /u/tipoviento [link] [comments]
Hi everyone, I'd like to introduce Switchify, a text prompt to image segmentation labelling tool. Check it out at https://runswitchify.com. Just sign up, upload an image, and start labelling. I think it'd be really useful for anyone trying to clean and process their image training data. I'd love any feedback on the product and general thoughts. Hope you guys enjoy trying it out. submitted by /u/Fun_Win_6054 [link] [comments]
submitted by /u/RSchaeffer [link] [comments]
Has anyone tried using Moving Average to train unsupervised LSTM for anomaly detection purposes or any ML? I have a dataset with date and value (essentially one feature). I was using a dataset with no anomalous data to train and hope it can reconstruct what normal trend should look like. However, while the val loss is low, the reconstruction was really bad and failed (constant line at 0). I was thinking if I should us Moving Average to smooth out the value because my dataset value actually jumps alot (however, it still only ranges between certain value, so is not an outlier) Does anyone have any tips on this? I'm just trying out a simple stacked LSTM. submitted by /u/RandomPasserBy44 [link] [comments]
Not that many people are paying attention to LLM interpretability research when capabilities research is moving as fast as it currently is, but interpretability is really important and in my opinion, really interesting and exciting! Anthropic has made a lot of breakthroughs in recent months, the biggest one being "Towards Monosemanticity". The basic idea is that they found a way to train a sparse autoencoder to generate interpretable features based on transformer activations. This allows us to look at the activations of a language model during inference, and understand which parts of the model are most responsible for predicting each next token. Something that really stood out to me was that the autoencoders they train to do this are actually very small, and would not require a lot of compute to get working. This gave me the idea to try to replicate the research by training models on my M3 Macbook. After a lot of reading and experimentation, I was able to get pretty strong results! I wrote a more in-depth post about it on my blog here: https://jakeward.substack.com/p/monosemanticity-at-home-my-attempt I'm now working on a few follow-up projects using this tech, as well as a minimal implementation that can run in a Colab notebook to make it more accessible. If you read my blog, I'd love to hear any feedback! submitted by /u/neverboosh [link] [comments]
Paper: https://arxiv.org/abs/2404.19756 Code: https://github.com/KindXiaoming/pykan Quick intro: https://kindxiaoming.github.io/pykan/intro.html Documentation: https://kindxiaoming.github.io/pykan/ Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs. https://preview.redd.it/r7vjmp31juxc1.png?width=2326&format=png&auto=webp&s=a2c722cf733510194659b9aaec24269a7f9e5d47 submitted by /u/SeawaterFlows [link] [comments]
Title basically, this was a conversation I read just recently and am now looking for the source. A specific paper was mentioned in there as well. The conclusion drawn was that we might be at the limit of what we can do with statistical models and that there's nothing special about the models themselves - only the data that's fed matters. Any pointers would be appreciated, thanks! submitted by /u/SunraysInTheStorm [link] [comments]
I know this might be a bit neiche for a subreddit like this but I’m just trying to get as much feedback as possible. I’m making a project. It was originally going to be a a type of AI powered SIEM. It would take in information from multiple network intrusion detection systems and detect patterns. E.g Suricata might flag a low level alert as well as SNORT and other simillar systems the same. Previously these alerts might have been disregarded but an AI powered SIEM could detect a pattern in each of these alerts, decide weather it’s an attack or just a false positive, and then decide what type of attack it is. Upon researching further into this project I have created an AI that can perform this task just from basic Netflow info. I’m aware this is a very broad question to ask but I was just wondering if anyone had any ideas for as the next step in this project. Maybe even a potential feature for the finished system. I’ve completed my original goal to a degree. If you have any questions feel free to ask. Any feedback would be amazing. Thanks for reading submitted by /u/OpeningDirector1688 [link] [comments]
submitted by /u/shadowylurking [link] [comments]
ICML 2024 paper acceptance results are supposed to be soon. Creating a discussion thread for this year's results. There is so much noise in the reviews every year. Some good work that the authors are proud of might get rejected because of the noisy system, given that ICML is growing so large these years. We should keep in mind that the work is still valuable no matter what the final result is. submitted by /u/zy415 [link] [comments]
I'm working on a Speaker Verification project wherein I'm exploring different techniques to verify the speaker via voice. The traditional approach is to extract the MFCC, Filterbanks, and prosodic features. Now this method seems to be outdated as most of the research is focused on making use of pre-trained models like Nvidia's TitaNet, Microsoft's WavLM, SpeechBrain also a model for this. Now these pre-trained models give Embeddings as output which represent the speaker's voice regardless of what he said in the recording. Now my doubt is what do these Embeddings represent? One of the architecture's makes use of MFCC's and later passes them to NN like LSTM to capture the pattern. submitted by /u/Puzzleheaded_Bee5489 [link] [comments]
I know like Figure Company’s robot used openai’s GPT as a brain and performs very well, able to do many chores already. So I wonder what spaces of robot ability left to improve?Any opinion is welcome! submitted by /u/CrisYou [link] [comments]
I'm working specifically on music (MIDI) generation with transformers. Most of these models/datasets are quite small, e.g. <100million parameters. If I understand correctly, RPE was quickly adapted with the Music Transformer as a means to embed the intra-token distance information into the attention calculations. Separately, RoPE seems to have a similar objective, although I can't grasp from the RoFormer paper if it embeds the same type (quality?) of knowledge (e.g. token X is ... distance from token Y). I guess my question is: for contexts where the models are small, and inter-token distance is crucially important, would RPE still potentially be a superior approach? submitted by /u/leoholt [link] [comments]
Hey folks! I’m Jonathan from TensorDock, and we’re building a cloud GPU marketplace. We want to make GPUs truly affordable and accessible. I once started a web hosting service on self-hosted servers in middle school. But building servers isn’t the same as selling cloud. There’s a lot of open source software to manage your homelab for side projects, but there isn’t anything to commercialize that. Large cloud providers charge obscene prices — so much so that they can often pay back their hardware in under 6 months with 24x7 utilization. We are building the software that allows anyone to become the cloud. We want to get to a point where any [insert company, data center, cloud provider with excess capacity] can install our software on our nodes and make money. They might not pay back their hardware in 6 months, but they don’t need to do the grunt work — we handle support, software, payments etc. In turn, you get to access a truly independent cloud: GPUs from around the world from suppliers who compete against each other on pricing and demonstrated reliability. So far, we’ve onboarded quite a few GPUs, including 200 NVIDIA H100 SXMs available from just $2.49/hr. But we also have A100 80Gs from $1.63/hr, A6000s from $0.47/hr, A4000s from $0.13/hr, etc etc. Because we are a true marketplace, prices fluctuate with supply and demand. All are available in plain Ubuntu 22.04 or with popular ML packages preinstalled — CUDA, PyTorch, TensorFlow, etc., and all are hosted by a network of mining farms, data centers, or businesses that we’ve closely vetted. If you’re looking for hosting for your next project, give us a try! Happy to provide testing credits, just email me at [jonathan@tensordock.com](mailto:jonathan@tensordock.com). And if you do end up trying us, please provide feedback below [or directly!] 🙂 Deploy a GPU VM: https://dashboard.tensordock.com/deploy CPU-only VMs: https://dashboard.tensordock.com/deploy_cpu Apply to become a host: https://tensordock.com/host submitted by /u/jonathan-lei [link] [comments]
Hi! I started pondering about how LLMs know when to stop generating tokens in response to a prompt. Is the notion of a stop token still used in modern LLMs? Or perhaps a combination of stop tokens and other tricks control the output length? From a fine tuning standpoint I get that you can train a model to always output tokens more or less the same length as in the training dataset. I.e. I imagine that the output length in instruction datasets are of similar length, and thus instruction fine tuned models learn to output the same length as in the dataset. If this is the case, then what about pre trained foundational models. Is the output length baked into the foundational model or only subsequent fine tuned models? submitted by /u/Maltmax [link] [comments]
In multimodal LLMs, they usually freeze a CLIP encoder. How does this work? Is it simply just a linear neuron, connecting the two inputs? Are there any papers/guides on this (specifically connecting 2 or more models together) submitted by /u/Small_Emotion8420 [link] [comments]
ICML 2024 paper acceptance results are supposed to be released in 24 hours or so. I thought I might create this thread for us to discuss anything related to it. There is some noise in the reviews every year. Don’t forget that even though your paper might get rejected, this does not mean that it is not valuable work. Good luck everyone ! submitted by /u/hugotothechillz [link] [comments]
submitted by /u/emiyake [link] [comments]
I am trying to use a Lagrangian NN on a large medical dataset with 50+ features with one output col, A. How feasible is that? I feel like it loses some purpose in using LNN but am confident it may work. B. How do I get a Lagrangian NN to work on such a large dataset? It seems as though most LNNs I am seeing only have one input and one output col, but that would be next to impossible to implement with my dataset. Thanks submitted by /u/CruisingLettuce [link] [comments]
For most people, a satisfactory career is essential for leading a happy life. However, ensuring…
The pipeline industry is more than pipework and construction, and we explore those details in…
SQL Interview Questions and Answers In the world of data-driven decision-making, SQL (Structured Query Language)…
Before you make the decision to switch your home’s interest service provider, take the time…
AI Innovations in April 2024. Welcome to the April 2024 edition of the Daily Chronicle,…