What are the top 3 methods used to find Autoregressive Parameters in Data Science?

What are the top 3 methods used to find Autoregressive Parameters in Data Science?

You can translate the content of this page by selecting a language in the select box.

Ace the AWS Cloud Practitioner Certification CCP CLF-C02 Exam: Prepare and Ace the AWS Cloud Practitioner Certification CCP CLF-C02

AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence

What are the top 3 methods used to find Autoregressive Parameters in Data Science?

 In order to find autoregressive parameters, you will first need to understand what autoregression is. Autoregression is a statistical method used to create a model that describes data as a function of linear regression of lagged values of the dependent variable. In other words, it is a model that uses past values of a dependent variable in order to predict future values of the same dependent variable.

In time series analysis, autoregression is the use of previous values in a time series to predict future values. In other words, it is a form of regression where the dependent variable is forecasted using a linear combination of past values of the independent variable. The parameter values for the autoregression model are estimated using the method of least squares.

Ace the AWS Solutions Architect Associates SAA-C03 Certification Exam : Quizzes, Flashcards, Practice Exams, Cheat Sheets, I passed SAA Testimonials, Tips and Tricks to ace the SAA-C03 exam

The autoregressive parameters are the coefficients in the autoregressive model. These coefficients can be estimated in a number of ways, including ordinary least squares (OLS), maximum likelihood (ML), or least squares with L1 regularization (LASSO). Once estimated, the autoregressive parameters can be used to predict future values of the dependent variable.

To find the autoregressive parameters, you need to use a method known as least squares regression. This method finds the parameters that minimize the sum of the squared residuals. The residual is simply the difference between the predicted value and the actual value. So, in essence, you are finding the parameters that best fit the data.

What are the top 3 methods used to find Autoregressive Parameters in Data Science?
What are the top 3 methods used to find Autoregressive Parameters in Data Science?

How to Estimate Autoregressive Parameters?

There are three main ways to estimate autoregressive parameters: ordinary least squares (OLS), maximum likelihood (ML), or least squares with L1 regularization (LASSO).

Ordinary Least Squares: Ordinary least squares is the simplest and most common method for estimating autoregressive parameters. This method estimates the parameters by minimizing the sum of squared errors between actual and predicted values.

Maximum Likelihood: Maximum likelihood is another common method for estimating autoregressive parameters. This method estimates the parameters by maximizing the likelihood function. The likelihood function is a mathematical function that quantifies the probability of observing a given set of data given certain parameter values.

Least Squares with L1 Regularization: Least squares with L1 regularization is another method for estimating autoregressive parameters. This method estimates the parameters by minimizing the sum of squared errors between actual and predicted values while also penalizing models with many parameters. L1 regularization penalizes models by adding an extra term to the error function that is proportional to the sum of absolute values of the estimator coefficients.

Finding Autoregressive Parameters: The Math Behind It
To find the autoregressive parameters using least squares regression, you first need to set up your data in a certain way. You need to have your dependent variable in one column and your independent variables in other columns. For example, let’s say you want to use three years of data to predict next year’s sales (the dependent variable). Your data would look something like this:

| Year | Sales |
| 2016 | 100 |
| 2017 | 150 |
| 2018 | 200 |

Next, you need to calculate the means for each column. For our sales example, that would look like this:

$$ \bar{Y} = \frac{100+150+200}{3} = 150$$

Now we can calculate each element in what’s called the variance-covariance matrix:

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLFC01 book

$$ \operatorname {Var} (X)=\sum _{i=1}^{n}\left({x_{i}}-{\bar {x}}\right)^{2} $$


$$ \operatorname {Cov} (X,Y)=\sum _{i=1}^{n}\left({x_{i}}-{\bar {x}}\right)\left({y_{i}}-{\bar {y}}\right) $$

For our sales example, that calculation would look like this:

$$ \operatorname {Var} (Y)=\sum _{i=1}^{3}\left({y_{i}}-{\bar {y}}\right)^{2}=(100-150)^{2}+(150-150)^{2}+(200-150)^{2})=2500 $$


$$ \operatorname {Cov} (X,Y)=\sum _{i=1}^{3}\left({x_{i}}-{\bar {x}}\right)\left({y_{i}}-{\bar {y}}\right)=(2016-2017)(100-150)+(2017-2017)(150-150)+(2018-2017)(200-150))=-500 $$

Now we can finally calculate our autoregressive parameters! We do that by solving this equation:

$$ \hat {\beta }=(X^{\prime }X)^{-1}X^{\prime }Y=\frac {1}{2500}\times 2500\times (-500)=0.20 $$\.20 . That’s it! Our autoregressive parameter is 0\.20 . Once we have that parameter, we can plug it into our autoregressive equation:

$$ Y_{t+1}=0\.20 Y_t+a_1+a_2+a_3footnote{where $a_1$, $a_2$, and $a_3$ are error terms assuming an AR(3)} .$$ And that’s how you solve for autoregressive parameters! Of course, in reality you would be working with much larger datasets, but the underlying principles are still the same. Once you have your autoregressive parameters, you can plug them into the equation and start making predictions!.

Which Method Should You Use?
The estimation method you should use depends on your particular situation and goals. If you are looking for simple and interpretable results, then Ordinary Least Squares may be the best method for you. If you are looking for more accurate predictions, then Maximum Likelihood or Least Squares with L1 Regularization may be better methods for you.

Djamgatech: Build the skills that’ll drive your career into six figures: Get Djamgatech.

Autoregressive models STEP BY STEP:

1) Download data: The first step is to download some data. This can be done by finding a publicly available dataset or by using your own data if you have any. For this example, we will be using data from the United Nations Comtrade Database.

2) Choose your variables: Once you have your dataset, you will need to choose the variables you want to use in your autoregression model. In our case, we will be using the import and export values of goods between countries as our independent variables.

3) Estimate your model: After choosing your independent variables, you can estimate your autoregression model using the method of least squares. OLS estimation can be done in many statistical software packages such as R or STATA.

4) Interpret your results: Once you have estimated your model, it is important to interpret the results in order to understand what they mean. The coefficients represent the effect that each independent variable has on the dependent variable. In our case, the coefficients represent the effect that imports and exports have on trade balance. A positive coefficient indicates that an increase in the independent variable leads to an increase in the dependent variable while a negative coefficient indicates that an increase in the independent variable leads to a decrease in the dependent variable.

5)Make predictions: Finally, once you have interpreted your results, you can use your autoregression model to make predictions about future values of the dependent variable based on past values of the independent variables.

Conclusion: In this blog post, we have discussed what autoregression is and how to find autoregressive parameters. 

Estimating an autoregression model is a relatively simple process that can be done in many statistical software packages such as R or STATA.

In statistics and machine learning, autoregression is a modeling technique used to describe the linear relationship between a dependent variable and one more independent variables. To find the autoregressive parameters, you can use a method known as least squares regression which minimizes the sum of squared residuals. This blog post also explains how to set up your data for calculating least squares regression as well as how to calculate Variance and Covariance before finally calculating your autoregressive parameters. After finding your parameters you can plug them into an autoregressive equation to start making predictions about future events!

We have also discussed three different methods for estimating those parameters: Ordinary Least Squares, Maximum Likelihood, and Least Squares with L1 Regularization. The appropriate estimation method depends on your particular goals and situation.

Machine Learning For Dummies
Machine Learning For Dummies

Machine Learning For Dummies App

Machine Learning For Dummies  on iOs:  https://apps.apple.com/us/app/machinelearning-for-dummies-p/id1610947211

Ace the Microsoft Azure Fundamentals AZ-900 Certification Exam: Pass the Azure Fundamentals Exam with Ease

Machine Learning For Dummies on Windowshttps://www.microsoft.com/en-ca/p/machinelearning-for-dummies-ml-ai-ops-on-aws-azure-gcp/9p6f030tb0mt?

Machine Learning For Dummies Web/Android on Amazon: https://www.amazon.com/gp/product/B09TZ4H8V6

What are some good datasets for Data Science and Machine Learning?

Machine Learning Engineer Interview Questions and Answers

Machine Learning Breaking News 

Transformer – Machine Learning Models

transformer neural network

Machine Learning – Software Classification

Autoregressive Model

Autoregressive generative models can estimate complex continuous data distributions such as trajectory rollouts in an RL environment, image intensities, and audio. Traditional techniques discretize continuous data into various bins and approximate the continuous data distribution using categorical distributions over the bins. This approximation is parameter inefficient as it cannot express abrupt changes in density without using a significant number of additional bins. Adaptive Categorical Discretization (ADACAT) is proposed in this paper as a parameterization of 1-D conditionals that is expressive, parameter efficient, and multimodal. A vector of interval widths and masses is used to parameterize the distribution known as ADACAT. Figure 1 showcases the difference between the traditional uniform categorical discretization approach with the proposed ADACAT.

Each component of the ADACAT distribution has non-overlapping support, making it a specific subfamily of mixtures of uniform distributions. ADACAT generalizes uniformly discretized 1-D categorical distributions. The proposed architecture allows for variable bin widths and more closely approximates the modes of two Gaussians mixture than a uniformly discretized categorical, making it highly expressive than the latter. Additionally, a distribution’s support is discretized using quantile-based discretization, which bins data into groups with similar measured data points. ADACAT uses deep autoregressive frameworks to factorize the joint density into numerous 1-D conditional ADACAT distributions in problems with more than one dimension. 

Continue reading | Check out the paper and github link.

Pytorch – Computer Application


Best practices for training PyTorch model

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

What are some good datasets for Data Science and Machine Learning?

Top 100 Data Science and Data Analytics and Data Engineering Interview Questions and Answers

Machine Learning Engineer Interview Questions and Answers

  • [P] Hand keypoint detection
    by /u/VGHMD (Machine Learning) on October 2, 2023 at 10:11 pm

    Hello Reddit, I have a question regarding the right tool. I'm looking for a tool / model to detect hand-keypoints in a video stream of a person assembling stuff. I know OpenPose is a possible one, also Google MediaPipe. I’m not really getting along with OpenPose and MediaPipe don’t show really good results. In my project, I would like to detect hand keypoints in assembly scenarios. It would be ok to use 2 cameras or a depth camera if necessary. Does anybody knows any models / tools to use? Thanks in advance 🙂 submitted by /u/VGHMD [link] [comments]

  • [P] Best option for a large, local embedding database?
    by /u/russ_fegoli (Machine Learning) on October 2, 2023 at 9:47 pm

    Langchain offers a wide array of vector databases for text embedding models. I need to create a vector database for around 3 million sentence embeddings, each being of dimension 384. I'm building a prototype, so it has to be local and free of charge to use. So far, I've hit limits for Chroma (41,666 max). I've also tried Redis, QDrant and FAISS - each of these gets so large that it eats up all the RAM and the process gets killed, or with QDrant, just errors out. I've used Pinecone before, but I don't really want to pay for a prototype as I have plenty of disk space. I was thinking of chunking the 3 million documents into local vector stores of size 41,666 using ChromaDB - but there isn't a whole lot out there about whether Chroma would allow me to merge all ~70 of these smaller databases into a bigger one for search. I also cannot find whether it would be possible to load all 70 of these into memory and search each one individually. So what are my options? My other thought was just creating a large Doc2Vec model, however I would like to use something more sophisticated like Huggingface embedding models. submitted by /u/russ_fegoli [link] [comments]

  • [D] Proof of convergence for a heavy-ball adaptive step-size algorithm for non-convex functions
    by /u/Loose_Foundation5990 (Machine Learning) on October 2, 2023 at 9:44 pm

    Hello everyone, I am struggling with prooving convergence for an optimizer which uses adaptive step-size with heavy ball algorithm for convex and non-convex functions. In some literature, I could find a regret bound analysis/proof for convex functions and proving that the estimated gradient at t -> inf goes to zero for non-convex functions. There are some assumptions and preconditions: The algorithm is heavy ball momentum with adaptive step-size. ' X_(k+1) = X_k - \eta_k . \nabla(f(x_k)) + \beta(x_k - x_(k-1)) The following assumptions are made: A. The function is smooth. B. The function is Lipschitz. C. The gradients are Lipschitz. I attempt to prove the convergence to a critical point or a local minima. Where the estimate of the gradients at any instance k goes to zero. i.e. E[\nabla(f(x_k))] = 0 as t -> inf. Could anyone please guide me through the process of convergence proof for non-convex functions or give me literature recommendations for the same. Thank you very much in advance. submitted by /u/Loose_Foundation5990 [link] [comments]

  • [D] open problems after GPT4 capabilities
    by /u/Cultural-Average3959 (Machine Learning) on October 2, 2023 at 9:16 pm

    We all know that LLMs (and especially foundation models) are extremely functionally capable. Has anyone made a nice list of deficiencies that they show? I know Gary Marcus did so many years ago, but after GPT3 and GPT4 -- what is still unsolved? submitted by /u/Cultural-Average3959 [link] [comments]

  • [D] Hoeffdings inequality, does it make sense practically?
    by /u/2azo (Machine Learning) on October 2, 2023 at 8:41 pm

    According to it, increasing the hypotheses set loosens the upper bound between in-sample and out-of-sample error. ​ Can't we subdivide the hypotheses set to multiple ones, ensuring tighter bounds in general? ​ and generally, have you seen it in use before? I have seen a lot of ML projects without anybody mentioning it or anything theoretical. submitted by /u/2azo [link] [comments]

  • [P] Good models to use for multimodal object detection when both the modalities are image based or some models which support ensembling?
    by /u/Xyber5 (Machine Learning) on October 2, 2023 at 8:33 pm

    So basically I have a dataset with images of vehicles in top down view in both RGB and IR, what are some models I can use for both unimodal and multimodal object detection to compare their performance. Links to GitHub repos would be helpful. Thanks submitted by /u/Xyber5 [link] [comments]

  • Benefits of converting DICOM images to PNG's [P]
    by /u/01jasper (Machine Learning) on October 2, 2023 at 8:07 pm

    I try to understand what are the benefits to convert DICOM images to PNG's. Context: I have DICOM images which I already extracted the useful meta-data I want to use. Those images are for a task, classification-detection pipeline of some disease. So as I already asked, what are the benefits of converting those DICOM files to PNG's rather then just using pydicom and the dicom pixel_array? Reason I ask this is because I saw many top 5 users on kaggle do this when dealing with DICOM images. If I understand how networks actually works, they get as input an array of pixels as floating point numbers no? So what's the differences between DICOM pixel_array to PNG's pixel array and numpy array or tensor? both are eventually will be fed to the network as a tensor of floating numbers. Is the reason is because PNG's are usually faster to train? Is the reason is because PNG's have more libraries support for preprocessing / augmentation / etc. ? Is the reason is because PNG's are the format many pre-trained models expect to? (I write this knowing it's 99% not true, as mentioned the tensor thing) Thanks in Advance, and Please, forgive my English (I could use AI tools to fix it but I feel addicted already) submitted by /u/01jasper [link] [comments]

  • [D] What kind of distribution is this?
    by /u/Tigmib (Machine Learning) on October 2, 2023 at 7:10 pm

    Hey guys, I am wondering what kind of distribution my data are following? I want to fit a distribution function to them and use this fitted distribution function to generate new samples with a given mean and standard deviation (python). Any tips for this? Happy to hear your suggestions 🙂 https://preview.redd.it/kdcftvpq8urb1.png?width=408&format=png&auto=webp&s=6163b9f571069e098c9e9a609c3d1cb9910fe1fb submitted by /u/Tigmib [link] [comments]

  • [R] Efficient Streaming Language Models with Attention Sinks - Meta AI 2023 - StreamingLLM enables Llama-2, Falcon and Pythia to have an infinite context length without any fine-tuning! Allows streaming use of LLMs!
    by /u/Singularian2501 (Machine Learning) on October 2, 2023 at 7:09 pm

    Paper: https://arxiv.org/abs/2309.17453 Github: https://github.com/mit-han-lab/streaming-llm Abstract: Deploying Large Language Models (LLMs) in streaming applications such as multi-round dialogue, where long interactions are expected, is urgently needed but poses two major challenges. Firstly, during the decoding stage, caching previous tokens' Key and Value states (KV) consumes extensive memory. Secondly, popular LLMs cannot generalize to longer texts than the training sequence length. Window attention, where only the most recent KVs are cached, is a natural approach -- but we show that it fails when the text length surpasses the cache size. We observe an interesting phenomenon, namely attention sink, that keeping the KV of initial tokens will largely recover the performance of window attention. In this paper, we first demonstrate that the emergence of attention sink is due to the strong attention scores towards initial tokens as a ``sink'' even if they are not semantically important. Based on the above analysis, we introduce StreamingLLM, an efficient framework that enables LLMs trained with a finite length attention window to generalize to infinite sequence lengths without any fine-tuning. We show that StreamingLLM can enable Llama-2, MPT, Falcon, and Pythia to perform stable and efficient language modeling with up to 4 million tokens and more. In addition, we discover that adding a placeholder token as a dedicated attention sink during pre-training can further improve streaming deployment. In streaming settings, StreamingLLM outperforms the sliding window recomputation baseline by up to 22.2x speedup. https://preview.redd.it/elatp0917urb1.jpg?width=875&format=pjpg&auto=webp&s=b1652bf5bbd0606e148b935c192093732d55827f https://preview.redd.it/j947mz817urb1.jpg?width=1502&format=pjpg&auto=webp&s=d8aa6092f3c4cdd8c2b6d19a614f7a28ac37cb59 https://preview.redd.it/vrjlj3917urb1.jpg?width=1504&format=pjpg&auto=webp&s=c5ebbc04bf9c97e55c6ba3201c0e3f17a71f9389 https://preview.redd.it/l3x4x0917urb1.jpg?width=1654&format=pjpg&auto=webp&s=48fa1fe61a0e97b06f4835fe2cba0c6d58c718a0 ​ submitted by /u/Singularian2501 [link] [comments]

  • [Project] I just released an open-source package, TorchLens, that can extract the activations/metadata from any PyTorch model, and visualize its structure, in just one line of code. I hope it helps you out!
    by /u/therealjmt91 (Machine Learning) on October 2, 2023 at 6:08 pm

    You just give it any PyTorch model (as-is, no changes needed), and it spits out a data structure with the activations of any layer you want, along with a bunch of metadata about the model and each layer and an optional automatic visualization of the model's computational graph. I hope this greatly speeds up the process of extracting features from models for further analysis, and also serves as an aid in quickly understanding new models. I also hope it'd be helpful for teaching purposes, too. It is meant to work for any PyTorch model whatsoever and I've tested it on hundreds of models (see the "model menagerie" of visualizations below), though it's always possible I've missed some edge case or another. Hope it helps you out--I'm still actively developing it, so let me know if there's anything on your wishlist! https://preview.redd.it/k37nhejvxtrb1.png?width=640&format=png&auto=webp&s=5713a8711110644794e2264d84dd479ede861c5e GitHub Repo Twitter Thread Paper CoLab Tutorial Gallery of Model Visuals submitted by /u/therealjmt91 [link] [comments]

  • Is the workload lesser in larger organizations?
    by /u/Mission-Language8789 (Data Science) on October 2, 2023 at 5:55 pm

    I work in a small data team, and I'm the only one working on my project. So I've to look after everything from data collection, processing, dashboarding, quite a bit of software engineering, testing, CI/CD and deployment. So even though my primary work is data science, there's a requirement for end-to-end work. That got me thinking (not in a disrespectful way), what do data scientists who work at larger organizations spend time on? By larger organizations, I mean places where there are specialised teams for each of the steps I mentioned above. submitted by /u/Mission-Language8789 [link] [comments]

  • What more can I do?
    by /u/PlayfulCobbler1497 (Data Science) on October 2, 2023 at 5:24 pm

    I recently acquired my bachelors in Data Science. We covered the usual stuff (Python ,stat, ML, DL, SQL, R, Visualization etc) but I'm finding it difficult to get hired. My CVs don't seem to be getting anywhere. Any suggestions on how I can improve my employability? submitted by /u/PlayfulCobbler1497 [link] [comments]

  • How to prep for final round interviews (intern)
    by /u/asmalltowngirlie (Data Science) on October 2, 2023 at 5:22 pm

    I was told that I'm moving onto final round interviews, and it will consist of a coding portion using either SQL/R/Python, A/B testing and Statistics, and behavioral questions. I only have 1 week to prepare and am scared of messing this chance up. I don't know how to really prepare for A/B testing in the sense of an interview. submitted by /u/asmalltowngirlie [link] [comments]

  • Data Analytics Certification
    by /u/dameis (Data Science) on October 2, 2023 at 4:54 pm

    I was looking on my schools catalog and saw a data analytics certification offered. After some research I found that the mathematics depart offers it. I feel like the classes in the certification sound better than most classes require of my data science program. Could anyone offer insight if the data science program covers the classes? Trying to figure out if I should take the data analytics certification or not. submitted by /u/dameis [link] [comments]

  • [D] Why Vision Tranformers?
    by /u/n0ided_ (Machine Learning) on October 2, 2023 at 4:51 pm

    Transformers have been the new kid on the block, easy to see why with LLMs and and sequential output generation, but I still don't know why vision transformers based on ViT are so hot in the field right now. From my understanding, CNNs are just vastly better than transformers for vision tasks, as its inductive biases allows it to determine the relationship between neighboring features of an image via pooling and filters. However, transformers don't have this kind of inductive bias, and as a result, take much more data and compute to reach similar levels of performance. I read this survey paper on Vision Transformers here: https://arxiv.org/pdf/2012.12556.pdf, which has the performance of CNNs vs various transformer models for CV. Comparing even the best vision transformers to the classic ResNet, I don't see a significant improvement, especially considering the additional amount of parameters needing to be stored, more compute needed, and decrease in throughput. Not only that, transformers are notorious for needing more data to train compared to a conventional CNN, so it's not like there's some one-shot magic going on in the background either. ​ https://preview.redd.it/brw7ry35ktrb1.png?width=445&format=png&auto=webp&s=fd9376777de0f5b1402aa3f30850a959f7cd071b I was initially thinking about videos, how one could feed in frames of a video for classification or real-time image detection like YOLO, but using the attention mechanisms to find patterns in between frames. However, reading and playing around with ViViT, it requires an unholy amount of memory and compute to process anything more than 300x300 resolution video, which basically eliminates it from competing with YOLO in any way right now. Thing is, I am new to the field, just having entered the field after graduating with my Master's. I know that while transformers are definitely the new shiny toy, there wouldn't be as much buzz about vision transformers if it truly didn't have some kind of potential that CNNs do not. I look forward to hearing your thoughts. If you think I'm very wrong and dumb feel free to yell at me at the comments below. Thanks. submitted by /u/n0ided_ [link] [comments]

  • Are there any playlists on YT that are ACTUALLY GOOD to learn Excel in context of Data Science ( beginner to advanced)
    by /u/vich_lasagna (Data Science) on October 2, 2023 at 4:42 pm

    submitted by /u/vich_lasagna [link] [comments]

  • [R] Tool-Integrated Reasoning: A New Approach for Math-Savvy LLMs
    by /u/Successful-Western27 (Machine Learning) on October 2, 2023 at 4:30 pm

    When trying to get language models to solve complex math problems, researchers kept running into limits. Models like GPT-3 and ChatGPT still struggle with advanced algebra, calculus, and geometry questions. The math is just too abstract and symbol-heavy for them. To break through this barrier, researchers from Tsinghua University and Microsoft taught models to combine natural language reasoning with calling external math tools. The key is their new "tool-integrated reasoning" format. Models generate a natural language plan first, then write code to invoke tools like SymPy to solve equations. They take the output results and continue verbal reasoning. By interleaving natural language and symbolic computations, they get the best of both worlds - semantic understanding from language models and rigorous math from tools. They trained versions of the LLaMA model this way, producing their Tool-Integrated Reasoning Agent (TORA). They present some strong results: In evaluations on 10 math datasets, TORA substantially outperformed prior state-of-the-art methods, achieving 13-19% higher accuracy on average. On one competition test, TORA-7B scored 40% accuracy, beating the previous best model by 22 percentage points. This demonstrates that integrating tools directly into the reasoning process can significantly enhance mathematical capabilities, even for large models like GPT-4. However, tough problems involving geometry and advanced algebra are still there. New techniques for symbolic reasoning and spatial understanding will likely be needed to push further. Overall though, tool integration seems a promising path to improve reasoning skills. Applying this to other domains like logic and programming could also be impactful. TLDR: Teaching language models to use math tools helps them solve way more complex problems. Full Paper Summary arXiv Link submitted by /u/Successful-Western27 [link] [comments]

  • Looking for something different
    by /u/SupertrampDFenx (Data Science) on October 2, 2023 at 4:18 pm

    Hi everyone, I would like to share my experience. I am currently working as a Data Scientist for a company (part of the R&D team). I have been with this company for about two years, and initially, everything seemed to be going well. At a certain point, the organization (both in terms of projects and the team) started to fall apart: pointless meetings, lack of clarity, and much more. After several months, I realized that the main problem is the manager of the R&D department, who, frankly speaking, does not understand anything about the technical side (starting from the programming language used and more). This has caused me a sense of discomfort that makes me consider leaving. The last straw was the assignment of a training course: initially, we were asked to choose a topic for a course (not necessarily related to the role, and in my case, I chose concepts of Data Engineering and frameworks like Airflow). Without saying anything, this manager assigned a course to all of us, Data Scientists, and what did she assign us? A course on Data Science! What's the point! In fact, I'm completely skipping every single video as I already know all the concepts. Having said that, can you give me some advice on how to handle this situation? Even better, can you suggest companies that have fully remote positions for Data Scientists? Thanks everyone submitted by /u/SupertrampDFenx [link] [comments]

  • What I wish I had known earlier in my career, particularly with disorganized companies
    by /u/Excellent_Cost170 (Data Science) on October 2, 2023 at 4:15 pm

    I'm quoting directly from a Reddit user named funbike. This is the rule you should abide by in organizations. I also made the same mistake when I joined a company, attempting to prove myself. " After being a fool in my early career trying too hard to impress, this is how I handle this kind of thing these days: Document EVERYTHING. Follow-up verbal conversations with summary email. When things go south, I'll be able to prove I warned them. Give realistic estimates on how long things will take. Whatever I say is usually twice how long I actually think it will take, because things never go like you think. Make it clear that that longer-term estimates will be less accurate the farther out they are, because software is notoriously difficult to estimate. Tell them to their face that we will not make the unrealistic dates they've set, and to prevent in future to always consult first. I will not work overtime due to artificial deadlines. I'll do O/T for extreme exceptional cases only, such as a one-time short-term crisis or for a regulatory-mandated deadline. By 6pm I'll be at my house. Explain quality should never be abandoned for speed. It will violently backfire in the end, with the opposite effect. I stand my ground. I can make them mildly unhappy now, or furiously disappointed in our results in the future. I'll take the first one please. Even if you were to heroically meet their unreasonable date, they'll just expect more next time. You'll burn out and maybe the next time you'll have an embarrassing failure even with crazy overtime. They'll say "tsk, tsk" and blame you. Don't fall into this trap" submitted by /u/Excellent_Cost170 [link] [comments]

  • How long did it take you to self-learn data science and afterwards, how long to get employed?
    by /u/Remarkable-Floor-351 (Data Science) on October 2, 2023 at 4:04 pm

    To anyone who taught themselves data science and then achieved employment in a data science role, how long did it take you to learn in hours per day? And additionally, how long did it take you after you stopped learning to find a job and keep a job? If you did not self learn or hold a job afterwards please do not reply with any speculations. submitted by /u/Remarkable-Floor-351 [link] [comments]


error: Content is protected !!