What are the top 10 algorithms every software engineer should know by heart?

What is the single most influential book every Programmers should read

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What are the top 10 algorithms every software engineer should know by heart?

As a software engineer, you’re expected to know a lot about algorithms. After all, they are the bread and butter of your trade. But with so many different algorithms out there, how can you possibly keep track of them all?

Never fear! We’ve compiled a list of the top 10 algorithms every software engineer should know by heart. From sorting and searching to graph theory and dynamic programming, these are the algorithms that will make you a master of your craft. So without further ado, let’s get started!

Sorting Algorithms

Sorting algorithms are some of the most fundamental and well-studied algorithms in computer science. They are used to order a list of elements in ascending or descending order. Some of the most popular sorting algorithms include quicksort, heapsort, and mergesort. However, there are many more out there for you to explore.

What are the top 10 algorithms every software engineer should know by heart? — QuickSort Algorithm Implementation with Python method1

Searching Algorithms

Searching algorithms are used to find an element in a list of elements. The most famous search algorithm is probably binary search, which is used to find an element in a sorted list. However, there are many other search algorithms out there, such as linear search and interpolation search.

Graph Theory Algorithms

Graph theory is the study of graphs and their properties. Graph theory algorithms are used to solve problems on graphs, such as finding the shortest path between two nodes or finding the lowest cost path between two nodes. Some of the most famous graph theory algorithms include Dijkstra’s algorithm and Bellman-Ford algorithm.

This graph has six nodes (A-F) and eight arcs. It can be represented by the following Python data structure:

    graph = {'A': ['B', 'C'],
             'B': ['C', 'D'],
             'C': ['D'],
             'D': ['C'],
             'E': ['F'],
             'F': ['C']}
def find_all_paths(graph, start, end, path=[]):
        path = path + [start]
        if start == end:
            return [path]
        if not graph.has_key(start):
            return []
        paths = []
        for node in graph[start]:
            if node not in path:
                newpaths = find_all_paths(graph, node, end, path)
                for newpath in newpaths:
                    paths.append(newpath)
        return paths
A sample run:
    >>> find_all_paths(graph, 'A', 'D')
    [['A', 'B', 'C', 'D'], ['A', 'B', 'D'], ['A', 'C', 'D']]
    >>>

# Code by Eryk Kopczyński
    def find_shortest_path(graph, start, end):
        dist = {start: [start]}
        q = deque(start)
        while len(q):
            at = q.popleft()
            for next in graph[at]:
                if next not in dist:
                    dist[next] = [dist[at], next]
                    q.append(next)
        return dist.get(end)

Dynamic Programming Algorithms

Dynamic programming is a technique for solving problems that can be divided into subproblems. Dynamic programming algorithms are used to find the optimal solution to a problem by breaking it down into smaller subproblems and solving each one optimally. Some of the most famous dynamic programming algorithms include Floyd-Warshall algorithm and Knapsack problem algorithm.

Number Theory Algorithms

Number theory is the study of integers and their properties. Number theory algorithms are used to solve problems on integers, such as factorization or primality testing. Some of the most famous number theory algorithms include Pollard’s rho algorithm and Miller-Rabin primality test algorithm.

Example: A school method based Python3 program to check if a number is prime

def isPrime(n):

# Corner case
if n <= 1:
    return False

# Check from 2 to n-1
for i in range(2, n):
    if n % i == 0:
        return False

return True

Driver Program to test above function

print(“true”) if isPrime(11) else print(“false”)
print(“true”) if isPrime(14) else print(“false”)

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

This code is contributed by Smitha Dinesh Semwal

Combinatorics Algorithms

Combinatorics is the study of combinatorial objects, such as permutations, combinations, and partitions. Combinatorics algorithms are used to solve problems on combinatorial objects, such as enumeration or generation problems. Some of the most famous combinatorics algorithms include Gray code algorithm and Lehmer code algorithm.

Example: A Python program to print all permutations using library function

from itertools import permutations

Get all permutations of [1, 2, 3]

AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Gemini, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, Promp Engineering)

perm = permutations([1, 2, 3])

Print the obtained permutations

for i in list(perm):
print (i)

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

Output:

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)

It generates n! permutations if the length of the input sequence is n.
If want to get permutations of length L then implement it in this way.

Geometry Algorithms

Geometry is the study of shapes and their properties. Geometry algorithms are used to solve problems on shapes, such as finding the area or volume of a shape or finding the intersection point of two lines. Some of the most famous geometry algorithms include Heron’s formula and Bresenham’s line drawing algorithm.

Cryptography Algorithms

Cryptography is the study of encryption and decryption techniques. Cryptography algorithms are used to encrypt or decrypt data. Some of the most famous cryptography algorithms include RSA algorithm and Diffie – Hellman key exchange algorithm.

O(n) Rotational Cipher in Python — O(n) Rotational Cipher

String Matching Algorithm

String matching algorithms are used t o find incidences of one string within another string or text . Some of the most famous string matching algorithms include Knuth-Morris-Pratt algorithm and Boyer-Moore string search algorithm.

Data Compression Algorithms

Data compression algorithms are used t o reduce the size of data files without losing any information . Some of the most famous data compression algorithms include Lempel-Ziv-Welch (LZW) algorithm and run – length encoding (RLE) algorithm. These are just some of the many important algorithms every software engineer should know by heart ! Whether you’r e just starting out in your career or you’re looking to sharpen your skill set , learning these algorithms will certainly help you on your way!

According to Konstantinos Ameranis, here are also some of the top 10 algorithms every software engineer should know by heart:

I wouldn’t say so much specific algorithms, as groups of algorithms.

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Greedy algorithms.

If your problem can be solved with an algorithm that can make a decision now and at the end this decision will still be optimal, then you don’t need to look any further. Examples are Prim, Kruscal for Minimal Spanning Trees (MST) and the Fractional Knapsack problem.

Divide and Conquer.

Examples of this group are binary search and quicksort. Basically, you divide your problem into two distinct sub-problems, solve each one separately and at the end combine the solutions. Concerning complexity, you will probably get something recursive e.g. T(n) = 2T(n/2) + n, which you can solve using the Master theorem

Graph and search algorithms.

Other than the MST, Breadth First Search (BFS) and Depth First Search (DFS), Dijkstra and possibly A*. If you feel you want to go further in this, Bellman-Ford (for dense graphs), Branch and Bound, Iterative Deepening, Minimax, AB search.

Flows. Basically, Ford-Fulkerson.

Simulated Annealing.

This is a very easy, very powerful randomized optimization algorithm. It gobbles NP-hard problems like Travelling Salesman Problem (TSP) for breakfast.

Hashing. Properties of hashing, known hashing algorithms and how to use them to make a hashtable.

Dynamic Programming.

Examples are the Discrete Knapsack Problem and Longest Common Subsequence (LCS).

Randomized Algorithms.

Two great examples are given by Karger for the MST and Minimum Cut.

Approximation Algorithms.

There is a trade off sometimes between solution quality and time. Approximation algorithms can help with getting a not so good solution to a very hard problem at a good time.

Linear Programming.

Especially the simplex algorithm but also duality, rounding for integer programming etc.

These algorithms are the bread and butter of your trade and will serve you well in your career. Below, we will countdown another top 10 algorithms every software engineer should know by heart.

Binary Search Tree Insertion

Binary search trees are data structures that allow for fast data insertion, deletion, and retrieval. They are called binary trees because each node can have up to two children. Binary search trees are efficient because they are sorted; this means that when you search for an element in a binary search tree, you can eliminate half of the tree from your search space with each comparison.

Quicksort

Quicksort is an efficient sorting algorithm that works by partitioning the array into two halves, then sorting each half recursively. Quicksort is a divide and conquer algorithm, which means it breaks down a problem into smaller subproblems, then solves each subproblem recursively. Quicksort is typically faster than other sorting algorithms, such as heapsort or mergesort.

Dijkstra’s Algorithm

Dijkstra’s algorithm is used to find the shortest path between two nodes in a graph. It is a greedy algorithm, meaning that it makes the locally optimal choice at each step in order to find the global optimum. Dijkstra’s algorithm is used in routing protocols and network design; it is also used in manufacturing to find the shortest path between machines on a factory floor.

Linear Regression

Linear regression is a statistical method used to predict future values based on past values. It is used in many fields, such as finance and economics, to forecast future trends. Linear regression is a simple yet powerful tool that can be used to make predictions about the future.

K-means Clustering

2023 AWS Certified Machine Learning Specialty (MLS-C01) Practice Exams

K-means clustering is a statistical technique used to group similar data points together. It is used in many fields, such as marketing and medicine, to group customers or patients with similar characteristics. K-means clustering is a simple yet powerful tool that can be used to group data points together for analysis.

Support Vector Machines

Support vector machines are supervised learning models used for classification and regression tasks. They are powerful machine learning models that can be used for data classification and prediction tasks. Support vector machines are widely used in many fields, such as computer vision and natural language processing.

Gradient Descent

Gradient descent is an optimization algorithm used to find the minimum of a function. It is a first-order optimization algorithm, meaning that it uses only first derivatives to find the minimum of a function. Gradient descent is widely used in many fields, such as machine learning and engineering design.

PageRank

PageRank is an algorithm used by Google Search to rank websites in their search engine results pages (SERP). It was developed by Google co-founder Larry Page and was named after him. PageRank is a link analysis algorithm that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web (WWW), with the purpose of “measuring” its relative importance within the set.(Wikipedia)

RSA Encryption

RSA encryption is a public-key encryption algorithm that uses asymmetric key cryptography.(Wikipedia) It was developed by Ron Rivest, Adi Shamir, and Len Adleman in 1977 and has since been widely used in many different applications.(Wikipedia) RSA encryption is used to secure communications between parties and is often used in conjunction with digital signatures.(Wikipedia)

Fourier Transform

The Fourier transform is an integral transform that decomposes a function into its constituent frequencies.(Wikipedia) It was developed by Joseph Fourier in 1807 and has since been widely used in many different applications.(Wikipedia) The Fourier transform has many applications in physics and engineering, such as signal processing and image compression.(Wikipedia)

Conclusion:

These are the top 10 algorithms every software engineer should know by heart! Learning these algorithms will help you become a better software engineer and will give you a solid foundation on which to build your career!

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Microsoft Azure AZ900 Certification and Training

Algorithm Breaking News 2022 – 2023

Instagram algorithm 2022 – 2023

The historical past of the Instagram Algorithm

Because the inception of 2010, Instagram has proved its price. The platform that was earlier generally known as a photo-sharing hub has step by step developed itself into aneCommerce platform with Instagram Procuring. Right now most companies use Instagram as a marketing tool to extend their attain throughout the platform. Within the earlier days of Instagram, hashtags grew to become a pattern for straightforward grouping and looking. In a while, a function of product tagging was launched. It made it simpler for folks to seek for the merchandise. In 2016, Instagram algorithms made a serious change. It launched Instagram tales, reside movies, and new enterprise instruments to show their merchandise and gain more followers to their profile.

Instagram uses “Read Path Models” to rank content. It’s an algorithm used by Developers to find the best outcome in a project or a basic filtering algorithm.

Here’s How the algorithm works to rank your content on explore page and home!

Using “Write Path Classifiers” Instagram analyzed Your Posts till 2019.

First your content is published after Instagram algorithm confirms its Community Guidelines.

After that, Algorithm classifies your content based on your Post Design and Captions.

Using Photo-recognition Instagram Scans your content finds similarities between your new piece of content and your audience’s previous interactions with your old content.

The same process occurs with your post captions. Your post instantly starts reaching your most followers and as engagement rises it gets on explore page.

In words of Instagram employee, This “Write Path Classifiers” algorithm didn’t tracked most important metrics to keep the explore page. That’s why they started building a new version of the algorithm that you can read below!

How The New Algorithm Works!

The new algorithm uses 3 Crucial ways to source content for Your Instagram Explore feed!

1. Calculation of Engagement

Instagram algorithm calculates real-time engagement and upload time signals to consider your post for Explore page.

In simple words, Instagram measures how much engagement creators at your level get and how much engagement your recent posts and how’s the engagement growing since the upload time.

Tip: Look at your insights and see what time your followers are highly active and post 40-70 minutes before the peak time.

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

2. Content Sourcing for Explore Page

This step constitutes search queries from Instagram users related to your post.

Ace the Microsoft Azure Fundamentals AZ-900 Certification Exam: Pass the Azure Fundamentals Exam with Ease

Instagram finds targeted users to show your post to them based on their search queries. Your post will show up on top of their explore page.

A Post on “Start Your Entrepreneurship Journey” will be shown to people searching for entrepreneurship to a small query about passive income.

From those queries Instagram source content for explore page.

3. Ranking Step for Explore Page

How long you rank on Instagram page and to what audience depends on the engagement you get when you start ranking on explore page.

After the sourcing step is passed that means your content is eligible to rank on explore page.

And during this step, tracking engagement metrics and their growth algorithm keeps your Post on Explore Page.

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

4. Sensitivity Control

Instagram announced Sensitivity control last year which impacted Instagram explore algorithm again!

What’s changed?

Instagram launched two new filters one High precision and low precision filters to maintain better content on Instagram for Different audiences.

Explore page changes every second with every refresh. So, do your content’s target audience.

With these two filters, Instagram tries to track engagement from different-different users and changes pieces of content.

In simple words, Instagram doesn’t want to show people bad content. That’s why these filters work to run explore page content through database to find if it’s suitable to run for another minute, hour or day on Instagram.

5 Instagram Hashtags Algorithm Doesn’t Exist

You get Hashtags reach because Instagram’s old algorithm “Write Path Classifier” is applicable to every single format of content.

Means your content ranks on hashtags based on relevancy with your Post Image and Caption.

If it’s relevant and getting enough engagement to rank on Hashtags size. You will rank on hashtags. It’s not hard to crack hashtags algorithm. The advice is don’t focus on hashtags that much, and keep your eyes on creating content for explore page.

What I learned about Stories Algorithm

“Instagram story views increase and decrease based on “navigation” and “interaction”.

Cloud Certification made simple. Ace your exams with Djamgatech.

What’s navigation?

In Instagram story insights, you will see a metric called “navigation” and below that you will see

Back- means the follower swiped back to see your last story or someone else’s story they saw before! Forward- means the follower clicked to see your next story Next story- the follower moved to see someone else’s story Exited- means the follower left the stories.

Founded: If your story have more forward and next stories. Then Instagram will push your stories to more followers as they want users to watch more stories and stay in stories tab.

Why?: After 2-3 stories they hit users with an ad!

Interactions: Polls/ Question stickers/ Quiz

When viewers interact with these story features. Instagram sees that followers are interacting more than before and that’s why they start pushing it more

How interactions like “profile visits” effect story views?

Yes, if your followers are visiting your profile through stories. Then that particular story (if its the first one) will receive more views than average as my story with 44 profile visits received the most views. So, you should do something worth a profile visit!

The New Reels Algorithm

I didn’t get much out of the conversation about Instagram reels from employees at IG.

But the only tip was to maintain the highest quality of video while uploading because while content distribution through Instagram processors your video might lose some quality.

Acls algorithm 2022 – 2023,

Free copy/print of all the ACLS algorithms

Algorithms for Advanced Cardiac Life Support

algo-arrest

Tiktok algorithm 2022 – 2023,

Your first few hours on tiktok are crucial to your growth.

You gonna spend few hours on fyp, interacting with videos and creators about your niche. -After few hours, you can start to make your first video

The very first video plays a huge role in your future. -Quality content -Unique but similar to your niche

9-15 seconds maximum!!

After upload, wait about a few hours, before your second video
2nd video needs to have a hook

“You won,t believe this”
“ Nobody is talking about this, but”
“Did you know that..?”
“ X tips on how to ..”
Your hook needs to be on your first few seconds of the video

Your videos needs to be captivating or strange, this way users spends more time on it.
Your next 3 videos should be similar

Tiktok usually boosts your first videos, that’s their hook
Now you need to hook tiktok onto your account to keep boosting it.

You will lose views and engagement

Its normal, you are not shadow banned. you just have to do it on your own now.
Now its time to get more followers

Do duets/stiches/parts

this way you hook your new followers and cycle up your old videos
now you need to have schedule
3-4 posts /day works the best.
wait 3-4h before your next post

Followers >Views

If you have 10k followers then you need at least 10k views /post to keep growing fast -Don’t follow people who follow you.

How Does The Tiktok Algorithm Work? (+10 Viral Hacks To Go)

Youtube algorithm 2022 – 2023,

Google algorithm update 2022 – 2023,

https://developers.google.com/search/updates/ranking

This page lists the latest ranking updates made to Google Search that are relevant to website owners. To learn more about how Google makes improvements to Search and why we share updates, check out our blog post on How Google updates Search. You can also find more updates about Google Search on our blog.

https://blog.google/products/search/how-we-update-search-improve-results/

https://www.seroundtable.com/category/google-updates

Twitter algorithm 2022 – 2023,

Twitter, which was founded in 2006, is still one of the world’s most popular social networking sites. As of 2020, there are over 340 million active Twitter users, with over 500 million tweets posted each day.

Twitter

That’s a lot of information to sort through. And, if your company is going to utilize Twitter effectively, you must first grasp how Twitter’s timeline algorithm works and then learn the most dependable techniques of getting your information in front of your target audience.

Twitter Timeline Options: Top Tweets and Most Recent Tweets(Latest)

The Twitter Timeline may be configured to show tweets in two ways:

• Top Tweet

• Recent Tweets

These modes mays be switched by clicking the Stars icon in the upper right corner of your timeline feed.

The Most Popular Tweets

Top Tweets use an algorithm to display tweets in the order that a user is most likely to be interested in. The algorithm is based on how popular and relevant tweets are. Because of the large number of tweets sent at any given time, Twitter news feed algorithms like this one were developed to protect users from becoming overwhelmed and to keep them up to date on material that they genuinely care about.

Recent Tweets

The Latest Tweets section reorders your timeline in reverse chronological order, with the most recently Tweeted Tweets at the top. It displays tweets as they are sent in real time, so more information from more people will appear, but it will not display every tweet. The algorithm will still have some say in deciding which tweets to broadcast at the time.

Ranking Signals for the Twitter Timeline Algorithm:

The following are ranking indications for the Twitter timeline algorithm:

• How recent it is

• Use of rich media (pictures, gifs, video)

• Engagement (likes, responses, retweets)

• Author prominence

• User-author relationship

• User behavior

For example, a user is more likely to see a tweet in their timeline if it comes from a person with whom they frequently interact and has a large number of likes and responses.

What exactly are Twitter Topics?

Facebook Algorithm 2022 – 2023

Facebook can tend to feel like an uphill battle for businesses. The social media platform’s algorithm isn’t very clear about how your posts end up on users’ screens. When even the sponsored posts you’re investing in aren’t working, you know there has to be something you’re missing.

Paid or unpaid, the way you post on Facebook and reach the platform’s ever-expanding audience matters. Every time a user logs on to the website or app, Facebook is learning about what that user likes seeing and what they skip past.

The social media giant has tried a lot of different algorithms over the years, ranging from focusing on the video to simply asking users what they want to see more of. Today, things look a little different, and knowing how the Facebook algorithm works can be a game-changer for businesses.

So here’s what you need to know about Facebook’s Algorithm in 2021:

Facebook is concerned with three things when its algorithm learns about user activity and begins curating their feed to these behaviors.

Following these three elements to a great post can mean huge things for your engagement and reach on Facebook. Ignoring them ends up in things like these terrible Facebook ads we wish we never saw.

First up, the accounts with which the user interacts matter. If someone is always checking up on certain friends and family members, then that’s going to mean their posts will show up sooner on their feed.

The same goes for organizations and businesses that users interact with the most. That means it’s your job to post content that encourages users to not only follow and like you but also provide users the type of content that drives engagement.

What sort of posts do best on Facebook?

Users all have their own preferences for what they like to see. At the end of the day, a mix of videos, links to blogs and web pages, and photos are good to keep things diverse and dynamic.

That said, the sort of posts that do best on your business account will depend on the final element of the Facebook algorithm that matters most: user interactions.

From sharing a post to simply giving it a like or reaction, interactions matter most when it comes to the Facebook algorithm. The social media platform wants users active and logging in as often as possible. That’s why their machine learning algorithm sees interactions as a huge plus for your account.

Comments matter too! In fact, comments serve a dual purpose for your business account on Facebook. Not only do comments drive interactions on your page, but they also give you direct feedback from the audience.

If you listen to comments and take your user’s feedback seriously, you can avoid posting content that ends up falling flat. That doesn’t just hurt your reach and engagement but it’s also a blunder on your digital brand.

Can you beat the Facebook Algorithm once and for all?

We don’t like putting negative energy into the universe, but the Facebook algorithm is sort of like a villain you need to take down to achieve your goals as a business. Understanding the Facebook algorithm can feel like a battle sometimes.

How Does Amazon’s Search Algorithm Work to Find the Right Products?

The search algorithm of Amazon is sophisticated and has a key goal. It aims to connect online shoppers with the products they are looking for as soon as possible. If you reach the top of the Search Pages, your brand visibility will improve, and sales will go up.

Not an essay but here’s a summary:

Based on a Vickrey-Clarke-Groves (VCG) auction.
Total Value = Bid*(eCTR*eCVR)+Value (info)
This creates an oCPM environment (info)

The core of this according to the auction and engineering team has more or less been the same for years.

2018/2020 are different issues. The former affecting (mostly) those who don’t understand oCPM as FB prioritizes user experience and the latter causing issues for those still relying on attribution instead of lift (info).

Audio recognition software like Shazam – how does the algorithm work?

Have a read through this mate http://coding-geek.com/how-shazam-works/

It identifies the songs by creating a audio fingerprint by using a spectrogram. When a song is being played ,shazam creates an audio fingerprint of that song (provided the noise is not high) ,and then checks if it matches with the millions of other audio fingerprints in its database, if it finds a match it sends the info. Here is a really good blog : https://www.toptal.com/algorithms/shazam-it-music-processing-fingerprinting-and-recognition

How does the PALS algorithm in 2022 actually work?

https://www.palsprograms.org/

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

Machine Learning Engineer Interview Questions and Answers

1.8x faster Integer printing algorithm
by Tigran Hayrapetyan (Programming on Medium) on May 3, 2024 at 3:14 pm
A new algorithm, which does no division or remainder calculationContinue reading on Cantor’s Paradise »
Simplifying SSL Certificate Integration in Android Apps: A Practical Guide
by Mohamed Elsdody (Programming on Medium) on May 3, 2024 at 3:11 pm
Simplifying SSL Certificate Integration in Android Apps: A Practical GuideContinue reading on Medium »
Apple stock pops 6% after results top estimates, company reveals $110 billion buyback
by Earl Cotten (Programming on Medium) on May 3, 2024 at 3:08 pm
Apple’s Market Resurgence: Riding on Optimism and Buyback WavesContinue reading on Newsbusinesses »
Understanding Java: Pass by Value vs. Pass by Reference
by Zeki Jusufoski (Programming on Medium) on May 3, 2024 at 2:56 pm
Unraveling Pass by Value and Pass by ReferenceContinue reading on Medium »
Obsidian Plugins Review — 56
by Nuno Campos (Programming on Medium) on May 3, 2024 at 2:41 pm
Elevate Your Obsidian Experience: Discover the Best New Plugins for Enhanced Note-Taking and Boost ProductivityContinue reading on Technology Hits »
The 5 Best Roadmaps for .NET Developers in 2024
by Sasha Marfut (Programming on Medium) on May 3, 2024 at 2:25 pm
How to organize your knowledge?Continue reading on Level Up Coding »
Open ai hints at new search functionalities ,becoming a direct Google competitor.
by AjayKrish (Programming on Medium) on May 3, 2024 at 2:12 pm
Continue reading on Medium »
Unsubscribes: Simplifying Email Opt-Outs with Salesforce Marketing Cloud.
by Ruchika Sandolkar (Programming on Medium) on May 3, 2024 at 2:10 pm
Continue reading on Medium »
A promising alternative to Multi layer perceptrons ,MLPs is taking over the industry.
by AjayKrish (Programming on Medium) on May 3, 2024 at 2:10 pm
Continue reading on Medium »
FRESH VALENCIA ORANGE SEEDLINGS AVAILABLE NOW!
by Fruit Tree and Plant Nursery | CLOVER-GREEN Nurs (Programming on Medium) on May 3, 2024 at 2:07 pm
Quality Valencia orange seedlings. 0756617920Continue reading on Medium »
In C, how can I efficiently Write to multiple files based on name?
by /u/uu3s (Algorithm) on October 7, 2023 at 12:10 am
submitted by /u/uu3s [link] [comments]
How do I multiplying big numbers, using Karatsuba's method?
by /u/uu3s (Algorithm) on October 7, 2023 at 12:08 am
submitted by /u/uu3s [link] [comments]
How to test whether 2 languages are equal, when given in algebraic form?
by /u/vv3st (Algorithm) on September 29, 2023 at 11:50 pm
submitted by /u/vv3st [link] [comments]
How to find an st-path in a planar graph, which is adjacent to the fewest number of faces?
by /u/vv3st (Algorithm) on September 29, 2023 at 11:50 pm
submitted by /u/vv3st [link] [comments]
Please answer my questions on 2-chordless cycle extraction, from a failed comparability graph recognition?
by /u/vv3st (Algorithm) on September 29, 2023 at 11:49 pm
submitted by /u/vv3st [link] [comments]
Is it possible to boost the error probability of a Consensus protocol, over dynamic network?
by /u/vv3st (Algorithm) on September 29, 2023 at 11:48 pm
submitted by /u/vv3st [link] [comments]
How to incorporate custom Algorithm in SOLR-LUCENE, before Indexing?
by /u/vv3st (Algorithm) on September 29, 2023 at 11:47 pm
submitted by /u/vv3st [link] [comments]
Just some advance ai algorithm
by Algorithm on June 27, 2023 at 2:57 am
[link] [comments]
Agglomerative Hierarchical Clustering complexity
by /u/CompteDeMonteChristo (Algorithm) on April 11, 2023 at 4:19 pm
I wrote an algorithm for Agglomerative Hierarchical Clustering General agglomerative clustering methods have a time complexity of O(N³) and a memory complexity of O(N²) due to the need to calculate and recalculate full pairwise distance matrices. I'd like to calculate the complexity for it. The algorithm running on random data is empirically 60 times faster on 1000 points, 200 faster with 2000 points and 500 times faster with 3000 points. It is clearly not O(N³) I'd like to calculate or estimate the complexity of it. Could someone help me on this? You can test and get the source on this page: https://preview.redd.it/2bv8hmqj6ata1.png?width=1170&format=png&auto=webp&s=c213b338ae524f38fd3e0be9e38258d04b2b2bcc https://ganaye.com/ahc/?numberOfPoints=3000&wantedClusters=6&linkage=avg&canvasSize=500 submitted by /u/CompteDeMonteChristo [link] [comments]
Finding Clique ids
by /u/239847293847 (Algorithm) on August 31, 2020 at 2:06 pm
Hello I have the following problem: I have a few million tuples of the form (id1, id2). If I have the tuple (id1, id2) and (id2, id3), then of course id1, id2 and id3 are all in the same group, despite that the tuple (id1, id3) is missing. I do want to create an algorithm where I get a list of (id, groupid) tuples as a result. How do I do that fast? I've already implemented an algorithm, but it is way too slow, and it works the following (simplified): 1) increment groupid 2) move first element of the tuplelist into the toprocess-set 3) move first element of the toprocess-set into the processed set with the current groupid 4) find all elements in the tuplelist that are connected to that element and move them to the toprocess-set 5) if the toprocess-set isn't empty go back to 3 6) if the tuplelist is not empty go back to 1 submitted by /u/239847293847 [link] [comments]

October 13, 2022October 25, 2022

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

Machine Learning and Artificial Intelligence are changing Algorithmic Trading. Algorithmic trading is the use of computer programs to make trading decisions in the financial markets. These programs are based on a set of rules that take into account a variety of factors, including market conditions and the behavior of other traders. In recent years, machine learning and artificial intelligence have begun to play a role in algorithmic trading. Here’s a look at how these cutting-edge technologies are changing the landscape of stock market trading.

Machine Learning in Algorithmic Trading

Machine learning is a type of artificial intelligence that allows computer programs to learn from data and improve their performance over time. This technology is well-suited for algorithmic trading because it can help programs to better identify trading opportunities and make more accurate predictions about future market movements.

One way that machine learning is being used in algorithmic trading is through the development of so-called “predictive models.” These models are designed to analyze past data (such as prices, volumes, and order types) in order to identify patterns that could be used to predict future market movements. By using predictive models, algorithmic trading systems can become more accurate over time, which can lead to improved profits.

How Does Machine Learning Fit into Algorithmic Trading?

Machine learning algorithms can be used to automatically generate trading signals. These signals can then be fed into an execution engine that will automatically place trades on your behalf. The beauty of using machine learning for algorithmic trading is that it can help you find patterns in data that would be impossible for humans to find. For example, you might use machine learning to detect small changes in the price of a stock that are not apparent to the naked eye but could indicate a potential buying or selling opportunity.

Artificial Intelligence in Algorithmic Trading

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

Artificial intelligence (AI) is another cutting-edge technology that is beginning to have an impact on algorithmic trading. AI systems are able to learn and evolve over time, just like humans do. This makes them well-suited for tasks such as identifying patterns in data and making predictions about future market movements. AI systems can also be used to develop “virtual assistants” for traders. These assistants can help with tasks such as monitoring the markets, executing trades, and managing risk.

According to Martha Stokes, Algorithmic Trading will continue to expand on the Professional Side of the market, in particular for these Market Participant Groups:

Buy Side Institutions, aka Dark Pools. Although the Buy Side is also going to continue to use the trading floor and proprietary desk traders, even outsourcing some of their trading needs, algorithms are an integral part of their advance order types which can have as many as 10 legs (different types of trading instruments across multiple Financial Markets all tied to one primary order) the algorithms aid in managing these extremely complex orders.

Sell Side Institutions, aka Banks, Financial Services. Banks actually do the trading for corporate buybacks, which appear to be continuing even into 2020. Trillions of corporate dollars have been spent (often heavy borrowing by corporations to do buybacks) in the past few years, but the appetite for buybacks doesn’t appear to be abating yet. Algorithms aid in triggering price to move the stock upward. Buybacks are used to create speculation and rising stock values.

High Frequency Trading Firms (HFTs) are heavily into algorithms and will continue to be on the cutting edge of this technology, creating advancements that other market participants will adopt later.

Hedge Funds also use algorithms, especially for contrarian trading and investments.

Corporations do not actually do their own buybacks; they defer this task to their bank of record.

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

Professional Trading Firms that offer trading services to the Dark Pools are increasing their usage of algorithms.

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

Smaller Funds Groups use algorithms less and tend to invest similarly to the retail side.

The advancements in Artificial Intelligence (AI), Machine Learning, and Dark Data Mining are all contributing to the increased use of algorithmic trading.

Computer programs that automatically make trading decisions use mathematical models and statistical analysis to make predictions about the future direction of prices. Machine learning and artificial intelligence can be used to improve the accuracy of these predictions.

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

1. Using machine learning for stock market prediction: Machine learning algorithms can be used to predict the future direction of prices. These predictions can be used to make buy or sell decisions in an automated fashion.

2. Improving the accuracy of predictions: The accuracy of predictions made by algorithmic trading programs can be improved by using more data points and more sophisticated machine learning algorithms.

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

3. Automating decision-making: Once predictions have been made, algorithmic trading programs can automatically make buy or sell decisions based on those predictions. This eliminates the need for human intervention and allows trades to be made quickly and efficiently.

4. Reducing costs: Automated algorithmic trading can help reduce transaction costs by making trades quickly and efficiently. This is because there are no delays caused by human decision-making processes.

To conclude:

Machine learning and artificial intelligence are two cutting-edge technologies that are beginning to have an impact on algorithmic trading. By using these technologies, traders can develop more accurate predictive models and virtual assistants to help with tasks such as monitoring the markets and executing trades. In the future, we can expect machine learning and AI to play an even greater role in stock market trading. If you are interested in using machine learning and AI for algorithmic trading, we recommend that you consult with a professional who has experience in this area.

CAVEAT by Ross:

Can artificial intelligence or machine learning predict the future of the stock market?

Can it predict?

Yes, to a certain extent. And let’s be honest, all you care about is that it predicts it in such a way you can extract profit out of your AI/ML model.

Ultimately, people drive the stock market. Even the models they build, no matter how fancy they build their AI/ML models..

And people in general are stupid, and make stupid mistakes. This will always account for “weird behavior” on pricing of stocks and other financial derivatives. Therefore the search of being able to explain “what drives the stock market” is futile beyond the extend of simple macro economic indicators. The economy does well. Profits go up, fellas buy stocks and this will be priced in the asset. Economy goes through the shitter, firms will do bad, people sell their stocks and as a result the price will reflect a lower value.

The drive for predicting markets should be based on profits, not as academia suggests “logic”. Look back at all the idiots who drove businesses in the ground the last 20/30 years. They will account for noise in your information. The focus on this should receive much more information. The field of behavioral finance is very interesting and unfortunately there isn’t much literature/books in this field (except work by Kahneman).

Best and worst performing currencies in 2022. Russian Ruble is number one – Russian Stock Market Today

Best NBFC Stocks in India 2024 — Analysis & List Of Stocks
by Kundkundtc (Stocks on Medium) on May 3, 2024 at 9:38 am
Read more information Best NBFC Stocks in India 2024 — Analysis & List Of StocksContinue reading on Medium »
Unlocking Long-Term Wealth: Your Guide to Dividend Investing
by Ryanne M. (Stocks on Medium) on May 3, 2024 at 8:10 am
You might be a new investor addicted to buying tech stocks on the rise or hoping to cash in on the latest profitable IPOs. Or maybe you’re…Continue reading on Medium »
Free Updo Hair — High Quality PNGs Set Free Download
by Irisinfinity (Stocks on Medium) on May 3, 2024 at 5:42 am
Continue reading on Medium »
Computer, stock market and business woman typing, analysis or check growth data Business, Corporate…
by Visualvortex (Stocks on Medium) on May 3, 2024 at 5:10 am
🔍 Check out Computer, stock market and business woman typing, analysis or check growth data now! Explore our library of 📁 business…Continue reading on Medium »
Artificial Intelligence Manages Crypto Assets 3D Render Backgrounds Motion Graphics Stock Video
by Auroraamaze (Stocks on Medium) on May 3, 2024 at 4:51 am
Continue reading on Medium »
Infographics About Blockchain and Crypto Assets 3D Render Backgrounds Motion Graphics Stock Video
by Auroraamaze (Stocks on Medium) on May 3, 2024 at 4:51 am
Continue reading on Medium »
मूविंग एवरेज कन्वर्जेंस डाइवर्जेंस (Moving Average Convergence Divergence) इंडिकेटर क्या है?
by Finohindi (Stocks on Medium) on May 3, 2024 at 3:40 am
कोई ट्रेडर जब भी शेयर बाज़ार में ट्रेडिंग के इरादे से आता है तब वो चार्ट के टेक्निकल एनालिसिस तथा कैंडलस्टिक पैटर्न का अध्ययन करता है |…Continue reading on Medium »
What is block trading in the stock market?
by Tabish zaidi (Stocks on Medium) on May 3, 2024 at 3:37 am
Continue reading on Medium »
Stock Market Update Thursday May 2, 2024
by AlgoTradeAlert (Stocks on Medium) on May 2, 2024 at 9:36 pm
Stock Market Update Thursday May 2, 2024 Equity markets reversed yesterday’s late-day selling pressure and staged a strong rally, with the…Continue reading on Medium »
Sector Exposure of Stocks
by Patrik Friedlos (Stocks on Medium) on May 2, 2024 at 8:03 pm
How much car manufacturer is actually in Tesla?Continue reading on Medium »

October 11, 2022January 8, 2023

What are some ways to increase precision or recall in machine learning?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What are some ways to increase precision or recall in machine learning?

What are some ways to Boost Precision and Recall in Machine Learning?

Sensitivity vs Specificity?

In machine learning, recall is the ability of the model to find all relevant instances in the data while precision is the ability of the model to correctly identify only the relevant instances. A high recall means that most relevant results are returned while a high precision means that most of the returned results are relevant. Ideally, you want a model with both high recall and high precision but often there is a trade-off between the two. In this blog post, we will explore some ways to increase recall or precision in machine learning.

There are two main ways to increase recall:

by increasing the number of false positives or by decreasing the number of false negatives. To increase the number of false positives, you can lower your threshold for what constitutes a positive prediction. For example, if you are trying to predict whether or not an email is spam, you might lower the threshold for what constitutes spam so that more emails are classified as spam. This will result in more false positives (emails that are not actually spam being classified as spam) but will also increase recall (more actual spam emails being classified as spam).

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

To decrease the number of false negatives,

you can increase your threshold for what constitutes a positive prediction. For example, going back to the spam email prediction example, you might raise the threshold for what constitutes spam so that fewer emails are classified as spam. This will result in fewer false negatives (actual spam emails not being classified as spam) but will also decrease recall (fewer actual spam emails being classified as spam).

There are two main ways to increase precision:

by increasing the number of true positives or by decreasing the number of true negatives. To increase the number of true positives, you can raise your threshold for what constitutes a positive prediction. For example, using the spam email prediction example again, you might raise the threshold for what constitutes spam so that fewer emails are classified as spam. This will result in more true positives (emails that are actually spam being classified as spam) but will also decrease precision (more non-spam emails being classified as spam).

To decrease the number of true negatives,

you can lower your threshold for what constitutes a positive prediction. For example, going back to the spam email prediction example once more, you might lower the threshold for what constitutes spam so that more emails are classified as spam. This will result in fewer true negatives (emails that are not actually spam not being classified as spam) but will also decrease precision (more non-spam emails being classified as spam).

To summarize,

there are a few ways to increase precision or recall in machine learning. One way is to use a different evaluation metric. For example, if you are trying to maximize precision, you can use the F1 score, which is a combination of precision and recall. Another way to increase precision or recall is to adjust the threshold for classification. This can be done by changing the decision boundary or by using a different algorithm altogether.

Sensitivity vs Specificity

In machine learning, sensitivity and specificity are two measures of the performance of a model. Sensitivity is the proportion of true positives that are correctly predicted by the model, while specificity is the proportion of true negatives that are correctly predicted by the model.

Google Colab For Machine Learning

State of the Google Colab for ML (October 2022)

Google introduced computing units, which you can purchase just like any other cloud computing unit you can from AWS or Azure etc. With Pro you get 100, and with Pro+ you get 500 computing units. GPU, TPU and option of High-RAM effects how much computing unit you use hourly. If you don’t have any computing units, you can’t use “Premium” tier gpus (A100, V100) and even P100 is non-viable.

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

Google Colab Pro+ comes with Premium tier GPU option, meanwhile in Pro if you have computing units you can randomly connect to P100 or T4. After you use all of your computing units, you can buy more or you can use T4 GPU for the half or most of the time (there can be a lot of times in the day that you can’t even use a T4 or any kinds of GPU). In free tier, offered gpus are most of the time K80 and P4, which performs similar to a 750ti (entry level gpu from 2014) with more VRAM.

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

~~For your consideration, T4 uses around 2, and A100 uses around 15 computing units hourly.~~
Based on the current knowledge, computing units costs for GPUs tend to fluctuate based on some unknown factor.

Considering those:

For hobbyists and (under)graduate school duties, it will be better to use your own gpu if you have something with more than 4 gigs of VRAM and better than 750ti, or atleast purchase google pro to reach T4 even if you have no computing units remaining.
For small research companies, and non-trivial research at universities, and probably for most of the people Colab now probably is not a good option.
Colab Pro+ can be considered if you want Pro but you don’t sit in front of your computer, since it disconnects after 90 minutes of inactivity in your computer. But this can be overcomed with some scripts to some extend. So for most of the time Colab Pro+ is not a good option.

If you have anything more to say, please let me know so I can edit this post with them. Thanks!

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Conclusion:

In machine learning, precision and recall trade off against each other; increasing one often decreases the other. There is no single silver bullet solution for increasing either precision or recall; it depends on your specific use case which one is more important and which methods will work best for boosting whichever metric you choose. In this blog post, we explored some methods for increasing either precision or recall; hopefully this gives you a starting point for improving your own models!

What are some good datasets for Data Science and Machine Learning?

Machine Learning Engineer Interview Questions and Answers

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Machine Learning and Data Science Breaking News 2022 – 2023

[P] Flan-T5 for Synthetic data generation?
by /u/Theredeemer08 (Machine Learning) on May 3, 2024 at 7:21 pm
Hi all, I'm trying to build a personal project on synthetic dataset generation. Been researching + laying out an initial structure for the project. The main question I have is can FLAN-T5 be used for data generation / mass text generation? I can't seem to find examples of people using it for that use-case. I've looked at mixtral-instruct models aswell. I am trying to avoid GPT4 due to cost. Please let me know of any other LMs that could be good for my purposes submitted by /u/Theredeemer08 [link] [comments]
[R][D] Quantization of Time-Series for improving performance of RNNs (possible use cases for LLMs)
by /u/HungryhungryUgolino (Machine Learning) on May 3, 2024 at 5:10 pm
Hello all, Wanted to ask if any of y'all had experience with using quantized/binned version of feature sets and/or goal sets to improve performance for sequence learners for time-series problems. I'm not very strong on NLP so sorry for any of the mistakes that may follow Set-up: f(X) -> ŷ with the goal of |ŷ-y| < eps X is a feature set with features that are hopefully informative on y, with varying frequencies of information, such as simple moving average with varying windows for each feature dimension, as a toy example. X and y are noisy Motivation I have seen some recent work modifying univariate time-series forecasting problems so they are digestible for LLMs, in particular : Chronos: Learning the Language of Time Series The general method is Scale a time series in some way, such as dividing each sequence by mean absolute value bin these values to make the possible values now discrete add start / end token to be digestible by LLMs and then use to forecast Hurrah now we have a time-series that can be passed into an LLM Quantization for RNNs rather than LLM Taking a step back, rather than using the above transformation for use with LLMs, I'm wondering if anyone here have used these techniques to make a time-series more amenable for an RNN. The two important parts of the transformation are (1) the scaling technique and (2) the number of bins N. As N -> infinity we get the same precision as the original time-series. Quantization as a function Q(.) can be applied to either X,y or both. Benefits I had in mind: Using integers as references to bins for faster/easier trading reduce noise in signal possibility of using feature embedding? Hopefully this was clear. Any help is appreciated. submitted by /u/HungryhungryUgolino [link] [comments]
[N] AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits tech industry
by /u/bregav (Machine Learning) on May 3, 2024 at 4:58 pm
AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits tech industry Summary from article: Artificial intelligence engineers at top tech companies told CNBC that the pressure to roll out AI tools at breakneck speed has come to define their jobs. They say that much of their work is assigned to appease investors rather than to solve problems for end users, and that they are often chasing OpenAI. Burnout is an increasingly common theme as AI workers say their employers are pursuing projects without regard for the technology’s effect on climate change, surveillance and other potential real-world harms. An especially poignant quote from the article: An AI engineer who works at a retail surveillance startup told CNBC that he’s the only AI engineer at a company of 40 people and that he handles any responsibility related to AI, which is an overwhelming task. He said the company’s investors have inaccurate views on the capabilities of AI, often asking him to build certain things that are “impossible for me to deliver.” submitted by /u/bregav [link] [comments]
[D] software to design figures
by /u/_Hardric (Machine Learning) on May 3, 2024 at 3:13 pm
I want to create graphs/figures for rl algorithms. I really like the style used in Deep Mind papers (AlphaZero, AlphaTensor, MuZero, ...). Does anyone know the software used for those images ? Or perhaps something else that achieves similar results ? https://preview.redd.it/4uohkcbxg8yc1.png?width=791&format=png&auto=webp&s=9136bd12eb797523a5ff73f2b0b02e811239d9c3 https://preview.redd.it/1vzin9izg8yc1.png?width=578&format=png&auto=webp&s=8046e1196347365b48ad2d3920ee0ba18119600c submitted by /u/_Hardric [link] [comments]
Ethical Dilemmas in Machine Learning Deployment [Discussion]
by /u/Old_Coder45 (Machine Learning) on May 3, 2024 at 2:31 pm
A couple of questions I have been discussing with colleagues recently, I wanted to get a broader idea of peoples thoughts so decided to pose the questions to this subreddit, thanks. How do we balance the ethical imperatives of transparency, fairness, and accountability with the practical necessities of deploying machine learning models in real-world decision-making contexts? What strategies, frameworks, or best practices can organizations adopt to navigate these challenges effectively while ensuring both ethical integrity and operational efficiency? submitted by /u/Old_Coder45 [link] [comments]
[D] How to train a text detection model that will detect it's orientation (rotation) ranging from +180 to -180 degrees.
by /u/tmargary (Machine Learning) on May 3, 2024 at 2:29 pm
Most models it seems like are able to detect rotated objects, but they use so called le90 convention, where objects are rotated from +90 to -90 degrees. In my case I would like to detect the text on the image in its correct orientation which means 0 and 180 degrees in my case are not the same (which is the case in MMOCR, MMDET, and MMRotate models). Can you guide me on this problem? How can I approach this issue? Do you have links to some open-source projects that tackle this issue? I know that usually the text orientation issue can be solved by training another small model, or by training the recognition stage with all possible rotations, but I would like to tackle this issue early in the detection stage. Any ideas would be highly appreciated. Thanks in advance. submitted by /u/tmargary [link] [comments]
How would you model this problem?
by /u/LogisticDepression (Data Science) on May 3, 2024 at 2:27 pm
Suppose I’m trying to predict churn based on previous purchases information. What I do today is come up with features like average spend, count of transactions and so on. I want to instead treat the problem as a sequence one, modeling the sequence of transactions using NN. The problem is that some users have 5 purchases, while others 15. How to handle this input size change from user to user, and more importantly which architecture to use? Thanks!! submitted by /u/LogisticDepression [link] [comments]
What makes a good or bad product manager?
by /u/fioney (Data Science) on May 3, 2024 at 11:13 am
Realised I’ve only ever worked with two product managers and would love your thoughts as to what makes a product manager good to work with or not so good to work with submitted by /u/fioney [link] [comments]
REVOLUTIONIZING DRAG-AND-DROP PYTHON GUI BUILDING
by /u/amimiath (Data Science) on May 3, 2024 at 11:09 am
submitted by /u/amimiath [link] [comments]
Apple silicone users: how do you make LLM’s run faster?
by /u/Exact-Committee-8613 (Data Science) on May 3, 2024 at 10:19 am
Just as the title says. I’m trying to build a rag using ollama but it’s taking so so long. I’m using apple m1 8gb ram (yes, I know, I brought a butter knife to a gun fight) but I’m broke and cannot afford a new one. Any suggestions? Thanks submitted by /u/Exact-Committee-8613 [link] [comments]
[R] HGRN2: Gated Linear RNNs with State Expansion
by Machine Learning on May 3, 2024 at 9:47 am
Paper: https://arxiv.org/abs/2404.07904 Code: https://github.com/OpenNLPLab/HGRN2 Standalone code (1): https://github.com/Doraemonzzz/hgru2-pytorch Standalone code (2): https://github.com/sustcsonglin/flash-linear-attention/tree/main/fla/models/hgrn2 Abstract: Hierarchically gated linear RNN (HGRN, Qin et al. 2023) has demonstrated competitive training speed and performance in language modeling, while offering efficient inference. However, the recurrent state size of HGRN remains relatively small, which limits its expressiveness. To address this issue, inspired by linear attention, we introduce a simple outer-product-based state expansion mechanism so that the recurrent state size can be significantly enlarged without introducing any additional parameters. The linear attention form also allows for hardware-efficient training. Our extensive experiments verify the advantage of HGRN2 over HGRN1 in language modeling, image classification, and Long Range Arena. Our largest 3B HGRN2 model slightly outperforms Mamba and LLaMa Architecture Transformer for language modeling in a controlled experiment setting; and performs competitively with many open-source 3B models in downstream evaluation while using much fewer total training tokens. [link] [comments]
[R] A Primer on the Inner Workings of Transformer-based Language Models
by /u/SubstantialDig6663 (Machine Learning) on May 3, 2024 at 9:46 am
Authors: Javier Ferrando (UPC), Gabriele Sarti (RUG), Arianna Bisazza (RUG), Marta Costa-jussà (Meta) Paper: https://arxiv.org/abs/2405.00208 Abstract: The rapid progress of research aimed at interpreting the inner workings of advanced language models has highlighted a need for contextualizing the insights gained from years of work in this area. This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architecture. We conclude by presenting a comprehensive overview of the known internal mechanisms implemented by these models, uncovering connections across popular approaches and active research directions in this area. https://preview.redd.it/57y44wwdn6yc1.png?width=1486&format=png&auto=webp&s=7b7fb38a59f3819ce0d601140b1e031b98c17183 submitted by /u/SubstantialDig6663 [link] [comments]
[D] Help: 1. Current PhD position is alright? 2. (3d) computer vision; point cloud processing, is my Research Roadmap correct?
by /u/Same_Half3758 (Machine Learning) on May 3, 2024 at 9:20 am
Currently I am PhD student (in the middle of the second semester) = (almost 7 months), particularly I am focusing on point cloud research for classification and segmentation all on my own. no guidance from my prof or fella Ph.D. (s). I have tow particular questions: should I drop out my Ph.D. under current supervisor? why? because almost there is no supervision and guidance? in a world with this huge of knowledge and fast going research, is it possible to end up with a satisfactory PhD on my own? considering that still my understanding for the field (point cloud processing) and DL is quite elementary. while I have the courage to work even though it is quite difficult to not having a fertile environment. should I quit and find a better place to pursue my research in? how bad is this situation? lastly, I am more concerned about my research strategy which is kinda on the fly actually. previously. from the beginning of the program until 3 months ago, I was solely reading groundbreaking papers i.e. pointnet, pointnet ++, point transformer series. I spent 3-4 months only exploring the very surface of the field, because it was my first interaction with the field and also honestly I did not have very good understanding of deep learning either. just grasped simple and high level concepts and ideas. but around 3 months ago, i realized this way I never come up with my own idea and contribute to the field. lack of knowledge coupled with absolute zero supervision and this naïve reading was not promising. so, i decided this time starting from scratch with Pointnet paper go deep and understand end to end of it. concepts in the paper, and its code implementation, which is still ongoing. I definitely feel I am learning. but the thing is: what should be my next step? particularly, there are different methods that have been used and structured literature in the field. so, should i pursue same strategy in many directions or just stick to one for a long time? I do not know even what are the exact options I have here! 🙁 I hope it is clear enough. submitted by /u/Same_Half3758 [link] [comments]
[D] Fine-tune Phi-3 model for domain specific data - seeking advice and insights
by /u/aadityaura (Machine Learning) on May 3, 2024 at 7:10 am
Hi, I am currently working on fine-tuning the Phi-3 model for financial data. While the loss is decreasing during training, suggesting that the model is learning quite well, the results on a custom benchmark are surprisingly poor. In fact, the accuracy has decreased compared to the base model. Results I've observed: Phi-3-mini-4k-instruct (base model): Average domain accuracy of 40% Qlora - Phi-3-mini-4k-instruct (fine-tuned model): Average domain accuracy of 35% I have tried various approaches, including QLora, Lora, and FFT, but all the results are poor compared to the base model. Moreover, I have also experimented with reducing the sequence length to 2k in an attempt to constrain the model and prevent it from going off-track, but unfortunately, this has not yielded any improvement. I'm wondering if there might be issues with the hyperparameters, such as the learning rate, or if there are any recommendations on how I can effectively fine-tune this model for better performance on domain-specific data. If anyone has successfully fine-tuned the Phi-3 model on domain-specific data, I would greatly appreciate any insights or advice you could share. Thank you in advance for your help and support! qlora configuration: sequence_len: 4000 sample_packing: true pad_to_sequence_len: true trust_remote_code: True adapter: qlora lora_r: 256 lora_alpha: 512 lora_dropout: 0.05 lora_target_linear: true lora_target_modules: - q_proj - v_proj - k_proj - o_proj - gate_proj - down_proj - up_proj gradient_accumulation_steps: 1 micro_batch_size: 2 num_epochs: 4 optimizer: adamw_torch lr_scheduler: cosine learning_rate: 0.00002 warmup_steps: 100 evals_per_epoch: 4 eval_table_size: saves_per_epoch: 1 debug: deepspeed: weight_decay: 0.0 https://preview.redd.it/7afyhxcjv5yc1.png?width=976&format=png&auto=webp&s=1ce3efe6df6e4533bad5ec2f23e4f4968736bd56 submitted by /u/aadityaura [link] [comments]
[R] postive draws for bioDraws
by /u/h2_so4_ (Machine Learning) on May 3, 2024 at 6:57 am
I'm a beginner in python. Please help me with the following situation. My research is stuck. Consider the following equation in which have to generate random values (currently have set the method to NORMALMLHS). . L1 =c+sigmaL1 * bioDraws (E_L1','NORMAL_MLHS) . where L1 is an endogenous variable, c is an estimale constant for which the lower bound is 0. the lower bound for sigmaL1 is also 0. Which method can use instead of 'NORMAL_MLHS' to ensure that it generates positive values and hence L1 is positive? submitted by /u/h2_so4_ [link] [comments]
[D] Distance Estimation - Real World coordinates
by /u/Embarrassed_Top_5901 (Machine Learning) on May 3, 2024 at 5:38 am
Hello, I'm sorry for resposting this question again but this is very important and I need assistance. I have three cameras in a room in different locations ( front, left and right wall). I should be able to find distance among humans in the room in meters. I performed camera calibration for all the cameras. I tried matching the common points using SIFT, and then performed DLT method but the values are way off and not even close to the actual values. I tried stereo vision as well but that is not giving me close values as well. I also have distanced between cameras in meters too. I'm a beginner in computer vision and I should complete this task soon but I have been stuck with this since one month and I'm getting tired as I'm not able to solve this issue and I'm running out of solutions. I would really appreciate if someone helps me and guide me in the right direction. Thanks a lot for your help and time 😄 submitted by /u/Embarrassed_Top_5901 [link] [comments]
[R] Iterative Reasoning Preference Optimization
by /u/topcodemangler (Machine Learning) on May 3, 2024 at 3:20 am
submitted by /u/topcodemangler [link] [comments]
[D] Looking at hardware options for an AI/LLM development machine for work. Training and inference on small-to-mid sized models. Lost in hardware specs -- details in post.
by /u/IThrowShoes (Machine Learning) on May 3, 2024 at 1:26 am
Greetings, At work I've been tasked with researching and developing some stuff around using LLMs in tandem with our in-house software suite. I can't go into many details due to policies, but it would eventually involve some PII identification/extraction, some document summarization, probably a little bit of RAG, etc. Over the last month or two, I've done some preliminary groundwork using very small models to show that something "is possible", but we'd like to take it to the next level. At this point I've been using a combination of my laptop's GPU (just a mobile RTX 3060) and my boss' RTX 4080 on an AMD threadripper machine. The 3060 falls over pretty quickly even on some of the smaller models, but the 4080 does pretty good at inferencing. But as you'd imagine I run out of VRAM pretty quickly trying to do anything slightly more robust. Part of my marching orders is to spec out some hardware for use in a local development machine/desktop. We have already put in an order for more production-grade hardware with a very sizable amount of VRAM (I think it hovers in at around 1 terabyte of VRAM, but not 100% sure) for use in our datacenter, but that wont arrive for a few months at least. With that, I am looking for some recommendations for a development workstation. I can't quite come to the conclusion if I should run multiple GPUs, or shell out for something that has more VRAM built-in. For example, do I run dual 3090s? Do I run an A6000 or two? Or one? Would a single RTX 6000 Ada (48GB) be sufficient? Given that: This is for development only, not production I want to inference small-to-mid sized models (probably up to 30b params) I probably want to fine tune small-to-mid sized models, if anything as a point of comparison. Even using LoRA/QLoRA Fine-tuning would be done on the Python side, and inferencing would be done using HuggingFace's candle library for Rust Using something cloud-based is discouraged on my end (can't go into details), and whatever software gets built that eventually lands in production can't talk with any external API anyways I dont mind using quantized models for development, but at some point I'd like to try on full precision models (which may have to wait for the production hardware to show up) I would say money is not a factor, but if I can budget something under $15k that'd be ideal What would you all recommend? Thanks! submitted by /u/IThrowShoes [link] [comments]
[R] Language settings in PrivateGPT implementation
by /u/povedaaqui (Machine Learning) on May 2, 2024 at 10:07 pm
Hello. I'm running PrivateGPT in a language other than english, and I don't get very well how the language settings work. Based in the example file, does it mean that when the first three parameters match, the prompt style will be set (in this case, "llama2")? I'm looking for the best setting possible for the foundational model I'm using for langagues different than english. settings-en.yaml: local: llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.1-GGUF llm_hf_model_file: mistral-7b-instruct-v0.1.Q4_K_M.gguf embedding_hf_model_name: BAAI/bge-small-en-v1.5 prompt_style: "llama2" For example, for phi3: phi3: llm_hf_repo_id: microsoft/Phi-3-mini-4k-instruct-gguf llm_hf_model_file: Phi-3-mini-4k-instruct-q4.gguf embedding_hf_model_name: nomic-ai/nomic-embed-text-v1.5 prompt_style: "phi3" submitted by /u/povedaaqui [link] [comments]
[D] Good strategies / resources to improve MLOps skills as a PhD student / researcher
by /u/fliiiiiiip (Machine Learning) on May 2, 2024 at 9:52 pm
A lot of researchers / PhD students in ML have prospects of joining the industry eventually (in US about 80% of ML PhDs are in the industry, according to the recently released Stanford's AI Index). What are some good tips / resources for someone to ensure he develops more practical & deployment-oriented MLOps skills? For example - setting up clusters, relevant cloud services (e.g. AWS), Docker, Kubernetes, developing internal tools for model training / data labelling... Stuff like that. submitted by /u/fliiiiiiip [link] [comments]

Top 100 Data Science and Data Analytics and Data Engineering Interview Questions and Answers

What are some good datasets for Data Science and Machine Learning?

October 10, 2022October 18, 2022

How can I get someone’s IP from WhatsApp?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

How can I get someone’s IP from WhatsApp?

If you’re wondering how to get someone’s IP from WhatsApp, you’re not alone. This is a question that’s often asked by programmers, developers, and software engineers. After all, smartphones are becoming more and more powerful and WhatsApp is one of the most popular messaging apps in the world. However, getting someone’s IP from WhatsApp is not as simple as it sounds. The app uses a number of security measures to protect users’ privacy, so it’s not possible to just extract someone’s IP address from a chat session. That said, there are ways to work around this issue. Programmers and developers with experience in reverse engineering can sometimes find ways to get access to WhatsApp servers and extract IP addresses from there. But this is definitely not something that the average person can do. So if you’re looking for a quick and easy way to get someone’s IP from WhatsApp, you’re out of luck.

This is something that might work, the author did not try it himself.

Whatsapp being P2P as they claim, it might work

First the person must be online
Your phone must be rooted
And then install busybox
And then install any command prompt from google playstore like termux
Close anything that uses internet by clicking disable on all apps who use internet
and then open whatsapp and try to make a voice call or a video call with that person
and then during the call open Termux and type this command : netstat -n
If the connection was really peer 2 peer as whatsapp says his IP must be one of the foreign IP Addresses you would see on their Terminal.

*Note: if won’t work if the connection was not peer 2 peer you would see the server IP instead of his IP, I said i did not try it*

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

Conclusion:

One of the most frequently asked questions from WhatsApp users is “How can I get someone’s IP from WhatsApp?” The answer to this question is not as simple as it may seem. Programmers and developers have created a number of ways to obtain someone’s IP address, but none of these methods are foolproof. The most reliable way to get someone’s IP address is to ask them for it. However, this isn’t always possible, and it certainly isn’t the most elegant solution. If you’re looking for a more sophisticated method, you can try asking a software engineer or smartphone developer. They may be able to help you figure out how to obtain someone’s IP address from WhatsApp. However, keep in mind that this isn’t an exact science, and there’s no guarantee that you’ll be able to get the information you’re looking for.

Smartphone 101 – Pick a smartphone for me – android or iOS – Apple iPhone or Samsung Galaxy or Huawei or Xaomi or Google Pixel

WhatsApp Download 2022

https://apknets.com/blue-whatsapp-download/

Whatsapp-Facebook
by /u/pimmingtetal (WhatsApp Reddit) on May 3, 2024 at 6:40 pm
submitted by /u/pimmingtetal [link] [comments]
Whatsapp-Facebook
by /u/pimmingtetal (WhatsApp Reddit) on May 3, 2024 at 6:40 pm
submitted by /u/pimmingtetal [link] [comments]
how to delete a group chat u got removed from and put it on archived
by /u/Star_tanyole (WhatsApp Reddit) on May 3, 2024 at 6:21 pm
i want to delete all my archive group chats and individual numbers. i have one group chat for school that i got removed from but before i was removed (i was removed for being inactive) i put it on archived for a while until i saw that i was removed. if i go a click on delete group chat will it removed the group chat for others to use or just to me that others wont know i deleted it from a notification. submitted by /u/Star_tanyole [link] [comments]
how to delete a group chat u got removed from and put it on archived
by /u/Star_tanyole (WhatsApp Reddit) on May 3, 2024 at 6:21 pm
i want to delete all my archive group chats and individual numbers. i have one group chat for school that i got removed from but before i was removed (i was removed for being inactive) i put it on archived for a while until i saw that i was removed. if i go a click on delete group chat will it removed the group chat for others to use or just to me that others wont know i deleted it from a notification. submitted by /u/Star_tanyole [link] [comments]
tried to delete archive chats but deleted ALL chats instead
by /u/stopplayinwithme (WhatsApp Reddit) on May 3, 2024 at 5:55 pm
as the title says, I can’t recover my chats because I deleted them thinking they were the archived ones. I have an iPhone and I tried to delete the app and download it again, I went through my settings to do iCloud backup but I didn’t have any. What else can I do? submitted by /u/stopplayinwithme [link] [comments]
tried to delete archive chats but deleted ALL chats instead
by /u/stopplayinwithme (WhatsApp Reddit) on May 3, 2024 at 5:55 pm
as the title says, I can’t recover my chats because I deleted them thinking they were the archived ones. I have an iPhone and I tried to delete the app and download it again, I went through my settings to do iCloud backup but I didn’t have any. What else can I do? submitted by /u/stopplayinwithme [link] [comments]
WTF, when I push on a conversation to open it, the chat pops up like this. The UI looks normal when I back out of the chat. Help
by /u/5hutTheFuckUp (WhatsApp Reddit) on May 3, 2024 at 5:49 pm
submitted by /u/5hutTheFuckUp [link] [comments]
WTF, when I push on a conversation to open it, the chat pops up like this. The UI looks normal when I back out of the chat. Help
by /u/5hutTheFuckUp (WhatsApp Reddit) on May 3, 2024 at 5:49 pm
submitted by /u/5hutTheFuckUp [link] [comments]
Disable or Remove Meta AI?!
by /u/Mental-Cold-73 (WhatsApp Reddit) on May 3, 2024 at 4:34 pm
I want an option to disable the Meta AI from WhatsApp. Meta AI is not something I want in my communication app. @Meta stop cluttering our experience with options we don't ask for. Or at least give us an option to disable the AI before we decide to move to other messaging apps... submitted by /u/Mental-Cold-73 [link] [comments]
Disable or Remove Meta AI?!
by /u/Mental-Cold-73 (WhatsApp Reddit) on May 3, 2024 at 4:34 pm
I want an option to disable the Meta AI from WhatsApp. Meta AI is not something I want in my communication app. @Meta stop cluttering our experience with options we don't ask for. Or at least give us an option to disable the AI before we decide to move to other messaging apps... submitted by /u/Mental-Cold-73 [link] [comments]
People not in my contact list can see my profile photo.
by /u/Pitiful_Pen_6973 (WhatsApp Reddit) on May 3, 2024 at 3:46 pm
I have my privacy settings set to only my contacts being able to view my profile picture. Why can people not in my contact list are still able to see my profile photo? submitted by /u/Pitiful_Pen_6973 [link] [comments]
People not in my contact list can see my profile photo.
by /u/Pitiful_Pen_6973 (WhatsApp Reddit) on May 3, 2024 at 3:46 pm
I have my privacy settings set to only my contacts being able to view my profile picture. Why can people not in my contact list are still able to see my profile photo? submitted by /u/Pitiful_Pen_6973 [link] [comments]
Block-unblock after number change?
by /u/Parul97 (WhatsApp Reddit) on May 3, 2024 at 3:18 pm
Hi, I had blocked some of my contacts on WhatsApp and then transferred my WhatsApp number. Most of my contacts did get notified but I know the blocked ones didn’t. However, will they get a notification again (which they hadn’t received when they were blocked) that the number was changed if I unblock them on the new number? submitted by /u/Parul97 [link] [comments]
Block-unblock after number change?
by /u/Parul97 (WhatsApp Reddit) on May 3, 2024 at 3:18 pm
Hi, I had blocked some of my contacts on WhatsApp and then transferred my WhatsApp number. Most of my contacts did get notified but I know the blocked ones didn’t. However, will they get a notification again (which they hadn’t received when they were blocked) that the number was changed if I unblock them on the new number? submitted by /u/Parul97 [link] [comments]
Unable to open/download pdf files on WhatsApp (Beta). Is anyone else also facing the same issue?
by /u/Inquisitive_Monk (WhatsApp Reddit) on May 3, 2024 at 2:48 pm
Working fine in WhatsApp web. The beta version of whatsapp in more pain than any benefit. submitted by /u/Inquisitive_Monk [link] [comments]
Unable to open/download pdf files on WhatsApp (Beta). Is anyone else also facing the same issue?
by /u/Inquisitive_Monk (WhatsApp Reddit) on May 3, 2024 at 2:48 pm
Working fine in WhatsApp web. The beta version of whatsapp in more pain than any benefit. submitted by /u/Inquisitive_Monk [link] [comments]
Is it possible to know what google account i used as a backup account
by /u/NorthHamza (WhatsApp Reddit) on May 3, 2024 at 2:44 pm
So my phone died completely with no data available. However, i have set a backup for WhatsApp and i see it always syncing, but my problem is i don't remember the full email i used for backup. Probably even if i knew the email, its going to ask for verification on my dead phone. Sorry about the missing "?" in the title, idk how to edit it. submitted by /u/NorthHamza [link] [comments]
Is it possible to know what google account i used as a backup account
by /u/NorthHamza (WhatsApp Reddit) on May 3, 2024 at 2:44 pm
So my phone died completely with no data available. However, i have set a backup for WhatsApp and i see it always syncing, but my problem is i don't remember the full email i used for backup. Probably even if i knew the email, its going to ask for verification on my dead phone. Sorry about the missing "?" in the title, idk how to edit it. submitted by /u/NorthHamza [link] [comments]
Blurry Profile Pic help! Iphone15
by /u/Sufficient_Spite_745 (WhatsApp Reddit) on May 3, 2024 at 2:35 pm
Struggling with blurry profile pictures on WhatsApp with iPhone 15! Have tried every fix - saving as HD, updating the app, even reinstalling - but still no luck. Anyone else faced this? Seeking advice to get that crystal-clear quality I'm after. submitted by /u/Sufficient_Spite_745 [link] [comments]
Blurry Profile Pic help! Iphone15
by /u/Sufficient_Spite_745 (WhatsApp Reddit) on May 3, 2024 at 2:35 pm
Struggling with blurry profile pictures on WhatsApp with iPhone 15! Have tried every fix - saving as HD, updating the app, even reinstalling - but still no luck. Anyone else faced this? Seeking advice to get that crystal-clear quality I'm after. submitted by /u/Sufficient_Spite_745 [link] [comments]

October 9, 2022October 25, 2022

What are the top 3 methods used to find Autoregressive Parameters in Data Science?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What are the top 3 methods used to find Autoregressive Parameters in Data Science?

In order to find autoregressive parameters, you will first need to understand what autoregression is. Autoregression is a statistical method used to create a model that describes data as a function of linear regression of lagged values of the dependent variable. In other words, it is a model that uses past values of a dependent variable in order to predict future values of the same dependent variable.

In time series analysis, autoregression is the use of previous values in a time series to predict future values. In other words, it is a form of regression where the dependent variable is forecasted using a linear combination of past values of the independent variable. The parameter values for the autoregression model are estimated using the method of least squares.

The autoregressive parameters are the coefficients in the autoregressive model. These coefficients can be estimated in a number of ways, including ordinary least squares (OLS), maximum likelihood (ML), or least squares with L1 regularization (LASSO). Once estimated, the autoregressive parameters can be used to predict future values of the dependent variable.

To find the autoregressive parameters, you need to use a method known as least squares regression. This method finds the parameters that minimize the sum of the squared residuals. The residual is simply the difference between the predicted value and the actual value. So, in essence, you are finding the parameters that best fit the data.

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

How to Estimate Autoregressive Parameters?

There are three main ways to estimate autoregressive parameters: ordinary least squares (OLS), maximum likelihood (ML), or least squares with L1 regularization (LASSO).

Ordinary Least Squares: Ordinary least squares is the simplest and most common method for estimating autoregressive parameters. This method estimates the parameters by minimizing the sum of squared errors between actual and predicted values.

Maximum Likelihood: Maximum likelihood is another common method for estimating autoregressive parameters. This method estimates the parameters by maximizing the likelihood function. The likelihood function is a mathematical function that quantifies the probability of observing a given set of data given certain parameter values.

Least Squares with L1 Regularization: Least squares with L1 regularization is another method for estimating autoregressive parameters. This method estimates the parameters by minimizing the sum of squared errors between actual and predicted values while also penalizing models with many parameters. L1 regularization penalizes models by adding an extra term to the error function that is proportional to the sum of absolute values of the estimator coefficients.

Finding Autoregressive Parameters: The Math Behind It
To find the autoregressive parameters using least squares regression, you first need to set up your data in a certain way. You need to have your dependent variable in one column and your independent variables in other columns. For example, let’s say you want to use three years of data to predict next year’s sales (the dependent variable). Your data would look something like this:

| Year | Sales |
|——|——-|
| 2016 | 100 |
| 2017 | 150 |
| 2018 | 200 |

Next, you need to calculate the means for each column. For our sales example, that would look like this:

$$ \bar{Y} = \frac{100+150+200}{3} = 150$$

Now we can calculate each element in what’s called the variance-covariance matrix:

$$ \operatorname {Var} (X)=\sum _{i=1}^{n}\left({x_{i}}-{\bar {x}}\right)^{2} $$

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

and

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

$$ \operatorname {Cov} (X,Y)=\sum _{i=1}^{n}\left({x_{i}}-{\bar {x}}\right)\left({y_{i}}-{\bar {y}}\right) $$

For our sales example, that calculation would look like this:

$$ \operatorname {Var} (Y)=\sum _{i=1}^{3}\left({y_{i}}-{\bar {y}}\right)^{2}=(100-150)^{2}+(150-150)^{2}+(200-150)^{2})=2500 $$

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

and

$$ \operatorname {Cov} (X,Y)=\sum _{i=1}^{3}\left({x_{i}}-{\bar {x}}\right)\left({y_{i}}-{\bar {y}}\right)=(2016-2017)(100-150)+(2017-2017)(150-150)+(2018-2017)(200-150))=-500 $$

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Now we can finally calculate our autoregressive parameters! We do that by solving this equation:

$$ \hat {\beta }=(X^{\prime }X)^{-1}X^{\prime }Y=\frac {1}{2500}\times 2500\times (-500)=0.20 $$\.20 . That’s it! Our autoregressive parameter is 0\.20 . Once we have that parameter, we can plug it into our autoregressive equation:

$$ Y_{t+1}=0\.20 Y_t+a_1+a_2+a_3footnote{where $a_1$, $a_2$, and $a_3$ are error terms assuming an AR(3)} .$$ And that’s how you solve for autoregressive parameters! Of course, in reality you would be working with much larger datasets, but the underlying principles are still the same. Once you have your autoregressive parameters, you can plug them into the equation and start making predictions!.

Which Method Should You Use?
The estimation method you should use depends on your particular situation and goals. If you are looking for simple and interpretable results, then Ordinary Least Squares may be the best method for you. If you are looking for more accurate predictions, then Maximum Likelihood or Least Squares with L1 Regularization may be better methods for you.

Autoregressive models STEP BY STEP:

1) Download data: The first step is to download some data. This can be done by finding a publicly available dataset or by using your own data if you have any. For this example, we will be using data from the United Nations Comtrade Database.

2) Choose your variables: Once you have your dataset, you will need to choose the variables you want to use in your autoregression model. In our case, we will be using the import and export values of goods between countries as our independent variables.

3) Estimate your model: After choosing your independent variables, you can estimate your autoregression model using the method of least squares. OLS estimation can be done in many statistical software packages such as R or STATA.

4) Interpret your results: Once you have estimated your model, it is important to interpret the results in order to understand what they mean. The coefficients represent the effect that each independent variable has on the dependent variable. In our case, the coefficients represent the effect that imports and exports have on trade balance. A positive coefficient indicates that an increase in the independent variable leads to an increase in the dependent variable while a negative coefficient indicates that an increase in the independent variable leads to a decrease in the dependent variable.

5)Make predictions: Finally, once you have interpreted your results, you can use your autoregression model to make predictions about future values of the dependent variable based on past values of the independent variables.

Conclusion: In this blog post, we have discussed what autoregression is and how to find autoregressive parameters.

Estimating an autoregression model is a relatively simple process that can be done in many statistical software packages such as R or STATA.

In statistics and machine learning, autoregression is a modeling technique used to describe the linear relationship between a dependent variable and one more independent variables. To find the autoregressive parameters, you can use a method known as least squares regression which minimizes the sum of squared residuals. This blog post also explains how to set up your data for calculating least squares regression as well as how to calculate Variance and Covariance before finally calculating your autoregressive parameters. After finding your parameters you can plug them into an autoregressive equation to start making predictions about future events!

We have also discussed three different methods for estimating those parameters: Ordinary Least Squares, Maximum Likelihood, and Least Squares with L1 Regularization. The appropriate estimation method depends on your particular goals and situation.

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

Machine Learning For Dummies App

Machine Learning For Dummies on iOs: https://apps.apple.com/us/app/machine–learning-for-dummies-p/id1610947211

Machine Learning For Dummies on Windows: https://www.microsoft.com/en-ca/p/machine–learning-for-dummies-ml-ai-ops-on-aws-azure-gcp/9p6f030tb0mt?

Machine Learning For Dummies Web/Android on Amazon: https://www.amazon.com/gp/product/B09TZ4H8V6

What are some good datasets for Data Science and Machine Learning?

Machine Learning Engineer Interview Questions and Answers

Machine Learning Breaking News

Transformer – Machine Learning Models

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

Machine Learning – Software Classification

Autoregressive Model

Autoregressive generative models can estimate complex continuous data distributions such as trajectory rollouts in an RL environment, image intensities, and audio. Traditional techniques discretize continuous data into various bins and approximate the continuous data distribution using categorical distributions over the bins. This approximation is parameter inefficient as it cannot express abrupt changes in density without using a significant number of additional bins. Adaptive Categorical Discretization (ADACAT) is proposed in this paper as a parameterization of 1-D conditionals that is expressive, parameter efficient, and multimodal. A vector of interval widths and masses is used to parameterize the distribution known as ADACAT. Figure 1 showcases the difference between the traditional uniform categorical discretization approach with the proposed ADACAT.

Each component of the ADACAT distribution has non-overlapping support, making it a specific subfamily of mixtures of uniform distributions. ADACAT generalizes uniformly discretized 1-D categorical distributions. The proposed architecture allows for variable bin widths and more closely approximates the modes of two Gaussians mixture than a uniformly discretized categorical, making it highly expressive than the latter. Additionally, a distribution’s support is discretized using quantile-based discretization, which bins data into groups with similar measured data points. ADACAT uses deep autoregressive frameworks to factorize the joint density into numerous 1-D conditional ADACAT distributions in problems with more than one dimension.

Continue reading | Check out the paper and github link.

Pytorch – Computer Application

https://torchmetrics.readthedocs.io/en/stable//index.html

Best practices for training PyTorch model

What are some ways we can use machine learning and artificial intelligence for algorithmic trading in the stock market?

What are some good datasets for Data Science and Machine Learning?

Top 100 Data Science and Data Analytics and Data Engineering Interview Questions and Answers

Machine Learning Engineer Interview Questions and Answers

[P] Flan-T5 for Synthetic data generation?
by /u/Theredeemer08 (Machine Learning) on May 3, 2024 at 7:21 pm
Hi all, I'm trying to build a personal project on synthetic dataset generation. Been researching + laying out an initial structure for the project. The main question I have is can FLAN-T5 be used for data generation / mass text generation? I can't seem to find examples of people using it for that use-case. I've looked at mixtral-instruct models aswell. I am trying to avoid GPT4 due to cost. Please let me know of any other LMs that could be good for my purposes submitted by /u/Theredeemer08 [link] [comments]
[R][D] Quantization of Time-Series for improving performance of RNNs (possible use cases for LLMs)
by /u/HungryhungryUgolino (Machine Learning) on May 3, 2024 at 5:10 pm
Hello all, Wanted to ask if any of y'all had experience with using quantized/binned version of feature sets and/or goal sets to improve performance for sequence learners for time-series problems. I'm not very strong on NLP so sorry for any of the mistakes that may follow Set-up: f(X) -> ŷ with the goal of |ŷ-y| < eps X is a feature set with features that are hopefully informative on y, with varying frequencies of information, such as simple moving average with varying windows for each feature dimension, as a toy example. X and y are noisy Motivation I have seen some recent work modifying univariate time-series forecasting problems so they are digestible for LLMs, in particular : Chronos: Learning the Language of Time Series The general method is Scale a time series in some way, such as dividing each sequence by mean absolute value bin these values to make the possible values now discrete add start / end token to be digestible by LLMs and then use to forecast Hurrah now we have a time-series that can be passed into an LLM Quantization for RNNs rather than LLM Taking a step back, rather than using the above transformation for use with LLMs, I'm wondering if anyone here have used these techniques to make a time-series more amenable for an RNN. The two important parts of the transformation are (1) the scaling technique and (2) the number of bins N. As N -> infinity we get the same precision as the original time-series. Quantization as a function Q(.) can be applied to either X,y or both. Benefits I had in mind: Using integers as references to bins for faster/easier trading reduce noise in signal possibility of using feature embedding? Hopefully this was clear. Any help is appreciated. submitted by /u/HungryhungryUgolino [link] [comments]
[N] AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits tech industry
by /u/bregav (Machine Learning) on May 3, 2024 at 4:58 pm
AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits tech industry Summary from article: Artificial intelligence engineers at top tech companies told CNBC that the pressure to roll out AI tools at breakneck speed has come to define their jobs. They say that much of their work is assigned to appease investors rather than to solve problems for end users, and that they are often chasing OpenAI. Burnout is an increasingly common theme as AI workers say their employers are pursuing projects without regard for the technology’s effect on climate change, surveillance and other potential real-world harms. An especially poignant quote from the article: An AI engineer who works at a retail surveillance startup told CNBC that he’s the only AI engineer at a company of 40 people and that he handles any responsibility related to AI, which is an overwhelming task. He said the company’s investors have inaccurate views on the capabilities of AI, often asking him to build certain things that are “impossible for me to deliver.” submitted by /u/bregav [link] [comments]
[D] software to design figures
by /u/_Hardric (Machine Learning) on May 3, 2024 at 3:13 pm
I want to create graphs/figures for rl algorithms. I really like the style used in Deep Mind papers (AlphaZero, AlphaTensor, MuZero, ...). Does anyone know the software used for those images ? Or perhaps something else that achieves similar results ? https://preview.redd.it/4uohkcbxg8yc1.png?width=791&format=png&auto=webp&s=9136bd12eb797523a5ff73f2b0b02e811239d9c3 https://preview.redd.it/1vzin9izg8yc1.png?width=578&format=png&auto=webp&s=8046e1196347365b48ad2d3920ee0ba18119600c submitted by /u/_Hardric [link] [comments]
Ethical Dilemmas in Machine Learning Deployment [Discussion]
by /u/Old_Coder45 (Machine Learning) on May 3, 2024 at 2:31 pm
A couple of questions I have been discussing with colleagues recently, I wanted to get a broader idea of peoples thoughts so decided to pose the questions to this subreddit, thanks. How do we balance the ethical imperatives of transparency, fairness, and accountability with the practical necessities of deploying machine learning models in real-world decision-making contexts? What strategies, frameworks, or best practices can organizations adopt to navigate these challenges effectively while ensuring both ethical integrity and operational efficiency? submitted by /u/Old_Coder45 [link] [comments]
[D] How to train a text detection model that will detect it's orientation (rotation) ranging from +180 to -180 degrees.
by /u/tmargary (Machine Learning) on May 3, 2024 at 2:29 pm
Most models it seems like are able to detect rotated objects, but they use so called le90 convention, where objects are rotated from +90 to -90 degrees. In my case I would like to detect the text on the image in its correct orientation which means 0 and 180 degrees in my case are not the same (which is the case in MMOCR, MMDET, and MMRotate models). Can you guide me on this problem? How can I approach this issue? Do you have links to some open-source projects that tackle this issue? I know that usually the text orientation issue can be solved by training another small model, or by training the recognition stage with all possible rotations, but I would like to tackle this issue early in the detection stage. Any ideas would be highly appreciated. Thanks in advance. submitted by /u/tmargary [link] [comments]
How would you model this problem?
by /u/LogisticDepression (Data Science) on May 3, 2024 at 2:27 pm
Suppose I’m trying to predict churn based on previous purchases information. What I do today is come up with features like average spend, count of transactions and so on. I want to instead treat the problem as a sequence one, modeling the sequence of transactions using NN. The problem is that some users have 5 purchases, while others 15. How to handle this input size change from user to user, and more importantly which architecture to use? Thanks!! submitted by /u/LogisticDepression [link] [comments]
What makes a good or bad product manager?
by /u/fioney (Data Science) on May 3, 2024 at 11:13 am
Realised I’ve only ever worked with two product managers and would love your thoughts as to what makes a product manager good to work with or not so good to work with submitted by /u/fioney [link] [comments]
REVOLUTIONIZING DRAG-AND-DROP PYTHON GUI BUILDING
by /u/amimiath (Data Science) on May 3, 2024 at 11:09 am
submitted by /u/amimiath [link] [comments]
Apple silicone users: how do you make LLM’s run faster?
by /u/Exact-Committee-8613 (Data Science) on May 3, 2024 at 10:19 am
Just as the title says. I’m trying to build a rag using ollama but it’s taking so so long. I’m using apple m1 8gb ram (yes, I know, I brought a butter knife to a gun fight) but I’m broke and cannot afford a new one. Any suggestions? Thanks submitted by /u/Exact-Committee-8613 [link] [comments]
[R] HGRN2: Gated Linear RNNs with State Expansion
by Machine Learning on May 3, 2024 at 9:47 am
Paper: https://arxiv.org/abs/2404.07904 Code: https://github.com/OpenNLPLab/HGRN2 Standalone code (1): https://github.com/Doraemonzzz/hgru2-pytorch Standalone code (2): https://github.com/sustcsonglin/flash-linear-attention/tree/main/fla/models/hgrn2 Abstract: Hierarchically gated linear RNN (HGRN, Qin et al. 2023) has demonstrated competitive training speed and performance in language modeling, while offering efficient inference. However, the recurrent state size of HGRN remains relatively small, which limits its expressiveness. To address this issue, inspired by linear attention, we introduce a simple outer-product-based state expansion mechanism so that the recurrent state size can be significantly enlarged without introducing any additional parameters. The linear attention form also allows for hardware-efficient training. Our extensive experiments verify the advantage of HGRN2 over HGRN1 in language modeling, image classification, and Long Range Arena. Our largest 3B HGRN2 model slightly outperforms Mamba and LLaMa Architecture Transformer for language modeling in a controlled experiment setting; and performs competitively with many open-source 3B models in downstream evaluation while using much fewer total training tokens. [link] [comments]
[R] A Primer on the Inner Workings of Transformer-based Language Models
by /u/SubstantialDig6663 (Machine Learning) on May 3, 2024 at 9:46 am
Authors: Javier Ferrando (UPC), Gabriele Sarti (RUG), Arianna Bisazza (RUG), Marta Costa-jussà (Meta) Paper: https://arxiv.org/abs/2405.00208 Abstract: The rapid progress of research aimed at interpreting the inner workings of advanced language models has highlighted a need for contextualizing the insights gained from years of work in this area. This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architecture. We conclude by presenting a comprehensive overview of the known internal mechanisms implemented by these models, uncovering connections across popular approaches and active research directions in this area. https://preview.redd.it/57y44wwdn6yc1.png?width=1486&format=png&auto=webp&s=7b7fb38a59f3819ce0d601140b1e031b98c17183 submitted by /u/SubstantialDig6663 [link] [comments]
[D] Help: 1. Current PhD position is alright? 2. (3d) computer vision; point cloud processing, is my Research Roadmap correct?
by /u/Same_Half3758 (Machine Learning) on May 3, 2024 at 9:20 am
Currently I am PhD student (in the middle of the second semester) = (almost 7 months), particularly I am focusing on point cloud research for classification and segmentation all on my own. no guidance from my prof or fella Ph.D. (s). I have tow particular questions: should I drop out my Ph.D. under current supervisor? why? because almost there is no supervision and guidance? in a world with this huge of knowledge and fast going research, is it possible to end up with a satisfactory PhD on my own? considering that still my understanding for the field (point cloud processing) and DL is quite elementary. while I have the courage to work even though it is quite difficult to not having a fertile environment. should I quit and find a better place to pursue my research in? how bad is this situation? lastly, I am more concerned about my research strategy which is kinda on the fly actually. previously. from the beginning of the program until 3 months ago, I was solely reading groundbreaking papers i.e. pointnet, pointnet ++, point transformer series. I spent 3-4 months only exploring the very surface of the field, because it was my first interaction with the field and also honestly I did not have very good understanding of deep learning either. just grasped simple and high level concepts and ideas. but around 3 months ago, i realized this way I never come up with my own idea and contribute to the field. lack of knowledge coupled with absolute zero supervision and this naïve reading was not promising. so, i decided this time starting from scratch with Pointnet paper go deep and understand end to end of it. concepts in the paper, and its code implementation, which is still ongoing. I definitely feel I am learning. but the thing is: what should be my next step? particularly, there are different methods that have been used and structured literature in the field. so, should i pursue same strategy in many directions or just stick to one for a long time? I do not know even what are the exact options I have here! 🙁 I hope it is clear enough. submitted by /u/Same_Half3758 [link] [comments]
[D] Fine-tune Phi-3 model for domain specific data - seeking advice and insights
by /u/aadityaura (Machine Learning) on May 3, 2024 at 7:10 am
Hi, I am currently working on fine-tuning the Phi-3 model for financial data. While the loss is decreasing during training, suggesting that the model is learning quite well, the results on a custom benchmark are surprisingly poor. In fact, the accuracy has decreased compared to the base model. Results I've observed: Phi-3-mini-4k-instruct (base model): Average domain accuracy of 40% Qlora - Phi-3-mini-4k-instruct (fine-tuned model): Average domain accuracy of 35% I have tried various approaches, including QLora, Lora, and FFT, but all the results are poor compared to the base model. Moreover, I have also experimented with reducing the sequence length to 2k in an attempt to constrain the model and prevent it from going off-track, but unfortunately, this has not yielded any improvement. I'm wondering if there might be issues with the hyperparameters, such as the learning rate, or if there are any recommendations on how I can effectively fine-tune this model for better performance on domain-specific data. If anyone has successfully fine-tuned the Phi-3 model on domain-specific data, I would greatly appreciate any insights or advice you could share. Thank you in advance for your help and support! qlora configuration: sequence_len: 4000 sample_packing: true pad_to_sequence_len: true trust_remote_code: True adapter: qlora lora_r: 256 lora_alpha: 512 lora_dropout: 0.05 lora_target_linear: true lora_target_modules: - q_proj - v_proj - k_proj - o_proj - gate_proj - down_proj - up_proj gradient_accumulation_steps: 1 micro_batch_size: 2 num_epochs: 4 optimizer: adamw_torch lr_scheduler: cosine learning_rate: 0.00002 warmup_steps: 100 evals_per_epoch: 4 eval_table_size: saves_per_epoch: 1 debug: deepspeed: weight_decay: 0.0 https://preview.redd.it/7afyhxcjv5yc1.png?width=976&format=png&auto=webp&s=1ce3efe6df6e4533bad5ec2f23e4f4968736bd56 submitted by /u/aadityaura [link] [comments]
[R] postive draws for bioDraws
by /u/h2_so4_ (Machine Learning) on May 3, 2024 at 6:57 am
I'm a beginner in python. Please help me with the following situation. My research is stuck. Consider the following equation in which have to generate random values (currently have set the method to NORMALMLHS). . L1 =c+sigmaL1 * bioDraws (E_L1','NORMAL_MLHS) . where L1 is an endogenous variable, c is an estimale constant for which the lower bound is 0. the lower bound for sigmaL1 is also 0. Which method can use instead of 'NORMAL_MLHS' to ensure that it generates positive values and hence L1 is positive? submitted by /u/h2_so4_ [link] [comments]
[D] Distance Estimation - Real World coordinates
by /u/Embarrassed_Top_5901 (Machine Learning) on May 3, 2024 at 5:38 am
Hello, I'm sorry for resposting this question again but this is very important and I need assistance. I have three cameras in a room in different locations ( front, left and right wall). I should be able to find distance among humans in the room in meters. I performed camera calibration for all the cameras. I tried matching the common points using SIFT, and then performed DLT method but the values are way off and not even close to the actual values. I tried stereo vision as well but that is not giving me close values as well. I also have distanced between cameras in meters too. I'm a beginner in computer vision and I should complete this task soon but I have been stuck with this since one month and I'm getting tired as I'm not able to solve this issue and I'm running out of solutions. I would really appreciate if someone helps me and guide me in the right direction. Thanks a lot for your help and time 😄 submitted by /u/Embarrassed_Top_5901 [link] [comments]
[R] Iterative Reasoning Preference Optimization
by /u/topcodemangler (Machine Learning) on May 3, 2024 at 3:20 am
submitted by /u/topcodemangler [link] [comments]
[D] Looking at hardware options for an AI/LLM development machine for work. Training and inference on small-to-mid sized models. Lost in hardware specs -- details in post.
by /u/IThrowShoes (Machine Learning) on May 3, 2024 at 1:26 am
Greetings, At work I've been tasked with researching and developing some stuff around using LLMs in tandem with our in-house software suite. I can't go into many details due to policies, but it would eventually involve some PII identification/extraction, some document summarization, probably a little bit of RAG, etc. Over the last month or two, I've done some preliminary groundwork using very small models to show that something "is possible", but we'd like to take it to the next level. At this point I've been using a combination of my laptop's GPU (just a mobile RTX 3060) and my boss' RTX 4080 on an AMD threadripper machine. The 3060 falls over pretty quickly even on some of the smaller models, but the 4080 does pretty good at inferencing. But as you'd imagine I run out of VRAM pretty quickly trying to do anything slightly more robust. Part of my marching orders is to spec out some hardware for use in a local development machine/desktop. We have already put in an order for more production-grade hardware with a very sizable amount of VRAM (I think it hovers in at around 1 terabyte of VRAM, but not 100% sure) for use in our datacenter, but that wont arrive for a few months at least. With that, I am looking for some recommendations for a development workstation. I can't quite come to the conclusion if I should run multiple GPUs, or shell out for something that has more VRAM built-in. For example, do I run dual 3090s? Do I run an A6000 or two? Or one? Would a single RTX 6000 Ada (48GB) be sufficient? Given that: This is for development only, not production I want to inference small-to-mid sized models (probably up to 30b params) I probably want to fine tune small-to-mid sized models, if anything as a point of comparison. Even using LoRA/QLoRA Fine-tuning would be done on the Python side, and inferencing would be done using HuggingFace's candle library for Rust Using something cloud-based is discouraged on my end (can't go into details), and whatever software gets built that eventually lands in production can't talk with any external API anyways I dont mind using quantized models for development, but at some point I'd like to try on full precision models (which may have to wait for the production hardware to show up) I would say money is not a factor, but if I can budget something under $15k that'd be ideal What would you all recommend? Thanks! submitted by /u/IThrowShoes [link] [comments]
[R] Language settings in PrivateGPT implementation
by /u/povedaaqui (Machine Learning) on May 2, 2024 at 10:07 pm
Hello. I'm running PrivateGPT in a language other than english, and I don't get very well how the language settings work. Based in the example file, does it mean that when the first three parameters match, the prompt style will be set (in this case, "llama2")? I'm looking for the best setting possible for the foundational model I'm using for langagues different than english. settings-en.yaml: local: llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.1-GGUF llm_hf_model_file: mistral-7b-instruct-v0.1.Q4_K_M.gguf embedding_hf_model_name: BAAI/bge-small-en-v1.5 prompt_style: "llama2" For example, for phi3: phi3: llm_hf_repo_id: microsoft/Phi-3-mini-4k-instruct-gguf llm_hf_model_file: Phi-3-mini-4k-instruct-q4.gguf embedding_hf_model_name: nomic-ai/nomic-embed-text-v1.5 prompt_style: "phi3" submitted by /u/povedaaqui [link] [comments]
[D] Good strategies / resources to improve MLOps skills as a PhD student / researcher
by /u/fliiiiiiip (Machine Learning) on May 2, 2024 at 9:52 pm
A lot of researchers / PhD students in ML have prospects of joining the industry eventually (in US about 80% of ML PhDs are in the industry, according to the recently released Stanford's AI Index). What are some good tips / resources for someone to ensure he develops more practical & deployment-oriented MLOps skills? For example - setting up clusters, relevant cloud services (e.g. AWS), Docker, Kubernetes, developing internal tools for model training / data labelling... Stuff like that. submitted by /u/fliiiiiiip [link] [comments]

Cloud Certification made simple. Ace your exams with Djamgatech.

October 6, 2022October 25, 2022

How do you make a Python loop faster?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

How do you make a Python loop faster?

Programmers are always looking for ways to make their code more efficient. One way to do this is to use a faster loop. Python is a high-level programming language that is widely used by developers and software engineers. It is known for its readability and ease of use. However, one downside of Python is that its loops can be slow. This can be a problem when you need to process large amounts of data. There are several ways to make Python loops faster. One way is to use a faster looping construct, such as C. Another way is to use an optimized library, such as NumPy. Finally, you can vectorize your code, which means converting it into a format that can be run on a GPU or other parallel computing platform. By using these techniques, you can significantly speed up your Python code.

According to Vladislav Zorov, If not talking about NumPy or something, try to use list comprehension expressions where possible. Those are handled by the C code of the Python interpreter, instead of looping in Python. Basically same idea like the NumPy solution, you just don’t want code running in Python.

Example: (Python 3.0)

lst = [n for n in range(1000000)]

def loops():

newlst = []

for n in lst:

newlst.append(n * 2)

return newlst

def lstcomp():

return [n * 2 for n in lst]

from timeit import timeit

print(timeit(loops, number=100))

#18.953254899999592 seconds

print(timeit(lstcomp, number=100))

#11.669047399991541 seconds

Or Do this in Python 2.0

Python list traversing tip:

Instead of this: for i in range(len(l)): x = l[i]

Use this for i, x in enumerate(l): …

TO keep track of indices and values inside a loop.

Twice faster, and the code looks better.

Another option is to write loops in C instead of Python. This can be done by using a Python extension module such as pyximport. By doing this, programmers can take advantage of the speed of C while still using the convenient syntax of Python.

Finally, developers can also improve the performance of their code by making use of caching. By caching values that are computed inside a loop, programmers can avoid having to recalculate them each time through the loop. By taking these steps, programmers can make their Python code more efficient and faster.

Very Important: Don’t worry about code efficiency until you find yourself needing to worry about code efficiency.

The place where you think about efficiency is within the logic of your implementations.

This is where “big O” discussions come in to play. If you aren’t familiar, here is a link on the topic

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

What are the top 10 Wonders of computing and software engineering?

Programming, Coding and Algorithms Questions and Answers

Do you want to learn python we found 5 online coding courses for beginners?

Python Coding Bestsellers on Amazon

https://amzn.to/3s3KXc3

https://coma2.ca

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

The Best Python Coding and Programming Bootcamps
We’ve also included a scholarship resource with more than 40 unique scholarships to provide additional financial support.
Python Coding Bootcamp Scholarships

Python Coding Breaking News

Stumbling my way through
by /u/-Doodoofard (Python) on May 3, 2024 at 9:22 pm
New to python and trying to create a simple calculator of my own design just wondering how I could get x to times input1 and input2 together (x is whatever is inputed)(intended to be either / or * or +/-) submitted by /u/-Doodoofard [link] [comments]
Migrate data from one MongoDB cluster to another MongoDB cluster using Python
by /u/vmanel96 (Python) on May 3, 2024 at 6:00 pm
I need to migrate around 1 billion records from one MongoDB cluster to another MongoDB cluster using Python. What's the best way to do it using Python. Basically I need to insert data in batches and batch size should be controllable, and multi threads/process should be running. Logging should be present, and if the script stops in between it should continue process from where it stopped. Exception handling and retries are required. Any other approach / libraries which would simplify this? submitted by /u/vmanel96 [link] [comments]
New book! The Quick Python Book, Fourth Edition by Naomi Ceder
by /u/ManningBooks (Python) on May 3, 2024 at 1:42 pm
Hello everybody, Thank you for having us here, and a huge "Thank you" to the moderators for letting us post. We have just released the latest edition of The Quick Python Book by the one-and-only Naomi Ceder, and I wanted to share that news with the community. Many of you are already familiar with Naomi's work and her massive contributions to the world of Python programming language. The Quick Python Book has aided over 100,000 developers in mastering Python. The Fourth Edition of the book has been revised to include the latest features, control structures, and libraries of Python, along with new coverage of working with AI-generated Python code. Naomi, the author, has beautifully balanced the details of the language with the insights and advice required to accomplish any task. Her personal touch has made learning Python an enjoyable experience for countless developers. 📚 You can find the book here: https://mng.bz/aEQj 📖 Get into the liveBook: https://mng.bz/gvee And last but not the least, get 46% off with code: receder46 Hope you find the book helpful. Thank you. Cheers, submitted by /u/ManningBooks [link] [comments]
Project: Simple Interactive Python Streamlit Maps With NASA GIS Data
by /u/jgloewen (Python) on May 3, 2024 at 10:22 am
Python Streamlit is terrific for putting together interactive dashboards. Combined with the geopandas library, streamlit can easily display GIS data points on a map for you. Forest fires in my home province of British Columbia, Canada have been really bad recently. NASA has a terrific dataset that keeps track of forest fires by country. Can I use Streamlit to access this dataset and display a map off all the fires within a certain area (BC) for a particular time frame (2021)? And can I give the user the ability to choose a month? You bet! Let me step you through how! FREE tutorial (with code): https://johnloewen.substack.com/p/simple-interactive-python-streamlit submitted by /u/jgloewen [link] [comments]
typedattr: Autocompletion and typechecking for CLI script arguments, using standard argparse syntax
by /u/gings7 (Python) on May 3, 2024 at 7:45 am
Excited to share my pypi package typedparser I have been working on for around 1 year now. What My Project Does: It enables writing CLI scripts and create an "args" variable with autocompleted members and type checks, but still keeps the simple and universally understood syntax of the stdlib argarse module. Target Audience: For stability, I battletested it in my research projects and added automatic builds as well as 80%+ test coverage. So I believe it is pretty stable. Comparison: For typing functionality it uses the attrs package as backend. It also provides some additional features for object and dictionary manipulation. Of course there are many other CLI argument packages out there, but this one stands out in that it tries to keep the syntax of the argparse standard library as much as possible, making it easy for others to figure out what your script does. Check it out and let me know what you think. submitted by /u/gings7 [link] [comments]
Friday Daily Thread: r/Python Meta and Free-Talk Fridays
by /u/AutoModerator (Python) on May 3, 2024 at 12:00 am
Weekly Thread: Meta Discussions and Free Talk Friday 🎙️ Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related! How it Works: Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting. Guidelines: All topics should be related to Python or the /r/python community. Be respectful and follow Reddit's Code of Conduct. Example Topics: New Python Release: What do you think about the new features in Python 3.11? Community Events: Any Python meetups or webinars coming up? Learning Resources: Found a great Python tutorial? Share it here! Job Market: How has Python impacted your career? Hot Takes: Got a controversial Python opinion? Let's hear it! Community Ideas: Something you'd like to see us do? tell us. Let's keep the conversation going. Happy discussing! 🌟 submitted by /u/AutoModerator [link] [comments]
Dash vs Reflex vs Others
by /u/Sea_Split_1182 (Python) on May 2, 2024 at 10:56 pm
Where can I find a decent comparison (pros and cons) of these 5 solutions? They seem to be solving the same problem, which is, afaiu, separating the frontend ‘annoyance’ from Python scripting / math. ⁠Reflex (used to be called Pynecone) https://reflex.dev ⁠Streamlit https://streamlit.io ⁠Gradio https://gradio.app ⁠Dash https://dash.plotly.com ⁠Panel https://panel.holoviz.org/ ⁠Anvil https://anvil.works/ Quarto My use case: user access the web app, choose some parameters, selects things that go or not into a model. Python returns results of my math. Needs to be somewhat eye-candy and I need to use a lot of pictures to get the user input (i.e. “which of these figures you like most? 1,2,3. User clicks on “3”, 3 is considered in the model. submitted by /u/Sea_Split_1182 [link] [comments]
I made a python package that can parse Excel Formula Strings into dictionary structures!
by /u/MPGaming9000 (Python) on May 2, 2024 at 8:19 pm
What my project does: It basically takes a formula string like you'd get from Openpyxl like "=SUM(A1:B2)" and breaks it all out into a dictionary structure for you to then navigate through, modify, and then reformat that modified structure back into an excel friendly formula string again! Target Audience: (People who modify Excel formula strings in automated spreadsheet modification scripts. Or people who need to analyze formulas in a spreadsheet to do some kind of logic based on that analysis). Disclaimer: For most people some simple regex pattern matching and str replaces would be fine to modify formulas but if you need a more structured approach to working with these strings, this package has you covered! How does it differ compared to other projects: There are libraries like Openpyxl that allow you to tokenize and translate formulas but that's currently where it ends. It doesn't allow you to systematically parse out a formula and replace those pieces and add new structures and what not into it. Currently the best you can really do is translate formulas and anything other than that would need to rely on regex string matching logic or string replacements. (Which still would be fine for most people, but this just adds another layer of organization and scalability to the format). More info about it here: https://github.com/Voltaic314/ExcelFormulaParser To install, just do: pip install ExcelFormulaParser Thank you for reading this!! Hope you guys find it useful if you're ever systematically modifying (or analyzing) spreadsheets! submitted by /u/MPGaming9000 [link] [comments]
Tutorial on Building a Server-to-Server Zoom App with Python
by /u/SleekEagle (Python) on May 2, 2024 at 6:57 pm
I made a tutorial on how to build a server-to-server Zoom OAuth application using Python. This application can transcribe Zoom meeting recordings, print the transcripts to the terminal, and save the transcripts as text files. video tutorial repo written tutorial This tutorial covers: Setting up OAuth authentication for server-to-server apps Utilizing the Zoom API to access recordings Implementing automatic transcription using Python submitted by /u/SleekEagle [link] [comments]
Starter Code for a LLM-based AI Assistant
by /u/2bytesgoat (Python) on May 2, 2024 at 6:52 pm
Hey everyone 👋 TL;DR Since everyone is talking about the Humane AI Pin and the Rabbit R1, I decided to make a short 5 minute tutorial on how people can setup and customize their own little AI assistant on their machine. I've uploaded a video tutorial here: https://www.youtube.com/watch?v=2fD_SAouoOs&ab_channel=2BytesGoat And the Github code is here: https://github.com/2BYTESGOAT/AI-ASSISTANT Longer version What my project does: It's the starter code for an AI assistant that you can run locally. More precisely, it's a ChatGPT / Llama 2 agent that has access to Google Search and can get businesses nearby based on your location. The tool can be easily extended to support other APIs. Target audience: Pythoneers that are curious about LLMs and LLM related libraries. Comparison: It was inspired by projects such as the Humane AI Pin and the Rabbit R1. Though it's a inferior version to those, it serves more as a playground for people to develop their own AI assistants. submitted by /u/2bytesgoat [link] [comments]
k8sAI - my open-source GPT CLI tool for Kubernetes!
by /u/Wild_Plantain528 (Python) on May 2, 2024 at 3:52 pm
What my project does: I wanted to share an open-source project I’ve been working on called k8sAI. It’s a personal AI Kubernetes expert that can answer questions about your cluster, suggests commands, and even executes relevant kubectl commands to help diagnose and suggest fixes to your cluster, all in the CLI! Target Audience: As a relative newcomer to k8s, this tool has really streamlined my workflow. I can ask questions about my cluster, k8sAI will run kubectl commands to gather info, and then answer those question. It’s also found several issues in my cluster for me - all I’ve had to do is point it in the right direction. I’ve really enjoyed making and using this so I thought it could be useful for others. Added bonus is that you don’t need to copy and paste into ChatGPT anymore! k8sAI operates with read-only kubectl commands to make sure your cluster stays safe. All you need is an OpenAI API key and a valid kubectl config. Start chatting with k8sAI using: $ pip install k8sAI $ k8sAI chat or to fix an issue: $ k8sAI fix -p="take a look at the failing pod in the test namespace" Would love to get any feedback you guys have! Here's the repo for anyone who wants to take a look Comparison: I found a tool (k8sGPT) that I enjoyed using, but I felt it was still missing a few pieces on the chatbot side. You can't chat back and forth with k8sGPT and it doesn't suggest commands for you to execute, so I decided to make this. submitted by /u/Wild_Plantain528 [link] [comments]
Multipart File Uploads to S3 with Python
by /u/tylersavery (Python) on May 2, 2024 at 3:28 pm
I created this tutorial after overcoming a difficult challenge myself: uploading 5GB+ files to AWS. This approach allows the browser to securely upload directly to an S3 bucket without the file having to travel through the backend server. The implementation is written in python (backend) and vanilla js (frontend). submitted by /u/tylersavery [link] [comments]
Hatch v1.10.0 - UV support, new test command and built-in script runner
by /u/Ofekmeister (Python) on May 2, 2024 at 2:00 pm
Hello everyone! I'd like to announce version 1.10.0: https://hatch.pypa.io/latest/blog/2024/05/02/hatch-v1100/ Feel free to provide any feedback either here or as a discussion on the repo: https://github.com/pypa/hatch submitted by /u/Ofekmeister [link] [comments]
The Python on Microcontrollers (and Raspberry Pi) Newsletter, a weekly news and project resource
by /u/HP7933 (Python) on May 2, 2024 at 2:00 pm
The Python on Microcontrollers (and Raspberry Pi) Newsletter: subscribe for free With the Python on Microcontrollers newsletter, you get all the latest information on Python running on hardware in one place! MicroPython, CircuitPython and Python on single Board Computers like Raspberry Pi & many more. The Python on Microcontrollers newsletter is the place for the latest news. It arrives Monday morning with all the week’s happenings. No advertising, no spam, easy to unsubscribe. 10,958 subscribers - the largest Python on hardware newsletter out there. Catch all the weekly news on Python for Microcontrollers with adafruitdaily.com. This ad-free, spam-free weekly email is filled with CircuitPython, MicroPython, and Python information that you may have missed, all in one place! Ensure you catch the weekly Python on Hardware roundup– you can cancel anytime – try our spam-free newsletter today! https://www.adafruitdaily.com/ submitted by /u/HP7933 [link] [comments]
What does your python development setup look like?
by /u/Working_Noise_6043 (Python) on May 2, 2024 at 9:18 am
I'd like to explore other people's setup and perhaps try need things or extra tools. What kind IDE, any extra tools to make it easier for you, etc. Looking forward to everyone's responses! submitted by /u/Working_Noise_6043 [link] [comments]
Suggestions for a self-hosted authentication as a service?
by /u/FlyingRaijinEX (Python) on May 2, 2024 at 7:18 am
I have a simple backend REST API service that is serving a few ML models. I have made it "secured" by implementing an API key in order call those endpoints. I was wondering, how common it is for people to use services that can be self-hosted as their authentication/authorization. If it is common and reliable, what are the best options to go for? I've read that building your own authentication/authorization service with email verification, password reset, and social auth can be a pain. Also, did some googling and found this General - Fief. Has anyone ever tried using this? If so, how was the experience? Thanks in advance. submitted by /u/FlyingRaijinEX [link] [comments]
One pytest marker to track the performance of your tests
by /u/toodarktoshine (Python) on May 2, 2024 at 5:18 am
Hello Pythonistas! I just wrote a blog post about measuring performance inside pytest test cases. We dive into why it’s important to test for performance and how to integrate the measurements in the CI. Here is the link to the blog: https://codspeed.io/blog/one-pytest-marker-to-track-the-performance-of-your-tests submitted by /u/toodarktoshine [link] [comments]
PkgInspect - Inspect Local/External Python Packages
by /u/yousefabuz (Python) on May 2, 2024 at 2:52 am
GitHub What My Project Does PkgInspect is a comprehensive tool designed to inspect and compare Python packages and Python versions effortlessly. It equips users with a comprehensive set of tools and utility classes to retrieve essential information from installed Python packages, compare versions seamlessly, and extract various details about Python installations with ease. Target Audience Developers and Python enthusiasts looking to streamline the process of inspecting Python packages, comparing versions, and extracting vital information from Python installations will find PkgInspect invaluable. Many current modules such as importlib_metadata and pkg_resources are fairly limited on what items can be inspected and retrieved for a specified python package. Also noticed pkg_resources has also deprecated some of its important retrieval methods. Comparison PkgInspect stands out from other Python package inspection tools due to its robust features. Unlike traditional methods that require manual inspection and comparison, PkgInspect automates the process, saving developers valuable time and effort. With PkgInspect, you can effortlessly retrieve package information, compare versions across different Python installations, and extract crucial details with just a few simple commands. Key Features Inspect Packages: Retrieve comprehensive information about installed Python packages. Compare Versions: Seamlessly compare package data across different Python versions. Retrieve Installed Pythons: Identify and list installed Python versions effortlessly. Inspect PyPI Packages: Gather detailed information about packages from the Python Package Index (PyPI). Fetch Available Updates: Stay up-to-date with available updates for a package from the current version. List Inspection Fieldnames: Access a list of available fieldnames for package inspection. Retrieve Package Metrics: Extract OS statistics about a package effortlessly. Fetch GitHub Statistics: Retrieve insightful statistics about a package from GitHub effortlessly. Retrieve all Python Packages: Easily list all installed Python packages for a given Python version. Main Components Core Modules PkgInspect: Inspects Python packages and retrieves package information. PkgVersions: Retrieves and compares package data across different Python versions. PkgMetrics: Extracts OS statistics about a package. Functions - inspect_package: Inspects a Python package and retrieves package information. - inspect_pypi: Inspects a package from the Python Package Index (PyPI). - get_available_updates: Fetches available updates for a package from the current version. - get_installed_pythons: Identifies and lists installed Python versions. - get_version_packages: Lists all installed Python packages for a given Python version. - pkg_version_compare: Compares package data across different Python versions. Inspection Field Options Any other field name will be treated as a file name to inspect from the packages' site-path directory. - `short_meta` (dict[str, Any]): Returns a dictionary of the most important metadata fields. - If only one field is needed, you can use any of the following metadata fields. - Possible Fields instead of `short_meta`: - `Metadata-Version` (PackageVersion) - `Name` (str) - `Summary` (str) - `Author-email` (str) - `Home-page` (str) - `Download-URL` (str) - `Platform(s)` (set) - `Author` (str) - `Classifier(s)` (set) - `Description-Content-Type` (str) - `short_license` (str): Returns the name of the license being used. - `metadata` (str): Returns the contents of the METADATA file. - `installer` (str): Returns the installer tool used for installation. - `license` (str): Returns the contents of the LICENSE file. - `record` (str): Returns the list of installed files. - `wheel` (str): Returns information about the Wheel distribution format. - `requested` (str): Returns information about the requested installation. - `authors` (str): Returns the contents of the AUTHORS.md file. - `entry_points` (str): Returns the contents of the entry_points.txt file. - `top_level` (str): Returns the contents of the top_level.txt file. - `source_file` (str): Returns the source file path for the specified package. - `source_code` (str): Returns the source code contents for the specified package. - `doc` (str): Returns the documentation for the specified package. - `Pkg` Custom Class Fields - `PkgInspect fields`: Possible Fields from the `PkgInspect` class. - `site_path` (Path): Returns the site path of the package. - `package_paths` (Iterable[Path]): Returns the package paths of the package. - `package_versions` (Generator[tuple[str, tuple[tuple[Any, str]]]]): Returns the package versions of the package. - `pyversions` (tuple[Path]): Returns the Python versions of the package. - `installed_pythons` (TupleOfPkgVersions): Returns the installed Python versions of the package. - `site_packages` (Iterable[str]): Returns the site packages of the package. - `islatest_version` (bool): Returns True if the package is the latest version. - `isinstalled_version` (bool): Returns True if the package is the installed version. - `installed_version` (PackageVersion): Returns the installed version of the package. - `available_updates` (TupleOfPkgVersions): Returns the available updates of the package. - `PkgVersions fields`: Possible Fields from the `PkgVersions` class. - `initial_version` (PackageVersion): Returns the initial version of the package. - `installed_version` (PackageVersion): Returns the installed version of the package. - `latest_version` (PackageVersion): Returns the latest version of the package. - `total_versions` (int): Returns the total number of versions of the package. - `version_history` (TupleOfPkgVersions): Returns the version history of the specified package. - `package_url`: Returns the URL of the package on PyPI. - `github_stats_url` (str): Returns the GitHub statistics URL of the package. - `github_stats` (dict[str, Any]): Returns the GitHub statistics of the package. - The GitHub statistics are returned as a dictionary \ containing the following fields which can accessed using the `item` parameter: - `Forks` (int): Returns the number of forks on GitHub. - `Stars` (int): Returns the number of stars on GitHub. - `Watchers` (int): Returns the number of watchers on GitHub. - `Contributors` (int): Returns the number of contributors on GitHub. - `Dependencies` (int): Returns the number of dependencies on GitHub. - `Dependent repositories` (int): Returns the number of dependent repositories on GitHub. - `Dependent packages` (int): Returns the number of dependent packages on GitHub. - `Repository size` (NamedTuple): Returns the size of the repository on GitHub. - `SourceRank` (int): Returns the SourceRank of the package on GitHub. - `Total releases` (int): Returns the total number of releases on GitHub. - `PkgMetrics fields`: Possible Fields from the `PkgMetrics` class. - `all_metric_stats` (dict[str, Any]): Returns all the OS statistics of the package. - `total_size` (int): Returns the total size of the package. - `date_installed` (datetime): Returns the date the package was installed. - `pypistats fields`: Possible Fields from the `pypistats` module. - `all-pypi-stats` (dict[str, Any]): Returns all the statistics of the package on PyPI into a single dictionary. - `stats-overall` (dict[str, Any]): Returns the overall statistics of the package on PyPI. - `stats-major` (dict[str, Any]): Returns the major version statistics of the package on PyPI. - `stats-minor` (dict[str, Any]): Returns the minor version statistics of the package on PyPI. - `stats-recent` (dict[str, Any]): Returns the recent statistics of the package on PyPI. - `stats-system` (dict[str, Any]): Returns the system statistics of the package on PyPI. Downsides & Limitations My algorithms are fairly well but do come with some important downsides. PkgInspect will ONLY inspect packages that are python files or contains a dist-info folder in the site-packages folder for a given Python version. Was not able to efficiently figure out a way to retrieve all necessary packages without containing unrelevant folders/files. Some personal packages may be skipped otherwise. Beta (pre-releases) has not been implemented yet. As many files may be handled, the runtime may be slow for some users. The demand for a project like this is not so much in-demand but have noticed many people, including my self, still seeking for a project like this. However, this type of project does seem to exceed my experience level with Python and algorithms (hence the downsides) so not entirely sure how far this project may come in the future. Was hoping for it to be GUI based if possible. Usage Examples from pkg_inspect import inspect_package inspect_package("pkg_inspect", itemOrfile="initial_version") # Output (Format - DateTimeAndVersion): ('May 02, 2024', '0.1.0') inspect_package("pkg_inspect", itemOrfile="version_history") # Output (Format - tuple[DateTimeAndVersion]): (('May 02, 2024', '0.1.2'), ('May 02, 2024', '0.1.1'), ('May 02, 2024', '0.1.0')) inspect_package("pkg_inspect", pyversion="3.12", itemOrfile="short_meta") # Output (Format dict[str, Any]): {'Author': 'Yousef Abuzahrieh', 'Author-email': 'yousefzahrieh17@gmail.com', 'Classifiers': {'Development Status 4 Beta', 'Intended Audience Developers', 'License OSI Approved Apache Software License', 'Operating System OS Independent', 'Programming Language Python 3', 'Programming Language Python 3 Only', 'Topic Utilities'}, 'Description-Content-Type': 'text/markdown', 'Download-URL': 'https://github.com/yousefabuz17/PkgInspect.git', 'Home-page': 'https://github.com/yousefabuz17/PkgInspect', 'License': 'Apache Software License', 'Metadata-Version': <Version('2.1')>, 'Name': 'pkg-inspect', 'Platforms': {'Windows', 'MacOS', 'Linux'}, 'Summary': 'A comprehensive tools to inspect Python packages and Python ' 'installations.'} inspect_package("pandas", pyversion="3.12", itemOrfile="github_stats") # Output (Format - dict[str, Any]): {'Contributors': '1.09K', 'Dependencies': 3, 'Dependent packages': '41.3K', 'Dependent repositories': '38.4K', 'Forks': '17.3K', 'Repository size': Stats(symbolic='338.000 KB (Kilobytes)', calculated_size=338.0, bytes_size=346112.0), 'SourceRank': 32, 'Stars': '41.9K', 'Total releases': 126, 'Watchers': 1116} submitted by /u/yousefabuz [link] [comments]
How to create architecture diagrams from code in Jupyter Notebook
by /u/writer_on_rails (Python) on May 2, 2024 at 2:32 am
Hello world,I wrote an article about creating diagrams from code on Jupyter Notebook inside VS Code. It will give you a brief on the setup and also an overview of concepts. Within 5 minutes, you should be able to start making cool architecture diagrams. [TO MODERATOR: This link does not contain any paywalled or paid content. All the contents are available for free] Article link: https://ashgaikwad.substack.com/p/how-to-create-architecture-diagrams submitted by /u/writer_on_rails [link] [comments]
Thursday Daily Thread: Python Careers, Courses, and Furthering Education!
by /u/AutoModerator (Python) on May 2, 2024 at 12:00 am
Weekly Thread: Professional Use, Jobs, and Education 🏢 Welcome to this week's discussion on Python in the professional world! This is your spot to talk about job hunting, career growth, and educational resources in Python. Please note, this thread is not for recruitment. How it Works: Career Talk: Discuss using Python in your job, or the job market for Python roles. Education Q&A: Ask or answer questions about Python courses, certifications, and educational resources. Workplace Chat: Share your experiences, challenges, or success stories about using Python professionally. Guidelines: This thread is not for recruitment. For job postings, please see r/PythonJobs or the recruitment thread in the sidebar. Keep discussions relevant to Python in the professional and educational context. Example Topics: Career Paths: What kinds of roles are out there for Python developers? Certifications: Are Python certifications worth it? Course Recommendations: Any good advanced Python courses to recommend? Workplace Tools: What Python libraries are indispensable in your professional work? Interview Tips: What types of Python questions are commonly asked in interviews? Let's help each other grow in our careers and education. Happy discussing! 🌟 submitted by /u/AutoModerator [link] [comments]

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

October 6, 2022October 25, 2022

What are the top 10 most insane myths about computer programmers?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What are the top 10 most insane myths about computer programmers?

Programmers are often seen as a eccentric breed. There are many myths about computer programmers that circulate both within and outside of the tech industry. Some of these myths are harmless misconceptions, while others can be damaging to both individual programmers and the industry as a whole.

Here are 10 of the most insane myths about computer programmers:

1. Programmers are all socially awkward nerds who live in their parents’ basements.
2. Programmers only care about computers and have no other interests.
3. Programmers are all genius-level intellects with photographic memories.
4. Programmers can code anything they set their minds to, no matter how complex or impossible it may seem.
5. Programmers only work on solitary projects and never collaborate with others.
6. Programmers write code that is completely error-free on the first try.
7. All programmers use the same coding languages and tools.
8. Programmers can easily find jobs anywhere in the world thanks to the worldwide demand for their skills.
9. Programmers always work in dark, cluttered rooms with dozens of monitors surrounding them.
10. Programmers can’t have successful personal lives because they spend all their time working on code.”

Another Top 10 Myths about computer programmers in details are:

Myth #1: Programmers are lazy.

This couldn’t be further from the truth! Programmers are some of the hardest working people in the tech industry. They are constantly working to improve their skills and keep up with the latest advancements in technology.

Myth #2: Programmers don’t need social skills.

While it is true that programmers don’t need to be extroverts, they do need to have strong social skills. Programmers need to be able to communicate effectively with other members of their team, as well as with clients and customers.

Myth #3: All programmers are nerds.

There is a common misconception that all programmers are nerdy introverts who live in their parents’ basements. This could not be further from the truth! While there are certainly some nerds in the programming community, there are also a lot of outgoing, social people. In fact, programming is a great field for people who want to use their social skills to build relationships and solve problems.

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

Myth #4: Programmers are just code monkeys.

Programmers are often seen as nothing more than people who write code all day long. However, this could not be further from the truth! Programmers are critical thinkers who use their analytical skills to solve complex problems. They are also creative people who use their coding skills to build new and innovative software applications.

Myth #5: Anyone can learn to code.

This myth is particularly damaging, as it dissuades people from pursuing careers in programming. The reality is that coding is a difficult skill to learn, and it takes years of practice to become a proficient programmer. While it is true that anyone can learn to code, it is important to understand that it is not an easy task.

Myth #6: Programmers don’t need math skills.

This myth is simply not true! Programmers use math every day, whether they’re calculating algorithms or working with big data sets. In fact, many programmers have degrees in mathematics or computer science because they know that math skills are essential for success in the field.

Myth #7: Programming is a dead-end job.

This myth likely comes from the fact that many people view programming as nothing more than code monkey work. However, this could not be further from the truth! Programmers have a wide range of career options available to them, including software engineering, web development, and data science.

Myth #8: Programmers only work on single projects.

Again, this myth likely comes from the outside world’s view of programming as nothing more than coding work. In reality, programmers often work on multiple projects at once. They may be responsible for coding new features for an existing application, developing a new application from scratch, or working on multiple projects simultaneously as part of a team.

Myth #9: Programming is easy once you know how to do it .

This myth is particularly insidious, as it leads people to believe that they can simply learn how to code overnight and become successful programmers immediately thereafter . The reality is that learning how to code takes time , practice , and patience . Even experienced programmers still make mistakes sometimes !

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

Myth #10: Programmers don’t need formal education

This myth likely stems from the fact that many successful programmers are self-taught . However , this does not mean that formal education is unnecessary . Many employers prefer candidates with degrees in computer science or related fields , and formal education can give you an important foundation in programming concepts and theory .

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

Myth #11: That they put in immense amounts of time at the job

I worked for 38 years programming computers. During that time, there were two times that I needed to put in significant extra times at the job. The first two years, I spent more time to get acclimated to the job (which I then left at age of 22) with a Blood Pressure 153/105. Not a good situation. The second time was at the end of my career where I was the only person who could get this project completed (due to special knowledge of the area) in the timeframe required. I spent about five months putting a lot of time in.

Myth #12: They need to know advanced math

Some programmers may need to know advanced math, but in the areas where I (and others) were involved with, being able to estimate resulting values and visualization skills were more important. One needs to know that a displayed number is not correct. Visualization skills is the ability to see the “big picture” and envision the associated tasks necessary to make the big picture correctly. You need to be able to decompose each of the associated tasks to limit complexity and make it easier to debug. In general the less complex code is, the fewer errors/bugs and the easier it is to identify and fix them.

Myth #13: Programmers remember thousands lines of code.

No, we don’t. We know approximate part of the program where the problem could be. And could localize it using a debugger or logs – that’s all.

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Myth #14: Everyone could be a programmer.

No. One must have not only desire to be a programmer but also has some addiction to it. Programming is not closed or elite art. It’s just another human occupation. And as not everyone could be a doctor or a businessman – as not everyone could be a programmer.

Myth #15: Simple business request could be easily implemented

No. The ease of implementation is defined by model used inside the software. And the thing which looks simple to business owners could be almost impossible to implement without significantly changing the model – which could take weeks – and vice versa: seemingly hard business problem could sometimes be implemented in 15 minutes.

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Myth #16: Please fix <put any electronic device here>or setup my printer – you are a programmer!

Yes, I’m a programmer – neither an electronic engineer nor a system administrator. I write programs, not fix devices, setup software or hardware!

As you can see , there are many myths about computer programmers circulating within and outside of the tech industry . These myths can be damaging to both individual programmers and the industry as a whole . It’s important to dispel these myths so that we can continue attracting top talent into the field of programming !

Google’s Carbon Copy: Is Google’s Carbon Programming language the Right Successor to C++?

What are the Greenest or Least Environmentally Friendly Programming Languages?

What are popular hobbies among Software Engineers?

October 6, 2022October 25, 2022

What are the top 5 common Python patterns when using dictionaries?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What are the top 5 common Python patterns when using dictionaries?

In Python, a dictionary is a data structure that allows you to store data in a key/value format. This is similar to a Map in Java. A dictionary is mutable, which means you can add, remove, and update elements in a dictionary. Dictionaries are unordered, which means that the order in which you add elements to a dictionary is not preserved. Python dictionaries are extremely versatile data structures. They can be used to store data in a variety of ways and can be manipulated to perform a wide range of operations.

There are many different ways to use dictionaries in Python. In this blog post, we will explore some of the most popular patterns for using dictionaries in Python.

The first pattern is using the in operator to check if a key exists in a dictionary. This can be helpful when you want to avoid errors when accessing keys that may not exist.

The second pattern is using the get method to access values in a dictionary. This is similar to using the in operator, but it also allows you to specify a default value to return if the key does not exist.

The third pattern is using nested dictionaries. This is useful when you need to store multiple values for each key in a dictionary.

The fourth pattern is using the items method to iterate over the key-value pairs in a dictionary. This is handy when you need to perform some operation on each pair in the dictionary.

The fifth and final pattern is using the update method to merge two dictionaries together. This can be useful when you have two dictionaries with complementary data that you want to combine into one dictionary

1) Creating a Dictionary
You can create a dictionary by using curly braces {} and separating key/value pairs with a comma. Keys must be unique and must be immutable (i.e., they cannot be changed). Values can be anything you want, including another dictionary. Here is an example of creating a dictionary:

“`

python
dict1 = {‘a’: 1, ‘b’: 2, ‘c’: 3}
“`

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

2) Accessing Elements in a Dictionary
You can access elements in a dictionary by using square brackets [] and the key for the element you want to access. For example:
“`python
print(dict1[‘a’]) # prints 1
“`

If the key doesn’t exist in the dictionary, you will get a KeyError. You can avoid this by using the get() method, which returns None if the key doesn’t exist in the dictionary. For example: “`python print(dict1.get(‘d’)) # prints None “`

If you want to get all of the keys or values from a dictionary, you can use the keys() or values() methods. For example:

“`python
dict = {‘key1′:’value1’, ‘key2′:’value2’, ‘key3′:’value3’}
print(dict[‘key2’]) # Output: value2“`

““

python keys = dict1.keys() # gets all of the keys

print(keys)
dict_keys([‘a’, ‘b’, ‘c’])

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

values = dict1.values() # gets all of the values

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

print(values)
dict_values([1, 2, 3])

“`

3) Updating Elements in a Dictionary

You can update elements in a dictionary by using square brackets [] and assigning a new value to the key. For example:

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

“`

python dict1[‘a’] = 10

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

print(dict1[‘a’]) # prints 10

“`

You can add items to a dictionary by using the update() function. This function takes in an iterable (such as a list, string, or set) as an argument and adds each element to the dictionary as a key-value pair. If the key already exists in the dictionary, then the value of that key will be updated with the new value.

“`python
dict = {‘key1′:’value1’, ‘key2′:’value2’, ‘key3′:’value3’}
dict.update({‘key4′:’value4’, ‘key5’:’value5}) # Output: {‘key1’: ‘value1’, ‘key2’: ‘value2’, ‘key3’: ‘value3’, ‘key4’: ‘value4’, ‘key5’: ‘value5’}“`

4) Deleting Elements from a Dictionary

You can delete elements from a dictionary by using the del keyword and specifying the key for the element you want to delete. For example:

“`

python del dict1[‘c’]

print(dict1) # prints {‘a’: 10, ‘b’: 2}

“ `

You can remove items from a dictionary by using either the pop() or clear() functions. The pop() function removes an item with the given key and returns its value. If no key is specified, then it removes and returns the last item in the dictionary. The clear() function removes all items from the dictionary and returns an empty dictionary {} .

“`python
dict = {‘key1′:’value1’, ‘key2′:’value2’, ‘key3′:’value3’) dict[‘key1’] # Output: value1 dict[‘key4’] # KeyError >> dict = {}; dict[‘new key’]= “new value” # Output: {‘new key’ : ‘new value’} “`

5) Looping Through Elements in a Dictionary

You can loop through elements in a dictionary by using a for loop on either the keys(), values(), or items(). items() returns both the keys and values from the dictionary as tuples (key, value). For example:

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

“`python for key in dict1: print(“{}: {}”.format(key, dict1[key])) #prints each key/value pair for key, value in dict1.items(): print(“{}: {}”.format(key, value)) #prints each key/value pair #prints all of the values for value in dict1 .values(): print(“{}”.format(value))

6) For iterating around a dictionary and accessing the key and value at the same time:

for key, value in d.items():

….

instead of :

for key in d:

value = d[key]

7) For getting a value if the key doesn’t exist:

v = d.get(k, None)

instead of:

if k in d:

v = d[k]

else:

v = None

8) For collating values against keys which can be duplicated.

from collections import defaultdict

d = defaultdict(list)

for key, value in datasource:

d[key].append(value)

instead of:

d = {}

for key, value in datasource:

if key in d:

d[key].append[value]

else:

d[key] = [value]

9) and of course if you find yourself doing this :

from collections import defaultdict

d = defaultdict(int)

for key in datasource:

d[key] += 1

then maybe you need to do this :

from collections import Counter

c = Counter(datasource)

Dictionaries are one of the most versatile data structures available in Python. As you have seen from this blog post, there are many different ways that they can be used to store and manipulate data. Whether you are just starting out with Python or are an experienced programmer, understanding how to use dictionaries effectively is essential to writing efficient and maintainable code.

Dictionaries are powerful data structures that offer a lot of flexibility in how they can be used. By understanding and utilizing these common patterns, you can leverage the power of dictionaries to write more efficient and effective Python code. Thanks for reading!

What are the top 5 common Python patterns when using dictionaries?

Google’s Carbon Copy: Is Google’s Carbon Programming language the Right Successor to C++?

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

What are the Greenest or Least Environmentally Friendly Programming Languages?

Top 100 Data Science and Data Analytics and Data Engineering Interview Questions and Answers

October 5, 2022October 26, 2022

Top 10 ways for software engineers or developers to gain more power in their companies?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Top 10 ways for software engineers or developers to gain more power in their companies?

Power is a relative term. In business, power is the ability to influence others to achieve a desired outcome. As a software engineer, you may not be the CEO of your company, but that doesn’t mean you can’t wield a considerable amount of power. Here are the top 10 ways for software engineers to gain more power in their companies.

1. Become an expert in your field.

The first way to gain more power in your company is to become an expert in your field. When you know more than anyone else in the room, people are going to listen to what you have to say. Be the go-to person for questions about your area of expertise and make yourself indispensable.

2. Make friends with the right people.

It’s not what you know, it’s who you know. Making friends with the right people can open doors and help you get your foot in the door. If you want to gain more power in your company, start by making friends with the people who already have it.

3. Speak up and be heard.

If you want to influence others, you need to be vocal about it. Don’t be afraid to speak up and share your ideas with others. The more you speak up, the more likely it is that your voice will be heard and that you’ll be able to make a difference.

4. Network, network, network!

Get out there and meet people! The more connections you have, the greater your chances of success will be. Go to industry events, meetups, and conferences and make yourself known. The more visible you are, the more likely it is that you’ll be able to exert your influence when it matters most.

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

5. Write articles or blog posts about your area of expertise.

Sharing your knowledge with others is a great way to build credibility and gain recognition as an expert in your field. Writing articles or blog posts about your area of expertise is a great way to get started. Not only will you get exposure for your ideas, but you’ll also get feedback from others that can help improve your thinking. And, if people like what they read, they may even start quoting you as an expert!

6. Stay up-to-date on industry trends.

If you want to be seen as an expert, it’s important that you stay up-to-date on industry trends. Keep abreast of new developments in your field so that you can provide valuable insights when opportunities or problems arise. Staying current will also give you an edge over competition when it comes time for promotions or raises!

7. Give presentations or talks about your area of expertise .

Sharing your knowledge with others is a great way to build credibility and gain recognition as an expert in your field . If there are conferences or meetups related to your industry , see if there are any speaking opportunities available . Giving presentations or talks is also a great opportunity t o network with other professionals in your field . 8\. Volunteer t o work on high – profile projects . If y ou want t o increase y our visibility within y our company , volunteering t o work on high – profile projects is a great way t o do it . Not only will y ou get t o work on some challenging and interesting projects , but y ou’ll also get t o rub shoulders with some influential people . And , if y o u do a good job , y our work will speak for itself — which can only improve y our standing within the company .

9. Seek out mentorship opportunities.

A mentor can provide invaluable guidance — both professional and personal — as you navigate your career. Mentors can help you develop both professionally and personally, and they can also introduce you to their networks . If you’re interested in finding a mentor, seek out programs within your company or look for individuals who have already achieved what you aspire to accomplish .

10. Be assertive

— but not aggressive . Being assertive means being confident enough in your abilities to speak up for yourself when necessary , but not being so aggressive that your actions alienate others. When you’re assertive, people are more likely to listen to what you have to say because they respect your confidence in yourself. But being too aggressive can create hostility which can shake people’s confidence in you.

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

According to Kurt Guntheroth, software developers can gain more power in their companies doing the following:

Working for the same company for many years gives you more power. The opinions of recent hires are always distrusted, no matter how smart they are. Sorry, that’s not a thing you can do now.
Being among the first developers at a startup gives you a lot of power, because you know how everything works. The guys who come in a year later will never know as much as you do.
Becoming good friends and golf/squash/kayaking buddies with the lead or manager gives you a ton of power. Not all managers are social enough to make that work, though. I’ve known several rather mediocre engineers who wielded power way beyond what they should have because they had been around awhile, and were friends with the boss.
A history of calling the outcomes of decisions correctly makes you slightly more powerful. The trick comes in balancing how much to advertise your correct calls. Too much and you sound snooty. Not enough and being right doesn’t even matter.
Willingness to write spec documents gives you a lot of power over a design. It’s my experience that only about half of the excellent software developers I have ever worked with could string two grammatical sentences together.

Are you a software engineer or developer who feels like you don’t have enough power or sway in your company? If so, don’t worry – you’re not alone. The truth is, there are often a lot of people in companies who feel like they don’t have enough power.

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

But the good news is that there are things you can do to change that. In this blog post, we’ll give you 10 actionable tips that you can use to gain more power in your company.

1. Get involved in decision-making processes.
2. Speak up when you have an idea or perspective to share.
3. Become a mentor or coach to others.
4. Be a thought leader by writing blog posts or articles, giving talks, or teaching classes.
5. Develop relationships with people in other departments or companies.
6. Join or create employee resource groups.
7. Serve on committees or working groups.
8. Volunteer for special projects.
9. Network outside of work hours.
10. Make sure your performance review focuses on your accomplishments.

11. Become a subject matter expert.
12. Develop a strong understanding of the business.
13. Get involved in strategic decision-making.
14. Foster relationships with key stakeholders.
15. Drive change within your team.
16. Improve process and efficiency within your department.
17. Champion new initiatives and ideas.
18. Invest in yourself and your career development.
19. Be an active participant in industry thought leadership.
20. Leverage your technical expertise to improve customer experiences or solve business problems.

If you’re a software engineer or developer who feels like you could have more power in your company, we hope this blog post was helpful for you! Remember, gaining more power in your company is all about taking action and putting yourself out there – so don’t be afraid to get started today with some of the tips we listed above.”

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

These are just a few ways that software engineers can become more powerful within their companies. By becoming a subject matter expert, developing a strong understanding of the business, and getting involved in strategic decision-making, you can position yourself as a key player in your organization. So take action today and start making your voice heard!

October 4, 2022October 25, 2022

Which programming language produces binaries that are the most difficult to reverse engineer?

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Which programming language produces binaries that are the most difficult to reverse engineer?

Have you ever wondered how someone might go about taking apart your favorite computer program to figure out how it works? The process is called reverse engineering, and it’s done all the time by software developers in order to learn from other programs or to find security vulnerabilities. In this blog post, we’ll discuss why some programming languages make reverse engineering more difficult than others. We’re going to take a look at why binaries that were originally written in assembly code are generally the most difficult to reverse engineer.

Any given high-level programming language will compile down to assembly code before becoming a binary. Because of this, the level of difficulty in reverse engineering a binary is going to vary depending on the original high-level programming language.

Reverse Engineering

Reverse engineering is the process of taking something apart in order to figure out how it works. In the context of software, this usually means taking a compiled binary and figuring out what high-level programming language it was written in, as well as what the program is supposed to do. This can be difficult for a number of reasons, but one of the biggest factors is the level of optimization that was applied to the code during compilation.

In order to reverse engineer a program, one must first understand how that program was created. This usually involves decompiling the program into its original source code so that it can be read and understood by humans.

Once the source code has been decompiled, a reverse engineer can begin to understand how the program works and look for ways to modify or improve it. However, decompiling a program is not always a trivial task. It can be made significantly more difficult if the program was originally written in a language that produces binaries that are difficult to reverse engineer.

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more codes)

Some Languages Are More Difficult to Reverse Engineer Than Others.

There are many factors that can make reversing a binary more difficult, but they all stem from the way that the compiled code is organized. For example, consider two different programs written in two different languages. Both programs do the same thing: print “Hello, world!” to the screen. One program is written in C++ and one is written in Java.

When these programs are compiled, the C++ compiler will produce a binary that is considerably smaller than the binary produced by the Java compiler. This is because C++ allows programmers to specify things like data types and memory layout explicitly, whereas Java relies on interpretation at runtime instead. As a result, C++ programs tend to be more efficient than Java programs when compiled into binaries.

However, this also means that C++ binaries are more difficult to reverse engineer than Java binaries. This is because all of the information about data types and memory layout is encoded in the binary itself instead of being stored separately in an interpreted programming language like Java. As a result, someone who wants to reverse engineer a C++ binary would need to spend more time understanding how the compiled code is organized before they could even begin to understand what it does.

Optimization

Optimization is a process where the compiler tries to make the generated code run as fast as possible, with the goal of making the program take up less memory. This is generally accomplished by reorganizing the code in such a way that makes it harder for a human to read. For example, consider this simple C++ program:

int main() {
int x = 5;
int y = 10;
int z = x + y;
return z;
}
This would compile down to assembly code that looks something like this:

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

main: ; PC=0x1001000
mov eax, 5 ; PC=0x1001005
mov ebx, 10 ; PC=0x100100a
add eax, ebx ; PC=0x100100d
ret ; PC=0x100100e
As you can see, even this very simple program has been optimized to the point where it’s no longer immediately clear what it’s doing just by looking at it. If you were trying to reverse engineer this program, you would have a very difficult time understanding what it’s supposed to do just by looking at the assembly code.
Of course, there are ways to reverse engineer programs even if they’ve been heavily optimized. However, all things being equal, it’s generally going to be more difficult to reverse engineer a binary that was originally written in assembly code than one that was written in a higher-level language such as Java or Python. This is because compilers for higher-level languages typically don’t apply as much optimization to the generated code since humans are going to be reading and working with it directly anyways. As a result, binaries that were originally written in assembly tend to be more difficult to reverse engineer than those written in other languages.

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

According to Tim Mensch, programming language producing binaries that are the most difficult to reverse engineer are probably anything that goes through a modern optimization backend like `gcc` or LLVM.

And note that gcc is now the GNU Compiler Collection, a backronym that they came up with after adding a number of frontend languages. In addition to C, there are frontends for C++, Objective-C, Objective-C++, Fortran, Ada, D, and Go, plus others that are less mature.

LLVM has even more options. The Wikipedia page lists ActionScript, Ada, C#, Common Lisp, PicoLisp, Crystal, CUDA, D, Delphi, Dylan, Forth, Fortran, Free Basic, Free Pascal, Graphical G, Halide, Haskell, Java bytecode, Julia, Kotlin, Lua, Objective-C, OpenCL, PostgreSQL’s SQL and PLpgSQL, Ruby, Rust, Scala, Swift, XC, Xojo and Zig.

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

I don’t even know what all of those languages are. In some cases they may include enough of a runtime to make it easier to reverse engineer the underlying code (I’m guessing the Lisp dialects and Haskell would, among others), but in general, once compiled to a target architecture with maximum optimization, all of the above would be more or less equally difficult to reverse engineer.

Languages that are more rare (like Zig) may have an advantage by virtue of doing things differently enough that existing decompilers would have trouble. But that’s only an incremental increase in difficulty.

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

There exist libraries that you can add to a binary to make it much more difficult to reverse engineer. Tools that prevent trivial disassembly or that make code fail if run in a debugger, for instance. If you really need to protect code that you have to distribute, then using one of those products might be appropriate.

But overall the only way to be sure that no one can reverse engineer your code (aside from nuking it from orbit, which has the negative side effect of eliminating your customer base) is to never distribute your code: Run anything proprietary on servers and only allow people with active accounts to use it.

Generally, though? 99.9% of code isn’t worth reverse engineering. If you’re not being paid by some large company doing groundbreaking research (and you’re not if you would ask this question) then no one will ever bother to reverse engineer your code. This is a really, really frequent “noob” question, though: Because it was so hard for a new developer to write an app, they fear someone will steal the code and use it in their own app. As if anyone would want to steal code written by a junior developer. 🙄

More to the point, stealing your app and distributing it illegally can generally be done without reverse engineering it at all; I guarantee that many apps on the Play Store are hacked and republished with different art without the thieves even slightly understanding how the app works. It’s only if you embed some kind of copy protection/DRM into your app that they’d even need to hack it, and if you’re not clever about how you add the DRM, hacking it won’t take much effort or any decompiling at all. If you can point a debugger at the code, you can simply walk through the assembly language and find where it does the DRM check—and disable it. I managed to figure out how to do this as a teen, on my own, pre-Internet (for research purposes, of course). I guarantee I’m not unique or even that skilled at it, but start to finish I disabled DRM in a couple hours at most.

So generally, don’t even bother worrying about how difficult something is to reverse engineer. No one cares to see your code, and you can’t stop them from hacking the app if you add DRM. So unless you can keep your unique code on a server and charge a subscription, count on the fact that if your app gets popular, it will be stolen. People will also share subscription accounts, so you need to worry about that as well when you design your server architecture.

There are a lot of myths and misconceptions out there about binary reversing.

Myth #1: Reversing a Binary is Impossible
This is simply not true. Given enough time and effort, anyone can reverse engineer a binary. It may be difficult, but it’s certainly not impossible. The first step is to understand what the program is supposed to do. Once you have a basic understanding of the program’s functionality, you can start to reverse engineering the code. This process will help you understand how the program works and how to modify it to suit your needs.

Myth #2: You Need Special Tools to Reverse Engineer a Binary
Again, this is not true. All you really need is a text editor and a disassembler. A disassembler will take the compiled code and turn it into assembly code, which is much easier to read and understand.Once you have the assembly code, you can start to reverse engineer the program. You may find it helpful to use a debugger during this process so that you can step through the code and see what each instruction does. However, a debugger is not strictly necessary; it just makes the process easier. If you don’t have access to a debugger, you can still reverse engineer the program by tracing through the code manually.

Myth #3: Only Certain Types of Programs Can Be Reversed Engineered
This myth is half true. It’s certainly easier to reverse engineered closed-source programs than open-source programs because you don’t have access to the source code. However, with enough time and effort, you can reverse engineer any type of program. The key is to understand the program’s functionality and then start breaking down the code into smaller pieces that you can understand. Once you have a good understanding of how the program works, you can start to figure out ways to modify it to suit your needs.

In conclusion,

We can see that binaries compiled from assembly code are generally more difficult to reverse engineer than those from other high-level languages. This is due to the level of optimization that’s applied during compilation, which can make the generated code very difficult for humans to understand. However, with enough effort and expertise, it is still possible to reverse engineer any given binary.

So, which programming language produces binaries that are the most difficult to reverse engineer?

There is no definitive answer, as it depends on many factors including the specific features of the language and the way that those features are used by individual programmers. However, languages like C++ that allow for explicit control over data types and memory layout tend to produce binaries that are more difficult to reverse engineer than interpreted languages like Java.

Google’s Carbon Copy: Is Google’s Carbon Programming language the Right Successor to C++?

What are the Greenest or Least Environmentally Friendly Programming Languages?

What are popular hobbies among Software Engineers?

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

Top 100 Data Science and Data Analytics and Data Engineering Interview Questions and Answers

Active Hydrating Toner, Anti-Aging Replenishing Advanced Face Moisturizer, with Vitamins A, C, E & Natural Botanicals to Promote Skin Balance & Collagen Production, 6.7 Fl Oz

Age Defying 0.3% Retinol Serum, Anti-Aging Dark Spot Remover for Face, Fine Lines & Wrinkle Pore Minimizer, with Vitamin E & Natural Botanicals

Firming Moisturizer, Advanced Hydrating Facial Replenishing Cream, with Hyaluronic Acid, Resveratrol & Natural Botanicals to Restore Skin's Strength, Radiance, and Resilience, 1.75 Oz

Skin Stem Cell Serum

Smartphone 101 - Pick a smartphone for me - android or iOS - Apple iPhone or Samsung Galaxy or Huawei or Xaomi or Google Pixel

Can AI Really Predict Lottery Results? We Asked an Expert.

Ace the 2023 AWS Solutions Architect Associate SAA-C03 Exam with Confidence

Pass the 2023 AWS Certified Machine Learning Specialty MLS-C01 Exam with Flying Colors

Football/Soccer World Cup 2022 Guide and Past World Cups History and Quiz illustrated

Djamgatech

Read Photos and PDFs Aloud for me iOS
Read Photos and PDFs Aloud for me android
Read Photos and PDFs Aloud For me Windows 10/11
Read Photos and PDFs Aloud For Amazon

Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more)

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6(Email us for more)

FREE 10000+ Quiz Trivia and and Brain Teasers for All Topics including Cloud Computing, General Knowledge, History, Television, Music, Art, Science, Movies, Films, US History, Soccer Football, World Cup, Data Science, Machine Learning, Geography, etc....

taimienphi.vn

List of Freely available programming books - What is the single most influential book every Programmers should read

#BlackOwned #BlackEntrepreneurs #BlackBuniness #AWSCertified #AWSCloudPractitioner #AWSCertification #AWSCLFC02 #CloudComputing #AWSStudyGuide #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AWSBasics #AWSCertified #AWSMachineLearning #AWSCertification #AWSSpecialty #MachineLearning #AWSStudyGuide #CloudComputing #DataScience #AWSCertified #AWSSolutionsArchitect #AWSArchitectAssociate #AWSCertification #AWSStudyGuide #CloudComputing #AWSArchitecture #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AzureFundamentals #AZ900 #MicrosoftAzure #ITCertification #CertificationPrep #StudyMaterials #TechLearning #MicrosoftCertified #AzureCertification #TechBooks

Top 1000 Canada Quiz and trivia: CANADA CITIZENSHIP TEST- HISTORY - GEOGRAPHY - GOVERNMENT- CULTURE - PEOPLE - LANGUAGES - TRAVEL - WILDLIFE - HOCKEY - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION

Top 1000 Africa Quiz and trivia: HISTORY - GEOGRAPHY - WILDLIFE - CULTURE - PEOPLE - LANGUAGES - TRAVEL - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION

Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada.

Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA

Health Health, a science-based community to discuss health news and the coronavirus (COVID-19) pandemic

Found: the dial in the brain that controls the immune system
by /u/barweis on May 3, 2024 at 3:18 pm
submitted by /u/barweis [link] [comments]
Long Beach health officials declare tuberculosis outbreak a public health emergency
by /u/josh252 on May 3, 2024 at 11:26 am
submitted by /u/josh252 [link] [comments]
Why does TB have such a hold on the Inuit communities of the Canadian Arctic?
by /u/Maxcactus on May 3, 2024 at 10:11 am
submitted by /u/Maxcactus [link] [comments]
Launching an effective bird flu vaccine quickly could be tough, scientists warn
by /u/Maxcactus on May 3, 2024 at 10:01 am
submitted by /u/Maxcactus [link] [comments]
What are the most frequent and serious causes of child poisoning?
by /u/euronews-english on May 3, 2024 at 9:43 am
submitted by /u/euronews-english [link] [comments]

Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.

TIL John Walsh, host of "America's Most Wanted," became an advocate for missing children after his son Adam was abducted and murdered in 1981. His advocacy led to changes in laws and the creation of the National Center for Missing & Exploited Children. His show helped capture over 1,200 fugitives.
by /u/whstlngisnvrenf on May 3, 2024 at 2:00 pm
submitted by /u/whstlngisnvrenf [link] [comments]
TIL - Computers were people (mostly women) up until WWII. Teams of people, often women from the late nineteenth century onwards, were used to undertake long and often tedious calculations.
by /u/MyHamburgerLovesMe on May 3, 2024 at 1:44 pm
submitted by /u/MyHamburgerLovesMe [link] [comments]
TIL when a peanut plant is pollinated, the flower loses its petals. The bare flower bud hangs from the stem and grows down toward the ground until it penetrates the soil. Once it reaches the appropriate depth, it becomes a peanut
by /u/admiralturtleship on May 3, 2024 at 1:38 pm
submitted by /u/admiralturtleship [link] [comments]
TIL that 3% of people in the US will have a psychotic break at some point in their lives
by /u/Just_Want_To_Write on May 3, 2024 at 1:08 pm
submitted by /u/Just_Want_To_Write [link] [comments]
TIL Most of the stories about the Dvorak keyboard being superior to the standard QWERTY come from a Navy study conducted by August Dvorak, who owned the patent on the Dvorak keyoard.
by /u/littletoyboat on May 3, 2024 at 11:00 am
submitted by /u/littletoyboat [link] [comments]

Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.

Researchers identify 2 monoclonal antibodies against Nipah virus through phage display. As there are no immunotherapies against this virus, these findings could assist in the development of antiviral drugs and vaccine design.
by /u/Biointron on May 3, 2024 at 3:43 pm
submitted by /u/Biointron [link] [comments]
Texas dairy farm worker's case may be first where H5N1 bird flu virus spread from mammal to human according to a new report by the CDC. All previous human cases were linked to transmission from infected birds.
by /u/shiruken on May 3, 2024 at 3:11 pm
submitted by /u/shiruken [link] [comments]
The environmental sustainability of digital content consumption
by /u/mvdm_42 on May 3, 2024 at 2:56 pm
submitted by /u/mvdm_42 [link] [comments]
Transparent filter with 10,000-pixel power turns phone into pro camera | The study utilizes a transparent, one-centimeter-square plane featuring a 2D semiconductor—a thin film composed of only a few atoms.
by /u/chrisdh79 on May 3, 2024 at 2:31 pm
submitted by /u/chrisdh79 [link] [comments]
Only half of studies in 5 leading journals of animal behavior research report using techniques to avoid observer bias from sneaking in – but that’s up from under 20% in study 8 years earlier
by /u/globehater on May 3, 2024 at 1:08 pm
submitted by /u/globehater [link] [comments]

Reddit Sports Sports News and Highlights from the NFL, NBA, NHL, MLB, MLS, and leagues around the world.

Chevrolet denies participation in Team Penske's IndyCar cheating scandal
by /u/Oldtimer_2 on May 3, 2024 at 3:49 pm
submitted by /u/Oldtimer_2 [link] [comments]
Larry Demeritte is just the second Black trainer since 1951 to saddle a horse for the Kentucky Derby
by /u/Oldtimer_2 on May 3, 2024 at 3:48 pm
submitted by /u/Oldtimer_2 [link] [comments]
Buffalo Bills sign WR Chase Claypool to a one year contract
by /u/Oldtimer_2 on May 3, 2024 at 3:43 pm
submitted by /u/Oldtimer_2 [link] [comments]
Report: Beckham to sign 1-year deal with Dolphins
by /u/SpiritedSuccess5675 on May 3, 2024 at 3:30 pm
submitted by /u/SpiritedSuccess5675 [link] [comments]
Verstappen says future is with Red Bull, suggests €250M not enough to switch. [$275M US]
by /u/Oldtimer_2 on May 3, 2024 at 3:20 pm
submitted by /u/Oldtimer_2 [link] [comments]

Turn your dream into reality with Google Workspace: It’s free for the first 14 days.
Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes:
96DRHDRA9J7GTN6
63F733CLLY7R7MM
63F7D7CPD9XXUVT
63FLKQHWV3AEEE6
63JGLWWK36CP7WM
63KKR9EULQRR7VE
63KNY4N7VHCUA9R
63LDXXFYU6VXDG9
63MGNRCKXURAYWC
63NGNDVVXJP4N99
63P4G3ELRPADKQU
With Google Workspace, Get custom email @yourcompany, Work from anywhere; Easily scale up or down
Google gives you the tools you need to run your business like a pro. Set up custom email, share files securely online, video chat from any device, and more.
Google Workspace provides a platform, a common ground, for all our internal teams and operations to collaboratively support our primary business goal, which is to deliver quality information to our readers quickly.
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE
C37HCAQRVR7JTFK
C3AE76E7WATCTL9
C3C3RGUF9VW6LXE
C3D9LD4L736CALC
C3EQXV674DQ6PXP
C3G9M3JEHXM3XC7
C3GGR3H4TRHUD7L
C3LVUVC3LHKUEQK
C3PVGM4CHHPMWLE
C3QHQ763LWGTW4C
Even if you’re small, you want people to see you as a professional business. If you’re still growing, you need the building blocks to get you where you want to be. I’ve learned so much about business through Google Workspace—I can’t imagine working without it. (Email us for more codes)