Food For Thought – Delicious Homemade Cuisine From All over the World

Vegi Skewer Canada

Who doesn’t remember their favourite food from home when they were growing up? That delicious taste stays with us forever. We can move all over the World, but the thought of our favourite home-cooked meals always make us happy.

In this blog, We are going to post Pictures, Recipes, Videos, Stories about Home cooked Meals from all over the World.

1- Charcoal Barbecued Vegi Skewer Canada

Vegi Skewer Canada
Vegi Skewer Canada

2- Ginger glazed Vegi mix + Baked Salmon + Rice : Canada

2022 AWS Cloud Practitioner Exam Preparation

Food Top Stories – Food Breaking News – Food For Thought

Breaking News – Top Stories

IT Engineering Finance Login

Breaking News and Top Stories From all over the world. World News. Hourly Updated News Feed from Reddit, Twitter, Medium, Quora and top news agency and media around the world (CNN, Fox News, USA Today, MSNBC, ABC News, Al Jazeera, CBC, BBC, SkyNews, etc.).

Facebook, Instagram, Apple and Google Apps Search Ads Secrets – Make Money From Your Products

Pay Per Click - google Facebook Instagram Twitter

A bit about search ads first.

There are billions of Apps and products out there and it is becoming harder and harder to stand out. You don’t want to spend countless of hours developing your dream app or products just to have close to zero sale per month.

This blog is an aggregate of the best secrets of Apple and Google Apps search ads for successful App developers.

2022 AWS Cloud Practitioner Exam Preparation

This blog also includes tips and tricks for successful Google Search Ads, Facebook Search Ads and Instagram Search Ads for any product.

Facebook, Instagram, Apple and Google Apps Search Ads Secrets - Make Money From Your Products
Google Search Ads For Apps Secrets

Apple Search Ads uses a Cost-Per-Tap (CPT) model, meaning that advertisers need to pay Apple every time someone “taps” on a Search Ad listing after performing a keyword search. While on other traditional mobile ad networks such as Google UAC or Facebook Ads, the advertiser usually pays per app install (Cost-Per Install model, or CPI) after a user saw or interacted with an ad.

Apple offers 2 types of search ads – basic and advanced. Which one should you choose?

I guess it depends upon the type of app and installs you want. Basic is CPI based vs Advanced is CPT based. This might make you think that Basic is better because you only pay when you get an install BUT that’s not the best way of looking at it. Basic has a much higher cost per install CPI than the cost per tap CPT you have from the advanced one. So unless your user either buys an IAP or paid app which makes more money than the CPI you paid to acquire that user, you might lose money.

Also, advanced lets your focus on specific keywords whereas Basic is mostly Apple’s own hidden algorithm showing your ads. Focusing on specific keywords is important because you don’t just want user to download the app, you want them to open and use it too. Since we don’t know how Apple will show your ad for basic, you have no clue whether your app is getting perfectly targeted.

So you may or may not be paying more money for the install using Basic vs Advanced as advanced can get you a lot more impressions of the ad (and more downloads if your metadata is on point).

Apple Search Ads is an intent-based channel

This is important in the post-IDFA era because Apple looks at the context of a particular search to target ads based on keywords. By its very nature, ASA does not rely on IDs to target individuals. Attribution models already have an advantage over other channels that rely on IDs for individual behavioural targeting.


Save 65% on select product(s) with promo code 65ZDS44X on Amazon.com

With Apple Search Ads, you can tap into user intent signals that match your offerings and attract higher-quality users. That’s why Apple claims such impressive performance numbers, such as 50 percent average conversion rates and 65 percent download rates.


A bit about search ads first.

I personally would never run Basic for a free app (even if it has IAP) as the CPI is very high and unless I have a high conversion rate for the IAP, I would be losing money. For a paid app, it might work well though.

I have mostly tested Advanced. I did run Basic but the CPI was way too high so I stopped it. For advanced, I would advice:

Start small but not too small. Like don’t set a daily budget of under $5 or over $20. Start with lets say $10 and keep it like that for 1-2 weeks and see how it works. Adjust the keywords in the search ad, adjust your screenshots, icon and other metadata to make it look more attractive if you notice people are clicking on the ad but not tapping the download button etc.

Before running search ads, make sure you have your freemium app monetization and DAU (active users) absolutely down. Like if you only have banner ads in the app and no way for user to buy the in app purchase, don’t bother with search ads yet if your cost per acquisition is too high. For example if your CPA is $2 in an extremely competitive app category, and you spend $2 to acquire a new user or you waste $2 on a user who taps on the ad but doesn’t hit download. You may never make your money back from your ads in the app. Banner ads aren’t even worth it imo unless you have thousands of active users. They hardly make a few pennies per 1000 impressions. Interstitial ads are better and make more money and Rewarded ads are even better. But still, you need to look at numbers to see whether you are at least breaking even.

Apple and Google gives you $100 credit for free to try it out, so use that to test it out and look at numbers, make changes etc.

Set the search ad settings correctly. There is an option for targeting audience – whom would you like to see your ad and options are “People who already have your app“, “People who don’t have your app” etc. Of course you don’t want to select the first option because they already have your app. You want to acquire new users. You can also choose the age of the audience. So for example, if you have an app which you is meant for people who own houses, you don’t want to target people under 25 or even 30 years old because most of them won’t own houses.

If you are getting taps (you spend money per tap) but not conversions (downloads), that means people are finding something on your app store page which they don’t like. This could be bad or missing reviews, bad screenshots, bad metadata etc. So get honest opinion from non-friends to see what they think of your app store page.

Search ads for paid apps OR apps with in app purchases is different than search ads for free apps. You should make sure your paid app OR IAP is priced right so that you can at least break even and preferably make a profit for every cost per acquiring the customer. For example – if your cost per acquisition is $5 (this can be pretty high for paid apps as a lot of people will often click and ad but then decide not to download the app maybe because of the pricing or some other metadata) and you have priced your app at $2.99, you are just burning money. Be intelligent.

Using keywords of other app names in same category might work for you. But I won’t suggest setting keywords for trademarked apps OR of popular apps which have nothing to do with your app category. This can get you called out for IP/Copyright/Trademark violation. This also won’t convert well because when people are search for a specific app (let’s say Facebook) and your calculator app shows up in the ad, no body is going to click on it as the user obviously is only looking to download Facebook.

I personally don’t like running ads in developing countries as – Admob pays very little in those countries, people don’t buy IAP much, people don’t buy paid apps much.

Don’t bid for keywords which have high competition OR very high CPT. Companies with deep pockets will kill you.

I am not a fan of the option “Search Match” (Automatically match my ad to relevant searches) which Apple gives you. I always disable that option.

Search ads are good if you can afford it and if you have an app which fits the profile. It may or may not work for every app. Always look at numbers.

I’m guessing search ads are the ads you see in the App Store when you are searching for specific apps?


Yes, search ads are for the app store search. So if someone searches for a keyword which you have targeted your ad towards and you win the bidding battle for the ad space for the same keyword against someone else, your app’s ad gets shown.

Is there an average price per click that you pay?

Yes, Apple search ads are CPT based. Cost per tap. So if someone taps your ad, you pay what you won the bid for against some other person’s ads bid. For example – If you bid for a keyword “car” and you have set the maximum CPT at $0.20 and Bob who is also an app developer and is running ads and has set his “car” keyword at a CPT of $0.10, you will pay $0.11 because that’s what it took to win. Of course there are more factors – level of competition for that keyword, higher levels of CPT being bid by others etc which can drive the average CPT higher for you. That’s why you get to set the maximum you are willing to pay per keyword.

How many people searching for apps, see my game as an ad, and click on it per day for $10?

There is no general range of how many people might. You can use the maximum CPT to control the amount you spend per tap and you can also set an optional CPA (cost per acquisition) to ensure you don’t run at a loss. However, the first 2 weeks should usually be experimental and test it out with low budgets.

A very important thing to remember – you pay per tap – NOT per download. So if someone taps your ad and notices your screenshots look like crap and doesn’t download your app, you just lost money. This is why you need the metadata to be perfect and use the CPA field after 2 weeks to make sure you don’t run at loss.

Along with that, do you only pay for clicks? Do you pay more if they download your app after the click?

Yes you pay per click (tap to be technically correct). You don’t pay more if they download.

I’m assuming you are constantly tracking How many active users you have and how much revenue you are generally getting to be able to ball-park any change in these numbers based off your ads being displayed.

Yes, I always monitor my ad spend and compare it to how many downloads I got (if this is for a paid app) or how many people bought the IAP and how much revenue I am making per day via Admob. I do this every morning. Unfortunately, Apple doesn’t seem to let me track how many of those ad conversions converted into buying the in app purchase. So this throws me off a bit.

So, your CPA. Is this your cost for running the ads per download?

Regarding CPA. They let you set an optional CPA goal when running your ad campaign. Determining it is a bit of work. Like when I am starting out, I don’t have any numbers to look at, so I leave the CPA blank or set it as the same price as my IAP or paid app price. Basically I don’t want the cost per acquisition to exceed the IAP or paid app price because that would mean I am burning money and running at a loss instead of profit. However after running the campaign for 1-2 weeks and looking at the numbers for each day, I can guess a better CPA and if I think I definitely don’t want to exceed a certain number because it would make me lose money instead of break even/profit, I will set it. You don’t want to set the CPA too low – at least initially because then you won’t even get any impressions of your ads. For example: Looking at one of my ad campaigns right now, I have default CPT of $0.10 (cost per tap as you pay every time someone taps your ad – doesn’t matter whether they download or not). They let you set CPT on a per keyword basis too which overrides the default CPT. NOTE that CPT is the maximum amount you are willing to pay for the tap. This means that if you are at a battle with someone else who also wants the same ad space, you can win the battle if your CPT is even a cent higher. You only pay whatever amount it takes to win the battle, not the highest one which you have set your CPT at. So often, your Average CPT will be lesser than what you set it at which is good. So for this campaign, my default CPT is $0.10 and I have a few keywords with custom CPT of $0.20. After looking at my numbers for the past few weeks, I see that for most of my keywords, I have Average CPT of $0.15, $0.16, $0.19 and average CPA of $0.15, $0.33, $0.29. So if I want, after testing it for couple weeks, I can lower the CPA to $0.50 so that I never run it at a loss.

So if I spend 10 dollars in 1 day and 5 people downloaded the app, that would be a $2 CPA? Yes.

And I will repeat my previous statement: I always monitor my ad spend and compare it to how many downloads I got (if this is for a paid app) or how many people bought the IAP and how much revenue I am making per day via Admob. I compare and set the CPA based off of these. I do this every morning. Unfortunately, Apple doesn’t seem to let me track how many of those ad conversions converted into buying the in app purchase. So this throws me off a bit.

Have you been able to verify your numbers and whether or not you are profiting based off these ads? Why not bump your ad spending even higher?

I have made money from certain types of apps and lost money by doing stupid stuff (running ad campaigns for a free with ads app but not having an IAP to remove ads, running ad campaigns for apps with only poverty banner ads and no full screen/interstitial/rewarded video ads which at least make some money, running ad campaigns for apps with generic keywords which are very high competition and gets out-bid by much bigger players with much deeper pockets, running ads where my CPA was higher than the money I was making off of the IAP or Paid app, running ad campaigns with a keyword which was for an app not even in my category which made users tap my ad, lose money and then they won’t download, running campaign with a keyword which was trademarked etc).

Basically, be intelligent, research, start slow and experiment with the $100 credit Apple gives you.

A few people asked me about rewarded ads vs interstitial ads for monetization. This is a bit off topic but I will throw this in.

Rewarded ads have a higher eCPM than regular interstitial ads, meaning you get paid more. Of course how high depends upon the type of app, number of users, placement of ads etc. I use Admob’s rewarded ads to mostly unlock features or number of XXX item usage in the app. There are other companies which offer them too. You can read a few points here for example:

source: reddit

Rewarded Video Ads | ironSourceRewarded video ads are a great mobile video advertising strategy to increase ad revenue & improve user experience. Learn how to monetize with video rewards.

The high eCPM is good. What’s even better about them than regular interstitial is that they just provide a better user experience and less negative reviews. This is because the user is willingly choosing to watch an ad instead of their game getting randomly interrupted. And in return, the user gets some type of in app reward – more coins, unlock some feature etc. So this is a win win for the developer and the user.

How do you determine your CPA for an app with IAPs? (Like does iTunes Connect tell you this information?)

They let you set an optional CPA goal when running your ad campaign. Determining it is a bit of work. Like when I am starting out, I don’t have any numbers to look at, so I leave the CPA blank or set it as the same price as my IAP or paid app price. Basically I don’t want the cost per acquisition to exceed the IAP or paid app price because that would mean I am burning money and running at a loss instead of profit.

However after running the campaign for 1-2 weeks and looking at the numbers for each day, I can guess a better CPA and if I think I definitely don’t want to exceed a certain number because it would make me lose money instead of break even/profit, I will set it.

You don’t want to set the CPA too low – at least initially because then you won’t even get any impressions of your ads.

For example:

Looking at one of my ad campaigns right now, I have default CPT of $0.10 (cost per tap as you pay every time someone taps your ad – doesn’t matter whether they download or not). They let you set CPT on a per keyword basis too which overrides the default CPT. NOTE that CPT is the maximum amount you are willing to pay for the tap. This means that if you are at a battle with someone else who also wants the same ad space, you can win the battle if your CPT is even a cent higher. You only pay whatever amount it takes to win the battle, not the highest one which you have set your CPT at. So often, your Average CPT will be lesser than what you set it at which is good.

So for this campaign, my default CPT is $0.10 and I have a few keywords with custom CPT of $0.20.

After looking at my numbers for the past few weeks, I see that for most of my keywords, I have Average CPT of $0.15, $0.16, $0.19 and average CPA of $0.15, $0.33, $0.29.

So if I want, after testing it for couple weeks, I can lower the CPA to $0.50 so that I never run it at a loss.

So essentially with $2,000 its possible to have 10,000+ people click on your ad? That seems like a solid conversion rate if at least 1/10th of them download the app.

Depending upon the type of app, your CPT can vary. For me most of them have been about 20 cents. So yes, 10000 taps from $2000 is a good estimate. However – these are taps – not downloads. For downloads, you need to make sure your metadata is on point! Also you need to have monetization is place – IAP, paid apps etc to make sure you are actually making money off of these users which you are spending money to acquire.

How long did it take for you to start seeing impressions? We have pretty competitive keywords so i’m using extremely high CPT. $10+ and i’m still not seeing any impressions. It’s been 24 hours.

If you haven’t setup scheduled ads, it should be quick. I had mine within an hour if I remember right. I would suggest trying for less competitive keywords though.

What’s your experience and tips for driving iOS game app downloads via paid ads platforms like Facebook Ads, Apple Search Ads, Youtube ads, etc…?

No experience but as a iPhone user i often see myself downloading apps while browsing instagram. So I’d assume you’ll be spot on with instagram/snapchat/tiktok or maybe even youtube shorts.

App Store search ads keyword match types

Search Ads involve three different types of keyword matches.

They are ways for you to tell Apple whether you want to bid on keywords exactly as you enter them or more broadly. This is influenced by campaign goals and will ultimately determine campaign results. So you must first understand the different types of keyword matches Apple offers.

Broad Match

Broad match is the default keyword match type. By selecting broad match, you are telling Apple that you want to bid on the keywords you select and other keywords that are broadly related to them.

Broad match includes misspellings, plurals, closely related words, synonyms, related searches, related phrases, and translations.


For example, when you type “Friends,” Apple also considers variations of “Friend,” “Amigo,” “Freind,” and more.

Exact match

Exact match helps you narrow your ad bid spread. By choosing exact match, you’re telling Apple that you want to bid exactly as entered for the selected keyword.

Common misspellings and plural forms will also be taken into account.

For example, when you type “friends,” Apple will consider “friends” and “friends.

Search matching

Search matches are best suited for keyword discovery. By selecting Search Match, you allow Apple to use its metadata to automatically match your app to relevant keywords and search terms.

For Search Match to work, your app’s metadata needs to be up to date and optimized. This means that App Store optimizations have been completed and recently updated. In this way, Apple can easily pull information about your app and generate the best and most relevant keywords.

App Store Search campaign types

When creating an account to start keyword bidding, ASA best practice is to split your keywords into four different campaign types: Generic, Branded and Competitor, and Discovery.

Generic Campaigns

Typically set to broad match, generic campaigns use keywords that are relevant to your app. For example, if you have a fitness app, you should include keywords such as “fitness” or “exercise” in this campaign. The purpose of the general campaign is to attract high intent app store visitors.

Branded campaigns

You will want to use a brand campaign to reach a more specific audience searching for your brand in the App Store, drive reinstalls and brand protection. Your keywords in this campaign will be keywords related to your brand name or a variation thereof. By bidding generously on your branded keywords, you ensure that your competitors don’t take this valuable space away from you.

Competitor activity

Set up exact matches, competitor campaigns to target App Store users who are searching for competitors. Keywords for these types of campaigns include your direct competitor’s name or a variation of their name.

Discovery campaigns

You need to set up a discovery campaign to discover new keywords or find alternative keywords that you are not using in other campaigns.

To maximize the effectiveness of a Discovery campaign, new keywords from Discovery should be added as exact match keywords to the other three campaign types, and all keywords from branded, generic, and competitor campaigns should be added as negative keywords in Discovery.

Best practices for using Apple Search Ads

Getting started with Apple Search Ads isn’t a problem. But you need to make sure you adopt some best practices that will ultimately help you make the most of your investment. Here are some App Store advertising best practices you should follow when using Apple Search Ads.

Review app metadata before launching a campaign

Before launching a new campaign, you’ll want to visit App Store Connect and take a closer look at app metadata. The appearance of your ads will be based on your app’s metadata, and you won’t be able to change it later. Keep in mind that the same ad is unlikely to be shown to every user. Some people may get a simple description of the app, while others will see screenshots and preview videos.

USP-based targeted keywords

This is very important for marketers using ASA Advanced. You need to do some research and identify keywords that will increase installs. For example, if you have a fitness tracking app, use keywords like “fitness tracker” or “diet plan” as keywords. You must understand the search patterns of your audience because it can greatly improve your conversion rate.

You can always expect higher competition with general keywords, but if you can find more specific keywords, they will not only be cheaper to bid on, but will also have a higher conversion rate.

Tip: Use the keyword research in your ASO strategy to understand your options and sync your goals!

Use the 80/20 budget allocation method for App Store promotions

When comparing keywords, you must split your keywords between broad match and exact match. 80% of your spend should go to exact match and the remaining 20% should go to broad match. Both will be used primarily for discovery campaigns to identify keywords that perform better than others.

Exact match keywords will allow you to attract and convert interested users. They will be easier to convert and more likely to generate more revenue. They may cost more, but they will also pay off. Ideally, you should allocate an 80/20 budget to get the maximum return. Once you start generating interest, you can also reduce your budget allocation.

How to leverage your app business within ASO and ASA on iOS app store?

The great thing about Apple Search Ads is that you can use the search match feature to identify new keywords. When Search Match is enabled, your ads are automatically matched to new search terms based on metadata in your App Store listings, information about similar apps of the same type, and other available search data.

The ability to check keyword relevancy is an invaluable part of Apple Search Ads. In just a few hours, you can run a small test campaign to collect data and get a complete picture of which keywords to optimize for in your ASO efforts. By analyzing Tap Through Rate (similar to Click Through Rate on the web), in-store conversion rates, and actual downloads, you can begin to develop a more effective ASO strategy. In addition, you can use attribution tools to explore the LTV of each keyword for campaign analysis.

ASA can help you narrow down your ASO strategy, but it’s not a gold mine; ASO is a long-term strategy, and your goal should be to keep increasing natural downloads. A key learning point is to look at ASA data from a longer-term perspective so you can see the true trends and performance of each keyword.

Apple Search Ads only work if you know how to properly target your keywords. To ensure maximum app visibility and download rates, you need to target specific and general keywords and carefully determine how much you are willing to bid for each keyword. An easy way to find keywords is to use a tool that automatically compiles a list of targeted keywords. You should increase your bids until you reach your cost-per-acquisition target and start winning downloads from popular keywords related to your niche.

Unfortunately, simply outbidding your competitors for high-volume keywords isn’t enough to win the number one spot, because Apple also considers the relevance of your app to the keyword. To ensure you always rank #1, you need to combine winning bids with ASO optimization. Factors that affect your ASO include app name, URL, description, reviews, and ratings.

Source: How to Leverage ASA to Boost Your App Visibility?

So, how should you optimize your Search Ads campaigns for profitability?

1. Cost-Per-Acquisition (CPA) Goal:

The first thing you need to determine is how much you can afford to spend for every Search Ads install, so how much your target CPI (Cost-Per-Install) or Cost-Per-Acquisition (CPA) Goal — as Apple names it — should be. Note the difference in naming here: unlike other networks, Apple uses the word “Acquisition” and not “Install” because they actually only measure when users hit download and not when they have actually fully installed the game (we will hear more on that important difference later in this article).

To do this, if you are already running campaigns on other networks, you know your customer LTV (lifetime value), or how much every user will spend on average in your game.

Let’s say your game net LTV is $6 for iOS users in the United States.

On Apple Search Ads, you can either set your bids based on a Max CPT (Cost-Per-Tap) you are willing to pay or choose a CPA Goal, which means Apple will try to display your ads automatically and maximize conversions. But we don’t recommend that option because, while it will make sure you don’t go above your target CPA, it will limit your impressions quite a lot so you will miss out on several opportunities to convert.

So, for Max CPT, we usually apply a 30% ratio of the LTV of the game we’re promoting, because we normally observe an average 30% conversion rate (from taps to installs) on Search Ads.

In that case, we would be using:

Max CPT Bid = $6 x 30% = $2

Source: Medium

 Measuring your ROAS:

Now comes the most important part: What’s the revenue generated from your Search Ads campaigns?

Apple doesn’t track (or share) any detailed activity coming from the Search Ads installs they have provided you. So you will have to use your MMP for that.

Depending on the LTV curve of your game, you’d be looking at your Day 7, 15, 30 etc. ROAS (Return on Ad Spend) on a campaign, ad group or keyword level.

Cohort Reports for Search Ads Campaigns in Adjust

Let’s say you use Day-7 as a goal, you will then be doing this calculation:

Day-7 ROAS = Day-7 MMP Revenue / Search Ads Spend

And then compare that your Day-7 ROAS goal. If it’s above that, that’s a good sign and you should keep your campaigns/ad-groups active but make sure you monitor the retention of these users in the long run to validate their good performance.

If it’s below your goal, let’s say by more than 25%, then you should consider pausing or reducing the spend on these ad groups or campaigns.

That’s the formal way of assigning and reporting revenue coming from Search Ads.

But you have to take into consideration the installs that are not seen by your MMP and which may have also generated revenue.

ROAS = ((Revenue) * (1 + LAT Rate x 50%)) / Search Ads Spend

Bid Optimization:

Once you have launched your campaigns, give it a few days and then look at the performance of the ad groups you have created.

The first thing you need to check is if the keywords you have selected convert to installs. If there are ad groups with a Conversion Rate below 25%-20% it means that the keywords you have chosen are either too broad or not relevant. You should then consider pausing or reducing the bid on these ad groups.

On the contrary, for ad groups and keywords that have a high Conversion Rate, for example anything above 30%, you should increase your bid for as long as it’s aligned with your projected ROAS. In order to know how much is necessary, in the Search Ads interface, Apple suggests a bid range to have an indication of how much you should spend to match or beat your competitors. You should adjust your bids for every keyword that are are below the suggested bid ranges (as long as it stays within your target CPA goals).

Many factors affect how your Apple Search Ads Basic app promotions perform, including relevancy, your maximum cost-per-install (max CPI) amount compared to your competitors, and user response to your ad. The following best practices can help improve your app promotion results.

  • Review your metadata in App Store Connect to ensure it’s the best representation of your app. Your app title, descriptions, and keywords are all considerations Apple Search Ads uses to assess your app’s relevance for specific search queries, so you should take great care in crafting them. Apple Search Ads Basic also uses the app name, subtitle, description, preview videos, and screenshots approved for your App Store product page to create your ad. Take the time to review your app metadata in App Store Connect before you start using Apple Search Ads Basic.

    Review App Store metadata best practices

    Note that if you change your App Store metadata, it can take up to 24 hours to be reflected in the ad preview within your account, and up to two hours to be reflected in your ad on the App Store.
  • Take a look at your ad creative. It can play a key role in your app promotion performance. Because Apple Search Ads uses the app name, subtitle, description, preview videos, and up to the first three screenshots approved for your App Store product page to create your ad, you may want to consider adjusting these assets if your ad isn’t performing well.
  • Consider your product page, too, as it can also help drive installs. With three app previews, 10 screenshots, and new text fields, product pages offer more opportunities to showcase your work.
  • If your ad isn’t delivering results, try raising your max CPI to increase the likelihood of your ad being shown. You can use the suggested max CPI in your dashboard as a guide to help determine the right amount.
  • Consider running your app promotion in all the countries and regions where your app is available. This will give you more opportunities to reach interested customers. Check your monthly budget to make sure you’re reaching as many customers as possible. You may need to increase your budget, especially if you’re running app promotions in multiple countries and regions.
  • Make sure you’re using the right business model. The right business model for your app balances your goals with the expectations of key audiences, and can also affect the performance of your app in App Store search, including with Apple Search Ads. If you’ve tried the above and still aren’t seeing results, it’s a good idea to review App Store best practices. Learn more here…

Google Search Ads Optimization Techniques

Tips for Scaling a performing Google Search Campaign

Don’t dedicate an entire campaign for a top-performing keywords.

How long did you “test[ed] simply raising budget” for? Are we talking about a week, month, multiple months?

Here are some other options for you:

  • Review your Impression Share and top of page rate metrics (Impr. (Top) % and Impr. (Abs. Top) %). Are these trending in the right direction? Are you losing out due to budget on high-performing campaigns? How do your ads perform when you’re placing above organic search results vs below (aka “Other”)?
  • Look at 30-, 60-, and 90-day windows for things like audiences, demographics, and locations. Are there options here that are high-spending but underperforming, and could be excluded? This would allow, moving forward, al of the budget to be spent on better-performing targeting options.
  • Consider testing new ad copy. If you can achieve stronger CTR, this allows you to generate traffic within the existing impression volume.
  • My preferred setup is to group keywords by a shared intent. I have B2B SaaS clients, so the majority of my campaigns are all focused on very high-intent searches that contain both context (around my clients’ services/solutions/vertical) and intent (keywords matching to search terms including “software”, “platform”, “solutions”, etc). To scale traffic, I’ve created a separate campaign that bids on keywords that contain just the contextual terms, but not the software-intent, with lower (manual) bids, using negative keywords to appropriately filter traffic. Considering splitting out your campaigns/ad groups by high-intent vs low-intent keywords, with budget given to higher performers.
  • Example: Let’s say your client offers a software for enterprise businesses to manage their cybersecurity. A high-intent keyword would be something like “enterprise cybersecurity software”, whereas a low-intent keyword would be just “enterprise cybersecurity”. We still require the user to use “enterprise cybersecurity” in some context, but that short-tail keyword does not require any specific intent like looking for a third-party tool/platform.

The keyword “enterprise cybersecurity software” will likely be significantly more expensive, and likely lower search volume/impressions, but has a clear, higher intent. The shorter-tail keyword will get you a larger number of impressions, but has a higher likelihood of leading to potentially lower-quality searches and clicks. I’d recommend starting out with trying to capture the high-intent searches first, but when you’re looking to scale, that’s where I’d add in the low-intent keywords, but separated into their own campaign, or at least a separate ad group.

On average, you spend a good amount of money on Google Ads, but still not worth the money results. So, spending the money without having the proper knowledge is a waste! And spending money with no results hurts, right? Don’t worry! We will tell you how you can get the value of your money. We will discuss tips and tricks to improve your Goggle Ads conversion rates.

Follow the ways below to improve your Google Ads Conversion Rates:

• Lead With an Attractive Offer or Value

The book cover is the Book’s first impression. And, you might have heard- “don’t judge a book by its cover”. Well, that’s exactly what we all do. We take a look at the book cover if it doesn’t please our eyes, we move on to the next.

Similarly, the headline is the first impression of your content. If it doesn’t please the eyes of your visitor, he/she won’t take an action on it. Hence, use some catchy phrases to create an attractive headline that will lead your content.

• Refine your CTAs

You need to tell your visitors what to do, otherwise, they won’t turn act! Yes, that’s true! It’s you who have to direct your website to take an action by generating a need for it.

Studies show that the most used CTAs by top-notch brands are- “get”, “buy”, and “shop”. Phrases like these, create an urge to take action, and that’s what improves your conversion rate.

• Boost your CTRs

Create content copy that can convince a reader to click right through your product. Write blogs or Ad copies that can convince your visitors to click. And for this, understand your audience. Convince them that they are missing something big and your product can fulfill that crack.

Don’t try to hurry them up to buy your product. Remember, in this step you just have to convince them to walk through your content and not buy your product. Use soft tone phrases like “get a quote”, “get more details”, etc.

• Align your Ad with an Accurate Landing Page

The general mistake we do sometimes is not checking up on our landing page. Whether we aligned our ad to the right landing page or not! Or, is the ad redirecting to the correct landing page or order! If you won’t do this right, you can lose a large audience.

For example, Your ad is about American diamond earrings, but the ad is aligned to a bangles landing page. This is not fulfilling the purpose of your Ad, and you will lose your potential customer here only.

Create a landing page for every segment and align them with the Ad properly.

• Work on your Quality Score

When you create or run a Google Ad, your Ad gets a ranking which is called Quality Score. This score is given based on the performance of your product. How much your Ad is impacting the audience, how it is performing in the market, how effective it is, and what value it’s giving out!

All these factors decide your Ad’s quality score.

According to studies, the more the quality score the lesser the overall CTR cost. This quality score can be improved by three factors- the landing page, the CTR, and Ad relevance.

• Don’t Miss out on your Social Proofs

People trust reviews. They are afraid of being the first one to use or buy anything. They look for the assurance and experience of others to rely on! Hence, putting out your social proofs is very important. Include the brands or firms you have worked with, put their reviews, and that will make you look authentic and preferred. This will attract and convince the visitors to be your potential loyal customers.

• Step-On your Competitors

Sometimes, not getting enough conversions via Google can be a targeting issue. And to sort that, you should focus on the audience’s intent. Like, what they are looking to buy, what is their need, etc. And, a clear way of doing this is branded keyword search.

Branded keyword search is when a person looks for something brand specific.

For example: “dresses on Myntra”, “Sports shoes on Reebok”, etc.

When a person will search the above keywords, he/she will not only get the results for the brands above but the Ads of alternatives too. That’s what stepping on your competitors is! Run your Ads on the brand keyword research of other competitive brands. I know, it’s something that sounds illegal but isn’t!

• Enhance your Landing Page

Optimizing Ads is not just enough! You need to work on everything else. One of the major things is the landing page. By having visitors directed to your landing page, you will have a task to fulfill what a visitor is expecting from you. Your landing page should have all the information needed in an organized manner. Don’t fill it heavily, but keep it on point.

Put product videos or video testimonials of the product or service, they tend to have greater chances to hook your visitors. And, the videos can help you better with conversion rates.

• Run Mobile-Friendly Ads

With the world going mobile, it’s important that you run mobile-friendly Ads. Keep the dimensions of your posters or Ad copies that can fit a mobile screen efficiently. Make it easy to access for the visitors. The only-desktop specific Ads will not look good on the mobile screen, and you might lose a great set of audience as most people access things through their mobiles.

Hence, move with the trend.

• Use Remarketing

We often forget how important remarketing is! Many times, a customer leaves the product in the cart or wishlist and forgets about it! Remarketing can help you catch back such customers. Look for Ads that performed great and are older. Run then again, they will lead your old visitors as well as create new leads as well.

Google Ads can be a whooping asset to convert your visitors into customers. You just need to do things right! If you will implement the above tips in the right manner the Google Ads conversion rate will definitely go up!

If anyone of you bright people has more tips to add, please feel free to add your opinions and suggestions. It’s always great to learn.

Read More: Conversion Rate Optimization Services

Another way to get good quality score on your ads these days is to write really awkward headlines that include the keywords, and then pinning any discounts. Kinda sucks but it’s been working better for me than traditional CTAs.

Quiz1: Jim Has Created A Google Search Ad With A Bid Of $5. Two Other Advertisers In An Auction Have Bids Of $2.50 And $2. How Much Would Jim Pay For The First Spot In The Auction?

Answer1: $2.51

Quiz2: True Or False? Google Audiences Are Updated On Every Impression, So Advertisers Can Reach Only The Most Relevant Consumers On YouTube Answer.

Answer2: True

Quiz3: On which social network should you share content most frequently? Correct Answer

Answer3: Twitter

Quiz4: You Want To Find New, High-Value Customers Using Their Data. Which Audience Solution Should You Use

Answer4: Similar Audiences

Meaning of key terms used in this blog:

Avg CPA: The average amount you’ve been charged for a conversion from your ad. Average cost per action (CPA) is calculated by dividing the total cost of conversions by the total number of conversions. 

  • For example, if your ad receives 2 conversions, one costing $2.00 and one costing $4.00, your average CPA for those conversions is $3.00.
  • Average CPA is based on your actual CPA (the actual amount you’re charged for a conversion from your ad), which might be different than your target CPA (the amount you’ve set as your desired average CPA if using Target CPA bidding).
  • Use performance targets to set an average CPA target for all campaign in a campaign group.

Avg CPT: This is the maximum amount you’re willing to pay for a tap on your ad.

Your default max CPT bid applies across all keywords in your ad group unless you specify a max CPT bid at the keyword level.

When calculating the amount of your max CPT bid:

  1. Decide what amount you can afford to spend on a new customer or action. Let’s say it’s $2.50 (U.S.).
  2. Estimate the percentage of customers who tap your ad and who you think will download your app or take your desired action. In this case, you estimate 40%.
  3. Calculate what you can afford to pay up to 40% of $2.50 (U.S.) — or $1.00 (U.S.) — for each tap. Therefore, set your starting default maximum CPT bid to $1.00 (U.S.).

Avg CPM: Average cost-per-thousand-impressions (CPM) is the average amount you pay per one thousand ad impressions on the App Store.

CR: The conversion rate (CR) is the total number of installs received within a period divided by total number of taps within the same period.

Dimensions: A dimension is an element of your Apple Search Ads campaign that can be included in a custom report. For example, campaign ID or CPT bid. Dimensions appear as rows in your custom reports.

Impression Share: The share of impressions your ad(s) received from the total impressions served on the same search terms or keywords, in the same countries and regions. Impression share is displayed as a percentage range, such as 0-10%, 11-20%, and so on. This metric is only available in predefined Impression Share custom reports and on the Recommendations page.

Impressions: The number of times your ad appeared in App Store search results within the reporting time period.

Installs: The total number of conversions from new downloads and redownloads resulting from an ad within the reporting period. Apple Search Ads installs are attributed within a 30-day tap-through window. Note that total installs may not match totals of LAT Off and LAT On installs, as additional downloads may come from customers using iOS 14 or later.

LAT Off Installs: Downloads from users who are using iOS 13 or earlier and have not enabled Limit Ad Tracking (LAT) on their device.

LAT On Installs: Downloads from users who are using iOS 13 or earlier and have enabled Limit Ad Tracking (LAT) on their device.

Match Source: This identifies whether your impression was the result of Search Match or a bidded keyword.

New Downloads: These represent app downloads from new users who have never before downloaded your app.

Rank: How your app ranks in terms of impression share compared to other apps in the same countries and regions. Rank is displayed as numbers from 1 to 5 or >5, with 1 being the highest rank. This metric is only available in predefined Impression Share reports and on the Recommendations page.

Redownloads: Redownloads occur when a user downloads your app, deletes it, and downloads the same app again following a tap on an ad on the App Store, or downloads the same app on an additional device.

Search Popularity: The popularity of a keyword, based on App Store searches. Search popularity is displayed as numbers from 1 to 5, with 5 being the most popular.

Search Term: Search terms are keywords and phrases that people have used to find the particular type of app they’re looking for.

Spend: The sum of the cost of each customer tap on your ad over the period of time set for your reporting.

Taps: The number of times your ad was tapped by users within the reporting time period.

TTR: The tap-through rate (TTR) is the number of times your ad was tapped by customers divided by the total impressions your ad received.

Keywords: Keywords are relevant words or terms someone may use when searching for an app like yours on the App Store. With Apple Search Ads Advanced, you bid on keywords to trigger and include your ad within relevant App Store search results — so when an App Store customer types in a search query that uses one of your keywords, your ad could appear.

Apple Search Ads knows a lot about your app and its genre, and will provide a list of keyword recommendations to save you time when you add keywords to a search results ad group. You can also add keywords of your own, and Apple Search Ads will suggest a further set of keywords related to the ones you’ve provided. To add any of them to your ad group, simply click the plus sign next to them.

I’ve managed +$10M in paid media over the last 8 years. Here are a few “less mainstream” FREE tools/websites/extensions I use. Hope this helps!

1. Adveronix

Adveronix is a handy Google Sheets add-on that allows you to export data from Facebook Ads, Google Ads, or any other channel automatically into a spreadsheet daily. You can then connect this spreadsheet to Google Data Studio and have a free connector for most media channels.

2. Polymer Search

Polymer Search has been one of my latest finds and a beneficial tool for creative analysis (and a few other things). For example, I usually test new creatives on Facebook Ads using dynamic creative testing campaigns.

I can then simply export my Facebook Ads data into a spreadsheet, connect it to Polymer Search, and immediately see which creative elements are working the best and which ones aren’t. The Auto-Explainer tool uses AI to immediately sort “Above Average” and “Below Average” creatives.

There’s also a ton more this tool can do – massive potential for media buyers.

3. BuiltWith

Before taking on any new client, one of my first steps is always to look at their website.

Suppose I don’t see anything like Klaviyo, Google Analytics, the Facebook Pixel, or any other marketing-related tech. In that case, this is usually a sign the client might be in a too early stage for me to help them out.

BuiltWith also helps you look into competitors and see what sorts of software they’re using.

4. Ad Creative Bank

The Ad Creative Bank is one of my top sources to find creative inspiration for new ads. It’s pretty simple: just look into the type of ads you want to create and browse through their well-organized library of great-looking ads.

5. Unicord Ads

Same as above, with the difference that you can sort by different industry/niche.

I find the ad quality slightly lower than Ad Creative Bank, but still a great library of ads to discover new brands and find inspiration for yourself!

6. One Click Extensions Manager

If you’re anything like me, your Google Chrome browser has +10 extensions cluttering your view. In short, One Click Extensions Manager allows you to organize all extensions into one single icon near your search tab, which makes everything feel a little more organized.

VidTao.com YouTube ads searchable by adspend over time. Perfect for modelling and competitive research.

And not forgetting:

Facebook Ad Library : Shouldn’t be overlooked.

Surferseo – it have free tier with a bit of tools
lsigraph.com – when you have no idea of keywords 

I’ve audited a dozen Facebook campaigns this month. Here’s the common mistakes I’m seeing people make:

Most of these mistakes were from ad accounts that are in the early testing stage and spending under $100/day. The majority of these mistakes are related to what NOT to do during the testing stage in an ad account. I had a few people get audits that were spending higher amounts ($500/day and above) but their situation was very specific and the solution I provided was also specific so it most likely wouldn’t add much value to share that scenario.

  1. Multiple interests and/or behaviors in one ad set (aka stacked audiences)

Doing this defeats the purpose of testing because you don’t know which interest is bringing in the results. Many other reasons to not do this during testing including you could have a great interest stacked with a bad one and that could skew the potential results. There are some instances where maybe it would be okay to have 2 stacked interests if the audiences are very small, but what I was seeing people do often is stack over 10 interests and behaviors into a single ad set.

2. Using CBO (campaign budget optimization) too early

CBO is not recommended for testing stage in Facebook ads. I’ve seen a couple of people do fine with CBO for testing but it logically doesn’t make sense because you don’t have much control over the budget allocation. This is why ad set budget is better for testing because when you want to put $20/day into one and set and $20/day into another, you know that the test is even. CBO will most likely not even out that budget. Even with setting ad set budget minimums and all of those constraints, which is sort of redundant. Facebook will recommend doing CBO by giving you messages inside of the ads manager but most of what Facebook says in their ads manager is not based off your current situation. They don’t know that you are in a testing phase and don’t have enough data to do a CBO, they just see that you are trying to spend a certain amount per day and they recommend CBO. Facebook’s ad manager isn’t smart enough to say “I see you are testing headline combinations – you should switch to ad set budget” or “I see you are trying to scale your store – you should use a CBO campaign”. You should use CBO once you’ve properly tested at least 4 audiences with ad set budget optimization.

3. Creating Lookalike audiences with low-quality data as a hail Mary

Yes, lookalike audiences are pretty neat. When you don’t have enough purchases, there are other source data pools that you can create them with. Video views, website traffic, page engagement, etc. The problem is you are pretty much creating a lookalike audience based on people who DON’T buy. Especially if you don’t have anyone buying your product. There is probably something wrong with your targeting as it is and you need to stick to interest targeting and optimizing for purchase conversions. I’ve seen people run a traffic campaign, get a few hundred clicks, and zero sales. This is because you are getting very low-quality traffic from Facebook and creating a lookalike is just going to find more people similar to that low-quality data. If you have a sort of “niche product” and you think that you can’t target them based on interests then you are not thinking outside of the box enough to find interests to test (more on finding the right interests in a later section).

4. Spreading too little per ad set and running multiple ad sets (I’ve seen as little as $3/day budgets)

For the campaigns that I audited, I gave them each a different recommended daily spend per ad set depending on their budget, niche, etc. so I don’t want to say that you should spend X amount per ad set, but $3/day is way too low. If you have a small budget, then you are better off testing less and spending more per ad set. So if you are doing $3/day to over 10 different ad sets to try and test 10 different audiences, you are going to get better data from spreading that same amount across 2-3 different audiences.

5. Interests narrowing and exclusions

I’ve seen some exclusions that make sense like excluding AliBaba and dropshipping whenever they were getting comments on the ads, but I’ve seen this done where the audience they were targeting needed to have interest in fashion AND apparel. Doing this is trying to target better than Facebook which is usually not a good idea to do unless you’ve tested both audiences on their own and if they are different categories of interests (music taste w/ hobby, industry interest w/ behavior targeting, etc.). At a testing stage this will cause CPM to be higher than needed.

6. Trying to target high-income people

This is on par with the previous mistake, but I wanted to make this its own blurb. Just because someone has a lot of money doesn’t mean they are going to shop at your store. You aren’t going to have better luck targeting the top 10% of zip codes based on income for your $20 sunglasses. Higher income people resonate better with name brand products that have credibility behind them so you would probably need to build up credibility, stellar branding, and high-quality products before attempting to target high-income people on Facebook.

7. Targeting interests that are too obvious

Your target demographic has many layers to their personality and social media behavior. When you sell a certain product and you only target the interest that is literally named the same thing that your product is, then you are limiting yourself to interests that your competition is probably targeting as well. Some of the best interests I’ve ran ads towards with Facebook ads are two or three degrees of separation from the product. I’ve sold supplements that were geared towards people who engage in certain activity, so instead of just targeting “supplement” I targeted “activity” interests. I’ve targeted music interests based on certain elements of a product that I’ve ran ads for, and the product wasn’t a music related product at all but people who liked that product typically listened to a certain type of music as well.

8. Focusing on cheap link clicks instead of purchases

The amount that you pay for a click does not matter if you are getting little to no sales. You want to pay more for expensive clicks from people that Facebook deems as likely to make a purchase or whatever action you are wanting them to do. I’ve audited a few campaigns where they ran two ad sets and the owner of the ad account concluded that “Ad Set 1” was better than “Ad Set 2” because it got clicks for half the cost. But neither of them got a sale, so neither is better than the other. Or I’ve audited campaigns where the store owner says “this ad did well, it got over 1,000 clicks” but it got zero sales. Typically this was done with an improper campaign setup anyway so none of those clicks were going to convert either way.

9. Not testing ads/audiences long enough

One campaign that I audited turned off an ad after just a few hours of letting it run because Facebook was spending the money too fast. I recommend letting a test run for at least 5 days. If the ad is setup properly then you will have some good days, some bad days, and some okay days. I’ve seen many times where the best day ever is right after a very bad day. Know that a bad day is still data for Facebook because it is learning what NOT to do.

10. Hanging on to an audience that stopped working

Audiences, ads, and campaigns can eventually stop working after a certain amount of time, regardless of how well they worked at one time. There are many reasons for this to happen which would be a whole post on its own, but if you’re struggling to get an audience to work then just move on and try again in the future. I audited a campaign that was running ads to a specific lookalike audience that was setup very odd and it wasn’t producing them very good results recently anyway, so I obviously recommended that they turn it off and try setting it up a different way that would be more likely to work. The user did not take the advice because that was their best performing audience many months ago. This is why you want to be diverse with your targeting so that when an audience stops working, you don’t cling onto it like overly attached girlfriend meme.

11. Setting up a funnel that is filled with low quality data

Running traffic campaigns is just going to get you a ton of traffic that is most likely not going to turn into a purchase. You are more likely to get a purchase from 100 high quality clicks than you would 1,000 low-quality clicks. Traffic campaigns give you the absolute bottom of the barrel traffic that Facebook has to offer. What I see people do is setup a funnel with traffic campaigns at the top, and retargeting at the bottom with a campaign optimized for conversions. This makes sense in theory, but in practice you are just continuing to retarget the low-quality traffic. And it just costs too much money to spend going after those low-quality clicks over and over again when you could just go straight for the purchase conversions campaign traffic. Those are the ones that are more likely to purchase without needing to see the ads 5 times. There are a lot of impulse buyers within those campaigns. Do this even if your store has zero purchases.

12. Worrying about 4 steps ahead when they are still on step 1

“I’m spending $50/day but what should I expect when I am scaling and spending $1,000/day?” That is going to be different for everybody but this is one of those situations where they are trying to solve a problem that hasn’t even happened yet and you’re essentially taking focus away from the step you are at right now and projecting it into a future scenario that may or may not happen.

13. Thinking the cost per purchase that they got on their own is what they’ll continue to see

If you are doing things incorrectly with Facebook ads, then you should expect to see results that are not very good. It’s one thing to have a frame of mind like “I’m not getting good results on my own but I think they could be better” as compared to “I’ve been running ads for two weeks with little to no experience and I’m paying too much to get a customer so Facebook isn’t worth it”.

3 Lessons After Spending $350K Since iOS 14.5 Hit

1. Account Structure

For me, it feels as if Facebook likes to have the account even more structured than previously. I rarely ever now use Cost Caps because of the delayed sales coming in and generally tend to have an account structure like this:

1 – TOF Scaling Campaign

2 – TOF Testing Campaign

3 – MOF/BOF Campaign (Try combining MOF/BOF in 1 Campaign if possible)

All in all, I try to consolidate my spend into as few campaigns as possible, and I still leverage Broad Targeting (No targeting at all). It has been working quite well for me on most accounts.

If you’re spending less than $500/d, I’d say Look a likes also are impacted. They are not getting as many data points as they were getting before, and therefore generally now have a lower value than before.

If you’re at the sub $500/d range, try Big Interests or just Broad Targeting if your look a like audiences are struggling.

2. Retargeting

Retargeting has changed a lot for me.

Especially at lower budget accounts, I broadened that retargeting window. Where I previously had 14D ATC, it is now 60 days. I also often combine multiple retargeting audiences, such as Add to Cart and View Content.

All in all, I try to have as few exclusions as possible since even if you e.g., exclude purchasers, those people see the ads. I’ve noticed this because a lot of new TOF Ads are getting comments from people who bought within the last 1-2 weeks from the brand.

So, with exclusions not being as effective, you want to prevent overlaps in retargeting audiences, which is why I consolidate.

3. Patience

Overall, tracking purchases has never been more challenging, and it feels to me as if Facebook is only tracking 40%-60% of all purchases from Facebook. This is why it is now super essential to look at your overall ROAS (Revenue / Ad Spend)

If your revenue increases when you scale up, but your ads manager is not showing up any purchases, they most likely come from your ads (Unless you’re running a big email promotion, got featured on a big magazine, or something like that, of course)

Purchases tend to show up in bulk for me in the ads manager after a few days, so don’t freak out if you see a low ROAS on your side, as long as the revenue is there. Make fewer day-to-day changes and keep an eye on results for a longer time.

Insights From Doing $150K+ a Day in Revenue on Facebook Ads

March 2022 Update on this: For those just seeing this now, Facebook has become significantly harder, but the general strategy here still works. And that’s testing LOTS of creatives, not fancy hacks. We’ve since started spending over $10K+ per day on Tik Tok as well and it’s doing WAY better than facebook for us.

What’s up everyone! Just wanted to drop in and share some insights into what it takes to manage $20K-60K+ a day in spend on facebook in DTC ecom. (I’ve done $150K-250K revenue days on facebook, personal best in terms of ROAS was a bit over $200K in revenue at about $60K in spend on a single one of our brands, not including black friday which was insane)

Just a caveat here, how I run ads might not work for you, especially if you’re super low in spend. Different brands require different strategies, and most importantly, my own strategies are constantly developing. How I test and scale on facebook now is completely different than how it was 6 months ago for example. Also another caveat, some of the tactics we use are really only necessary at a super high level as you’ll see here, if you’re a mom and pop shop they won’t be necessary (for example running multiple facebook pages which I’ll get into).

When I first got started in online advertising, I was always searching for the ‘perfect’ way to run ads through shitty gurus, and honestly there is NO perfect way. I recommend learning the basics and devising your own strategy, which is what I ended up doing. Another thing, at lowish spend (less than $5K-10K+ a day I would say, you’re usually going to get decent fluctuations in performance day to day on facebook. Consistency on facebook comes from high spend and feeding the algo as many data points as possible.

I’m fortunate enough to be in a network of the most elite DTC brand owners so I’ve accumulated a ton of knowledge about what works at this level of scale, but this game still requires constant learning! This isn’t set in stone but its just what I’ve found works for me, so here it goes.

Naming Conventions

Consistent naming conventions are super important for analyzing data in ad reporting at a glance. You can figure out your own but here are mine if you’re looking for a quick idea:

Campaign Names:

TOF: Prospecting (Top of Funnel)

BOF: Retargeting

T: Testing

S: Scaling

SS: Super Scaling (these campaigns are typically $2K-10K daily budget)

X.XX numbers at the end of campaign names or ad sets names: date of launch, i.e. 5.15 is May 15

Campaign name example: SS – TOF – CBO – Beast – 6.05

Ad set names:

Targeting – Countries – Age – Placement – Attribution – Date of launch

E.g. Broad – US + CA – 18+ – Auto – 7dc1dv – 3.15

e.g. INT – Theme parks – US – 18+ – Auto – 7dc – 3.24

E.g. LLA – Lookalike (US, 10%) – 2+ Purchase 180 Days – US – 18+ – Auto – 7dc – 2.16

Ad Names:

Brand – FB Page – video/image number – ad copy number – lander/advertorial number – post ID – date of launch

E.g.

PP – vv100 – adc49 – lp3 – 123434341834813 – 8.08

PP – p3 – vv100 – adc72 – lp53 – 123434341834813 – 8.08

 
Account Structure – Testing (Post ID’s)
Testing Campaigns (always running):
T – TOF – ABO – Interest Testing – 5.15
  • Testing random interests found in facebook audience insights, similar interests to winning interests, etc using best 2-4 post ID’s to “feed” the pixel data

  • Audience insights is phasing out so this might not be useful in the future

  • Small budget ad sets of $30-50

  • Can dupe winners out 2x in same campaign at slightly higher budget of $50-60

I do this with lookalikes too but I do not run interests or lookalikes with any real budget whatsoever nowadays. I literally run all creative testing and scaling with completely wide open targeting

 
T – 1 – Creative – TOF – ABO – Broad – 2.18
  • Phase 1 testing campaign

  • All new videos/images get launched here

  • I like to do them in batches of 3-4 new videos/images at a time in a single broad ad set with the budget set to 1.5-2x AOV

  • Broad targeting (US + CA, 18+ so we determine how effective the creatives truly are without being skewed by very good lookalikes/interests etc. In the case of more niche products, can try broad interest targeting, like interest ‘fitness’ if selling fitness apparel or ‘coffee’ if selling coffee product, with detailed targeting expansion checked ON)

  • Using best copy variation, best offer, best lander/advertorial

  • Winners graduate to testing phase 2

 
T – 2 – Ad Copy – TOF – ABO – Broad – 2.19
  • Phase 2 testing campaign

  • Take each winning winning creative from phase 1 and put it into its own broad ad set in this second campaign, testing 4-5 different ad copy angles (separate ad), still using best lander

  • E.g. ad set naming convention:

    • img192 – Broad – US + CA – 18+ – Auto – 7dc – 3.02

      • Means img192 is the constant image across the 4 ads, with 4 different copy

  • Winning ad copy variants graduates to step 3

 
T – 3 – Lander – TOF – ABO – Broad – 2.19
  • Phase 3 testing campaign

  • Here’s what differentiates us from most ecom brands. We test a TON of advertorials, like 3-5 new advertorials a month focused on different angles. Seriously at scale this is what separates winners from losers. In this campaign I’ll also test running direct to our top sales lander as well as one of the ads. We NEVER run direct to a shopify store, we have a subdomain with dedicated landing pages/advertorials that we run to with custom checkout that converts MUCH higher and has a much higher AOV with it’s upsells.

  • Take winning video/images + copy combo and test 3-5 different landers/advertorials as mentioned

  • E.g. ad set naming convention:

    • vv65 – adc220 – Broad – US + CA – 18+ – Auto – 7dc – 3.21

      • Denotes that vv5 and adc220 were the winning variables from previous test, now testing 3-4 different landers/adverts with these two winning combos

  • By now the creative has run through 3 different testing campaigns/phases. If still performing, it can be moved to bigger budget testing to see its scaling potential

  • Can also be moved to optional step 4 for generating more winning post ID’s

  • Also optional: Winner of this test can be moved back to step 2, testing more ad copy focused around the advertorial if a specific advertorial won during this test

T – 4 – Page – TOF – ABO – Broad – 2.19
  • Optional step 4

  • This is another tactic that I don’t see many bigger brands using. In this campaign I’ll take the winning ads from the previous steps, and re-create them on 3-4 different facebook pages that aren’t our main brand page. These are ‘blog’ style pages. For example the name of one of the pages if you own a furniture store might be “Home Decor Insider”. What you don’t want to do is create fake influencer pages like “Katie’s Home’s” or something like that as that’s not allowed.

  • Take the winning video/image + copy + lander/advert combo and test it on 3-4 different facebook pages to generate more winning post ID’s as mentioned.

  • The point of this is multi-fold:

    • Generate as many winning post id’s as possible because at scale you’ll need them

    • Distributes negative feedback score away from your main brand page (negative feedback can become an issue at scale, especially last year with covid shipping delays)

    • Different pages perform differently in the auction, some page names may resonate with people more and get cheaper cpc’s and cpm’s.

As you can see here the point in all this testing is generating as many winning post ID’s as possible.

BPA – TOF – ABO – Broad – 2.19
  • BPA meaning best performing ads

  • This campaign is for testing all the winning post ID’s from steps 1-4 at higher budgets.

  • Like to do them in ad sets with batches of 2-4 ads

  • Also broad ad sets, but can also try with different LLA’s or broad interests

  • Budget 1.5-3x AOV, and scale it but dupe. I.e. start the ad set at $300, if doing well over the course of 3 days or so, dupe out at double $600. From here you’ll get a sense of how it does at higher budgets. Sometimes it can do very well in the smaller 1-4 step testing, but falls flat here. If it was getting decent metrics in testing, but falls flat here, you can try duplicating the ad set and trying it again, or testing with a couple different audiences.

DCT Testing (if applicable)
  • DCT seems to work better with lower CPA products, or requires a very high budget for higher CPA products

  • I haven’t had much success with dynamic creatives for testing, and especially now with the ios update facebook doesn’t show in breakdowns which creative variables are getting the purchases so they seem essentially worthless.

  • If i were to do creative testing for DCT I would do something like:

    • One broad ad set for each new video/image

    • $100-300 budget

    • 1x new video/image, 2 best copy + 1 new copy, 1 best headline + 1 new headline

  • Pull winning post ID’s out, follow testing steps 3-4 above to test different landers/adverts/offers/fb pages

  • What i DO like dynamic creative for lately is time sensitive sales, like black friday where I don’t have a ton of time to test stuff. What I usually do is toss in a ton of my existing winning videos/images/copy/headlines (I might just add a black friday sale specific line to the top of the ad copy) running to my best advertorial/lander and let it rip at about $1000 a day budget. If it does good after 1 day I’ll duplicate it out into a cost cap/bid cap at $5K-10K a day or whatever

CBO Angle testing:

This is a CBO with 5-7 ad sets, each ad set is a separate angle containing winning ads from the above campaigns, that get added to their respective angle ad set. Budget is about $1K per day for me. All ad sets wide open broad targeting

SCALING!!!

Here’s the fun part. My methods of scaling nowadays have evolved with what works on facebook. The good thing is with this level of spend I learn quickly what is or what does not work on facebook anymore so it keeps me current. I have a few different scaling campaign structures that I’m currently running simultaneously. This is what I’m finding works right now:

Scaling Campaign 1

Lowest cost CBO -> 1 ad set (completely Broad) -> Best 6-10 post ID’s from testing campaigns. I’ll add new post ID’s/turn off ads if performance is on a decline over a week period. I will increase the budget by 20-30% a day if performance has been consistently good over a 2-3 day period.

Scaling Campaign 2

Same as above, except this campaign is made up entirely of non-brand page post ID’s from the page testing campaigns

^ These campaigns are both often running at $2-5K+ a day

Scaling Campaign 3 – Bid Cap ABO

I duplicate the best ad sets 3x from the CBO angle testing campaign into a separate ABO campaign, each running at a different bid. Ad set one’s bid cap is set to target CPA + 25%. So if my target cpa for example is $50, the bid cap would be set to $62.5. Ad set two is set to +50% ($75) and ad set 3 is set to +100% ($99.99, I round down in this case as my theory is if i set the bid to $100, I’ll be put into a higher tiered auction pool and may get outbid, dont quote me on this lol)

I set budgets at about $1K-5K per ad set here. And because you can have one of these campaigns for each angle, you can see how quickly scale can build up here.

Scaling Campaign 4 – Cost Cap ABO
  • Same as above, but the cost caps for this campaign will be +15%, +25% and +50%

Scaling Campaign 5 – Cost Cap CBO
  • 4 completely broad ad sets duplicate of each other, all with the same cost cap. This campaign contains the best 6-12 post ID’s overall from all testing campaigns. You’ll have to play with the cost cap here to get it to spend properly. This campaign is generally a big one for me usually with a $10K daily budget. I’ll also have a minimum ad set spend of about 3-5x the CPA set for each ad set

The point in having so many scaling campaigns is multi-fold:

  • Prevents reliance on a single scaling campaign on poor days. For example one or two of these campaigns might do mediocre one day, but the rest are crushing and make up for it

  • Optimizes differently and hits different points in the auction by utilizing both CBO and ABO

If you want to go crazy you can also take these exact scaling campaigns and scale them across multiple accounts as well. For that $200K day I had $10K+ cost cap campaigns scaled across like 4 different accounts.

And that’s it! Like I said this is not end all be all of running ads, just what I’ve evolved to do after spending high budget day in and day out for single brands

The most important thing about scaling with this level of spend and what separates the brands who do great online and those who don’t is content. We’re testing about 10-15 NEW video ads per WEEK + variations of winning videos on top of that (different hooks for example)

Audience “hacking” is no longer really a thing and hasn’t been for a while. I don’t run any interests at scale for the most part and lookalikes I barely use nowadays either (they worked great last year up until Q3-Q4). literally just wide open 18+ targeting. broad targeting might not work as well if you have a super niche brand

It’s true that nowadays facebook has certainly become a lot more difficult. We aren’t spending as much on it compared to last year (though still a lot and it’s our primary DTC revenue driver still), we’re trying to crack other traffic sources to diversify for cold traffic, especially with Tik Tok, Youtube, GDN and Snapchat. Snap is spending about $3K-5K a day at so-so ROAS.

How to structure your entire Facebook ad campaign (From prospecting to retargeting)

Having a defined structure and strategy is essential to a successful Facebook ad campaign.

I run an ads agency and one of the biggest mistakes I see with Facebook ads is a complete lack of structure. Many business owners and advertisers treat Facebook ads like darts, throwing hail Mary’s at the board and hoping for a favorable outcome. This is especially apparent when it comes to scaling, I think this is what people struggle with most.

In this post I will give a complete overview of how to structure your Facebook ads, from TOF prospecting to BOF retargeting.

Quick disclaimer, this is just a general overview of strategy and structure. Every ad account should be approached differently and it’s important to tailor your strategy to your brand.

This is what it should look like from a birds-eye view:

TOF – 1 Testing Campaign & 1 Scaling Campaign

MOF- Retargeting Campaign for Soft Interest (Landing page view, video views etc)

BOF – Retargeting Campaign for Heavy Interest (ATC, IC etc)

BOF Post Purchase (Optional) – This is brand dependent and isn’t applicable for all. This is post-purchase retargeting.

TOF – Testing and Scaling

This stage of the funnel should ideally be split into two campaigns, it may require more with bigger accounts.

This entire stage of the funnel only involves cold audiences, a majority of your budget should be allocated to TOF.

  • Testing

The first campaign is the testing campaign. It’s important to test EVERYTHING. This campaign should be ABO and every ad set should be allocated an equal daily spend. Test audiences and creatives for 1 week, kill ad sets that aren’t performing, winning ad sets and and creatives will be moved to the scaling campaign.

It’s also possible to scale ad sets vertically in the testing campaign. However, be careful to not get overzealous as you risk sending the ad set back into learning. To scale vertically, slowly increase the ad set budget by 10%-20% every couple of days.

  • Scaling

All your winning ad sets from the testing campaign must be duplicated into the scaling campaign. Sometimes ad sets will perform vastly different when duplicated so this is why we also scale vertically in the testing campaign. Sometimes it may just be a matter of duplicating the ad set twice before it performs. This is a result of Facebook’s learning phase always being different.

Now, this campaign should ideally be CBO as your goal is to maximise results. You should still be introducing new ad sets from your testing campaign, some people even introduce new ad sets directly to the scaling campaign. At this stage of the funnel, keep an eye on frequency as you don’t want to risk audience fatigue. It’s important to keep introducing new creatives to combat audience fatigue.

The TOF campaign should include both cold interest audiences and cold LLA audiences. As I said, test everything. It’s also important to start with logical audiences. Once you start getting traction you can begin introducing some more obscure interests.

Your copy at this stage should also be problem/solution focused, you are selling your product at this stage.

MOF – Retargeting Soft Interest

This stage of the funnel will only be effective if your cold campaigns were optimised for purchases, otherwise, you will be wasting money retargeting low-quality audiences.

The targeting for this stage is simple. It’s important that you exclude audiences that you will be targeting later down the funnel, such as ATCs, ICs, and Purchases.

The copy is really important at this stage of the funnel. You have already somewhat sold them on the product, hence why they clicked. I’ve found that trust-building copy and creatives are effective. Customer reviews/testimonials can be leveraged to build trust with your audience and convince them that your product delivers on what it promises, or at least, has a real customer base. People like to follow the herd, convince them that the herd buys your product.

Some advertisers skip this stage of the funnel completely, or combine it with the bottom of funnel retargeting. This is ok, but I like structure and separating the campaigns is much more orderly. It also allows you to ensure copy and creative is consistent with the funnel stage.

BOF – Retargeting Heavy Interest

This is the campaign that should provide you with the best results in terms of ROAS and CPA. However, as the audience will be much smaller, the daily ad spend will be relatively low.

It’s important that you exclude the MOF audiences, as well as purchasers.

Creative and copy should involve a strong CTA. This audience has already been involved in the purchase process and thus, have shown strong interest in your product. We often use discount codes at this stage as a CTA.

You can also get creative with your copy. Remember, this audience already knows your brand and product.

BOF Post Purchase – Optional

This is only applicable for brands with multiple products for sale. Only a very small budget should be allocated to this campaign.

Again, this audience is already very familiar with your brand so use this to your advantage.

As mentioned in the beginning, this is just a basic structure and there are many variations. It’s important that you take your own situation into account when setting up your Facebook ads.

I hope this post has been helpful, it’s not as granular as my previous posts but I think it’s important that people understand how to structure an entire Facebook ad strategy.

Top 10 CPM’s most expensive/cheapest Facebook

Here are the top 10 most expensive CPM’s for February-March 2022:

Australia – $19.57

Denmark – $18.98

Norway – $18.19

United States – $17.26

Singapore – $15.43

Israel – $14.68

New Zealand – $14.23

United Kingdom – $12.40

Canada – $11.86

Sweden – $11.71

Here are the top 10 cheapest CPM’s for February-March 2022:

Uzbekistan – $0.06

Belarus – $0.09

Kyrgyzstan – $0.16

Tajikistan – $0.16

Turkmenistan – $0.21

Kazakhstan – $0.22

Guinea-Bissau – $0.41

India – $0.41

Azerbaijan – $0.42

Wallis and Futuna – $0.43

Your poor performing Facebook Ads is not as simple to fix as you probably think it is…

If you are experiencing poor results with your Facebook Ads and have a “quick fix” in mind, please read this post before you attempt to fix it.

When you create Facebook ad campaigns, you know that there are just so many different ways that it can be set up.

Like a dozen different campaign objectives… Many conversion optimization options… Hundreds (maybe thousands, idk) of interest you can target… Lookalike audiences… The different platforms you can place your ad on… Video vs. image… Square vs. rectangle… Long copy vs. short copy…

And the list goes on and on.

So whenever you launch a campaign on Facebook and it isn’t working after 5-7 days, you can see how many different things can be adjusted in an attempt to fix it.

I’ve worked on hundreds of ad campaigns on Facebook and have had thousands of conversations about Facebook ads with either my clients or with people who are needing help running their ads and they come to me for consulting or to have me personally launch and scale their ads properly. Sometimes they will tell me what they think is causing their issues and what they say ALWAYS falls into two categories. They either say “I have no idea” or they say that they think the fix is just one thing like “I just need better targeting” or “my ads don’t get enough likes” or “I’m just not sure how much my daily budget is, that’s my main problem”

And I’ve made the mistake of taking their word for it so when I dive into their ad account, I go in with the expectation of just making that easy fix and everything else in the ad account being setup properly. Just fix their targeting or budgeting and it’ll all be smooth sailing from here. Nope. There are always many more problems I see as I go in their ad strategy and setup.

I’m going to go a bit deep here… people often emulate this type of thinking with a lot of things in life that are big problems but think the solution is super simple. When people need to lose weight, they’ll say “If I could afford healthy food and a gym membership, I would be in great shape” but there are so many other problems like their consistency or workout routine… their opinion of what “healthy food” is could be inaccurate. Get them free unlimited healthy food and free gym membership and they’ll still be out of shape. And people think “if I had a million dollars, I would be happy with my life” but then they win the lottery and are still miserable.

Maybe there is some sort of psychological pattern that people do to themselves to feel less overwhelmed with their problems? I’m not an expert in that area!

Here’s the point I’m trying to make: the fix for your low performing ads is MUCH more than just one single small little fix. It’s either a lot more little fixes or one big fix.

If I dive into your Facebook ad account and I see horrible campaign structure, improper budgeting, confusing ads, and terrible targeting… turning on “target people connected to Wi-Fi” is NOT going to fix your campaign. Find the “perfect interest” to target won’t fix it either. But this is the type of thinking that people have that I talk to with broken ads.

When it comes to fixing broken Facebook campaigns, all of the solutions fall into two main categories, each having their own criteria that MUST be met.

The categories

  1. Campaign structure

  2. Product (or offer)

The criteria that both must be met for a winning ad campaign

  1. The campaign structure must cater to what Facebook prefers

  2. The product must cater to what your target demographic prefers

Some things do overlap a little bit into both categories. For example, the ad design needs to be social media friendly so that Facebook doesn’t throttle your reach with high CPM and your ad must cater to your target demographic by being easy for them to understand what you are selling. So that’s a little bit of both Facebook and target demographic in that situation. And then in the scenario where your product can’t go against Facebook’s ad policy is clearly something that must cater to Facebook’s preferences.

I could write a book going over all of the things that fall into these categories that will fix a failing ad campaign, but here are a few real examples I’ve seen inside of ad campaigns over the last few weeks.

1. Budget spread too thin among ad sets and/or ads

An ad account I started working on last week was using dynamic ads with as many ad variations as possible. Maxed out number of creatives, maxed out number of ad copy, and headlines. The amount that they were spending on this dynamic ad was about $100 per day, however because they had so many dynamic options, they basically had like 200+ ads in one ad set. Put $100/day into that and you’ve got 50 cents per day per ad. That’s not nearly enough budget to give Facebook with any ad. If you are going to use dynamic ads or multiple ads in one ad set, try to give each ad a range of $5-15 per day.

2. Ad talks more about the business or brand instead of the product

This one broke the rule of having the ad and product cater to the target demographic. Especially for newly established brands, your best target demographic are impulse buyers. They don’t typically care about how long you’ve been in business or how your product is made. Now I’m not saying you should never put that into an ad, but I would recommend talking about the product or special offer at the top of the text in the ad and in the headline which is the first thing that a viewer will read.

3. Targeting is far too restricted and narrowed down

A rule of thumb when it comes to Facebook’s targeting is you want to make it easy for Facebook to find who it is you are looking for. When you add too many constraints on your targeting, it requires Facebook to work extra hard on figuring out who to put your ad in front of and Facebook makes you pay for that extra work it has to do by raising your CPM substantially. The ad account I worked on had 5 interests in the first level that were entertainment based, then narrowed down to 3 more interests that were hobby based that must match, and then finally was narrowed down again towards engaged shoppers. So when Facebook finds someone in that first level of audience, it needs to check if they match the second level, and then the third as well. For best results, just test out one or two interests in each ad set starting out.

4. Creative is not social media friendly

Your ad doesn’t need to be “good” as much as it needs to be designed in a way that Facebook prefers so that it shows the ad to a lot of people. This is the first warning sign that I encounter when I look at an ad in the ads library for a Facebook page. I was on the phone with someone consulting them on their Facebook strategy and they said “My biggest problem is the targeting. I have no idea what interest is the right one,” but then I look at their ads in the ad library and it doesn’t matter who they target with that ad, Facebook doesn’t like the ad. Too much text on the ad and low quality image is the common one I see for this one. The 20% text rule is no longer in effect, however if you put too much text on an ad it will throttle the reach and increase the CPMs (usually by a TON to where it is nearly impossible to counter) If you have some big bold text you want to put on the creative, just put that in the headline of the ad instead.

And there are many more errors that I have witnessed but I’m sure that a lot of people who read this post are making similar errors to just the few examples I’ve mentioned and I hope this can help them fix their ad account at least a little bit.

How to leave less money on the table with your FB ads

I’ve audited hundreds of ad campaigns, from huge organization like Greenpeace to startup drop shippers.

There are 9 areas I pay attention to when doing these audits:

  1. Structure

  2. Objectives

  3. Targeting

  4. Placements

  5. Customer Avatar / Personas

  6. Copywriting

  7. Visuals

  8. Landing Pages

  9. Funnel / Strategy

Here are the most common mistakes I see businesses make with each of those Pillars, that hold them back from the ROI they need if they are to grow.

Pillar 1 – Structure

Biggest Mistake: Not using clear naming protocols.

Explanation: This is possibly the least sexy area of FB ads, but if you don’t name your campaigns, ad sets and ads consistently, you end up with unclear names for things and everything takes longer when trying to find your way around your account, look back at results, or compare performance of two campaigns/ad sets. Look at this example…How to avoid making the same mistake: The naming convention I recommend is as follows:Campaign:Objective | description | date i.e. “Guide download | Overwhelm | Jun 2019”
Ad Set:Description | date | testing variable i.e.ad set 1: “Overwhelm | Jun 2019 | email lookalike” ad set 2: “Overwhelm | Jun 2019 | Interest: Moz”
Ads:Description | date | testing variable | creative variable i.e.ad 1: “Overwhelm | Jun 2019 | email LLA | H1C1V1“ ad 1: “Overwhelm | Jun 2019 | email LLA | H1C1V2“ (H= headline, C= ad copy, V= visual)

Pillar 2 – Objectives

Biggest Mistake: Not using the conversion objective

Explanation: I think this comes down to people not quite understanding how Facebook’s targeting and objectives work.

Here’s an (over-simplified for the sake of clarity) overview:

There are two main factors that affect who sees your ads, your targeting and your objective. By choosing targeting options, you narrow down your potential audience from ‘Everyone who uses Facebook’ down to (for example) ‘people who like pages related to surfing’ or ‘women over 40 within 10 miles of my business’.

Then Facebook takes that group of people, and ranks them in order of ‘most likely to complete the objective you’ve chosen’ based on the huge amount of historical data they have on everyone. This means that if you’ve selected an audience of 100’000 people, and chosen the ‘traffic’ objective, then Facebook will decide who of those 100’000 people are most likely to click your ad (based on things like how relevant they think this ad is to them, and how often they’ve historically clicked on things like this), and show it to them in rough order, from person 1 to person 100’000.If you chose the ‘video views’ objective, then Facebook will decide who of those 100’000 people are most likely to watch your video (based on things like how often they watch videos like yours), and show it to them in rough order, from person 1 to person 100’000.So…

By choosing different objectives – your ads will show to different groups of people within your audience. This isn’t a big deal if you have an audience of 30’000 because your ad will likely show to all of them in a short timeframe, but if you’ve got an audience of 2 million people, then you want to show it to the people most likely to do the thing you want. And typically, when you’re sending someone to your website, it’s because you want them to do something when they’re there – i.e. download a guide, or buy a product, or book an appointment. So by not choosing the ‘conversion’ you are likely getting worse results than you could be.

How to avoid making the same mistake:

Read through the following paragraphs to learn when to use the most common objectives:

Traffic – Use this when you’re sending people to your website but don’t have an action for them to do when they get there, or can’t track what they do when they get there – I.e. a blog post/ press release/ new thing you’re doing, or when promoting third party content (where you don’t have access to a tracking pixel on the end site).

Conversions – Use this when you want to send someone to your website AND have them do an action – i.e. getting them to buy something, sign up for an event, or download your awesome guide.

  • Within conversions – you can set up different objectives. Best practice is to start with the end goal you want, i.e. purchases, and then move back along the customer journey (purchase > initiate checkout > add to basket > view content > view landing page) if you don’t get results.

Page Post Engagement (PPE) (This is the same as boosting a post) – Use this when you want to get comments/likes/shares on a post – i.e. content that doesn’t require an action/ for a competition/ getting people to tag their friends. These are also great when you have a messenger bot setup, triggered by a comment.

Video views – If you’re building an audience of people to retarget, then video is likely to be the cheapest route, because you can track anyone who watches 3 seconds or more of your video. Also if you want to get cheap awareness of something that doesn’t include a direct action you want someone to take.

Lead Generation (Lead Forms) – These seem undervalued by many advertisers, probably because getting the leads from the form into anywhere useful like your CRM, isn’t as easy as it should be* – but if you want to get people to sign up for something, or give you their details, and you they are already qualified, then Lead forms can work great. For local businesses who want leads (i.e. gyms or cleaners), lead forms consistently get me the best results. * Use Zapier to easily get the info people fill in sent to your email/phone instantly.

Reach – Using the reach objective is telling Facebook to not worry about any end objective, but rather to just show your ads to everyone in your chosen audience. This is useful when you’re targeting a small number of people (e.g. retargeting the 2000 people who’ve watched a specific video of yours), or if targeting a small geographical area (e.g the 5km radius around your business) 

Brand Awareness – An underused objective – presumably because it doesn’t produce a very measurable end ‘result’ but brand awareness ads are actually very powerful. Facebook will choose who to show your ads to based on who is likely to remember your brand in a couple of days time. This means it can be very useful for ads going out to a broad cold audience, with a view to retargeting them. HOWEVER – I’ve also found it to be one of the most profitable objectives to use for retargeting in multi-tiered campaigns (i.e people who’ve visited your website but not signed up for your course yet)

Pillar 3 – Targeting

Biggest Mistake (Non-Local): Ignoring custom audiences. Explanation: The following order of targeting options are (broadly speaking) the preferred, because they go from warmest to coldest:

  1. Custom audiences

  2. Lookalike Audiences (LLA’s)

  3. Interest targeting

  4. Location

  5. Age & Gender

And obviously, the warmer the audience, the more likely they are to buy from you.

Yet I see a lot of businesses just constantly pumping out ads to a cold audience, and ignoring the people who have already watched their videos / been to their website / added a product to their cart. In – businesses, a retargeting campaign, going out to people who have added something to cart but not bought is the highest ROI campaign 9 times out of 10, and it’s the same no matter what you sell.

How to avoid making the same mistake: Plan out a proper customer journey. What are all the different steps that someone goes through between first coming across your business and becoming a long-term customer?

  • Downloading a guide and getting on your email list?

  • Watch a video of you explaining how your process is ideal for them?

  • Browsing your website?

  • Scheduling a call with you personally?

And then create ads for each relevant stage to help guide them along that path. Remember, as they become more familiar with you, you will also speak to them differently.

Pillar 4 – Placements

Biggest mistake: Wasting money on the audience network.

Explanation: There are over a dozen different places where your ads can show. But not all of them tend to be equally effective, and Facebook will often push a high amount of traffic to the audience network because it is less saturated. The audience network is a huge number of websites and apps where Facebook also show ads. There are times and places when the audience network is great – I’ve seen it work well for link clicks to blog posts, and as part of a retargeting campaign, allowing you to ‘be everywhere’, but too often it’s not the right choice.

In recent times (since sometime in 2019) Facebook’s ability to choose the right placement has seemed to massively improve, to the point where I often leave placements on ‘automatic’ because I end up with a better end ROAS, but the audience network is the most common culprit for wasted spend, especially if you’re looking to get video views from a cold audience.

How to avoid making the same mistake:

Go to the ‘Performance and Clicks’ pulldown menu in ads manager, and then use ‘Placements’ in the ‘Breakdown’ pulldown menu to see if there are any Placements which are performing above or below the average.

If you see that you’re spending lots on the audience network and not getting results, then you might want to turn it off in future.

You do this at the ad set level, select the ‘Edit placements’ radio button instead of ‘Automatic’ and untick the placements you don’t want. Caveat – As mentioned, this is an area that I am encouraging people to play around with a bit less recently – it’s worth testing, but I’ve seen many examples of CPM’s increasing significantly when you remove too many placements.

Pillar 5 – Customer Avatar/Personas

When it comes to defining their customer clearly (if you don’t know who you’re selling to, it’s hard to speak to them in an appealing way) there are two related/intertwined mistakes I see made most often.

Biggest Mistake: They don’t define their target customer at all in the first place, and just use generic language that (sort of) appeals to everyone.

  1. If they have defined an avatar, they’ve lumped everyone in together, to some amalgamation of all their customers.

Explanation: Generic language speaks to (and disqualifies) nobody. Buying is first and foremost an emotional decision, and if we don’t trust the person selling to us, we’re not going to buy, so you need to show that you UNDERSTAND THEM, and UNDERSTAND THEIR PROBLEMS.

How to avoid making the same mistake: First, define all the different groups of people that buy from you, there should be at least 3, but if you’ve got loads, then just identify the biggest few. Each of these personas will have different opinions/goals/pains etc, so once you’ve done that, ask yourself the following questions for each one:

  1. For each one we want to know the basic demographics that define them: 

    1. age,

    2. gender,

    3. location,

    4. income…

  2. Then the psychographics that relate to what you’re selling:

    1. What do they want?

    2. What do they care about?

    3. Who are their enemies?

    4. What are their dreams?

    5. What do they believe?

    6. What are their suspicions?

    7. How have they failed before?

    8. What are they afraid of?

Then when you create an ad campaign, create it for just one persona at a time, and craft your message and your offer to match them.

Pillar 6 – Copy/Offer

Biggest Mistake: Copywriting is a huge topic, but you don’t have to be a world-class copywriter to get results from Facebook ads – the biggest mistake I see being made is talking about you, not about your clients.

Explanation: This follows on from the above customer persona section – because if you don’t have a clear picture of who your ad is for, then you can’t write for them. But you need to write for them, because talking about yourself is NOT going to appeal to them. “We are the biggest supplier of…”“I am a skilled teacher and can do…”This isn’t interesting to the reader, and will not get them to click.

How to avoid making the same mistake: WIIFM – Every time you write a sentence, read it back and ask yourself (from your reader’s POV) “What’s In It For Me?” If you have a clearly defined picture of who you’re writing for, then you can go through everything you write and make sure that it’s relevant to them, their hopes, dreams, goals, objections, fears…

Pillar 7 – Visuals

Biggest Mistake: Not testing them.

Explanation: The PRIMARY job of the image/video that you use is to get enough attention to stop someone scrolling for a split second, so that they can scan the ad copy to see if it’s relevant/interesting.

If you just chuck up one photo and never try anything else, who knows how much money you’re leaving on the table.

How to avoid making the same mistake: Effective attention-getting-visuals tend to fit into one of 3 categories:

  1. The target market Show an image/video of the type of person you’re speaking to – they will pay attention because it’s relevant to them. For example – if you run a food truck, then a photo of your customers eating an awesome looking burger in front of a recognizable place/landmark in your town.

  2. The problem/solution/aspirations Demonstrate either the issue at hand, or your product/service solving that issue – again, people will pay attention because it’s relevant. For example – If you sell waterproof hiking shoes, you could show someone with wet socks looking miserable.

  3. A pattern interrupt. Something that just seems out of place will get attention (read Purple Cow by Seth Godin), but beware using ‘wacky’ but irrelevant images/videos for the sake of it. these might get people to stop/click, but it’s likely doing nothing to qualify the right people. For example – I saw a FB ad a while back that was just a picture of a cute dog, with a headline along the line of “Instead of you seeing a boring advert, I’m paying to show you this pup” – it got my attention, but that was that.”

So find (or create) a bunch of images and video that fit those categories and see which gets the best Click-Through-Rates and the most conversions.

Caveat- you can of course, also use the video in your ads to teach/inspire/sell directly, but remember that without getting initial attention, your efforts will be passed over, and you still need to be testing different variations.

Pillar 8 – Landing pages
 

Biggest Mistake: S L O W loading times.

Explanation: Your landing page is the page that you send people to if they click on your ad. It could be a simple blog post, a product page on an e-commerce store, a booking page for a cafe, or an opt-in page where someone can give their info in exchange for a download/course/freebie.

Landing pages are consistently given less attention than they need especially compared to the ads sending people there, which is crazy because it can easily increase/decrease the ROI on your ads by 100-500% or more. and the biggest culprit is loading speed – how long it takes for your website to load for the viewer. According to Neil Patel “Nearly half of web users expect a site to load in 2 seconds or less, and they tend to abandon a site that isn’t loaded within 3 seconds.” 

How to avoid making the same mistake: Google ‘pagespeed insights’ and click the top link, then enter your website/page. All those things that appear, they are all costing you money. ‘Eliminate render-blocking resources’ ‘Defer unused CSS’ ‘Properly size images’ – it’s all geeky stuff, and it all counts – so find a website developer and pay them to fix it. The great thing about speeding up your site is that it’s going to pay for itself over and over and over. If you’re paying money every month to run ads, then it’s worth paying a one-off fee to increase your conversion rate overnight.

Pillar 9 – Funnel/Strategy

Biggest Mistake: Randomness

Explanation: To put it bluntly – most businesses don’t have a plan when it comes to FB ads. They tried a couple of ads that worked, but now they aren’t working so well, and they just keep throwing things up without much of a clue.

How to avoid making the same mistake: It’s not complicated, not groundbreaking. but it is effective. You find an established business like yours, that’s already running ads, and you ‘model’ what they’re doing.

And the great thing that came from Facebook’s privacy stuff is that all this info is publicly available. Here’s how to you find it:

– Find known successful companies on FB – OR search keywords for your niche – Look for the ‘Page Transparency’ box on the right.

– And if they’re running ads, Facebook will tell you.

– You click on ‘Go to Ad Library’

– And there you go, all the ads that they’re currently running.

– You can click on them, follow their funnel, see what they’re doing.

– And model it for your business.

This isn’t perfect, and you can’t just copy/paste a funnel from another business, but it gives you a starting point, and if you model what a similar business is doing, adapt it to your own products & clients, then test from there, you’re likely going in the right direction, rather than driving around without a map.

There you go – avoid these 9 mistakes and you’re probably halfway there.

 
  1. The hardest part of working on Facebook is working with Facebook.

  2. Set your conversion objective for business goal, even if you can’t exit “Learning Limited”. Cheaper results.

  3. You can get incredible results if you go “Broad” targeting. This means no targeting parameters. But first you have to groom your Pixel Metadata with Lookalikes, retargeting, etc.

  4. Videos are gold.

  5. Play it white hat. The “gurus” who teach you “scaling tactics” with duping and running small ad sets either haven’t advertised in 3 years or they are just saying what someone else told them.

These 5 rules will help any budding FB Advertiser. 

What’s your favorite FB hack?

Before running an ad for my target country, I run the same ad for low-cost countries like African and Asian countries to gather insane amount of Likes, Shares, and Comments.

Then I use the same ad to run for my target country. The likes and shares serve as a social proof that the ad is worth watching.

This is a common strategy 🙂 But you don’t have to run the ad to third world countries – you can simply run it optimized for Engagement in the US (or wherever your target market is). Engagement-optimized campaign CPMs go as low as under $1.

It’s always better to accumulate social proof (especially comments) from your native country’s users.

How I Scaled An Ecom Brand From $45K To $120K In 30 Days

Your Landing Page/Purchase Flow and your offer.

I rarely see people testing landing pages, and even rarer, I see people talking about offers.

But changing these 2 things allowed me to scale an ecom brand from $45K/m to $120K/m within 30 days.

How?

Improving both Landing Page and Offer resulted in a conversion rate increase from 1.38% to 3.35%.

Let’s dive right into it, and hopefully, you can get something valuable out of this post:

Landing Page/Purchase Flow:

What is the purchase flow?

The purchase flow is each step that a customer has to take to buy the product.

A standard purchase flow usually looks like this:

Product Page – Add to Cart – Cart Page – Checkout – Purchase.

—-

In the brand I’m using in this example, the purchase flow looked like this:

Homepage – Offer $120 AOV Product Bundle (they have the option to add to cart here) – Product Page – Add to Cart – Cart Page – Checkout – Purchase

—–

Which in itself is a rather long flow with a high AOV. Generally speaking, you want to keep your purchase flow as short as possible to prevent drop-offs.

How a short purchase flow may look like:

Product Page – Add to Cart Button – Checkout (Skip cart page) – Purchase

Note: You might want to add upsells on the cart page, so this flow is not always ideal. It could also very well be that you need to explain your product to convince people to buy it, which is why e.g., sending people to a homepage or specific landing page can also be better than sending them straight to the product page. You need to test here.

So, the landing page from people who came from Facebook was the homepage combined with a relatively high AOV product bundle (2 products) for $120.

This did a decent job at selling the product, and the conversion rate was 1.38%, with an AOV of $120.

So our revenue from 100 visitors looked like this:

(100*0.0138)*120 = $165

So, our RPV (Revenue per visitor) was $1.65 ($165/100)

This offer was not profitable for the client. The overall ROAS was way below the ROAS Targets, and I knew I needed to change something. However, on the ads side of things, everything looked great.

So, here’s what I changed:

  1. Landing Page

First of all, I started by redirecting the traffic to the product page to see if this affects the conversion rate.

This, however, wasn’t a success because the conversion rate didn’t increase significantly. In addition, the Facebook Ads were still unprofitable, and I knew a greater change needed to come. So, I built my specific landing page for that product bundle.

Since I’m not the greatest at building landing pages or writing landing page copy, here are two excellent guides where I learned a lot:

Landing Page example1

How My Landing Page Structure Looked Like In Order:

Hero Banner (With a button that automatically scrolls to buy section)

“Featured In” Part

Why “Product” Part

Reviews Part

Guarantee

Product Buy Section

Reviews

How The Purchase Flow Looked Like:

Landing Page – Scroll Down – Add to Cart – Cart Page /w new Upsell – Checkout

I follow the structure from the 2 guides above, so if you’re interested in building your own landing page, I highly suggest you check them out!

Note: I always use GemPages for landing pages, so if you’re a Shopify store owner, I’d suggest you use GemPages to build your Landing pages. ShoGun is also pretty good, but I prefer GemPages.

While the new landing page did a slightly better job selling (Conversion Rate increased from 1.38% to 1.7%) than either the product page or homepage, this still meant the Facebook Ads were just barely even profitable. So a more significant change needed to be made.

I changed the offer.

2. The Offer

Before, we were selling a product bundle upfront for a $120 AOV with now a 1.7% CV Rate, which meant we were getting a $2.04 RPV (Revenue per visitor)

Here’s what I changed:

I advertised a lower-priced AOV product with a discount on the landing page (core product) and instead created an in-cart upsell with the old 2nd bundle product. So if customers bought these 2 products, it was basically the same bundle as before.

How the numbers changed:

AOV: Decreased by 10% (which was to be expected) from $120 to $108.

CV Rate: Increased from 1.7% to 3.15%

RPV: Increased from $2.04 to $3.78, which is a huge change.

So from the start ($1.65 per visitor) to the end ($3.78 per visitor), I was able to increase the revenue per visitor by $2.13, which is an increase of 129% just by changing the landing page and offer.

TL;DR: By changing the Landing Page and offer from a brand I was able to increase their revenue per visitor by 129%.

I hope I could show you with this post that it’s not only your Facebook Ads you need to work on. In the end, your ads + homepage are connected, and even something as simple as the offer can have a significant impact on your conversion rate.

Facebook Ads: How iOS 14 will affect your campaigns

Campaigns will be affected in a variety of ways including:

  1. Delayed Reporting: Real-time reporting for iOS devices will not be supported, and data may be delayed up to 3 days.

  2. No support for breakdowns: For both app and web conversions, delivery and action breakdowns, such as age, gender, region, and placement will not be supported.

  3. Attribution Changes: The attribution window for all new or active ad campaigns will be set at the ad set level, rather than at the account level. Additionally, going forward, 28-day click-through, 28-day view-through, and 7-day view-through attribution windows will not be supported for active campaigns.

  4. Targeting Limitations: As more people opt out of tracking on iOS 14 devices, the size of your app connections, app activity Custom Audiences, and website Custom Audiences may decrease.

  5. Dynamic Ads Limitations: As more devices update to iOS 14, the size of your retargeting audiences may decrease.

  6. Limited to 8 conversion events per domain: You’ll be restricted to configuring up to 8 unique conversion events per website domain, and ad sets optimizing for a conversion event that’s no longer available will be paused when Facebook implements Apple’s AppTrackingTransparency framework. Businesses that use more than 8 conversion events per domain for optimization or reporting should create an action plan for how to operate with 8 events maximum. (Note: Facebook will automatically configure the events most relevant based on our activity)

  7. (There’s more, especially for mobile campaigns, but you can read about it at the link at the bottom of my post)

Action Items:

  1. We’ll want to preemptively verify our domain ownership in Business Manager. This will allow us to have authority over which conversion events are eligible for our domain should we choose to do so:  Apple dev verification

  2. We’ll have to be vigilant in terms of keeping these changes in mind when assessing campaign performance. For example, our FB ROAS will likely appear to be lower in the coming days and we may not be able to simply look at yesterday’s data when assessing performance. Instead, we may need a 3-day window.

  3. This will likely affect Google Ads as well, but I have not seen Google release a document outlining the specific impacts this will have. For now, we can assume that what’s happening to Facebook will be the same for Google.

Details here

How to Make a Good Landing Page: The PPC Advertiser’s Guide

Knowing how to make a good landing page makes a massive difference to your pay-per-click (PPC) advertising campaigns. When you design a landing page that offers a better user experience, you’ll see marked improvements in key metrics, including your Ad Rank (Quality Score & CPC), bounce rate, and conversion rate. As these factors improve, your costs will fall, ultimately helping you earn a higher return on investment (ROI).

In this guide, we’ll show you how to make a good landing page, covering each vital step to make it easy for you to deliver an experience people won’t forget.

What are the most critical aspects when designing a landing page?

When you’re learning how to make a good landing page, you should focus on the following:

  1. Relevancy of landing page

  2. Define your unique selling point (USP)

  3. Show your product/service in action

  4. Tell people what they need to know

  5. Make your landing page mobile-friendly

  6. Simplicity

  7. Make your call to action clear

  8. Remove distractions

  9. Provide transparent policies

  10. Leverage social proof

  11. Minimize loading times

  12. Build engagement

  13. Optimize for voice search

  14. Social Sharing & Feeds.

  15. Test and update

Let’s look at each one in more detail.

1. Relevancy of landing page

Here’s a common mistake in PPC advertising:

You promise one thing in your ad, but when people click it, your landing page fails to deliver that promise. For example, your ad may offer a 10% discount on brake pads, but when people arrive on the landing page, it offers a 5% discount on brake discs.

This inconsistency will deter users, and your business will lose out on possible leads and conversions. You must create relevant landing pages that align with your ads — and with user intent.

2. Define your USP (unique selling point)

Is your ad and landing page closely aligned now?

Good. Now, it’s time to define your unique selling proposition, which is how you differentiate your offer from your competition.

Your ad may address a problem that your target audience needs to solve. With a strong USP, you can show prospects that your product or service is the best solution available.

For example, if you are a quality pizza delivering company and you are best at coping with your delivery time you must emphasize your quality and your delivery time on the landing page.

3. Show your product or service in action

Humans are visual creatures. If they see products or services in action, their appreciation and desire to have it will increase.

You can experiment with these ideas to improve engagement on your landing page:

  • Still photos

  • Animated explainer video

  • User tutorial video

  • Carousel shots that highlight specific features

  • Infographic

Also, it gives you a chance to explain the product or service in more detail, answering any common queries, and dispelling doubts before they arise. For example: if your landing page is having steps to complete by the user, escort them in a way that keeps the interest active for the user. Like:

Step 1: Fill the form

Step 2: Get the offer

Step 3: Get Paid

4. Tell people what they need to know

Nowadays, there is zero room for fluffy content, especially in paid advertising. Your ads and landing pages must get to the point – fast!

Use your landing page to explain only vital information that prospects need to know, such as:

  • Benefits of your product or service

  • Pricing and purchasing options

  • Business contact details including physical location and phone number

  • Social media channels and email address

Focus on the essential information to maintain interest and build credibility with your landing pages.

5. Make your landing page mobile-friendly

In the mobile age, nobody wants to deal with confusing websites. Therefore, you must create landing pages that offer smooth and straightforward navigation, right to the point of sign-up.

Make your landing pages mobile-responsive, so users on smartphones and tablets can quickly scan through the page, and complete any action that’s required.

Here are a few pointers:

  • Compact images – Make your images small (in dimensions and file size). This will speed up your loading times and make pages easier to view.

  • Reduce typing demands – Keep things simple for users.

  • Avoid auto-downloads – This annoys users by taking up space in their device.

  • Avoid auto-play videos – Intrusive audio can embarrass or annoy users, especially if they are watching videos in a public place.

  • Minimize animations – Use color effects and GIFs sparingly to speed up loading times. Provide animation if it is really required to show some demo otherwise don’t use it.

6. Simplicity

Learning how to make a good landing page may seem scary, but here’s the best tip of them all:

Keep it simple.

Here’s how:

  • Simple and direct copy

  • Clear, direct headlines

  • Minimalist design with plenty of white space to enhance the information rather than hiding it.

  • A clear call-to-action (CTA) that tells users what you want.

  • Fewer colors

  • High-readability

Here is the example of clutter vs. simple and clean landing pages.

Keeping it simple will lead to better results in terms of engagement, clicks, and conversions.

7. Make your call to action clear

No landing page is complete without a strong CTA.

Whatever your product or service is, and however you make your offer, you need CTAs at decision points on the page to drive action.

Consider these strategies for better CTAs:

Less is more

It’s a good idea to avoid having too many CTAs. It may be best to use just one at the very bottom of the page. That being said, having another CTA above-the-fold is a popular choice.

If you decide on that, make sure you also include vital information above-the-fold, so users have those details to guide their decision.

Make it count

Have you ever seen an action button with the word “submit” on it?

This is a common choice, but not a great one because it lacks strength and inspiration. Instead, you want to incite action.

Create a stronger CTA that gets people to react. For example, “Don’t miss out on your FREE download” is better than “download now.”

Step-by-step structures

Outline how easy your visitors will find your product or service to use. With clear, easy-to-follow directions, the value of your offer becomes undeniable — and often, irresistible.

8. Remove Distractions

Here’s something you should keep in mind when you want to know how to make a good landing page:

You must focus on a single conversion goal. Just one.

Therefore, anything else that distracts from your goal is surplus. Get rid of all distractions, external links, and unnecessary CTAs, images, or information that dilutes your message or invites users away from your landing page.

Ideally, you want to streamline the journey on your landing page to funnel leads to your final CTA.

9. Provide transparent policies

As we move into 2020, consumer privacy matters are at an all-time high. The data breach scandals of Facebook, Yahoo, and Quora caused panic, and the General Data Protection (GDPR) regulations have taken effect across the globe.

Now, you must be transparent with the processes and practices you use for collecting, storing, and sharing consumer data. If people can’t trust your brand, you’ll never make a sale.

Follow these tips to nurture trust with people:

  • Use cookies toolbar to notify people that you track on-site behavioral data.

  • Use terms and conditions page to outline what your business is responsible for, and what it’s not.

  • Share your privacy policy, so people understand how you use consumer data.

  • Publish an FAQ page that answers common questions people may have about your brand, and your products and services.

10. Leverage social proof

Imagine your company provides analytics services to major corporations. Once you have one or two big clients in your portfolio, you can leverage those relationships to convince others to convert.

By getting positive reviews, you’ll have strong social proof from happy customers — that pay well. That can be enough to sway other top-tier clients.

To maximize this strategy, try to get video testimonials. Video content is much more engaging, and it will be a high-impact addition to your landing page.

11. Minimize loading times

Speed is crucial in the customer journey. Nobody wants to wait around for a slow website to load, especially on mobile.

Here are some tips to slash your loading times:

  • Use Accelerated Mobile Pages (AMP), as this is an important ranking factor of Google’s Mobile and Desktop Indexes.

  • Use compact-sized images and files.

  • Minify your HTML, CSS, and JavaScript files.

  • Opt for client-side scripting rather than server-side.

  • Use CDNs (content delivery networks)

  • Reduce redirects

  • Enable compressions

12. Build engagement

Shoppers have a lot to choose from online. You need to work hard to convert prospective new customers, tailoring your marketing tools and techniques to engage your site visitors in ways that they appreciate.

For instance, you can harness data insights with a live chatbot feature, or utilize pop-up discounts that cater to each visitor’s interests.

These techniques keep people on your page and make them consider your offer or brand as an option.

13. Optimize for voice search

In 2019, voice search enjoyed significant growth, primarily driven by the improvements in voice-enabled technology. Alexa, Siri, Cortana, and Google Assistant are battling it out to be king in voice-enabled devices, and with it, they are changing search engine optimization.

How?

Well, people who use voice search tend to do things a little differently than those who do a regular text-based search.

So, when you’re thinking of how to make a good landing page in 2020 and beyond, you should think about the following:

Focus on user intent

When people use voice search, they usually have a particular need, such as:

  • The address or opening hours of a store.

  • The price of a specific product.

  • Whether a business offers a specific type of service etc.

Keep user intent in mind to create content that answers specific questions, providing answers to things people want to know.

Google may be a smart search engine, but it needs all the help it can get. The better you optimize your content, the easier it will be for Google to analyze it — and promote it.

Use schema markup

Schema markup makes it easier for search engines to comprehend the content of a webpage. Consider your website, your audience, and the CRM editing capabilities to use the right schema markup that will help you get noticed by voice searchers.

Use long-tail keywords

Voice search queries are typically conversational in style, often framed as questions or full, grammatically-correct sentences.

You can incorporate these long-tail, conversational keyword phrases into your landing page content to attract targeted traffic. As a bonus, this defined traffic is often cheaper.

14. Social Sharing & Feeds

Show your social feeds and tweets on your landing page to show your presence on social media. Once visitor purchase or do some conversion, make it easy for them to brag about their purchase and share their experiences by adding links to all types of social media. It will increase your credibility and presence on social platforms.

15. Test and update

Like everything else in PPC advertising, your landing pages are not a set-and-forget task. Once you publish your landing pages, you must keep an eye on the analytics to gauge their performance.

Try A/B testing several ideas to determine the most effective version of your landing page. For example, you could test out two versions with different:

  • Headlines

  • Benefits

  • Images

  • CTAs

  • CTA positions

Run variants for a while, gather the data, and then analyze it to identify which version generates more clicks, leads, and conversions.

This process of testing and monitoring should be ongoing, helping you continually update and improve your landing pages, eliminating flaws, and optimizing strong points to create the best possible user experience.

Remember only to change and test one aspect at a time. This makes it easier to determine the impact of the change. For example, test images one week, then pick the best image. Next week, test headlines, then select the best headline. The following week, test CTAs, etc.

Wrap Up

So, now you know how to make a good landing page. By analyzing these areas and putting in the time and effort to optimize each one, you’re sure to see dramatic improvements.

PPC advertising requires patience and strategy, more so than a big budget. Learning how to optimize your landing pages is crucial to maximizing your ROI.

Is Organic Search Traffic from Blog Posts superior to Google Ads?

From my experience Google ads cost me $0.80 per click. Of course it depends on the niche. So it might vary.

Now for $10 I can find someone on Upwork who writes me a 1000 word blog post. Again it depends on the niche. But that’s been my experience.

So $10 spent on Google ads will give me 12 clicks. Wouldn’t a $10 blog post give me much more traffic than 12 clicks over the years? Assuming it has a good headline and maybe some tags.

If I had to bet, I would bet that the blog post over time would far outperform the Google ads. But I don’t yet have the data. So I’m curious what you think about that?

Answer: 

The blog probably would get more unique visitors, yeah. But are they qualified, are you selling them in the blog post, does your $10/article writer understand their needs and have experience on writing copy that converts?

With ads you can filter your keywords to find customers who are warm and are actively looking for a solution, it’s a little harder for articles on that front. E.g. a search for ‘welders in hackney’ would be a solid term to target with ads, but an article written on that topic probably wouldn’t rank well enough without a lot of research on the companies, finding out their pricing, services offered and enough unique and smart content to rank above those services own websites.

If your plan is to replace every advert keyword you’re targeting with a $10 blog post, you’ll end up with hundreds of really low quality articles that Google will recognize as low-effort and out of sync with the searcher’s intent and you won’t rank for anything.

Blog post with SEO included that ranks for specific keywords will have a good roi. But just make sure it is quality content as $10 content is likely to be worth exactly that.

What advice would you give someone wanting to learn google ads in 2022?

  • Working on an actual account will teach you more thing s than a course

  • Take a course only to cover the basics for developing strategies work on an actual account

  • Always look out for new features in ads manager, as Google is often biased towards new features and provides results at cheaper costs

  • Courses are a great start but nothing beats just running ads. Personally I think there is more than enough free info on YouTube to last a lifetime…..and good info too.

    Learn the basics. Understand each feature in the dashboard. You’re general marketing experience with FB will help you.

    I would recommend taking a client up on the offer or running ads for yourself to learn.

  • The best way to learn google ads is by doing so. Do not buy a course! Google has some beginner courses (skillshop) take some of these and than ask an ngo if you can work for them. For ngo‘s google ads is free so it is a nice why to get to know the interface and everything around. And after than maybe you are able to go to an agency, there you could learn a lot.

  • Would you be interested in this idea
    by /u/skumati99 (Entrepreneur) on May 19, 2022 at 9:05 am

    I really enjoy listening to podcasts (about business, marketing and self development mostly). And I have been taking notes for almost 2 years from interviews with big names in business world. My idea is to summarize those valuable interviews bullet points Tips ? Like a 45 minute interview to be summarized in 5 slides max. View Poll submitted by /u/skumati99 [link] [comments]

  • Anyone Have a Script that will Automatically Turn Off All Ads and Extensions if it Contains a Specific Word?
    by /u/butterssucks (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 9:04 am

    Looking for Script that will turn off all ads that contains "Pre-order" or "preorder" automatically when ran submitted by /u/butterssucks [link] [comments]

  • Thank you Thursday! - May 19, 2022
    by /u/AutoModerator (Entrepreneur) on May 19, 2022 at 9:00 am

    Your opportunity to thank the /r/Entrepreneur community by offering free stuff, contests, discounts, electronic courses, ebooks and the best deals you know of. Please consolidate such offers here! Since this thread can fill up quickly, consider sorting the comments by "new" (instead of "best" or "top") to see the newest posts. submitted by /u/AutoModerator [link] [comments]

  • Am I making a mistake?
    by /u/Capable-Raccoon-6371 (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 8:57 am

    I'm using Apple Search Ads to promote an app. I've setup a few campaigns based on expected conversion rate, primarily three campaigns for an Exact, Broad, and AI Search campaign. Each with incrementally lower CPT rates. But I also setup a single Campaign for $1000 at $500 per day as what I call a Sprint. The goal is to have a high CPT and burn through $1000 to get more users for reviews. This is a new app with 0 reviews so I need to acquire them to make my future steady campaigns more effective. Am I thinking incorrectly? Is my Sprint to burn $1000 as a jump start something that makes sense? The CPT is setup at $2 matching my Exact campaign with a projected 70% conversation rate. This will also help me gather analytics about the keywords I am using in my other campaigns quickly. submitted by /u/Capable-Raccoon-6371 [link] [comments]

  • Performance Max - Budget
    by /u/Barnes77_ (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 8:26 am

    Hello! I currently have two performance max campaigns for a couple of weeks with "maximize conversion value" and the truth is that I am very happy with the results it is giving me. The doubt and fear that I have right now is that I want to increase the daily budget and I don't know what strategy to follow, I don't want the campaigns to be spoiled. Google recommends that I increase my budget and set a target ROAS, but I think that with two such new campaigns, it's still early days. What would you do? Thank you submitted by /u/Barnes77_ [link] [comments]

  • Does durring recession street food work
    by /u/galinaultima (Entrepreneur) on May 19, 2022 at 8:00 am

    Your honest opinion, would selling food from foodtruck work durring "dark times"? Or its success depends more of the local culture/ menu..? Does anyone here has experience with selling street food and got "burned"? submitted by /u/galinaultima [link] [comments]

  • Wolfram Alpha for Databases
    by /u/dailyidea (Entrepreneur) on May 19, 2022 at 7:44 am

    Hey guys! Every week I post a business idea here that I also sent out on my newsletter. If you’re interested, the link to that is at the bottom of this. Have any of you used Wolfram Alpha before? As someone taking a few university math classes, it’s really a godsend. I can type in just about any mathematical question, in plain english, and it will (usually) give me a correct, thought out answer with the steps needed to reproduce. It’s incredible. It got me thinking – this concept of parsing a human readable question into a more computer-usable format must be valuable elsewhere. This idea uses that simple idea. The Spark Notes Version I don’t use TL;DRs. But this is close to that. This idea is for a B2B software product that would integrate with your database. It would allow the non-technical teams (marketing, product, sales, etc.) to perform database queries by writing in plain English what info they need. Let’s say a sales team wanted to know how much revenue was generated in the past week. Normally, they would first go to an analyst or a data scientist, who would write the query in SQL (or whatever language they use). Then, the analyst would send either the number or the full report. This is so many touch points for a relatively simple need. Imagine if the sales manager instead could just go into this program, type “sum of revenue in the past week” and get the answer. The Long(er) Version Most database languages really aren’t that far from English. If you don’t know SQL, try to guess what the following will do: SELECT email FROM `customers` WHERE total_spend > 20 It’s really not that complex. It gives you the email address of all customers with total_spend greater than 20. Of course queries can get significantly more complex, and much harder to read. But that doesn’t change the fact that they usually are at least similar to the plain english account of requirements. This idea is just making software that goes in the opposite direction. Take a human-readable input (like “how many new customers did we onboard last week”) and turn it into an SQL query, then return the data. Basically an AI assistant that bridges the gap between the technical and non-technical departments in a firm. Sure, that’s cool. Is it feasible? Probably. We’ve seen recently that AI is shockingly good at interpreting what we need and turning it into code. OpenAI recently released Codex, which can correctly generate entire pieces of code from simple instructions almost 30% of the time! Parsing a request into a structured query is a much simpler feat than that. If you can get the tech side of this down, the marketing is almost done for you. Just hire some salespeople to pitch to B2B buyers. Purchasing cycle and compliance headaches notwithstanding, it is obvious how this program would benefit any company with separate marketing and engineering departments. Marketing teams don’t need to waste engineering resources to perform a simple query. Marketing resources aren’t bottlenecked by having to wait for data. All teams perform better – they are more willing to take a data-based approach, since they no longer have to wait forever to get the data back. It increases revenue, it decreases costs. There’s hardly a better sales pitch imaginable. What do you guys think? Would your business use this? If you liked this idea and want more like it in your inbox (no pressure), you can check out my free newsletter at lp.dailyidea.com submitted by /u/dailyidea [link] [comments]

  • No idea where to start
    by /u/moizbaig920 (Entrepreneur) on May 19, 2022 at 7:39 am

    Ok So I just turned 19 and still live with my parents, I have been thinking to start something of my own for about 5yrs or something but I have no idea where to start or which niche to pick. Would love to hear some ideas from you guys Keeping in mind 1) I am from Pakistan (being in a third world country means most buisness ideas won't work here) 2) my interest is more on the IT and services sector side Thanks submitted by /u/moizbaig920 [link] [comments]

  • Twitter Ads Metrics before launch
    by /u/Kniphe (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 7:10 am

    Hi all, We're tasked from a client to give predicted KPIs for twitter ads among others, they want numbers to set budgets and timelines for a cross platform campaign. Snapchat, Facebook etc all give predictions based on your ad details, but twitter seems to not show predicted reach, impressions, website clicks etc. like the other platforms do as you set your parameters. Are we missing something or does Twitter only provide that data during the ad campaign? Thanks! submitted by /u/Kniphe [link] [comments]

  • Semrush Review [2022]: Is It The Best Choice For You?
    by /u/amusedhearts (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 6:13 am

    https://flexsub.shop/semrush-review-2022-is-it-the-best-choice-for-you/ submitted by /u/amusedhearts [link] [comments]

  • Google Ads Change Sensitivity
    by /u/Loud_Yogurtcloset_90 (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 5:18 am

    Has anybody else noticed how sensitive Google ads is to changes recently? I feel like any changes made seem to put the account into a state of learning causing the CPA to go extremely high for a couple of days. The account doesn't actually say that it's in learning but it's extremely obvious that it is... submitted by /u/Loud_Yogurtcloset_90 [link] [comments]

  • Pre-Launch Campaign Help
    by /u/KFTAw (Entrepreneur) on May 19, 2022 at 4:15 am

    Hello everyone, I am running a pre-launch campaign for my startup candy company. I have made the website and have been sending out DMs to all of my friends. I'm at 70 emails right now, but unless I'm constantly DMing people, it doesn't seem to be growing. Would you please give me some ideas to get my pre-launch to do better? I am working with a graphic designer/photographer to start getting photos of my product, but what would you recommend would have the highest ROI? I'm not against running paid ads or an influencer campaign, but I do not have any social media presence at the moment, as I am waiting for designers to get me some quality content (I'm trash at creating content myself). I have also been looking into joining Facebook groups, but I really just don't know what to post, as the rules all say no advertising. Here's the link in case you want to see the site: www.redrojitos.com submitted by /u/KFTAw [link] [comments]

  • Google Ad Strength
    by /u/Blue_Wizard25 (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 3:54 am

    Did anyone do a test to see if "average" ad strength fared lower than going for "excellent"? Ad strength seems to be a combination of pre-defined rules. Meet them and you get full marks, regardless of your industry or whatever u sell. I really doubt letting them mix and match is better. Usually I pin a DKI line on the first position and it gets really high CTRs. Have not done an official test on this tho submitted by /u/Blue_Wizard25 [link] [comments]

  • Sending out Mass Text To Clientel
    by /u/Chosen_one184 (Entrepreneur) on May 19, 2022 at 2:54 am

    Hi, So my wife is a beautician with an extensive client list and would like to send out mass messages from time to time to inform them of either location changes or promotions etc. Ideally she would want to write the message herself then add clientele in and then send it out. My question, is anyone here familiar with such a service? submitted by /u/Chosen_one184 [link] [comments]

  • Advice on creating a takeout only african restaurant
    by /u/bipleg123 (Entrepreneur) on May 19, 2022 at 2:52 am

    We've been discussing opening a takeout only restaurant for my mom. It's been her dream for a long time, she worked in an african restaurant most of her childhood and teenage years. Although that was in Africa and we are now in U.S.A so totally different environment. We have enough saved up and took out some loans to help start, and we are aware of having no profits for a while when starting. Seeing as there isn't too many african restaurants in our environment as compared to the likes of chinese, japanese, and indian restaurants. Are there any tips any restaurant owner has on what to do? Are we in over our heads about this, with having no experience in an restaurant here in America? Any help or advice would be appreciated. submitted by /u/bipleg123 [link] [comments]

  • Porta Potty Business
    by /u/dapruf (Entrepreneur) on May 19, 2022 at 2:02 am

    I have the possibility to do a joint venture with an established dumpster company. Anyone have any experience with portable toilet businesses or know of anyone? Happy to compensate for a consultation. I know what a crappy business model. submitted by /u/dapruf [link] [comments]

  • Tiktok requires business license on landing page
    by /u/Andrewpg3 (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 1:42 am

    Getting my ads rejected because I don’t have my business license on my landing page. Any way around this without creating an LLC? submitted by /u/Andrewpg3 [link] [comments]

  • Need advice on getting clients
    by /u/Dalcynn (Entrepreneur) on May 19, 2022 at 1:35 am

    My fiancée recently opened an Allstate in Houston. She is fully staffed and has been open for almost a month now. Her branch has only closed 5 new policies and 2 of them were closed herself. I started working for her to try and help as much as I could, but I’ve never been in the insurance game before, so I’ve been trying to be the face people see and network outside of the office. Her employees cold call people all day long via internet leads, but it hasn’t produced a single result even though that’s what other agency owners suggested to her. I have managed to quote a few army buddies and fellow fireman, but they did not switch. We even beat the price they are currently paying AND offered better coverage. I just don’t understand any of this. It’s beyond frustrating and I’m just at a loss at this point on what to do. I need some advice so that I can help her as much as possible so that she doesn’t see her life savings and hard work go down the drain submitted by /u/Dalcynn [link] [comments]

  • How much would you sell automation to a company for?
    by /u/afraidtoleavemystoop (Entrepreneur) on May 19, 2022 at 1:15 am

    More specifically, lets say I could automate 2 people's jobs (my soft "estimate" is 2 - but really its a team of 5 people (and a couple of remote workers from overseas), one of which who's job it is to always be doing this work, and the others who hop in when they can, as you can imagine they get behind on the work that I would automate). Lets say this automation would be able to work 24/7, and produce less errors than the current employees, and can take care of 95% of cases brought up. Lets also say the company is worth ~$150 million, has a great balance sheet, is not a tech company, and is small enough (~100-125 employees) where the CEO is accessible and you can have conversations with him. If I were to go into his office and offer this, how much should I aim for? I'd prefer to not let the company have the program over paying me yearly to license it, as I think I could sell this to multiple businesses who use the same software. submitted by /u/afraidtoleavemystoop [link] [comments]

  • Cooperate with another business under a single name.
    by /u/Khazuk (Entrepreneur) on May 19, 2022 at 1:11 am

    Hi! I have been researching a bit and I can't seem to find my answers, and neither do my business partners. After much research, I know how to approach this in my own country, and I also read something about Co-Branding, but... We're both startups and we're located in different countries in Europe. We both have different brands and names. But we provide similar services that complement each other. As a result of this, most of our online identity, besides our logo, is identical. So we have Company-A (Netherlands) and Company-B (Italy). Company-A is a one-man company(Not a freelancer, has employees) and Company-B is a Limited Liability Company(LLC). Now I was wondering if and how it is possible to legally operate online under the same brand identity. So to the public: Company-C, while remaining as two companies. (Same logo, same name, same contact information, website, etc.) The primary reason why we won't merge is the cross-country administration and benefits our countries provide for startups. TL;DR: In the long run we will likely do a company merge, but for now this is not a feasible solution so we are looking for an alternative to limit our pains in terms of online identity and all caveats and investments that come with it. submitted by /u/Khazuk [link] [comments]

  • "Unknown Error: No error" when trying to boost Instagram posts via Ad Manager
    by /u/LaCaipirinha (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 12:18 am

    If I try to boost posts on my new Instagram account it goes into pending for 24h then as soon as the 24h ticks over, it switches of "currently boosteD" however no money is spent and no views generated. If I try to submit in Ad Manager I get the error message "Unknown Error: No error". Any ideas guys? I am pulling my hair out. submitted by /u/LaCaipirinha [link] [comments]

  • Targeting grocery delivery service to the elderly population?
    by /u/demonology26 (Entrepreneur) on May 19, 2022 at 12:18 am

    The grocery delivery market is very competitive, with big players such as UberEats, Deliveroo, and many smaller startups wanting a slice of the pie. However, most of grocery delivery is targeted towards customers and familial households with access to smartphones and computers, whilst the older generation/elderly population are unable to enjoy these services. We want to target our own local grocery delivery service to elderly homes, retirement centres, and nursing/care homes by allowing them to call us to place an order. They will simply tell us what they want from the store and we will deliver accordingly. We have the logistics planned out, such as payment handling, cash handling, call management, and so forth, so the logistics aren’t the issue. We just want to know if this is a good idea, or if there is indeed a market and demand from the older population? I guess the only way to find out is to try it? submitted by /u/demonology26 [link] [comments]

  • Search terms never cease to amaze!
    by /u/pickYourPass46 (Ads on Google, Meta, Microsoft, etc.) on May 19, 2022 at 12:00 am

    Just wanted to share this laugh. My client who is in the air duct cleaner business came up under the search term for “duct dynasty”. I would I love to keep keyword but not sure it would help the overall performance. But definitely brought back funnier memories of the show. Happy growth to you all! submitted by /u/pickYourPass46 [link] [comments]

  • What is the best free banner ad creation software?
    by /u/MyNameCannotBeSpoken (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 11:56 pm

    I recently downloaded Adobe Express and was satisfied until I discovered there can only be one animation. Which is the best free banner advertising creation software (that allows multiple animations)? submitted by /u/MyNameCannotBeSpoken [link] [comments]

  • [Shipping] How does one ship worldwide with products that are under $20?
    by /u/tzontzonel (Entrepreneur) on May 18, 2022 at 11:27 pm

    Hi everyone, To give you some context on my question. Next month, around 14-16th of June I will be opening my 3d printing small business in the Netherlands. And 90% of my products will be between 4 to 25 euros (excluding custom orders). Then I went and looked at a couple of shipping companies in the Netherlands which are DPD, DHL, PostNL and another one, but I forgot. All of the simulations (i.e. Netherlands-Romania) have given me between 20-30 euros for a 20x10x4 package (which probably means a product between 5 and 10 euros. Excluding the possibility of having multiple manufacturing partners in different parts of the world, how does one ship worldwide with products under the shipping price? submitted by /u/tzontzonel [link] [comments]

  • How do I make money as a 13 year old? Ive been trying to make money for a long time(5 years) and I haven't made any at all. do you guys have any advice short term or long term?
    by /u/Ok_Conflict1704 (Entrepreneur) on May 18, 2022 at 11:23 pm

    Ive been trying to make money and start a business for quite a while now. But I have no money to make money at all. where I live you can start working at 16 meanwhile im 13. ive tried lots of way to make money ive tried to flip free stuff online but I don't have a car to get the stuff im flipping nor the space to house it. after that failed attempt I tried to do surveys online but even if i worked for an entire day I wouldn't make a cent and to be honest I wasted a lot of time trying various server sites. and than I tried learning about shopify and drop shipping but , while I could set up the website for free and put up the product on the website I have no way to advertise because I have no capital , I tried making social media accounts to advertise my stuff but I had no followers whatsoever so I went on the internet and saw A guy who was partnering with people that were semi famous , I tried the same thing nobody responded though.I also tried doing virtual assistant tasks and similar work online but nobody would hire a 13 year old with no experience. I also tried selling candy at school ( this one absolutely failed). I tried to record audio books this also didn't work. I even tried making music but I have 0 music talent (it also wasent for me) this also didnt make me any money .tbh ive lost track of how much stuff i tried, I tried a lot of stuff I saw on the internet and youtube for 5 years (most of it turned out to be scams) .the likelihood of me making money short term is very low so Ive been learning programming (among other long term skills) so I can freelance (im not at this level yet though) and ive been trying to learn about investing and real estate so I can be better prepared in the future. (btw the reason im trying to make money is so my parents don't have to work another labor job as time passes the toll of all that labor slowly piles on my parents and it hurts to see that) anyhoo ive come to this subredit looking for advice whether it be short term (ive pretty much given up on short term for now and ive shifted my focus toward long term because I dont know how to make money short term as a 13 year old) or long term. Thank you very much :). submitted by /u/Ok_Conflict1704 [link] [comments]

  • Health insurance for contractors
    by /u/gbartlettbjj (Entrepreneur) on May 18, 2022 at 11:22 pm

    Hi guys, Not sure if this is the right place to ask so sorry I’m advance if it’s not… I’m a senior software engineer currently employed but I’m thinking of leaving my corporate job to start contracting full time. My wife is also self employed. Any suggestions on health insurance? My wife and I are 30 years old, no health problems, no children. We don’t really go to the doctors except for an annual physical. Trying to figure out what other people do in this situation. Any suggestions or advice would be greatly appreciated! submitted by /u/gbartlettbjj [link] [comments]

  • Google Ads Reps
    by /u/xvalid2 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 11:11 pm

    Does anyone know the actual tiers of Google Ads reps or “account strategists”? Also which are actual Google employees and which are employed by a third party. And if there’s a way to tell if a specific rep is an actual Google employee? submitted by /u/xvalid2 [link] [comments]

  • Are there situations when using more than 20-30 keywords per ad group is appropriate?
    by /u/RefractHD (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 11:00 pm

    Hello everyone, ​ I am running some search campaigns on Bing, and as I am sure many of you know, Bing likes to recommend a hefty amount of keywords to add into a campaign. Some of which I think may be helpful, but these additions have increased the amount of keywords in my campaigns tremendously, and I always hear people say like "no more than 30 keywords per ad group" but in this case, if the ad copy is broad and covers the umbrella of the keywords within the campaign, does it matter if I go over? ​ Would really appreciate some replies, thanks! submitted by /u/RefractHD [link] [comments]

  • I need to interview an entrepreneur...
    by /u/gamer2412 (Entrepreneur) on May 18, 2022 at 9:08 pm

    Hey y’all I need to conduct an interview with an entrepreneur that I don’t know through family, friends, or previous work experience. I’m having more trouble finding one than I thought, I would really appreciate if I could reach out to anyone with entrepreneurial experience! submitted by /u/gamer2412 [link] [comments]

  • Google Shopping show free shipping over a certain amount or hide shipping
    by /u/sackling (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 8:45 pm

    Our store offers free shipping on orders over $99 total. The way google shopping is setup right now it shows free shipping on the PLA when the item itself is over $99 (as per the rules we have setup in merchant center). but for items below it is showing a table rate that is essentially the highest price possible (shipped to the opposite coast of the country). So in fact shipping is often cheaper than what is being shown in the ad. ​ Is there any way to indicate "free shipping on orders over $99" for those items? Or any better way of showing a lower shipping rate that would not violate policy? submitted by /u/sackling [link] [comments]

  • 21+ Effective Marketing Resources You Need!
    by /u/lazymentors (Entrepreneur) on May 18, 2022 at 8:01 pm

    Hey I’m Jaskaran, Author of The Social Juice. I curate weekly marketing updates and resources for my subscribers. Here are 21+ resources I found in last few weeks! Starting with Books This is Marketing by Seth Godin ( Voted by 137+ Redditors) Cashvertising By Drew Eric Whiteman ( 15 Tweets mentioning this book) Sell like Crazy By Sabri Suby (All over Instagram from 2020) Oglivy on Advertising by David Ogilvy (Best to learn Advertising principles) Building a storybrand by Donald Miller ( Top Marketing Books List on Google) Wining the Story Wars by Seth Godin ( Another Marketing Classic from Seth). Dotcom secrets by Russell Brunson ( Great to learn Internet marketing from Beginning) . Ad week Copywriting Handbook by Joseph sugarman ( Great to Understand Marketing & Advertising in one read). Moving To Newsletters Marketing Examples - Every Monday find Copywriting and marketing advice worth your time. The Social Juice- Get List of Marketing Updates and resources- Every Sunday* VeryGoodCopy - Find copywriting tips and resources shared by professional copywriter Eddie Shleyner. For The Interested - Best Newsletter for Creators and content marketers. [DTC Report](dtcreport.co)- Join 50,000+ readers reading DTC reports and useful Insights for free. Stopping At Blogs Seroundtable - A must Stop and read blog to find out everything related to Google and SEO. Seth’s Blog - If you want to constantly learn from Seth Godin. His blog is a must check out! Honeycopy - Learn Copywriting and storytelling from Cole Schafer. Copyhackers - Another amazing copywriting Blog + their Podcast is also worth listening Hubspot Blog - From Email marketing to Instagram growth, find everything here! 6.Ahrefs Blog - Great SEO advice shared in every post! Ending At Free Resources I found! Nira templates- Find more than 856 free templates related to marketing, planning and product managements. State of Email - A free Guide on state of email in 2022 by Mailmodo. Attentive Library - A free resource to find text marketing examples and get inspiration for your next campaign! SEO for Non SEOs - A 5-day free email course teaching SEO! Content Marketing - Steal this content marketing template for your next campaign! Hey, You liked this? I share marketing resources and updates in my newsletter and you can sign up here to receive those updates every week! If you are thinking why Backlinko and many other blogs, resources aren’t here. They are viral in niche and everyone knows about them. I tried to share the best kept resources helpful for you. Thanks for reading, Part 2 in few weeks! submitted by /u/lazymentors [link] [comments]

  • I quit my job to start my own design company. (Unique from other agencies)
    by /u/SpikeySanju (Entrepreneur) on May 18, 2022 at 8:00 pm

    💬 Little Intro about me Two months ago I quit my job to start my own digital design agency. THISUX was born from a desire to explore new opportunities, hoping to test myself and see what I could build from the ground up. It has always been a dream of mine since college and now that I have done it, I know for sure it's what I want to do going forward. Finally after lot's of struggle I'm starting my own design agency, which is a dream come true. I know that with hard work and passion, I can make it happen. 🎉 Build In Public I’ve been tweeting and keeping a blog about Thisux so that other people can learn from my mistakes. By working in public, I get more motivation to work harder. As someone who believes strongly in open source, we plan to contribute more to the community early on. We raise by lifting others up. I want to follow this approach in our company. ⁉️ Why Choose THISUX? Thisux offers unlimited design services for a flat monthly fee and is suitable for startups, agencies and entrepreneurs. TL;DR 🎨 Unlimited design request. ♾️ Unlimited design revision. 🥰 Unlimited branding. 📝 UX Research & AB Testing for each of your task (No matter what we provide quality service). 📊 Built in Dashboard to request and track your designs. 🗄️ We Offer variety of services. 💰 Flat monthly fee (No hidden charges). 😞 We Failed ProductHunt Launch I was super excited to launch my startup on ProductHunt . I have put so much effort to design, develop the website, creating payment flows & admin panels to track client request precisely with prebuilt templates completely from scratch to provide better user experience. After launching on PH I started tweeting about my launch for supporting me on PH. With 1,550 followers on twitter & 2,400 on LinkedIn I thought I would get decent upvotes on pH. But few hours later my post got only 14 upvotes 😞. I was very disappointed and upset. I had worked hard on this project for three months, but I didn't have any money saved to run my family in case the project failed. 💭 What we learned from our failed ProductHunt launch? Don't ever launch on weekends. Also on Monday! Get to know your followers better. See which times of day they interact with you most on social media. Make sure to take advantage of this information, because 30 close friends interacting with you is more meaningful than 1,000 members ghosting your profile. Remember that your purpose has to be shared in order for others to support you. Keep people in the loop by sharing the details of your product development process. By letting them see how the project is coming along, they'll feel more involved in it and more likely to support you. 🔭 What's next? I'm not giving up. I will keep pushing myself to achieve my goals and get better. I'll continue sharing my stories on social media to help others. 🔊 We're offering a 50% discount for the first 2 months. If you're interested in our service, claim it on our website https://www.thisux.in .Get your own design team for an entire month, at a savings of $1747. This offer ends in 3 days. 🎉 Coupon: THISUX50OFF Also, we appreciate your constructive feedbacks and comments! Wish me good luck 😊❤️... Yours Sincerely, Spikey Sanju submitted by /u/SpikeySanju [link] [comments]

  • I never want to go back to a day job!
    by /u/Oladapo25 (Entrepreneur) on May 18, 2022 at 7:18 pm

    I left 9 - 5 employment several years ago, took the leap into working for myself and after a failed business, pivot and lots of resilience, it's working out; decent income and more freedom. But entrepreneurship is one of the hardest things I've ever done in my life 🙂. It just occured to me recently that one of my top motivation is that I don't want to ever go back to a day job or work for someone full time. I feel it would be hard for me to fit in, and I will likely be a bad employee. Lol. This helps me to work harder, and fight through the numerous challenges I've faced in business. Am I the only one who feel the same way? Even though running a business is f****ing hard and the stability of 9 - 5/ full time employment can be tempting atimes, I'd rather bear the pains than go back to full time employment. I've burnt the boats! submitted by /u/Oladapo25 [link] [comments]

  • Providing workstations to new employees
    by /u/Ollep7 (Entrepreneur) on May 18, 2022 at 6:32 pm

    My company is growing fast. I’m hiring 3 full time positions. I already have two employees that I provided with a computer, and we work with Google Workspace and Zoom a lot. How should I plan to organize the new workstations for those new employees? Do I need to do anything besides providing them with the Office suite, email access, and Zoom? I feel like I should be doing something more. submitted by /u/Ollep7 [link] [comments]

  • Are these books still relevant / best for today in starting a business?
    by /u/15795After (Entrepreneur) on May 18, 2022 at 6:15 pm

    I stumbled upon this post from 7 years ago and was wondering if these are still well recommended / "best" books for today. Any insight? (71) The 5 books I read before starting a profitable business that replaced my day job. : Entrepreneur (reddit.com) The Millionaire Fastlane: Crack the Code to Wealth and Live Rich for a Lifetime The 4-hour Workweek The Launch Pad: Inside Y Combinator, Silicon Valley's Most Exclusive School for Startups The Lean Startup Rework submitted by /u/15795After [link] [comments]

  • Do you want to make a mobile app? Here are some lessons from 7000 design hours. Part 2: The Design
    by /u/ZonderHarry (Entrepreneur) on May 18, 2022 at 5:37 pm

    Hey guys, I’ve been busy traveling for the past few weeks, but after getting a great reception to my earlier post, I’m going to continue releasing my guide on how to build an awesome mobile app. For those of you who don’t know, I’m Harry, the creator of Zonder, the real-world exploration & travel game (www.zonderapp.com). I’ve personally spent over 7000 hours on the design, and many more overseeing and working with other designers and developers. I've made tons of mistakes and had learning moments so you won't have to. My first post, which shows you what to expect when you first decide to start making an app, can be found here: https://old.reddit.com/r/Entrepreneur/comments/uc0a4y/do_you_want_to_make_a_mobile_app_heres_a_guide_to/ In this edition, let’s talk about arguably the most important part of creating an app - the design of the app itself. As business oriented and utility apps have a lower design standard, we’ll focus on the concepts necessary to design the most difficult app possible; the consumer entertainment app. This section will focus on the Introductory and Login screens for maximum conversion, which are some of the most important parts of your app. General Design Concepts: The best way to start off is by compiling a list of all your desired features, then building the navigation flow. There will be 2 important user flows (which screens users can see, in order): The first-time login flow and the standard (recurring) user flow. The big difference between the 2 is that one includes the login and tutorial screens, while the other does not. Bottom nav bars are the standard screen navigation solution, and should have 3-5 of your most important features by screen. Avoid hamburger menus (the 3 line icon) for any features you want people to use. They don’t give people a hint as to what might be behind them, and users won’t feel motivated to open them. Don’t make too many different paths to get to the same screen (unless necessary). This confuses users and prevents them from forming a solid navigational path in their mind. Don’t have multiple screens that look broadly similar to each other. This can also confuse users. If you must have similarly designed screens, use a different color for each. Implement a solid design reference system that contains as many common elements as you can. Some examples can be screen headers, pressable buttons and icons. This system will keep every part of your app looking like it’s part of the same professional product. Only deviate from this system if you have a novel use case that isn’t previously covered. Your design system should be based on the feel of the app you’re trying to make. For example, game or fun apps should have softer, gradiented colors and avoid sharp corners, while professional work-related apps can have sharp-cornered tiles and solid block colors and default text without holder shapes. Icon design also varies significantly between app types - choose the right icon pack for your goal. Intro and Signup Screens: The Intro Screen is the first screen that a new user sees in your app. This screen is very important and should accomplish a few key objectives: It reinforces to the user that they downloaded the right app. Have a graphical representation of your app’s main purpose and features. For example, a travel guide app should have some travel photos or a map here. Do not just have the text of your logo - unless your logo itself reinforces the purpose of your app. It should communicate the high quality of your app. I would recommend putting your highest quality graphics and presentation on this screen. Animations are preferred if you can make them. The Intro Screen is extremely important to overall conversion, and a stellar one can increase your account creation rate by double digit %. Many people tend to drop off at the first screen if what they see isn’t up to their standards (it doesn’t meet either point A or B above) I would recommend your Intro Screen to have only 1 option - to go to the next screen. This makes it as easy and brainless as possible for the user to start interacting with your app. Once they start interacting with it, they’re more likely to keep doing so. I would not recommend giving people many things to decide from on the Intro Screen - such as choosing between login options. Also, if you have multiple login options it gives you less space to reach your objectives in point 1. For the Signup Screen, you should include as many social media options as you can for quick signup, including Google, Facebook and Apple. These will all be more effective than manual account creation since it’s much lower effort for the user. Your manual Account Creation process should be on 1 screen, not 2 or multiple screens. That’s because a user will be able to see the beginning AND end of the process at the same time, and be much more motivated to start. The confirm button must say something like “Start” or “Create” or similar - it cannot say “Next”. “Next” implies that they are not finished and they will have to do an unknown amount of additional work to sign up. The Account Creation screen should have a header that reinforces the value of signing up. Instead try something like “Create your personal profile” or “Start shopping in 1-2-3!” Do not use a header like “Sign up” because people will perceive those words as something that benefits your app more than it benefits them. Extremely important: if your app design requires usage of a username, DO NOT ask the user to create a username during the signup process. This action is very thought-intensive for people and can cause a lot of users to drop off. People will spend a lot of time creating the perfect username and may decide to “get back to it later” and never come back. Doubly so if the username they really like is already taken. By contrast, name, email and password are relatively mindless and easy to input. ​ Tutorials: There are 3 different types of tutorials, and they are useful for different purposes (my terms): Direct (3 or 4 screen tutorial) Interactive Tutorial Graduated Tutorial Popups The Direct Tutorial is ideally a 3-screen tutorial that a user can swipe through, while 4 screens is a bit harder to process but acceptable if your app is that complex. Your app needs to be designed so that a user is not completely lost if they don’t read your Direct Tutorial however, as many people mindlessly scroll past them in their excitement to start using the app. An Interactive Tutorial allows users to learn how a feature works by seeing it in action on their screen. An example would be a booking app prompting you to start a search for “hotels” as soon as you get into the app. These are usually not necessary unless you have some sort of unusual feature that isn’t immediately obvious. Many mobile games use a variant of this by letting a player play a tutorial level that includes all the features. Graduated tutorials are for more complex apps that have features that a user may not use right away, therefore making a direct tutorial useless. Once a user actually enters that screen for the first time, a popup will usually appear that reminds them of how the feature works. If your app has more than 4 large features, it’s advisable to use these popups instead of a longer main tutorial. Tutorial writing should be concise and stay within 2 lines of text, with 3 lines being less ideal but acceptable. Do not put 4 lines of text into one block. People will readily read 2 lines, and read 3 lines sometimes. If there are 4 or more lines, try to break them up by putting a graphic in the middle. I know that most people go into startups to make money, but my goal is to help other creators make better products we can all enjoy. Therefore, I’m happy to help answer any questions from prospective app founders completely free of charge, whether it's about design, hiring, team management, or ideation. Just shoot me a DM on reddit anytime. If you’d like to see these tips in action on my app, Zonder, or are interested in playing a real-life exploration game where you can earn XP and level up by traveling or going out, check out my website at www.zonderapp.com where you can find links to the app for both iOS and Android. Feel free to ask me any questions about app design in the comments! submitted by /u/ZonderHarry [link] [comments]

  • Is Craigslist Advertising Any Good?
    by /u/paperclip_specialist (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 5:31 pm

    We're in the home improvement industry and one of our clients is wondering about Craiglist ads. Does anyone have any experience with these? submitted by /u/paperclip_specialist [link] [comments]

  • I lead a team of 8 people and I have no idea what I’m doing
    by /u/FunkSlim (Entrepreneur) on May 18, 2022 at 4:51 pm

    We’re developing an app as of right now. Our lead developer/web designer has begun hiring people under him for that process (it’s all out of my league) and we have 2 marketing specialists, an accountant, a graphic designer, an analyst and a manager. This is very much a part time job we’ve been working in the evenings and any free time we have. This is no where close to paying bills for anyone, we’re all working towards this common goal with the hope and belief that it will eventually pay off. Multiple members of my team have come to me with questions of when I project we can work full time and start ACTUALLY getting paid. I don’t have an answer and as a leader it makes me feel incompetent. With where we’re at we could easily maintain our dev on the app while the rest of us started doing something else that could make us an income in the mean time, and potentially help fund the app as well. My thought process is that- I have a team of professionals/specialists with degrees and quality work ethic that would completely support any decision I made for the team (because they know I have all of our best interests at heart) so surely there’s something we can do to make a full time job for the 8 of us. We all work day jobs and if we could replace them with a job the team can do together, that would build a lot of morale and speed up the creation of the app we’re working on. The truth is I’m way out of my league here, I’m a young guy, I’ve only ever held 1 other job where I was expected to have any leadership and now I’m THE leader. Everyone believes in our product, but it requires a lot more than belief- it requires a lot of investment that we don’t have. I know have a responsibility to my team and I don’t know how to help them, or myself, to achieve this. We’ve all had a sit down and decided that we’re all ok with working for a period of time for no income, if that will lead to a sustainable period of time where we do receive income. submitted by /u/FunkSlim [link] [comments]

  • In what increments do you raise your Performance Max budgets? (Without breaking the algo)
    by /u/Euphoric-Priority755 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 4:43 pm

    Bonus question: how often do you tweak your asset groups? submitted by /u/Euphoric-Priority755 [link] [comments]

  • Some questions about working with a Google Ads Freelancer
    by /u/leosmith66 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 4:37 pm

    Hello, here are the questions: 1) I am the app owner. Should I create the Google Ads account myself, then add the freelancer as an admin? Or is it better to let them create the account and add me? 2) How is ad-spend money controlled? Am I the only one who can see bank account credentials and control the daily spend? Is the freelancer only allowed to allocate money? Please excuse my ignorance. submitted by /u/leosmith66 [link] [comments]

  • Facebook keeps rejecting ads immediately after publishing.
    by /u/fakerrre (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 4:12 pm

    How can I prevent this? Can a Facebook specialist staff who is offering me a online meeting help me with this? My ads doesn’t violate any rules. submitted by /u/fakerrre [link] [comments]

  • Remarketing for Shoes that help with Foot Pain?
    by /u/bogusjedi (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 4:07 pm

    Has anyone in similar industries found a way to use remarketing for products whose primary function isn't necessary a medical device but is know to help people with ailments? I am constantly playing wack-a-mole with disapprovals and I am wondering if it's just better to sunset the remarketing idea or if there are workarounds with this. Our Google rep has tried to work with an approval engineer to see if we can find what is triggering the flagging - be it our keywords, ad copy or website content. So far it seems to be illuding to keywords but they haven't been able to confirm. Curious if anyone has any experience on what they have tried or if this is a lost cause. Would/could using P. max be an alternative for remarketing in this case? submitted by /u/bogusjedi [link] [comments]

  • CPC jumped through the roof after switching to Max Conversions on Google Ads
    by /u/Stephan_Gunville (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 3:38 pm

    Hi, I'm newish to this and had a campaign on max clicks for about a month and a half and was getting 1.50$ to 2.50$ clicks in home cleaning niche in Ottawa, and had a few conversions, but then I switched to max conversions, and for the first 2 weeks I was getting way more conversions at a similar cost per click, but now just in the past week, Google is getting me 7$ clicks... I'm on a 10$/day budget as I cant afford more but I've only received 5 clicks in the last week... wtf is going on? how do I fix this? submitted by /u/Stephan_Gunville [link] [comments]

  • Help me understand Target ROAS please
    by /u/got2b1 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 2:26 pm

    Why does reducing budget put you in a better position to hit a ROAS target? What are the best practices for using this bidding strategy with smart shopping? submitted by /u/got2b1 [link] [comments]

  • According to your experience, which type of google ad campaign works better than others? Display Campaign or Search Campaign? I know it's based on what you want but what is your suggestion for my targets (job seekers)?
    by /u/Accomplished-Yam-418 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 2:12 pm

    submitted by /u/Accomplished-Yam-418 [link] [comments]

  • What is the best service to use to track your competitors PPC campaigns/performance?
    by /u/VinoRosso96 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 1:56 pm

    submitted by /u/VinoRosso96 [link] [comments]

  • Attribution Advice
    by /u/DFWGuy55 (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 1:50 pm

    Background - OP is SMB owner. We use Google Ads. Organic includes blog content, local SEO via GMB (now Maps) and reputation via GMB reviews. Leads originate via phone, Calendly consult bookings and contact form on website. We are multi-location. Dilemma - We don't know what is driving our leads. Our leads will simply say - "the internet" when asked. Attribution using GTM hasn't proven successful. We have considered call tracking numbers for phone driven leads. We have considered promo codes or unique offers. Advice Requested - What is best practices? submitted by /u/DFWGuy55 [link] [comments]

  • $1,000,000 loan?
    by /u/IconicXIII (Entrepreneur) on May 18, 2022 at 1:47 pm

    How much money does my business need to be making monthly for income in order to obtain a $1,000,000 loan either SBA or through financing (equipment financing/loans). I am trying to expand but unsure what would make me eligible for a loan this size and what income I would need to prove on bank statements. Thanks! submitted by /u/IconicXIII [link] [comments]

  • Optimising for Store Capacity
    by /u/BoysenberryNope (Ads on Google, Meta, Microsoft, etc.) on May 18, 2022 at 1:33 pm

    We have stores that take appointments from customers. We can get a view if some stores have more capacity for more appointments than others. We’d like to reflect this in our ad spend on Google Ads (i.e. direct our spend where it is needed, and not direct traffic from Location X when Store X has a long waiting list for appointments). What is the best way to optimise this? I’ve heard of location based bid adjustments and also proximity targets, but don’t understand the difference between the two? Or are there other ways to optimise this? Would appreciate any advice. submitted by /u/BoysenberryNope [link] [comments]

Azure Solutions Architect Expert Certification Questions And Answers Dumps

Azure Solutions Architect Expert Exam Preparation

This exam measures your ability to accomplish the following technical tasks: design identity, governance, and monitoring solutions; design data storage solutions; design business continuity solutions; and design infrastructure solutions.

This blog covers the Designing Microsoft Azure Infrastructure Solutions.

A candidate for this certification should have advanced experience and knowledge of IT operations, including networking, virtualization, identity, security, business continuity, disaster recovery, data platforms, and governance. A professional in this role should manage how decisions in each area affect an overall solution. In addition, they should have experience in Azure administration, Azure development, and DevOps processes.

2022 AWS Cloud Practitioner Exam Preparation

Skills measured

  • Design identity, governance, and monitoring solutions (25-30%)
  • Design data storage solutions (25-30%)
  • Design business continuity solutions (10-15%)
  • Design infrastructure solutions (25-30%)

Below are the top 50 Questions and Answers for AZ303, AZ304 and AZ305 Certification Exam:

What is one reason to regularly review Azure role assignments?

A. ensure naming conventions are properly applied.

B. To reduce the risk associated with stale role assignments.

C. To eliminate extra distribution groups that are no longer used.

Answer: B:  You should regularly review access of privileged Azure resource roles to reduce the risk associated with stale role assignment

What is an access package?

A. An access package is a group of users with the access they need to work on a project or perform a task.

B. An access package is a bundle of all the resources with the access a user needs to work on a project or perform their task.

C. An access package is a used to create a transitive trust between B2B organizations.

Answer: B:  An access package is a bundle of all the resources with the access a user needs to work on a project or perform their task. For example, you may want to create an Access Package that includes all the applications that developers in your organization need, or all applications to which external users should have access.

How can Discovery and insights for privileged identity management help an organization?

A. Discovery and insights can find privileged role assignments across Azure AD, and then provide recommendations on how to secure them using Azure AD governance features like Privileged Identity Management (PIM).

B. Discovery and insights can find when guest’s access resources across Azure AD.


Save 65% on select product(s) with promo code 65ZDS44X on Amazon.com

C. Discovery and insights can find security group assignments across Azure AD, and then provide recommendations on how to secure them using Azure AD governance features like Privileged Identity Management (PIM).


D. N/A


Answer: A – Discovery and insights can find privileged role assignments across Azure AD, and then provide recommendations on how to secure them using Azure AD governance features like Privileged Identity Management (PIM).

Whether to assign a role to a group instead of to individual users is a strategic decision. When planning, consider assigning a role to a group to manage role assignments when the desired outcome is to delegate assigning the role and what else?

A. You want to use Conditional Access policies.

B. Many Azure resources need to be managed.

C. Many users are assigned to a role.

D. N/A


Answer: C – Management of one group is much easier than management many individual users.

Which roles can only be assigned using Privileged Identity Management?

A. Permanently active roles.

B. Eligible roles.

C. Transient roles.

D. N/A


Answer: B. – Permanently active roles are the normal roles assigned through Azure Active Directory and Azure resources while eligible roles can only be assigned in Privileged Identity Management.

What is the purpose of the audit logs?

A. Azure AD audit logs provide a comparison of budgeted Azure usage compared to actual.

B. Azure AD audit logs provide records of system activities for compliance reporting.

C. Azure AD audit logs allow customer to monitor activity when provisioning new services within Azure.

D. N/A


Answer: B. – An audit log has a default list view that shows data, like the date and time of the occurrence, the service that logged the occurrence, the category and name of the activity (what), the status of the activity (success or failure), the target, and the initiator/actor (who) of an activity.

Can Azure export logging data to third-party SIEM tools?

A. Yes, Azure supports exporting log data to several common third-party SIEM tools.

B. No, Azure only supports the export to Azure Sentinel.


C. Yes, Splunk is the 3rd Party SIEM Azure can export to.

D. N/A


Answer: A. – Azure can export to many of the most popular SIEM tools. The most common are Splunk, IBM QRadar, and ArcSight.

A Solutions Architect wants to configure email notifications to be sent from Azure AD Domain Services when issues are detected. In Azure, where this would be configured?

A. Azure Microsoft Portal > Azure Active Directory > Monitoring > Notifications > Add email recipient.

B. Azure Microsoft Portal > Azure AD Domain Services > Notification settings > Add email recipient.

C. Azure Microsoft Portal > Notification Hubs > Azure Active Directory > Add email recipient.

D. N/A


Answer: B – The health of an Azure Active Directory Domain Services (Azure AD DS) managed domain is monitored by the Azure platform. The health status page in the Azure Microsoft Portal shows any alerts for the managed domain. To make sure issues are responded to in a timely manner, email notifications can be configured to report on health alerts as soon as they’re detected in the Azure AD DS managed domain.

You are architecting a web application that constantly reads and writes important medical imaging data in blob storage.

To ensure the web application is resilient, you have been asked to configure Azure Storage as follows:

  • Protect against a regional disaster.
  • Leverage synchronous replication of storage data across multiple data centers.

How would you configure Azure Storage to meet these requirements?

GZRS provides asynchronous replication to a single physical location in the secondary region. Additionally, this includes synchronous replication across three availability zones within the primary region (ZRS).

Video for reference: Storage Account Replication

 

You need to ensure your virtual machine boot and data volumes are encrypted. Your virtual machine is already deployed using an Azure marketplace Windows OS image and managed disks. Which  tasks should you complete to enable the required encryption?

Configure a Key Vault Access Policy: A Key Vault Access Policy will be required to allow Azure Disk Encryption for volume encryption.

Create an Azure Key Vault: Azure Disk Encryption leverages a Key Vault for the secure storage of cryptographic information.

Video for reference: Azure Disk Encryption

You have configured Azure multi-factor authentication (MFA) for your company. Some staff have reported they are receiving MFA verification requests, even when they didn’t initiate any authentication themselves. They believe this might be hackers.
Which feature would you enable to help protect against this type of security issue?

Fraud alert helps users to protect against MFA verification requests they did not initiate. It provides the ability to report fraudulent attempts, as well as the ability to automatically block users who report fraud.

Reference: Fraud Alert

You are configuring a new storage account using PowerShell. The storage account must support Queue storage. The PowerShell command you are using is as follows:

New-AzStorageAccount -name "tpcstore01" -ResourceGroupName "rg1" -location "auseast" -SkuName "standard_lrs"

Which two arguments could you use to complete the PowerShell command to meet the above requirements?

-Kind "Storage"

General Purpose v1 supports blob, file, queue, table, and disk.

-Kind "StorageV2"

General Purpose v2 supports blob, file, queue, table, disk, and data lake.

You need to ensure your virtual machine boot and data volumes are encrypted. Your virtual machine is already deployed using an Azure marketplace Linux OS image and managed disks.
Which  two commands would you use to enable the required encryption?

New-AzKeyvault

Azure Disk Encryption leverages a Key Vault for the secure storage of cryptographic information.

Set-AzVMDiskEncryptionExtension

Azure Disk Encryption leverages a VM extension to enable BitLocker (Windows) or DM-Crypt (Linux) to encrypt boot/OS/data volumes.

CompanyA is planning on making some significant changes to their governance solution. They have asked for your assistance with recommendations and questions. Here are the specific requirements.

– Consistency across subscriptions. It appears each subscription has different policies for the creation of virtual machines. The IT department would like to standardize the policies across the Azure subscriptions.

– Ensure critical storage is highly available. There are several critical applications that use storage. The IT department wants to ensure the storage is made highly available across regions.

– Identify R&D costs. The CTO wants to know how much a new project is costing. The costs are spread out across multiple departments.

– ISO compliance. CompanyA wants to certify that it complies with the ISO 27001 standard. The standard will require resources groups, policy assignments, and templates.

How can CompanyA to ensure policies are implemented across multiple subscriptions?

Create a management group and place all the relevant subscriptions in the new management group.
A management group could include all the subscriptions. Then a policy could be scoped to the management group and applied to all the subscriptions.

How can CompanyA ensure applications use geo-redundancy to create highly available storage applications?

Add an Azure policy that requires geo-redundant storage.
An Azure policy can enforce different rules over your resource configurations.

How can CompanyA report all the costs associated with a new product?

Add a resource tag to identify which resources are used for the new product.
Resource tagging provides extra information, or metadata, about your resources. You could then run a cost report on all resources with that tag.

Which governance tool should CompanyA use for the ISO 27001 requirements?

Azure blueprints.
Azure blueprints will deploy all the artifacts for ISO 27001 compliance.

You are configuring an Azure Automation runbook using the Azure sandbox.
For your runbook to work, you need to install a PowerShell module. You would like to minimize the administrative overhead for maintaining and operating your runbook.
Which option should you choose to install an additional PowerShell module?

Navigate to Shared Resources > Modules, and configure the additional module.
Additional PowerShell modules can be added to the sandbox environment for use by your runbooks.

CompanyA is planning on making some significant changes to their identity and access management solution. They have asked for your assistance on some recommendations and questions. Here are the specific requirements.

– Device access to company applications. The CTO has agreed to allow some level of device access. Employees at the company’s retail stores will now be able to access certain company applications. This access, however, should be restricted to only approved devices.

– Company reorganization. A company-wide reorganization has affected many employees. These employees are now in new roles. The IT team needs to ensure users have the correct access based on their new jobs.

– External developer accounts. A new development project requires external software developers to access company data files. The IT team needs to create user accounts for approximately five developers.

– User sign-in attempts. A recent audit of user sign-ins attempts revealed anonymous IP addresses and unusual locations. The IT team wants to require multifactor authentication for these attempted sign-ins.

How can CompanyA ensure that employees at the company’s retail stores can access company applications only from approved tablet devices?

Conditional access: Conditional Access enables you to require users to access your applications only from approved, or managed, devices.

What should CompanyA do to ensure employees have the correct permissions for their job role?

Require an access review: An access review would give managers an opportunity to validate the employees access.

What should CompanyA do to give access to the partner developers?

Invite the developers as guest users to their directory: In Business-to-Business scenarios guest user accounts are created. You can then apply the appropriate permissions

What solution would be best for the user sign-in attempts requirement?

Create a sign-in risk policy: That’s correct. A sign-in risk policy can identify anonymous IP and atypical locations. Secondary multifactor authentication can then be required.

You are working as a network administrator, managing the following virtual networks:

VNET1

  • Location: Australia East

  • Resource groupRG1

  • Address space: 10.1.0.0/16

    VNET2

  • Location: Australia Southeast

  • Resource groupRG2

  • Address space: 10.1.0.0/16

You have been asked to connect VNET1 and VNET2, to allow private communication between resources in each virtual network. Do you need to modify either of the two virtual networks before virtual network peering is supported?

Yes: IP address ranges cannot overlap. One of the virtual networks must have their address space changed before VNet peering would be able to be configured.


You are architecting identity management for a hybrid environment, and you plan to use Azure AD Connect with password hash sync (PHS).
It is important that you design the solution to be highly available. How would you implement high availability for the synchronization service?

Configure an additional server with Azure AD Connect in staging mode.

Azure AD Connect can be configured in staging mode, which helps with high availability.

You are responsible for monitoring a major web application for your company. The application is implemented using Azure App Service Web Apps and Application Insights.
The chief marketing officer has asked you to provide information to help analyze user behavior based on a group of characteristics. To start with, it will be a simple query looking at all active users from Australia.
Which of the following would you use to provide this information?

Cohorts leverage analytics queries to analyze users, sessions, events, or operations that have something in common (e.g., location, event, etc.). Reference: App insights

You work for a company with multiple Active Directory domains: exampledomain1.com and test.lab.com. Your company would like to use Azure AD Connect to synchronize your on-premises Active Directory domain, exampledomain1.com, with Azure AD. You do not wish to synchronize test.lab.com.

Which tasks should you complete, requiring minimal administrative effort and causing the least disruption to the existing environment?

Run the Azure AD Connect wizard, and configure Domain and OU filtering.

You are architecting a mission-critical processing solution for your company. The solution will leverage virtual machines for the processing tier, and it is critical that high performance levels are maintained at all times.
You need to leverage a managed disk that guarantees up to 900 MB/s throughput and 2,000 IOPS — but also minimizes costs.
Which of the following would you use within your solution?

Premium SSD Managed Disks:  Premium SSDs provide high performance and low latency, and include guaranteed capacity, IOPS, and throughput.

CompanyA wants to reduce storage costs by reducing duplicate content and, whenever applicable, migrating it to the cloud. The company would like a solution that centralizes maintenance while still providing nation-wide access for customers. Customers should be able to browse and purchase items online even in a case of a failure affecting an entire Azure region. Here are some specific requirements.

  • Warranty document retention. The company’s risk and legal teams requires warranty documents be kept for three years.

  • New photos and videos. The company would like each product to have a photo or video to demonstrate the product features.

  • External vendor development. A vendor will create and develop some of the online ecommerce features. The developer will need access to the HTML files, but only during the development phase.

  • Product catalog updates. The product catalog is updated every few months. Older versions of the catalog aren’t viewed frequently but must be available immediately if accessed.

What is the best way for CompanyA to protect their warranty information?

Time-based retention policy: With a time-based retention policy, users can set policies to store data for a specified interval. When a time-based retention policy is in place, objects can be created and read, but not modified or deleted.

What type of storage should CompanyA use for their photos and videos?

Blob storage: That’s correct. Blob storage is best for their photos.

What is the best way to provide the developer access to the ecommerce HTML files?

Shared access signatures: That’s correct. Shared access signatures provide secure delegated access. This functionality can be used to define permissions and how long access is allowed.

Which access tier should be used for the older versions of the product catalog?

Cool access tier: That’s correct. The cool access tier is for content that wouldn’t be viewed frequently but must be available immediately if accessed.

What tool would you use to identify underutilized and idle Azure resources in order to help reduce overall spend?

Azure Advisor: Advisor helps you optimize and reduce your overall Azure spend by identifying idle and underutilized resources. Reference

You work as a network administrator for a company. You manage several virtual machines within the following virtual network:

  • NameVNET1
  • Address space: 10.1.0.0/16
  • SubnetSUBNET1 (10.1.1.0/24)

You need to configure DNS for a VM called VM1, that is located in SUBNET1. DNS should be set to 8.8.8.8. All other VMs must keep their existing settings.

What should you do?

Navigate to the network interface of VM1, DNS Servers, and enable Custom DNS Servers and set to 8.8.8.8.

Custom DNS can be set at the network interface level, so that the settings only apply for a specific virtual machine.

You are architecting a web application that constantly reads and writes important medical imaging data in blob storage. To ensure the web application is resilient, you have proposed the use of storage account failover. Management has asked you whether any data loss might occur for this solution, in the event of a failover. How would you respond?

There may be data loss, and the extent of data loss can be estimated using the Last Sync Time.

The Last Sync Time property provides an indication of how far the secondary is behind from the primary. This can be used to estimate the extent of data loss that may occur. 

What storage service should you implement for an application that streams video content?

Azure Blobs: Azure blobs are used for storing large amounts of unstructured data, such as documents, images, and video files. This service is best used for streaming audio and video, particularly over HTTP/S.

What storage service should you implement for an application that needs to access data using SMB?

Azure Files: Azure files allow you to create and maintain highly available file shares that are accessible anywhere. They can be considered as a replacement to traditional file servers. They provide SMB access.

You are architecting a mission-critical solution for your company using virtual machines.
The solution must qualify for a Microsoft service level agreement (SLA) of 99.95%.
You deploy your solution to a single virtual machine in an availability set. The virtual machine uses premium storage. Does this meet the required SLA?

No: The virtual machine does use premium storage; however, this only provides a 99.9% SLA.

You are implementing Azure Backup using the Microsoft Azure Backup Server.
Which of the following would you use to allow the server to register with your recovery services vault?

Vault Credentials: Vault Credentials are used by the Microsoft Azure Backup Server software to register with the vault.

You are developing a solution on a server hosted on-premises. The solution needs to access data within Azure Key Vault.
Which two options would you use to ensure the application has access to Azure Key Vault?

Register the application in Azure AD and use a client secret.
To allow an on-premises application to authenticate with Azure AD, it can be registered in Azure AD and given a client secret (or client certificate). If this application was hosted on a supported Azure service, it could have been possible to use a managed identity instead.

Configure an access policy in Azure Key Vault.
To allow access to Key Vault, any identity (application, user, etc.) must be provided permissions using an Access Policy.

You have a Windows virtual machine within Azure, which must be backed up.
You have the following requirements:
– Back up the virtual machine three times per day
– Include system state backups
You configure a backup to a recovery services vault using the Microsoft Azure Recovery Services (MARS) agent.
Does this fulfill the requirements above?

Yes: The Microsoft Azure Recovery Services (MARS) agent can perform backups of files, folders, and system states up to three times a day.

You are planning a migration of machines to Azure from your on-premises Hyper-V host.
You would like to estimate how much it will cost to migrate your operating machines to Azure. Which of the following two items would you include in your migration solution?
The effort required to estimate pricing, and then ultimately go on to perform a migration, should be minimized.

Azure Migrate Project: All migrations (both assessment and migration) require an Azure Migrate Project for the storage of related metadata.

You are implementing Azure Blueprints to help improve standards and compliance for your Azure environment.
You would like to ensure that when an Azure Blueprint is used, a user is assigned ‘owner’ permissions to a specific resource group defined in the blueprint.
Does Azure Blueprints provide this functionality?

Yes: Azure Blueprints includes several different artifacts, one of which is ‘Role Assignment’. This allows a user to be assigned permissions as part of the blueprint definition.

You are planning a migration from on-premises to Azure.
Your on-premises environment is made up of the following:
– VMware hosted virtual machines
– Hyper-V hosted virtual machines
– Physical servers
Will the Azure Migrate: Server Migration tool provided by Microsoft support your environment for migrations to Azure?

Yes, for VMware, Hyper-V, and physical machines. The Azure Migrate: Server Migration tool support migrating VMware VMs, Hyper-V VMs, and physical servers.

For a new container image you are developing, you need to ensure a local HTML file, index.html, is included in the image. Which command would you include in the Dockerfile?

COPY ./index.html /usr/share/nginx/html

The COPY command can be used within a Dockerfile to copy files and directories from source to destination.

You have developed a financial management application for your company.
It is currently hosted as an Azure App Service Web App within Azure.
To improve security, you need to ensure that the web application is only accessible when users connect from your head-office IP address of 14.78.162.190.
Within the Azure Portal settings for your web app, which section would you use to configure this security?

Networking > Access Restrictions
Access Restrictions allows you to filter inbound connectivity to Azure App service, based on the IP address of the requesting user/service.
This meets the requirements of this scenario, as an Access Restriction could be configured for the Web App. To configure this, an ALLOW rule would be created for the web app (and the management interface, SCM, if needed). Adding the ALLOW rule for the IP address of 13.77.161.179 would automatically create a DENY ALL rule, which will prevent any other network location from accessing this resource.

You are responsible for improving the availability of a web application. The web application has the following characteristics:
– Hosted using Azure App Service.
– Leverages an Azure SQL back-end.
You need to configure Azure SQL Database to meet the following needs:
Must be able to continue operations in the event of a region failure.
Must support automatic failover in the event of failure.
You must recommend a solution that requires the least amount of effort to implement, and can manage in the event of a failover. Which configuration do you recommend?

Azure SQL auto-failover group: Using Azure SQL auto-failover groups provides protection at a geographic scale. By using the read-write listener, an application will seamlessly point to the primary, even in the event of a failover. Azure SQL auto-failover groups simplify the deployment and management of geo-replicated databases. It supports replication, and failover, for one or more databases on Azure SQL Database, or Azure SQL Managed Instances. A key benefit of auto-failover groups, is the built-in management of DNS for read, and read-write listeners.

You have been asked to implement high availability for an Azure SQL Managed Instance.
The solution is critical, and data loss must be minimized. If the data platform fails you must wait 1 hour before automatic failover occurs.
You must determine: (1) How to configure replication. (2) How to configure the 1 hour delay.

Enable replication using Auto-Failover Groups. Enable the 1 hour delay using the Grace Period.
Auto-Failover Groups are supported by Azure SQL Managed Instances, and the Grace Period is used to define how many hours to wait before an automatic read/write failover occurs.

You are helping to architect a social media application.
The solution must ensure that all users read data in the order it has been completely written.
You propose the use of Cosmos DB. What else do you include in your proposal to meet the requirements?

Cosmos DB Strong Consistency: Strong consistency ensures that reads are guaranteed to return the most recent committed write. This is useful when order matters.

You need to configure high availability for Azure SQL Databases.
You would like the service to include the following:
– Automatic failover policy.
– Ability to manually failover.
– DNS management for primary read/write access.
You configure Azure SQL Active Geo-Replication. Does this meet the requirements?

No: Active Geo-Replication does not include DNS automatically managed for primary read/write access. This is a feature of auto-failover groups. The inclusion of DNS for both the primary read/write endpoint, and the secondary read endpoint, reduces the management overhead for ensuring applications are pointing to the correct resources in the event of a disaster.

Top 100 Data Science and Data Analytics Interview Questions and Answers

Data Science Bias Variance Trade-off

Below and the Top 100 Data Science and Data Analytics Interview Questions and Answers dumps.

What is Data Science? 

Data Science is a blend of various tools, algorithms, and machine learning principles with the goal to discover hidden patterns from the raw data. How is this different from what statisticians have been doing for years? The answer lies in the difference between explaining and predicting: statisticians work a posteriori, explaining the results and designing a plan; data scientists use historical data to make predictions.

AWS Data analytics DAS-C01 Exam Prep
AWS Data analytics DAS-C01 Exam Prep
AWS Data analytics DAS-C01 on iOS pro

2022 AWS Cloud Practitioner Exam Preparation
 
 
 
 
 
 
 
AWS Data analytics DAS-C01 Exam Prep PRO App:
Very Similar to real exam, Countdown timer, Score card, Show/Hide Answers, Cheat Sheets, FlashCards, Detailed Answers and References
No ADS, Access All Quiz Detailed Answers, Reference and Score Card

How does data cleaning play a vital role in the analysis? 

Data cleaning can help in analysis because:

  • Cleaning data from multiple sources helps transform it into a format that data analysts or data scientists can work with.
  • Data Cleaning helps increase the accuracy of the model in machine learning.
  • It is a cumbersome process because as the number of data sources increases, the time taken to clean the data increases exponentially due to the number of sources and the volume of data generated by these sources.
  • It might take up to 80% of the time for just cleaning data making it a critical part of the analysis task

What is linear regression? What do the terms p-value, coefficient, and r-squared value mean? What is the significance of each of these components?

Reference  

Imagine you want to predict the price of a house. That will depend on some factors, called independent variables, such as location, size, year of construction… if we assume there is a linear relationship between these variables and the price (our dependent variable), then our price is predicted by the following function: Y = a + bX
The p-value in the table is the minimum I (the significance level) at which the coefficient is relevant. The lower the p-value, the more important is the variable in predicting the price. Usually we set a 5% level, so that we have a 95% confidentiality that our variable is relevant.
The p-value is used as an alternative to rejection points to provide the smallest level of significance at which the null hypothesis would be rejected. A smaller p-value means that there is stronger evidence in favor of the alternative hypothesis.
The coefficient value signifies how much the mean of the dependent variable changes given a one-unit shift in the independent variable while holding other variables in the model constant. This property of holding the other variables constant is crucial because it allows you to assess the effect of each variable in isolation from the others.
R squared (R2) is a statistical measure that represents the proportion of the variance for a dependent variable that’s explained by an independent variable or variables in a regression model.

Credit: Steve Nouri

What is sampling? How many sampling methods do you know? 

Reference

 

Data sampling is a statistical analysis technique used to select, manipulate and analyze a representative subset of data points to identify patterns and trends in the larger data set being examined. It enables data scientists, predictive modelers and other data analysts to work with a small, manageable amount of data about a statistical population to build and run analytical models more quickly, while still producing accurate findings.

Sampling can be particularly useful with data sets that are too large to efficiently analyze in full – for example, in big data analytics applications or surveys. Identifying and analyzing a representative sample is more efficient and cost-effective than surveying the entirety of the data or population.
An important consideration, though, is the size of the required data sample and the possibility of introducing a sampling error. In some cases, a small sample can reveal the most important information about a data set. In others, using a larger sample can increase the likelihood of accurately representing the data as a whole, even though the increased size of the sample may impede ease of manipulation and interpretation.
There are many different methods for drawing samples from data; the ideal one depends on the data set and situation. Sampling can be based on probability, an approach that uses random numbers that correspond to points in the data set to ensure that there is no correlation between points chosen for the sample. Further variations in probability sampling include:


Save 65% on select product(s) with promo code 65ZDS44X on Amazon.com

Simple random sampling: Software is used to randomly select subjects from the whole population.
• Stratified sampling: Subsets of the data sets or population are created based on a common factor,
and samples are randomly collected from each subgroup. A sample is drawn from each strata (using a random sampling method like simple random sampling or systematic sampling).
o EX: In the image below, let’s say you need a sample size of 6. Two members from each
group (yellow, red, and blue) are selected randomly. Make sure to sample proportionally:
In this simple example, 1/3 of each group (2/6 yellow, 2/6 red and 2/6 blue) has been
sampled. If you have one group that’s a different size, make sure to adjust your
proportions. For example, if you had 9 yellow, 3 red and 3 blue, a 5-item sample would
consist of 3/9 yellow (i.e. one third), 1/3 red and 1/3 blue.
• Cluster sampling: The larger data set is divided into subsets (clusters) based on a defined factor, then a random sampling of clusters is analyzed. The sampling unit is the whole cluster; Instead of sampling individuals from within each group, a researcher will study whole clusters.
o EX: In the image below, the strata are natural groupings by head color (yellow, red, blue).
A sample size of 6 is needed, so two of the complete strata are selected randomly (in this
example, groups 2 and 4 are chosen).


Data Science Stratified Sampling - Cluster Sampling
Data Science Stratified Sampling – Cluster Sampling

– Cluster Sampling

  • Multistage sampling: A more complicated form of cluster sampling, this method also involves dividing the larger population into a number of clusters. Second-stage clusters are then broken out based on a secondary factor, and those clusters are then sampled and analyzed. This staging could continue as multiple subsets are identified, clustered and analyzed.
    • Systematic sampling: A sample is created by setting an interval at which to extract data from the larger population – for example, selecting every 10th row in a spreadsheet of 200 items to create a sample size of 20 rows to analyze.

Sampling can also be based on non-probability, an approach in which a data sample is determined and extracted based on the judgment of the analyst. As inclusion is determined by the analyst, it can be more difficult to extrapolate whether the sample accurately represents the larger population than when probability sampling is used.

Non-probability data sampling methods include:
• Convenience sampling: Data is collected from an easily accessible and available group.
• Consecutive sampling: Data is collected from every subject that meets the criteria until the predetermined sample size is met.
• Purposive or judgmental sampling: The researcher selects the data to sample based on predefined criteria.
• Quota sampling: The researcher ensures equal representation within the sample for all subgroups in the data set or population (random sampling is not used).

Quota sampling
Quota sampling

Once generated, a sample can be used for predictive analytics. For example, a retail business might use data sampling to uncover patterns about customer behavior and predictive modeling to create more effective sales strategies.

Credit: Steve Nouri

What are the assumptions required for linear regression?

There are four major assumptions:

There is a linear relationship between the dependent variables and the regressors, meaning the model you are creating actually fits the data,
• The errors or residuals of the data are normally distributed and independent from each other,
• There is minimal multicollinearity between explanatory variables, and
• Homoscedasticity. This means the variance around the regression line is the same for all values of the predictor variable.

What is a statistical interaction?

Reference: Statistical Interaction

Basically, an interaction is when the effect of one factor (input variable) on the dependent variable (output variable) differs among levels of another factor. When two or more independent variables are involved in a research design, there is more to consider than simply the “main effect” of each of the independent variables (also termed “factors”). That is, the effect of one independent variable on the dependent variable of interest may not be the same at all levels of the other independent variable. Another way to put this is that the effect of one independent variable may depend on the level of the other independent
variable. In order to find an interaction, you must have a factorial design, in which the two (or more) independent variables are “crossed” with one another so that there are observations at every
combination of levels of the two independent variables. EX: stress level and practice to memorize words: together they may have a lower performance. 

What is selection bias? 

Reference

Selection (or ‘sampling’) bias occurs when the sample data that is gathered and prepared for modeling has characteristics that are not representative of the true, future population of cases the model will see.
That is, active selection bias occurs when a subset of the data is systematically (i.e., non-randomly) excluded from analysis.

Selection bias is a kind of error that occurs when the researcher decides what has to be studied. It is associated with research where the selection of participants is not random. Therefore, some conclusions of the study may not be accurate.

The types of selection bias include:
Sampling bias: It is a systematic error due to a non-random sample of a population causing some members of the population to be less likely to be included than others resulting in a biased sample.
Time interval: A trial may be terminated early at an extreme value (often for ethical reasons), but the extreme value is likely to be reached by the variable with the largest variance, even if all variables have a similar mean.
Data: When specific subsets of data are chosen to support a conclusion or rejection of bad data on arbitrary grounds, instead of according to previously stated or generally agreed criteria.
Attrition: Attrition bias is a kind of selection bias caused by attrition (loss of participants)
discounting trial subjects/tests that did not run to completion.

What is an example of a data set with a non-Gaussian distribution?

Reference


The Gaussian distribution is part of the Exponential family of distributions, but there are a lot more of them, with the same sort of ease of use, in many cases, and if the person doing the machine learning has a solid grounding in statistics, they can be utilized where appropriate.

Binomial: multiple toss of a coin Bin(n,p): the binomial distribution consists of the probabilities of each of the possible numbers of successes on n trials for independent events that each have a probability of p of
occurring.

Bernoulli: Bin(1,p) = Be(p)
Poisson: Pois(λ)

What is bias-variance trade-off?

Bias: Bias is an error introduced in the model due to the oversimplification of the algorithm used (does not fit the data properly). It can lead to under-fitting.
Low bias machine learning algorithms — Decision Trees, k-NN and SVM
High bias machine learning algorithms — Linear Regression, Logistic Regression

Variance: Variance is error introduced in the model due to a too complex algorithm, it performs very well in the training set but poorly in the test set. It can lead to high sensitivity and overfitting.
Possible high variance – polynomial regression

Normally, as you increase the complexity of your model, you will see a reduction in error due to lower bias in the model. However, this only happens until a particular point. As you continue to make your model more complex, you end up over-fitting your model and hence your model will start suffering from high variance.

bias-variance trade-off

Bias-Variance trade-off: The goal of any supervised machine learning algorithm is to have low bias and low variance to achieve good prediction performance.

1. The k-nearest neighbor algorithm has low bias and high variance, but the trade-off can be changed by increasing the value of k which increases the number of neighbors that contribute to the prediction and in turn increases the bias of the model.
2. The support vector machine algorithm has low bias and high variance, but the trade-off can be changed by increasing the C parameter that influences the number of violations of the margin allowed in the training data which increases the bias but decreases the variance.
3. The decision tree has low bias and high variance, you can decrease the depth of the tree or use fewer attributes.
4. The linear regression has low variance and high bias, you can increase the number of features or use another regression that better fits the data.

There is no escaping the relationship between bias and variance in machine learning. Increasing the bias will decrease the variance. Increasing the variance will decrease bias.

 

What is a confusion matrix?

The confusion matrix is a 2X2 table that contains 4 outputs provided by the binary classifier.

A data set used for performance evaluation is called a test data set. It should contain the correct labels and predicted labels. The predicted labels will exactly the same if the performance of a binary classifier is perfect. The predicted labels usually match with part of the observed labels in real-world scenarios.
A binary classifier predicts all data instances of a test data set as either positive or negative. This produces four outcomes: TP, FP, TN, FN. Basic measures derived from the confusion matrix:

What is the difference between “long” and “wide” format data?

In the wide-format, a subject’s repeated responses will be in a single row, and each response is in a separate column. In the long-format, each row is a one-time point per subject. You can recognize data in wide format by the fact that columns generally represent groups (variables).

difference between “long” and “wide” format data

What do you understand by the term Normal Distribution?

Data is usually distributed in different ways with a bias to the left or to the right or it can all be jumbled up. However, there are chances that data is distributed around a central value without any bias to the left or right and reaches normal distribution in the form of a bell-shaped curve.

Data Science: Normal Distribution

The random variables are distributed in the form of a symmetrical, bell-shaped curve. Properties of Normal Distribution are as follows:

1. Unimodal (Only one mode)
2. Symmetrical (left and right halves are mirror images)
3. Bell-shaped (maximum height (mode) at the mean)
4. Mean, Mode, and Median are all located in the center
5. Asymptotic

What is correlation and covariance in statistics?

Correlation is considered or described as the best technique for measuring and also for estimating the quantitative relationship between two variables. Correlation measures how strongly two variables are related. Given two random variables, it is the covariance between both divided by the product of the two standard deviations of the single variables, hence always between -1 and 1.

correlation and covariance

Covariance is a measure that indicates the extent to which two random variables change in cycle. It explains the systematic relation between a pair of random variables, wherein changes in one variable reciprocal by a corresponding change in another variable.

correlation and covariance in statistics

What is the difference between Point Estimates and Confidence Interval? 

Point Estimation gives us a particular value as an estimate of a population parameter. Method of Moments and Maximum Likelihood estimator methods are used to derive Point Estimators for population parameters.

A confidence interval gives us a range of values which is likely to contain the population parameter. The confidence interval is generally preferred, as it tells us how likely this interval is to contain the population parameter. This likeliness or probability is called Confidence Level or Confidence coefficient and represented by 1 − ∝, where ∝ is the level of significance.

What is the goal of A/B Testing?

It is a hypothesis testing for a randomized experiment with two variables A and B.
The goal of A/B Testing is to identify any changes to the web page to maximize or increase the outcome of interest. A/B testing is a fantastic method for figuring out the best online promotional and marketing strategies for your business. It can be used to test everything from website copy to sales emails to search ads. An example of this could be identifying the click-through rate for a banner ad.

What is p-value?

When you perform a hypothesis test in statistics, a p-value can help you determine the strength of your results. p-value is the minimum significance level at which you can reject the null hypothesis. The lower the p-value, the more likely you reject the null hypothesis.

What do you understand by statistical power of sensitivity and how do you calculate it? 

Sensitivity is commonly used to validate the accuracy of a classifier (Logistic, SVM, Random Forest etc.). Sensitivity = [ TP / (TP +TN)]

 

Why is Re-sampling done?

A Gentle Introduction to Statistical Sampling and Resampling

  • Sampling is an active process of gathering observations with the intent of estimating a population variable.
  • Resampling is a methodology of economically using a data sample to improve the accuracy and quantify the uncertainty of a population parameter. Resampling methods, in fact, make use of a nested resampling method.

Once we have a data sample, it can be used to estimate the population parameter. The problem is that we only have a single estimate of the population parameter, with little idea of the variability or uncertainty in the estimate. One way to address this is by estimating the population parameter multiple times from our data sample. This is called resampling. Statistical resampling methods are procedures that describe how to economically use available data to estimate a population parameter. The result can be both a more accurate estimate of the parameter (such as taking the mean of the estimates) and a quantification of the uncertainty of the estimate (such as adding a confidence interval).

Resampling methods are very easy to use, requiring little mathematical knowledge. A downside of the methods is that they can be computationally very expensive, requiring tens, hundreds, or even thousands of resamples in order to develop a robust estimate of the population parameter.

The key idea is to resample from the original data — either directly or via a fitted model — to create replicate datasets, from which the variability of the quantiles of interest can be assessed without longwinded and error-prone analytical calculation. Because this approach involves repeating the original data analysis procedure with many replicate sets of data, these are sometimes called computer-intensive methods. Each new subsample from the original data sample is used to estimate the population parameter. The sample of estimated population parameters can then be considered with statistical tools in order to quantify the expected value and variance, providing measures of the uncertainty of the
estimate. Statistical sampling methods can be used in the selection of a subsample from the original sample.

A key difference is that process must be repeated multiple times. The problem with this is that there will be some relationship between the samples as observations that will be shared across multiple subsamples. This means that the subsamples and the estimated population parameters are not strictly identical and independently distributed. This has implications for statistical tests performed on the sample of estimated population parameters downstream, i.e. paired statistical tests may be required. 

Two commonly used resampling methods that you may encounter are k-fold cross-validation and the bootstrap.

  • Bootstrap. Samples are drawn from the dataset with replacement (allowing the same sample to appear more than once in the sample), where those instances not drawn into the data sample may be used for the test set.
  • k-fold Cross-Validation. A dataset is partitioned into k groups, where each group is given the opportunity of being used as a held out test set leaving the remaining groups as the training set. The k-fold cross-validation method specifically lends itself to use in the evaluation of predictive models that are repeatedly trained on one subset of the data and evaluated on a second held-out subset of the data.  

Resampling is done in any of these cases:

  • Estimating the accuracy of sample statistics by using subsets of accessible data or drawing randomly with replacement from a set of data points
  • Substituting labels on data points when performing significance tests
  • Validating models by using random subsets (bootstrapping, cross-validation)

What are the differences between over-fitting and under-fitting?

In statistics and machine learning, one of the most common tasks is to fit a model to a set of training data, so as to be able to make reliable predictions on general untrained data.

In overfitting, a statistical model describes random error or noise instead of the underlying relationship.
Overfitting occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. A model that has been overfitted, has poor predictive performance, as it overreacts to minor fluctuations in the training data.

Underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. Underfitting would occur, for example, when fitting a linear model to non-linear data.
Such a model too would have poor predictive performance.

 

How to combat Overfitting and Underfitting?

To combat overfitting:
1. Add noise
2. Feature selection
3. Increase training set
4. L2 (ridge) or L1 (lasso) regularization; L1 drops weights, L2 no
5. Use cross-validation techniques, such as k folds cross-validation
6. Boosting and bagging
7. Dropout technique
8. Perform early stopping
9. Remove inner layers
To combat underfitting:
1. Add features
2. Increase time of training


What is regularization? Why is it useful?

Regularization is the process of adding tuning parameter (penalty term) to a model to induce smoothness in order to prevent overfitting. This is most often done by adding a constant multiple to an existing weight vector. This constant is often the L1 (Lasso – |∝|) or L2 (Ridge – ∝2). The model predictions should then minimize the loss function calculated on the regularized training set.

What Is the Law of Large Numbers? 

It is a theorem that describes the result of performing the same experiment a large number of times. This theorem forms the basis of frequency-style thinking. It says that the sample means, the sample variance and the sample standard deviation converge to what they are trying to estimate. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and will tend to become closer to the expected value as more trials are performed.

What Are Confounding Variables?

In statistics, a confounder is a variable that influences both the dependent variable and independent variable.

If you are researching whether a lack of exercise leads to weight gain:
lack of exercise = independent variable
weight gain = dependent variable
A confounding variable here would be any other variable that affects both of these variables, such as the age of the subject.

What is Survivorship Bias?

It is the logical error of focusing aspects that support surviving some process and casually overlooking those that did not work because of their lack of prominence. This can lead to wrong conclusions in numerous different means. For example, during a recession you look just at the survived businesses, noting that they are performing poorly. However, they perform better than the rest, which is failed, thus being removed from the time series.

Explain how a ROC curve works?

The ROC curve is a graphical representation of the contrast between true positive rates and false positive rates at various thresholds. It is often used as a proxy for the trade-off between the sensitivity (true positive rate) and false positive rate.

Data Science ROC Curve

What is TF/IDF vectorization?

TF-IDF is short for term frequency-inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. It is often used as a weighting factor in information retrieval and text mining.

Data Science TF IDF Vectorization

The TF-IDF value increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general.

Python or R – Which one would you prefer for text analytics?

We will prefer Python because of the following reasons:
• Python would be the best option because it has Pandas library that provides easy to use data structures and high-performance data analysis tools.
• R is more suitable for machine learning than just text analysis.
• Python performs faster for all types of text analytics.

How does data cleaning play a vital role in the analysis? 

Data cleaning can help in analysis because:

  • Cleaning data from multiple sources helps transform it into a format that data analysts or data scientists can work with.
  • Data Cleaning helps increase the accuracy of the model in machine learning.
  • It is a cumbersome process because as the number of data sources increases, the time taken to clean the data increases exponentially due to the number of sources and the volume of data generated by these sources.
  • It might take up to 80% of the time for just cleaning data making it a critical part of the analysis task

Differentiate between univariate, bivariate and multivariate analysis. 

Univariate analyses are descriptive statistical analysis techniques which can be differentiated based on one variable involved at a given point of time. For example, the pie charts of sales based on territory involve only one variable and can the analysis can be referred to as univariate analysis.

The bivariate analysis attempts to understand the difference between two variables at a time as in a scatterplot. For example, analyzing the volume of sale and spending can be considered as an example of bivariate analysis.

Multivariate analysis deals with the study of more than two variables to understand the effect of variables on the responses.

Explain Star Schema

It is a traditional database schema with a central table. Satellite tables map IDs to physical names or descriptions and can be connected to the central fact table using the ID fields; these tables are known as lookup tables and are principally useful in real-time applications, as they save a lot of memory. Sometimes star schemas involve several layers of summarization to recover information faster.

What is Cluster Sampling?

Cluster sampling is a technique used when it becomes difficult to study the target population spread across a wide area and simple random sampling cannot be applied. Cluster Sample is a probability sample where each sampling unit is a collection or cluster of elements.

For example, a researcher wants to survey the academic performance of high school students in Japan. He can divide the entire population of Japan into different clusters (cities). Then the researcher selects a number of clusters depending on his research through simple or systematic random sampling.

What is Systematic Sampling? 

Systematic sampling is a statistical technique where elements are selected from an ordered sampling frame. In systematic sampling, the list is progressed in a circular manner so once you reach the end of the list, it is progressed from the top again. The best example of systematic sampling is equal probability method.

What are Eigenvectors and Eigenvalues? 

Eigenvectors are used for understanding linear transformations. In data analysis, we usually calculate the eigenvectors for a correlation or covariance matrix. Eigenvectors are the directions along which a particular linear transformation acts by flipping, compressing or stretching.
Eigenvalue can be referred to as the strength of the transformation in the direction of eigenvector or the factor by which the compression occurs.

Give Examples where a false positive is important than a false negative?

Let us first understand what false positives and false negatives are:

  • False Positives are the cases where you wrongly classified a non-event as an event a.k.a Type I error
  • False Negatives are the cases where you wrongly classify events as non-events, a.k.a Type II error.

Example 1: In the medical field, assume you have to give chemotherapy to patients. Assume a patient comes to that hospital and he is tested positive for cancer, based on the lab prediction but he actually doesn’t have cancer. This is a case of false positive. Here it is of utmost danger to start chemotherapy on this patient when he actually does not have cancer. In the absence of cancerous cell, chemotherapy will do certain damage to his normal healthy cells and might lead to severe diseases, even cancer.

Example 2: Let’s say an e-commerce company decided to give $1000 Gift voucher to the customers whom they assume to purchase at least $10,000 worth of items. They send free voucher mail directly to 100 customers without any minimum purchase condition because they assume to make at least 20% profit on sold items above $10,000. Now the issue is if we send the $1000 gift vouchers to customers who have not actually purchased anything but are marked as having made $10,000 worth of purchase

Give Examples where a false negative important than a false positive? And vice versa?

Example 1 FN: What if Jury or judge decides to make a criminal go free?

Example 2 FN: Fraud detection.

Example 3 FP: customer voucher use promo evaluation: if many used it and actually if was not true, promo sucks

Give Examples where both false positive and false negatives are equally important? 

In the Banking industry giving loans is the primary source of making money but at the same time if your repayment rate is not good you will not make any profit, rather you will risk huge losses.
Banks don’t want to lose good customers and at the same point in time, they don’t want to acquire bad customers. In this scenario, both the false positives and false negatives become very important to measure.

What is the Difference between a Validation Set and a Test Set?

A Training Set:
• to fit the parameters i.e. weights

A Validation set:
• part of the training set
• for parameter selection
• to avoid overfitting

A Test set:
• for testing or evaluating the performance of a trained machine learning model, i.e. evaluating the
predictive power and generalization.

What is cross-validation?

Reference: k-fold cross validation 

Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into. As such, the procedure is often called k-fold cross-validation. When a specific value for k is chosen, it may be used in place of k in the reference to the model, such as k=10 becoming 10-fold cross-validation. Mainly used in backgrounds where the objective is forecast, and one wants to estimate how accurately a model will accomplish in practice.

Cross-validation is primarily used in applied machine learning to estimate the skill of a machine learning model on unseen data. That is, to use a limited sample in order to estimate how the model is expected to perform in general when used to make predictions on data not used during the training of the model.

It is a popular method because it is simple to understand and because it generally results in a less biased or less optimistic estimate of the model skill than other methods, such as a simple train/test split.

The general procedure is as follows:
1. Shuffle the dataset randomly.
2. Split the dataset into k groups
3. For each unique group:
a. Take the group as a hold out or test data set
b. Take the remaining groups as a training data set
c. Fit a model on the training set and evaluate it on the test set
d. Retain the evaluation score and discard the model
4. Summarize the skill of the model using the sample of model evaluation scores

Data Science Cross Validation

There is an alternative in Scikit-Learn called Stratified k fold, in which the split is shuffled to make it sure you have a representative sample of each class and a k fold in which you may not have the assurance of it (not good with a very unbalanced dataset).

What is Machine Learning?

Machine learning is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence. Machine Learning explores the study and construction of algorithms that can learn from and make predictions on data. You select a model to train and then manually perform feature extraction. Used to devise complex models and algorithms that lend themselves to a prediction which in commercial use is known as predictive analytics.

What is Supervised Learning? 

Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples.

Algorithms: Support Vector Machines, Regression, Naive Bayes, Decision Trees, K-nearest Neighbor Algorithm and Neural Networks

Example: If you built a fruit classifier, the labels will be “this is an orange, this is an apple and this is a banana”, based on showing the classifier examples of apples, oranges and bananas.

What is Unsupervised learning?

Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labelled responses.

Algorithms: Clustering, Anomaly Detection, Neural Networks and Latent Variable Models

Example: In the same example, a fruit clustering will categorize as “fruits with soft skin and lots of dimples”, “fruits with shiny hard skin” and “elongated yellow fruits”.

What are the various Machine Learning algorithms?

Machine Learning Algorithms

What is “Naive” in a Naive Bayes?

Reference: Naive Bayes Classifier on Wikipedia

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable. Bayes’ theorem states the following relationship, given class variable y and dependent feature vector X1through Xn:

Machine Learning Algorithms Naive Bayes
Machine Learning Algorithms Naive Bayes

What is PCA (Principal Component Analysis)? When do you use it?

Reference: PCA on wikipedia

Principal component analysis (PCA) is a statistical method used in Machine Learning. It consists in projecting data in a higher dimensional space into a lower dimensional space by maximizing the variance of each dimension.

The process works as following. We define a matrix A with > rows (the single observations of a dataset – in a tabular format, each single row) and @ columns, our features. For this matrix we construct a variable space with as many dimensions as there are features. Each feature represents one coordinate axis. For each feature, the length has been standardized according to a scaling criterion, normally by scaling to unit variance. It is determinant to scale the features to a common scale, otherwise the features with a greater magnitude will weigh more in determining the principal components. Once plotted all the observations and computed the mean of each variable, that mean will be represented by a point in the center of our plot (the center of gravity). Then, we subtract each observation with the mean, shifting the coordinate system with the center in the origin. The best fitting line resulting is the line that best accounts for the shape of the point swarm. It represents the maximum variance direction in the data. Each observation may be projected onto this line in order to get a coordinate value along the PC-line. This value is known as a score. The next best-fitting line can be similarly chosen from directions perpendicular to the first.
Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components.

Machine Learning Algorithms PCA

PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations.

SVM (Support Vector Machine)  algorithm

Reference: SVM on wikipedia

Classifying data is a common task in machine learning. Suppose some given data points each belong to one of two classes, and the goal is to decide which class a new data point will be in. In the case of supportvector machines, a data point is viewed as a p-dimensional vector (a list of p numbers), and we want to know whether we can separate such points with a (p − 1)-dimensional hyperplane. This is called a linear classifier. There are many hyperplanes that might classify the data. One reasonable choice as the best hyperplane is the one that represents the largest separation, or margin, between the two classes. So, we
choose the hyperplane so that the distance from it to the nearest data point on each side is maximized. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum-margin classifier; or equivalently, the perceptron of optimal stability. The best hyper plane that divides the data is H3.

  • SVMs are helpful in text and hypertext categorization, as their application can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings.
  • Some methods for shallow semantic parsing are based on support vector machines.
  • Classification of images can also be performed using SVMs. Experimental results show that SVMs achieve significantly higher search accuracy than traditional query refinement schemes after just three to four rounds of relevance feedback.
  • Classification of satellite data like SAR data using supervised SVM.
  • Hand-written characters can be recognized using SVM.

What are the support vectors in SVM? 

Machine Learning Algorithms Support Vectors

In the diagram, we see that the sketched lines mark the distance from the classifier (the hyper plane) to the closest data points called the support vectors (darkened data points). The distance between the two thin lines is called the margin.

To extend SVM to cases in which the data are not linearly separable, we introduce the hinge loss function, max (0, 1 – yi(w∙ xi − b)). This function is zero if x lies on the correct side of the margin. For data on the wrong side of the margin, the function’s value is proportional to the distance from the margin. 

What are the different kernels in SVM?

There are four types of kernels in SVM.
1. LinearKernel
2. Polynomial kernel
3. Radial basis kernel
4. Sigmoid kernel

What are the most known ensemble algorithms? 

Reference: Ensemble Algorithms

The most popular trees are: AdaBoost, Random Forest, and  eXtreme Gradient Boosting (XGBoost).

AdaBoost is best used in a dataset with low noise, when computational complexity or timeliness of results is not a main concern and when there are not enough resources for broader hyperparameter tuning due to lack of time and knowledge of the user.

Random forests should not be used when dealing with time series data or any other data where look-ahead bias should be avoided, and the order and continuity of the samples need to be ensured. This algorithm can handle noise relatively well, but more knowledge from the user is required to adequately tune the algorithm compared to AdaBoost.

The main advantages of XGBoost is its lightning speed compared to other algorithms, such as AdaBoost, and its regularization parameter that successfully reduces variance. But even aside from the regularization parameter, this algorithm leverages a learning rate (shrinkage) and subsamples from the features like random forests, which increases its ability to generalize even further. However, XGBoost is more difficult to understand, visualize and to tune compared to AdaBoost and random forests. There is a multitude of hyperparameters that can be tuned to increase performance.

What is Deep Learning?

Deep Learning is nothing but a paradigm of machine learning which has shown incredible promise in recent years. This is because of the fact that Deep Learning shows a great analogy with the functioning of the neurons in the human brain.

Deep Learning

What is the difference between machine learning and deep learning?

Deep learning & Machine learning: what’s the difference?

Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed. Machine learning can be categorized in the following four categories.
1. Supervised machine learning,
2. Semi-supervised machine learning,
3. Unsupervised machine learning,
4. Reinforcement learning.

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.

Machine Learning vs Deep Learning

• The main difference between deep learning and machine learning is due to the way data is
presented in the system. Machine learning algorithms almost always require structured data, while deep learning networks rely on layers of ANN (artificial neural networks).

• Machine learning algorithms are designed to “learn” to act by understanding labeled data and then use it to produce new results with more datasets. However, when the result is incorrect, there is a need to “teach them”. Because machine learning algorithms require bulleted data, they are not suitable for solving complex queries that involve a huge amount of data.

• Deep learning networks do not require human intervention, as multilevel layers in neural
networks place data in a hierarchy of different concepts, which ultimately learn from their own mistakes. However, even they can be wrong if the data quality is not good enough.

• Data decides everything. It is the quality of the data that ultimately determines the quality of the result.

• Both of these subsets of AI are somehow connected to data, which makes it possible to represent a certain form of “intelligence.” However, you should be aware that deep learning requires much more data than a traditional machine learning algorithm. The reason for this is that deep learning networks can identify different elements in neural network layers only when more than a million data points interact. Machine learning algorithms, on the other hand, are capable of learning by pre-programmed criteria.

What is the reason for the popularity of Deep Learning in recent times? 

Now although Deep Learning has been around for many years, the major breakthroughs from these techniques came just in recent years. This is because of two main reasons:
• The increase in the amount of data generated through various sources
• The growth in hardware resources required to run these models
GPUs are multiple times faster and they help us build bigger and deeper deep learning models in comparatively less time than we required previously

What is reinforcement learning?

Reinforcement Learning allows to take actions to max cumulative reward. It learns by trial and error through reward/penalty system. Environment rewards agent so by time agent makes better decisions.
Ex: robot=agent, maze=environment. Used for complex tasks (self-driving cars, game AI).

RL is a series of time steps in a Markov Decision Process:

1. Environment: space in which RL operates
2. State: data related to past action RL took
3. Action: action taken
4. Reward: number taken by agent after last action
5. Observation: data related to environment: can be visible or partially shadowed

What are Artificial Neural Networks?

Artificial Neural networks are a specific set of algorithms that have revolutionized machine learning. They are inspired by biological neural networks. Neural Networks can adapt to changing the input, so the network generates the best possible result without needing to redesign the output criteria.

Artificial Neural Networks works on the same principle as a biological Neural Network. It consists of inputs which get processed with weighted sums and Bias, with the help of Activation Functions.

Machine Learning Artificial Neural Network

How Are Weights Initialized in a Network?

There are two methods here: we can either initialize the weights to zero or assign them randomly.

Initializing all weights to 0: This makes your model similar to a linear model. All the neurons and every layer perform the same operation, giving the same output and making the deep net useless.

Initializing all weights randomly: Here, the weights are assigned randomly by initializing them very close to 0. It gives better accuracy to the model since every neuron performs different computations. This is the most commonly used method.

What Is the Cost Function? 

Also referred to as “loss” or “error,” cost function is a measure to evaluate how good your model’s performance is. It’s used to compute the error of the output layer during backpropagation. We push that error backwards through the neural network and use that during the different training functions.
The most known one is the mean sum of squared errors.

Machine Learning Cost Function

What Are Hyperparameters?

With neural networks, you’re usually working with hyperparameters once the data is formatted correctly.
A hyperparameter is a parameter whose value is set before the learning process begins. It determines how a network is trained and the structure of the network (such as the number of hidden units, the learning rate, epochs, batches, etc.).

What Will Happen If the Learning Rate is Set inaccurately (Too Low or Too High)? 

When your learning rate is too low, training of the model will progress very slowly as we are making minimal updates to the weights. It will take many updates before reaching the minimum point.
If the learning rate is set too high, this causes undesirable divergent behavior to the loss function due to drastic updates in weights. It may fail to converge (model can give a good output) or even diverge (data is too chaotic for the network to train).

What Is The Difference Between Epoch, Batch, and Iteration in Deep Learning? 

Epoch – Represents one iteration over the entire dataset (everything put into the training model).
Batch – Refers to when we cannot pass the entire dataset into the neural network at once, so we divide the dataset into several batches.
Iteration – if we have 10,000 images as data and a batch size of 200. then an epoch should run 50 iterations (10,000 divided by 50).

What Are the Different Layers on CNN?

Reference: Layers of CNN 

Machine Learning Layers of CNN

The Convolutional neural networks are regularized versions of multilayer perceptron (MLP). They were developed based on the working of the neurons of the animal visual cortex.

The objective of using the CNN:

The idea is that you give the computer this array of numbers and it will output numbers that describe the probability of the image being a certain class (.80 for a cat, .15 for a dog, .05 for a bird, etc.). It works similar to how our brain works. When we look at a picture of a dog, we can classify it as such if the picture has identifiable features such as paws or 4 legs. In a similar way, the computer is able to perform image classification by looking for low-level features such as edges and curves and then building up to more abstract concepts through a series of convolutional layers. The computer uses low-level features obtained at the initial levels to generate high-level features such as paws or eyes to identify the object.

There are four layers in CNN:
1. Convolutional Layer – the layer that performs a convolutional operation, creating several smaller picture windows to go over the data.
2. Activation Layer (ReLU Layer) – it brings non-linearity to the network and converts all the negative pixels to zero. The output is a rectified feature map. It follows each convolutional layer.
3. Pooling Layer – pooling is a down-sampling operation that reduces the dimensionality of the feature map. Stride = how much you slide, and you get the max of the n x n matrix
4. Fully Connected Layer – this layer recognizes and classifies the objects in the image.

Q60: What Is Pooling on CNN, and How Does It Work?

Pooling is used to reduce the spatial dimensions of a CNN. It performs down-sampling operations to reduce the dimensionality and creates a pooled feature map by sliding a filter matrix over the input matrix.

What are Recurrent Neural Networks (RNNs)? 

Reference: RNNs

RNNs are a type of artificial neural networks designed to recognize the pattern from the sequence of data such as Time series, stock market and government agencies etc.

Recurrent Neural Networks (RNNs) add an interesting twist to basic neural networks. A vanilla neural network takes in a fixed size vector as input which limits its usage in situations that involve a ‘series’ type input with no predetermined size.

Machine Learning RNN

RNNs are designed to take a series of input with no predetermined limit on size. One could ask what’s\ the big deal, I can call a regular NN repeatedly too?

Machine Learning Regular NN

Sure can, but the ‘series’ part of the input means something. A single input item from the series is related to others and likely has an influence on its neighbors. Otherwise it’s just “many” inputs, not a “series” input (duh!).
Recurrent Neural Network remembers the past and its decisions are influenced by what it has learnt from the past. Note: Basic feed forward networks “remember” things too, but they remember things they learnt during training. For example, an image classifier learns what a “1” looks like during training and then uses that knowledge to classify things in production.
While RNNs learn similarly while training, in addition, they remember things learnt from prior input(s) while generating output(s). RNNs can take one or more input vectors and produce one or more output vectors and the output(s) are influenced not just by weights applied on inputs like a regular NN, but also by a “hidden” state vector representing the context based on prior input(s)/output(s). So, the same input could produce a different output depending on previous inputs in the series.

Machine Learning Vanilla NN

In summary, in a vanilla neural network, a fixed size input vector is transformed into a fixed size output vector. Such a network becomes “recurrent” when you repeatedly apply the transformations to a series of given input and produce a series of output vectors. There is no pre-set limitation to the size of the vector. And, in addition to generating the output which is a function of the input and hidden state, we update the hidden state itself based on the input and use it in processing the next input.

What is the role of the Activation Function?

The Activation function is used to introduce non-linearity into the neural network helping it to learn more complex function. Without which the neural network would be only able to learn linear function which is a linear combination of its input data. An activation function is a function in an artificial neuron that delivers an output based on inputs.

Machine Learning libraries for various purposes

Machine Learning Libraries

What is an Auto-Encoder?

Reference: Auto-Encoder

Auto-encoders are simple learning networks that aim to transform inputs into outputs with the minimum possible error. This means that we want the output to be as close to input as possible. We add a couple of layers between the input and the output, and the sizes of these layers are smaller than the input layer. The auto-encoder receives unlabeled input which is then encoded to reconstruct the input. 

An autoencoder is a type of artificial neural network used to learn efficient data coding in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”. Along with the reduction side, a reconstructing side is learnt, where the autoencoder tries to generate from the reduced encoding a representation as close as possible to its original input, hence its name. Several variants exist to the basic model, with the aim of forcing the learned representations of the input to assume useful properties.
Autoencoders are effectively used for solving many applied problems, from face recognition to acquiring the semantic meaning of words.

Machine Learning Auto_Encoder

What is a Boltzmann Machine?

Boltzmann machines have a simple learning algorithm that allows them to discover interesting features that represent complex regularities in the training data. The Boltzmann machine is basically used to optimize the weights and the quantity for the given problem. The learning algorithm is very slow in networks with many layers of feature detectors. “Restricted Boltzmann Machines” algorithm has a single layer of feature detectors which makes it faster than the rest.

Machine Learning Boltzmann Machine

What Is Dropout and Batch Normalization?

Dropout is a technique of dropping out hidden and visible nodes of a network randomly to prevent overfitting of data (typically dropping 20 per cent of the nodes). It doubles the number of iterations needed to converge the network. It used to avoid overfitting, as it increases the capacity of generalization.

Batch normalization is the technique to improve the performance and stability of neural networks by normalizing the inputs in every layer so that they have mean output activation of zero and standard deviation of one

Why Is TensorFlow the Most Preferred Library in Deep Learning?

TensorFlow provides both C++ and Python APIs, making it easier to work on and has a faster compilation time compared to other Deep Learning libraries like Keras and PyTorch. TensorFlow supports both CPU and GPU computing devices.

What is Tensor in TensorFlow?

A tensor is a mathematical object represented as arrays of higher dimensions. Think of a n-D matrix. These arrays of data with different dimensions and ranks fed as input to the neural network are called “Tensors.”

What is the Computational Graph?

Everything in a TensorFlow is based on creating a computational graph. It has a network of nodes where each node operates. Nodes represent mathematical operations, and edges represent tensors. Since data flows in the form of a graph, it is also called a “DataFlow Graph.”

What is logistic regression?

• Logistic Regression models a function of the target variable as a linear combination of the predictors, then converts this function into a fitted value in the desired range.

• Binary or Binomial Logistic Regression can be understood as the type of Logistic Regression that deals with scenarios wherein the observed outcomes for dependent variables can be only in binary, i.e., it can have only two possible types.

• Multinomial Logistic Regression works in scenarios where the outcome can have more than two possible types – type A vs type B vs type C – that are not in any particular order.

No alternative text description for this image

No alternative text description for this image

Credit:

How is logistic regression done? 

Logistic regression measures the relationship between the dependent variable (our label of what we want to predict) and one or more independent variables (our features) by estimating probability using its underlying logistic function (sigmoid).

Explain the steps in making a decision tree. 

1. Take the entire data set as input
2. Calculate entropy of the target variable, as well as the predictor attributes
3. Calculate your information gain of all attributes (we gain information on sorting different objects from each other)
4. Choose the attribute with the highest information gain as the root node
5. Repeat the same procedure on every branch until the decision node of each branch is finalized
For example, let’s say you want to build a decision tree to decide whether you should accept or decline a job offer. The decision tree for this case is as shown:

Machine Learning Decision Tree

It is clear from the decision tree that an offer is accepted if:
• Salary is greater than $50,000
• The commute is less than an hour
• Coffee is offered

How do you build a random forest model?

A random forest is built up of a number of decision trees. If you split the data into different packages and make a decision tree in each of the different groups of data, the random forest brings all those trees together.

Steps to build a random forest model:

1. Randomly select ; features from a total of = features where  k<< m
2. Among the ; features, calculate the node D using the best split point
3. Split the node into daughter nodes using the best split
4. Repeat steps two and three until leaf nodes are finalized
5. Build forest by repeating steps one to four for > times to create > number of trees

Differentiate between univariate, bivariate, and multivariate analysis. 

Univariate data contains only one variable. The purpose of the univariate analysis is to describe the data and find patterns that exist within it.

Machine Learning Univariate Data

The patterns can be studied by drawing conclusions using mean, median, mode, dispersion or range, minimum, maximum, etc.

Bivariate data involves two different variables. The analysis of this type of data deals with causes and relationships and the analysis is done to determine the relationship between the two variables.

Bivariate data

Here, the relationship is visible from the table that temperature and sales are directly proportional to each other. The hotter the temperature, the better the sales.

Multivariate data involves three or more variables, it is categorized under multivariate. It is similar to a bivariate but contains more than one dependent variable.

Example: data for house price prediction
The patterns can be studied by drawing conclusions using mean, median, and mode, dispersion or range, minimum, maximum, etc. You can start describing the data and using it to guess what the price of the house will be.

What are the feature selection methods used to select the right variables?

There are two main methods for feature selection.
Filter Methods
This involves:
• Linear discrimination analysis
• ANOVA
• Chi-Square
The best analogy for selecting features is “bad data in, bad answer out.” When we’re limiting or selecting the features, it’s all about cleaning up the data coming in.

Wrapper Methods
This involves:
• Forward Selection: We test one feature at a time and keep adding them until we get a good fit
• Backward Selection: We test all the features and start removing them to see what works
better
• Recursive Feature Elimination: Recursively looks through all the different features and how they pair together

Wrapper methods are very labor-intensive, and high-end computers are needed if a lot of data analysis is performed with the wrapper method.

You are given a data set consisting of variables with more than 30 percent missing values. How will you deal with them? 

If the data set is large, we can just simply remove the rows with missing data values. It is the quickest way; we use the rest of the data to predict the values.

For smaller data sets, we can impute missing values with the mean, median, or average of the rest of the data using pandas data frame in python. There are different ways to do so, such as: df.mean(), df.fillna(mean)

Other option of imputation is using KNN for numeric or classification values (as KNN just uses k closest values to impute the missing value).

How will you calculate the Euclidean distance in Python?

plot1 = [1,3]

plot2 = [2,5]

The Euclidean distance can be calculated as follows:

euclidean_distance = sqrt((plot1[0]-plot2[0])**2 + (plot1[1]- plot2[1])**2)

What are dimensionality reduction and its benefits? 

Dimensionality reduction refers to the process of converting a data set with vast dimensions into data with fewer dimensions (fields) to convey similar information concisely.

This reduction helps in compressing data and reducing storage space. It also reduces computation time as fewer dimensions lead to less computing. It removes redundant features; for example, there’s no point in storing a value in two different units (meters and inches).

How should you maintain a deployed model?

The steps to maintain a deployed model are (CREM):

1. Monitor: constant monitoring of all models is needed to determine their performance accuracy.
When you change something, you want to figure out how your changes are going to affect things.
This needs to be monitored to ensure it’s doing what it’s supposed to do.
2. Evaluate: evaluation metrics of the current model are calculated to determine if a new algorithm is needed.
3. Compare: the new models are compared to each other to determine which model performs the best.
4. Rebuild: the best performing model is re-built on the current state of data.

How can a time-series data be declared as stationery?

  1. The mean of the series should not be a function of time.
Machine Learning Stationery Time Series Data: Mean
  1. The variance of the series should not be a function of time. This property is known as homoscedasticity.
Machine Learning Stationery Time Series Data: Variance
  1. The covariance of the i th term and the (i+m) th term should not be a function of time.
Machine Learning Stationery Time Series Data: CoVariance

‘People who bought this also bought…’ recommendations seen on Amazon are a result of which algorithm?

The recommendation engine is accomplished with collaborative filtering. Collaborative filtering explains the behavior of other users and their purchase history in terms of ratings, selection, etc.
The engine makes predictions on what might interest a person based on the preferences of other users. In this algorithm, item features are unknown.
For example, a sales page shows that a certain number of people buy a new phone and also buy tempered glass at the same time. Next time, when a person buys a phone, he or she may see a recommendation to buy tempered glass as well.

What is a Generative Adversarial Network?

Suppose there is a wine shop purchasing wine from dealers, which they resell later. But some dealers sell fake wine. In this case, the shop owner should be able to distinguish between fake and authentic wine. The forger will try different techniques to sell fake wine and make sure specific techniques go past the shop owner’s check. The shop owner would probably get some feedback from wine experts that some of the wine is not original. The owner would have to improve how he determines whether a wine is fake or authentic.
The forger’s goal is to create wines that are indistinguishable from the authentic ones while the shop owner intends to tell if the wine is real or not accurately.

Machine Learning GAN illustration

• There is a noise vector coming into the forger who is generating fake wine.
• Here the forger acts as a Generator.
• The shop owner acts as a Discriminator.
• The Discriminator gets two inputs; one is the fake wine, while the other is the real authentic wine.
The shop owner has to figure out whether it is real or fake.

So, there are two primary components of Generative Adversarial Network (GAN) named:
1. Generator
2. Discriminator

The generator is a CNN that keeps keys producing images and is closer in appearance to the real images while the discriminator tries to determine the difference between real and fake images. The ultimate aim is to make the discriminator learn to identify real and fake images.

You are given a dataset on cancer detection. You have built a classification model and achieved an accuracy of 96 percent. Why shouldn’t you be happy with your model performance? What can you do about it?

Cancer detection results in imbalanced data. In an imbalanced dataset, accuracy should not be based as a measure of performance. It is important to focus on the remaining four percent, which represents the patients who were wrongly diagnosed. Early diagnosis is crucial when it comes to cancer detection and can greatly improve a patient’s prognosis.

Hence, to evaluate model performance, we should use Sensitivity (True Positive Rate), Specificity (True Negative Rate), F measure to determine the class wise performance of the classifier.

We want to predict the probability of death from heart disease based on three risk factors: age, gender, and blood cholesterol level. What is the most appropriate algorithm for this case?

The most appropriate algorithm for this case is logistic regression.

After studying the behavior of a population, you have identified four specific individual types that are valuable to your study. You would like to find all users who are most similar to each individual type. Which algorithm is most appropriate for this study? 

As we are looking for grouping people together specifically by four different similarities, it indicates the value of k. Therefore, K-means clustering is the most appropriate algorithm for this study.

You have run the association rules algorithm on your dataset, and the two rules {banana, apple} => {grape} and {apple, orange} => {grape} have been found to be relevant. What else must be true? 

{grape, apple} must be a frequent itemset.

Your organization has a website where visitors randomly receive one of two coupons. It is also possible that visitors to the website will not receive a coupon. You have been asked to determine if offering a coupon to website visitors has any impact on their purchase decisions. Which analysis method should you use?

One-way ANOVA: in statistics, one-way analysis of variance is a technique that can be used to compare means of two or more samples. This technique can be used only for numerical response data, the “Y”, usually one variable, and numerical or categorical input data, the “X”, always one variable, hence “oneway”.
The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples. If the group means are drawn from populations with the same mean values, the variance between the group means should be lower than the variance of the samples, following the central limit
theorem. A higher ratio therefore implies that the samples were drawn from populations with different mean values.

What are the feature vectors?

A feature vector is an n-dimensional vector of numerical features that represent an object. In machine learning, feature vectors are used to represent numeric or symbolic characteristics (called features) of an object in a mathematical way that’s easy to analyze.

What is root cause analysis?

Root cause analysis was initially developed to analyze industrial accidents but is now widely used in other areas. It is a problem-solving technique used for isolating the root causes of faults or problems. A factor is called a root cause if its deduction from the problem-fault-sequence averts the final undesirable event from recurring.

Do gradient descent methods always converge to similar points?

They do not, because in some cases, they reach a local minimum or a local optimum point. You would not reach the global optimum point. This is governed by the data and the starting conditions.

 In your choice of language, write a program that prints the numbers ranging from one to 50. But for multiples of three, print “Fizz” instead of the number and for the multiples of five, print “Buzz.” For numbers which are multiples of both three and five, print “FizzBuzz.”

Python Fibonacci algorithm

What are the different Deep Learning Frameworks?

PyTorch: PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook’s AI Research lab. It is free and open-source software released under the Modified BSD license.
TensorFlow: TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library and is also used for machine learning applications such as neural networks. Licensed by Apache License 2.0. Developed by Google Brain Team.
Microsoft Cognitive Toolkit: Microsoft Cognitive Toolkit describes neural networks as a series of computational steps via a directed graph.
Keras: Keras is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible. Licensed by MIT.

Data Sciences and Data Mining Glossary

Credit: Dr. Matthew North
Antecedent: In an association rules data mining model, the antecedent is the attribute which precedes the consequent in an identified rule. Attribute order makes a difference when calculating the confidence percentage, so identifying which attribute comes first is necessary even if the reciprocal of the association is also a rule.

Archived Data: Data which have been copied out of a live production database and into a data warehouse or other permanent system where they can be accessed and analyzed, but not by primary operational business systems.

Association Rules: A data mining methodology which compares attributes in a data set across all observations to identify areas where two or more attributes are frequently found together. If their frequency of coexistence is high enough throughout the data set, the association of those attributes can be said to be a rule.

Attribute: In columnar data, an attribute is one column. It is named in the data so that it can be referred to by a model and used in data mining. The term attribute is sometimes interchanged with the terms ‘field’, ‘variable’, or ‘column’.

Average: The arithmetic mean, calculated by summing all values and dividing by the count of the values.

Binomial: A data type for any set of values that is limited to one of two numeric options.

Binominal: In RapidMiner, the data type binominal is used instead of binomial, enabling both numerical and character-based sets of values that are limited to one of two options.

Business Understanding: See Organizational Understanding: The first step in the CRISP-DM process, usually referred to as Business Understanding, where the data miner develops an understanding of an organization’s goals, objectives, questions, and anticipated outcomes relative to data mining tasks. The data miner must understand why the data mining task is being undertaken before proceeding to gather and understand data.

Case Sensitive: A situation where a computer program recognizes the uppercase version of a letter or word as being different from the lowercase version of the same letter or word.

Classification: One of the two main goals of conducting data mining activities, with the other being prediction. Classification creates groupings in a data set based on the similarity of the observations’ attributes. Some data mining methodologies, such as decision trees, can predict an observation’s classification.

Code: Code is the result of a computer worker’s work. It is a set of instructions, typed in a specific grammar and syntax, that a computer can understand and execute. According to Lawrence Lessig, it is one of four methods humans can use to set and control boundaries for behavior when interacting with computer systems.

Coefficient: In data mining, a coefficient is a value that is calculated based on the values in a data set that can be used as a multiplier or as an indicator of the relative strength of some attribute or component in a data mining model.

Column: See Attribute. In columnar data, an attribute is one column. It is named in the data so that it can be referred to by a model and used in data mining. The term attribute is sometimes interchanged with the terms ‘field’, ‘variable’, or ‘column’.

Comma Separated Values (CSV): A common text-based format for data sets where the divisions between attributes (columns of data) are indicated by commas. If commas occur naturally in some of the values in the data set, a CSV file will misunderstand these to be attribute separators, leading to misalignment of attributes.

Conclusion: See Consequent: In an association rules data mining model, the consequent is the attribute which results from the antecedent in an identified rule. If an association rule were characterized as “If this, then that”, the consequent would be that—in other words, the outcome.

Confidence (Alpha) Level: A value, usually 5% or 0.05, used to test for statistical significance in some data mining methods. If statistical significance is found, a data miner can say that there is a 95% likelihood that a calculated or predicted value is not a false positive.

Confidence Percent: In predictive data mining, this is the percent of calculated confidence that the model has calculated for one or more possible predicted values. It is a measure for the likelihood of false positives in predictions. Regardless of the number of possible predicted values, their collective confidence percentages will always total to 100%.

Consequent: In an association rules data mining model, the consequent is the attribute which results from the antecedent in an identified rule. If an association rule were characterized as “If this, then that”, the consequent would be that—in other words, the outcome.

Correlation: A statistical measure of the strength of affinity, based on the similarity of observational values, of the attributes in a data set. These can be positive (as one attribute’s values go up or down, so too does the correlated attribute’s values); or negative (correlated attributes’ values move in opposite directions). Correlations are indicated by coefficients which fall on a scale between -1 (complete negative correlation) and 1 (complete positive correlation), with 0 indicating no correlation at all between two attributes.

CRISP-DM: An acronym for Cross-Industry Standard Process for Data Mining. This process was jointly developed by several major multi-national corporations around the turn of the new millennium in order to standardize the approach to mining data. It is comprised of six cyclical steps: Business (Organizational) Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment.

Cross-validation: A method of statistically evaluating a training data set for its likelihood of producing false positives in a predictive data mining model.

Data: Data are any arrangement and compilation of facts. Data may be structured (e.g. arranged in columns (attributes) and rows (observations)), or unstructured (e.g. paragraphs of text, computer log file).

Data Analysis: The process of examining data in a repeatable and structured way in order to extract meaning, patterns or messages from a set of data.

Data Mart: A location where data are stored for easy access by a broad range of people in an organization. Data in a data mart are generally archived data, enabling analysis in a setting that does not impact live operations.

Data Mining: A computational process of analyzing data sets, usually large in nature, using both statistical and logical methods, in order to uncover hidden, previously unknown, and interesting patterns that can inform organizational decision making.

Data Preparation: The third in the six steps of CRISP-DM. At this stage, the data miner ensures that the data to be mined are clean and ready for mining. This may include handling outliers or other inconsistent data, dealing with missing values, reducing attributes or observations, setting attribute roles for modeling, etc.

Data Set: Any compilation of data that is suitable for analysis.

Data Type: In a data set, each attribute is assigned a data type based on the kind of data stored in the attribute. There are many data types which can be generalized into one of three areas: Character (Text) based; Numeric; and Date/Time. Within these categories, RapidMiner has several data types. For example, in the Character area, RapidMiner has Polynominal, Binominal, etc.; and in the Numeric area it has Real, Integer, etc.

Data Understanding: The second in the six steps of CRISP-DM. At this stage, the data miner seeks out sources of data in the organization, and works to collect, compile, standardize, define and document the data. The data miner develops a comprehension of where the data have come from, how they were collected and what they mean.

Data Warehouse: A large-scale repository for archived data which are available for analysis. Data in a data warehouse are often stored in multiple formats (e.g. by week, month, quarter and year), facilitating large scale analyses at higher speeds. The data warehouse is populated by extracting data from operational systems so that analyses do not interfere with live business operations.

Database: A structured organization of facts that is organized such that the facts can be reliably and repeatedly accessed. The most common type of database is a relational database, in which facts (data) are arranged in tables of columns and rows. The data are then accessed using a query language, usually SQL (Structured Query Language), in order to extract meaning from the tables.

Decision Tree: A data mining methodology where leaves and nodes are generated to construct a predictive tree, whereby a data miner can see the attributes which are most predictive of each possible outcome in a target (label) attribute.

Denormalization: The process of removing relational organization from data, reintroducing redundancy into the data, but simultaneously eliminating the need for joins in a relational database, enabling faster querying.

Dependent Variable (Attribute): The attribute in a data set that is being acted upon by the other attributes. It is the thing we want to predict, the target, or label, attribute in a predictive model.

Deployment: The sixth and final of the six steps of CRISP-DM. At this stage, the data miner takes the results of data mining activities and puts them into practice in the organization. The data miner watches closely and collects data to determine if the deployment is successful and ethical. Deployment can happen in stages, such as through pilot programs before a full-scale roll out.

Descartes’ Rule of Change: An ethical framework set forth by Rene Descartes which states that if an action cannot be taken repeatedly, it cannot be ethically taken even once.

Design Perspective: The view in RapidMiner where a data miner adds operators to a data mining stream, sets those operators’ parameters, and runs the model.

Discriminant Analysis: A predictive data mining model which attempts to compare the values of all observations across all attributes and identify where natural breaks occur from one category to another, and then predict which category each observation in the data set will fall into.

Ethics: A set of moral codes or guidelines that an individual develops to guide his or her decision making in order to make fair and respectful decisions and engage in right actions. Ethical standards are higher than legally required minimums.

Evaluation: The fifth of the six steps of CRISP-DM. At this stage, the data miner reviews the results of the data mining model, interprets results and determines how useful they are. He or she may also conduct an investigation into false positives or other potentially misleading results.

False Positive: A predicted value that ends up not being correct.

Field: See Attribute: In columnar data, an attribute is one column. It is named in the data so that it can be referred to by a model and used in data mining. The term attribute is sometimes interchanged with the terms ‘field’, ‘variable’, or ‘column’.

Frequency Pattern: A recurrence of the same, or similar, observations numerous times in a single data set.

Fuzzy Logic: A data mining concept often associated with neural networks where predictions are made using a training data set, even though some uncertainty exists regarding the data and a model’s predictions.

Gain Ratio: One of several algorithms used to construct decision tree models.

Gini Index: An algorithm created by Corrodo Gini that can be used to generate decision tree models.

Heterogeneity: In statistical analysis, this is the amount of variety found in the values of an attribute.

Inconsistent Data: These are values in an attribute in a data set that are out-of-the-ordinary among the whole set of values in that attribute. They can be statistical outliers, or other values that simply don’t make sense in the context of the ‘normal’ range of values for the attribute. They are generally replaced or remove during the Data Preparation phase of CRISP-DM.

Independent Variable (Attribute): These are attributes that act on the dependent attribute (the target, or label). They are used to help predict the label in a predictive model.

Jittering: The process of adding a small, random decimal to discrete values in a data set so that when they are plotted in a scatter plot, they are slightly apart from one another, enabling the analyst to better see clustering and density.

Join: The process of connecting two or more tables in a relational database together so that their attributes can be accessed in a single query, such as in a view.

Kant’s Categorical Imperative: An ethical framework proposed by Immanuel Kant which states that if everyone cannot ethically take some action, then no one can ethically take that action.

k-Means Clustering: A data mining methodology that uses the mean (average) values of the attributes in a data set to group each observation into a cluster of other observations whose values are most similar to the mean for that cluster.

Label: In RapidMiner, this is the role that must be set in order to use an attribute as the dependent, or target, attribute in a predictive model.

Laws: These are regulatory statutes which have associated consequences that are established and enforced by a governmental agency. According to Lawrence Lessig, these are one of the four methods for establishing boundaries to define and regulate social behavior.

Leaf: In a decision tree data mining model, this is the terminal end point of a branch, indicating the predicted outcome for observations whose values follow that branch of the tree.

Linear Regression: A predictive data mining method which uses the algebraic formula for calculating the slope of a line in order to predict where a given observation will likely fall along that line.

Logistic Regression: A predictive data mining method which uses a quadratic formula to predict one of a set of possible outcomes, along with a probability that the prediction will be the actual outcome.

Markets: A socio-economic construct in which peoples’ buying, selling, and exchanging behaviors define the boundaries of acceptable or unacceptable behavior. Lawrence Lessig offers this as one of four methods for defining the parameters of appropriate behavior.

Mean: See Average: The arithmetic mean, calculated by summing all values and dividing by the count of the values. 

Median: With the Mean and Mode, this is one of three generally used Measures of Central Tendency. It is an arithmetic way of defining what ‘normal’ looks like in a numeric attribute. It is calculated by rank ordering the values in an attribute and finding the one in the middle. If there are an even number of observations, the two in the middle are averaged to find the median.

Meta Data: These are facts that describe the observational values in an attribute. Meta data may include who collected the data, when, why, where, how, how often; and usually include some descriptive statistics such as the range, average, standard deviation, etc.

Missing Data: These are instances in an observation where one or more attributes does not have a value. It is not the same as zero, because zero is a value. Missing data are like Null values in a database, they are either unknown or undefined. These are usually replaced or removed during the Data Preparation phase of CRISP-DM.

Mode: With Mean and Median, this is one of three common Measures of Central Tendency. It is the value in an attribute which is the most common. It can be numerical or text. If an attribute contains two or more values that appear an equal number of times and more than any other values, then all are listed as the mode, and the attribute is said to be Bimodal or Multimodal.

Model: A computer-based representation of real-life events or activities, constructed upon the basis of data which represent those events.

Name (Attribute): This is the text descriptor of each attribute in a data set. In RapidMiner, the first row of an imported data set should be designated as the attribute name, so that these are not interpreted as the first observation in the data set.

Neural Network: A predictive data mining methodology which tries to mimic human brain processes by comparing the values of all attributes in a data set to one another through the use of a hidden layer of nodes. The frequencies with which the attribute values match, or are strongly similar, create neurons which become stronger at higher frequencies of similarity.

n-Gram: In text mining, this is a combination of words or word stems that represent a phrase that may have more meaning or significance that would the single word or stem.

Node: A terminal or mid-point in decision trees and neural networks where an attribute branches or forks away from other terminal or branches because the values represented at that point have become significantly different from all other values for that attribute.

Normalization: In a relational database, this is the process of breaking data out into multiple related tables in order to reduce redundancy and eliminate multivalued dependencies.

Null: The absence of a value in a database. The value is unrecorded, unknown, or undefined. See Missing Values.

Observation: A row of data in a data set. It consists of the value assigned to each attribute for one record in the data set. It is sometimes called a tuple in database language.

Online Analytical Processing (OLAP): A database concept where data are collected and organized in a way that facilitates analysis, rather than practical, daily operational work. Evaluating data in a data warehouse is an example of OLAP. The underlying structure that collects and holds the data makes analysis faster, but would slow down transactional work.

Online Transaction Processing (OLTP): A database concept where data are collected and organized in a way that facilitates fast and repeated transactions, rather than broader analytical work. Scanning items being purchased at a cash register is an example of OLTP. The underlying structure that collects and holds the data makes transactions faster, but would slow down analysis.

Operational Data: Data which are generated as a result of day-to-day work (e.g. the entry of work orders for an electrical service company).

Operator: In RapidMiner, an operator is any one of more than 100 tools that can be added to a data mining stream in order to perform some function. Functions range from adding a data set, to setting an attribute’s role, to applying a modeling algorithm. Operators are connected into a stream by way of ports connected by splines.

Organizational Data: These are data which are collected by an organization, often in aggregate or summary format, in order to address a specific question, tell a story, or answer a specific question. They may be constructed from Operational Data, or added to through other means such as surveys, questionnaires or tests.

Organizational Understanding: The first step in the CRISP-DM process, usually referred to as Business Understanding, where the data miner develops an understanding of an organization’s goals, objectives, questions, and anticipated outcomes relative to data mining tasks. The data miner must understand why the data mining task is being undertaken before proceeding to gather and understand data.

Parameters: In RapidMiner, these are the settings that control values and thresholds that an operator will use to perform its job. These may be the attribute name and role in a Set Role operator, or the algorithm the data miner desires to use in a model operator.

Port: The input or output required for an operator to perform its function in RapidMiner. These are connected to one another using splines.

Prediction: The target, or label, or dependent attribute that is generated by a predictive model, usually for a scoring data set in a model.

Premise: See Antecedent: In an association rules data mining model, the antecedent is the attribute which precedes the consequent in an identified rule. Attribute order makes a difference when calculating the confidence percentage, so identifying which attribute comes first is necessary even if the reciprocal of the association is also a rule.

Privacy: The concept describing a person’s right to be let alone; to have information about them kept away from those who should not, or do not need to, see it. A data miner must always respect and safeguard the privacy of individuals represented in the data he or she mines.

Professional Code of Conduct: A helpful guide or documented set of parameters by which an individual in a given profession agrees to abide. These are usually written by a board or panel of experts and adopted formally by a professional organization.

Query: A method of structuring a question, usually using code, that can be submitted to, interpreted, and answered by a computer.

Record: See Observation: A row of data in a data set. It consists of the value assigned to each attribute for one record in the data set. It is sometimes called a tuple in database language.

Relational Database: A computerized repository, comprised of entities that relate to one another through keys. The most basic and elemental entity in a relational database is the table, and tables are made up of attributes. One or more of these attributes serves as a key that can be matched (or related) to a corresponding attribute in another table, creating the relational effect which reduces data redundancy and eliminates multivalued dependencies.

Repository: In RapidMiner, this is the place where imported data sets are stored so that they are accessible for modeling.

Results Perspective: The view in RapidMiner that is seen when a model has been run. It is usually comprised of two or more tabs which show meta data, data in a spreadsheet-like view, and predictions and model outcomes (including graphical representations where applicable).

Role (Attribute): In a data mining model, each attribute must be assigned a role. The role is the part the attribute plays in the model. It is usually equated to serving as an independent variable (regular), or dependent variable (label).

Row: See Observation: A row of data in a data set. It consists of the value assigned to each attribute for one record in the data set. It is sometimes called a tuple in database language.

Sample: A subset of an entire data set, selected randomly or in a structured way. This usually reduces a data set down, allowing models to be run faster, especially during development and proof-of-concept work on a model.

Scoring Data: A data set with the same attributes as a training data set in a predictive model, with the exception of the label. The training data set, with the label defined, is used to create a predictive model, and that model is then applied to a scoring data set possessing the same attributes in order to predict the label for each scoring observation.

Social Norms: These are the sets of behaviors and actions that are generally tolerated and found to be acceptable in a society. According to Lawrence Lessig, these are one of four methods of defining and regulating appropriate behavior.

Spline: In RapidMiner, these lines connect the ports between operators, creating the stream of a data mining model.

Standard Deviation: One of the most common statistical measures of how dispersed the values in an attribute are. This measure can help determine whether or not there are outliers (a common type of inconsistent data) in a data set.

Standard Operating Procedures: These are organizational guidelines that are documented and shared with employees which help to define the boundaries for appropriate and acceptable behavior in the business setting. They are usually created and formally adopted by a group of leaders in the organization, with input from key stakeholders in the organization.

Statistical Significance: In statistically-based data mining activities, this is the measure of whether or not the model has yielded any results that are mathematically reliable enough to be used. Any model lacking statistical significance should not be used in operational decision making.

Stemming: In text mining, this is the process of reducing like-terms down into a single, common token (e.g. country, countries, country’s, countryman, etc. → countr).

Stopwords: In text mining, these are small words that are necessary for grammatical correctness, but which carry little meaning or power in the message of the text being mined. These are often articles, prepositions or conjunctions, such as ‘a’, ‘the’, ‘and’, etc., and are usually removed in the Process Document operator’s sub-process.

Stream: This is the string of operators in a data mining model, connected through the operators’ ports via splines, that represents all actions that will be taken on a data set in order to mine it.

Structured Query Language (SQL): The set of codes, reserved keywords and syntax defined by the American National Standards Institute used to create, manage and use relational databases.

Sub-process: In RapidMiner, this is a stream of operators set up to apply a series of actions to all inputs connected to the parent operator.

Support Percent: In an association rule data mining model, this is the percent of the time that when the antecedent is found in an observation, the consequent is also found. Since this is calculated as the number of times the two are found together divided by the total number of they could have been found together, the Support Percent is the same for reciprocal rules.

Table: In data collection, a table is a grid of columns and rows, where in general, the columns are individual attributes in the data set, and the rows are observations across those attributes. Tables are the most elemental entity in relational databases.

Target Attribute: See Label; Dependent Variable: The attribute in a data set that is being acted upon by the other attributes. It is the thing we want to predict, the target, or label, attribute in a predictive model.

Technology: Any tool or process invented by mankind to do or improve work.

Text Mining: The process of data mining unstructured text-based data such as essays, news articles, speech transcripts, etc. to discover patterns of word or phrase usage to reveal deeper or previously unrecognized meaning.

Token (Tokenize): In text mining, this is the process of turning words in the input document(s) into attributes that can be mined.

Training Data: In a predictive model, this data set already has the label, or dependent variable defined, so that it can be used to create a model which can be applied to a scoring data set in order to generate predictions for the latter.

Tuple: See Observation: A row of data in a data set. It consists of the value assigned to each attribute for one record in the data set. It is sometimes called a tuple in database language.

Variable: See Attribute: In columnar data, an attribute is one column. It is named in the data so that it can be referred to by a model and used in data mining. The term attribute is sometimes interchanged with the terms ‘field’, ‘variable’, or ‘column’.

View: A type of pseudo-table in a relational database which is actually a named, stored query. This query runs against one or more tables, retrieving a defined number of attributes that can then be referenced as if they were in a table in the database. Views can limit users’ ability to see attributes to only those that are relevant and/or approved for those users to see. They can also speed up the query process because although they may contain joins, the key columns for the joins can be indexed and cached, making the view’s query run faster than it would if it were not stored as a view. Views can be useful in data mining as data miners can be given read-only access to the view, upon which they can build data mining models, without having to have broader administrative rights on the database itself.

What is the Central Limit Theorem and why is it important?

An Introduction to the Central Limit Theorem

Answer: Suppose that we are interested in estimating the average height among all people. Collecting data for every person in the world is impractical, bordering on impossible. While we can’t obtain a height measurement from everyone in the population, we can still sample some people. The question now becomes, what can we say about the average height of the entire population given a single sample.
The Central Limit Theorem addresses this question exactly. Formally, it states that if we sample from a population using a sufficiently large sample size, the mean of the samples (also known as the sample population) will be normally distributed (assuming true random sampling), the mean tending to the mean of the population and variance equal to the variance of the population divided by the size of the sampling.
What’s especially important is that this will be true regardless of the distribution of the original population.

Central Limit Theorem
Central Limit Theorem: Population Distribution

As we can see, the distribution is pretty ugly. It certainly isn’t normal, uniform, or any other commonly known distribution. In order to sample from the above distribution, we need to define a sample size, referred to as N. This is the number of observations that we will sample at a time. Suppose that we choose
N to be 3. This means that we will sample in groups of 3. So for the above population, we might sample groups such as [5, 20, 41], [60, 17, 82], [8, 13, 61], and so on.
Suppose that we gather 1,000 samples of 3 from the above population. For each sample, we can compute its average. If we do that, we will have 1,000 averages. This set of 1,000 averages is called a sampling distribution, and according to Central Limit Theorem, the sampling distribution will approach a normal distribution as the sample size N used to produce it increases. Here is what our sample distribution looks like for N = 3.

Simple Mean Distribution with N=3
Simple Mean Distribution with N=3

As we can see, it certainly looks uni-modal, though not necessarily normal. If we repeat the same process with a larger sample size, we should see the sampling distribution start to become more normal. Let’s repeat the same process again with N = 10. Here is the sampling distribution for that sample size.

Sample Mean Distribution with N = 10
Sample Mean Distribution with N = 10

Credit: Steve Nouri

What is bias-variance trade-off?

Bias: Bias is an error introduced in the model due to the oversimplification of the algorithm used (does not fit the data properly). It can lead to under-fitting.
Low bias machine learning algorithms — Decision Trees, k-NN and SVM
High bias machine learning algorithms — Linear Regression, Logistic Regression

Variance: Variance is error introduced in the model due to a too complex algorithm, it performs very well in the training set but poorly in the test set. It can lead to high sensitivity and overfitting.
Possible high variance – polynomial regression

Normally, as you increase the complexity of your model, you will see a reduction in error due to lower bias in the model. However, this only happens until a particular point. As you continue to make your model more complex, you end up over-fitting your model and hence your model will start suffering from high variance.

bias-variance trade-off

Bias-Variance trade-off: The goal of any supervised machine learning algorithm is to have low bias and low variance to achieve good prediction performance.

1. The k-nearest neighbor algorithm has low bias and high variance, but the trade-off can be changed by increasing the value of k which increases the number of neighbors that contribute to the prediction and in turn increases the bias of the model.
2. The support vector machine algorithm has low bias and high variance, but the trade-off can be changed by increasing the C parameter that influences the number of violations of the margin allowed in the training data which increases the bias but decreases the variance.
3. The decision tree has low bias and high variance, you can decrease the depth of the tree or use fewer attributes.
4. The linear regression has low variance and high bias, you can increase the number of features or use another regression that better fits the data.

There is no escaping the relationship between bias and variance in machine learning. Increasing the bias will decrease the variance. Increasing the variance will decrease bias.

The Best Medium-Hard Data Analyst SQL Interview Questions

compiled by Google Data Analyst Zachary Thomas!

The Best Medium-Hard Data Analyst SQL Interview Questions

Self-Join Practice Problems: MoM Percent Change

Context: Oftentimes it’s useful to know how much a key metric, such as monthly active users, changes between months.
Say we have a table logins in the form:

SQL Self-Join Practice Mom Percent Change

Task: Find the month-over-month percentage change for monthly active users (MAU).

Solution:
(This solution, like other solution code blocks you will see in this doc, contains comments about SQL syntax that may differ between flavors of SQL or other comments about the solutions as listed)

SQL MoM Solution2

Tree Structure Labeling with SQL

Context: Say you have a table tree with a column of nodes and a column corresponding parent nodes

Task: Write SQL such that we label each node as a “leaf”, “inner” or “Root” node, such that for the nodes above we get:

A solution which works for the above example will receive full credit, although you can receive extra credit for providing a solution that is generalizable to a tree of any depth (not just depth = 2, as is the case in the example above).

Solution: This solution works for the example above with tree depth = 2, but is not generalizable beyond that.

An alternate solution, that is generalizable to any tree depth:
Acknowledgement: this more generalizable solution was contributed by Fabian Hofmann

An alternate solution, without explicit joins:
Acknowledgement: William Chargin on 5/2/20 noted that WHERE parent IS NOT NULL is needed to make this solution return Leaf instead of NULL.

Retained Users Per Month with SQL

Acknowledgement: this problem is adapted from SiSense’s “Using Self Joins to Calculate Your Retention, Churn, and Reactivation Metrics” blog post

PART 1:
Context: Say we have login data in the table logins:

Task: Write a query that gets the number of retained users per month. In this case, retention for a given month is defined as the number of users who logged in that month who also logged in the immediately previous month.

Solution:

PART 2:

Task: Now we’ll take retention and turn it on its head: Write a query to find how many users last month did not come back this month. i.e. the number of churned users

Solution:

Note that there are solutions to this problem that can use LEFT or RIGHT joins.

PART 3:
Context: You now want to see the number of active users this month who have been reactivated — in other words, users who have churned but this month they became active again. Keep in mind a user can reactivate after churning before the previous month. An example of this could be a user active in February (appears in logins), no activity in March and April, but then active again in May (appears in logins), so they count as a reactivated user for May .

Task: Create a table that contains the number of reactivated users per month.

Solution:

Cumulative Sums with SQL

Acknowledgement: This problem was inspired by Sisense’s “Cash Flow modeling in SQL” blog post
Context: Say we have a table transactions in the form:

Where cash_flow is the revenues minus costs for each day.

Task: Write a query to get cumulative cash flow for each day such that we end up with a table in the form below:

Solution using a window function (more effcient):

Alternative Solution (less efficient):

Rolling Averages with SQL

Acknowledgement: This problem is adapted from Sisense’s “Rolling Averages in MySQL and SQL Server” blog post
Note: there are different ways to compute rolling/moving averages. Here we’ll use a preceding average which means that the metric for the 7th day of the month would be the average of the preceding 6 days and that day itself.
Context: Say we have table signups in the form:

Task: Write a query to get 7-day rolling (preceding) average of daily sign ups

Solution1:

Solution2: (using windows, more efficient)

Multiple Join Conditions in SQL

Acknowledgement: This problem was inspired by Sisense’s “Analyzing Your Email with SQL” blog post
Context: Say we have a table emails that includes emails sent to and from zach@g.com:

Task: Write a query to get the response time per email (id) sent to zach@g.com . Do not include ids that did not receive a response from zach@g.com. Assume each email thread has a unique subject. Keep in mind a thread may have multiple responses back-and-forth between zach@g.com and another email address.

Solution:

SQL Window Function Practice Problems

#1: Get the ID with the highest value
Context: Say we have a table salaries with data on employee salary and department in the following format:

Task: Write a query to get the empno with the highest salary. Make sure your solution can handle ties!

#2: Average and rank with a window function (multi-part)

PART 1:
Context: Say we have a table salaries in the format:

Task: Write a query that returns the same table, but with a new column that has average salary per depname. We would expect a table in the form:

Solution:

PART 2:
Task: Write a query that adds a column with the rank of each employee based on their salary within their department, where the employee with the highest salary gets the rank of 1. We would expect a table in the form:

Solution:

Predictive Modelling Questions

Source:  datasciencehandbook.me

 

1-  (Given a Dataset) Analyze this dataset and give me a model that can predict this response variable. 

2-  What could be some issues if the distribution of the test data is significantly different than the distribution of the training data?

3-  What are some ways I can make my model more robust to outliers?

4-  What are some differences you would expect in a model that minimizes squared error, versus a model that minimizes absolute error? In which cases would each error metric be appropriate?

5- What error metric would you use to evaluate how good a binary classifier is? What if the classes are imbalanced? What if there are more than 2 groups?

6-  What are various ways to predict a binary response variable? Can you compare two of them and tell me when one would be more appropriate? What’s the difference between these? (SVM, Logistic Regression, Naive Bayes, Decision Tree, etc.)

7-  What is regularization and where might it be helpful? What is an example of using regularization in a model?

8-  Why might it be preferable to include fewer predictors over many?

9-  Given training data on tweets and their retweets, how would you predict the number of retweets of a given tweet after 7 days after only observing 2 days worth of data?

10-  How could you collect and analyze data to use social media to predict the weather?

11- How would you construct a feed to show relevant content for a site that involves user interactions with items?

12- How would you design the people you may know feature on LinkedIn or Facebook?

13- How would you predict who someone may want to send a Snapchat or Gmail to?

14- How would you suggest to a franchise where to open a new store?

15- In a search engine, given partial data on what the user has typed, how would you predict the user’s eventual search query?

16- Given a database of all previous alumni donations to your university, how would you predict which recent alumni are most likely to donate?

17- You’re Uber and you want to design a heatmap to recommend to drivers where to wait for a passenger. How would you approach this?

18- How would you build a model to predict a March Madness bracket?

19- You want to run a regression to predict the probability of a flight delay, but there are flights with delays of up to 12 hours that are really messing up your model. How can you address this?

 

Data Analysis Interview Questions

Source:  datasciencehandbook.me

1- (Given a Dataset) Analyze this dataset and tell me what you can learn from it.

2- What is R2? What are some other metrics that could be better than R2 and why?

3- What is the curse of dimensionality?

4- Is more data always better?

5- What are advantages of plotting your data before performing analysis?

6- How can you make sure that you don’t analyze something that ends up meaningless?

7- What is the role of trial and error in data analysis? What is the the role of making a hypothesis before diving in?

8- How can you determine which features are the most important in your model?

9- How do you deal with some of your predictors being missing?

10- You have several variables that are positively correlated with your response, and you think combining all of the variables could give you a good prediction of your response. However, you see that in the multiple linear regression, one of the weights on the predictors is negative. What could be the issue?

11- Let’s say you’re given an unfeasible amount of predictors in a predictive modeling task. What are some ways to make the prediction more feasible?

12- Now you have a feasible amount of predictors, but you’re fairly sure that you don’t need all of them. How would you perform feature selection on the dataset?

13- Your linear regression didn’t run and communicates that there are an infinite number of best estimates for the regression coefficients. What could be wrong?

14- You run your regression on different subsets of your data, and find that in each subset, the beta value for a certain variable varies wildly. What could be the issue here?

15- What is the main idea behind ensemble learning? If I had many different models that predicted the same response variable, what might I want to do to incorporate all of the models? Would you expect this to perform better than an individual model or worse?

16- Given that you have wifi data in your office, how would you determine which rooms and areas are underutilized and overutilized?

17- How could you use GPS data from a car to determine the quality of a driver?

18- Given accelerometer, altitude, and fuel usage data from a car, how would you determine the optimum acceleration pattern to drive over hills?

19- Given position data of NBA players in a season’s games, how would you evaluate a basketball player’s defensive ability?

20- How would you quantify the influence of a Twitter user?

21- Given location data of golf balls in games, how would construct a model that can advise golfers where to aim?

22- You have 100 mathletes and 100 math problems. Each mathlete gets to choose 10 problems to solve. Given data on who got what problem correct, how would you rank the problems in terms of difficulty?

23- You have 5000 people that rank 10 sushis in terms of saltiness. How would you aggregate this data to estimate the true saltiness rank in each sushi?

24-Given data on congressional bills and which congressional representatives co-sponsored the bills, how would you determine which other representatives are most similar to yours in voting behavior? How would you evaluate who is the most liberal? Most republican? Most bipartisan?

25- How would you come up with an algorithm to detect plagiarism in online content?

26- You have data on all purchases of customers at a grocery store. Describe to me how you would program an algorithm that would cluster the customers into groups. How would you determine the appropriate number of clusters to include?

27- Let’s say you’re building the recommended music engine at Spotify to recommend people music based on past listening history. How would you approach this problem?

28- Explain how boosted tree models work in simple language.

29- What sort of data sampling techniques would you use for a low signal temporal classification problem?

30- How would you deal with categorical variables and what considerations would you keep in mind?

31- How would you identify leakage in your machine learning model?

32- How would you apply a machine learning model in a live experiment?

33- What is difference between sensitivity, precision and recall? When would you use these over accuracy, name a few situations

34- What’s the importance of train, val, test splits and how would you split or create your dataset – how would this impact your model metrics?

35- What are some simple ways to optimise your model and how would you know you’ve reached a stable and performant model?

Statistical Inference Interview Questions

Source:  datasciencehandbook.me

1- In an A/B test, how can you check if assignment to the various buckets was truly random?

2- What might be the benefits of running an A/A test, where you have two buckets who are exposed to the exact same product?

3- What would be the hazards of letting users sneak a peek at the other bucket in an A/B test?

4- What would be some issues if blogs decide to cover one of your experimental groups?

5- How would you conduct an A/B test on an opt-in feature?

6- How would you run an A/B test for many variants, say 20 or more?

7- How would you run an A/B test if the observations are extremely right-skewed?

8- I have two different experiments that both change the sign-up button to my website. I want to test them at the same time. What kinds of things should I keep in mind?

9- What is a p-value? What is the difference between type-1 and type-2 error?

10- You are AirBnB and you want to test the hypothesis that a greater number of photographs increases the chances that a buyer selects the listing. How would you test this hypothesis?

11- How would you design an experiment to determine the impact of latency on user engagement?

12- What is maximum likelihood estimation? Could there be any case where it doesn’t exist?

13- What’s the difference between a MAP, MOM, MLE estimator? In which cases would you want to use each?

14- What is a confidence interval and how do you interpret it?

15- What is unbiasedness as a property of an estimator? Is this always a desirable property when performing inference? What about in data analysis or predictive modeling?

Product Metric Interview Questions

Source:  datasciencehandbook.me

1- What would be good metrics of success for an advertising-driven consumer product? (Buzzfeed, YouTube, Google Search, etc.) A service-driven consumer product? (Uber, Flickr, Venmo, etc.)

2- What would be good metrics of success for a productivity tool? (Evernote, Asana, Google Docs, etc.) A MOOC? (edX, Coursera, Udacity, etc.)

3- What would be good metrics of success for an e-commerce product? (Etsy, Groupon, Birchbox, etc.) A subscription product? (Netflix, Birchbox, Hulu, etc.) Premium subscriptions? (OKCupid, LinkedIn, Spotify, etc.)

4- What would be good metrics of success for a consumer product that relies heavily on engagement and interaction? (Snapchat, Pinterest, Facebook, etc.) A messaging product? (GroupMe, Hangouts, Snapchat, etc.)

5- What would be good metrics of success for a product that offered in-app purchases? (Zynga, Angry Birds, other gaming apps)

6- A certain metric is violating your expectations by going down or up more than you expect. How would you try to identify the cause of the change?

7- Growth for total number of tweets sent has been slow this month. What data would you look at to determine the cause of the problem?

8- You’re a restaurant and are approached by Groupon to run a deal. What data would you ask from them in order to determine whether or not to do the deal?

9- You are tasked with improving the efficiency of a subway system. Where would you start?

10- Say you are working on Facebook News Feed. What would be some metrics that you think are important? How would you make the news each person gets more relevant?

11- How would you measure the impact that sponsored stories on Facebook News Feed have on user engagement? How would you determine the optimum balance between sponsored stories and organic content on a user’s News Feed?

12- You are on the data science team at Uber and you are asked to start thinking about surge pricing. What would be the objectives of such a product and how would you start looking into this?

13- Say that you are Netflix. How would you determine what original series you should invest in and create?

14- What kind of services would find churn (metric that tracks how many customers leave the service) helpful? How would you calculate churn?

15- Let’s say that you’re are scheduling content for a content provider on television. How would you determine the best times to schedule content?

Programming Questions

Source:  datasciencehandbook.me

1- Write a function to calculate all possible assignment vectors of 2n users, where n users are assigned to group 0 (control), and n users are assigned to group 1 (treatment).

2- Given a list of tweets, determine the top 10 most used hashtags.

3- Program an algorithm to find the best approximate solution to the knapsack problem1 in a given time.

4- Program an algorithm to find the best approximate solution to the travelling salesman problem2 in a given time.

5- You have a stream of data coming in of size n, but you don’t know what n is ahead of time. Write an algorithm that will take a random sample of k elements. Can you write one that takes O(k) space?

6- Write an algorithm that can calculate the square root of a number.

7- Given a list of numbers, can you return the outliers?

8- When can parallelism make your algorithms run faster? When could it make your algorithms run slower?

9- What are the different types of joins? What are the differences between them?

10- Why might a join on a subquery be slow? How might you speed it up?

11- Describe the difference between primary keys and foreign keys in a SQL database.

12- Given a COURSES table with columns course_id and course_name, a FACULTY table with columns faculty_id and faculty_name, and a COURSE_FACULTY table with columns faculty_id and course_id, how would you return a list of faculty who teach a course given the name of a course?

13- Given a IMPRESSIONS table with ad_id, click (an indicator that the ad was clicked), and date, write a SQL query that will tell me the click-through-rate of each ad by month.

14- Write a query that returns the name of each department and a count of the number of employees in each:
EMPLOYEES containing: Emp_ID (Primary key) and Emp_Name
EMPLOYEE_DEPT containing: Emp_ID (Foreign key) and Dept_ID (Foreign key)
DEPTS containing: Dept_ID (Primary key) and Dept_Name

Probability Questions

1- Bobo the amoeba has a 25%, 25%, and 50% chance of producing 0, 1, or 2 offspring, respectively. Each of Bobo’s descendants also have the same probabilities. What is the probability that Bobo’s lineage dies out?

2- In any 15-minute interval, there is a 20% probability that you will see at least one shooting star. What is the probability that you see at least one shooting star in the period of an hour?

3- How can you generate a random number between 1 – 7 with only a die?

4- How can you get a fair coin toss if someone hands you a coin that is weighted to come up heads more often than tails?

5- You have an 50-50 mixture of two normal distributions with the same standard deviation. How far apart do the means need to be in order for this distribution to be bimodal?

6- Given draws from a normal distribution with known parameters, how can you simulate draws from a uniform distribution?

7- A certain couple tells you that they have two children, at least one of which is a girl. What is the probability that they have two girls?

8- You have a group of couples that decide to have children until they have their first girl, after which they stop having children. What is the expected gender ratio of the children that are born? What is the expected number of children each couple will have?

9- How many ways can you split 12 people into 3 teams of 4?

10- Your hash function assigns each object to a number between 1:10, each with equal probability. With 10 objects, what is the probability of a hash collision? What is the expected number of hash collisions? What is the expected number of hashes that are unused.

11- You call 2 UberX’s and 3 Lyfts. If the time that each takes to reach you is IID, what is the probability that all the Lyfts arrive first? What is the probability that all the UberX’s arrive first?

12- I write a program should print out all the numbers from 1 to 300, but prints out Fizz instead if the number is divisible by 3, Buzz instead if the number is divisible by 5, and FizzBuzz if the number is divisible by 3 and 5. What is the total number of numbers that is either Fizzed, Buzzed, or FizzBuzzed?

13- On a dating site, users can select 5 out of 24 adjectives to describe themselves. A match is declared between two users if they match on at least 4 adjectives. If Alice and Bob randomly pick adjectives, what is the probability that they form a match?

14- A lazy high school senior types up application and envelopes to n different colleges, but puts the applications randomly into the envelopes. What is the expected number of applications that went to the right college

15- Let’s say you have a very tall father. On average, what would you expect the height of his son to be? Taller, equal, or shorter? What if you had a very short father?

16- What’s the expected number of coin flips until you get two heads in a row? What’s the expected number of coin flips until you get two tails in a row?

17- Let’s say we play a game where I keep flipping a coin until I get heads. If the first time I get heads is on the nth coin, then I pay you 2n-1 dollars. How much would you pay me to play this game?

18- You have two coins, one of which is fair and comes up heads with a probability 1/2, and the other which is biased and comes up heads with probability 3/4. You randomly pick coin and flip it twice, and get heads both times. What is the probability that you picked the fair coin?

19- You have a 0.1% chance of picking up a coin with both heads, and a 99.9% chance that you pick up a fair coin. You flip your coin and it comes up heads 10 times. What’s the chance that you picked up the fair coin, given the information that you observed?

Reference: 800 Data Science Questions & Answers doc by

 

Direct download here

Reference: 164 Data Science Interview Questions and Answers by 365 Data Science

Download it here

DataWarehouse Cheat Sheet

What are Differences between Supervised and Unsupervised Learning?

Supervised UnSupervised
Input data is labelled Input data is unlabeled
Split in training/validation/test No split
Used for prediction Used for analysis
Classification and Regression Clustering, dimension reduction, and density estimation

Python Cheat Sheet

Download it here

Data Sciences Cheat Sheet

Download it here

Panda Cheat Sheet

Download it here

Learn SQL with Practical Exercises

SQL is definitely one of the most fundamental skills needed to be a data scientist.

This is a comprehensive handbook that can help you to learn SQL (Structured Query Language), which could be directly downloaded here

Credit: D Armstrong

Data Visualization: A comprehensive VIP Matplotlib Cheat sheet

Credit: Matplotlib

Download it here

Power BI for Intermediates

Download it here

Credit: Soheil Bakhshi and Bruce Anderson

How to get a job in data science – a semi-harsh Q/A guide.

Python Frameworks for Data Science

Natural Language Processing (NLP) is one of the top areas today.

Some of the applications are:

  • Reading printed text and correcting reading errors
  • Find and replace
  • Correction of spelling mistakes
  • Development of aids
  • Text summarization
  • Language translation
  • and many more.

NLP is a great area if you are planning to work in the area of artificial intelligence.

High Level Look of AI/ML Algorithms

Best Machine Learning Algorithms for Classification: Pros and Cons

Business Analytics in one image

Curated papers, articles, and blogs on data science & machine learning in production from companies like Google, LinkedIn, Uber, Facebook Twitter, Airbnb, and …

  1. Data Quality
  2. Data Engineering
  3. Data Discovery
  4. Feature Stores
  5. Classification
  6. Regression
  7. Forecasting
  8. Recommendation
  9. Search & Ranking
  10. Embeddings
  11. Natural Language Processing
  12. Sequence Modelling
  13. Computer Vision
  14. Reinforcement Learning
  15. Anomaly Detection
  16. Graph
  17. Optimization
  18. Information Extraction
  19. Weak Supervision
  20. Generation
  21. Audio
  22. Validation and A/B Testing
  23. Model Management
  24. Efficiency
  25. Ethics
  26. Infra
  27. MLOps Platforms
  28. Practices
  29. Team Structure
  30. Fails

How to get a job in data science – a semi-harsh Q/A guide.

HOW DO I GET A JOB IN DATA SCIENCE?

Hey you. Yes you, person asking “how do I get a job in data science/analytics/MLE/AI whatever BS job with data in the title?”. I got news for you. There are two simple rules to getting one of these jobs.

Have experience.

Don’t have no experience.

There are approximately 1000 entry level candidates who think they’re qualified because they did a 24 week bootcamp for every entry level job. I don’t need to be a statistician to tell you your odds of landing one of these aren’t great.

HOW DO I GET EXPERIENCE?

Are you currently employed? If not, get a job. If you are, figure out a way to apply data science in your job, then put it on your resume. Mega bonus points here if you can figure out a way to attribute a dollar value to your contribution. Talk to your supervisor about career aspirations at year-end/mid-year reviews. Maybe you’ll find a way to transfer to a role internally and skip the whole resume ignoring phase. Alternatively, network. Be friends with people who are in the roles you want to be in, maybe they’ll help you find a job at their company.

WHY AM I NOT GETTING INTERVIEWS?

IDK. Maybe you don’t have the required experience. Maybe there are 500+ other people applying for the same position. Maybe your resume stinks. If you’re getting 1/20 response rate, you’re doing great. Quit whining.

IS XYZ DEGREE GOOD FOR DATA SCIENCE?

Does your degree involve some sort of non-remedial math higher than college algebra? Does your degree involve taking any sort of programming classes? If yes, congratulations, your degree will pass most base requirements for data science. Is it the best? Probably not, unless you’re CS or some really heavy math degree where half your classes are taught in Greek letters. Don’t come at me with those art history and underwater basket weaving degrees unless you have multiple years experience doing something else.

SHOULD I DO XYZ BOOTCAMP/MICROMASTERS?

Do you have experience? No? This ain’t gonna help you as much as you think it might. Are you experienced and want to learn more about how data science works? This could be helpful.

SHOULD I DO XYZ MASTER’S IN DATA SCIENCE PROGRAM?

Congratulations, doing a Master’s is usually a good idea and will help make you more competitive as a candidate. Should you shell out 100K for one when you can pay 10K for one online? Probably not. In all likelihood, you’re not gonna get $90K in marginal benefit from the more expensive program. Pick a known school (probably avoid really obscure schools, the name does count for a little) and you’ll be fine. Big bonus here if you can sucker your employer into paying for it.

WILL XYZ CERTIFICATE HELP MY RESUME?

Does your certificate say “AWS” or “AZURE” on it? If not, no.

DO I NEED TO KNOW XYZ MATH TOPIC?

Yes. Stop asking. Probably learn probability, be familiar with linear algebra, and understand what the hell a partial derivative is. Learn how to test hypotheses. Ultimately you need to know what the heck is going on math-wise in your predictions otherwise the company is going to go bankrupt and it will be all your fault.

WHAT IF I’M BAD AT MATH?

Do some studying or something. MIT opencourseware has a bunch of free recorded math classes. If you want to learn some Linear Algebra, Gilbert Strang is your guy.

WHAT PROGRAMMING LANGUAGES SHOULD I LEARN?

STOP ASKING THIS QUESTION. I CAN GOOGLE “HOW TO BE A DATA SCIENTIST” AND EVERY SINGLE GARBAGE TDS ARTICLE WILL TELL YOU SQL AND PYTHON/R. YOU’RE LUCKY YOU DON’T HAVE TO DEAL WITH THE JOY OF SEGMENTATION FAULTS TO RUN A SIMPLE LINEAR REGRESSION.

SHOULD I LEARN PYTHON OR R?

Both. Python is more widely used and tends to be more general purpose than R. R is better at statistics and data analysis, but is a bit more niche. Take your pick to start, but ultimately you’re gonna want to learn both you slacker.

SHOULD I MAKE A PORTFOLIO?

Yes. And don’t put some BS housing price regression, iris classification, or titanic survival project on it either. Next question.

WHAT SHOULD I DO AS A PROJECT?

IDK what are you interested in? If you say twitter sentiment stock market prediction go sit in the corner and think about what you just said. Every half brained first year student who can pip install sklearn and do model.fit() has tried unsuccessfully to predict the stock market. The efficient market hypothesis is a thing for a reason. There are literally millions of other free datasets out there you have one of the most powerful search engines at your fingertips to go find them. Pick something you’re interested in, find some data, and analyze it.

DO I NEED TO BE GOOD WITH PEOPLE? (courtesy of /u/bikeskata)

Yes! First, when you’re applying, no one wants to work with a weirdo. You should be able to have a basic conversation with people, and they shouldn’t come away from it thinking you’ll follow them home and wear their skin as a suit. Once you get a job, you’ll be interacting with colleagues, and you’ll need them to care about your analysis. Presumably, there are non-technical people making decisions you’ll need to bring in as well. If you can’t explain to a moderately intelligent person why they should care about the thing that took you 3 days (and cost $$$ in cloud computing costs), you probably won’t have your position for long. You don’t need to be the life of the party, but you should be pleasant to be around.

Credit: u/save_the_panda_bears

Top 75 Data Science Youtube channel

1- Alex The Analyst
2- Tina Huang
3- Abhishek Thakur
4- Michael Galarnyk
5- How to Get an Analytics Job
6- Ken Jee
7- Data Professor
8- Nicholas Renotte
9- KNN Clips
10- Ternary Data: Data Engineering Consulting
11- AI Basics with Mike
12- Matt Brattin
13- Chronic Coder
14- Intersnacktional
15- Jenny Tumay
16- Coding Professor
17- DataTalksClub
18- Ken’s Nearest Neighbors Podcast
19- Karolina Sowinska
20- Lander Analytics
21- Lights OnData
22- CodeEmporium
23- Andreas Mueller
24- Nate at StrataScratch
25- Kaggle
26- Data Interview Pro
27- Jordan Harrod
28- Leo Isikdogan
29- Jacob Amaral
30- Bukola
31- AndrewMoMoney
32- Andreas Kretz
33- Python Programmer
34- Machine Learning with Phil
35- Art of Visualization
36- Machine Learning University
 

Data Science and Data Analytics Breaking News – Top Stories

  • What does exactly a Data Warehouse do?
    by /u/nexcorp (Data Science) on May 19, 2022 at 9:10 am

    submitted by /u/nexcorp [link] [comments]

  • People Are Dating All Wrong, According to Data Science
    by /u/valprop1 (Data Science) on May 19, 2022 at 8:41 am

    submitted by /u/valprop1 [link] [comments]

  • Title Races among the Tennis Big Three [OC]
    by /u/iamtheguy55 (DataIsBeautiful) on May 19, 2022 at 8:36 am

    submitted by /u/iamtheguy55 [link] [comments]

  • How to develop knowledge and skills in modern data science practices
    by /u/scriptosens (Data Science) on May 19, 2022 at 8:29 am

    I am a data science researcher in a company, but it is more like a university environment where the research component prevails. Everyone talks about AI and data, but on practice, people either build regression models using tiny datasets or make powerpoints for management with even less use. I tend to learn more about the medical domain and stick to my habits and previous knowledge when it comes to building algorithms and models. It feels that I can't match most of requirements in a typical DS position description. I want to build up my skills in more production-level data science and development practices - MLops, Azure services, advanced visualization, Tableau, PowerBI, mlflow, pipelines. My goal would be to be able to pass an interview for a more IT-oriented DS position. What are your advises guys, where should I start, considering I still need to do my main job? Which skills should I focus first? How to keep this motivation long-term, as this is not a one-week task? Which specific online courses can help me be up to speed fast? Any personal experience in a similar situation? In another thread in this sub people recommended this course https://www.dunderdata.com/build-an-interactive-data-analytics-dashboard-with-python, it seems very practical, well prepared and not too expensive. Any more recommendations like this? submitted by /u/scriptosens [link] [comments]

  • How Should I Approach my Imposter Syndrome?
    by /u/bryceking24 (Data Science) on May 19, 2022 at 8:26 am

    Next week I’ll be starting an internship. The past two nights I’ve been kept awake by thoughts that I might not know enough. Thoughts that I just won’t bring anything of value to the table. I look around this community and see all the amazing projects people work on and all the keywords and buzz words that people throw around that I know very little about. So much about deep learning and machine learning that I just haven’t had the opportunity to learn yet and am not even quite sure that they would interest me. In short, I graduated from undergrad in 2021 with a Bachelor of Science in Econometrics and Quantitative Economics (Concentration in Computer Science) and am currently studying for a Master of Science in Economics. I have added a Graduate Certificate in Data Science because I feel that my program has fallen short of my expectations. I do plenty of documentation reading, video watching, and code practice online, but at the end of the day I feel like a glorified “googler”. Like I couldn’t do anything without access to the internet because I just don’t have enough practice and muscle memory to really make it stick. Has anyone else been in this situation? How did it turn out for you? What helped you push beyond your feelings of self doubt? submitted by /u/bryceking24 [link] [comments]

  • PhD Nerds of r/datascience, where did you do your PhD? How did it go?
    by /u/Endosym_ (Data Science) on May 19, 2022 at 7:13 am

    Looking to hear about people's experiences! What was your thesis? What was life like during your PhD? Feel free to go into as much detail as you like! I'm a few years away from it myself, but I'm starting to look into the possibility of completing a PhD in Data Science & Machine Learning. Thought hearing about people's experiences and opinions can help guide undergrads like me into the PhD that's right for them! (If a PhD is right for them in the first place) submitted by /u/Endosym_ [link] [comments]

  • [Not OC] 80 days of war mapped in Ukraine
    by /u/almocalifornia9 (DataIsBeautiful) on May 19, 2022 at 6:20 am

    submitted by /u/almocalifornia9 [link] [comments]

  • pyspark or koalas?
    by /u/Apprehensive_Limit35 (Data Science) on May 19, 2022 at 6:19 am

    For those working in Databricks using python, does your team try to use Koalas as common syntax or does your team prefer Pyspark? I am trying to understand if learning the new-to- me syntax of pyspark is a valuable skill compared to writing just on Koalas which is more familiar to my pandas base. Data analytics / science team, not DE. submitted by /u/Apprehensive_Limit35 [link] [comments]

  • Need Data Set for Class Project, was going with tobacco use and states with tax hikes but open to other ideas. Need a decent and free data set.
    by /u/TehDonkey117 (Data Science) on May 19, 2022 at 4:55 am

    If tobacco product category is included I would also like to see if those products usage increased more in states where cigarette usage decreased. I went to the CDC and found survey sat but my professor said it wasn't good enough. I am open to other ideas, need to show potential correlation between two or more items. I am trying to avoid race and crime since that seems to be covered by multiple students already. Again, you know of one or more data sets with a different topic that works too. I just need to show I can comprehend the fundamentals of the class. submitted by /u/TehDonkey117 [link] [comments]

  • AI research to industrial data science transition
    by /u/Spike__Leon (Data Science) on May 19, 2022 at 3:46 am

    Title may be misleading, i’m not an AI researcher, i just got my education/internships AI research-oriented Currently lost, looking for career advice from data practitioners I studied applied math for 4 years, then for my last year i got into a data science program focused on AI research. I mostly did deep learning during this year. Also, i had 2 deep learning research internships, the second one for my master’s thesis. I wrote a paper in the first and we tried to publish it in a peer-reviewed journal but we got denied and just abandoned it. I still put it in my resume tho I tried to land a machine learning engineer job but to no success, for the reason that i have very limited skills. I know python and most of its used libraries/frameworks, and the basics of SQL. I have worked in R in my studies but i don’t have a professional level in it. I thought about trying to land a phd but my grades are probably not good enough (3.2/5) and i figured that i’m not good enough to be an AI researcher, and that a phd is considered as overqualified to be a data scientist (in my country at least, others i don’t know.) I tried looking at data scientist job offers but they required a lot of data engineering skills, which i haven’t. I thought about data analyst/ data engineer but again i lack the core skills. Thing to know is where i live, 90% of job offers are for data engineering and 9% for data analyst/scientist. But i literally have never done any data engineering, even in my studies. I know basic stuff about parallelism but that’s all. Now i’m kinda lost about what to do. I’m really interested about deep learning but it’s too difficult for me, data science could be great but i feel that i lack most skills, technical ones, but also the soft skills that come with it. What could i do ? Did anyone here make the research-industry transition ? or the deep learning-data science transition ? What do you think ? submitted by /u/Spike__Leon [link] [comments]

  • Tips on Version Control
    by /u/XhoniShollaj (Data Science) on May 19, 2022 at 3:09 am

    As a data scientist, what were some of the main discoveries regarding version control and collaboration platforms ( ie. Git and Github / Gitlab)? What are some useful tips that you wish you knew when you first started getting your hands on CI/CD? submitted by /u/XhoniShollaj [link] [comments]

  • Which ETL tools are on demand ?
    by /u/JAY_1520 (Data Science) on May 19, 2022 at 2:58 am

    I use pentaho data integration as an ETL tool at work, but I want to know if there’s a more powerful or better tool that’s highly used. If yes, should I use another or I’m just fine with pentaho ? Note: I mostly use it for loading databases or transforming files and then output a dataset as csv depending on requirement. submitted by /u/JAY_1520 [link] [comments]

  • What method would be best for constrained multi-target regression?
    by /u/DataScience0 (Data Science) on May 19, 2022 at 2:24 am

    I have a dataset with multiple features and am attempting to train a model that can simultaneously predict several targets at once, constrained from 0 to 1. For example, let's pretend the targets are a series of variables describing the proportion of the day each observation (a person) spends sleeping, eating, working, and relaxing. The sum of these 4 variables will equal 1 but each is just a proportion of time, on average, per 24 hours for that individual. I also have a feature set with demographic info like age, height, weight, gender, country, etc. that will be the independents in the model. Example rows: person age gender height weight sleeping eating working relaxing A 25 M 72 200 0.2 0.1 0.4 0.3 B 47 M 70 175 0.35 0.05 0.4 0.2 What kind of model enables predictions of the four states, while keeping those predictions constrained from 0 to 1 (and sum = 1) ? An example in R or Python would be greatly appreciated. submitted by /u/DataScience0 [link] [comments]

  • What are the differences / pros and cons of MS Visual Studio vs TOAD for SQL server
    by /u/Abject-Suit9107 (Data Science) on May 19, 2022 at 1:20 am

    Hello everyone I was wondering what the list of differences is between using MS Visual Studio and Toad for SQL server and which one would be more useful. My company is considering this transition given that our vendors operate on a TOAD software and we don’t, we use Visual Studio in conjunction with our ERP. I’m compiling a presentation for the pros and cons of both but and my recommendations but all I’ve found online is that TOAD can be more useful for automating simple tasks and that it can query multiple databases simultaneously. Is that it? submitted by /u/Abject-Suit9107 [link] [comments]

  • Heights of Mountains in Washington State over time
    by /u/raiderpower17 (DataIsBeautiful) on May 19, 2022 at 12:59 am

    submitted by /u/raiderpower17 [link] [comments]

  • Is it possible to track dynamic variables via 3rd party integration?
    by /u/Ok_Cartoonist_2105 (Data Science) on May 18, 2022 at 10:04 pm

    Please delete if I’m not allowed to ask this question. I’m an accountant and out of my depth here. Within my industry I can’t find a way to automatically track dynamic variables via 3rd party integration? I’m not even sure I’m using the correct terminology here. But basically I need to be able to track certain articles across geos, clicks etc. The only way to access the info is with my username and ID. I know it’s 3rd party integration. So at the moment I have to manually pull reports and then manually input the data. Is it possible to build a dashboard that gives me this info along with other information? I have an idea of what I want already. submitted by /u/Ok_Cartoonist_2105 [link] [comments]

  • [OC] Today's Market Meltdown
    by /u/Janman14 (DataIsBeautiful) on May 18, 2022 at 9:00 pm

    submitted by /u/Janman14 [link] [comments]

  • Looking for help (or internet resources) with motion capture/tracking
    by /u/showme_watchu_gaunt (Data Science) on May 18, 2022 at 9:00 pm

    Hi hi, I'm a data scientist and avid rock climber and want to merge the two. Specifically looking at making my training more measurable. I thought it would be fun to try and deploy some sort of MoCap where I could track my center-of-gravity so I could detect how fast I do certain activities (velocity training) e.g. pullups. I did some computer vision stuff a long time ago but I'm a little out of touch with the field and was wondering what the current state of the art was for MoCap. Do you have any resources that I could look at, or aware of any open source stuff out there that's pretty plug and play? I'm not trying to be lazy and have done some research myself but wanted to see what the community had to say. GOAL: record myself doing a pull up and track center of gravity vs time with limited equipment (tripod + phone would be best). Thanks! submitted by /u/showme_watchu_gaunt [link] [comments]

  • Firearm-related deaths increased by 30% for those under 19 years of age from 2019 to 2020, surpassing motor vehicle crashes as the leading cause of death in children and teens. [OC]
    by /u/USAFacts_Official (DataIsBeautiful) on May 18, 2022 at 8:43 pm

    submitted by /u/USAFacts_Official [link] [comments]

  • Is there any advanced data science courses out there?
    by /u/PersonalGlove515 (Data Science) on May 18, 2022 at 8:40 pm

    I have about 6 years of experience in data science, with a experience in the all data cycle from gather data from APIs to build APIs myself with a machine learning model inside in it. And looking forward for an advanced course, not advanced in the sense to learn how the train a bayesian belief network. But advanced in the sense making insightful dashboards, tricks to engineer better the features and stuff like that. If you now any please drop a comment. Thanks! submitted by /u/PersonalGlove515 [link] [comments]