You can translate the content of this page by selecting a language in the select box.
What is the tech stack behind Google Search Engine?
Google Search is one of the most popular search engines on the web, handling over 3.5 billion searches per day. But what is the tech stack that powers Google Search?
The PageRank algorithm is at the heart of Google Search. This algorithm was developed by Google co-founders Larry Page and Sergey Brin and patented in 1998. It ranks web pages based on their quality and importance, taking into account things like incoming links from other websites. The PageRank algorithm has been constantly evolving over the years, and it continues to be a key part of Google Search today.
However, the PageRank algorithm is just one part of the story. The Google Search Engine also relies on a sophisticated infrastructure of servers and data centers spread around the world. This infrastructure enables Google to crawl and index billions of web pages quickly and efficiently. Additionally, Google has developed a number of proprietary technologies to further improve the quality of its search results. These include technologies like Spell Check, SafeSearch, and Knowledge Graph.
The technology stack that powers the Google Search Engine is immensely complex, and includes a number of sophisticated algorithms, technologies, and infrastructure components. At the heart of the system is the PageRank algorithm, which ranks pages based on a number of factors, including the number and quality of links to the page. The algorithm is constantly being refined and updated, in order to deliver more relevant and accurate results. In addition to the PageRank algorithm, Google also uses a number of other algorithms, including the Latent Semantic Indexing algorithm, which helps to index and retrieve documents based on their meaning. The search engine also makes use of a massive infrastructure, which includes hundreds of thousands of servers around the world. While google is the dominant player in the search engine market, there are a number of other well-established competitors, such as Microsoft’s Bing search engine and Duck Duck Go.
The original Google algorithm was called PageRank, named after inventor Larry Page (though, fittingly, the algorithm does rank web pages).
After 17 years of work by many software engineers, researchers, and statisticians, Google search uses algorithms upon algorithms upon algorithms.
- The various components used by Google Search are all proprietary, but most of the code is written in C++.
- Google Search has a number of technical explications on how search works and this is also the limit as to what can be shared publicly.
- https://abseil.io and GogleTest https://google.github.io/googletest/ are the main open source Google C++ libraries, those are extensively used for Search.
- https://bazel.build is an other open source framework which is heavily used all across Google including for Search.
- Google has general information on you, the kinds of things you might like, the sites you frequent, etc. When it fetches search results, they get ranked, and this personal info is used to adjust the rankings, resulting in different search results for each user.
How does Google’s indexing algorithm (so it can do things like fuzzy string matching) technically structure its index?
- There is no single technique that works.
- At a basic level, all search engines have something like an inverted index, so you can look up words and associated documents. There may also be a forward index.
- One way of constructing such an index is by stemming words. Stemming is done with an algorithm than boils down words to their basic root. The most famous stemming algorithm is the Porter stemmer.
- However, there are other approaches. One is to build n-grams, sequences of n letters, so that you can do partial matching. You often would choose multiple n’s, and thus have multiple indexes, since some n-letter combinations are common (e.g., “th”) for small n’s, but larger values of n undermine the intent.
- don’t know that we can say “nothing absolute is known”. Look at misspellings. Google can resolve a lot of them. This isn’t surprising; we’ve had spellcheckers for at least 40 years. However, the less common a misspelling, the harder it is for Google to catch.
- One cool thing about Google is that they have been studying and collecting data on searches for more than 20 years. I don’t mean that they have been studying searching or search engines (although they have been), but that they have been studying how people search. They process several billion search queries each day. They have developed models of what people really want, which often isn’t what they say they want. That’s why they track every click you make on search results… well, that and the fact that they want to build effective models for ad placement.
Each year, Google changes its search algorithm around 500–600 times. While most of these changes are minor, Google occasionally rolls out a “major” algorithmic update (such as Google Panda and Google Penguin) that affects search results in significant ways.
For search marketers, knowing the dates of these Google updates can help explain changes in rankings and organic website traffic and ultimately improve search engine optimization. Below, we’ve listed the major algorithmic changes that have had the biggest impact on search.
It took a starting page and added all the unique (if the word occurred more than once on the page, it was only counted once) words on the page to the index or incremented the index count if it was already in the index.
The page was indexed by the number of references the algorithm found to the specific page. So each time the system found a link to the page on a newly discovered page, the page count was incremented.
When you did a search, the system would identify all the pages with those words on it and show you the ones that had the most links to them.
As people searched and visited pages from the search results, Google would also track the pages that people would click to from the search page. Those that people clicked would also be identified as a better quality match for that set of search terms. If the person quickly came back to the search page and clicked another link, the match quality would be reduced.
Now, Google is using natural language processing, a method of trying to guess what the user really wants. From that it it finds similar words that might give a better set of results based on searches done by millions of other people like you. It might assume that you really meant this other word instead of the word you used in your search terms. It might just give you matches in the list with those other words as well as the words you provided.
It really all boils down to the fact that Google has been monitoring a lot of people doing searches for a very long time. It has a huge list of websites and search terms that have done the job for a lot of people.
There are a lot of proprietary algorithms, but the real magic is that they’ve been watching you and everyone else for a very long time.
What programming language powers Google’s search engine core?
C++, mostly. There are little bits in other languages, but the core of both the indexing system and the serving system is C++.
How does Google handle the technical aspect of fuzzy matching? How is the index implemented for that?
- With n-grams and word stemming. And correcting bad written words. N-grams for partial matching anything.
Use a ping service. Ping services can speed up your indexing process.
- Search Google for “pingmylinks”
- Click on the “add url” in the upper left corner.
- Submit your website and make sure to use all the submission tools and your site should be indexed within hours.
Our ranking algorithm simply doesn’t rank google.com highly for the query “search engine.” There is not a single, simple reason why this is the case. If I had to guess, I would say that people who type “search engine” into Google are usually looking for general information about search engines or about alternative search engines, and neither query is well-answered by listing google.com.
To be clear, we have never manually altered the search results for this (or any other) specific query.
If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLFC01 book below.
The basic idea is using an inverted index. This means for each word keeping a list of documents on the web that contain it.
Responding to a query corresponds to retrieval of the matching documents (This is basically done by intersecting the lists for the corresponding query words), processing the documents (extracting quality signals corresponding to the doc, query pair), ranking the documents (using document quality signals like Page Rank and query signals and query/doc signals) then returning the top 10 documents.
Here are some tricks for doing the retrieval part efficiently:
– distribute the whole thing over thousands and thousands of machines
– do it in memory
– looking first at the query word with the shortest document list
– keeping the documents in the list in reverse PageRank order so that we can stop early once we find enough good quality matches
– keep lists for pairs of words that occur frequently together
– shard by document id, this way the load is somewhat evenly distributed and the intersection is done in parallel
– compress messages that are sent across the network
Jeff Dean in this great talk explains quite a few bits of the internal Google infrastructure. He mentions a few of the previous ideas in the talk.
He goes through the evolution of the Google Search Serving Design and through MapReduce while giving general advice about building large scale systems.
As for complexity, it’s pretty hard to analyze because of all the moving parts, but Jeff mentions that the the latency per query is about 0.2 s and that each query touches on average 1000 computers.
If Lemoine’s claims are true, it would be a milestone in the history of humankind and technological development.
Google strongly denies LaMDA has any sentient capacity.
Fun facts about Google Search Engine Competitors
Data Source: statcounterGS
Tools Used: Excel & PowerPoint
Edit: Note that the data for Baidu/China is likely higher. How statcounterGS collects the data might understate # users from China.
Baidu is popular in China, Yandex is popular in Russia.
We know you like your hobbies and especially coding, We do too, but you should find time to build the skills that’ll drive your career into Six Figures. Cloud skills and certifications can be just the thing you need to make the move into cloud or to level up and advance your career. 85% of hiring managers say cloud certifications make a candidate more attractive. Start your cloud journey with these excellent books below:
Yandex is great for reverse image searches, google just can’t compete with yandex in that category.
Normal Google reverse search is a joke (except for finding a bigger version of a pic, it’s good for that), but Google Lens can be as good or sometimes better at finding similar images or locations than Yandex depending on the image type. Always good to try both, and also Bing can be decent sometimes.
Bing has been profitable since 2015 even with less than 3% of the market share. So just imagine how much money Google is taking in.
Firstly: Yahoo, DuckDuckGo, Ecosia, etc. all use Bing to get their search results. Which means Bing’s usage is more than the 3% indicated.
Secondly: This graph shows overall market share (phones and PCs). But, search engines make most of their money on desktop searches due to more screen space for ads. And Bing’s market share on desktop is WAY bigger, its market share on phones is ~0%. It’s American desktop market share is 10-15%. That is where the money is.
What you are saying is in fact true though. We make trillions of web searches – which means even three percent market-share equals billions of hits and a ton of money.
I like duck duck go. And they have good privacy features. I just wish their maps were better because if I’m searching a local restaurant nothing is easier than google to transition from the search to the map to the webpage for the company. But for informative searches I think it gives a more objective, less curated return.
Use Ecosia and profits go to reforestation efforts!
Turns out people don’t care about their privacy, especially if it gets them results.
I recently switched to using brave browser and duck duck go and I basically can’t tell the difference in using Google and chrome.
The only times I’ve needed to use Google are for really specific searches where duck duck go doesn’t always seem to give the expected results. But for daily browsing it’s absolutely fine and far far better for privacy.
Does Google Search have the most complex functionality hiding behind a simple looking UI?
There is a lot that happens between the moment a user types something in the input field and when they get their results.
Google Search has a high-level overview, but the gist of it is that there are dozens of sub systems involved and they all work extremely fast. The general idea is that search is going to process the query, try to understand what the user wants to know/accomplish, rank these possibilities, prepare a results page that reflects this and render it on the user’s device.
I would not qualify the UI of simple. Yes, the initial state looks like a single input field on an otherwise empty page. But there is already a lot going on in that input field and how it’s presented to the user. And then, as soon as the user interacts with the field, for instance as they start typing, there’s a ton of other things that happen – Search is able to pre-populate suggested queries really fast. Plus there’s a whole “syntax” to search with operators and what not, there’s many different modes (image, news, etc…).
One recent iteration of Google search is Google Lens: Google Lens interface is even simpler than the single input field: just take a picture with your phone! But under the hood a lot is going on. Source.
The Google search engine is a remarkable feat of engineering, and its capabilities are only made possible by the use of cutting-edge technology. At the heart of the Google search engine is the PageRank algorithm, which is used to rank web pages in order of importance. This algorithm takes into account a variety of factors, including the number and quality of links to a given page. In order to effectively crawl and index the billions of web pages on the internet, Google has developed a sophisticated infrastructure that includes tens of thousands of servers located around the world. This infrastructure enables Google to rapidly process search queries and deliver relevant results to users in a matter of seconds. While Google is the dominant player in the search engine market, there are a number of other search engines that compete for users, including Bing and Duck Duck Go. However, none of these competitors have been able to replicate the success of Google, due in large part to the company’s unrivaled technological capabilities.
- google translate is limited on writing numbers to an interesting numberby /u/Pluventi (Google) on February 2, 2023 at 3:04 pm
https://preview.redd.it/ciefkgvwisfa1.png?width=1265&format=png&auto=webp&s=dd51feda5fb9fb2d5093739498d7ba2fd2fdf648 submitted by /u/Pluventi [link] [comments]
- Google Play completely lost it, its stuck like this for an hour and i can't download any appsby /u/Invertius (Google) on February 2, 2023 at 2:24 pm
submitted by /u/Invertius [link] [comments]
- Fake food websites posting zodiac/horoscopesby /u/therealdustbucket (Google) on February 2, 2023 at 2:23 pm
submitted by /u/therealdustbucket [link] [comments]
- Event Timer IN Action Script 3by /u/ah2000ha (Google) on February 2, 2023 at 10:13 am
submitted by /u/ah2000ha [link] [comments]
- 7 tips for better Google search resultsby /u/Philarpy (Google) on February 2, 2023 at 10:08 am
submitted by /u/Philarpy [link] [comments]
- Google Is Reportedly Testing An Alternate Home Page With ChatGPT-Styleby /u/Realistic-Plant3957 (Google) on February 2, 2023 at 9:54 am
submitted by /u/Realistic-Plant3957 [link] [comments]
- Google cannot grammar.by /u/AspireC137 (Google) on February 2, 2023 at 7:14 am
submitted by /u/AspireC137 [link] [comments]
- is it a conspiracy that the assistant on third party devices is absolutely terrible suddenlyby /u/zzcool (Google) on February 2, 2023 at 5:52 am
i have assistant on my tv and it became part of my life i let her handle alarms my entire home she was responsive and always worked i trusted her with everything, then i had my tv exchanged on warranty to the same model and she has never been the same again keep in mind i have since again upgraded to a new tv, last time she worked was on the first sony a90j 1 she rarely responds when the tv if off sometimes she does sometimes she doesn't but very rarely 2 alarms rarely if ever works even if i set a timer and an alarm with a few min apart it still doesn't work i missed dentist appointments because of it, now i only trust the alarm when the tv is on 3 shes extremely unresponsive to my ok google unless i almost scream clearly ok google, but she easily listens to whispers or random things in a show 4 my exchanged a90j has google assistant but now i also have a sony a95k which has NO assistant, none not a single response on or off i only have it working on the a90j now, all settings are on everything is supposed to work nothing does even after multiple resets i have retrained my voice model multiple times, i have removed all devices multiple times i have factory reset my devices multiple times, this is not on my side and i am suspecting this is on purpose, as this has been going on across two tvs of different generations and it started happening around beginning of 2022 submitted by /u/zzcool [link] [comments]
- Pixel Tablet will have ‘Hub mode’ settings and Google wallpaper and style app will have separate tabsby /u/RegularFile2893 (Google) on February 2, 2023 at 5:43 am
submitted by /u/RegularFile2893 [link] [comments]
- can somebody help me with my chromebookby /u/Plump_Booty_6969 (Google) on February 2, 2023 at 5:31 am
so your probaly familiar with this chromebook layout right? well for some reason it now does this thing where its now invisible and its only visible if your curser is ontop of it and i dont like that and i looked everywhere in the options and it dosent tell me anything how bring it back to normal. if anyone can help me on how to actually make it visible instead of it only being visible when your cursers ontop of it that would be helpful. thank you. https://preview.redd.it/sdwfb53ynpfa1.png?width=1359&format=png&auto=webp&s=f130201e24ac07f980176d6200ece02ea27fd4ee submitted by /u/Plump_Booty_6969 [link] [comments]
- Looking for a Corrosive Blastslingby /u/Affectionate_Tiger22 (Google) on February 2, 2023 at 5:23 am
submitted by /u/Affectionate_Tiger22 [link] [comments]
- Google released a new update for Go languageby /u/RegularFile2893 (Google) on February 2, 2023 at 5:21 am
submitted by /u/RegularFile2893 [link] [comments]
- ChatGPT about to destroy Google? Not sure that I'll be sad.by /u/AnnoyedVelociraptor (Google) on February 2, 2023 at 2:41 am
I've been reading a lot of the news about ChatGPT being a threat to Google. I used to love Google. I remember when I got a Gmail account. I remember when they had that ever-increasing counter of Gmail storage. Then they killed Reader. And Podcasts (back in the day). And they killed search. Maybe not directly, but definitely by making a system that companies can actively abuse SEO to get traffic to them. Couple of photos in this album: https://imgur.com/gallery/jRNkLbb submitted by /u/AnnoyedVelociraptor [link] [comments]
- Is Google Merchant Center Staffed by Botsby /u/stopfordstopgalen (Google) on February 2, 2023 at 1:15 am
I think it is seriously. submitted by /u/stopfordstopgalen [link] [comments]
- I can't create a google accountby /u/Short-Leading6791 (Google) on February 2, 2023 at 1:13 am
Ok, so... I don't use gmail so I decided to create a brand new google account with my brand new microsoft account. But it says the email is already registered... How is that even possible If I just created this email address and I never created a google account with it before?! Btw, yes it's possible to create a google account this way (I have another one). Just clarifying it in case I hear for the 100th time it's not possible. submitted by /u/Short-Leading6791 [link] [comments]
- Google Drive files unreadableby /u/AdemSof (Google) on February 1, 2023 at 9:01 pm
Files download through the Google Drive app are unreadable but when I download them through the web on mobile they are readable. submitted by /u/AdemSof [link] [comments]
- Android 13 QPR2 Beta 3 is now availableby /u/RegularFile2893 (Google) on February 1, 2023 at 7:56 pm
submitted by /u/RegularFile2893 [link] [comments]
- Think Google is getting worried about ChatGPT? They asked questions about it and how it compares to Google search results.by /u/benhaube (Google) on February 1, 2023 at 6:35 pm
submitted by /u/benhaube [link] [comments]
- New Play Games Designby /u/-Phytony_ (Google) on February 1, 2023 at 4:44 pm
Today I was bored so I want to redesign the Play Games App Here Are The Results: Ps: This is the first time I do something like this so please comment after this submitted by /u/-Phytony_ [link] [comments]
- I accidentally tapped on google-notification from an infected google doc.by /u/Maikell84 (Google) on February 1, 2023 at 3:39 pm
Hi! I was tagged in an obvious scam-google-document, which is why I got a notification. I obviously know that this is a scam and that I must not interact with this shit. The problem is, that instead of swiping away the notification, I accidentally clicked on it. It directly opened an empty document on my phone. Nothing else happened. Now I see, that the link is a script in a macro folder?! Is it possible that some code executed on my phone? Would be a bit ridiculous if I now got a virus or something, just because I got a notification, that came directly from google? Thoughts? https://preview.redd.it/ifzji85djlfa1.png?width=1080&format=png&auto=webp&s=2edbc14ff32f94c3e095c49781c02967cbe3dc81 submitted by /u/Maikell84 [link] [comments]
- I just found a easter egg!by /u/Beautiful_Carry_9696 (Google) on February 1, 2023 at 3:32 pm
submitted by /u/Beautiful_Carry_9696 [link] [comments]
- Some Google Pixel 7 pro users facing heating issue after installing QPR2 betaby /u/masterreal100 (Google) on February 1, 2023 at 11:29 am
submitted by /u/masterreal100 [link] [comments]
- Importing Contacts from an excel file to Google Contactsby /u/ApprehensiveHalf2802 (Google) on February 1, 2023 at 10:45 am
Guyz, I have an excel file that contains just two columns : Name and Phone Number. I want to import it into Google Contacts to save my contacts to cloud. Please suggest a suitable way to do this submitted by /u/ApprehensiveHalf2802 [link] [comments]
- anyone see this about the pixel tablet? (more in comments)by /u/Putrid_Froyo194 (Google) on February 1, 2023 at 4:05 am
submitted by /u/Putrid_Froyo194 [link] [comments]
- Why Google Search Sucks And A Tribute To Neil Gaimanby Mason Pelt (Google Search on Medium) on January 31, 2023 at 10:34 pm
Searching, “who is Neil Gaiman”, or “list of the endless in the Neil Gaiman series” will likely give searchers the answers they seek. But…Continue reading on Medium »
- ChatGPT Could Destroy Google In A Few Years, According To Gmail’s Creatorby /u/liquidocelotYT (Google) on January 31, 2023 at 1:21 pm
submitted by /u/liquidocelotYT [link] [comments]
- Unlocking the Mystery: How Much Data Does Google Search Use?by Abu Syaid (Google Search on Medium) on January 31, 2023 at 9:01 am
Are you tired of being in the dark about how much data Google search uses? You’re not alone! In today’s digital world, it’s important to…Continue reading on Medium »
- Google vs Social Media Search: How to Drive Organic Traffic in 2023 -Invedus Outsourcingby Michael Coplin (Google Search on Medium) on January 30, 2023 at 11:21 am
IntroductionContinue reading on Medium »
- Effectively using Google Search Operatorsby Double Pointer (Google Search on Medium) on January 30, 2023 at 12:29 am
IntroductionContinue reading on The Everything Blog »
- 會從萬事問 Google 變成萬事問 ChatGPT 嗎？by 林鼎淵 (Google Search on Medium) on January 29, 2023 at 1:04 pm
ChatGPT 跟 Google 一樣能提供人們所需的資訊，甚至可以說跟Google相比，ChatGPT 取得資訊的門檻更低；因為使用者能透過對話的方式一步步接近答案，不需要透過關鍵字反覆嘗試。 但未來ChatGPT這類AI聊天機器人，真的能取代Google搜尋引擎嗎？Continue reading on Dean Lin »
- Maximizing Your Website’s Potential with Google Search Consoleby Gbo Seo Solutions (Google Search on Medium) on January 29, 2023 at 9:20 am
Learn How to Use Google Search Console to Improve Your Website’s Visibility and Performance.Continue reading on Medium »
- GOOGLE SERP CHECKERby Javed gull (Google Search on Medium) on January 28, 2023 at 10:57 am
Google SERP Checker Tool: How to Check Your RankingsContinue reading on Medium »
- OK Google: Search & Play with Your Google Assistant, Just Say “Hey Google” Set My Device: Google…by Manya Gupta (Google Search on Medium) on January 28, 2023 at 6:47 am
OK Google: Search & Play with Your Google Assistant, Just Say “Hey Google” Set My Device: Google Assistant Setup for , Lumos, Luminous…Continue reading on Medium »
- Google is Dying (Not about ChatGPT!)by Graeme Ing (Google Search on Medium) on January 28, 2023 at 4:22 am
Google is one of the juggernauts of the tech industry. Its chief products (search and email) dominate our lives so much that “to google”…Continue reading on Medium »
- How To Rank Your Medium Articles On Google Each Time. Step By Step Tutorialby Bjørnar Øvrelid (Google Search on Medium) on January 27, 2023 at 8:46 pm
Are you wondering how to rank your Medium articles on Google each time you hit publish? Then this is the right tutorial for you.Continue reading on Medium »
- Google Employee Layoff Megathreadby /u/Damiian1 (Google) on January 22, 2023 at 5:24 am
As you may be aware, Google announced on Friday that it was cutting 12,000 employees, roughly 6% of the full-time workforce. This thread serves as a place to discuss options as well as sharing support resources. This is not the place to discuss anything which is sensitive in nature. Please keep the thread on topic to facilitate the flow of useful information for anyone affected by the layoffs. submitted by /u/Damiian1 [link] [comments]
How do we know that the Top 3 Voice Recognition Devices like Siri Alexa and Ok Google are not spying on us?
A Twitter List by enoumen