How does a database handle pagination?

How does a database handle pagination?

How does a database handle pagination?

How does a database handle pagination?

It doesn’t. First, a database is a collection of related data, so I assume you mean DBMS or database language.

Second, pagination is generally a function of the front-end and/or middleware, not the database layer.

But some database languages provide helpful facilities that aide in implementing pagination. For example, many SQL dialects provide LIMIT and OFFSET clauses that can be used to emit up to n rows starting at a given row number. I.e., a “page” of rows. If the query results are sorted via ORDER BY and are generally unchanged between successive invocations, then that can be used to implement pagination.

That may not be the most efficient or effective implementation, though.


How does a database handle pagination?

So how do you propose pagination should be done?

On context of web apps , let’s say there are 100 mn users. One cannot dump all the users in response.

Cache database query results in the middleware layer using Redis or similar and serve out pages of rows from that.

What if you have 30, 000 rows plus, do you fetch all of that from the database and cache in Redis?

I feel the most efficient solution is still offset and limit. It doesn’t make sense to use a database and then end up putting all of your data in Redis especially data that changes a lot. Redis is not for storing all of your data.


AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Bard, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, AI Podcast)

If you have large data set, you should use offset and limit, getting only what is needed from the database into main memory (and maybe caching those in Redis) at any point in time is very efficient.

With 30,000 rows in a table, if offset/limit is the only viable or appropriate restriction, then that’s sometimes the way to go.

More often, there’s a much better way of restricting 30,000 rows via some search criteria that significantly reduces the displayed volume of rows — ideally to a single page or a few pages (which are appropriate to cache in Redis.)

It’s unlikely (though it does happen) that users really want to casually browse 30,000 rows, page by page. More often, they want this one record, or these small number of records.

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLF-C02 book

 

Question: This is a general question that applies to MySQL, Oracle DB or whatever else might be out there.

I know for MySQL there is LIMIT offset,size; and for Oracle there is ‘ROW_NUMBER’ or something like that.

But when such ‘paginated’ queries are called back to back, does the database engine actually do the entire ‘select’ all over again and then retrieve a different subset of results each time? Or does it do the overall fetching of results only once, keeps the results in memory or something, and then serves subsets of results from it for subsequent queries based on offset and size?

If it does the full fetch every time, then it seems quite inefficient.

If it does full fetch only once, it must be ‘storing’ the query somewhere somehow, so that the next time that query comes in, it knows that it has already fetched all the data and just needs to extract next page from it. In that case, how will the database engine handle multiple threads? Two threads executing the same query?

something will be quick or slow without taking measurements, and complicate the code in advance to download 12 pages at once and cache them because “it seems to me that it will be faster”.

Answer: First of all, do not make assumptions in advance whether something will be quick or slow without taking measurements, and complicate the code in advance to download 12 pages at once and cache them because “it seems to me that it will be faster”.

YAGNI principle – the programmer should not add functionality until deemed necessary.
Do it in the simplest way (ordinary pagination of one page), measure how it works on production, if it is slow, then try a different method, if the speed is satisfactory, leave it as it is.


From my own practice – an application that retrieves data from a table containing about 80,000 records, the main table is joined with 4-5 additional lookup tables, the whole query is paginated, about 25-30 records per page, about 2500-3000 pages in total. Database is Oracle 12c, there are indexes on a few columns, queries are generated by Hibernate. Measurements on production system at the server side show that an average time (median – 50% percentile) of retrieving one page is about 300 ms. 95% percentile is less than 800 ms – this means that 95% of requests for retrieving a single page is less that 800ms, when we add a transfer time from the server to the user and a rendering time of about 0.5-1 seconds, the total time is less than 2 seconds. That’s enough, users are happy.


And some theory – see this answer to know what is purpose of Pagination pattern

  • What Are The Delete Methods Available In MongoDB?
    by Rofl Facts (Database on Medium) on December 5, 2023 at 2:54 pm

    The following techniques are available in MongoDB for removing documents from a collection:Continue reading on Medium »

  • SQL in Simple Language | Data Analytics
    by Manoj Bidadi Raju (Database on Medium) on December 5, 2023 at 2:01 pm

    Continue reading on Medium »

  • How to Preview Rows Affected by Cascading Delete in SQL
    by David Techwell (Database on Medium) on December 5, 2023 at 1:58 pm

    Continue reading on DataFrontiers »

  • Unlock the Understanding of Data Model, Schema, Instance, and Data Independence in DBMS (4)
    by Tamanna shaikh (Database on Medium) on December 5, 2023 at 1:42 pm

    Welcome to the fascinating world of Database Management Systems (DBMS), where data organization and management are key to unlocking the…Continue reading on Medium »

  • O uso de procedures para substituir triggers ao fazer um evento dentro da mesma tabela
    by FELIPE COLPO BAGESTEIRO (Database on Medium) on December 5, 2023 at 12:58 pm

    TRIGGERS sempre foram algo bem complicado de se implementar dentro da área de bancos de dados, mesmo que sua função possa ser substituída…Continue reading on Medium »

  • Understanding CASE WHEN: Making Transformation Easy
    by Sumit Kumar (Database on Medium) on December 5, 2023 at 12:52 pm

    CASE WHEN statement is a powerful tool that allows you to perform conditional logic within your SQL queries. It’s a versatile construct…Continue reading on Medium »

  • SQL vs. NoSQL Databases
    by Developers Tutorials (Database on Medium) on December 5, 2023 at 12:32 pm

    SQL vs. NoSQL DatabasesContinue reading on Medium »

  • Why Email Marketing For B2B Lead Generation | DigitalNeel07
    by Neel S (Database on Medium) on December 5, 2023 at 12:16 pm

    What is Email Marketing for Lead Generation:Continue reading on Medium »

  • Programlamada Listeleme Yöntemleri
    by Hasan Alparslan Fidan (Database on Medium) on December 5, 2023 at 12:07 pm

    Herkese selamlar! Bugünkü yazımda sizlere programlamada listeleme yöntemlerini anlatmaya çalışacağım, kod ve örneklerimiz javascript…Continue reading on Medium »

  • Your Gateway to Top Management Email Lists, Affordable Solutions, and Guaranteed Results in…
    by Phyllis Reed (Database on Medium) on December 5, 2023 at 11:15 am

    In the fast-paced world of business, effective communication with top management is crucial for success. Leveraging the power of…Continue reading on Medium »

  • Building a database as an Analyst
    by /u/penguboo (Database) on December 4, 2023 at 7:29 pm

    I Just joined a company for a few weeks in a new department that focuses on providing food for customers, which was previously outsourced. I'm the only analyst/IT on the team and they want to build a database on excel to hold their financial information, which would be later used for analytics. The operations is quite big. Because we have a middle man to help buy most of our ingredients through vendors, some of the data like food purchases comes externally in a form of a report from the middle man, and some data comes internally. I don’t have experience with building databases. Is it a good idea to be building it in excel, or what should my next steps be? submitted by /u/penguboo [link] [comments]

  • Data base, Data Warehouse, Data Mart, Data Lake, Data Mine, Data Hub....the difference?
    by /u/Nearby-Leek-1058 (Database) on December 4, 2023 at 4:30 am

    I see a lot of terminology being thrown around these days, so much that I am not sure which thing does what anymore. Are any of you up to speed in how all these things come together? submitted by /u/Nearby-Leek-1058 [link] [comments]

  • Which database to use for small startup real-estate company website?
    by /u/ForsakenProgram250 (Database) on December 4, 2023 at 1:15 am

    I got into the startup company to develop website for their real estate company. I have trouble deciding twhich database to choose. Company is just startup and it's base of operation is in small country. Should I use MongoDB or Postgresql? Give me other suggestions if you could please. submitted by /u/ForsakenProgram250 [link] [comments]

  • Ideal NoSQL DB for Heavy-Hitter Increment-By?
    by /u/The_Gentel_Sandwich (Database) on December 4, 2023 at 12:55 am

    I know LSM-Tree-Based DBs (like Cassandra) "tend to be better for write-heavy applications" and B-Tree-based DBs (like DynamoDB) "tend to be better for read-heavy applications" Which would be best for an "Increment-by-heavy application"? - Example: Keeping a count of the total all-time number of listens for all songs on Spotify, batched and written to the DB once each night - Since you would need to "increment-by" the number of new listens from the day, would this essentially be "reading" the existing value, calculating the new total, then "writing" the final result? If so, would that mean the number of reads/writes would be "even" (for these updates at least)? Or do I have some huge misunderstanding? Any and all assistance is greatly appreciated, thanks! submitted by /u/The_Gentel_Sandwich [link] [comments]

  • Need help with my ER Diagram
    by /u/IGNFico (Database) on December 3, 2023 at 6:05 pm

    submitted by /u/IGNFico [link] [comments]

  • Slow import of a 40GB MySQL dump
    by /u/SabatinoMasala (Database) on December 3, 2023 at 3:34 pm

    I’m struggling to import a 40GB MySQL dump in a reasonable time. On my Mac Studio M2 Ultra this import takes around 6 hours - which seems absurd. The database has some tables with 10+ million of rows, and every table heavily relies on FK constraints and triggers, which I believe to be the cause of the slow imports. Importing 40GB MySQL database seems like such a trivial thing to do, but alas. Anyone have any advice? submitted by /u/SabatinoMasala [link] [comments]

  • I am migrating the Caché database system to the Oracle database system.
    by /u/Sarahhydroponic (Database) on December 3, 2023 at 4:13 am

    Is there any fastest method to help migrate between these 2 databases while keeping the original business logic intact? The number of methods from Caché is more than 20 thousand. If you just move it manually, it will take a long time. submitted by /u/Sarahhydroponic [link] [comments]

  • Question about simultaneous data editing
    by /u/Guilty_Ad_5422 (Database) on December 3, 2023 at 1:54 am

    Hi everyone, So Im currently working on a website of mine (which is a wordpress, it probably dont matter but you got some context) and while developing a new feature I thought about a possible problem (I think ?) Here is the situtation: I have a table in my database which is already filled with datas, and each entry is either assigned to a user or not assigned. like : ​ id data assigned user_id 1 randomData true 3 2 randomData false <null> 3 randomData false <null> When a user do a specific action, the code do in this order : 1)Retrieve for an available entry in the previously mentioned table 2)Check the infos contained in the entry 3)Assigned the entry to the user. My question being, is there a risk that two users trigger the action almost in the same time and end up editing the same row ? And how could I prevent this ? submitted by /u/Guilty_Ad_5422 [link] [comments]

  • Home database for a plethora of files types?
    by /u/joellapointe1717 (Database) on December 3, 2023 at 1:42 am

    I'm not a computer/programming oriented person. I'm tired to put my files in infinite layers of folders. I need to remember in my brain how they are placed in the folder tree. My files can be : Videos, audio, pdf, word, excel etc. I would like to classify/search my files with keywords and properties... I saw Jellyfin, it is mostly music and videos oriented. I would like a general solution for different file types. Any ideas? submitted by /u/joellapointe1717 [link] [comments]

  • Please help me choose
    by /u/appalam25 (Database) on December 1, 2023 at 9:49 pm

    I need to design a database for my mobile app - a flashcard app similar to anki. There are around 25 subjects, each having around 3 to 10 subtopics. each topic may have 5 to 20 headings and each heading may have 100 to 1500 flashcards. There will be a million plus flashcards easily. I need to organise these flashcards content as question, answer, explanations, these fields will be multimedia rich like images.... for managing them i will be assigning ID, TIme last accessed, time last reviewed, time taken to study, similarities to the other cards, due date etc. Once studied i will be assigning them to spaced repetion algorithm and students have to study the 'flashcards due for the day' everyday. This is where i am concerned about NoSQL . Can you suggest me which dbms i should choose, why and how to choose a serverless option for your recommendation. thanks in advance submitted by /u/appalam25 [link] [comments]

  • How To Make A DataBase?
    by /u/Nixtivo (Database) on December 1, 2023 at 9:30 pm

    submitted by /u/Nixtivo [link] [comments]

  • SQL Query optimization through minimum expression and column subset in subselects
    by /u/adam_optimizer (Database) on December 1, 2023 at 5:47 pm

    submitted by /u/adam_optimizer [link] [comments]

  • Do you use a separated service for full text search?
    by /u/Horror-Card-3862 (Database) on December 1, 2023 at 4:20 pm

    Projects that need full text search, do you guys commonly use the apis provided by your databases like PostgreSQL FTS or MySQL FTS? Or do you guys use some other service like elasticsearch/algolia? I need to implement a item search in my application but not sure to use DB inbuilt FTS or an external service. submitted by /u/Horror-Card-3862 [link] [comments]

  • GS-Base
    by /u/Mastodont_XXX (Database) on December 1, 2023 at 10:11 am

    Hi, just a quick question - is there anyone here who actually uses GS-Base? Any major problems? Is it sufficient as a replacement for MySQL on home databases? Thx. submitted by /u/Mastodont_XXX [link] [comments]

  • Question About Indexing [Postgres]
    by /u/HadarN (Database) on November 30, 2023 at 4:13 pm

    Hey All, Just moved from using Mongo to Postgres not long ago and having some issues understanding my query's EXPLAIN results. Firstly- using Mongo, all my indexes are as good as none after projection/multiple actions. Using Postgres, are the indexes still available to use after JOIN? For example, if using both JOIN and GROUP BY/ ORDER BY, is the latter action doomed to be index-less? (given we don't save the result of the initial join separately) Secondly- What is ProjectSet?? can't seem to find any explanation over 3 words... ​ Thank you so much for helping me with my confusion! submitted by /u/HadarN [link] [comments]

  • Data Lake Mysteries Unveiled: Nessie, Dremio, and MinIO Make Waves
    by /u/swodtke (Database) on November 30, 2023 at 3:26 am

    Just as Git has become fundamental to software development, data engineers need similar tools to work in parallel, compare data versions, promote changes to production and roll back data when needed. Nessie provides data engineers with a Git-like version control system for managing data versions, branches, merges, and commits. This can be very helpful when multiple data engineers are working with and transforming data at the same time. Nessie allows each engineer to work in separate branches while maintaining a single source of truth in the form of the main branch. This functionality empowers data engineering teams to maintain data quality collaboratively in the face of unrelenting change. https://blog.min.io/uncover-data-lake-nessie-dremio-iceberg/?utm_source=reddit&utm_medium=organic-social+&utm_campaign=nessie_dremio_iceberg submitted by /u/swodtke [link] [comments]

  • Question regarding dBase design
    by /u/richb201 (Database) on November 29, 2023 at 2:05 pm

    I have an app that I wrote using CodeIgniter3 and mySql on AWS RDS. I am going to convert to CodeIgniter4 but am also considering modifying the database. The app is heavily database centric. RDS is quite expensive. It is a multiuser system. I have considered possess due to its ability to have a separate table for each user. I am also considering using MSSQL due to the ability to debug the script. Any thoughts? submitted by /u/richb201 [link] [comments]

  • Evaluating an entity relationship diagram and database
    by /u/average_couchpotato (Database) on November 29, 2023 at 9:10 am

    How do I go about evaluating an entity relationship diagram and database? I am working on an assignment that requires me to evaluate an entity relationship diagram and database, how I go about this ? I’m confused on how to start and structure this review, thanks submitted by /u/average_couchpotato [link] [comments]

  • How to redirect Reporting server URL to new hostname?
    by /u/l0ne-warri0r (Database) on November 29, 2023 at 8:48 am

    Reporting server databases are part of the MSSQL Availability group, however in the reporting server configuration manager the web service url is pointed directly to DB server hostname. I want to I reconfigure the reporting server web service URL to the availability group listener hostname. How do I redirect the url to the availability listener name if someone tries to access the old url ? submitted by /u/l0ne-warri0r [link] [comments]

  • How to redirect Reporting server URL to new hostname?
    by /u/l0ne-warri0r (Database) on November 29, 2023 at 8:47 am

    Reporting server databases are part of the MSSQL Availability group, however in the reporting server configuration manager the web service url is pointed directly to DB server hostname. I want to I reconfigure the reporting server web service URL to the availability group listener hostname. How do I redirect the url to the availability listener name if someone tries to access the old url ? submitted by /u/l0ne-warri0r [link] [comments]

  • Do I need to make any changes in this schema?
    by /u/MemberOfUniverse (Database) on November 29, 2023 at 2:42 am

    I need to store the following data in postgresql. I have a list of businesses and each business has a list of customers. Each customer for each business has a list of transactions. And every customer for every business has a chat room. I decided the following schema Business Table(Name, business_id) Customer table(Name, customer_id, Foreign Key to business_id) Transactions Table(Name, foreign key to customer_Id from customer Table, foreign key to business_id from business Table). Messages Table(data,sender(could be either business owner or customer), foreign key to customer_Id from customer Table, foreign key to business_id from business Table) Is there a better way? submitted by /u/MemberOfUniverse [link] [comments]

  • Database design for online web app
    by /u/xedoxzara (Database) on November 28, 2023 at 10:33 pm

    Currently I'm trying to build web app for online voting where registered user can create their own Election. For the database, every time user create an election should the system create new required table for the election or I could just some tables to hold everything ? case 1 (create required table for each election ) : user_table (id,name,email,etc) election_table (user_id, election_id,etc) per election created by user above, below tables will be created in database with electionId is part of table name : voter_electionId_table (voter_id,name,email,etc) candidate_electionId_table (candidate_id,no,name) ballot_electionId_table(vote_id,candidate_id) case 2 ( same tables for every election) : user_table (id,name,email,etc) election_table (user_id, election_id,etc) voter(voter_id,election_id,etc) candidate_electionId_table (electionID,candidate_id,no,name,etc) ballot_electionId_table(electionId,vote_id,candidate_id) which from above is better ? or none of them is good (If yes, please share your thought). ​ submitted by /u/xedoxzara [link] [comments]

  • Database for production agency
    by /u/L3v1Ning (Database) on November 28, 2023 at 9:37 pm

    Hello everyone, Do I have a dataBASED idea for you (sorry. Had to). I run a theater production agency in Poland. We produce a show, create multiple casts and tech teams, receive orders all over the country and go to events to play. We been a few years on the market and it is getting a bit overwhelming with the amount of data we need to work with and how much more data we need to collect. I'm trying to repair communication and information. I've been thinking about a DB/app to be able to use it for administration, logistics and, if possible, planning. First, what information I need to collect: Client (person that is organizing the event, ordering a show) Employees (3 types: techs [that have different tech roles. Some employees can do multiple jobs], actors [can play different roles, depending on the show. Like techs, some can play multiple roles], administration [basically me and the boss, but could be assistants or anybody else we might employ that doesn't go into the earlier categories]) Orders (Basically bookings/reservations made by clients. I want to implement a system via wordpress for this. I also need the information if the orders have been "moved" to a different date, or cancelled, and by who) Contracts (These probably will be 2 different tables - employee contracts, where, depending on the "department" a contract would be filled with information provided in the Employees table. I think there should be a seperate table for their pays. The other table would be Client Contracts - I also have a few types - quarterly contracts that include 'x' amount of orders, single contracts with single/double orders [depending on the orders]) Shows (ATM we have 2 shows available) Who/what/where (Not sure how to call this one. Basically to collect data which actor or tech played which role, when, how many times, in what towns. Things like that. It would be both for quality management and information on pay for the month) With all this I also would love to: have the option to import/export data to/from csv/xml files or to a google sheets file. Right now I'm running a "flimsy" db in google sheets, because my employee is older and not that technical savvy. It would be great if it would import data from the booking plug-in Export data to another website, where our fans can look up where and when we are playing and what cast is playing FTM that is everything that comes to mind. I must add - we have 2 shows, but the orders can come up: Show 1 (S1) is being played the same day, same time in 2 different places, 2 different clients, 2 different casts S1 and Show 2 (S2) are being played in 1 place (so we need a proper cast, to be able to play) and another S2 is being played somewhere else 2-3 times in 1 day S1 in 1 place Shows in our home town don't require a tech group or any drivers. Hopefully someone can help me out with this. submitted by /u/L3v1Ning [link] [comments]

  • [Question] How do you pick a database for your project
    by /u/barakadax (Database) on November 28, 2023 at 11:43 am

    So far I was using the ranking database website I found to consider which database I want to use. Is there a better website/tool to help choosing a DB for a project? For this example I'm writing in Rust a small POC, running on Docker, so I figured out in memory database is the best, if it has the option to save to disk for backups is great but for the POC in memory is more than enough, I don't need key-value database so Redis is good for this, relational would be sufficient but I don't seem to find one, there are too many options to find the needle in this haystack. So how do you guys choose a database? Edit: why is there a huge DB in my post that send you back to this post?... submitted by /u/barakadax [link] [comments]

  • Lessons learned from upgrading RDS MySQL 5.7 to 8.0
    by /u/eranchetz (Database) on November 28, 2023 at 7:42 am

    submitted by /u/eranchetz [link] [comments]

Budget to start a web app built on the MEAN stack

I want to start a web app built on the MEAN stack (mongoDB, express.js, angular, and node.js). How much would it cost me to host this site? What resources are there for hosting websites built on the MEAN stack?

I went through the same questions and concerns and I actually tried a couple of different cloud providers for similar environments and machines.

Web Apps Feed

  1. At Digital Ocean, you can get a fully loaded machine to develop and host at $5 per month (512 MB RAM, 20 GB disk ). You can even get a $10 credit by using this link of mine.[1] It is very easy to sign up and start. Just don’t use their web console to connect to your host. It is slow. I recommend using ssh client to connect and it is very fast.
  2. GoDaddy will charge you around 8$ per month for a similar MEAN stack host (512 MB RAM, 1 core processor, 20 Gb disk ) for your MEAN Stack development.
  3. Azure use bitmani’s mean stack on minimum DS1_V2 machine (1core, 3.5 gB RAM) and your average cost will be $52 per month if you never shut down the machine. The set up is a little bit more complicated that Digital Ocean, but very doable. I also recommend ssh to connect to the server and develop.
  4. AWS also offers Bitmani’s MEAN stack on EC2 instances similar to Azure DS1V2 described above and it is around $55 per month.
  5. Other suggestions

All those solutions will work fine and it all depends on your budget. If you are cheap like me and don’t have a big budget, go with Digital Ocean and start with $10 off with this code.

Basic Gotcha Linux Questions for IT DevOps and SysAdmin Interviews

Some IT DevOps, SysAdmin, Developer positions require the knowledge of basic linux Operating System. Most of the time, we know the answer but forget them when we don’t practice very often. This refresher will help you prepare for the linux portion of your IT interview by answering some gotcha Linux Questions for IT DevOps and SysAdmin Interviews.

Get a $10 credit to have your own linux server for your MEAN STACK development and more. It is only $5 per month for a fully loaded Ubuntu machine.

Latest Linux Feeds

I- Networking:

  1. How many bytes are there in a MAC address?
    48.
    MAC, Media Access Control, address is a globally unique identifier assigned to network devices, and therefore it is often referred to as hardware or physical address. MAC addresses are 6-byte (48-bits) in length, and are written in MM:MM:MM:SS:SS:SS format.
  2. What are the different parts of a TCP packet?
    The term TCP packet appears in both informal and formal usage, whereas in more precise terminology segment refers to the TCP protocol data unit (PDU), datagram to the IP PDU, and frame to the data link layer PDU: … A TCP segment consists of a segment header and a data section.
  3. Networking: Which command is used to initialize an interface, assign IP address, etc.
    ifconfig (interface configuration). The equivalent command for Dos is ipconfig.
    Other useful networking commands are: Ping, traceroute, netstat, dig, nslookup, route, lsof
  4. What’s the difference between TCP and UDP; Between DNS TCP and UDP?
    There are two types of Internet Protocol (IP) traffic. They are TCP or Transmission Control Protocol and UDP or User Datagram Protocol. TCP is connection oriented – once a connection is established, data can be sent bidirectional. UDP is a simpler, connectionless Internet protocol.
    The reality is that DNS queries can also use TCP port 53 if UDP port 53 is not accepted.
    DNS uses TCP for Zone Transfer over port :53.
    DNS uses UDP for DNS Queries over port :53.

  5. What are defaults ports used by http, telnet, ftp, smtp, dns, , snmp, squid?
    All those services are part of the Application level of the TCP/IP protocol.
    http => 80
    telnet => 23
    ftp => 20 (data transfer), 21 (Connection established)
    smtp => 25
    dns => 53
    snmp => 161
    dhcp => 67 (server), 68 (Client)
    ssh => 22
    squid => 3128
  6. How many host available in a subnet (Class B and C Networks)
  7. How DNS works?
    When you enter a URL into your Web browser, your DNS server uses its resources to resolve the name into the IP address for the appropriate Web server.
  8. What is the difference between class A, class B and class C IP addresses?
    Class A Network (/ 8 Prefixes)
    This network is 8-bit network prefix. IP address range from 0.0.0.0 to 127.255.255.255
    Class B Networks (/16 Prefixes)
    This network is 16-bit network prefix. IP address range from 128.0.0.0 to 191.255.255.255Class C Networks (/24 Prefixes)
    This network is 24-bit network prefix.IP address range from 192.0.0.0 to 223.255.255.255
  9. Difference between ospf and bgp?
    The first reason is that BGP is more scalable than OSPF. , and this, normal igp like ospf cannot perform. Generally speaking OSPF and BGP are routing protocols for two different things. OSPF is an IGP (Interior Gateway Protocol) and is used internally within a companies network to provide routing.

II- Operating System
1&1 Web Hosting


  1. How to find the Operating System version?
    $uname -a
    To check the distribution for redhat for example: $cat /etc/redhat –release
  2. How to list all the process running?
    top
    To list java processes, ps -ef | grep java
    To list processes on a specific port:
    netstat -aon | findstr :port_number
    lsof -i:80
  3. How to check disk space?
    df shows the amount of disk space used and available.
    du displays the amount of disk used by the specified files and for each subdirectories.
    To drill down and find out which file is filling up a drive: du -ks /drive_name/* | sort -nr | head
  4. How to check memory usage?
    free or cat /proc/meminfo
  5. What is the load average?
    It is the average sum of the number of process waiting in the queue and the number of process currently executing over the period of 1, 5 and 15 minutes. Use top to find the load average.
  6. What is a load balancer?
    A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications.
  7. What is the Linux Kernel?
    The Linux Kernel is a low-level systems software whose main role is to manage hardware resources for the user. It is also used to provide an interface for user-level interaction.
  8. What is the default kill signal?
    There are many different signals that can be sent (see signal for a full list), although the signals in which users are generally most interested are SIGTERM (“terminate”) and SIGKILL (“kill”). The default signal sent is SIGTERM.
    kill 1234
    kill -s TERM 1234
    kill -TERM 1234
    kill -15 1234
  9. Describe Linux boot process
    BIOS => MBR => GRUB => KERNEL => INIT => RUN LEVEL
    As power comes up, the BIOS (Basic Input/Output System) is given control and executes MBR (Master Boot Record). The MBR executes GRUB (Grand Unified Boot Loader). GRUB executes Kernel. Kernel executes /sbin/init. Init executes run level programs. Run level programs are executed from /etc/rc.d/rc*.d
    Mac OS X Boot Process:

    Boot ROMFirmware. Part of Hardware system
    BootROM firmware is activated
    POSTPower-On Self Test
    initializes some hardware interfaces and verifies that sufficient memory is available and in a good state.
    EFI Extensible Firmware Interface
    EFI does basic hardware initialization and selects which operating system to use.
    BOOTX boot.efi boot loader
    load the kernel environment
    Rooting/Kernel The init routine of the kernel is executed
    boot loader starts the kernel’s initialization procedure
    Various Mach/BSD data structures are initialized by the kernel.
    The I/O Kit is initialized.
    The kernel starts /sbin/mach_init
    Run Level mach_init starts /sbin/init
    init determines the runlevel, and runs /etc/rc.boot, which sets up the machine enough to run single-user.
    rc.boot figures out the type of boot (Multi-User, Safe, CD-ROM, Network etc.)
  10. List services enabled at a particular run level
    chkconfig –list | grep 5:0n
    Enable|Disable a service at a specific run level: chkconfig on|off –level 5
  11. How do you stop a bash fork bomb?
    Create a fork bomb by editing limits.conf:
    root hard nproc 512
    Drop a fork bomb as below:
    :(){ :|:& };:
    Assuming you have access to shell:
    kill -STOP
    killall -STOP -u user1
    killall -KILL -u user1
  12. What is a fork?
    fork is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel. Fork is the primary (and historically, only) method of process creation on Unix-like operating systems.
  13. What is the D state?
    D state code means that process is in uninterruptible sleep, and that may mean different things but it is usually I/O.

III- File System

  1. What is umask?
    umask is “User File Creation Mask”, which determines the settings of a mask that controls which file permissions are set for files and directories when they are created.
  2. What is the role of the swap space?
    A swap space is a certain amount of space used by Linux to temporarily hold some programs that are running concurrently. This happens when RAM does not have enough memory to hold all programs that are executing.
  • What is the role of the swap space?
    A swap space is a certain amount of space used by Linux to temporarily hold some programs that are running concurrently. This happens when RAM does not have enough memory to hold all programs that are executing.
  • What is the null device in Linux?
    The null device is typically used for disposing of unwanted output streams of a process, or as a convenient empty file for input streams. This is usually done by redirection. The /dev/null device is a special file, not a directory, so one cannot move a whole file or directory into it with the Unix mv command.You might receive the “Bad file descriptor” error message if /dev/null has been deleted or overwritten. You can infer this cause when file system is reported as read-only at the time of booting through error messages, such as“/dev/null: Read-only filesystem” and “dup2: bad file descriptor”.
    In Unix and related computer operating systems, a file descriptor (FD, less frequently fildes) is an abstract indicator (handle) used to access a file or other input/output resource, such as a pipe or network socket.
  • What is a inode?
    The inode is a data structure in a Unix-style file system that describes a filesystem object such as a file or a directory. Each inode stores the attributes and disk block location(s) of the object’s data.

IV- Databases

  1. What is the difference between a document store and a relational database?
    In a relational database system you must define a schema before adding records to a database. The schema is the structure described in a formal language supported by the database and provides a blueprint for the tables in a database and the relationships between tables of data. Within a table, you need to define constraints in terms of rows and named columns as well as the type of data that can be stored in each column.In contrast, a document-oriented database contains documents, which are records that describe the data in the document, as well as the actual data. Documents can be as complex as you choose; you can use nested data to provide additional sub-categories of information about your object. You can also use one or more document to represent a real-world object.
  2. How to optimise a slow DB?
    • Rewrite the queries
    • Change indexing strategy
    • Change schema
    • Use an external cache
    • Server tuning and beyond
  3. How would you build a 1 Petabyte storage with commodity hardware?
    Using JBODs with large capacity disks with Linux in a distributed storage system stacking nodes until 1PB is reached.
    JBOD (which stands for “just a bunch of disks”) generally refers to a collection of hard disks that have not been configured to act as a redundant array of independent disks (RAID) array.
    JBOD

V- Scripting


AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Bard, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, AI Podcast)
  1. What is @INC in Perl?
    The @INC Array. @INC is a special Perl variable that is the equivalent to the shell’s PATH variable. Whereas PATH contains a list of directories to search for executables, @INC contains a list of directories from which Perl modules and libraries can be loaded.
  2. Strings comparison – operator – for loop – if statement
  3. Sort access log file by http Response Codes
    Via Shell using linux commands
    cat sample_log.log | cut -d ‘”‘ -f3 | cut -d ‘ ‘ -f2 | sort | uniq -c | sort -rn
  4. Sort access log file by http Response Codes Using awk
    awk ‘{print $9}’ sample_log.log | sort | uniq -c | sort -rn
  5. Find broken links from access log file
    awk ‘($9 ~ /404/)’ sample_log.log | awk ‘{print $7}’ sample_log.log | sort | uniq -c | sort -rn
  6. Most requested page:
    awk -F\” ‘{print $2}’ sample_log.log | awk ‘{print $2}’ | sort | uniq -c | sort -r
  7. Count all occurrences of a word in a file
    grep -o “user” sample_log.log | wc -w

Learn more at http://career.guru99.com/top-50-linux-interview-questions/

Real Time Linux Jobs

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLF-C02 book

Install and run your first noSQL MongoDB on Mac OSX

Amazon SQL vs NoSQL

Install and run your first noSQL MongoDB on Mac OSX

Classified as a NoSQL database, MongoDB is an open source, document-oriented database designed with both scalability and developer agility in mind. Instead of storing your data in tables and rows as you would with a relational database, in MongoDB you store JSON-like documents with dynamic schemas; This makes the integration of data in certain types of application easier and faster.
Why?
MongoDB can help you make a difference to the business. Tens of thousands of organizations, from startups to the largest companies and government agencies, choose MongoDB because it lets them build applications that weren’t possible before. With MongoDB, these organizations move faster than they could with relational databases at one tenth of the cost. With MongoDB, you can do things you could never do before.

    1. Install Homebrew
      $ /usr/bin/ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”
      Homebrew installs the stuff you need that Apple didn’t.
      $ brew install wget
    2. Install MongoDB
      $ brew install mongodb
    3. Run MongoDB
      Create the data directory: $ mkdir -p /data/db
      Set permissions for the data directory:$ chown -R you:yourgroup /data/db then chmod -R 775 /data/db
      Run MongoDB (as non root): $ mongod
    4. Begin using MongoDB.(MongoDB will be running as soon as you ran mongod above)Open another terminal and run: mongo

Install and run your first noSQL MongoDB on Mac OSX

References: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-os-x/


error: Content is protected !!