How does a database handle pagination?

How does a database handle pagination?

You can translate the content of this page by selecting a language in the select box.

How does a database handle pagination?

It doesn’t. First, a database is a collection of related data, so I assume you mean DBMS or database language.

Second, pagination is generally a function of the front-end and/or middleware, not the database layer.

But some database languages provide helpful facilities that aide in implementing pagination. For example, many SQL dialects provide LIMIT and OFFSET clauses that can be used to emit up to n rows starting at a given row number. I.e., a “page” of rows. If the query results are sorted via ORDER BY and are generally unchanged between successive invocations, then that can be used to implement pagination.

That may not be the most efficient or effective implementation, though.

So how do you propose pagination should be done?

On context of web apps , let’s say there are 100 mn users. One cannot dump all the users in response.

Cache database query results in the middleware layer using Redis or similar and serve out pages of rows from that.

What if you have 30, 000 rows plus, do you fetch all of that from the database and cache in Redis?

I feel the most efficient solution is still offset and limit. It doesn’t make sense to use a database and then end up putting all of your data in Redis especially data that changes a lot. Redis is not for storing all of your data.

If you have large data set, you should use offset and limit, getting only what is needed from the database into main memory (and maybe caching those in Redis) at any point in time is very efficient.

With 30,000 rows in a table, if offset/limit is the only viable or appropriate restriction, then that’s sometimes the way to go.

More often, there’s a much better way of restricting 30,000 rows via some search criteria that significantly reduces the displayed volume of rows — ideally to a single page or a few pages (which are appropriate to cache in Redis.)

It’s unlikely (though it does happen) that users really want to casually browse 30,000 rows, page by page. More often, they want this one record, or these small number of records.

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLFC01 book below.


Question: This is a general question that applies to MySQL, Oracle DB or whatever else might be out there.

I know for MySQL there is LIMIT offset,size; and for Oracle there is ‘ROW_NUMBER’ or something like that.

Invest in your future today by enrolling in this Azure Fundamentals - Microsoft Azure Certification and Training ebook below. This Azure Fundamentals Exam Prep Book will prepare you for the Azure Fundamentals AZ900 Certification Exam.


But when such ‘paginated’ queries are called back to back, does the database engine actually do the entire ‘select’ all over again and then retrieve a different subset of results each time? Or does it do the overall fetching of results only once, keeps the results in memory or something, and then serves subsets of results from it for subsequent queries based on offset and size?

If it does the full fetch every time, then it seems quite inefficient.

If it does full fetch only once, it must be ‘storing’ the query somewhere somehow, so that the next time that query comes in, it knows that it has already fetched all the data and just needs to extract next page from it. In that case, how will the database engine handle multiple threads? Two threads executing the same query?

something will be quick or slow without taking measurements, and complicate the code in advance to download 12 pages at once and cache them because “it seems to me that it will be faster”.

Answer: First of all, do not make assumptions in advance whether something will be quick or slow without taking measurements, and complicate the code in advance to download 12 pages at once and cache them because “it seems to me that it will be faster”.

YAGNI principle – the programmer should not add functionality until deemed necessary.
Do it in the simplest way (ordinary pagination of one page), measure how it works on production, if it is slow, then try a different method, if the speed is satisfactory, leave it as it is.


From my own practice – an application that retrieves data from a table containing about 80,000 records, the main table is joined with 4-5 additional lookup tables, the whole query is paginated, about 25-30 records per page, about 2500-3000 pages in total. Database is Oracle 12c, there are indexes on a few columns, queries are generated by Hibernate. Measurements on production system at the server side show that an average time (median – 50% percentile) of retrieving one page is about 300 ms. 95% percentile is less than 800 ms – this means that 95% of requests for retrieving a single page is less that 800ms, when we add a transfer time from the server to the user and a rendering time of about 0.5-1 seconds, the total time is less than 2 seconds. That’s enough, users are happy.


And some theory – see this answer to know what is purpose of Pagination pattern

  • Full Text Search in PostgreSQL — Creating the right Index
    by Shanmukh Sista (Database on Medium) on September 25, 2022 at 10:55 pm

    Let’s look at full text search indexing in PostgreSQL. Text search can be a common scenario for a lot of application developers / startups…Continue reading on Better Programming »

  • Hi Neo4J
    by İbrahim Yıldız (Database on Medium) on September 25, 2022 at 10:23 pm

    Herkese Merhabalar , Yep Yeni Bir NOSQL Veri Tabanı Serisine Başlıyoruz Bu Makalemde Neo4j Nedir Ne İçin Kullanılır ve Ubuntu Sistemi…Continue reading on Medium »

  • Android project example of local database sync with Jetpack Compose, Ktor, Room, Hilt, Navigation…
    by Daksh Bhatt (Database on Medium) on September 25, 2022 at 9:43 pm

    Note: the content of this article is dependent on my previous article…Continue reading on Medium »

  • Top MySQL 8/AWS RDS 3 features that can improve your app
    by Ivan Kolodii (Database on Medium) on September 25, 2022 at 8:50 pm

    Recently AWS announced a long waited support of MySQL 8 for their RDS Aurora. Of course, you could use MySQL 8 engine on RDS way before…Continue reading on Medium »

  • Últimos Jogadores Atualizados — 25/09/2022
    by Tudo pelo Futebol (Database on Medium) on September 25, 2022 at 8:14 pm

    Continue reading on Medium »

  • Últimos Jugadores Actualizados — 25/09/2022
    by Todo por el Fútbol (Database on Medium) on September 25, 2022 at 8:14 pm

    Continue reading on Medium »

  • Last Updated Players — 09/25/2022
    by Everything for Football (Database on Medium) on September 25, 2022 at 8:14 pm

    Continue reading on Medium »

  • What is Database Management System and Useful SQL Queries
    by Mehak Fatima (Database on Medium) on September 25, 2022 at 6:50 pm

    A DBMS is a type of information system that stores and manages the data in a database table, for instance MySQL and PostgreSQL. The word…Continue reading on Medium »

  • Jogadores Mais Vistos do Dia — 25/09/2022
    by Tudo pelo Futebol (Database on Medium) on September 25, 2022 at 5:23 pm

    Continue reading on Medium »

  • Jugadores Más Vistos del Día — 25/09/2022
    by Todo por el Fútbol (Database on Medium) on September 25, 2022 at 5:23 pm

    Continue reading on Medium »

  • Need a Portable Database With GUI
    by /u/Industry-Regular (Database) on September 24, 2022 at 9:34 am

    Hello, I am not a software developer. I dont know any coding. I am only a computer user. I can use office tools fine. But I need a DB. I have an excel list. But I need more detailed statistics. I cannot add multiple record in single cell at excel for example. Such as: book name + author name1, author name2, page count this is one line in excel, when i want a detailed author statistics, excel count the author record as a single record. this is just an example. ​ So I want to transfer my excel list to a db. It should have a gui like microsoft access but I need a portable solution because in my office our computers does not have access. I want to run it in a stick. In other sub, someone recommended sqlite browserdb but I could not find a way to add multiple data in single cell and create a input form like access. ​ Any recommendations? submitted by /u/Industry-Regular [link] [comments]

  • PgAdmin 4's ERD Tool doesn't have 1:1 One-to-One Relationship?
    by /u/cybernetics_engineer (Database) on September 24, 2022 at 1:59 am

    PgAdmin 4's ERD Tool is good for the most part, although adding FK is kind of a pain. Too many clicks. I looked over docs and doesn't seem like there is a way to specify 1:1 to relationship. All FK default to 1: N. And also can't seem to change orientation of lines either. I have mixed feelings about the autoalign. Wish I could also box around some of the tables to categorize for easy scanning. ERD can easily get out of hand without boxing around. Was using draw.io, but the lag was very bad. Am I missing something in the docs? submitted by /u/cybernetics_engineer [link] [comments]

  • Postisco, recommend it?
    by /u/jiii95 (Database) on September 23, 2022 at 7:50 pm

    Hi, I am a mac user. I have set the postgres app on my mac, and I am looking for a full-feature including ER Diagrams, intuitive, M1 native, and not consuming much of RAM. I have narrowed it down to two apps: Postico and TablePlus (this one doesn't include er diagram), do you have something else in mind? Also, is Postico M1 native? submitted by /u/jiii95 [link] [comments]

  • SuperType/SubType always 1:1 Relationship?
    by /u/cybernetics_engineer (Database) on September 23, 2022 at 7:09 pm

    SuperType Product id PK SubType Good product_id PKFK references Product.id weight SubType Service product_id PKFK references Product.id hours Are supertype/subtype relationships always 1:1, because the Subtype is an instance of the supertype? the subtype PK is unique so it must always be 1:1? But then if I think of this example, it makes sense to say a country can have many postal codes. Or is this not a true super/subtype relationship? SuperType Country id PK SubType Postal_Code country_id PKFK references Country.id SubType State country_id PKFK references Country.id submitted by /u/cybernetics_engineer [link] [comments]

  • What is "good design" in an ER model?
    by /u/DiscountWHM (Database) on September 23, 2022 at 4:02 pm

    I'm having a bit of communication breakdown with my instructor in trying to understand what actually defines "good logical design" in an ER model. We were asked to make a conceptual model of a fictional problem but everytime I rearrange it he'll suggest where he might move the entities but I'm struggling to get a direct answer for what design principal should be guiding us in our layout or entities and relationships. Do I care about good visual flow, or is it more about the logical flow of events from these entity relationships? He also mentioned needing a "start point." Is the starting entity where the entire process originates from? ​ Edit: Also, is it ok for relationship lines to cross over one another if nessecary? I was originally avoiding that assuming it was bad practice. submitted by /u/DiscountWHM [link] [comments]

  • How hard would it be to make a data warehouse for final year project?
    by /u/RHSiuolF (Database) on September 22, 2022 at 4:41 pm

    Just starting my final year of university where I get to choose a project and one of the ones offered by staff is to build a data warehouse for my uni which seems interesting. I was wondering how hard this would be to do as I only have basic sql knowledge down but will have 7.5 months to work on this while also doing another module on big data & data mining? submitted by /u/RHSiuolF [link] [comments]

  • Fast ClickHouse queries in a Managed Solution
    by /u/ageje (Database) on September 22, 2022 at 4:30 pm

    Here is Mark's blog post comparing of some of the data warehouses. He has recently reviewed DoubleCloud's managed ClickHouse offering: 1.1 Billion Taxi Rides in ClickHouse on DoubleCloud submitted by /u/ageje [link] [comments]

  • How do I make my hobby tier database on Hasura "production ready"?
    by /u/lorenzomofo (Database) on September 22, 2022 at 4:14 pm

    I used the free tier on Hasura (Hasura Cloud) where they spin up a free database on Heroku for me. I have my tables and columns setup and have tested it with real data and my website can read and write the data through the Hasura GraphQL endpoints correctly. Now I want to put my website in production and get users to sign up for real. What do I need to do with Hasura (and Heroku) to make the backend production ready? I figure I can easily upgrade to the paid tier with Hasura but do I need to do anything with Heroku? submitted by /u/lorenzomofo [link] [comments]

  • Should i add seller inside costumer entity?
    by /u/Andmau00 (Database) on September 22, 2022 at 3:46 pm

    Im trying to figure out how to enter a customer that can sell products in my online shopping page. Should i made another entity called "seller" or should i add it as an atribute on the customer entity. Thanks in advance for any advice! ​ https://preview.redd.it/r0ssaes8lfp91.png?width=1248&format=png&auto=webp&s=34c00d3f0b85cc704b5a0635283f1d1e9a5df667 submitted by /u/Andmau00 [link] [comments]

  • Switching from IT support to Database Administration. Employer is reimbursing me, looking for advice.
    by /u/throwaway90371560 (Database) on September 22, 2022 at 12:26 pm

    Hi. I have been with my employer for about 7 months now. I started out as a network technician intern, just mapping stuff out and upgrading hardware etc. but they decided to keep me on permanently as IT support. It is a food manufacturing environment. This is one plant of several. I do keep rather busy with my current job, but my CFO (who I have really developed a great working relationship with) thinks that I can accomplish more and deserve more pay. He wants to reimburse me to either return to school or get some certifications in DBA (I do not hold a degree or certifications of any kind at the moment). He is VERY big on data and has always wanted someone internal to administrate our databases. He thinks that I could be the perfect fit if I obtain some education. I'm really excited about it I gotta say. Nervous and intimidated, but excited. So basically my CFO has left it up to me to do my own research on what education/certs I should go after, and he says that they will reimburse anything that directly benefits me growing into this new role. I was looking for some guidance on the best certifications (or even degree) that I should go after, seeing as how I have sort of been left to make up my own mind on it and I'm feeling a little overwhelmed. submitted by /u/throwaway90371560 [link] [comments]

  • Will Long Running Select Lock a Table?
    by /u/cloyfan93 (Database) on September 22, 2022 at 2:57 am

    In MySQL 5.7, the default isolation level is repeatable read. Let say I have a long running select query that takes 30 mins. to return, for example, select * from test_table where id < 1000000; Then when updates to these rows come, would the update wait for the select query to finish? I would want to test this but I don't know how hence why I am asking here. submitted by /u/cloyfan93 [link] [comments]

  • My professor gave us the task of finding out where the manager ID came from in regards to the relation(b) and I can't figure out how. Was wondering if anyone could explain it to me?
    by /u/Pancho1429 (Database) on September 22, 2022 at 2:34 am

    submitted by /u/Pancho1429 [link] [comments]

  • Build a NoSQL Database From The Scratch in 1000 Lines of Code
    by /u/Amit23456 (Database) on September 21, 2022 at 4:08 pm

    submitted by /u/Amit23456 [link] [comments]

  • I can't wrap my mind around setting up the table structure for a specific portion of my database
    by /u/Doogie_Howitzer_WMD (Database) on September 21, 2022 at 3:20 pm

    I am stuck on trying to structure the tables for one portion of my database that has a lot of inter-dependencies.   I am trying to restructure a database at my job. We have physical folders with a number system for keeping track of our "jobs". The folder numbers are a four-digit year (folder year), followed by a dash, followed by a three-digit code (folder code). We do work in two different States, so the folder codes will end in even numbers for "State A" and odd numbers for "State B". However, this pattern alternates every year. In even-numbered years, folders for "State A" will have folder codes that end in even numbers, and in odd-numbered years, folders for "State A" will have folder codes that end in odd numbers. The reverse happens for "State B" (even year, odd code / odd year, even code). That is well and good. I have a table called "JobFolder" with fields: JobFolderID FolderYear FolderCode StateID (foreign key to a table containing States) I have a validation rule for the table itself that restricts records to having to follow the alternating even-odd behavior of the folder codes for the corresponding year and state.   Here is where I am racking my brain - there are three different types of folders.   The first type is what I'll call a "Standard Job Folder" (Std Folder), and it's pretty straightforward. The only pertinent information it requires is a Purchaser and a State. A single Purchaser can have many Std Folders in a given year. I already have a table with all Purchasers, so that identifying that would simply be a relation to that table.   The second type of folders are miscellaneous orders for a given year associated with a specific purchaser (I'll call it "Purchaser Misc." Folder). A purchaser might have an emergency order of something that isn't part of a specifically planned job (those get standard job folders). The pertinent information this requires is: a Purchaser a Folder year a State A Purchaser can only have one "Purchaser Misc." folder per year.   The third type of folders are what I'll call "Monthly Miscellaneous" Folders. These are like the miscellaneous of the miscellaneous. Purchasers who make much more frequent and large orders will usually get their own "Purchaser Misc." folder for the year, while Purchaser who make small, infrequent orders have those orders put in the monthly miscellaneous folder. The pertinent information this requires is: a month a folder year a State There are 24 "Monthly Misc." folders each year, 12 for each State.   I am struggling to design a table structure that can incorporate all of these interdependencies. My original idea has tables for each Folder type in addition to the JobFolder table, and relates to the JobFolders table with different composite keys consisting of two or three items, but that seems crazy. Any help would be greatly appreciated. submitted by /u/Doogie_Howitzer_WMD [link] [comments]

  • Weak entity vs auto_increment identifier
    by /u/parzival_bit (Database) on September 21, 2022 at 2:13 pm

    so i have this kind of ER scenario: https://ibb.co/ZXTZT0z I'm not very sure about the entity type of "Fund". So, i mean, is it a Weak Entity? Infact a Fund can exists only if there is 1 and only 1 Investor who transfer the Fund and 1 and only 1 Startup who raise the Fund. I don't understand if ID_Fund can be considered as a unique identifier for Fund (ie Fund is a Strong Entity) or if Fund is a weak entity of both Startup and Investor. Is it possible for a weak entity to have two or more strong entities? Or am I missing something with this concept? And what is the correct representation of my ER scenario? Please any help would be awesome! submitted by /u/parzival_bit [link] [comments]

  • Question: what would make you less nervous putting validation and consistency logic back into the DB layer?
    by /u/dr_rodopszin (Database) on September 20, 2022 at 8:41 pm

    I am currently back at full-stack engineering. I started with automated tests and then 9 years ago started backend development then after 2-3 years of it went to frontend. I had a pretty solid understanding of databases back then which I vaguely remember. Currently I need to write queries time-to-time and work some ORMs. Which made me curious about what has happened on the database front while I was pushing pixels in the browser. Things like playing nice with source control (for instance knowing what exact stored procedures are currently out in production just by looking at latest main branch) or modern language features (import/export modules, packages, etc.) or testing. During my time working I saw the following journey: Let's put validation and consistency as close to the data as possible When I started doing backend 9 years ago, we had a team of database engineers writing a ton of validation, parameter sanitization and consistency checks in forms of stored procedures, triggers et al. Since a lot of the actual business logic was in the database layer there was a constant ping-pong whenever a bug happened: Is this a backend bug? Or is this a DB bug? The database team also lagged behind in productivity essentially becoming a bottleneck. Not because they were bumbling simpletons, but because the language (TSQL) and the database itself was lacking luxuries like playing nice with source control, inability to do automated tests quickly and effectively, lack of CI/CD or modern language features like encapsulating, exporting and importing modules etc. Rise of business logic can only be on the backend This is of course an overly simplified story, but then I saw a clear direction to move as much of the business logic as possible to the backend. Since the language was different we could cover the logic with automated or unit tests. So this move had pros and cons: PROS * tests * developer experience (rich ecosystem of Java/JS/C#, etc. and of course) * source control (e.g. you now that latest of main branch equals code running in production) * CI/CD (checks run on pull requests, merging means deployment) All good, but this step introduced ORMs. And damned ORMs back then (and as I saw nowadays as well) did have problems like a simple query must be broken down into multiple fetch, transform and update calls. Or I was seeing people joining tables with JS functions, because the damn ORM did not allow them doing it in, or using SQL queries would mean a loss in developer experience (DX). And then you just pray that the other devs heard about transactions and concurrency, because if you change 1 update query into 3 separate DB calls the same record can change X times before your update finishes. From this I clearly see these cons: CONS * Kindergarten level SQL options: anything a tiny bit complicated is most likely not implemented or... * In a lot of cases encourages you to write code that is ineffective and/or dangerous compared to what any mid-level database folk would write, essentially you write logic/structure to play nice with the ORM in contrast to writing code for satisfying product, safety or efficiency principles. The way of Slonik I did stumble upon Slonik which was enlightening: you can write proper SQL queries inside javascript with the magic Tagged template literals. javascript await connection.query(sql` INSERT INTO foo (bar, baz, qux) SELECT * FROM ${sql.unnest( [ [1, 2, 3], [4, 5, 6] ], [ 'int4', 'int4', 'int4' ] )} `); This way you also get parameter binding which is a great solution against SQL injection. Uou get a lot of features of ORMs but I also can now use SQL what SQL is for with all its expressive power, now multiplied with all the goodies of a modern language like JS would offer. But the DB roundtrips problem still exist. 🤔 Which made me think: What if a slonik-like thing would be available on the database layer, as a composite of SQL + some modern language, a service/DB hybrid, expanding what can be written in regular SQL several factors, that has all these modern language features we like in C#/Java/Go/JS, but also does not need to do backend roundtrips? To reach this level I need to already make database round trips and the entire thing is optional: other services can just bypass my fancy slonik based backend layer and access DB directly. You see, let's say service A operates over database A, uses slonik and avoids a lot of the problems but service B operating over the same database A just uses string queries naively or a bit backward ORM and cannot keep the same level of security as service A. If we could have things like testing/mocking, modules, etc. on the database layer then it wouldn't be a pain to have some of the data related business logic in it, and all services could and would work with a ton of security and consistency, no need to replicate those business logic in several microservices (sometimes even using multiple languages). submitted by /u/dr_rodopszin [link] [comments]

  • A free and beginner-friendly data modeling tool?
    by /u/Affectionate_Trick39 (Database) on September 20, 2022 at 6:22 pm

    Like Erwin but free? submitted by /u/Affectionate_Trick39 [link] [comments]

  • Fast query in table with one billion rows
    by /u/wxblex (Database) on September 20, 2022 at 6:16 pm

    Hi! Im working with a table in psql with more than one billion rows, and i cant do selects because its take too much time or never ends. I created a index but the select stills take a lot of time. I was thinking, if i move all the data to mongoDb, the select query will be faster? because i need to get a lot of info from that table and it's really impossible. Any advice is welcome Thanks! ​ Edit: More information about my table. The hardware is a db.t3.medium of AWS RDS, with 1 core, 2 cpu virtual, and 4GB Ram. The most difficult query is like select count(distinct macaddress), router, floor, date, place from table where place = 1 group by date,floor,router,place order by date desc; . I made an index but (probably) is really bad designed. The table has 14 columns. submitted by /u/wxblex [link] [comments]

  • Help with 3NF normalisation
    by /u/Tempest36 (Database) on September 19, 2022 at 10:19 pm

    Hi all, This is my first time posting here - so if this is not allowed please delete it. I am currently studying Databases and am having trouble with the 3NF normalisation content. My course materials are utter garbage and don't help me at all. Is there a different way to understand 3NF normalisation? At the moment my understanding is essentially: If a column is reliant on another column that is not the primary key it is a transitive dependency and needs to be its own table; If a column is reliant on a column that is the primary key it is a functional dependency and stays in the same table as the primary key. With this being the case, I have been given a table (below) and told to 3NF normalise it. I have also put in my attempts underneath as well for what I think is correct. Is someone able to explain how I get to the 3NF normalised state of this table so that I am able to complete this task? Single underline is the primary key. Italicised is the foreign key. https://preview.redd.it/immg0c8i4wo91.jpg?width=1114&format=pjpg&auto=webp&s=bd8ab5b27c20eeec560677ac38ef535368342bd8 Thanks so much. submitted by /u/Tempest36 [link] [comments]

  • How to model Units of Measure?
    by /u/cybernetics_engineer (Database) on September 19, 2022 at 5:36 pm

    A product can have many parts. Parts can be used in many products. Product and Parts are Many to Many. A part can be stored in many units of measure. Ex: meters, tons, gallons, etc. Units of measure have types. Ex: Length, Volume, Mass, Area. part table will have attributes : inventory_level unit_type unit_id. product_parts junction table will have attributes: required_part_amount unit_type unit_id. unit_of_measure table will have attributes: id name unit_of_measure_type table will have attributes: id name unit_of_measure_conversion table will have attributes: uom_from_id (PKFK unit_of_measure) uom_to_id (PKFK unit_of_measure) calculation_step operator (restricted to +, -, *,/) factor constant product_parts's unit_type field will be a FK to part's unit_type. Which ensures they will have the same unit_type. I did not have unit_type as an attribute of unit_of _measure, but instead a whole separate table because a part can be stored in gallons, but the required amount can be quarts. The problem I have is now, is that the unit_id referenced in either part or product_parts can be mismatched. I don't know how to ensure that when unit_type of VOLUME in either part or product_parts restricts the unit_id to the right type. All types of units will be stored in unit_of_measure. Looks like I need to also store the unit_type as an FK on unit_of_measure? Even then, I am still stuck. Tried to make this as digestable as possible. submitted by /u/cybernetics_engineer [link] [comments]

  • What are some tools and best practices for manually updating DB tables?
    by /u/rodrigonader (Database) on September 19, 2022 at 5:32 pm

    Imagine you have a couple of reference tables... These tables are monthly updated and their values point to multiple fields across the DB. What's a standard tool used to manually change DB values in spreadsheets or any other user-friendly format? submitted by /u/rodrigonader [link] [comments]

  • tech question for people in the industry.
    by /u/Stock-Philosophy8675 (Database) on September 19, 2022 at 5:15 pm

    So this is really the only place I can ask this and get an answer from people in the field. But I've always been told that people in the tech industry would never own iphones? Thoughts. And why? Lol. Also. Sorry if this doesn't follow guidelines. I'm just really curious! submitted by /u/Stock-Philosophy8675 [link] [comments]

  • What is a good reference for designing database schema? For beginners
    by /u/Affectionate_Trick39 (Database) on September 19, 2022 at 11:36 am

    I'm getting a bit lost deciding what should have its own table and what should be a field. I'm interested in the theory. submitted by /u/Affectionate_Trick39 [link] [comments]

  • Building a NoSQL database from zero in 1000 lines of code part 1
    by /u/Amit23456 (Database) on September 19, 2022 at 5:27 am

    submitted by /u/Amit23456 [link] [comments]

  • Bloomberg Philanthropies launches racial wealth gap database
    by /u/cheeztoshobo (Database) on September 19, 2022 at 1:31 am

    submitted by /u/cheeztoshobo [link] [comments]

Budget to start a web app built on the MEAN stack

You can translate the content of this page by selecting a language in the select box.

I want to start a web app built on the MEAN stack (mongoDB, express.js, angular, and node.js). How much would it cost me to host this site? What resources are there for hosting websites built on the MEAN stack?

I went through the same questions and concerns and I actually tried a couple of different cloud providers for similar environments and machines.

Web Apps Feed

  1. At Digital Ocean, you can get a fully loaded machine to develop and host at $5 per month (512 MB RAM, 20 GB disk ). You can even get a $10 credit by using this link of mine.[1] It is very easy to sign up and start. Just don’t use their web console to connect to your host. It is slow. I recommend using ssh client to connect and it is very fast.
  2. GoDaddy will charge you around 8$ per month for a similar MEAN stack host (512 MB RAM, 1 core processor, 20 Gb disk ) for your MEAN Stack development.
  3. Azure use bitmani’s mean stack on minimum DS1_V2 machine (1core, 3.5 gB RAM) and your average cost will be $52 per month if you never shut down the machine. The set up is a little bit more complicated that Digital Ocean, but very doable. I also recommend ssh to connect to the server and develop.
  4. AWS also offers Bitmani’s MEAN stack on EC2 instances similar to Azure DS1V2 described above and it is around $55 per month.
  5. Other suggestions

All those solutions will work fine and it all depends on your budget. If you are cheap like me and don’t have a big budget, go with Digital Ocean and start with $10 off with this code.

Basic Gotcha Linux Questions for IT DevOps and SysAdmin Interviews

You can translate the content of this page by selecting a language in the select box.

Some IT DevOps, SysAdmin, Developer positions require the knowledge of basic linux Operating System. Most of the time, we know the answer but forget them when we don’t practice very often. This refresher will help you prepare for the linux portion of your IT interview by answering some gotcha Linux Questions for IT DevOps and SysAdmin Interviews.

Get a $10 credit to have your own linux server for your MEAN STACK development and more. It is only $5 per month for a fully loaded Ubuntu machine.

Latest Linux Feeds

I- Networking:

  1. How many bytes are there in a MAC address?
    48.
    MAC, Media Access Control, address is a globally unique identifier assigned to network devices, and therefore it is often referred to as hardware or physical address. MAC addresses are 6-byte (48-bits) in length, and are written in MM:MM:MM:SS:SS:SS format.
  2. What are the different parts of a TCP packet?
    The term TCP packet appears in both informal and formal usage, whereas in more precise terminology segment refers to the TCP protocol data unit (PDU), datagram to the IP PDU, and frame to the data link layer PDU: … A TCP segment consists of a segment header and a data section.
  3. Networking: Which command is used to initialize an interface, assign IP address, etc.
    ifconfig (interface configuration). The equivalent command for Dos is ipconfig.
    Other useful networking commands are: Ping, traceroute, netstat, dig, nslookup, route, lsof
  4. What’s the difference between TCP and UDP; Between DNS TCP and UDP?
    There are two types of Internet Protocol (IP) traffic. They are TCP or Transmission Control Protocol and UDP or User Datagram Protocol. TCP is connection oriented – once a connection is established, data can be sent bidirectional. UDP is a simpler, connectionless Internet protocol.
    The reality is that DNS queries can also use TCP port 53 if UDP port 53 is not accepted.
    DNS uses TCP for Zone Transfer over port :53.
    DNS uses UDP for DNS Queries over port :53.

  5. What are defaults ports used by http, telnet, ftp, smtp, dns, , snmp, squid?
    All those services are part of the Application level of the TCP/IP protocol.
    http => 80
    telnet => 23
    ftp => 20 (data transfer), 21 (Connection established)
    smtp => 25
    dns => 53
    snmp => 161
    dhcp => 67 (server), 68 (Client)
    ssh => 22
    squid => 3128
  6. How many host available in a subnet (Class B and C Networks)
  7. How DNS works?
    When you enter a URL into your Web browser, your DNS server uses its resources to resolve the name into the IP address for the appropriate Web server.
  8. What is the difference between class A, class B and class C IP addresses?
    Class A Network (/ 8 Prefixes)
    This network is 8-bit network prefix. IP address range from 0.0.0.0 to 127.255.255.255
    Class B Networks (/16 Prefixes)
    This network is 16-bit network prefix. IP address range from 128.0.0.0 to 191.255.255.255Class C Networks (/24 Prefixes)
    This network is 24-bit network prefix.IP address range from 192.0.0.0 to 223.255.255.255
  9. Difference between ospf and bgp?
    The first reason is that BGP is more scalable than OSPF. , and this, normal igp like ospf cannot perform. Generally speaking OSPF and BGP are routing protocols for two different things. OSPF is an IGP (Interior Gateway Protocol) and is used internally within a companies network to provide routing.

II- Operating System
1&1 Web Hosting

  1. How to find the Operating System version?
    $uname -a
    To check the distribution for redhat for example: $cat /etc/redhat –release
  2. How to list all the process running?
    top
    To list java processes, ps -ef | grep java
    To list processes on a specific port:
    netstat -aon | findstr :port_number
    lsof -i:80
  3. How to check disk space?
    df shows the amount of disk space used and available.
    du displays the amount of disk used by the specified files and for each subdirectories.
    To drill down and find out which file is filling up a drive: du -ks /drive_name/* | sort -nr | head
  4. How to check memory usage?
    free or cat /proc/meminfo
  5. What is the load average?
    It is the average sum of the number of process waiting in the queue and the number of process currently executing over the period of 1, 5 and 15 minutes. Use top to find the load average.
  6. What is a load balancer?
    A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications.
  7. What is the Linux Kernel?
    The Linux Kernel is a low-level systems software whose main role is to manage hardware resources for the user. It is also used to provide an interface for user-level interaction.
  8. What is the default kill signal?
    There are many different signals that can be sent (see signal for a full list), although the signals in which users are generally most interested are SIGTERM (“terminate”) and SIGKILL (“kill”). The default signal sent is SIGTERM.
    kill 1234
    kill -s TERM 1234
    kill -TERM 1234
    kill -15 1234
  9. Describe Linux boot process
    BIOS => MBR => GRUB => KERNEL => INIT => RUN LEVEL
    As power comes up, the BIOS (Basic Input/Output System) is given control and executes MBR (Master Boot Record). The MBR executes GRUB (Grand Unified Boot Loader). GRUB executes Kernel. Kernel executes /sbin/init. Init executes run level programs. Run level programs are executed from /etc/rc.d/rc*.d
    Mac OS X Boot Process:

    Boot ROM Firmware. Part of Hardware system
    BootROM firmware is activated
    POST Power-On Self Test
    initializes some hardware interfaces and verifies that sufficient memory is available and in a good state.
    EFI Extensible Firmware Interface
    EFI does basic hardware initialization and selects which operating system to use.
    BOOTX boot.efi boot loader
    load the kernel environment
    Rooting/Kernel The init routine of the kernel is executed
    boot loader starts the kernel’s initialization procedure
    Various Mach/BSD data structures are initialized by the kernel.
    The I/O Kit is initialized.
    The kernel starts /sbin/mach_init
    Run Level mach_init starts /sbin/init
    init determines the runlevel, and runs /etc/rc.boot, which sets up the machine enough to run single-user.
    rc.boot figures out the type of boot (Multi-User, Safe, CD-ROM, Network etc.)
  10. List services enabled at a particular run level
    chkconfig –list | grep 5:0n
    Enable|Disable a service at a specific run level: chkconfig on|off –level 5
  11. How do you stop a bash fork bomb?
    Create a fork bomb by editing limits.conf:
    root hard nproc 512
    Drop a fork bomb as below:
    :(){ :|:& };:
    Assuming you have access to shell:
    kill -STOP
    killall -STOP -u user1
    killall -KILL -u user1
  12. What is a fork?
    fork is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel. Fork is the primary (and historically, only) method of process creation on Unix-like operating systems.
  13. What is the D state?
    D state code means that process is in uninterruptible sleep, and that may mean different things but it is usually I/O.

III- File System

  1. What is umask?
    umask is “User File Creation Mask”, which determines the settings of a mask that controls which file permissions are set for files and directories when they are created.
  2. What is the role of the swap space?
    A swap space is a certain amount of space used by Linux to temporarily hold some programs that are running concurrently. This happens when RAM does not have enough memory to hold all programs that are executing.
  • What is the role of the swap space?
    A swap space is a certain amount of space used by Linux to temporarily hold some programs that are running concurrently. This happens when RAM does not have enough memory to hold all programs that are executing.
  • What is the null device in Linux?
    The null device is typically used for disposing of unwanted output streams of a process, or as a convenient empty file for input streams. This is usually done by redirection. The /dev/null device is a special file, not a directory, so one cannot move a whole file or directory into it with the Unix mv command.You might receive the “Bad file descriptor” error message if /dev/null has been deleted or overwritten. You can infer this cause when file system is reported as read-only at the time of booting through error messages, such as“/dev/null: Read-only filesystem” and “dup2: bad file descriptor”.
    In Unix and related computer operating systems, a file descriptor (FD, less frequently fildes) is an abstract indicator (handle) used to access a file or other input/output resource, such as a pipe or network socket.
  • What is a inode?
    The inode is a data structure in a Unix-style file system that describes a filesystem object such as a file or a directory. Each inode stores the attributes and disk block location(s) of the object’s data.

IV- Databases

  1. What is the difference between a document store and a relational database?
    In a relational database system you must define a schema before adding records to a database. The schema is the structure described in a formal language supported by the database and provides a blueprint for the tables in a database and the relationships between tables of data. Within a table, you need to define constraints in terms of rows and named columns as well as the type of data that can be stored in each column.In contrast, a document-oriented database contains documents, which are records that describe the data in the document, as well as the actual data. Documents can be as complex as you choose; you can use nested data to provide additional sub-categories of information about your object. You can also use one or more document to represent a real-world object.
  2. How to optimise a slow DB?
    • Rewrite the queries
    • Change indexing strategy
    • Change schema
    • Use an external cache
    • Server tuning and beyond
  3. How would you build a 1 Petabyte storage with commodity hardware?
    Using JBODs with large capacity disks with Linux in a distributed storage system stacking nodes until 1PB is reached.
    JBOD (which stands for “just a bunch of disks”) generally refers to a collection of hard disks that have not been configured to act as a redundant array of independent disks (RAID) array.
    JBOD

V- Scripting

  1. What is @INC in Perl?
    The @INC Array. @INC is a special Perl variable that is the equivalent to the shell’s PATH variable. Whereas PATH contains a list of directories to search for executables, @INC contains a list of directories from which Perl modules and libraries can be loaded.
  2. Strings comparison – operator – for loop – if statement
  3. Sort access log file by http Response Codes
    Via Shell using linux commands
    cat sample_log.log | cut -d ‘”‘ -f3 | cut -d ‘ ‘ -f2 | sort | uniq -c | sort -rn
  4. Sort access log file by http Response Codes Using awk
    awk ‘{print $9}’ sample_log.log | sort | uniq -c | sort -rn
  5. Find broken links from access log file
    awk ‘($9 ~ /404/)’ sample_log.log | awk ‘{print $7}’ sample_log.log | sort | uniq -c | sort -rn
  6. Most requested page:
    awk -F\” ‘{print $2}’ sample_log.log | awk ‘{print $2}’ | sort | uniq -c | sort -r
  7. Count all occurrences of a word in a file
    grep -o “user” sample_log.log | wc -w

Learn more at http://career.guru99.com/top-50-linux-interview-questions/

Real Time Linux Jobs

Install and run your first noSQL MongoDB on Mac OSX

You can translate the content of this page by selecting a language in the select box.

Classified as a NoSQL database, MongoDB is an open source, document-oriented database designed with both scalability and developer agility in mind. Instead of storing your data in tables and rows as you would with a relational database, in MongoDB you store JSON-like documents with dynamic schemas; This makes the integration of data in certain types of application easier and faster.
Why?
MongoDB can help you make a difference to the business. Tens of thousands of organizations, from startups to the largest companies and government agencies, choose MongoDB because it lets them build applications that weren’t possible before. With MongoDB, these organizations move faster than they could with relational databases at one tenth of the cost. With MongoDB, you can do things you could never do before.

  1. Install Homebrew
    $ /usr/bin/ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”
    Homebrew installs the stuff you need that Apple didn’t.
    $ brew install wget
  2. Install MongoDB
    $ brew install mongodb
  3. Run MongoDB
    Create the data directory: $ mkdir -p /data/db
    Set permissions for the data directory:$ chown -R you:yourgroup /data/db then chmod -R 775 /data/db
    Run MongoDB (as non root): $ mongod

  4. Begin using MongoDB.(MongoDB will be running as soon as you ran mongod above)

    Open another terminal and run: mongo

  5. Install and run your first noSQL MongoDB on Mac OSX

References: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-os-x/


error: Content is protected !!