Linux & App Servers Monitoring Tricks

You can translate the content of this page by selecting a language in the select box.



  • Prevent server from going down,
  • detect what caused the server to go down,
  • get servers back after failure.
  • Explain what caused the server to fail

I – Regularly watch your monitoring  tool(s) (Nagios, Wily,top, …..)

On some of those tools,  you can see a graph of the apps CPU Load Average, CPU Used Percentage, Disk Usage percentage, memory used percentage, Network Bandwidth, Swap Used percentage.

II – Check ulimit count (Number of files opened by applications like tomcat, oracle,…)

[Server]# ulimit -n
If the number of files opened is getting closer to ulimit count (1024), increase the ulimit and talk to dev to identify and fix the process that is causing that.
To increase ulimit count for a specific application account, run the command  ulimit –n [value]
Ulimit can be set to whatever you want. Its one of those things that’s put in place as a throttle to keep things from going too nuts. Some systems will actually just set it to unlimited.

III- Port Monitoring 
Check number of connections to ports used by your apps

IV- Thread dump (stack trace of all threads ) If you have a high cpu percentage

[Server]# kill -3 (The output is printed in catalina.out) to see what is causing this and send it to developers.

V- Disk space /[drive_name] filling up quickly
Identify the file(s) that are filling up the disks. Most of the time ,it will be logs files.
[Server]# du -ks /[drive_name]/* | sort -nr | head
5719076 /[drive_name]/catalina
3675672 /[drive_name]/data
3287436 /[drive_name]/source
2044316 /[drive_name]/servers
319404 /[drive_name]/images
16 /[drive_name]/lost+found
By running this command on the larger folder, that will lead you to the files that eat the disk space.
Back up, remove or empty the file in question given that it won’t break the system.
If the log files are responsible for the disk filling up, let the developer know about it so that they can solve it. In the meantime, empty the log file with the command:

[Log_File_Location]# echo -n > Large_Log_File_Name.log

VI- Watch catalina.out and log4j.out after staging and live deploy, especially when you are restarting the servers.

If you are looking for an all-in-one solution to help you prepare for the AWS Cloud Practitioner Certification Exam, look no further than this AWS Cloud Practitioner CCP CLFC01 book below.


[Server]# tail -f log4j.log
VII- Start app servers properly
Before restarting app servers, make sure there is no app pid running for that specific server.

Invest in your future today by enrolling in this Azure Fundamentals - Microsoft Azure Certification and Training ebook below. This Azure Fundamentals Exam Prep Book will prepare you for the Azure Fundamentals AZ900 Certification Exam.

[Server_Name]$ ps -ef | grep oracle
Kill the pid for that server.

IX – Cpu Load level

I would say that if we peak under 70% CPU during high traffic, we are doing well and have room. A good level to be ticking over at would be 30% used.
[Server]# top
top – 12:37:29 up 47 days, 23:09, 4 users, load average: 0.20, 0.20, 0.22
Tasks: 189 total, 1 running, 178 sleeping, 10 stopped, 0 zombie
Cpu(s): 1.2%us, 0.1%sy, 0.0%ni, 97.5%id, 1.0%wa, 0.0%hi, 0.1%si, 0.0%st

Use this Promo Code RDB9RRU31D12T and Save 30% Off the following eBooks:

X- Server specific status pings (To assure the server are up and serving contents)
Write scripts for this

XI- Garbage collection stats

If you are interested in any garbage collection stats there’s the gc.log files on each of the appservers (bad thing about it is it doesn’t do any date stamping so you can see how memory fluctuates but its a difficult to create a chart over time). In the past I’ve thought it might be good idea to write a cron that archived it daily so that you could at least break things down day by day.

XII- DB Connection

XIII- Load Average Monitoring script
Set up a  cron that just email sysadmin when the load average is above 3.

XIV – Find out who is monopolizing or eating the CPUs
[Server]# ps -eo pcpu,pid,user,args | sort -k 1 -r | head -10

My favorite tool for creating blog content about tiny topics is the Jasper AI blog writer.

Get 20% off Google Workspace (Google Meet)  Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more)

Get 20% off Google Google Workspace (Google Meet) Standard Plan with  the following codes:  96DRHDRA9J7GTN6 (Email us for more))

We know you like Sports and Geeky things, We do too, but you should build the skills that’ll drive your career into six figures. Cloud skills and certifications can be just the thing you need to make the move into cloud or to level up and advance your career. 85% of hiring managers say cloud certifications make a candidate more attractive.

Download the Djamga App for ios or android or Microsoft for drop in soccer, basketball, volleyball, badminton, football, hockey, cricket games details and location in your city.

FREE 10000+ Quiz Trivia and and Brain Teasers for All Topics including Cloud Computing, General Knowledge, History, Television, Music, Art, Science, Movies, Films, US History, Soccer Football, World Cup, Data Science, Machine Learning, Geography, etc....

With average increases in salary of over 25% for certified individuals, you’re going to be in a much better position to secure your dream job or promotion if you earn your AWS Certified Solutions Architect Associate our Cloud Practitioner certification. Get the books below to for real practice exams:

Use the promo codes: W6XM9XP4TWN9 or T6K9P4J9JPPR or 9LWMYKJ7TWPN or TN4NTERJYHY4 for AWS CCP eBook at Apple iBook store.

Use Promo Codes XKPHAATA6LRL 4XJRP9XLT9XL or LTFFY6JA33EL or HKRMTMTHFMAM or 4XHAFTWT4FN6 for AWS SAA-C03 eBook at Apple iBook store

Use Promo Codes EF46PT44LXPN or L6L9R9LKEFFR or TWELPA4JFJWM for Azure Fundamentals eBook at Apple iBook store.

List of Freely available programming books - What is the single most influential book every Programmers should read

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Content is protected !!