Device does not come back after deleting it from Network Flow Analysis (NFA)

Network Traffic Analysis is the process of recording and analysing network traffic.

Broadcom Netflow analysis (NFA) is one of the best flow analyser available in market.

Lets look in to one of the issue in NFA and its resolution.

Issue : Device is not coming back in NFA after deleting it

Environment: v9.3.3 or earlier

Cause:  This can occur sometimes if there are some orphaned records for the router or interfaces are in one of the NFA databases.

Resolution:

  1. Log in to NFA console and navigate to Administration > Enable interfaces, make sure that the device is not listed there.
  2. Log in to the harvester server in which device is sending flow.
  3. Open the command prompt and run below commands.

(Use select commands to verify the information before deleting it)

a. mysql harvester Read more

b. select * from routers where router =inet_aton(‘x.x.x.x’);

(note the router id from the result and use it while running the commands in poller database)

c. delete from routers where router =inet_aton(‘x.x.x.x’);

d. select * from interfaces where router =inet_aton(‘x.x.x.x’);

e. delete from interfaces where router =inet_aton(‘x.x.x.x’);

f. use poller

——————- advertisements ——————-

———————————————————

g. select * from persistent_map where routerid=xx;

h. delete from persistent_map where routerid=xx;

i. select * from interfaces_snmp where routerid=xx;

j. delete from interfaces_snmp where routerid=xx;

k. select * from routers where address=’x.x.x.x’;

l. delete from routers where address=’x.x.x.x’;

Commands to backup the routers & interfaces table

1. Open command prompt
2. mysqldump -P3308 harvester routers > routers.sql
3. mysqldump -P3308 harvester interfaces > interfaces.sql

That’s it in this post, more troubleshooting steps in the coming posts. Please feel free to add your queries/feedback in the comments section below.

Splunk Part II – Installation

In our Splunk Overview post, we recently discussed about the basic details about the product. We also had Q&A posts also on the same. Here in this post we will be covering how to install the software in your setup, which will get you started with your hands-on experiences.

Let’s see how to get it done..

Splunk installation:

We have gone through a brief introduction of Splunk in the previous blog. Now lets go ahead and download Splunk.

In the previous blog we learned that there are multiple components in Splunk. Do we need to download them one by one ? the answer is NO. Read more

We need to download only 2 packages

  • Splunk enterprise
  • Splunk Universal forwarder

You can visit www.splunk.com to download Splunk. Click on “free Splunk” and register yourself. Once successfully registered you will be able to download Splunk.

——————- advertisements ——————-

———————————————————

For downloading universal forwarder you can use below link or search in goggle for Splunk universal forward.

https://www.splunk.com/en_us/download/universal-forwarder.html

If you want to download the packages directly to a Unix server, you can use the wget command given in the Splunk download page.

You can install Splunk in Linux using rpm command.

#rpm –ivh <Splunk.rpm>

Now you have installed the one instance of your Splunk. Now you have to configure this instance as one of the Splunk component as per your architecture. We will come with another blog with component configuration.

Hope you enjoyed reading this post. Please share your thoughts in the comments section.

Splunk Interview Questions and Answers – Part II

In our previous post, we had covered few questions and answers on Splunk. This is the second post from that series where we will be adding a few more important technical details. Without much of an intro, let’s get directly into the details..

  • What are Buckets? Explain Splunk Bucket Lifecycle.

Buckets are directories that store the indexed data in Splunk. So, it is a physical directory that chronicles the events of a specific period. A bucket undergoes several stages of transformation over time. They are:

  1. Hot – A hot bucket comprises of the newly indexed data, and hence, it is open for writing and new additions. An index can have one or more hot buckets.
  2. Warm – A warm bucket contains the data that is rolled out from a hot bucket.
  3. Cold – A cold bucket has data that is rolled out from a warm bucket.
  4. Frozen – A frozen bucket contains the data rolled out from a cold bucket. The Splunk Indexer deletes the frozen data by default. However, there’s an option to archive it. An important thing to remember here is that frozen data is not searchable.

Read more

  • Define Sourcetype in Splunk.

In Splunk, Sourcetype refers to the default field that is used to identify the data structure of an incoming event. Sourcetype should be set at the forwarder level for indexer extraction to help identify different data formats. It determines how Splunk Enterprise formats the data during the indexing process.

——————- advertisements ——————-

———————————————————

  • Explain the difference between Stats and Eventstats commands.

In Splunk, the Stats command is used to generate the summary statistics of all the existing fields in the search results and save them as values in newly created fields. Although the Eventstats command is pretty similar to the Stats command, it adds the aggregation results inline to each event (if only the aggregation is pertinent to that particular event). So, while both the commands compute the requested statistics, the Eventstats command aggregates the statistics into the original raw data.

  • Differentiate between Splunk App and Add-on.

Splunk Apps refer to the complete collection of reports, dashboards, alerts, field extractions, and lookups. However, Splunk Add-ons only contain built-in configurations – they do not have dashboards or reports.

  • What is the command to stop and start Splunk service?

The command to start Splunk service is: ./splunk start

The command to stop Splunk service is: ./splunk stop

  • How can you clear the Splunk search history?

To clear the Splunk search history, you need to delete the following file from Splunk server:

$splunk_home/var/log/splunk/searches.log

——————- advertisements ——————-

———————————————————

  • What is a Fishbucket and what is the Index for it?

Fishbucket is an index directory resting at the default location, that is:

/opt/splunk/var/lib/splunk

Fishbucket includes seek pointers and CRCs for the indexed files. To access the Fishbucket, you can use the GUI for searching:

index=_thefishbucket

  • What is the Dispatch Directory?

The Dispatch Directory includes a directory for individual searches that are either running or have completed. The configuration for the Dispatch Directory is as follows:

$SPLUNK_HOME/var/run/splunk/dispatch

  • How does Splunk avoid duplicate indexing of logs?

The Splunk Indexer keeps track of all the indexed events in a directory – the Fishbuckets directory that contains seek pointers and CRCs for all the files being indexed presently. So, if there’s any seek pointer or CRC that has been already read, splunkd will point it out.

That’s it in this post. We will soon be adding more Q&A in the next post in this series. Stay tuned.

Please share your feedback/queries in the comments section below.

Python Virtual Environments – Why and How ?

You might have heard about Virtual Environments in Python already and may not be using it yet. Or you maybe aware of the advantages, but not sure how to set it up. In this post, we are trying to help you connect the dots.

Why you need Virtual Environments ? How to use it ? Let’s see..

Let’s take an example of 2 Python projects where you have the below requirements.

Project 1: Python version 2.7 and NumPy module version 1.17

Project 2: Python version 3.6 and NumPy module version 1.22 Read more

How you are going to achieve this ? And this is just 2 projects, of course you’ll be working on many projects and those will have their own requirements. Virtual Environments is the way..!

——————- advertisements ——————-

———————————————————

Think of Virtual Environments as a individual sections in a wardrobe. You are keeping your dresses in a partition, your spouse’s in a separate partition and kid’s items in another partition. Your kid is a Python project and his/her dresses are the modules/packages required for the project. The partition in wardrobe is the Virtual Environment containing only the required packages without messing with other project’s files. This helps in easy management, as you can have the required packages with respective versions in it’s own Virtual Environment without worrying about the other projects.

Working with Virtual Environments

You’ll need to install virtualenv (pip install virtualenv) if you are using Python 2.x. venv will be pre-installed in Python 3.x. (Let’s consider Python 3.x for the remaining examples)

python -m venv NewEnv

This command will create a New directory named NewEnv and it will have the subdirectories as below..

Scripts – This directory contains all the scripts including activate, deactivate, pip etc…

Include – C headers for the python packages

Lib – Contains the site-packages folder containing all the dependencies

Now you have created the environment. You have to activate the environment for using it. You’ll have the activate (batch) script inside the scripts folder.

C:\Users\bforum\test>NewEnv\Scripts\activate.bat
(NewEnv) C:\Users\bforum\test>

——————- advertisements ——————-

———————————————————

As you can see in the above example, we have run activation for NewEnv and the next line indicates we are in ‘NewEnv’.

When I do a pip list now, it is listing only pip and setup tools in this enviroment.

(NewEnv) C:\Users\bforum\test> pip list

Package Version
———- ——-
pip 19.2.3
setuptools 41.2.0

(NewEnv) C:\Users\bforum\test>

Now I can use pip install ModuleName==Version to install the specific version of modules in this environment. I just installed NumPy and Pandas (and their dependencies). Now the output looks like below.

(NewEnv) C:\Users\bforum\test> pip list

Package Version
————— ——-
numpy 1.22.0
pandas 1.4.1
pip 19.2.3
python-dateutil 2.8.2
pytz 2021.3
setuptools 41.2.0
six 1.16.0

(NewEnv) C:\Users\bforum\test>

——————- advertisements ——————-

———————————————————

You can deactivate the Virtual Environment to come out of it using deactivate (shell) script.

(NewEnv) C:\Users\bforum\test>NewEnv\Scripts\deactivate.bat

C:\Users\bforum\test>

That’s it in this post, hope this helped you. Comments section is open for any feedback/questions.

Splunk Interview Questions and Answers – Part I

In one of our recent posts, we had discussed about Splunk. We would recommend you to read the overview post before going thru this post.

Due to high demand for this skill in the market, we were requested by one of our reader to have a Q&A on the same. In this first part of this post, we are covering some of the important questions related to Splunk. More will be added soon in the next post.

Let’s get started…

  1. What is Splunk ?

Splunk is a software platform that allows users to analyse machine-generated data (from hardware devices, networks, servers, IoT devices, etc.). Splunk is widely used for searching, visualising, monitoring, and reporting enterprise data. It processes and analyses machine data and converts it into powerful operational intelligence by offering real-time insights into the data through accurate visualisations.

Read more

  1. Name the components of Splunk architecture.

The Splunk architecture is made of the following components:

——————- advertisements ——————-

———————————————————

  • Search Head – It provides GUI for searching
  • Indexer – It indexes the machine data
  • Forwarder – It forwards logs to the Indexer
  • Deployment server – It manages the Splunk components in a distributed environment and distributes configuration apps.

 

  1. Name the common port numbers used by Splunk.

The common port numbers for Splunk are:

  • Web Port: 8000
  • Management Port: 8089
  • Network port: 514
  • Index Replication Port: 8080
  • Indexing Port: 9997
  • KV store: 8191

 

  1. What are the different types of Splunk dashboards?

There are three different kinds of Splunk dashboards:

  • Real-time dashboards
  • Dynamic form-based dashboards
  • Dashboards for scheduled reports
——————- advertisements ——————-

———————————————————

  1. Name the types of search modes supported in Splunk.

Splunk supports three types of dashboards, namely:

  • Fast mode
  • Smart mode
  • Verbose mode

 

  1. Name the different kinds of Splunk Forwarders.

There are two types of Splunk Forwarders:

  • Universal Forwarder (UF) – It is a lightweight Splunk agent installed on a non-Splunk system to gather data locally. UF cannot parse or index data.
  • Heavyweight Forwarder (HWF) – It is a heavyweight Splunk agent with advanced functionalities, including parsing and indexing capabilities. It is used for filtering data.

 

  1. What is the use of License Master in Splunk?

License master in Splunk is responsible for making sure that the right amount of data gets indexed. Splunk license is based on the data volume that comes to the platform within a 24hr window

——————- advertisements ——————-

———————————————————

  1. What happens if the License Master is unreachable?

In case the license master is unreachable, then it is just not possible to search the data. However, the data coming in to the Indexer will not be affected. The data will continue to flow into your Splunk deployment, the Indexers will continue to index the data as usual however, you will get a warning message on top your Search head or web UI.

 

  1. What is the purpose of Splunk DB Connect?

Splunk DB Connect is a generic SQL database plugin designed for Splunk. It enables users to integrate database information with Splunk queries and reports seamlessly.

 

  1. What are some of the most important configuration files in Splunk?

The most crucial configuration files in Splunk are:

  • props.conf
  • indexes.conf
  • input.conf
  • output.conf
  • transforms.conf

That’s it in this part. Please see the second part of the series for more questions. Our comments section is open for any questions/comments/feedback.

Getting familiar with Splunk – a brief introduction

Are you getting started with your journey towards Splunk ? or are you in the early stages in the Splunk learning path ? If your answer is ‘yes’, this post is for you. We will be uncovering some of the very basics about Splunk in this post.

Splunk is a software used to search and analyse machine data. This machine data can come from web applications, sensors, devices or any data created by user. It serves the needs of IT infrastructure by analysing the logs generated in various processes but it can also analyse any structured or semi-structured data with proper data modelling. Splunk performs capturing, indexing, and correlating the real time data in a searchable container and produces graphs, alerts, dashboards and visualisations. Splunk provides easy to access data over the whole organisation for easy diagnostics and solutions to various business problems.

May be an image of text that says "splunk>"

Let’s dive into the details..

Product categories Read more

Splunk is available in three different product categories as follows −

Splunk Enterprise − It is used by companies which have large IT infrastructure and IT driven business. It helps in gathering and analysing the data from websites, applications, devices and sensors, etc.

——————- advertisements ——————-

———————————————————

Splunk Cloud − It is the cloud hosted platform with same features as the enterprise version. It can be availed from Splunk itself or through the AWS cloud platform.

Splunk Light − It allows search, report and alert on all the log data in real time from one place. It has limited functionalities and features as compared to the other two versions.

Features of Splunk 

Data Types: Splunk supports any format and any amount of data -enables centralised log management.
Dashboards and Visualisations : Customised dashboards and data visualisations. Dashboards integrate charts, reports and re-usable panels to display a comprehensive data
Monitoring and Alerting : Continuous monitoring of events, conditions, and critical KPIs helps to have greater visibility into your operations.
Reporting : Reports can be created in real time, scheduled to run at any interval.
Apps and Add-ons : Splunk base has 1000+ apps and add-ons from Splunk, partners and community.

Components of Splunk

——————- advertisements ——————-

———————————————————

The primary components in the Splunk architecture are the forwarder, the indexer, and the search head.

Splunk Forwarder:
The forwarder is an agent you deploy on IT systems, which collects logs and sends them to the indexer. Splunk has two types of forwarders:

Universal Forwarder – forwards the raw data without any prior treatment. This is faster, and requires less resources on the host, but results in huge quantities of data sent to the indexer.
Heavy Forwarder – performs parsing and indexing at the source, on the host machine and sends only the parsed events to the indexer.
Splunk Indexer
The indexer transforms data into events, stores it to disk and adds it to an index.

Splunk Search Head
The search head provides the UI users can use to interact with Splunk. It allows users to search and query Splunk data.

——————- advertisements ——————-

———————————————————

What Splunk can Index

Alternative to Splunk

Sumo Logic : allows you to monitor and visualize historical and real-time events.
Loggly : helps you to collect data from the system using Syslog compatibility.
ELK stack : ELK Stack allows users to take to data from any source, in any format, and to search, analyze, and visualize that data.

Hope this gave you a brief about the software and it’s functions. We’ll be adding more related contents soon. Please feel free to add your feedback/questions in the comments section.

Basic commands for Linux OS Performance Monitoring

Monitoring the system performance regularly is very much important to ensure the services are being delivered to the end customers without any latency. OS Performance monitoring is an important layer of the entire system performance, along with other layers including application performance, network performance etc…

OS Performance monitoring tools are used for monitoring, visualising, storing, and analysing system-level performance measurements. It allows the monitoring and management of real-time data, and logging and retrieval of historical data.

Red Hat Enterprise Linux provides several tools that can be used from the command line to monitor a system performance.

We are discussing here some of the built-in command line tools for system monitoring.

top

Read more

The top program provides a dynamic real-time view of a running system.  It can display system summary information as well as a list of processes or threads currently being managed by the Linux kernel.

Top command helps the system administrator to find the process and users who utilize more resource in the system.

——————- advertisements ——————-

———————————————————

Let’s see the example below.

top is provided by the procps-ng package. It gives a dynamic view of the processes in a running system. It displays a variety of information, including a system summary and a list of tasks currently being managed by the Linux kernel

ps

It is the abbreviation of “process status”. ps displays information about a selection of the active processes. The output of ps command may vary depends on the parameters we used with it.

Let’s see the example below

——————- advertisements ——————-

———————————————————

ps is provided by the procps-ng package. It captures a snapshot of a select group of active processes. By default, the examined group is limited to processes that are owned by the current user and associated with the terminal where the ps command is executed.

vmstat

It is the abbreviation of virtual memory statistics. vmstat reports information about processes, memory, paging, block IO, traps, disks and cpu activity.

Let’s see the example below

Virtual memory statistics (vmstat) is provided by the procps-ng package. If we use vmstat as a command with no parameters, it will show you the report  which contains the averages for each of the statistics since the last reboot.

sar

It is the abbreviation of System activity reporter.  It collects and reports information about system activity that has occurred so far on the current day.

sar is provided by the sysstat package.  It can be used to monitor Linux system’s resources like CPU usage, Memory utilization, I/O devices consumption, Network monitoring, Disk usage, process and thread allocation and more.

——————- advertisements ——————-

———————————————————

Let’s see the example below

sar command will show only cpu monitoring activity if any flag is not specifies by user. It displays result on the output screen by default , in addition the result can also be stored in the file specified using  -o filename option.

netstat

It is the abbreviation of network statistics. Netstat prints information about the Linux networking subsystem. Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships.

netstat is provided by the package net-tools .By default, netstat displays a list of open sockets.  If you don’t specify any address families, then the active sockets of all configured address families will be printed.

Below example shows how netstat can be used to print the routing table.

iostat

It is the abbreviation of input/output  statistics .The iostat command is used for monitoring system input/output device loading by observing the time the devices are active in relation to their average transfer rates.

——————- advertisements ——————-

———————————————————

Let’s see the example below

iostat is provided by the package sysstat.The iostat command generates reports that can be used to change system configuration to better balance the input/output load between physical disks.

That’s some of the very basic and important commands and it’s usage. Hope it will help you to monitor your system effectively. We will discuss more performance related topics in upcoming posts.

Cisco IT Blog Awards 2021 – Finalist..!

It’s a great pleasure to announce that we are selected as one of the finalists in the IT Blog Awards 2021, hosted by Cisco. Can’t explain how it feels to be in the list among leading IT blogs, for the Third Time (2018,2020 and 2021 now).

We would like to Congratulate all the finalists and wish them the best in the competition.

There are 58 entries in the Blogs category and 17 entries in the Vlogs and Podcasts category. There are only these categories this time, unlike previous times (where awards were given in different categories of contents).

You can vote now for the best blogs and vlogs/podcasts, based on the value they are creating, the quality of contents etc… This is your opportunity to vote for the contents which always help you at work or in your studies. Read more

You can select upto 5 Blogs and 5 Vlogs/Podcasts and rank them 1-5. We would be happy if you are having our site as well in your 5.

VOTE NOW

You can find more details/rules in the above voting link. Have a detailed look at the blogs/vlogs/podcasts, and vote NOW..!

Getting familiar with Boto3 – AWS SDK for Python

Let’s discuss about Boto3, AWS Software Development Kit (SDK) for Python in this post. Boto3 enables you to perform most of the tasks in almost all the AWS services via Python scripts. For more details on the available services and actions, I would recommend you to refer the Boto3 documentation.

In this post we will have a brief introduction to the SDK including the installation and a few examples.

Installation

Yes, as you expected it is the pip install command. pip install boto3 will take care of the installation and if you want a specific version you can use the below syntax (where version is the required version 1.20.30 for example).

pip install boto3==version Read more

C:\Users\bforum>python -m pip install boto3
Collecting boto3
Downloading boto3-1.20.31-py3-none-any.whl (131 kB)
|████████████████████████████████| 131 kB 168 kB/s
Collecting botocore<1.24.0,>=1.23.31
Downloading botocore-1.23.31-py3-none-any.whl (8.5 MB)
|████████████████████████████████| 8.5 MB 726 kB/s

 

==== truncated output ====

 

Successfully installed boto3-1.20.31 botocore-1.23.31

Setting up Boto3 

Now that you have installed the module, you have import the same to your program. import boto3 would do it.

——————- advertisements ——————-

———————————————————

For accessing the AWS services, you have to allow the credentials for Boto3 to connect to the environment. You can have the shared credentials (key_id and secret for AWS access updated in the .aws/credentials file. If you are using an EC2 instance, you can use the IAM roles for accessing the AWS environment.

Not at all recommended, but here I am having my credentials inside my script itself.

Basic S3 tasks

aws_secret_access_key=’YOUR_ACCESS_KEY’
s3 = boto3.client(service_name=”s3″,aws_access_key_id=”test_key”,aws_secret_access_key=aws_secret_access_key)

You can replace the above line with just s3 = boto3.client(‘s3’) if you have your credentials defined in the file or is being taken care by IAM.

You can now invoke the below command to create an S3 bucket.

s3.create_bucket(Bucket=’bucket_name’) 

# Where bucket_name is the desired bucket name should be unique and must meeting the naming requirements (lowercase,numbers,periods and dashes only, having length of 3-63 characters etc…)

You will get an HTTP 200 response with bucket name and creation details.

Now let’s see how we can list S3 buckets.

s3 = boto3.client(‘s3’)
s3.list_buckets()

——————- advertisements ——————-

———————————————————

the above list_buckets command will list the existing buckets, but that will be HTTP response in JSON format. To filter only the names, let’s use the below commands (basically a for loop).

out = s3.list_buckets()
for bucket in out[‘Buckets’]:
   print(bucket[“Name”])

The output will be the bucket names as below

beginnersforum-bf1
beginnersforum-bf2
beginnersforum1crrs3

You can download a file from S3 using the below commands

bucket_name = ‘beginnersforum-bf1’
keyname = ‘Test Text file1.txt’
output_filename = ‘test1.txt’
s3.download_file(bucket_name, keyname, output_filename)

Few EC2 examples

ec2 = boto3.client(‘ec2’) #will create an EC2 client which can be used in the upcoming commands. 

start an instance : ec2.start_instances(InstanceIds=[instance_id])

Reboot an instance : ec2.reboot_instances(InstanceIds=[‘Instance_ID’])

Shutdown an instance : ec2.stop_instances(InstanceIds=[instance_id])

That was just a very basic intro to this Python module. Based on your need, you may refer to the specific service and action details in the SDK documentation. Hope this post helped you in understanding the module, please feel free to have your thoughts/feedback in the comments section.

Machine Learning – Ways to identify outliers in your data

Cleaning the data is one of the most important and the most time taking process in any Machine Learning problem.You have to have a clear understanding about the data and have to process the data well to have accurate results.

There are so many things to be considered for processing the data. Dealing with outliers – finding outliers and cleaning them if required – is one of them.

Let’s see what outliers are, ways to identify them and how to remove them.

first thing first, what are outliersRead more

“” In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. An outlier can cause serious problems in statistical analyses.””

——————- advertisements ——————-

———————————————————

Hope the above wikipedia definition is clear enough for everyone. We will have to review the data and identify the data points which are potential outliers. We have to then work with the domain experts to ensure if those are the real outliers and can be removed. Depending on the type of data and the problem being worked upon, sometimes we may have to keep the outliers as well.

Now, let’s look at different ways to find outliers.

IQR method

We can use the IQR rule to isolate the identify data points which are appearing to be outliers.

Any points 1.5 IQR above the 75th percentile (3rd Quartile) or any points below the 25th percentile (1st Quartile) by 1.5 IQR is considered as outliers. I know that didn’t make it clear for you, but the below diagram and the explanation below will make it easy to understand.

As you can see in the above diagram (box plot), we have to calculate the quartiles (25th, 50th and 75th percentiles) of the data. IQR is the difference between 3rd Quartile and 1st Quartile (box in the diagram). Anything above the maximum point and anything below the minimum point is an outlier and may need to be removed. maximum point is the 1.5 IQR above the Q3 point and minimum point is the 1.5 IQR below the Q1 point.

——————- advertisements ——————-

———————————————————

A small python code below for the same.

import numpy as np
input = (2,13,14,12,18,15,16,14,25)
q1,q3 = np.percentile(input,25), np.percentile(input,75)
IQR = q3-q1
min = q1 – (1.5*IQR)
max = q3 + (1.5*IQR)
outliers = [val for val in input if val < min or val > max]

print(

“Outliers for the datapoints {} are : \n {}”.format(input,outliers))
The output should look like,
Outliers for the datapoints (2, 13, 14, 12, 18, 15, 16, 14, 25) are :
[2, 25]
Box plots gives a good visualisation of the outliers as in the above example image.
Z-Score Method
This method is applicable for data which is assuming (or almost) standard distribution and is based on the 68-95-99.7 rule in statistics. According to the 68-95-99.7 (or empirical) rule, 68% of the data points reside within the 1st standard deviation from the mean of the distribution. 95% within 2 standard deviation and 99.7% within 3 standard deviation. That means almost all the data points reside within 3 standard deviation.
——————- advertisements ——————-

———————————————————

z-score is calculated with the below equation.
z= Xmean/standard deviation
3 standard deviation is being chosen mostly in z-score calculation (as per empirical rule) but is a user choice.
You may see reference about Standard deviation method as well, which is basically similar to z-score method. There, we are considering the data points outside the cutoff standard deviation point (usually 3 standard deviation) as outliers.
Hope this post helps you. Please share your feedback/suggestions in the comments section below.

1 2 3 8