In our Splunk Overview post, we recently discussed about the basic details about the product. We also had Q&A posts also on the same. Here in this post we will be covering how to install the software in your setup, which will get you started with your hands-on experiences.
Let’s see how to get it done..
Splunk installation:
We have gone through a brief introduction of Splunk in the previous blog. Now lets go ahead and download Splunk.
In the previous blog we learned that there are multiple components in Splunk. Do we need to download them one by one ? the answer is NO. Read more
We need to download only 2 packages
Splunk enterprise
Splunk Universal forwarder
You can visit www.splunk.com to download Splunk. Click on “free Splunk” and register yourself. Once successfully registered you will be able to download Splunk.
——————- advertisements ——————-
———————————————————
For downloading universal forwarder you can use below link or search in goggle for Splunk universal forward.
If you want to download the packages directly to a Unix server, you can use the wget command given in the Splunk download page.
You can install Splunk in Linux using rpm command.
#rpm –ivh <Splunk.rpm>
Now you have installed the one instance of your Splunk. Now you have to configure this instance as one of the Splunk component as per your architecture. We will come with another blog with component configuration.
Hope you enjoyed reading this post. Please share your thoughts in the comments section.
In our previous post, we had covered few questions and answers on Splunk. This is the second post from that series where we will be adding a few more important technical details. Without much of an intro, let’s get directly into the details..
What are Buckets? Explain Splunk Bucket Lifecycle.
Buckets are directories that store the indexed data in Splunk. So, it is a physical directory that chronicles the events of a specific period. A bucket undergoes several stages of transformation over time. They are:
Hot – A hot bucket comprises of the newly indexed data, and hence, it is open for writing and new additions. An index can have one or more hot buckets.
Warm – A warm bucket contains the data that is rolled out from a hot bucket.
Cold – A cold bucket has data that is rolled out from a warm bucket.
Frozen – A frozen bucket contains the data rolled out from a cold bucket. The Splunk Indexer deletes the frozen data by default. However, there’s an option to archive it. An important thing to remember here is that frozen data is not searchable.
In Splunk, Sourcetype refers to the default field that is used to identify the data structure of an incoming event. Sourcetype should be set at the forwarder level for indexer extraction to help identify different data formats. It determines how Splunk Enterprise formats the data during the indexing process.
——————- advertisements ——————-
———————————————————
Explain the difference between Stats and Eventstats commands.
In Splunk, the Stats command is used to generate the summary statistics of all the existing fields in the search results and save them as values in newly created fields. Although the Eventstats command is pretty similar to the Stats command, it adds the aggregation results inline to each event (if only the aggregation is pertinent to that particular event). So, while both the commands compute the requested statistics, the Eventstats command aggregates the statistics into the original raw data.
Differentiate between Splunk App and Add-on.
Splunk Apps refer to the complete collection of reports, dashboards, alerts, field extractions, and lookups. However, Splunk Add-ons only contain built-in configurations – they do not have dashboards or reports.
What is the command to stop and start Splunk service?
The command to start Splunk service is: ./splunk start
The command to stop Splunk service is: ./splunk stop
How can you clear the Splunk search history?
To clear the Splunk search history, you need to delete the following file from Splunk server:
$splunk_home/var/log/splunk/searches.log
——————- advertisements ——————-
———————————————————
What is a Fishbucket and what is the Index for it?
Fishbucket is an index directory resting at the default location, that is:
/opt/splunk/var/lib/splunk
Fishbucket includes seek pointers and CRCs for the indexed files. To access the Fishbucket, you can use the GUI for searching:
index=_thefishbucket
What is the Dispatch Directory?
The Dispatch Directory includes a directory for individual searches that are either running or have completed. The configuration for the Dispatch Directory is as follows:
$SPLUNK_HOME/var/run/splunk/dispatch
How does Splunk avoid duplicate indexing of logs?
The Splunk Indexer keeps track of all the indexed events in a directory – the Fishbuckets directory that contains seek pointers and CRCs for all the files being indexed presently. So, if there’s any seek pointer or CRC that has been already read, splunkd will point it out.
That’s it in this post. We will soon be adding more Q&A in the next post in this series. Stay tuned.
Please share your feedback/queries in the comments section below.
In one of our recent posts, we had discussed about Splunk. We would recommend you to read the overview post before going thru this post.
Due to high demand for this skill in the market, we were requested by one of our reader to have a Q&A on the same. In this first part of this post, we are covering some of the important questions related to Splunk. More will be added soon in the next post.
Let’s get started…
What is Splunk ?
Splunk is a software platform that allows users to analyse machine-generated data (from hardware devices, networks, servers, IoT devices, etc.). Splunk is widely used for searching, visualising, monitoring, and reporting enterprise data. It processes and analyses machine data and converts it into powerful operational intelligence by offering real-time insights into the data through accurate visualisations.
The Splunk architecture is made of the following components:
——————- advertisements ——————-
———————————————————
Search Head – It provides GUI for searching
Indexer – It indexes the machine data
Forwarder – It forwards logs to the Indexer
Deployment server – It manages the Splunk components in a distributed environment and distributes configuration apps.
Name the common port numbers used by Splunk.
The common port numbers for Splunk are:
Web Port: 8000
Management Port: 8089
Network port: 514
Index Replication Port: 8080
Indexing Port: 9997
KV store: 8191
What are the different types of Splunk dashboards?
There are three different kinds of Splunk dashboards:
Real-time dashboards
Dynamic form-based dashboards
Dashboards for scheduled reports
——————- advertisements ——————-
———————————————————
Name the types of search modes supported in Splunk.
Splunk supports three types of dashboards, namely:
Fast mode
Smart mode
Verbose mode
Name the different kinds of Splunk Forwarders.
There are two types of Splunk Forwarders:
Universal Forwarder (UF) – It is a lightweight Splunk agent installed on a non-Splunk system to gather data locally. UF cannot parse or index data.
Heavyweight Forwarder (HWF) – It is a heavyweight Splunk agent with advanced functionalities, including parsing and indexing capabilities. It is used for filtering data.
What is the use of License Master in Splunk?
License master in Splunk is responsible for making sure that the right amount of data gets indexed. Splunk license is based on the data volume that comes to the platform within a 24hr window
——————- advertisements ——————-
———————————————————
What happens if the License Master is unreachable?
In case the license master is unreachable, then it is just not possible to search the data. However, the data coming in to the Indexer will not be affected. The data will continue to flow into your Splunk deployment, the Indexers will continue to index the data as usual however, you will get a warning message on top your Search head or web UI.
What is the purpose of Splunk DB Connect?
Splunk DB Connect is a generic SQL database plugin designed for Splunk. It enables users to integrate database information with Splunk queries and reports seamlessly.
What are some of the most important configuration files in Splunk?
The most crucial configuration files in Splunk are:
props.conf
indexes.conf
input.conf
output.conf
transforms.conf
That’s it in this part. Please see the second part of the series for more questions. Our comments section is open for any questions/comments/feedback.
Are you getting started with your journey towards Splunk ? or are you in the early stages in the Splunk learning path ? If your answer is ‘yes’, this post is for you. We will be uncovering some of the very basics about Splunk in this post.
Splunk is a software used to search and analyse machine data. This machine data can come from web applications, sensors, devices or any data created by user. It serves the needs of IT infrastructure by analysing the logs generated in various processes but it can also analyse any structured or semi-structured data with proper data modelling. Splunk performs capturing, indexing, and correlating the real time data in a searchable container and produces graphs, alerts, dashboards and visualisations. Splunk provides easy to access data over the whole organisation for easy diagnostics and solutions to various business problems.
Splunk is available in three different product categories as follows −
Splunk Enterprise − It is used by companies which have large IT infrastructure and IT driven business. It helps in gathering and analysing the data from websites, applications, devices and sensors, etc.
——————- advertisements ——————-
———————————————————
Splunk Cloud − It is the cloud hosted platform with same features as the enterprise version. It can be availed from Splunk itself or through the AWS cloud platform.
Splunk Light − It allows search, report and alert on all the log data in real time from one place. It has limited functionalities and features as compared to the other two versions.
Features of Splunk
Data Types: Splunk supports any format and any amount of data -enables centralised log management. Dashboards and Visualisations : Customised dashboards and data visualisations. Dashboards integrate charts, reports and re-usable panels to display a comprehensive data Monitoring and Alerting : Continuous monitoring of events, conditions, and critical KPIs helps to have greater visibility into your operations. Reporting : Reports can be created in real time, scheduled to run at any interval.
Apps and Add-ons : Splunk base has 1000+ apps and add-ons from Splunk, partners and community.
Components of Splunk
——————- advertisements ——————-
———————————————————
The primary components in the Splunk architecture are the forwarder, the indexer, and the search head.
Splunk Forwarder:
The forwarder is an agent you deploy on IT systems, which collects logs and sends them to the indexer. Splunk has two types of forwarders:
Universal Forwarder – forwards the raw data without any prior treatment. This is faster, and requires less resources on the host, but results in huge quantities of data sent to the indexer. Heavy Forwarder – performs parsing and indexing at the source, on the host machine and sends only the parsed events to the indexer. Splunk Indexer
The indexer transforms data into events, stores it to disk and adds it to an index.
Splunk Search Head
The search head provides the UI users can use to interact with Splunk. It allows users to search and query Splunk data.
——————- advertisements ——————-
———————————————————
What Splunk can Index
Alternative to Splunk
Sumo Logic : allows you to monitor and visualize historical and real-time events. Loggly : helps you to collect data from the system using Syslog compatibility. ELK stack : ELK Stack allows users to take to data from any source, in any format, and to search, analyze, and visualize that data.
Hope this gave you a brief about the software and it’s functions. We’ll be adding more related contents soon. Please feel free to add your feedback/questions in the comments section.
Jenkins is one of the important tool in DevOps and most of the time we would require to execute Jenkins job using remote REST API call. Jobs can either be parameterized or non parameterized. A Parameterized job will need certain input from user side for execution. Here we will discuss how to call both types of jobs using REST API.
Triggering paremeterized job by sending values in URL
Triggering parametrized Job by sending values in JSON file
Introduction to Jenkins API
[From Jenkins Documentation] “Jenkins is the market leading continuous integration system, originally created by Kohsuke Kawaguchi. This API makes Jenkins even easier to use by providing an easy to use conventional python interface.”
——————- advertisements ——————-
———————————————————
Jenkins provides rich set of REST based APIs.
Setting Jenkins Job to respond REST API
The REST API feature can be enabled per Job basis. To enable REST API trigger for a Job, Navigate to Your JobName ->Configure -> Build triggers TAB and Check on ‘Trigger build remotely’.
Find out the API URL for the Job
Once you enabled the check box to trigger build remotely , Jenkins will show you the URL to access the particular Job and gives an option to provide the API token for the build. Consider my Jenkins server URL is 10.10.10.100:8080 and my job name is ‘test-job’ , then the URL will be as follows
‘ http://10.10.10.100:8080/job/test-job/build?token=MyTestAPIToken’ -> For non parameterized build
‘ http://10.10.10.100:8080/job/test-job/buildWithParameters?token=MyTestAPIToken’ -> For parameterized build
——————- advertisements ——————-
———————————————————
Handling Authentication
Jenkins using combination of user credential based authentication and API token authentication. We can build token for each build as shown above. In user credential authentication, you can either pass the usename+password or username+token . To access the token for your username, login with your user account , navigate to Manage Jenkins -> Manage Users ->Select User name -> Click Add New Token ->Give Token Name -> Click generate . This will display token for your user . Copy this token and save on safe place as this token can not be recovered in future.
Building REST API request
There are two steps involved in making a successful API request. The first step is to send authentication request and get the CRUMB variable. This crumb data required to be send as header on further API requests. Jenkins use this to prevent Cross Site Forgery. The second one include the actual job request where you will specify job name and parameters for the job. Following are the example for getting CRUMB data using CURL query
——————- advertisements ——————-
———————————————————
Getting CRUMB data :
Format : crumb=$(curl -vvv -u “username:passsword” -s ‘http://jenkinsurl/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,”:”,//crumb)’)
Triggering a non-parameterized job will be easy as there is no requirement of sending any additional data for the build. Below are the example for the API request. Assume we have got crumb data from the above step.
curl -H $crumb –user apiuser: 1104fbd9d00f4e9e0240365c20a358c2b7 -X POST http://10.10.10.100:8080/job/test-job/build?token=MyTestAPIToken
Where ‘test-job’ is name of my Job and ‘ MyTestAPIToken ‘ is the token keyword which i have set manually on Job configure page. Refer ‘Find out the API URL for the Job’ section above to understand for setting up token keyword.
——————- advertisements ——————-
———————————————————
How to create Parameterized Job:
Consider a Jenkins Job where i am asking user inputs before executing Job, these type of jobs are called parameterized job. We can enable parameter request by checking ‘This Job Parameterized’ option under the general tab. Here i am enabling paramerized option for the job name ‘test-job’ and adding two string parameters ‘message1’ and ‘message2’.
Click on Job name -> Configure ->click on General Tab -> enable ‘This project is parameterized’ -> click add parameter and select ‘String Parameter’ from the drop down box. You can see multiple input parameter types which you can use as per your requirement.
On the upcoming window, enter the name as message1 and give the description. Click ‘add parameter’ and repeat the steps for ‘message2’ as well.
——————- advertisements ——————-
———————————————————
Execute the Job by selecting your job name and clicking ‘Build with parameters’. This will prompt for user input before initiating the build. You can use the data provided by the user inside your build bash script or pipeline steps using ‘$message1’ format
Eg: echo Message 1 to the user $message1
echo Message 2 to the user $message2
Triggering Parametrized Job:
You can trigger parametrized job using API call and pass values. You can pass values in a URL encoded format or as part of a JSON file if you have many parameters.
Passing parameters using URL encoded :
Step 1: Collect the CRUMB data
——————- advertisements ——————-
———————————————————
See the above section ‘Building REST API request’
Step 2: Send the parameters in URL
curl -v -H $crumb –user apiuser:apiuser -X POST ‘http://10.10.10.100:8080/job/testjob/buildWithParameters?token=MyTestAPIToken&message1=’hai’&message2=’hello’
Note : Message1 and message2 are the name of the parameter , please see above
Passing parameters using URL encoded JSON format:
Step 1: Collect the crumb data
See the above section ‘Building REST API request’
Step 2: Send the parameters in URL encoded Json format