Splunk Part II – Installation

In our Splunk Overview post, we recently discussed about the basic details about the product. We also had Q&A posts also on the same. Here in this post we will be covering how to install the software in your setup, which will get you started with your hands-on experiences.

Let’s see how to get it done..

Splunk installation:

We have gone through a brief introduction of Splunk in the previous blog. Now lets go ahead and download Splunk.

In the previous blog we learned that there are multiple components in Splunk. Do we need to download them one by one ? the answer is NO. Read more

We need to download only 2 packages

  • Splunk enterprise
  • Splunk Universal forwarder

You can visit www.splunk.com to download Splunk. Click on “free Splunk” and register yourself. Once successfully registered you will be able to download Splunk.

——————- advertisements ——————-

———————————————————

For downloading universal forwarder you can use below link or search in goggle for Splunk universal forward.

https://www.splunk.com/en_us/download/universal-forwarder.html

If you want to download the packages directly to a Unix server, you can use the wget command given in the Splunk download page.

You can install Splunk in Linux using rpm command.

#rpm –ivh <Splunk.rpm>

Now you have installed the one instance of your Splunk. Now you have to configure this instance as one of the Splunk component as per your architecture. We will come with another blog with component configuration.

Hope you enjoyed reading this post. Please share your thoughts in the comments section.

Splunk Interview Questions and Answers – Part II

In our previous post, we had covered few questions and answers on Splunk. This is the second post from that series where we will be adding a few more important technical details. Without much of an intro, let’s get directly into the details..

  • What are Buckets? Explain Splunk Bucket Lifecycle.

Buckets are directories that store the indexed data in Splunk. So, it is a physical directory that chronicles the events of a specific period. A bucket undergoes several stages of transformation over time. They are:

  1. Hot – A hot bucket comprises of the newly indexed data, and hence, it is open for writing and new additions. An index can have one or more hot buckets.
  2. Warm – A warm bucket contains the data that is rolled out from a hot bucket.
  3. Cold – A cold bucket has data that is rolled out from a warm bucket.
  4. Frozen – A frozen bucket contains the data rolled out from a cold bucket. The Splunk Indexer deletes the frozen data by default. However, there’s an option to archive it. An important thing to remember here is that frozen data is not searchable.

Read more

  • Define Sourcetype in Splunk.

In Splunk, Sourcetype refers to the default field that is used to identify the data structure of an incoming event. Sourcetype should be set at the forwarder level for indexer extraction to help identify different data formats. It determines how Splunk Enterprise formats the data during the indexing process.

——————- advertisements ——————-

———————————————————

  • Explain the difference between Stats and Eventstats commands.

In Splunk, the Stats command is used to generate the summary statistics of all the existing fields in the search results and save them as values in newly created fields. Although the Eventstats command is pretty similar to the Stats command, it adds the aggregation results inline to each event (if only the aggregation is pertinent to that particular event). So, while both the commands compute the requested statistics, the Eventstats command aggregates the statistics into the original raw data.

  • Differentiate between Splunk App and Add-on.

Splunk Apps refer to the complete collection of reports, dashboards, alerts, field extractions, and lookups. However, Splunk Add-ons only contain built-in configurations – they do not have dashboards or reports.

  • What is the command to stop and start Splunk service?

The command to start Splunk service is: ./splunk start

The command to stop Splunk service is: ./splunk stop

  • How can you clear the Splunk search history?

To clear the Splunk search history, you need to delete the following file from Splunk server:

$splunk_home/var/log/splunk/searches.log

——————- advertisements ——————-

———————————————————

  • What is a Fishbucket and what is the Index for it?

Fishbucket is an index directory resting at the default location, that is:

/opt/splunk/var/lib/splunk

Fishbucket includes seek pointers and CRCs for the indexed files. To access the Fishbucket, you can use the GUI for searching:

index=_thefishbucket

  • What is the Dispatch Directory?

The Dispatch Directory includes a directory for individual searches that are either running or have completed. The configuration for the Dispatch Directory is as follows:

$SPLUNK_HOME/var/run/splunk/dispatch

  • How does Splunk avoid duplicate indexing of logs?

The Splunk Indexer keeps track of all the indexed events in a directory – the Fishbuckets directory that contains seek pointers and CRCs for all the files being indexed presently. So, if there’s any seek pointer or CRC that has been already read, splunkd will point it out.

That’s it in this post. We will soon be adding more Q&A in the next post in this series. Stay tuned.

Please share your feedback/queries in the comments section below.

Splunk Interview Questions and Answers – Part I

In one of our recent posts, we had discussed about Splunk. We would recommend you to read the overview post before going thru this post.

Due to high demand for this skill in the market, we were requested by one of our reader to have a Q&A on the same. In this first part of this post, we are covering some of the important questions related to Splunk. More will be added soon in the next post.

Let’s get started…

  1. What is Splunk ?

Splunk is a software platform that allows users to analyse machine-generated data (from hardware devices, networks, servers, IoT devices, etc.). Splunk is widely used for searching, visualising, monitoring, and reporting enterprise data. It processes and analyses machine data and converts it into powerful operational intelligence by offering real-time insights into the data through accurate visualisations.

Read more

  1. Name the components of Splunk architecture.

The Splunk architecture is made of the following components:

——————- advertisements ——————-

———————————————————

  • Search Head – It provides GUI for searching
  • Indexer – It indexes the machine data
  • Forwarder – It forwards logs to the Indexer
  • Deployment server – It manages the Splunk components in a distributed environment and distributes configuration apps.

 

  1. Name the common port numbers used by Splunk.

The common port numbers for Splunk are:

  • Web Port: 8000
  • Management Port: 8089
  • Network port: 514
  • Index Replication Port: 8080
  • Indexing Port: 9997
  • KV store: 8191

 

  1. What are the different types of Splunk dashboards?

There are three different kinds of Splunk dashboards:

  • Real-time dashboards
  • Dynamic form-based dashboards
  • Dashboards for scheduled reports
——————- advertisements ——————-

———————————————————

  1. Name the types of search modes supported in Splunk.

Splunk supports three types of dashboards, namely:

  • Fast mode
  • Smart mode
  • Verbose mode

 

  1. Name the different kinds of Splunk Forwarders.

There are two types of Splunk Forwarders:

  • Universal Forwarder (UF) – It is a lightweight Splunk agent installed on a non-Splunk system to gather data locally. UF cannot parse or index data.
  • Heavyweight Forwarder (HWF) – It is a heavyweight Splunk agent with advanced functionalities, including parsing and indexing capabilities. It is used for filtering data.

 

  1. What is the use of License Master in Splunk?

License master in Splunk is responsible for making sure that the right amount of data gets indexed. Splunk license is based on the data volume that comes to the platform within a 24hr window

——————- advertisements ——————-

———————————————————

  1. What happens if the License Master is unreachable?

In case the license master is unreachable, then it is just not possible to search the data. However, the data coming in to the Indexer will not be affected. The data will continue to flow into your Splunk deployment, the Indexers will continue to index the data as usual however, you will get a warning message on top your Search head or web UI.

 

  1. What is the purpose of Splunk DB Connect?

Splunk DB Connect is a generic SQL database plugin designed for Splunk. It enables users to integrate database information with Splunk queries and reports seamlessly.

 

  1. What are some of the most important configuration files in Splunk?

The most crucial configuration files in Splunk are:

  • props.conf
  • indexes.conf
  • input.conf
  • output.conf
  • transforms.conf

That’s it in this part. Please see the second part of the series for more questions. Our comments section is open for any questions/comments/feedback.

Getting familiar with Splunk – a brief introduction

Are you getting started with your journey towards Splunk ? or are you in the early stages in the Splunk learning path ? If your answer is ‘yes’, this post is for you. We will be uncovering some of the very basics about Splunk in this post.

Splunk is a software used to search and analyse machine data. This machine data can come from web applications, sensors, devices or any data created by user. It serves the needs of IT infrastructure by analysing the logs generated in various processes but it can also analyse any structured or semi-structured data with proper data modelling. Splunk performs capturing, indexing, and correlating the real time data in a searchable container and produces graphs, alerts, dashboards and visualisations. Splunk provides easy to access data over the whole organisation for easy diagnostics and solutions to various business problems.

May be an image of text that says "splunk>"

Let’s dive into the details..

Product categories Read more

Splunk is available in three different product categories as follows −

Splunk Enterprise − It is used by companies which have large IT infrastructure and IT driven business. It helps in gathering and analysing the data from websites, applications, devices and sensors, etc.

——————- advertisements ——————-

———————————————————

Splunk Cloud − It is the cloud hosted platform with same features as the enterprise version. It can be availed from Splunk itself or through the AWS cloud platform.

Splunk Light − It allows search, report and alert on all the log data in real time from one place. It has limited functionalities and features as compared to the other two versions.

Features of Splunk 

Data Types: Splunk supports any format and any amount of data -enables centralised log management.
Dashboards and Visualisations : Customised dashboards and data visualisations. Dashboards integrate charts, reports and re-usable panels to display a comprehensive data
Monitoring and Alerting : Continuous monitoring of events, conditions, and critical KPIs helps to have greater visibility into your operations.
Reporting : Reports can be created in real time, scheduled to run at any interval.
Apps and Add-ons : Splunk base has 1000+ apps and add-ons from Splunk, partners and community.

Components of Splunk

——————- advertisements ——————-

———————————————————

The primary components in the Splunk architecture are the forwarder, the indexer, and the search head.

Splunk Forwarder:
The forwarder is an agent you deploy on IT systems, which collects logs and sends them to the indexer. Splunk has two types of forwarders:

Universal Forwarder – forwards the raw data without any prior treatment. This is faster, and requires less resources on the host, but results in huge quantities of data sent to the indexer.
Heavy Forwarder – performs parsing and indexing at the source, on the host machine and sends only the parsed events to the indexer.
Splunk Indexer
The indexer transforms data into events, stores it to disk and adds it to an index.

Splunk Search Head
The search head provides the UI users can use to interact with Splunk. It allows users to search and query Splunk data.

——————- advertisements ——————-

———————————————————

What Splunk can Index

Alternative to Splunk

Sumo Logic : allows you to monitor and visualize historical and real-time events.
Loggly : helps you to collect data from the system using Syslog compatibility.
ELK stack : ELK Stack allows users to take to data from any source, in any format, and to search, analyze, and visualize that data.

Hope this gave you a brief about the software and it’s functions. We’ll be adding more related contents soon. Please feel free to add your feedback/questions in the comments section.

Getting familiar with Boto3 – AWS SDK for Python

Let’s discuss about Boto3, AWS Software Development Kit (SDK) for Python in this post. Boto3 enables you to perform most of the tasks in almost all the AWS services via Python scripts. For more details on the available services and actions, I would recommend you to refer the Boto3 documentation.

In this post we will have a brief introduction to the SDK including the installation and a few examples.

Installation

Yes, as you expected it is the pip install command. pip install boto3 will take care of the installation and if you want a specific version you can use the below syntax (where version is the required version 1.20.30 for example).

pip install boto3==version Read more

C:\Users\bforum>python -m pip install boto3
Collecting boto3
Downloading boto3-1.20.31-py3-none-any.whl (131 kB)
|████████████████████████████████| 131 kB 168 kB/s
Collecting botocore<1.24.0,>=1.23.31
Downloading botocore-1.23.31-py3-none-any.whl (8.5 MB)
|████████████████████████████████| 8.5 MB 726 kB/s

 

==== truncated output ====

 

Successfully installed boto3-1.20.31 botocore-1.23.31

Setting up Boto3 

Now that you have installed the module, you have import the same to your program. import boto3 would do it.

——————- advertisements ——————-

———————————————————

For accessing the AWS services, you have to allow the credentials for Boto3 to connect to the environment. You can have the shared credentials (key_id and secret for AWS access updated in the .aws/credentials file. If you are using an EC2 instance, you can use the IAM roles for accessing the AWS environment.

Not at all recommended, but here I am having my credentials inside my script itself.

Basic S3 tasks

aws_secret_access_key=’YOUR_ACCESS_KEY’
s3 = boto3.client(service_name=”s3″,aws_access_key_id=”test_key”,aws_secret_access_key=aws_secret_access_key)

You can replace the above line with just s3 = boto3.client(‘s3’) if you have your credentials defined in the file or is being taken care by IAM.

You can now invoke the below command to create an S3 bucket.

s3.create_bucket(Bucket=’bucket_name’) 

# Where bucket_name is the desired bucket name should be unique and must meeting the naming requirements (lowercase,numbers,periods and dashes only, having length of 3-63 characters etc…)

You will get an HTTP 200 response with bucket name and creation details.

Now let’s see how we can list S3 buckets.

s3 = boto3.client(‘s3’)
s3.list_buckets()

——————- advertisements ——————-

———————————————————

the above list_buckets command will list the existing buckets, but that will be HTTP response in JSON format. To filter only the names, let’s use the below commands (basically a for loop).

out = s3.list_buckets()
for bucket in out[‘Buckets’]:
   print(bucket[“Name”])

The output will be the bucket names as below

beginnersforum-bf1
beginnersforum-bf2
beginnersforum1crrs3

You can download a file from S3 using the below commands

bucket_name = ‘beginnersforum-bf1’
keyname = ‘Test Text file1.txt’
output_filename = ‘test1.txt’
s3.download_file(bucket_name, keyname, output_filename)

Few EC2 examples

ec2 = boto3.client(‘ec2’) #will create an EC2 client which can be used in the upcoming commands. 

start an instance : ec2.start_instances(InstanceIds=[instance_id])

Reboot an instance : ec2.reboot_instances(InstanceIds=[‘Instance_ID’])

Shutdown an instance : ec2.stop_instances(InstanceIds=[instance_id])

That was just a very basic intro to this Python module. Based on your need, you may refer to the specific service and action details in the SDK documentation. Hope this post helped you in understanding the module, please feel free to have your thoughts/feedback in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-VI

Our Sixth and last post in our AWS Solutions Architect Associate Certification preparation series. Hope you have gone thru the previous posts and you are happy with the content we shared.
.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
.
VPC (Continued)
Steps in setting up VPC is very important in your exam (as below)
* Create VPC – Creates Route table, ACL and SG
* Create additional Subnets – assign public IP to be set Yes for the public subnet
* Create internet Gateway and attach to VPC
* Create the additional route – for allowing public access (We should never allow public for the main route)
* Assign the subnets for Routes – allow public one for the new route
* Launch instances with the new VPC and the different subnets
* Create a Security group for private instance to allow access to it from public instance

Read more

NAT instances and NAT gateways are a way of NATing and making the private subnet system communicate with internet. NAT instances are single EC2 instances without any redundancy. NAT gateways are HA-enabled within AZ. It can not span AZs, better to have separate Gateways in each AZ.
NAT instances can send communications without being a source or destination. For this to work, we have to disable source/destination checks for the instance.
——————- advertisements ——————-

———————————————————

NACL (Network Access Control Lists) : default NACL is created when VPC is created with ALL Allow rules. Any subnet being created will be added by default to the default NACL. We can created additional NACLs and can associate subnets. One subnet can be part of a single NACL. If an Allow rule (rule No. 200) is created and there’s a deny rule (rule No.100), Deny rule takes precedence based on the rule No (chronological order). (e.g; rule 100 to allow all access via 80. rule 200 to deny all access via port 80. Allow will be in effect).
NACL will be checked first before the same rule in Security group.
At least 2 public subnets are required for creating the LoadBalancers.
Network flow logs are the way of capturing the TCP flow using the CloudWatch. It can be at VPC,Subnet or Network interface level.
Can not enable flow logs for a peered VPC with a VPC in another AWS account. Flow log config can’t be modified (e.g; modifying the IAM role etc..)
Bastian host allows for administration of instance in private network. NAT Gateway/instance allows internet access for the private instance but administration is not possible.
AWS direct connect : Direct connect (DX) centers are available everywhere and we will have to have a customer/partner Cage there with routers. These will connect (AWS backbone network) to the AWS Cage routers. Cust/partner router connects to the customer premise (office/DC) and AWS routers connect to our AWS services (instances/S3/VPC etc…).

——————- advertisements ——————-

———————————————————

VPC endpoints allows to connect the VPC to AWS services (without going out of AWS network). VPC Gateway endpoint (Supported with S3 and DynamoDB) and VPC instance endpoints.
High Availability
Application loadbalancers : Application aware, operates at layer 7 of OSI. HTTP and HTTPs requests.
Network loadbalancers : TCP traffic balancing, for extreme performance.
Classic loadbalancers : Can do both. Legacy one. May not be application aware. As it is not app aware, it may give error 504 for gateway time out. It may not be aware if it is a Database issue or webserver issue.
X-forwarded-for : This header will have the customer’s public IP as the Load-balancer forwards the request to the actual application.
That’s it..! We know still many topics are not covered, but we have made a effort to help your certification preparation.
Your feedbacks are very much valuable and it helps us improve our contents. Thank You..!

AWS Solutions Architect Associate Certification preparation – short notes-V

Our fifth post on the AWS Solutions Architect Associate certification preparation topic. Hope you have enjoyed the previous posts in this series where we discussed many important topics including EC2, S3, Databases etc…
.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
.
Now, let’s continue
Route53
Note : ELBs do not have a pre-defined IP address, you route to them using route53.
1. Simple routing policy – Can have multiple entries against one name and the policy picks the IPs randomly during the request.

Read more

2. Weighted Routing policy – We can set weightage for each record (individual host records to be created) and the IO request will be given priority in that order. [We can create healthchecks for the instances and routing policy omits the records having healthcheck issues]
——————- advertisements ——————-

———————————————————

3. Latency based policy – Route53 decides the DNS records/instances based on the least network latency.
4. Failover routing policy – We can define active and passive records. Healthcheck monitors the active
5. Geolocation routing policy – Based on the location of users queries DNS, the DNS record/EC2 instance will be used. Not same as latency
6. Geoproximity routing policy – Complicated one. Allows the access based on the location of the users and resources.Bias(keyword)
7. Multivalue routing policy – similar to simple routing policy, but allows healthcheck for multiple instances.

VPC (Virtual Private Cloud)
Virtual Private Cloud allows the segregation of the network allowing you to create your own logically isolated AWS environment. Complete control of the network settings (including ip address,subnet,route tables,internet gateways etc…). Can seperate hosts to private (without internet) and public (with internet) segments, adding up security. can create a VPN connection with the VPC and use the AWS as an office/datacenter extension.
* Launch instances into a chosen subnet
* Assigning custom IP address ranges in each subnet
* Configuring route tables between subnets
——————- advertisements ——————-

———————————————————

* Create internet gateway and attach to our VPC
* Better security control over AWS resources
* instance security groups
* Subnet Network ACLs.

Default VPC allows easy instance deployments. All subnets in def VPC will have route to the internet. Each EC2 instance will have both private and public IPs.
VPC Peering : allows direct communication with hosts in another VPC. Peering can be done with VPCs in another AWS account and another region also. No Transitive peering (direct peering between VPCs is required)
1 AZ can have one or more Subnets, but 1 subnet can’t span across AZs.
Only 1 Internet Gateway per VPC.
We are not done with VPC yet, we will add additional notes in the next post in this series. Hope these contents are helping you in your preparation.
Feel free to share your feedback/suggestions in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-IV

Into our fourth post in the AWS Solutions Architect Associate certification preparation series.

In our previous posts, we discussed the common topics including S3, EC2 etc… In this post, we will cover the databases section.

.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
.

Relational Databases

6 DBs available in AWS are – SQL server, Oracle, MySQL, PostgreSQL, Amazon Aurora, MariaDB

Multi-AZ for Disaster recovery and Read Replicas for Performance.
DynamoDB is amazon’s No SQL solution.
Redshift is the AMazon’s Datawarehousing solution (for Online Analytic processing -OLAP).
Elasticache – improves performance by in-memory cache in cloud. SUpports 2 open-source in-memory caching engines. – Memcached and redis

Read more

RDS runs on VMs but we cannot access those. AWS takes care of managing the VMs. RDS is NOT serverless (except Aurora)

——————- advertisements ——————-

———————————————————
RDS Backups : Automated daily backups and snapshots. Retention period 1-35 days.
Automated backups are enabled by default. Data will be saved in S3, and you get space for free.
During backup window, IO will be suspended and there may be performance issue.

DB snapshots are manual.
Restored DB (from manual snapshot or Automated backup), will be a new RDS instance with new endpoint (URL)

D@RE encryption is supported (with AWS KMS) for SQL server, Oracle, MySQL, PostgreSQL, Amazon Aurora, MariaDB. Stored data, backups and snapshots are all encrypted.
Multi-AZ : For disaster recovery. AWS will automatically switch to the secondary copy in case of any maintenance or disaster. supported for SQL server, Oracle, MySQL, PostgreSQL, and MariaDB. Amazon Aurora by it’s architecture supports multi-AZ failure.
Read-replica : are for performance improvment for read-intensive database instances. Read can be re-directed to any of the async copy of the actual instance. writes can be still done to the primary DB. Supported by MySQL, PostgreSQL, Amazon Aurora, MariaDB
Can have upto 5 copies/replicas of the primary. Can have read-replicas of read-replicas (performance may reduce).Automatic backups must be turned on.
We can have read-replicas that can have multi-AZ. Can create read-replicas of multi-az source DB.

——————- advertisements ——————-

———————————————————
DynamoDB : AWS’s No SQL DB. Uses SSD and is spread across 3 separate geo areas.
Eventual consistant reads(default)- can ensure data consistency after 1-2 secs of write.
Strong consistant reads – Needed if data will be read by application within a second of write.
Redshift is used for Business intelligence. OLAP solution for Datawarehousing. available in 1 AZ at present(can’t span across multi)
Backup is by default with 1 day retention. Can be modified to max of 35days.
Always 3copies (1xOriginal+1xReplica+1xBackupinS3) kept.
For disaster recovery,Redshift can automatically replicate the snapshots to a S3bucket in different region.
Redshift configuration:
Single node with 160 GB or Multi-node (which will have a leader node – which receives the queries and manages client connections – and upto 128 compute nodes – which processes the queries and computations). Users will be charged for the hours the compute nodes are operating not the leader nodes.
D@RE for Redshift using AES-256 encryption. Redshift takes care of KMS. We can manage Keys using HardwareSecurityModule(HSM) or AWS KMS.
Uses advanced level of compression, which identifies similar data and does compression.

——————- advertisements ——————-

———————————————————
Amazon Aurora
MySQL compatible relational database engine, 5x better performance than MySQL.
start with 10G, increments by 10G upto 64TB. Compute resource can scale upto 32vCPUs and 244G memory.
6copies of data (2x copies in 3 AZs). Can loss 2 copies of data without affecting the write-ability. Can loss 3 copies without affecting the read-ability. Aurora read-replicas are better and can have upto 15 copies (5 for MySQL read-replicas). Automated failover (to read-replica) is supported.

Elasticache – improves performance by in-memory cache in cloud. SUpports 2 open-source in-memory caching engines. – Memcached (simple solution) and redis (Supports Multi-AZ and supports backups)

Another short post is coming to an end. Hope it was helpful and you enjoyed reading it. Please share your feedbacks as comments.

AWS Solutions Architect Associate Certification preparation – short notes-III

Third post in our AWS Solutions Architect Associate certification preparation series. Hope you have enjoyed the first post and the second in the series. We have a few more topics to cover in this series and some of them are in this post.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s continue…

EBS (Elastic Block Storage)
types :
General Purpose (SSD) (GP2) – General purpose, cost-effective storage. 100 – 16000 IOPS. Mixed workload
Provisioned IOPS (SSD) (IO1) – For IO intensive workloads.
Throughput Optimized HDD (ST1) – low cost magnetic storage, performance in terms of throughput.
Cold HDD (SC1) – For large, sequential cold-data workloads.
Magnetic – Uses magnetic storage, for infrequently accessed data.

Read more

migrating a EC2 instance from one region to other ::> Create a snapshot of the root volume > Create an AMI from the snap > Create an instance from AMI on another region.
snapshots are existing in S3.

——————- advertisements ——————-

———————————————————

AMI (Amazon Machine Images) can also be copied to another region for VM deployments.

When you delete/terminate an instance, the additional drives won’t get deleted by default.
For AMIs backed by EBS volumes, the OS root device is created on an EBS snapshot of an EBS volume.For AMIs backed by instance store, the instance root device is created from a template stored in AWS S3.
Instance store root volumes will not be listed in EC2>EBS>Volumes as this is not an EBS volume. we can create the instance from an instance store, but only to limited hardware (instance type) selection. We can not stop an instance which is running on instance store. Only reboot or terminate options are available. If there’s an issue in the underlying hardware, data will be lost. It is also called Ephemeral (short time).
Root volume (of the instance) can be encrypted by ::> create a snapshot of the root volume> copy it by encrypting it> create an AMI from the encrypted copy> launch an instance from it.
Cloudwatch and Cloudtrail :
Cloudwatch (Gym trainer to remember) is for performance monitoring – Compute (EC2,Route53,ELoadbalancers..) ,Storage (EBS Volumes, Storage gateway) and CDN (CloudFront)
Cloudtrail (CCTV to remember) is for checking who is calling for who (kind of access logging in my understanding)
Cloudwatch monitors 5minute intervel by default, can be reduced to 1minute also.
2 ways of accessing the AWSCLI, 1 is giving the user the permissions required for CLI and using the credentials in the CLI. Second one is by creating the IAM role for CLI access and adding that to the EC2 instance.

——————- advertisements ——————-

———————————————————

Sample commands :
aws s3 ls   (to list the S3 buckets)
aws s3 mb s3://bforumnewbucket  (to create a bucket with the given name. mb=make bucket)
credentials are saved in plain text in ~/.aws directory.
curl http://169.254.169.254/latest/meta-data – Captures any meta data about the instance
curl http://169.254.169.254/latest/user-data – captures bootstrap data
EFS (Elastic File System)
supports NFSv4. Pay as you use. Petabyte scale. Thousands of concurrent NFS connections. Read after write consistency.
clustered placement group :- for High performance computing, requiring high thruput or low latency. Within a single AZ.
Spread placement group :- for applications with small number of critical instances, that should be kept seperate. Can span across AZs.
Placement groups names must be unique.
That is another short post, many more topics to come. Hope you are enjoying this series. Your feedbacks will help in improving our contents, please feel free to add in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-II

Hope you have gone thru the first post in our series on AWS Solutions Architect Associate certification preparation. This is a continuation of a few topics we covered in the first post. In this series of posts, we will covering the topics required for you to prepare for the AWS Solutions Architect Associate examination. Thus we are trying to help you in your certification journey.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s continue with our contents.

S3 (continued)

Data @ Rest Encryption (D@RE) can be achieved via
SSE-S3 – Server side encr with S3 managing the Keys
SSE-KMS – Customer can define the KMS (Key Management System) available via AWS
SSE-C – Using customer’s own KMS system
client side encryption – encrypt the data before putting in S3. Customer’s responsibility.
Cross Region Replication : Existing files (before enabling CRR) are not replicated, deletion and Delete Markers are not replicated

Read more

Edge location –
Origin – Origin of the file, can be S3, EC2, Elastic Loadbalancer or Route53
distribution – Name given to the CDN (collection of Edgelocations)
Web distribution – for websites
RTMP – for media streaming (Real Time Messaging protocol- Adobe’s Media sharing protocol)
——————- advertisements ——————-

———————————————————

You can invalidate the cache content from CDN, but will be charged

Snowball – 50 and 80 TB variants. 1/5 cost compared to network transfer.256-bit encr. Can import/Export to/from S3.
Snowball Edge – 100TB -with compute and storage.
Snowmobile – Exabyte-scale, upto 100PB. For effective DC migration.
EC2
On Demand plan – allows for payment as you use (hours/minutes). Good for testing and dev.
Good for short term testing or small workloads, application testing. no upfront payment
Reserved plan – for 1 or 3 yr contract and is cheaper
Standard – upto 75% discount on pricing and instances type can’t be changed
Convertible – Flexibility of instance types
Scheduled – for scheduled scalability.
Spot – As in share market if the rate matches, you may get it.
If a spot instance is terminated by AWS, the partial hour will not be charged. But will be charged if the termination was initiated by User.
Dedicated – Physical server. Compliance or license use cases
Boot drive can’t be encrypted by EC2, only additional drives can be encrypted. We have to use third party tools in OS (like bitlocker) for boot drive encryption.
——————- advertisements ——————-

———————————————————

Security groups :
When we create an inbound rule, outbound rule is created automatically. Security groups are stateful, NACL (Network ACL) are stateless.
Can allow traffic for an IP or port, but can’t deny. There’s no deny option for Sec Groups. It is possible with NACL.
Everything is blocked by default in SG. You have to go and allow what you wanted.
Let’s continue the EC2 discussions and more topics in our next post. Hope you are enjoying reading the series. You are always welcome to post your comments/suggestions/feedback in the comments section below.

1 2