Machine Learning / Artificial Intelligence – basic pre-requisites to get started

Machine Learning or Artificial Intelligence technologies are booming (or should I say boooooming..!!) nowadays, maybe already. Many of you are planning to get started with your AI or ML studies, but not clear on where to start and what are the pre-requisites. Yes, a bit different area from your current Infrastructure/Software/Database/Coding domain and which requires a bit (not really, a lot more) of additional learning.
Let’s try to take a look at the most common pre-requisites for getting started with your AI/ML journey.

Read more

Maths, but how much of maths ?? 
Not too much, it is basically a refresh required for your high-school maths. You just have to remember linear algebra (linear equations, matrices and vectors) stuff and basics of calculus (rate of change, integration, differentiation etc…). Remember, it is required for you to understand how the models work. You do not have to do all these calculations etc… by your own, or you do not have to code all these in maths terms. The model will do it, maths is for you to understand what is happening in the back-end. 
——————- advertisements ——————-


Statistics and Probability 

It’s a lot to learn, if you are not too good in it. You should be having a good level of statistics knowledge to understand the AI/ML stuff, as the basics are all based on the statistics terms. 
You should be well versed with the descriptive statistics including the mean, median, standard deviation, variance etc… of the data. Same applies with the Probability theory. You have to be too good with it, able to understand the data distribution including distribution functions PDF (Probability Density Function) and CDF (Cumulative Density Function) (and PMF etc…).
——————- advertisements ——————-


Histograms and other common plotting – visualisation – methods, a better understanding of data (covariance, correlation etc…).

The more statistics you know the better you can understand. 
Is it necessary to know coding ??
I would say, yes. There are MLaaS (Machine Learning as a Service) tools which will do the most part for you where you don’t have to write a single line of code. But I would prefer coding it yourself for a better understanding and that maybe necessary based on your job role.
Python – is where everyone gets landed, based on its popularity and it being Machine Learning friendly (just kidding, it has rich set of modules available for ML). If you are good at R, C/C++, Java, Julia, Scala, Ruby… still you are at a better position, 1 item (programming) is already checked.
——————- advertisements ——————-


Useful links
Let’s me add it soon…
I believe that covers the basic requirements to get started with your ML/AI learning. The more you work with data, the better you would be with it. It requires a lot, lot of learning and even more practice. Please feel free to add your suggestions/queries in the comments section.

Linux basics – interview questions and answers -Booting Part 2

Our second post from the blog series on Linux basics. Hope you have gone thru the part-1 of this series already, if not, we recommend reading it. In continuation with the booting Q&A, we are adding a few more to it in this post.

Let’s get into the questions and answers

  • How to set password for single user mode.

Change the definition of the single user login shell in /etc/sysconfig/init from sushellto sulogin

#sed -i “s,^SINGLE=.*,SINGLE=/sbin/sulogin,” /etc/sysconfig/init

  • How to reinstall boot loader.

# grub-install /dev/sda Read more

  • What is initial RAM disk image.

The initial RAM disk (initrd)is an initial root file system that is mounted prior to when the real root file system is available. The initrd is bound to the kernel and loaded as part of the kernel boot procedure

  • How to create initramfs in rescue mode, what are the two utilities
——————- advertisements ——————-


mkinitrd utility can be used to recreate the initrd image in RHEL4 and 5

dracut utility can be used in later versions of RHEL to rebuild the initrams image

  • How to list the content of initramfs


  • What will happen if grub.conf file deleted and how to recover

The system will fail to boot and fall in to the grub prompt

Need to restore the boot partition in hd0 and setup grub.

  • What is kernel module

Kernel modules are pieces of code that can be loaded and unloaded into the kernel upon demand. They extend the functionality of the kernel without the need to reboot the system

  • Which package is require for kernel module utilities


——————- advertisements ——————-


  • How to list loaded kernel modules


  • How to get the information about module

Modinfo <module name>

  • How to load module into kernel


  • From which location the modprobe command will load modules

/lib/modules/’uname -r’

  • How modprobe command will resolve dependencies, Which file contains information about dependencies.

modprobe expects an up-to-date modules.dep.bin file (or fallback human readable modules.dep file), as generated by the depmod utility. This file lists what other modules each module needs (if any), and modprobe uses this to add or remove these dependencies automatically.

——————- advertisements ——————-


  • What is the difference between modprobe and insmod

modprobe is the intelligent version of insmod simply adds a module where modprobe looks for any dependency (if that particular module is dependent on any other module) and loads them

  • What is the two command for unload the module from kernel

Modeprobe -r <module name>

Rmmod <module name >

  • How to blacklist a module

You can modify the /etc/modprobe.d/blacklist.conf file that already exists on the system by default. However, the preferred method is to create a separate configuration file, /etc/modprobe.d/<module_name>.conf, that will contain settings specific only to the given kernel module.

  • What is udev
——————- advertisements ——————-


udev is a generic device manager running as a daemon on a Linux system and listening (via a netlink socket) to uevents the kernel sends out if a new device is initialized or a device is removed from the system

  • How to view the serial number of system

Dmidecode -t system

That’s it in this post. Hope you are enjoying the content. Please feel free to add your suggestions/comments/feedback in the comments section.

Linux basics – interview questions and answers -Booting Part 1

It’s been a while without a Linux/Unix post, now we are starting a series here. A series posts with some of the basics, in a Q&A format. We are attempting to help you improve your basics, which can be helpful in your revision for job interviews as well.

Here comes the first part, where we will be discussing some of the Q&As from the booting part. This will be helpful for those who are at an L1- L2 level in your Linux knowledge.

Let’s get in to the stuff…

  • Which file is responsible for Starts/kills services depending on RUNLEVEL

/etc/rc.d/  rc0 to rc6 files Read more

  • Which file is responsible for configure Ctrl+Alt+Del key combination to shutdown the system at console.

/etc/inittab è comment out the line “ca:12345:ctrlaltdel:/sbin/shutdown -t1 -a – r now”

  • What are the two display manager?
——————- advertisements ——————-


GDM (GNOME Display Manager) — The default display manager for Red Hat Enterprise Linux.

KDM — KDE’s display manager which allows the user to shutdown, restart or log in to the system

  • How to switch a run level from one to another?

Init <run level>

  • What is happening when we switch into another run level

When init is requested to change the runlevel, it sends the warning signal SIGTERM to all processes that are undefined in the new runlevel. It then waits 5 seconds before forcibly terminating these processes via the SIGKILL signal

  • How to find the current run level

Who -r

  • What is rescue mode?

Rescue mode provides the ability to boot a small Red Hat Enterprise Linux environment entirely from CD-ROM, or some other boot method, instead of the system’s hard drive.

There may be times when you are unable to get Red Hat Enterprise Linux running completely enough to access files on your system’s hard drive. Using rescue mode, you can access the files stored on your system’s hard drive, even if you cannot run Red Hat Enterprise Linux from that hard drive

——————- advertisements ——————-


  • How to enter in to rescue mode?

To boot into rescue mode, you must be able to boot the system using one of the following methods 1:

By booting the system from an installation boot CD-ROM.

By booting the system from other installation boot media, such as USB flash    devices.

By booting the system from the Red Hat Enterprise Linux CD-ROM #1.

Once you have booted using one of the described methods, add the keyword rescue as a kernel parameter. For example, for an x86 system, type the following command at the installation boot prompt: linux rescue

  • How to load a driver at the time of booting in to rescue mode

Type linux dd at the boot prompt at the start of the installation process and press Enter

  • If a driver that is part of the Red Hat Enterprise Linux 6 distribution prevents the system from booting, How to blacklist that driver

Boot the system into rescue mode with the command linux rescue rdblacklist=name_of_driver

——————- advertisements ——————-


Open the /mnt/sysimage/boot/grub/grub.conf file with the vi text editor

#vi /mnt/sysimage/boot/grub/grub.conf

kernel /vmlinuz-2.6.32-71.18-2.el6.i686 ro root=/dev/sda1 rhgb quiet rdblacklist=foobar ( edit the kernel line by adding entry  rdblacklist=drivername)

Create a new file under /etc/modprobe.d/ that contains the command blacklist name_of_driver

echo “blacklist foobar” >> /mnt/sysimage/etc/modprobe.d/blacklist-foobar.conf

Reboot the system

  • What is chroot, what are the uses.

A chroot is an operation that changes the apparent root directory for the current running process and their children

  • What is single user mode, how to enter into single user mode ?

Single-user mode provides a Linux environment for a single user that allows you to recover your system from problems that cannot be resolved in networked multi-user environment. You do not need an external boot device to be able to boot into single-user mode, and you can switch into it directly while the system is running

——————- advertisements ——————-


At the GRUB boot screen, press any key to enter the GRUB interactive menu.

Select Red Hat Enterprise Linux with the version of the kernel that you want to boot and press the a to append the line.

Type single as a separate word at the end of the line and press Enter to exit GRUB edit mode. Alternatively, you can type 1 instead of single

  • What is emergency mode, how to enter in to emergency mode, main difference between single user mode and emergency mode

Emergency mode, provides the minimal bootable environment and allows you to repair your system even in situations when rescue mode is unavailable. In emergency mode, the system mounts only the root file system, and it is mounted as read-only. Also, the system does not activate any network interfaces and only a minimum of the essential services are set up.

At the GRUB boot screen, press any key to enter the GRUB interactive menu.

Select Red Hat Enterprise Linux with the version of the kernel that you want to boot and press the a to append the line.

Type emergency as a separate word at the end of the line and press Enter to exit GRUB edit mode.

——————- advertisements ——————-


In emergency mode, you are booted into the most minimal environment possible. The root file system is mounted read-only and almost nothing is set up. The main advantage of emergency mode over single-user mode is that the init files are not loaded. If init is corrupted or not working, you can still mount file systems to recover data that could be lost during a re-installation.

In single-user mode, your computer boots to runlevel 1. Your local file systems are mounted, but your network is not activated. You have a usable system maintenance shell.

Hope you have enjoyed reading this post. Please feel free to add your feedback in the comments section.

Understanding Supervised and Unsupervised (Machine) Learning

As you start with your Machine Learning, you will get to hear a lot about the terms Supervised and Unsupervised learning. You will find a lot of blogs, videos etc… explaining and differentiating these 2 types. This is just another attempt to explain with some examples.

Supervised learning

Supervised learning means you (a supervisor) is training the machine to identify few patterns from the data you provided. Here the data will have some clear indications (labels) about the pattern for the machine to learn from. Machine can use this learning to find similar patterns in the new dataset.

Taking an example, you are giving few Apples and Oranges to a kid and you are identifying Read more

some of them as Apples and others as Oranges based on their characteristics (colour, hardness etc…). Next time you give him an Orange, he will be able to identify it as Orange from his previous experience.

——————- advertisements ——————-


Classification and Regression are 2 models of Machine Learning falling under the Supervised Learning category. We will learn these in detail in later posts, but just to give you an overview..

Classification model is used for classifying the input data (classifying emails to spam or not, medical diagnosis – cancer or not etc…)

Regression model is used to identify a continuous relation of the output values with given input values (predicting the rent amount for houses with the listed different features)

Unsupervised learning

In this type of learning, the data provided is not labelled or classified. Machine tries to find some patterns or similarities in supplied data and group or sort it.

considering a similar example which we discussed for supervised learning, you are giving a basket of Apples and Oranges to your kid. You are not giving any special instructions about the fruits in the basket, but your kid will still be able to form 2 groups, 1 of Apples and the other of Oranges. He/she is able to do it by observing the characteristics like colour, softness etc…

——————- advertisements ——————-


Clustering and Association are 2 common models falling in this type of learning.

Clustering is forming groups from the given input based on similarities (example below)

Association is used mostly in identifying a customer’s buying pattern, which in some cases called as Market basket analysis. How many customers buying milk are buying bread (for example)

Hope you enjoyed reading this post. Please feel free to share your thoughts in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-VI

Our Sixth and last post in our AWS Solutions Architect Associate Certification preparation series. Hope you have gone thru the previous posts and you are happy with the content we shared.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
VPC (Continued)
Steps in setting up VPC is very important in your exam (as below)
* Create VPC – Creates Route table, ACL and SG
* Create additional Subnets – assign public IP to be set Yes for the public subnet
* Create internet Gateway and attach to VPC
* Create the additional route – for allowing public access (We should never allow public for the main route)
* Assign the subnets for Routes – allow public one for the new route
* Launch instances with the new VPC and the different subnets
* Create a Security group for private instance to allow access to it from public instance

Read more

NAT instances and NAT gateways are a way of NATing and making the private subnet system communicate with internet. NAT instances are single EC2 instances without any redundancy. NAT gateways are HA-enabled within AZ. It can not span AZs, better to have separate Gateways in each AZ.
NAT instances can send communications without being a source or destination. For this to work, we have to disable source/destination checks for the instance.
——————- advertisements ——————-


NACL (Network Access Control Lists) : default NACL is created when VPC is created with ALL Allow rules. Any subnet being created will be added by default to the default NACL. We can created additional NACLs and can associate subnets. One subnet can be part of a single NACL. If an Allow rule (rule No. 200) is created and there’s a deny rule (rule No.100), Deny rule takes precedence based on the rule No (chronological order). (e.g; rule 100 to allow all access via 80. rule 200 to deny all access via port 80. Allow will be in effect).
NACL will be checked first before the same rule in Security group.
At least 2 public subnets are required for creating the LoadBalancers.
Network flow logs are the way of capturing the TCP flow using the CloudWatch. It can be at VPC,Subnet or Network interface level.
Can not enable flow logs for a peered VPC with a VPC in another AWS account. Flow log config can’t be modified (e.g; modifying the IAM role etc..)
Bastian host allows for administration of instance in private network. NAT Gateway/instance allows internet access for the private instance but administration is not possible.
AWS direct connect : Direct connect (DX) centers are available everywhere and we will have to have a customer/partner Cage there with routers. These will connect (AWS backbone network) to the AWS Cage routers. Cust/partner router connects to the customer premise (office/DC) and AWS routers connect to our AWS services (instances/S3/VPC etc…).

——————- advertisements ——————-


VPC endpoints allows to connect the VPC to AWS services (without going out of AWS network). VPC Gateway endpoint (Supported with S3 and DynamoDB) and VPC instance endpoints.
High Availability
Application loadbalancers : Application aware, operates at layer 7 of OSI. HTTP and HTTPs requests.
Network loadbalancers : TCP traffic balancing, for extreme performance.
Classic loadbalancers : Can do both. Legacy one. May not be application aware. As it is not app aware, it may give error 504 for gateway time out. It may not be aware if it is a Database issue or webserver issue.
X-forwarded-for : This header will have the customer’s public IP as the Load-balancer forwards the request to the actual application.
That’s it..! We know still many topics are not covered, but we have made a effort to help your certification preparation.
Your feedbacks are very much valuable and it helps us improve our contents. Thank You..!

AWS Solutions Architect Associate Certification preparation – short notes-V

Our fifth post on the AWS Solutions Architect Associate certification preparation topic. Hope you have enjoyed the previous posts in this series where we discussed many important topics including EC2, S3, Databases etc…
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
Now, let’s continue
Note : ELBs do not have a pre-defined IP address, you route to them using route53.
1. Simple routing policy – Can have multiple entries against one name and the policy picks the IPs randomly during the request.

Read more

2. Weighted Routing policy – We can set weightage for each record (individual host records to be created) and the IO request will be given priority in that order. [We can create healthchecks for the instances and routing policy omits the records having healthcheck issues]
——————- advertisements ——————-


3. Latency based policy – Route53 decides the DNS records/instances based on the least network latency.
4. Failover routing policy – We can define active and passive records. Healthcheck monitors the active
5. Geolocation routing policy – Based on the location of users queries DNS, the DNS record/EC2 instance will be used. Not same as latency
6. Geoproximity routing policy – Complicated one. Allows the access based on the location of the users and resources.Bias(keyword)
7. Multivalue routing policy – similar to simple routing policy, but allows healthcheck for multiple instances.

VPC (Virtual Private Cloud)
Virtual Private Cloud allows the segregation of the network allowing you to create your own logically isolated AWS environment. Complete control of the network settings (including ip address,subnet,route tables,internet gateways etc…). Can seperate hosts to private (without internet) and public (with internet) segments, adding up security. can create a VPN connection with the VPC and use the AWS as an office/datacenter extension.
* Launch instances into a chosen subnet
* Assigning custom IP address ranges in each subnet
* Configuring route tables between subnets
——————- advertisements ——————-


* Create internet gateway and attach to our VPC
* Better security control over AWS resources
* instance security groups
* Subnet Network ACLs.

Default VPC allows easy instance deployments. All subnets in def VPC will have route to the internet. Each EC2 instance will have both private and public IPs.
VPC Peering : allows direct communication with hosts in another VPC. Peering can be done with VPCs in another AWS account and another region also. No Transitive peering (direct peering between VPCs is required)
1 AZ can have one or more Subnets, but 1 subnet can’t span across AZs.
Only 1 Internet Gateway per VPC.
We are not done with VPC yet, we will add additional notes in the next post in this series. Hope these contents are helping you in your preparation.
Feel free to share your feedback/suggestions in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-IV

Into our fourth post in the AWS Solutions Architect Associate certification preparation series.

In our previous posts, we discussed the common topics including S3, EC2 etc… In this post, we will cover the databases section.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Relational Databases

6 DBs available in AWS are – SQL server, Oracle, MySQL, PostgreSQL, Amazon Aurora, MariaDB

Multi-AZ for Disaster recovery and Read Replicas for Performance.
DynamoDB is amazon’s No SQL solution.
Redshift is the AMazon’s Datawarehousing solution (for Online Analytic processing -OLAP).
Elasticache – improves performance by in-memory cache in cloud. SUpports 2 open-source in-memory caching engines. – Memcached and redis

Read more

RDS runs on VMs but we cannot access those. AWS takes care of managing the VMs. RDS is NOT serverless (except Aurora)

——————- advertisements ——————-

RDS Backups : Automated daily backups and snapshots. Retention period 1-35 days.
Automated backups are enabled by default. Data will be saved in S3, and you get space for free.
During backup window, IO will be suspended and there may be performance issue.

DB snapshots are manual.
Restored DB (from manual snapshot or Automated backup), will be a new RDS instance with new endpoint (URL)

D@RE encryption is supported (with AWS KMS) for SQL server, Oracle, MySQL, PostgreSQL, Amazon Aurora, MariaDB. Stored data, backups and snapshots are all encrypted.
Multi-AZ : For disaster recovery. AWS will automatically switch to the secondary copy in case of any maintenance or disaster. supported for SQL server, Oracle, MySQL, PostgreSQL, and MariaDB. Amazon Aurora by it’s architecture supports multi-AZ failure.
Read-replica : are for performance improvment for read-intensive database instances. Read can be re-directed to any of the async copy of the actual instance. writes can be still done to the primary DB. Supported by MySQL, PostgreSQL, Amazon Aurora, MariaDB
Can have upto 5 copies/replicas of the primary. Can have read-replicas of read-replicas (performance may reduce).Automatic backups must be turned on.
We can have read-replicas that can have multi-AZ. Can create read-replicas of multi-az source DB.

——————- advertisements ——————-

DynamoDB : AWS’s No SQL DB. Uses SSD and is spread across 3 separate geo areas.
Eventual consistant reads(default)- can ensure data consistency after 1-2 secs of write.
Strong consistant reads – Needed if data will be read by application within a second of write.
Redshift is used for Business intelligence. OLAP solution for Datawarehousing. available in 1 AZ at present(can’t span across multi)
Backup is by default with 1 day retention. Can be modified to max of 35days.
Always 3copies (1xOriginal+1xReplica+1xBackupinS3) kept.
For disaster recovery,Redshift can automatically replicate the snapshots to a S3bucket in different region.
Redshift configuration:
Single node with 160 GB or Multi-node (which will have a leader node – which receives the queries and manages client connections – and upto 128 compute nodes – which processes the queries and computations). Users will be charged for the hours the compute nodes are operating not the leader nodes.
D@RE for Redshift using AES-256 encryption. Redshift takes care of KMS. We can manage Keys using HardwareSecurityModule(HSM) or AWS KMS.
Uses advanced level of compression, which identifies similar data and does compression.

——————- advertisements ——————-

Amazon Aurora
MySQL compatible relational database engine, 5x better performance than MySQL.
start with 10G, increments by 10G upto 64TB. Compute resource can scale upto 32vCPUs and 244G memory.
6copies of data (2x copies in 3 AZs). Can loss 2 copies of data without affecting the write-ability. Can loss 3 copies without affecting the read-ability. Aurora read-replicas are better and can have upto 15 copies (5 for MySQL read-replicas). Automated failover (to read-replica) is supported.

Elasticache – improves performance by in-memory cache in cloud. SUpports 2 open-source in-memory caching engines. – Memcached (simple solution) and redis (Supports Multi-AZ and supports backups)

Another short post is coming to an end. Hope it was helpful and you enjoyed reading it. Please share your feedbacks as comments.

AWS Solutions Architect Associate Certification preparation – short notes-III

Third post in our AWS Solutions Architect Associate certification preparation series. Hope you have enjoyed the first post and the second in the series. We have a few more topics to cover in this series and some of them are in this post.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s continue…

EBS (Elastic Block Storage)
types :
General Purpose (SSD) (GP2) – General purpose, cost-effective storage. 100 – 16000 IOPS. Mixed workload
Provisioned IOPS (SSD) (IO1) – For IO intensive workloads.
Throughput Optimized HDD (ST1) – low cost magnetic storage, performance in terms of throughput.
Cold HDD (SC1) – For large, sequential cold-data workloads.
Magnetic – Uses magnetic storage, for infrequently accessed data.

Read more

migrating a EC2 instance from one region to other ::> Create a snapshot of the root volume > Create an AMI from the snap > Create an instance from AMI on another region.
snapshots are existing in S3.

——————- advertisements ——————-


AMI (Amazon Machine Images) can also be copied to another region for VM deployments.

When you delete/terminate an instance, the additional drives won’t get deleted by default.
For AMIs backed by EBS volumes, the OS root device is created on an EBS snapshot of an EBS volume.For AMIs backed by instance store, the instance root device is created from a template stored in AWS S3.
Instance store root volumes will not be listed in EC2>EBS>Volumes as this is not an EBS volume. we can create the instance from an instance store, but only to limited hardware (instance type) selection. We can not stop an instance which is running on instance store. Only reboot or terminate options are available. If there’s an issue in the underlying hardware, data will be lost. It is also called Ephemeral (short time).
Root volume (of the instance) can be encrypted by ::> create a snapshot of the root volume> copy it by encrypting it> create an AMI from the encrypted copy> launch an instance from it.
Cloudwatch and Cloudtrail :
Cloudwatch (Gym trainer to remember) is for performance monitoring – Compute (EC2,Route53,ELoadbalancers..) ,Storage (EBS Volumes, Storage gateway) and CDN (CloudFront)
Cloudtrail (CCTV to remember) is for checking who is calling for who (kind of access logging in my understanding)
Cloudwatch monitors 5minute intervel by default, can be reduced to 1minute also.
2 ways of accessing the AWSCLI, 1 is giving the user the permissions required for CLI and using the credentials in the CLI. Second one is by creating the IAM role for CLI access and adding that to the EC2 instance.

——————- advertisements ——————-


Sample commands :
aws s3 ls   (to list the S3 buckets)
aws s3 mb s3://bforumnewbucket  (to create a bucket with the given name. mb=make bucket)
credentials are saved in plain text in ~/.aws directory.
curl – Captures any meta data about the instance
curl – captures bootstrap data
EFS (Elastic File System)
supports NFSv4. Pay as you use. Petabyte scale. Thousands of concurrent NFS connections. Read after write consistency.
clustered placement group :- for High performance computing, requiring high thruput or low latency. Within a single AZ.
Spread placement group :- for applications with small number of critical instances, that should be kept seperate. Can span across AZs.
Placement groups names must be unique.
That is another short post, many more topics to come. Hope you are enjoying this series. Your feedbacks will help in improving our contents, please feel free to add in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-II

Hope you have gone thru the first post in our series on AWS Solutions Architect Associate certification preparation. This is a continuation of a few topics we covered in the first post. In this series of posts, we will covering the topics required for you to prepare for the AWS Solutions Architect Associate examination. Thus we are trying to help you in your certification journey.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s continue with our contents.

S3 (continued)

Data @ Rest Encryption (D@RE) can be achieved via
SSE-S3 – Server side encr with S3 managing the Keys
SSE-KMS – Customer can define the KMS (Key Management System) available via AWS
SSE-C – Using customer’s own KMS system
client side encryption – encrypt the data before putting in S3. Customer’s responsibility.
Cross Region Replication : Existing files (before enabling CRR) are not replicated, deletion and Delete Markers are not replicated

Read more

Edge location –
Origin – Origin of the file, can be S3, EC2, Elastic Loadbalancer or Route53
distribution – Name given to the CDN (collection of Edgelocations)
Web distribution – for websites
RTMP – for media streaming (Real Time Messaging protocol- Adobe’s Media sharing protocol)
——————- advertisements ——————-


You can invalidate the cache content from CDN, but will be charged

Snowball – 50 and 80 TB variants. 1/5 cost compared to network transfer.256-bit encr. Can import/Export to/from S3.
Snowball Edge – 100TB -with compute and storage.
Snowmobile – Exabyte-scale, upto 100PB. For effective DC migration.
On Demand plan – allows for payment as you use (hours/minutes). Good for testing and dev.
Good for short term testing or small workloads, application testing. no upfront payment
Reserved plan – for 1 or 3 yr contract and is cheaper
Standard – upto 75% discount on pricing and instances type can’t be changed
Convertible – Flexibility of instance types
Scheduled – for scheduled scalability.
Spot – As in share market if the rate matches, you may get it.
If a spot instance is terminated by AWS, the partial hour will not be charged. But will be charged if the termination was initiated by User.
Dedicated – Physical server. Compliance or license use cases
Boot drive can’t be encrypted by EC2, only additional drives can be encrypted. We have to use third party tools in OS (like bitlocker) for boot drive encryption.
——————- advertisements ——————-


Security groups :
When we create an inbound rule, outbound rule is created automatically. Security groups are stateful, NACL (Network ACL) are stateless.
Can allow traffic for an IP or port, but can’t deny. There’s no deny option for Sec Groups. It is possible with NACL.
Everything is blocked by default in SG. You have to go and allow what you wanted.
Let’s continue the EC2 discussions and more topics in our next post. Hope you are enjoying reading the series. You are always welcome to post your comments/suggestions/feedback in the comments section below.

AWS Solutions Architect Associate Certification preparation – short notes-I

Cloud computing certifications are having very high market demand. And many of you are preparing or planning for cloud computing certifications. We recently had a series on the Azure fundamentals (AZ900) certification preparation.

Now it is time for an AWS certification series.

Here we are starting a series on the AWS Solutions Architect Associate certification preparation. We recommend you to attend a complete course on this topic or to refer the authentic documentation for your preparation. These posts are just for your revision, or to help you with some short notes on the course content. Read more

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s get into the contents :

——————- advertisements ——————-


AWS Region, Availability Zones and Edge locations
Region : is a geographical area containing 2 or more Availability zones. Example Sydney, Singapore, Northern Virginia regions.
AZ : Availability zone can be considered as a datacenter. Or it can be more than one DC also. In case of any local disasters like flood or earthquake, we may have data unavailability/data loss scenario for any data in the AZ. But AWS makes sure that the data is having multiple copies in different AZs to ensure data availability.
Edge locations : are the local endpoints for the customers for accessing the data. If a customer is at far distance from the AZ where the data is stored, there could be a latency for the customer to access his data. To avoid this delay, data are being cached to the edge locations. This is being achieved by CloudFront, AWS’s Content Delivery Network.
IAM (Identity Access Management)
Allows/Controls access to the AWS via user management. Shared access to the resource and centralised access control.
Makes Identity Federation (allowing login via different accounts including Facebook, google etc…) possible
Users : Users which access the AWS console
Groups : A set of users as in usual terms of access like AD (Groups for Finance, HR departments in an organization for example)
Policies : Are the defined policies of access, defining which account can do what task. These are saved in JSON (JavaScript Object Notation) format.
Roles : An identity which has a set of permission rules, can be assigned to different individuals/resources.
IAM is universal, any identity created in AWS is global (not specific to any region).
A root user is the user with which an AWS account is created. It has complete admin access. New users can be created and assigned permissions (A new user will not have any permissions when created.
An access key ID and secret access keys are provided when a new user is created.These can be used for accessing the AWS resources via CLI or APIs. These cannot be used for the AWS console access.

——————- advertisements ——————-


S3 (Simple Storage Service)
S3 saves files in bucket. A container or folder, must have a unique universal name.
Successful file upload – http 200 code
Files saved as Key (name), Value (actual file) and version
Sub-resources – Access control list and torrent
11×9’s guarantee for durability, and 99.99% guaranteed availability by Amazon. Saved at different sites and S3 is designed for the loss of 2 sites at a time.
S3-IA (infrequently Accessed) – lower fee storage for infrequently accessed data
S3 One ZOne IA- cheaper version of S3, data at one site. (Reduced Redundancy storage – RRS)
S3 Intelligent Tiering – Auto-tiering
Multi factor authentication can be enabled for Delete operations for protecting the data.
S3 Glacier and S3 Glacier Deep Archive – For archival. Deep archive is the cheapest storage but retrieval time is 12 hours. S3 is being billed for the storage capacity, no of requests for access, Tiers, transfer, cross region replication.

——————- advertisements ——————-


Bucket policies – Works at the bucket level
ACL – Works at the individual obj level
Bucket access logging is possible and can be saved to a different bucket also.
We will discuss further on S3 and many other topics in the next post in this series. Hope this section was helpful for you.
Please share your suggestions/feedback in the comments section.
1 2 3 4 8