Getting familiar with Boto3 – AWS SDK for Python

Let’s discuss about Boto3, AWS Software Development Kit (SDK) for Python in this post. Boto3 enables you to perform most of the tasks in almost all the AWS services via Python scripts. For more details on the available services and actions, I would recommend you to refer the Boto3 documentation.

In this post we will have a brief introduction to the SDK including the installation and a few examples.

Installation

Yes, as you expected it is the pip install command. pip install boto3 will take care of the installation and if you want a specific version you can use the below syntax (where version is the required version 1.20.30 for example).

pip install boto3==version Read more

C:\Users\bforum>python -m pip install boto3
Collecting boto3
Downloading boto3-1.20.31-py3-none-any.whl (131 kB)
|████████████████████████████████| 131 kB 168 kB/s
Collecting botocore<1.24.0,>=1.23.31
Downloading botocore-1.23.31-py3-none-any.whl (8.5 MB)
|████████████████████████████████| 8.5 MB 726 kB/s

 

==== truncated output ====

 

Successfully installed boto3-1.20.31 botocore-1.23.31

Setting up Boto3 

Now that you have installed the module, you have import the same to your program. import boto3 would do it.

——————- advertisements ——————-

———————————————————

For accessing the AWS services, you have to allow the credentials for Boto3 to connect to the environment. You can have the shared credentials (key_id and secret for AWS access updated in the .aws/credentials file. If you are using an EC2 instance, you can use the IAM roles for accessing the AWS environment.

Not at all recommended, but here I am having my credentials inside my script itself.

Basic S3 tasks

aws_secret_access_key=’YOUR_ACCESS_KEY’
s3 = boto3.client(service_name=”s3″,aws_access_key_id=”test_key”,aws_secret_access_key=aws_secret_access_key)

You can replace the above line with just s3 = boto3.client(‘s3’) if you have your credentials defined in the file or is being taken care by IAM.

You can now invoke the below command to create an S3 bucket.

s3.create_bucket(Bucket=’bucket_name’) 

# Where bucket_name is the desired bucket name should be unique and must meeting the naming requirements (lowercase,numbers,periods and dashes only, having length of 3-63 characters etc…)

You will get an HTTP 200 response with bucket name and creation details.

Now let’s see how we can list S3 buckets.

s3 = boto3.client(‘s3’)
s3.list_buckets()

——————- advertisements ——————-

———————————————————

the above list_buckets command will list the existing buckets, but that will be HTTP response in JSON format. To filter only the names, let’s use the below commands (basically a for loop).

out = s3.list_buckets()
for bucket in out[‘Buckets’]:
   print(bucket[“Name”])

The output will be the bucket names as below

beginnersforum-bf1
beginnersforum-bf2
beginnersforum1crrs3

You can download a file from S3 using the below commands

bucket_name = ‘beginnersforum-bf1’
keyname = ‘Test Text file1.txt’
output_filename = ‘test1.txt’
s3.download_file(bucket_name, keyname, output_filename)

Few EC2 examples

ec2 = boto3.client(‘ec2’) #will create an EC2 client which can be used in the upcoming commands. 

start an instance : ec2.start_instances(InstanceIds=[instance_id])

Reboot an instance : ec2.reboot_instances(InstanceIds=[‘Instance_ID’])

Shutdown an instance : ec2.stop_instances(InstanceIds=[instance_id])

That was just a very basic intro to this Python module. Based on your need, you may refer to the specific service and action details in the SDK documentation. Hope this post helped you in understanding the module, please feel free to have your thoughts/feedback in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-VI

Our Sixth and last post in our AWS Solutions Architect Associate Certification preparation series. Hope you have gone thru the previous posts and you are happy with the content we shared.
.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
.
VPC (Continued)
Steps in setting up VPC is very important in your exam (as below)
* Create VPC – Creates Route table, ACL and SG
* Create additional Subnets – assign public IP to be set Yes for the public subnet
* Create internet Gateway and attach to VPC
* Create the additional route – for allowing public access (We should never allow public for the main route)
* Assign the subnets for Routes – allow public one for the new route
* Launch instances with the new VPC and the different subnets
* Create a Security group for private instance to allow access to it from public instance

Read more

NAT instances and NAT gateways are a way of NATing and making the private subnet system communicate with internet. NAT instances are single EC2 instances without any redundancy. NAT gateways are HA-enabled within AZ. It can not span AZs, better to have separate Gateways in each AZ.
NAT instances can send communications without being a source or destination. For this to work, we have to disable source/destination checks for the instance.
——————- advertisements ——————-

———————————————————

NACL (Network Access Control Lists) : default NACL is created when VPC is created with ALL Allow rules. Any subnet being created will be added by default to the default NACL. We can created additional NACLs and can associate subnets. One subnet can be part of a single NACL. If an Allow rule (rule No. 200) is created and there’s a deny rule (rule No.100), Deny rule takes precedence based on the rule No (chronological order). (e.g; rule 100 to allow all access via 80. rule 200 to deny all access via port 80. Allow will be in effect).
NACL will be checked first before the same rule in Security group.
At least 2 public subnets are required for creating the LoadBalancers.
Network flow logs are the way of capturing the TCP flow using the CloudWatch. It can be at VPC,Subnet or Network interface level.
Can not enable flow logs for a peered VPC with a VPC in another AWS account. Flow log config can’t be modified (e.g; modifying the IAM role etc..)
Bastian host allows for administration of instance in private network. NAT Gateway/instance allows internet access for the private instance but administration is not possible.
AWS direct connect : Direct connect (DX) centers are available everywhere and we will have to have a customer/partner Cage there with routers. These will connect (AWS backbone network) to the AWS Cage routers. Cust/partner router connects to the customer premise (office/DC) and AWS routers connect to our AWS services (instances/S3/VPC etc…).

——————- advertisements ——————-

———————————————————

VPC endpoints allows to connect the VPC to AWS services (without going out of AWS network). VPC Gateway endpoint (Supported with S3 and DynamoDB) and VPC instance endpoints.
High Availability
Application loadbalancers : Application aware, operates at layer 7 of OSI. HTTP and HTTPs requests.
Network loadbalancers : TCP traffic balancing, for extreme performance.
Classic loadbalancers : Can do both. Legacy one. May not be application aware. As it is not app aware, it may give error 504 for gateway time out. It may not be aware if it is a Database issue or webserver issue.
X-forwarded-for : This header will have the customer’s public IP as the Load-balancer forwards the request to the actual application.
That’s it..! We know still many topics are not covered, but we have made a effort to help your certification preparation.
Your feedbacks are very much valuable and it helps us improve our contents. Thank You..!

AWS Solutions Architect Associate Certification preparation – short notes-V

Our fifth post on the AWS Solutions Architect Associate certification preparation topic. Hope you have enjoyed the previous posts in this series where we discussed many important topics including EC2, S3, Databases etc…
.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
.
Now, let’s continue
Route53
Note : ELBs do not have a pre-defined IP address, you route to them using route53.
1. Simple routing policy – Can have multiple entries against one name and the policy picks the IPs randomly during the request.

Read more

2. Weighted Routing policy – We can set weightage for each record (individual host records to be created) and the IO request will be given priority in that order. [We can create healthchecks for the instances and routing policy omits the records having healthcheck issues]
——————- advertisements ——————-

———————————————————

3. Latency based policy – Route53 decides the DNS records/instances based on the least network latency.
4. Failover routing policy – We can define active and passive records. Healthcheck monitors the active
5. Geolocation routing policy – Based on the location of users queries DNS, the DNS record/EC2 instance will be used. Not same as latency
6. Geoproximity routing policy – Complicated one. Allows the access based on the location of the users and resources.Bias(keyword)
7. Multivalue routing policy – similar to simple routing policy, but allows healthcheck for multiple instances.

VPC (Virtual Private Cloud)
Virtual Private Cloud allows the segregation of the network allowing you to create your own logically isolated AWS environment. Complete control of the network settings (including ip address,subnet,route tables,internet gateways etc…). Can seperate hosts to private (without internet) and public (with internet) segments, adding up security. can create a VPN connection with the VPC and use the AWS as an office/datacenter extension.
* Launch instances into a chosen subnet
* Assigning custom IP address ranges in each subnet
* Configuring route tables between subnets
——————- advertisements ——————-

———————————————————

* Create internet gateway and attach to our VPC
* Better security control over AWS resources
* instance security groups
* Subnet Network ACLs.

Default VPC allows easy instance deployments. All subnets in def VPC will have route to the internet. Each EC2 instance will have both private and public IPs.
VPC Peering : allows direct communication with hosts in another VPC. Peering can be done with VPCs in another AWS account and another region also. No Transitive peering (direct peering between VPCs is required)
1 AZ can have one or more Subnets, but 1 subnet can’t span across AZs.
Only 1 Internet Gateway per VPC.
We are not done with VPC yet, we will add additional notes in the next post in this series. Hope these contents are helping you in your preparation.
Feel free to share your feedback/suggestions in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-IV

Into our fourth post in the AWS Solutions Architect Associate certification preparation series.

In our previous posts, we discussed the common topics including S3, EC2 etc… In this post, we will cover the databases section.

.
[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]
.

Relational Databases

6 DBs available in AWS are – SQL server, Oracle, MySQL, PostgreSQL, Amazon Aurora, MariaDB

Multi-AZ for Disaster recovery and Read Replicas for Performance.
DynamoDB is amazon’s No SQL solution.
Redshift is the AMazon’s Datawarehousing solution (for Online Analytic processing -OLAP).
Elasticache – improves performance by in-memory cache in cloud. SUpports 2 open-source in-memory caching engines. – Memcached and redis

Read more

RDS runs on VMs but we cannot access those. AWS takes care of managing the VMs. RDS is NOT serverless (except Aurora)

——————- advertisements ——————-

———————————————————
RDS Backups : Automated daily backups and snapshots. Retention period 1-35 days.
Automated backups are enabled by default. Data will be saved in S3, and you get space for free.
During backup window, IO will be suspended and there may be performance issue.

DB snapshots are manual.
Restored DB (from manual snapshot or Automated backup), will be a new RDS instance with new endpoint (URL)

D@RE encryption is supported (with AWS KMS) for SQL server, Oracle, MySQL, PostgreSQL, Amazon Aurora, MariaDB. Stored data, backups and snapshots are all encrypted.
Multi-AZ : For disaster recovery. AWS will automatically switch to the secondary copy in case of any maintenance or disaster. supported for SQL server, Oracle, MySQL, PostgreSQL, and MariaDB. Amazon Aurora by it’s architecture supports multi-AZ failure.
Read-replica : are for performance improvment for read-intensive database instances. Read can be re-directed to any of the async copy of the actual instance. writes can be still done to the primary DB. Supported by MySQL, PostgreSQL, Amazon Aurora, MariaDB
Can have upto 5 copies/replicas of the primary. Can have read-replicas of read-replicas (performance may reduce).Automatic backups must be turned on.
We can have read-replicas that can have multi-AZ. Can create read-replicas of multi-az source DB.

——————- advertisements ——————-

———————————————————
DynamoDB : AWS’s No SQL DB. Uses SSD and is spread across 3 separate geo areas.
Eventual consistant reads(default)- can ensure data consistency after 1-2 secs of write.
Strong consistant reads – Needed if data will be read by application within a second of write.
Redshift is used for Business intelligence. OLAP solution for Datawarehousing. available in 1 AZ at present(can’t span across multi)
Backup is by default with 1 day retention. Can be modified to max of 35days.
Always 3copies (1xOriginal+1xReplica+1xBackupinS3) kept.
For disaster recovery,Redshift can automatically replicate the snapshots to a S3bucket in different region.
Redshift configuration:
Single node with 160 GB or Multi-node (which will have a leader node – which receives the queries and manages client connections – and upto 128 compute nodes – which processes the queries and computations). Users will be charged for the hours the compute nodes are operating not the leader nodes.
D@RE for Redshift using AES-256 encryption. Redshift takes care of KMS. We can manage Keys using HardwareSecurityModule(HSM) or AWS KMS.
Uses advanced level of compression, which identifies similar data and does compression.

——————- advertisements ——————-

———————————————————
Amazon Aurora
MySQL compatible relational database engine, 5x better performance than MySQL.
start with 10G, increments by 10G upto 64TB. Compute resource can scale upto 32vCPUs and 244G memory.
6copies of data (2x copies in 3 AZs). Can loss 2 copies of data without affecting the write-ability. Can loss 3 copies without affecting the read-ability. Aurora read-replicas are better and can have upto 15 copies (5 for MySQL read-replicas). Automated failover (to read-replica) is supported.

Elasticache – improves performance by in-memory cache in cloud. SUpports 2 open-source in-memory caching engines. – Memcached (simple solution) and redis (Supports Multi-AZ and supports backups)

Another short post is coming to an end. Hope it was helpful and you enjoyed reading it. Please share your feedbacks as comments.

AWS Solutions Architect Associate Certification preparation – short notes-III

Third post in our AWS Solutions Architect Associate certification preparation series. Hope you have enjoyed the first post and the second in the series. We have a few more topics to cover in this series and some of them are in this post.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s continue…

EBS (Elastic Block Storage)
types :
General Purpose (SSD) (GP2) – General purpose, cost-effective storage. 100 – 16000 IOPS. Mixed workload
Provisioned IOPS (SSD) (IO1) – For IO intensive workloads.
Throughput Optimized HDD (ST1) – low cost magnetic storage, performance in terms of throughput.
Cold HDD (SC1) – For large, sequential cold-data workloads.
Magnetic – Uses magnetic storage, for infrequently accessed data.

Read more

migrating a EC2 instance from one region to other ::> Create a snapshot of the root volume > Create an AMI from the snap > Create an instance from AMI on another region.
snapshots are existing in S3.

——————- advertisements ——————-

———————————————————

AMI (Amazon Machine Images) can also be copied to another region for VM deployments.

When you delete/terminate an instance, the additional drives won’t get deleted by default.
For AMIs backed by EBS volumes, the OS root device is created on an EBS snapshot of an EBS volume.For AMIs backed by instance store, the instance root device is created from a template stored in AWS S3.
Instance store root volumes will not be listed in EC2>EBS>Volumes as this is not an EBS volume. we can create the instance from an instance store, but only to limited hardware (instance type) selection. We can not stop an instance which is running on instance store. Only reboot or terminate options are available. If there’s an issue in the underlying hardware, data will be lost. It is also called Ephemeral (short time).
Root volume (of the instance) can be encrypted by ::> create a snapshot of the root volume> copy it by encrypting it> create an AMI from the encrypted copy> launch an instance from it.
Cloudwatch and Cloudtrail :
Cloudwatch (Gym trainer to remember) is for performance monitoring – Compute (EC2,Route53,ELoadbalancers..) ,Storage (EBS Volumes, Storage gateway) and CDN (CloudFront)
Cloudtrail (CCTV to remember) is for checking who is calling for who (kind of access logging in my understanding)
Cloudwatch monitors 5minute intervel by default, can be reduced to 1minute also.
2 ways of accessing the AWSCLI, 1 is giving the user the permissions required for CLI and using the credentials in the CLI. Second one is by creating the IAM role for CLI access and adding that to the EC2 instance.

——————- advertisements ——————-

———————————————————

Sample commands :
aws s3 ls   (to list the S3 buckets)
aws s3 mb s3://bforumnewbucket  (to create a bucket with the given name. mb=make bucket)
credentials are saved in plain text in ~/.aws directory.
curl http://169.254.169.254/latest/meta-data – Captures any meta data about the instance
curl http://169.254.169.254/latest/user-data – captures bootstrap data
EFS (Elastic File System)
supports NFSv4. Pay as you use. Petabyte scale. Thousands of concurrent NFS connections. Read after write consistency.
clustered placement group :- for High performance computing, requiring high thruput or low latency. Within a single AZ.
Spread placement group :- for applications with small number of critical instances, that should be kept seperate. Can span across AZs.
Placement groups names must be unique.
That is another short post, many more topics to come. Hope you are enjoying this series. Your feedbacks will help in improving our contents, please feel free to add in the comments section.

AWS Solutions Architect Associate Certification preparation – short notes-II

Hope you have gone thru the first post in our series on AWS Solutions Architect Associate certification preparation. This is a continuation of a few topics we covered in the first post. In this series of posts, we will covering the topics required for you to prepare for the AWS Solutions Architect Associate examination. Thus we are trying to help you in your certification journey.

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s continue with our contents.

S3 (continued)

Data @ Rest Encryption (D@RE) can be achieved via
SSE-S3 – Server side encr with S3 managing the Keys
SSE-KMS – Customer can define the KMS (Key Management System) available via AWS
SSE-C – Using customer’s own KMS system
client side encryption – encrypt the data before putting in S3. Customer’s responsibility.
Cross Region Replication : Existing files (before enabling CRR) are not replicated, deletion and Delete Markers are not replicated

Read more

Edge location –
Origin – Origin of the file, can be S3, EC2, Elastic Loadbalancer or Route53
distribution – Name given to the CDN (collection of Edgelocations)
Web distribution – for websites
RTMP – for media streaming (Real Time Messaging protocol- Adobe’s Media sharing protocol)
——————- advertisements ——————-

———————————————————

You can invalidate the cache content from CDN, but will be charged

Snowball – 50 and 80 TB variants. 1/5 cost compared to network transfer.256-bit encr. Can import/Export to/from S3.
Snowball Edge – 100TB -with compute and storage.
Snowmobile – Exabyte-scale, upto 100PB. For effective DC migration.
EC2
On Demand plan – allows for payment as you use (hours/minutes). Good for testing and dev.
Good for short term testing or small workloads, application testing. no upfront payment
Reserved plan – for 1 or 3 yr contract and is cheaper
Standard – upto 75% discount on pricing and instances type can’t be changed
Convertible – Flexibility of instance types
Scheduled – for scheduled scalability.
Spot – As in share market if the rate matches, you may get it.
If a spot instance is terminated by AWS, the partial hour will not be charged. But will be charged if the termination was initiated by User.
Dedicated – Physical server. Compliance or license use cases
Boot drive can’t be encrypted by EC2, only additional drives can be encrypted. We have to use third party tools in OS (like bitlocker) for boot drive encryption.
——————- advertisements ——————-

———————————————————

Security groups :
When we create an inbound rule, outbound rule is created automatically. Security groups are stateful, NACL (Network ACL) are stateless.
Can allow traffic for an IP or port, but can’t deny. There’s no deny option for Sec Groups. It is possible with NACL.
Everything is blocked by default in SG. You have to go and allow what you wanted.
Let’s continue the EC2 discussions and more topics in our next post. Hope you are enjoying reading the series. You are always welcome to post your comments/suggestions/feedback in the comments section below.

AWS Solutions Architect Associate Certification preparation – short notes-I

Cloud computing certifications are having very high market demand. And many of you are preparing or planning for cloud computing certifications. We recently had a series on the Azure fundamentals (AZ900) certification preparation.

Now it is time for an AWS certification series.

Here we are starting a series on the AWS Solutions Architect Associate certification preparation. We recommend you to attend a complete course on this topic or to refer the authentic documentation for your preparation. These posts are just for your revision, or to help you with some short notes on the course content. Read more

[ Disclaimer : This is not a complete training material for the certification. This is just random (short) notes which we captured from course curricula, which will help the readers for their final revision/rewind before appearing for the exam. We do not offer any guarantee in passing the exam with this content ]

Let’s get into the contents :

——————- advertisements ——————-

———————————————————

AWS Region, Availability Zones and Edge locations
Region : is a geographical area containing 2 or more Availability zones. Example Sydney, Singapore, Northern Virginia regions.
AZ : Availability zone can be considered as a datacenter. Or it can be more than one DC also. In case of any local disasters like flood or earthquake, we may have data unavailability/data loss scenario for any data in the AZ. But AWS makes sure that the data is having multiple copies in different AZs to ensure data availability.
Edge locations : are the local endpoints for the customers for accessing the data. If a customer is at far distance from the AZ where the data is stored, there could be a latency for the customer to access his data. To avoid this delay, data are being cached to the edge locations. This is being achieved by CloudFront, AWS’s Content Delivery Network.
IAM (Identity Access Management)
Allows/Controls access to the AWS via user management. Shared access to the resource and centralised access control.
Makes Identity Federation (allowing login via different accounts including Facebook, google etc…) possible
Users : Users which access the AWS console
Groups : A set of users as in usual terms of access like AD (Groups for Finance, HR departments in an organization for example)
Policies : Are the defined policies of access, defining which account can do what task. These are saved in JSON (JavaScript Object Notation) format.
Roles : An identity which has a set of permission rules, can be assigned to different individuals/resources.
IAM is universal, any identity created in AWS is global (not specific to any region).
A root user is the user with which an AWS account is created. It has complete admin access. New users can be created and assigned permissions (A new user will not have any permissions when created.
An access key ID and secret access keys are provided when a new user is created.These can be used for accessing the AWS resources via CLI or APIs. These cannot be used for the AWS console access.

——————- advertisements ——————-

———————————————————

S3 (Simple Storage Service)
S3 saves files in bucket. A container or folder, must have a unique universal name.
Successful file upload – http 200 code
Files saved as Key (name), Value (actual file) and version
Sub-resources – Access control list and torrent
11×9’s guarantee for durability, and 99.99% guaranteed availability by Amazon. Saved at different sites and S3 is designed for the loss of 2 sites at a time.
S3-IA (infrequently Accessed) – lower fee storage for infrequently accessed data
S3 One ZOne IA- cheaper version of S3, data at one site. (Reduced Redundancy storage – RRS)
S3 Intelligent Tiering – Auto-tiering
Multi factor authentication can be enabled for Delete operations for protecting the data.
S3 Glacier and S3 Glacier Deep Archive – For archival. Deep archive is the cheapest storage but retrieval time is 12 hours. S3 is being billed for the storage capacity, no of requests for access, Tiers, transfer, cross region replication.

——————- advertisements ——————-

———————————————————

Bucket policies – Works at the bucket level
ACL – Works at the individual obj level
Bucket access logging is possible and can be saved to a different bucket also.
We will discuss further on S3 and many other topics in the next post in this series. Hope this section was helpful for you.
Please share your suggestions/feedback in the comments section.