How to Create a DynamoDB Table

 



Amazon DynamoDB is a NoSQL database used in serverless applications.  It supports key-value and document data models. If you need to create a DynamoDB table that will be used by your application service, here are the basic steps to do that.


1.  The bare-minimum requirements to create a DynamoDB table are the table name, a partition key and a sort key column.  For example, the table below:

Tablename: dbaworks-courses

PartitionKey: teacher_id

SortKey: course_id

The name of the table is  dbaworks-courses.  Table names are case-sensitive, and contains only characters a-z or A-Z, the numbers 0-9, or the symbols dash (-), underscore (_) or dot (.)

The partition key and sort key are the types of primary keys for DynamoDB tables.  The partition key values are used in an internal hash function that determines the physical storage partition where the data item is stored. In the example, the partition key is teacher_id. The data items in the table dbaworks-courses can be accessed directly using the teacher-id value. 

The use of sort key together with the partition key makes up the composite primary key.  In this case, more than one data items can have the same value for the partition key teacher-id and are stored together in their respective partitions. In each partition, the data are further sorted based on the sort key, in this case the course_id


2.  You may optionally need to add tags to your new table for better categorisation and reporting.

Examples of tags and their values:

creator = dbaworks

environment = prod

department = computer-science

jiraticket = CS-2233 

requestor = cmary 

application = courseapps


3.  Create the DynamoDB table in the AWS Console or using the AWS CLI command below:

aws dynamodb create-table \

  --table-name dbaworks-courses \

  --attribute-definitions \

         AttributeName=teacher_id,AttributeType=S \

         AttributeName=course_id,AttributeType=S  \

  --key-schema AttributeName=teacher_id,KeyType=HASH \

         AttributeName=course_id,KeyType=RANGE 


You need to supply the required parameters such as the table-name,the attribute-definitions, and the key-schema. Our table name is dbaworks-courses. The attribute-definitions parameter section is divided further into the attribute name and type.  This is the part where you define all the columns of your table and their data types. In our example, these are 2 columns or attributes, and they are the teacher_id and the course_id. 

The  key-schema is where you define the primary or composite key for your table.  Again, you can declare specific attributes like key name and key type under this parameter.  In our case, the teacher_id is the the partition key, and it has the HASH value for its key type to indicate that this is a hash attribute and will be used in the internal hash function to define the storage partition.  The other key attribute is course_id and it has the RANGE value for its key type to indicate that this is the sort key.

The output of the create-table command is below: 

{

    "TableDescription": {

        "AttributeDefinitions": [

            {

                "AttributeName": "course_id",

                "AttributeType": "S"

            },

            {

                "AttributeName": "teacher_id",

                "AttributeType": "S"

            }

        ],

        "TableName": "dbaworks-courses",

        "KeySchema": [

            {

                "AttributeName": "teacher_id",

                "KeyType": "HASH"

            },

            {

                "AttributeName": "course_id",

                "KeyType": "RANGE"

            }

        ],

        "TableStatus": "CREATING",

        "CreationDateTime": 1595058082.896,

        "ProvisionedThroughput": {

            "NumberOfDecreasesToday": 0,

            "ReadCapacityUnits": 0,

            "WriteCapacityUnits": 0

        },

        "TableSizeBytes": 0,

        "ItemCount": 0,

        "TableArn": "arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses",

        "TableId": "123456f3-1234-5f33-65cc-4cfd456e3nn7",

        "BillingModeSummary": {

            "BillingMode": "PAY_PER_REQUEST"

        }

    }

}


Take note of the resource ARN  or the TableArn in the output of the create-table command, because you will need this for further modifications or updates in your table.  In this example, the resource ARN or TableArn is "arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses"


4.  You can monitor the table creation using the command below:


aws dynamodb describe-table --table-name dbaworks-courses | grep TableStatus


        "TableStatus": "ACTIVE",


5.  Once the table is created or in ACTIVE status, you can optionally  add the tags to the table, using the tags in Step 2 as values.  Here is how to add tags to the new DynamoDB table.  The resource-arn is the TableArn value that resulted from the create-table command from Step 3.  

aws dynamodb tag-resource \

  --resource-arn arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses \

  --tags \

     Key=creator,Value=dbaworks \

     Key=environment,Value=prod \

     Key=department,Value=computer-science \

     Key=jiraticket,Value=CS-2233 \

     Key=requestor,Value=cmary \

     Key=application,Value=courseapps 


You can list the tags added for this table by running the command below:  

aws dynamodb list-tags-of-resource \

  --resource-arn arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses


6.  You can also add secondary indexes to the table.  These indexes can either be a global secondary index (GSI) or a local secondary index (LSI). Secondary indexes are used to query data using different keys other than the primary keys. GSI is an index with its own partition key and sort key different from the table's primary key,  and LSI uses the partition key of the table but has a different sort key. These secondary indexes give more flexibility in querying the data.  Secondary index declarations are also similar in form as creating the table and its attributes, and it requires such parameter values as IndexName, KeySchema, and Projection. 

In the example below, we used the update-table command to add a global secondary index to the table:

aws dynamodb update-table \

  --table-name dbaworks-courses \

  --attribute-definitions \

      AttributeName=student_uid,AttributeType=S \

      AttributeName=created,AttributeType=N \

  --global-secondary-index-updates \

    '[{"Create":{"IndexName": "student-index","KeySchema":[{"AttributeName":"student_uid","KeyType":"HASH"},{"AttributeName":"created","KeyType":"RANGE"}],"Projection":{"ProjectionType":"ALL"}}}]'

The Projection declaration specifies that these column attributes will be copied or projected from the main table into the index. The projection type ALL means all the attributes will be projected to the index.  You can also chose to project the index and primary keys only (KEYS_ONLY) or a specified list of table attributes only (INCLUDE).


7.  You can learn more of the comprehensive options and functionalities of the create-table command by using the help option such as below:

aws dynamodb create-table help


If you want to deeper-dive into the concepts of Amazon DynamoDB, you can learn a lot from these books:





Technology Photo by fabio on Unsplash



Download RDS DB logs through AWS CLI

If you need a quicker way to download RDS DB logs, you can do so through AWS CLI (command line interface).  

Although the AWS console also allows downloading these logs, there are disadvantages to this.  Since the console is a graphical user interface, you must click/select the RDS instance first, click on the Monitoring tab of the instance, scroll down to the list of instance logs, and even sort the list of logs in descending order to get to the most recent one -- if you are in the middle of an incident and need to look at the logs to diagnose the issue.  Then you click on the selected log, click the download button, which will bring up a page where you can right-click the selected log using  "save as".  In short, too many mouse clicks before you can get to the needed logs.

Another disadvantage is that you are only allowed to download the log files as they are.  So if the log file is large, you cross your fingers that the network connection will hold steady and won't disconnect in the middle of the download.  If that happens, you have to resume the download and start again.

For faster and more reliable download of RDS DB log files, you can use AWS CLIs instead.  Here is how it is done in Linux.  The same CLIs are also available for Windows.


First, you run the describe-db-log-files command and combine it with the grep Linux OS command to list the existing log files.  You must supply the value for the db-instance-identifier parameter, in this example it is dba-works-01

dbaworks@dbasrv1:~$ aws rds describe-db-log-files --db-instance-identifier dba-works-01 | grep "LogFileName"

This will return the list of RDS log files for the dba-works-01 instance.


            "LogFileName": "error/postgresql.log.2021-05-16-12",

            "LogFileName": "error/postgresql.log.2021-05-17-00",

            "LogFileName": "error/postgresql.log.2021-05-17-12",

            "LogFileName": "error/postgresql.log.2021-05-18-00",

            "LogFileName": "error/postgresql.log.2021-05-18-12",

            "LogFileName": "error/postgresql.log.2021-05-19-00",

            "LogFileName": "error/postgresql.log.2021-05-19-12",


Next, run the download-db-log-file-portion  command to begin downloading the selected log file. Again, you should supply the value for the db-instance-identifier parameter, in this example it is dba-works-01, the value for the log-file-name parameter, in this case it is "error/postgresql.log.2021-05-19-12". The next parameter -- starting-token -- allows you to specify where to start paginating the log file.  In this example the value of 0 is used, which means to download the entire file.  The last parameter is output and we use text to specify that the log file should be formatted as text when it is downloaded.  Lastly, we use the Linux pipe to save the results to a local file name dba-works-01-postgresql.log.2021-05-19-12.log


dbaworks@dbasrv1:~$ aws rds download-db-log-file-portion --db-instance-identifier dba-works-01 --log-file-name "error/postgresql.log.2021-05-19-12" --starting-token 0 --output text > dba-works-01-postgresql.log.2021-05-19-12.log


You can also download only the latest portion of the log file by using a simplified command, skipping the -- starting-token parameter, see below: 


dbaworks@dbasrv1:~$ aws rds download-db-log-file-portion --db-instance-identifier dba-works-01 --log-file-name "error/postgresql.log.2021-05-19-12" --output text dba-works-01-postgresql.log.2021-05-19-12.log

 

The  download-db-log-file-portion command is a nifty tool.  You can run the command with the help option to get more help or information on the other different ways to use this command for your convenience.


aws rds download-db-log-file-portion --db-instance-identifier help



Want to learn more about AWS Command Line Interface (CLI)?  Update your knowledge with this book below:


How to Load or Import a .csv File to a PostgreSQL Table



If you need to populate a PostGreSQL table with data that comes in .csv (comma-separated values) file format, here are the steps to do so:

1.  Login to PostGreSQL with your user that can access and manipulate the table

psql -h dba-works-01 -U dbaworks salesdb

2.  Use the copy command to populate the table with data from .csv file


COPY transactions FROM '/path-to-csv-file/transactions.csv' WITH (FORMAT csv);


NOTE that you can use COPY when you have superuser root privileges.

OR you can use \COPY if you do not have superuser access.


\COPY transactions FROM '/path-to-csv-file/transactions.csv
DELIMITER ',' CSV



If you want to learn about PostgreSQL or update your knowledge about it, you can check out this book below:





CSV Photo by Mika Baumeister on Unsplash

Add a Command Alias to Linux User Profile




Problem:  I'm tired of typing long commands in the Linux command prompt whenever I want something done. Not only that, typing these commands is prone to typo errors.

Solution: Add a command alias to our user profile.

Here's how:

1.  Edit your Linux OS user profile (or .bash_profile)  and add a command alias

dbaworks@dbasrv1:~$ vi .profile

-- add the command alias

# set PATH so it includes user's private bin directories
PATH="$HOME/bin:$HOME/.local/bin:$PATH"

alias check_mongo='ps -ef | grep mongo'


2.  Reload your OS user profile

dbaworks@dbasrv1:~$ . .profile

3.  You can now run the command alias and get the desired result:

dbaworks@dbasrv1:~$ check_mongo

--- output
dbaworks@dbasrv1:~$ check_mongo 
mongod 9655 1 1 Jun29 ? 04:15:13 /usr/bin/mongod -- configsvr --replSet csReplSet1 --port 27050 -f /data/mongod.conf --bind_ip_all 


Learn more about Linux shell scripting using this guide book:

Shell Script to Automatically Create an AMI Backup of an EC2 Instance (AWS)




Requirements: 

- Access to a Linux machine where shell script will be created
- Access keys to an Amazon Web Services (AWS) Account where the EC2 Instance is hosted


Script name: ami-bash.sh

Input: Name of the instance

How to run: ami-backup.sh <instance_name>

What it does: 

1.  Asks for INSTANCE_NAME

2.  Sets the LOG directory and LOGFILE to record execution

3.  Runs the AWS CLI describe-instances to extract the INSTANCE_ID

4.  Sets the name of the AMI IMAGE_NAME  

5.  Runs the AWS CLI create-image to create the AMI

6.  Runs a loop to check if the image has been created and available

Finished result: An AMI backup of the instance


Script:
=================================================================================
# name:  ami-backup.sh
# usage: ami-backup.sh <instance_name>
#        ami-backup.sh my-db-instance

INSTANCE_NAME=$1

LOGDIR=/tmp/log
LOGFILE=${LOGDIR}/ami-backup-$INSTANCE_NAME-$(date +"%Y-%m-%d-%H-%M-%S").log

INSTANCE_ID=`aws ec2 -describe-instances --filters \
    "Name=tag:Name,Values=$INSTANCE_NAME" \
    | grep "InstanceId" \
    | grep -Po ': "\K[^"]*'`
IMAGE_NAME=ami-backup-$INSTANCE_NAME-$(date +"%Y-%m-%d-%H-%M")

echo "$(date +"%Y-%m-%d-%H:%M:%S") ami-backup.sh started." >> $LOGFILE
echo "$IMAGE_NAME AMI will be created." >> $LOGFILE

IMAGE_ID=`aws ec2 create-image --instance-id $INSTANCE_ID \
     --name "$IMAGE_NAME" \
     --description "$INSTANCE_NAME AMI - $(date +"%Y-%m-%d-%H-%M")" \
     | grep -Po ': "\K[^"]*'`

while [ true ]
do
   echo "AMI creation in progress..." >> $LOGFILE
   IMAGE_CREATION_STATUS=`aws ec2 describe-images --filters \
      "Name=name,Values=$IMAGE_NAME" | grep "State" | grep -Po ': "\K[^"]*'`
      echo "AMI $IMAGE_NAME created." >> $LOGFILE
      break
   fi
   sleep 300
done

echo "$(date +"%Y-%m-%d-%H:%M:%S") ami-backup.sh completed." >> $LOGFILE

=================================================================================



=================================================================================

Amazon Photo by Igor Shalyminov on Unsplash