How to Create a DynamoDB Table

 



Amazon DynamoDB is a NoSQL database used in serverless applications.  It supports key-value and document data models. If you need to create a DynamoDB table that will be used by your application service, here are the basic steps to do that.


1.  The bare-minimum requirements to create a DynamoDB table are the table name, a partition key and a sort key column.  For example, the table below:

Tablename: dbaworks-courses

PartitionKey: teacher_id

SortKey: course_id

The name of the table is  dbaworks-courses.  Table names are case-sensitive, and contains only characters a-z or A-Z, the numbers 0-9, or the symbols dash (-), underscore (_) or dot (.)

The partition key and sort key are the types of primary keys for DynamoDB tables.  The partition key values are used in an internal hash function that determines the physical storage partition where the data item is stored. In the example, the partition key is teacher_id. The data items in the table dbaworks-courses can be accessed directly using the teacher-id value. 

The use of sort key together with the partition key makes up the composite primary key.  In this case, more than one data items can have the same value for the partition key teacher-id and are stored together in their respective partitions. In each partition, the data are further sorted based on the sort key, in this case the course_id


2.  You may optionally need to add tags to your new table for better categorisation and reporting.

Examples of tags and their values:

creator = dbaworks

environment = prod

department = computer-science

jiraticket = CS-2233 

requestor = cmary 

application = courseapps


3.  Create the DynamoDB table in the AWS Console or using the AWS CLI command below:

aws dynamodb create-table \

  --table-name dbaworks-courses \

  --attribute-definitions \

         AttributeName=teacher_id,AttributeType=S \

         AttributeName=course_id,AttributeType=S  \

  --key-schema AttributeName=teacher_id,KeyType=HASH \

         AttributeName=course_id,KeyType=RANGE 


You need to supply the required parameters such as the table-name,the attribute-definitions, and the key-schema. Our table name is dbaworks-courses. The attribute-definitions parameter section is divided further into the attribute name and type.  This is the part where you define all the columns of your table and their data types. In our example, these are 2 columns or attributes, and they are the teacher_id and the course_id. 

The  key-schema is where you define the primary or composite key for your table.  Again, you can declare specific attributes like key name and key type under this parameter.  In our case, the teacher_id is the the partition key, and it has the HASH value for its key type to indicate that this is a hash attribute and will be used in the internal hash function to define the storage partition.  The other key attribute is course_id and it has the RANGE value for its key type to indicate that this is the sort key.

The output of the create-table command is below: 

{

    "TableDescription": {

        "AttributeDefinitions": [

            {

                "AttributeName": "course_id",

                "AttributeType": "S"

            },

            {

                "AttributeName": "teacher_id",

                "AttributeType": "S"

            }

        ],

        "TableName": "dbaworks-courses",

        "KeySchema": [

            {

                "AttributeName": "teacher_id",

                "KeyType": "HASH"

            },

            {

                "AttributeName": "course_id",

                "KeyType": "RANGE"

            }

        ],

        "TableStatus": "CREATING",

        "CreationDateTime": 1595058082.896,

        "ProvisionedThroughput": {

            "NumberOfDecreasesToday": 0,

            "ReadCapacityUnits": 0,

            "WriteCapacityUnits": 0

        },

        "TableSizeBytes": 0,

        "ItemCount": 0,

        "TableArn": "arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses",

        "TableId": "123456f3-1234-5f33-65cc-4cfd456e3nn7",

        "BillingModeSummary": {

            "BillingMode": "PAY_PER_REQUEST"

        }

    }

}


Take note of the resource ARN  or the TableArn in the output of the create-table command, because you will need this for further modifications or updates in your table.  In this example, the resource ARN or TableArn is "arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses"


4.  You can monitor the table creation using the command below:


aws dynamodb describe-table --table-name dbaworks-courses | grep TableStatus


        "TableStatus": "ACTIVE",


5.  Once the table is created or in ACTIVE status, you can optionally  add the tags to the table, using the tags in Step 2 as values.  Here is how to add tags to the new DynamoDB table.  The resource-arn is the TableArn value that resulted from the create-table command from Step 3.  

aws dynamodb tag-resource \

  --resource-arn arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses \

  --tags \

     Key=creator,Value=dbaworks \

     Key=environment,Value=prod \

     Key=department,Value=computer-science \

     Key=jiraticket,Value=CS-2233 \

     Key=requestor,Value=cmary \

     Key=application,Value=courseapps 


You can list the tags added for this table by running the command below:  

aws dynamodb list-tags-of-resource \

  --resource-arn arn:aws:dynamodb:us-west-1:123456789123:table/dbaworks-courses


6.  You can also add secondary indexes to the table.  These indexes can either be a global secondary index (GSI) or a local secondary index (LSI). Secondary indexes are used to query data using different keys other than the primary keys. GSI is an index with its own partition key and sort key different from the table's primary key,  and LSI uses the partition key of the table but has a different sort key. These secondary indexes give more flexibility in querying the data.  Secondary index declarations are also similar in form as creating the table and its attributes, and it requires such parameter values as IndexName, KeySchema, and Projection. 

In the example below, we used the update-table command to add a global secondary index to the table:

aws dynamodb update-table \

  --table-name dbaworks-courses \

  --attribute-definitions \

      AttributeName=student_uid,AttributeType=S \

      AttributeName=created,AttributeType=N \

  --global-secondary-index-updates \

    '[{"Create":{"IndexName": "student-index","KeySchema":[{"AttributeName":"student_uid","KeyType":"HASH"},{"AttributeName":"created","KeyType":"RANGE"}],"Projection":{"ProjectionType":"ALL"}}}]'

The Projection declaration specifies that these column attributes will be copied or projected from the main table into the index. The projection type ALL means all the attributes will be projected to the index.  You can also chose to project the index and primary keys only (KEYS_ONLY) or a specified list of table attributes only (INCLUDE).


7.  You can learn more of the comprehensive options and functionalities of the create-table command by using the help option such as below:

aws dynamodb create-table help


If you want to deeper-dive into the concepts of Amazon DynamoDB, you can learn a lot from these books:





Technology Photo by fabio on Unsplash



Download RDS DB logs through AWS CLI

If you need a quicker way to download RDS DB logs, you can do so through AWS CLI (command line interface).  

Although the AWS console also allows downloading these logs, there are disadvantages to this.  Since the console is a graphical user interface, you must click/select the RDS instance first, click on the Monitoring tab of the instance, scroll down to the list of instance logs, and even sort the list of logs in descending order to get to the most recent one -- if you are in the middle of an incident and need to look at the logs to diagnose the issue.  Then you click on the selected log, click the download button, which will bring up a page where you can right-click the selected log using  "save as".  In short, too many mouse clicks before you can get to the needed logs.

Another disadvantage is that you are only allowed to download the log files as they are.  So if the log file is large, you cross your fingers that the network connection will hold steady and won't disconnect in the middle of the download.  If that happens, you have to resume the download and start again.

For faster and more reliable download of RDS DB log files, you can use AWS CLIs instead.  Here is how it is done in Linux.  The same CLIs are also available for Windows.


First, you run the describe-db-log-files command and combine it with the grep Linux OS command to list the existing log files.  You must supply the value for the db-instance-identifier parameter, in this example it is dba-works-01

dbaworks@dbasrv1:~$ aws rds describe-db-log-files --db-instance-identifier dba-works-01 | grep "LogFileName"

This will return the list of RDS log files for the dba-works-01 instance.


            "LogFileName": "error/postgresql.log.2021-05-16-12",

            "LogFileName": "error/postgresql.log.2021-05-17-00",

            "LogFileName": "error/postgresql.log.2021-05-17-12",

            "LogFileName": "error/postgresql.log.2021-05-18-00",

            "LogFileName": "error/postgresql.log.2021-05-18-12",

            "LogFileName": "error/postgresql.log.2021-05-19-00",

            "LogFileName": "error/postgresql.log.2021-05-19-12",


Next, run the download-db-log-file-portion  command to begin downloading the selected log file. Again, you should supply the value for the db-instance-identifier parameter, in this example it is dba-works-01, the value for the log-file-name parameter, in this case it is "error/postgresql.log.2021-05-19-12". The next parameter -- starting-token -- allows you to specify where to start paginating the log file.  In this example the value of 0 is used, which means to download the entire file.  The last parameter is output and we use text to specify that the log file should be formatted as text when it is downloaded.  Lastly, we use the Linux pipe to save the results to a local file name dba-works-01-postgresql.log.2021-05-19-12.log


dbaworks@dbasrv1:~$ aws rds download-db-log-file-portion --db-instance-identifier dba-works-01 --log-file-name "error/postgresql.log.2021-05-19-12" --starting-token 0 --output text > dba-works-01-postgresql.log.2021-05-19-12.log


You can also download only the latest portion of the log file by using a simplified command, skipping the -- starting-token parameter, see below: 


dbaworks@dbasrv1:~$ aws rds download-db-log-file-portion --db-instance-identifier dba-works-01 --log-file-name "error/postgresql.log.2021-05-19-12" --output text dba-works-01-postgresql.log.2021-05-19-12.log

 

The  download-db-log-file-portion command is a nifty tool.  You can run the command with the help option to get more help or information on the other different ways to use this command for your convenience.


aws rds download-db-log-file-portion --db-instance-identifier help



Want to learn more about AWS Command Line Interface (CLI)?  Update your knowledge with this book below: