Reflection on my current personal project¶
I have created a personal project to explore and learn about implementing event-driven applications in AWS...
That sounds pretty easy and straight forward, right?
Well, it's just over a month since I started, and of the 11 labs I originally planned, I am still busy with lab 3. Just after a month into this little experiment, I'm pausing a little bit just to reflect on a couple of topics and I thought it may be helpful to others if I share them here as well. Also, my blog needs a little activity too!
The basic idea¶
In my minds eye I had this idea that try to focus on AWS core services to try and implement an event-driven application. I identified essentially two stacks or paths that I want to explore:
- AWS Kinesis that is geared toward fast ingestion of many events
- A different approach (and cheaper) that involved AWS S3, SNS, and SQS.
To make things a little more practical, I thought I would create a simulation of a real life application, and I was wondering which type of application would be ideal to experiment with in terms of both the two paths mentioned above. I'm sure a lot of readers would be able to think of something, but I settled on a building access card management system.
My thinking was as follow:
- The more administrative back-office type tasks, like linking a physical access card to an employee could be implemented using the S3, SNS and SQS path
- Assuming we have a scenario where a lot of people have to enter/exit a premises at a fairly high rate, it might be a good scenario for a Kinesis implementation.
Keep in mind, the actual practicality is not that important (it's not an application I'm going to sell or anything), but I will still consider the practical aspects to see if certain design decisions makes practical sense.
Although the focus is not on security, I do want to implement at least a basic acceptable level of security measures. I believe this is what is missing from many learning projects and security is often left as an after thought and something that should be figured out later. With this exercise I will try to stick with some of the basics as far as possible. I am sure by including the security topic in my focus it will also change the dynamic somewhat, and I am keen to learn how this focus changes things.
The "Full Stack"¶
It turns out I really challenged myself with this one... There are a number of front-end components required, so in addition to the primary objectives focusing on the event infrastructure and integration, I also plan to look at some web hosting options. I have done basic S3 hosting implementations before (like this blog), so for this exercise I thought I would play with EC2 based hosting, using AWS FSX as a common network storage back-end.
With that said, it's probably a good idea to quickly map out all the technologies and services I have touched up to this point as well as those I will probably use soon:
- ELastic Load Balancer
- Route 53
- Certificate Manager
- API Gateway
- SSM (for secure EC2 access)
- Athena (planning to use)
- Docker, for hosting an Nginx web server on EC2 (could potentially be replaced with ECS? If I still have energy left at the end of the project I would like to explore this)
- Python, for Lambda functions and some other scripts, for example the initial populating of the DynamoDB table
- Shell scripting for build and deployment logic
- Bootstrap web framework with one of their free templates
This is a one-man exercise, and I am focused on learning. Therefore, there is a kind of a plan, which I have laid out in phases I called labs, where each lab focus on a specific outcome. A lot relies on the discovery through practical implementation, so I am not spending a lot of time on planning everything upfront - rather, I allow things to evolve more naturally. So, in terms of planning and design, I basically look ahead no more than about two weeks - and even within that two weeks, things tend to change.
I spend between two and three hours a day during the week and probably a total of around eight to ten hours over weekends on this exercise. That's about 20 hours a week. That is of course over and above my normal forty-ish hour work week, so it is fairly intense, but focused.
All Infrastructure is deployed via AWS CloudFormation templates and artifacts required for deployment is staged on S3. This allow me to delete the entire project at the end of each session and then re-deploy each day from scratch. To compensate for the wait time during the CLoudFormation actions, I use the time waiting during deployment for planning the session's activities and start to do some actual work as far as possible. At the end of the session, during the deletion of all the stacks, I use the time to reflect, update notes and progress and perhaps also cleanup some bits if required.
I work mainly early mornings during the week, as I get up at around 0400 each morning. I don't work in the evening as I need time to also exercise a bit, do some chores around the home and all that kind of stuff.
I think I have found a nice rhythm now so I am happy with this approach. My only frustration is that I have to break for real work, and this sometimes interrupts my thought processes at that time. It's always a race against time for me in these cases to pen down all of the stuff in my head as quick as I can. And then the rest of my normal work day can play out.
Progress and early reflection¶
Progress is slower than I though it would be, especially since I already have a fairly good understanding of many of the AWS technologies as I have used these in one form or another in many other projects previously. However, each implementation is different and presents with unique challenges that have to be solved. In total, I have now spend about 120 hours and I am approaching the end of lab 3. It is really hard to judge the effort that will be required in the other labs, but I would think about 50 hours per lab seems to be a fair estimate at this point. I also plan to take a break during December so at this point I doubt I will finish this project this year.
But what have I learned so far? This is the whole point of this exercise after all...
The line between project code and Infrastructure as code...¶
Something that is dawning on me is that the lines between Infrastructure and project code is becoming very blurry. From my perspective, when you buy into a cloud platform like AWS, developers will have to know some Infrastructure as Code, as there are some cases where Infrastructure as Code and project code are really integrated.
This, in my mind, presents several challenges:
- Developers generally focus on solving a problem and is therefore not focused on Infrastructure topics
- Software developers do not always know about topics like networking, routing, load balancing and related infrastructure topics. THerefore, the question is how do they solve this, or what role does Infrastructure engineers now play in the team?
- When integrating with cloud services, developers have to consider how data will be passed into their applications. In traditional on-premise or hosted solutions, developers could dictate a lot more on how the API must look like and these application do not always have to consider the infrastructure integration bits. For example, when integrating with the AWS API Gateway, your API will not consume the message as you dictated in a OpenAPI specification. Rather, you first have to extract the input data from the AWS API Gateway message, which has a lot of other information about the request as well. Having said that, from a service consumer, they still integrate as per the OpenAPI specification.
- Infrastructure engineers, on the other hand, may not know all the details related to how a software solution needs to operate within an environment. This will require a close working relationship with the developers to ensure all the supporting services are considered. A good example here was the intricate AWS Load Balancer configuration, especially for the web application that had to be secured through a Cognito login. IN this same example, there were also the Route 53 (DNS) and certificates to consider - something developers typically would not concern themselves with.
- As you develop your solution, you constantly need to keep in mind how information is passed from one system to the next. There are many options, and each service can potentially integrate with many other services, and this present challenges in how your application distinguish between the different message specifications. Of course, this is also very cloud provider specific, so writing a more portable application will also be a great challenge.
Development Support and Operational Challenges¶
This is actually not news to me personally, but once again I see that the effort to prepare the Infrastructure and development environments also take up a fair amount of time.
For example, the integration between GitHub and AWS to automatically deploy the web application involved several AWS services, including:
- AWS Lambda function with URL (which the GitHub webhook is integrated to) that identifies the GitHub event and prepares a release event that is posted on SNS/SQS
- SNS/SQS to relay new release events
- Another AWS Lambda function to orchestrate the deployment of an EC2 instance, that in turn will release the GitHub changes. This EC2 instance self terminates after not receiving more release events on SQS for more than 10 or so minutes.
The above sounds like a pretty straight forward solution, and even though I knew exactly how to implement this, it still took almost a whole week (< 20 hours) to implement. The actual input of the code is one thing, but the real time killer here was the process testing and refining until it is just right.
In the end I am pretty happy with the solution. When I deploy this stack on AWS, I just need to update the week-hook URL on GitHub and then each time I update my web application, the new release is automatically deployed to AWS. The first deployment take a couple of minutes, as the EC2 instance need to start up, but after that releases are fast. Since the EC2 release server only stays up for a little over 10 minutes, it gives me an incentive to release something every 10 minutes - even if it's the smallest of changes. I found that this approach really focuses me on the task at hand.
DynamoDB is different... in a good way¶
I forced myself to try and really understand DynamoDB and in particular how it differs from relational databases.
In a nutshell, I have learned a lot!!
The design of the data storage and indexes has a very different approach to RDBMS systems, but I found a couple of really helpful resources on the topic, and so far I feel like I have taken a giant leap forward in understanding how this amazing database product works. I now find it as easy as working with a RDBMS - perhaps even easier...
I have documented all the resources I consulted in the learning effort, and I hope to write a lot more about this in future.
I really like this little project I embarked on. I have brushed up on a lot of old skills and learned about some new technologies, like AWS FSX. The experience also helped me gain some new perspective on what it takes to really developed a practical event-driven solution on AWS, since I am usually just focused on AWS Infrastructure.
Hopefully there will be more updates in the near future, especially once I have lab 3 complete, which will allow some real practical deep dives to start take shape.
Until my next update, this is it for now. Today I have taken a break from the project to take time to reflect, and it also helped me to see that it is a worthwhile exercise. I will strongly encourage everyone that want to grow in any area of their live to also try a similar approach: embark on a project and do it practically.