Building an Event Analytics Hub: A Journey in Software Development

Nick Shimokochi
Dec 16, 2024
5 min read

Updated: Dec 21, 2024

Extracting clean, robust architecture from business requirements is not unlike carving a path through the woods for others to follow...

Throughout the years, I've mentored several students, guiding them in software development skills and methodologies. There are numerous ways to learn coding and software engineering. Nevertheless, it is always better (when possible) to inspire technical efforts with specific system requirements: don't just aimlessly write code...create software that functions effectively! With this in mind, I’m starting a new, ambitious project to develop an Event Analytics Hub. This platform aims not only to showcase modern software development practices but also to offer my students a concrete example of how to design, implement, and deploy a cloud-native application. In this article, I’ll discuss the motivations for the project, the architectural choices influencing its design, and the phased strategy we’ll employ to bring it to fruition.

Motivating the Project's Requirements

Before diving into the technical requirements, it’s essential to outline the user stories that drive this project. Understanding the platform's purpose from the perspective of its end-users ensures that every design choice and implementation detail aligns with real-world utility.

The primary user of the Event Analytics Hub is an operator or analyst responsible for monitoring and understanding data streams from various sensors. For example, a facility manager might use this platform to track environmental conditions in real time and identify anomalies that could indicate equipment failure or environmental risks. To serve this user base, the platform needs to perform several key functions:

Ingest and Validate Sensor Data: The platform must be capable of receiving telemetry data from multiple sensors in real time. Data validation ensures that only accurate and complete information progresses through the pipeline.
Transform and Store Data for Analysis: After ingestion, the system should clean, aggregate, and transform the raw data into meaningful insights, storing it in a format that is easy to query and analyze.
Provide Insightful Analytics: Users should be able to access aggregated metrics, such as average temperature and humidity over a specific period, and visualize trends and anomalies that require attention.
Enable Proactive Monitoring: The platform should highlight anomalies or deviations from expected patterns, helping users take timely actions to mitigate potential issues.
Support Administrative Tasks: For users managing sensors or the system itself, the platform should provide tools for registering new sensors, updating metadata, and tracking ETL job statuses.

In addition to industrial and operational users, the platform also caters to home users interested in monitoring environmental conditions in their living spaces. For example, a homeowner might use the platform to track temperature and humidity levels to optimize HVAC usage or identify trends that could indicate mold risks or energy inefficiencies. By providing real-time data visualization and anomaly alerts, the platform empowers home users to make informed decisions about maintaining a comfortable and energy-efficient environment.

This home user use case introduces an additional level of adaptability to the platform, illustrating how the same architecture can be tailored to various scales and contexts of usage. By addressing these user stories, the platform becomes an essential tool for operational decision-making and also acts as an advanced example of modern software architecture for educational purposes.

Architectural Choices

The Event Analytics Hub will process, transform, and analyze simulated sensor data. The frontend will consist of a React dashboard for visualizing trends, anomalies, and summaries, alongside a Python Django admin interface for managing data and ETL jobs. On the backend, a Python Flask service will ingest raw sensor data, publishing it to pub/sub (in this case, AWS SQS) for decoupling ingestion from processing. The Django service will handle ETL operations, transforming raw data into insights, while a Golang service will provide high-performance APIs for querying aggregated data and analytics.

Data will be stored across Amazon RDS for structured data and DynamoDB for metadata, balancing relational and NoSQL capabilities. Serverless functions, implemented with AWS Lambda, will validate incoming data and trigger ETL workflows asynchronously. This architecture will rely heavily on Amazon SQS to ensure that data flows smoothly between ingestion, processing, and analysis.

To facilitate deployment and scaling, the system will be hosted on Amazon EKS (Elastic Kubernetes Service). An AWS Elastic Load Balancer (ELB) will serve as the ingress point, distributing traffic securely to backend services within the Kubernetes cluster. CI/CD pipelines will be driven through GitHub Actions, defined on the respective service repos,leveraging workflows for testing, building, and deploying containerized services to the Kubernetes cluster. Additionally, Terraform will be used for provisioning cloud infrastructure to ensure multi-cloud compatibility and declarative management of resources. Monitoring and observability will be implemented using CloudWatch, providing real-time insights into system health and performance.

The platform’s data flow will illustrate how distributed systems operate. Mock sensors will push telemetry data, such as temperature and humidity readings, to the Flask API. This data will be validated by Lambda functions and stored in RDS, while SQS will decouple ingestion from downstream processing. The Django service will extract, transform, and load this data into RDS, performing calculations like averaging values and detecting anomalies. Transformed data will be indexed in DynamoDB for fast querying by the Go service, which will power the analytics dashboard.

For security, we will utilize Keycloak; this will serve as our main auth service. Login redirects will be handled here, as will the creation and management of client credentials required for API access (e.g. JWT tokens used by clients/sensors pushing data to the platform)

Phases of Execution

In addition to building the core functionality of the platform, significant emphasis will be placed on automating the development workflow and implementing robust CI/CD pipelines for each repository. The goal is to create a seamless, repeatable, and efficient development process that aligns with modern DevOps practices. Every repository will include infrastructure-as-code configurations, automated tests, and deployment pipelines, ensuring rapid iteration and consistent deployments.

For each service (Flask, Django, Go, and React), dedicated pipelines will be created using GitHub Actions, utilizing workflows for automated testing, container builds, and deployments to EKS. These workflows will also integrate with Terraform to manage infrastructure changes, ensuring compatibility across different cloud environments. Although we are using AWS, we want to "imagine" we're in a business setting where changing deployment platforms might be necessary for product or security reasons. Therefore, we aim to maintain maximum flexibility.

Monitoring and observability will be crucial, with logs and metrics centralized in CloudWatch to offer actionable insights and simplify debugging processes. This DevOps strategy not only reduces manual workload but also highlights the significance of operational excellence in software development.

To execute this project efficiently, I’m dividing it into six distinct phases:

I've been spending most of my time, as of late, either writing Python or executing project planning or people management tasks. As such, I could use a refresher on my tech stack. The first phase will involve revisiting React and Go to bridge any gaps I may have. Concurrently, I will set up the foundational platform infrastructure using Terraform, provisioning resources like EKS clusters, RDS databases, and SQS queues.
In the second phase, I will develop the Flask service, focusing on data ingestion. This will include integrating the service with SQS and Lambda to establish a robust data pipeline.
The third phase will introduce the Django service, which will combine backend logic with a server-rendered admin interface, showcasing full-stack development capabilities.
In the fourth phase, I will implement the Go service, building high-performance APIs optimized for serving analytics queries.
Phase five will center on creating the React dashboard. This frontend will visualize data trends and anomalies, providing users with an intuitive interface for interacting with the system.
Finally, the sixth phase will involve integrating all components and conducting rigorous testing to ensure the platform operates cohesively and meets scalability requirements.

Final Thoughts

The Event Analytics Hub is more than just a software project: it’s a teaching tool designed to illustrate the "why" behind architectural decisions. By tying requirements to deliberate choices in tools and techniques, this platform will serve as a comprehensive guide to understanding modern software systems. For my students, it’s both an inspiration and a hands-on example of how to approach complex problems with elegant solutions.

In future articles, I’ll document each phase of the project, sharing detailed insights and lessons learned.

Stay tuned!

Building an Event Analytics Hub: A Journey in Software Development

Motivating the Project's Requirements

Architectural Choices

Phases of Execution

Final Thoughts

Recent Posts

Comments