Firemind created a fully managed solution with the use of Amazon Kinesis Data Streams, Firehose, Amazon S3, CloudFront and AWS Lambda.
This system was built from the ground up purely on AWS services. Firemind ensured that the methods used could scale to demand and, given that there was very frequent live traffic, auto scaling would have to be in place to ensure that data sent through the Kinesis Data Streams could be processed without being discarded.
Included in this solution was Kinesis Data Streams, Firehose, Amazon S3 and AWS Lambda. Data Streams were used to receive logs from CloudFront Real-time logging, which were then forwarded to Kinesis Firehose, an AWS Managed service.
Firemind then created a Lambda to modify the records sent through the Firehose stream, to comply with GDPR. Specifically, Firemind removed the last two octets of the IP addresses from CloudFront logs on modification (188.8.131.52), and removed any null values returned by CloudFront logs.
Once records had been processed through Amazon Kinesis using its conversion Lambda modification option, Firemind then utilised Kinesis Firehose to output this data into an S3 bucket in a different account. This data was outputted in Apache Parquet format and provided DFL members access to the business intelligence platform and reports.
Firemind also setup an AWS Glue crawler, to automatically go through the Parquet files to scrape new data and observe if the data model had changed – this ran every 24 hours.