Streaming Data Project
Streaming isn't just for Netflix aficionados! This project will focus on streaming data, using tools like Apache Kafka and ksqlDB. The data source that we will use for Extraction will be a Python script that generates random text. This will be ingested by Kafka and transformations will be done in ksqlDB. Further transformation and loading of the data will be handled in a future post. Let's Start With Some Text! As mentioned, this process will start with a python script that will generate oodles of random sentences. This functionality is brought to us by the module " Faker ," which is described as "a Python package that generates fake data for you." We're also going to use a Kafka connector for python, kafka-python , which will help us generate a data stream and shove it into Kafka's hungry maw. Once they're installed via pip (or pip3, as some would have it), the import string is straightforward: from kafka import KafkaProducer from faker import...