Apache Flink Starter in Kotlin

A starting point for an Apache Flink project using Kotlin.

Pre-requisites
Up & Running
Running The Job
- Standalone Mode
- Using a Job Cluster
Observing the Job in Action
Additional Containers
Useful Commands
Useful References
Additional Topics - Visit Wiki
- Deploying with Flink Kubernetes Operator
- Profiling Flink with JDK Mission Control

Pre-Requisites

Docker on Mac
Gradle - You have a few options here
- If you're using Intellij, just make sure it's enabled.
- Run brew install gradle

Up & Running

Let's first clone the repo and fire up our system,

git clone [email protected]:aedenj/apache-flink-kotlin-starter.git ~/projects/apache-flink-kotlin-starter
cd ~/projects/apache-flink-kotlin-starter;make kafka-start

Now you have a single node Kafka cluster with various admin tools to make life a little easier. See the Kafka cluster repo for its operating details.

Running the App

The sample job in this repo reads from a topic named source and writes to a topic named destination. There are a couple of ways of running this job depending on what you're trying to accomplish.

First, let's setup the kafka topics. Run make create-default-topics.

Standalone Mode

For quick feedback it's easiest to run the job locally,

If you're using Intellij, use the usual methods.
On the command line run make run

Using a Job Cluster

Run make flink-start.
Verify the cluster is running by navigating to the UI

This will run the job with the cluster defined in flink-job-cluster.yml. By default the cluster runs one task manager and one slot however this can be changed. The make flink-start command accepts two parameters NUM_TASK_MANAGERS and NUM_TASK_SLOTS. For example, one can run

make flink-start NUM_TASK_MANAGERS=2 NUM_TASK_SLOTS=3

and this will result in a job cluster running two task managers with three slots each and a default parallelism of six.

In order to stop the job run make flink-stop

Observing the Job in Action

After starting the job with one of the methods above, let's observe it reading and writing a message from one topic to another.

Start the job using one of the methods above.
If Kafka Cat isn't installed, run brew install kcat
To populate the source topic run head -n 10 nyctaxi-dataset/yellowcab-trips-sample.json | kcat -b localhost -t source -T -P -l
Navigate to the Kafka UI and view the messages both the source and destination topics.

You should see the same messages in both topics.

Monitoring Flink

Both Prometheus and Grafana are available via Docker. In addition the Prometheus exporter has been enabled in the local cluster. In order to see this all in action,

Ensure Kafka is running. If it isn't run, make kafka-start. I prefer to see the output. However, if you don't, you can add the -d option to the kafka-start command in the Makefile.
If the topics for the default Flink Job don't already exist run make create-topics
Now let's start up Grafana and Prometheus by running make monitor-start
Start the Flink Cluster by executing make flink-start
Navigate to the example dashboard

You may not see results immediately. Wait a minute or so and you should start seeing results. Lastly, if you run the Flink cluster with more than one task manager that node will be automatically discovered the discovery service setup in prometheus.

Here are a list links of useful links,

Description	Link
Prometheus - Useful for exploring the raw measuresments of a metric	http://localhost:9090
Grafana - Home of all the dashboard	http://localhost:9003
Job Manager Prometheus Exporter	http://localhost:9249
Task Manager Prometheus Exporter	Run `docker ps`. Identify containers mapped to 9249

Additional Containers

ZooNavigator - A web-based UI for ZooKeeper that you can start and stop using make zoonav-start and make zoonav-stop. Navigate to http://localhost:9001/ for access.

Useful Commands

Service	Command(s)
Kafka Cluster	`make kafka-start`, `make kafka-stop`
Kafka	`make create-default-topics`, `make delete-default-topics`, `make create-topics topic='"topicA:2:3" "topicB:5:1"'`, `make delete-topics topic='"topicA" "topicB"'`
Flink Cluster	`make flink-start`, `make flink-start NUM_TASK_SLOTS=2 NUM_TASK_MANAGERS=3`, `make flink-stop`
Monitoring	`make monitor-start`, `make monitor-stop`
ZooNavigator	`make zoonav-start`, `make zoonav-stop`

Useful References

kcat (Formly KafkaCat) - a generic non-JVM producer and consumer for Apache Kafka >=0.8, think of it as a netcat for Kafka. In addition the Confluent Docs has helpful examples.
Kafka UI - A feature rich UI on par with the Confluent UI to monitor and manage Kafka clusters.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
app		app
conf		conf
docker		docker
gradle/wrapper		gradle/wrapper
kubernetes		kubernetes
nyctaxi-dataset		nyctaxi-dataset
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
build.gradle.kts		build.gradle.kts
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apache Flink Starter in Kotlin

Pre-Requisites

Up & Running

Running the App

Standalone Mode

Using a Job Cluster

Observing the Job in Action

Monitoring Flink

Additional Containers

Useful Commands

Useful References

About

Releases

Packages

Languages

aedenj/apache-flink-kotlin-starter

Folders and files

Latest commit

History

Repository files navigation

Apache Flink Starter in Kotlin

Pre-Requisites

Up & Running

Running the App

Standalone Mode

Using a Job Cluster

Observing the Job in Action

Monitoring Flink

Additional Containers

Useful Commands

Useful References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages