Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write integration tests for data conversions for schema #77

Open
ikbalkaya opened this issue May 26, 2022 · 4 comments
Open

Write integration tests for data conversions for schema #77

ikbalkaya opened this issue May 26, 2022 · 4 comments
Labels
enhancement New feature or improved functionality.

Comments

@ikbalkaya
Copy link
Contributor

ikbalkaya commented May 26, 2022

Currently we use EmbeddedConnectCluster to run our integration tests. This provides a way to run tests without having to provision an external cluster for tests.

However it looks like it is currently not possible to exchange data with schema registry - and it doesn't seem to be possible to produce data with a schema (Avro schema in the case I tried)

While trying to find a way to write unit tests I found myself using classess here https://github.com/confluentinc/schema-registry it is easy to serialize, deserialize and convert messages with Avro using this repo as it contains the AvroConverter itself. It also has MockSchemaRegistryClient that can be used to exchange schema between producers and consumers for unit tests.

I think this particular repo can be further checked to see if there are embedded classes / utilities that provide ability to wire up schemas with producers and connectors and also some utilities that provide a way to send a data with schema to Kafka.

It is also worth to check whether we can use MockSchemaRegistryClient in our current test setup

┆Issue is synchronized with this Jira Uncategorised by Unito

@ikbalkaya ikbalkaya added the enhancement New feature or improved functionality. label May 26, 2022
@lmars
Copy link
Member

lmars commented May 26, 2022

@ikbalkaya I provided a few pointers for how we might do this in our internal Slack channel, can you perhaps explain here the things you tried and what you couldn't quite get to work, so that anyone picking this up in future can build off the learnings you made?

@ikbalkaya ikbalkaya changed the title Write integration tests for Avro data conversion Write integration tests for data conversions for schema May 27, 2022
@ikbalkaya
Copy link
Contributor Author

ikbalkaya commented May 27, 2022

@lmars I tried to summarize my findings and some potential options we can try - hopefully it is clear

@lmars
Copy link
Member

lmars commented May 30, 2022

@ikbalkaya here are the pointers I provided in the Slack channel:

With regards to starting a schema registry:

Here's how the schema-registry tests start one for an integration test:
https://github.com/confluentinc/schema-registry/blob/master/core/src/test/java/io/confluent/kafka/schemaregistry/RestApp.java#L71-L87

and with regards to producing Avro-encoded data:

if we look at the underlying implementation of produce, I think we can just replicate that in our test code?
https://github.com/apache/kafka/blob/3.1.0/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedKafkaCluster.java#L407-L414

it's just converting the strings to bytes anyway

we could init our own producer like this: https://github.com/apache/kafka/blob/3.1.0/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedKafkaCluster.java#L156-L163

Did you try out any of these approaches?

@ikbalkaya
Copy link
Contributor Author

@lmars I tried both and wasn't able to combine all elements together. As far as I remember for RestApp, it wasn't possible to relate - communicate the embedded connect and Kafka clusters. For producing Avro compatible data, we have a similar issue - We can produce an Avro compatible data but it wasn't possible to add an intermediate schema-registry to exchange schemas between producer and sink connector. I was looking for an embedded schema registry server to use but couldn't find one, or the one I found wasn't compatible with the embedded Kafka cluster. I was hoping to look into this later as it was ending up taking lots of my time.
But I think if there is nothing available, I might be able to create some compatible embedded schema registry that could play well with our current test setup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improved functionality.
Development

No branches or pull requests

2 participants