ADR Consumer Topic Identification [ Replay Design ]

ADR Consumer Topic Identification

Status

  • Proposed
  • Trialing
  • Under review
  • Approved
  • Retired

Problem Context

Today, the storage service publishes RecordChange messages to “recordstopic”.

When Storage Service publishes a RecordChange message to the “recordstopic” topic of the service bus, all the consumers get notified (eg. Indexer service, notification service). During scenarios like replaying for reindex scenarios, notifying all the consumers may not be required. Hence, we need a way to instruct storage service to publish RecordChange messages to a custom topic depending on the use case. For example, if the replay is going to be done for re-index, then we can instruct storage service to publish the RecordChange messages to a “reindex” topic which is being listened to by the indexer only, instead of publishing them to recordstopic which has many consumers. This will ensure that only the indexer service gets notified of the events.

Therefore, it is of utmost importance that the Producer's design allows for the appropriate routing of operations to their intended topics. This brings us to the question of how the Storage service can accurately determine the topic to which each message should be directed based on its specific functionality/operation . In response to this challenge, we have explored the following design, which will serve as the foundation for the development of our Replay API.

Design Option Detailed Approach Pros/Cons
      1. Create different Topic for Each Operation and provide operation name i.e. reindex as input to the replay API.

      [Preferred Approach]

There will be a separate topic for each operation.

For example, indexer service can listen to a topic called “reindex” and notification service can listen to the topic “notify” in addition to “recordstopic”.

The replay API will take the input as operation name i.e. reindex, based on that, it will decide which topic the replay API has to publish the recordChange message. This will ensure only the indexer gets notified.

Picture1

Note – One operation will maps to one Topic in Service ( 1:1) . While a single topic can have multiple consumers.

Pros: 
  • Abstraction and statelessness as users need not know about internal topics. 
  • Consistency as different CSP can decides on common operation name irrespective of internal implementation details.
  • Decoupling of the internal implementation from Replay operation.

    Cons:  

    • Management of mapper which helps us to map the functionality i.e. reindex to topic name.  
    • Implementation will take time. 
    • Producers should know about consumer topic mapping. [ Remark – every Producer knows topic names either through registry or in memory store or environment variable mapper, currently we pass it as hardcoded value from deployment yaml to application properties]
      2. Create different Topic for Each Operation and provide Topic Name/ID as input to the replay API.
There will be a separate topic for each operation.

For example, indexer service can listen to a topic called “reindex” and notification service can listen to the topic “notify” in addition to “recordstopic”.

If Replay is required for reindex scenario, then replay API can be called with the parameter topicId’s value as “reindex”.

This will cause storage services to publish recordChange messages to reindex topic instead of recordchangetopic. This will ensure only the indexer gets notified.

Pros -
  • No need to maintain the internal mapper if we use topic name.
  • Implementation will be easy if we use the topic name directly.

    Cons –

    • We must keep up with the mapper that allows us to associate topic IDs with their corresponding topic names when utilizing the topic ID as input for the reply to API in case we pass topic id.
    • Users should have access to the internal topic details in case we use topic name.
    • Since these APIs will be introduced at the community level, customizing them for specific topics, which may have different names or implementations by different CSPs, could impact uniformity.
      3. Create Different Topic for Each Consumer and let the specific consumer like indexer, call the replayAPI with the topicId
In this, a new Reindex API in indexer will call the replay API and will provide the topic name along in the request body. Pros –
  • Internal details like topic name need not be known to the user.
  • The consumer can perform pre-requisite operations like deleting indices before calling the replay API.

    Cons –

    • User has to use different APIs which will be bad experience.
    • If we call it using reindex is that I must change the response payload to incorporate the status id and ADR in community and code merge.

Conclusion - We are going with approach 1 taking into consideration the Pros.

Edited by Akshat Joshi