Zephyr Enterprise Services

Introduction

The Webhook part of the Zephyr Enterprise suite is the module that facilitates data exchange and sync between ZE & JIRA and is responsible for requirements and defects. Looking at the high-volume data growth with our large customers and making the solution more scalable, a decision has been taken to decouple and transition the entire webhook processing to run asynchronously.

Problem Statement

The main problem areas identified in the current implementation are as follows:

Extreme Resource Utilization

  • Currently, the JIRA webhook and Zephyr Enterprise applications are deployed on the same infrastructure, which becomes a bottleneck during high peak loads. This causes the infrastructure to be utilized more efficiently, resulting in poor application performance.

Scalability Issues

  • The existing webhook processing module is tightly integrated with core Zephyr Enterprise components, so it is realistically difficult to scale webhook processing out.

    • Tight Coupling of Webhook with ZE core

    • Tight Coupling of event processor with ZE core

    • Legacy Queue – no cross visibility in queues

Data Inconsistencies

  • Data inconsistencies are often observed due to the JIRA sync processing, which may cause issues like duplication, incomplete details, etc.

    • Events are often not processed in the order as received.

    • Duplicate requirements get created in ZE.

    • Longer processing time causes a huge backlog and lag.

    • Data loss while processing in case of failures.

Excessive JIRA Calls

  • In the existing implementation of webhook processing, JIRA APIs are being called excessively, resulting in a delay in overall execution, additional resource utilization, and more burden on the JIRA environment.

Traceability

  • Insufficient audit trails are maintained in the existing webhook processing components hence there are no ways to trace and backtrack the events in case of failures and other investigations.

Proposed Solution

The solutions identified to deal with the existing issues around webhook processing are as below:

Decouple

  • The webhook system will be decoupled from the existing ZE and implemented as a separate runtime.

  • Decoupling will allow us to run the Webhook solution outside of the ZE core infrastructure, allowing both systems to function independently.

  • Achieve asynchronous processing of JIRA events independent of ZE application resources.

Centralized Queue Management

  • We plan to transition from the existing legacy/file-based queueing mechanism to RabbitMQ, which is a well-known and widely used message broker system.

  • Centralized queueing shall provide flexibility and ensure the order in which events are processed.

  • Better visibility in terms of event processing, e.g., events in the queue, consumption and traffic trends, etc., through RabbitMQ user panels.

  • RabbitMQ also supports multi-node setup which helps to achieve scalability needs as and when required.

Fault Tolerance

  • Implement Persistent queues to prevent data loss in the event of the RabbitMQ server and/or event consumer components go down or reboot.

  • Follow the Handshake mechanism to inform and remove a request from RabbitMQ only when the consumer has successfully consumed and processed the event.

  • Explore the feasibility of implementing a handshake between JIRA and the Webhook system and introducing a retry mechanism in the event of service failures.

  • Explore feasibility and ways to make the Webhook system aware of JIRA rate limits and avoid blocking other JIRA users.

Audit Logs

  •  Bridge the gap in the current implementation and capture adequate logs and audit trails for webhook processing. 

 

  • The services are compatible with deployment on both Windows and Linux

  • This service is optional

  • JAVA 17.0.10 is required for running this service, so we recommend a separate server for running the services

For more details on deployment of the services

Â