Real-time Data Archives - Page 14 of 37

Leveraging Predictive Analytics for Improved Patient Care and Operational Excellence

Posted on February 6, 2024 by Striim | 3 min read | 5 views

The healthcare industry is undergoing rapid changes and the integration of Striim and GenAI applications is a significant breakthrough. Hospitals are currently facing challenges such as consumerization, workforce shortages, and the need for digital transformation. However, Striim and GenAI offer a way forward by providing efficient and effective care that focuses on the patients. Striim aims to navigate these complexities and take healthcare delivery to new heights.

Empowering Modern Healthcare with Advanced Technology

Healthcare is transitioning towards value-based care and emphasizing price transparency, demands robust, adaptable solutions. Striim and GenAI emerge as such solutions, redefining healthcare delivery and management. Their role in integrating real-time data is crucial, ensuring quality care and operational efficiency, two pillars essential for modern healthcare success.

Addressing the Challenges in Modern Healthcare

Healthcare providers today face a myriad of challenges, each impacting care quality and operational efficiency:

Patient Disconnection: Consider patients with chronic conditions, who without timely updates about their health status may feel disconnected from their care plans. This challenge often stems from limited access to real-time patient data, which hinders the delivery of personalized care and robust patient engagement.
Operational Efficiency Amid Staffing Shortages: Envision a hospital during an unexpected patient surge – perhaps due to a local health crisis. Without real-time data, efficiently allocating resources and adjusting staffing levels becomes a daunting task, often leading to strained services and compromised patient care.
Revenue Leakage: A common yet critical issue arises from inefficiencies and errors in billing processes. Such challenges, often due to outdated or fragmented systems, can lead to significant financial losses, undercutting a hospital’s operational viability.
Limited Real-Time Information: The healthcare sector’s Achilles’ heel is often the delayed access to vital patient data. For instance, a lag in updating a patient’s diagnostic results can impede timely treatment, affecting patient outcomes and care quality.

Transforming Healthcare with Striim and AI

In response to these challenges, Striim and GenAI offer transformative solutions:

Personalized Patient Care: Integrating data from electronic medical records (EMRs), IoT devices, and direct patient feedback, these platforms enable healthcare providers to craft individualized care plans. This tailored approach enhances treatment effectiveness and elevates patient outcomes.
Workforce Optimization: By consolidating data from various hospital systems, Striim and GenAI provide critical real-time insights for effective staffing and resource management. This capability is especially valuable in times of workforce fluctuations, helping maintain high-quality patient care without overstraining staff.
Integrated Revenue Cycle Management: Utilizing real-time data integration and processing, these tools create a seamless and efficient revenue cycle. From patient registration to final billing and payment reconciliation, every step is optimized for accuracy and speed, reducing the likelihood of revenue loss due to administrative errors.
Streamlined Clinical Workflows: Immediate access to comprehensive patient information is crucial for informed decision-making in healthcare. Striim and GenAI streamline clinical workflows by integrating real-time data and advanced analytics, enhancing efficiency and reducing the administrative burden on healthcare providers.

Shaping a Future-Ready Healthcare System

The healthcare industry has achieved a significant milestone by adopting Striim and GenAI technologies. The union of these technologies has improved patient outcomes, enhanced operational agility, and boosted financial health. In a sector where efficiency, responsiveness, and patient-centricity are critical, Striim and GenAI aim to set new standards.

We invite healthcare professionals to explore the transformative potential of Striim and GenAI. How can these innovative technologies revolutionize patient care, operational management, and financial efficiency? Join the conversation and share your insights on embracing these advanced solutions in healthcare.

Book a demo today.

A Guide to Seamless Data Fabric Implementation

Posted on February 5, 2024 by Striim | 8 min read | 5 views

Organizations are grappling with the increasing complexity and diversity of their data sources. Traditional approaches often fall short in addressing the challenges posed by disparate data silos, and there arises a need for a more cohesive and integrated solution. Enter Data Fabric — a paradigm that promises a unified, scalable, and agile approach to managing the intricacies of modern data.

What is Data Fabric?

Data Fabric is a comprehensive data management approach that goes beyond traditional methods, offering a framework for seamless integration across diverse sources. It is not a standalone product but comprises key elements, including data integration, ensuring the smooth merging of data; data quality, maintaining high data standards; metadata management, organizing and understanding data context; and security, safeguarding data integrity. Together, these four elements form a cohesive fabric, unifying disparate data sources and providing organizations with a holistic and coherent perspective on their data landscape.

The 4 Key Pillars of Data Fabric

Data Integration: Breaking Down Silos
At the core of Data Fabric is the imperative need for seamless data integration. This element ensures the smooth merging of data from various sources, fostering a unified and comprehensive view. By dismantling data silos, organizations can promote collaboration and unlock valuable insights that were previously hidden in isolated pockets of information.

Data Quality: Building Trust in Information
Maintaining high standards for data quality such as accuracy, consistency, and reliability is paramount. By upholding data quality, organizations can trust the information they rely on for decision-making, fostering a data-driven culture built on dependable insights.

Metadata Management: Navigating the Data Landscape
Effective metadata management is the key to navigating the vast data landscape. This element involves organizing and understanding the context of data, enhancing discoverability and interpretability. With well-managed metadata, users can gain insights into the origin, structure, and relationships of integrated data, facilitating more informed decision-making.

Security: Safeguarding Data Integrity
Security is a non-negotiable aspect of the Data Fabric approach. It involves implementing robust measures to safeguard the integrity of data. By ensuring confidentiality and reliability through stringent security protocols, organizations can protect their data from unauthorized access, instilling trust in their data management practices.

How Striim Supports Data Fabric Implementation

While there are various ways to build a data fabric, the ideal solution simplifies the transition by complementing your existing technology stack. Striim serves as the foundation for a data fabric by connecting with legacy and modern solutions alike. Its flexible and scalable data integration backbone supports real-time data delivery via intelligent pipelines that span hybrid cloud and multi-cloud environments.

Real-Time Data Integration
Striim provides a powerful streaming integration platform that aligns and employs change data capture (CDC) and streaming data processing to ensure data is captured and processed promptly, minimizing latency and delivering timely insights. Striim continuously ingests transaction data and metadata from on-premise and cloud sources. An in-memory streaming SQL engine transforms, enriches, correlates, and analyzes transaction event streams.
Enhanced Data Quality
Striim incorporates robust data quality measures such as validation rules and data cleansing processes. By enforcing data quality standards throughout the integration pipeline, Striim ensures the integrity and accuracy of data. Fresh data guarantees the latest insights on operational data to make profitable real-time decisions.
Metadata-Driven Architecture
Rich metadata management is at Striim’s core platform. It captures and utilizes metadata, including information on data lineage, quality, and transformations, providing a solid backbone for guiding activities within the data management system.
Scalability and Flexibility
Striim’s architecture is inherently modular, allowing for infinite scalability by adding more processing and storage resources as needed and without any additional planning or cost to execute so you can save time and money. Whether a database schema changes, a node fails, or a transaction is larger than expected — Striim’s Intelligent Integration pipelines take corrective actions the instant a problem arises.
Security Measures
Striim ensures end-to-end security in data streaming and integration. It offers encryption protocols, access controls, and monitoring features to safeguard sensitive information, addressing the security concerns. Striim’s hybrid and multi-cloud vault securely stores passwords, secrets, and keys. It also integrates seamlessly with third-party vaults such as HashiCorp.
AI Innovation Support
Striim serves as a crucial component for organizations aiming to harness the power of Artificial Intelligence (AI). Its seamless integration capabilities align with Data Fabric’s role as a bedrock for AI initiatives, providing a unified view essential for training robust machine learning models.

Empowering GenAI Innovation

Data Fabric has emerged as a pivotal framework that goes beyond integration, offering a comprehensive solution for organizations aiming to harness the power of AI. At its core, Data Fabric serves as the bedrock for AI initiatives by seamlessly integrating diverse data sources, providing a unified view essential for training robust machine learning models. Organizations leveraging the synergies of GenAI and data fabric can unlock a multitude of advantages. By enabling natural language access, these technologies empower organizations to democratize data, offering a ChatGPT-like interface for seamless queries. Addressing the complexities of data integration in hybrid and multi-cloud environments, generative AI and LLMs streamline real-time integration through automated code generation, supporting dynamic entity resolution and automated data mapping. Leveraging vector databases, these technologies enable groundbreaking similarity searches based on connected context within the data fabric, fostering data intelligence and uncovering untapped data assets. Furthermore, they address the critical challenge of real-time data quality by automating anomaly detection, data cleansing, and validation, ensuring a heightened overall data quality. Finally, in the realm of data security and governance, GenAI and data fabric automate processes such as discovery, classification, categorization, and data access in real time, establishing a foundation for secure and governed data management.

Implementation Strategies for Data Fabric in Your Organization

While the promises of Data Fabric are compelling, the road to implementation requires careful consideration and strategic planning. Organizations embarking on the journey of adopting Data Fabric should begin by conducting a comprehensive assessment of their existing data landscape. Understanding the current state of data sources, quality, and integration points is crucial to formulating an effective implementation strategy.

Collaboration between IT and business units is key during the implementation phase. Data Fabric is not just a technological solution but a holistic framework that requires alignment with the organization’s business goals. Engaging stakeholders from various departments ensures that the Data Fabric implementation is tailored to meet the specific needs and objectives of the organization.

Additionally, organizations should adopt an iterative approach to implementation, focusing on quick wins and gradually expanding the scope. This allows for continuous feedback and adjustments, ensuring that the Data Fabric evolves alongside the changing needs of the organization.

Real-World Applications of Data Fabric with Striim

To illustrate the real-world impact of Data Fabric, let’s explore a few use cases across different industries.

Revolutionize Patient Care: Seamless Data Integration in Healthcare
Healthcare institutions grapple with fragmented patient data across various systems. Implementing Data Fabric unifies electronic health records, diagnostic tools, and wearable device data in real time. This results in a comprehensive patient view, enhancing medical decision-making, personalized treatment plans, and accelerating medical research for breakthrough innovations.

Elevate Customer Experience: Real-time Insights in Retail Operations
Retail giants aim to enhance customer experience by integrating data from multiple sources, including sales transactions, customer behaviors, and inventory levels. With Data Fabric, the organization achieves real-time data integration, optimizing pricing strategies, improving inventory management, and ultimately delivering a seamless and personalized retail experience.

Ensure Regulatory Compliance: Robust Data Management in Financial Services
Financial institutions face the challenge of meeting stringent regulatory requirements. Data Fabric is implemented to ensure compliance by integrating and managing data with a focus on security and accuracy. This not only streamlines compliance processes but also enhances risk assessment, fraud detection, and personalized customer services in the fast-paced financial landscape.

Enhance Drug Discovery: Data Integration in Pharmaceutical Research
In the pharmaceutical industry, research teams grapple with the integration of diverse datasets critical for drug discovery. Data Fabric accelerates drug development by seamlessly integrating data from clinical trials, research studies, and external sources. This unified data approach promotes collaboration, data-driven decision-making, and accelerates the pace of innovation in pharmaceutical research.

Optimize Supply Chain: Real-time Data for Manufacturing and Logistics
Manufacturing companies seek to optimize their supply chain by integrating data from production processes, logistics, and inventory management. Data Fabric enables real-time data processing, providing a unified and up-to-date view of the entire supply chain. This results in improved operational efficiency, reduced lead times, and enhanced agility in responding to market demands.

Transforming Data Challenges with Data Fabric and Striim

The advent of Data Fabric emerges as a transformative force, offering a unified, scalable, and agile solution to the burgeoning challenges posed by disparate data sources. Comprising essential elements such as data integration, data quality, metadata management, and security, Data Fabric transcends traditional limitations. This cohesive framework not only breaks down data silos but also fosters a culture of collaboration, enabling organizations to make informed decisions based on a unified and comprehensive data landscape.

Ready to build a global and agile data environment that can track, analyze, and govern data across applications, environments, and users? Start using Striim for free today and scale limitlessly!

Striim Cloud for Application Integration

Posted on February 2, 2024 by Ananda Venkatesha | 4 min read | 5 views

Introducing Striim Cloud for Application Integration: A fully managed, simple, and scalable SaaS service for application connectors. With this new application integration service, users can stream real-time CRM, ERP, Billing, and Payment data from their cloud applications to data warehouses in minutes with zero coding. Instantly unlock the value of your application data through real-time insights, reports, and dashboards for your businesses. Data integration users can now take advantage of a single service that can join the application and transactional data to generate business-critical insights.

The number of cloud applications has exploded; research says enterprises, on average, deploy 500 applications, and the adoption of new applications is growing. Businesses that continuously deploy these applications are facing inevitable challenges in controlling data integration and presenting insightful data to their management and customers. As the leader in change data capture (CDC) from databases, Striim is introducing the new service Striim Cloud for Application Integration, which is built on a proven real-time streaming, scalable, and highly available Striim Cloud platform.

As a Google Cloud native, fully managed service, Striim removes the complexity of data integration, allowing businesses to focus on deriving valuable insights without worrying about the underlying technical challenges. This combination of ease of use, exceptional performance, and comprehensive management makes applications like HubSpot, Stripe, Zendesk, and more an ideal choice for organizations aiming to leverage their data for strategic advantage.

Key features:

Offers dedicated single-tenant architecture & modern network security features to ensure the highest level of data security
Automated schema creation, initial load of historical data, and continuous syncs in real-time to BigQuery
Secure, OAuth connectivity and SAML 2.0 Authentication
Ability to transform data in-flight, in real-time to deliver business-ready application data to BigQuery
Real-time monitoring of data delivery and data quality SLAs

Getting Started:

This blog covers getting started on Striim Cloud for Application integration solutions with an example of our new HubSpot connector. With just a few clicks steps and without coding, anyone in the organization with access to their cloud application and BigQuery can set up the pipeline and show the value of application data to the management in minutes.

Simply follow these easy steps to build your first data streaming pipeline between HubSpot and BigQuery:

Login to Google Marketplace, search for Striim or HubSpot
Subscribe to a 10-day Trial or purchase the plan
Signup with Striim Cloud
Create the first integration service (Infra)
Create your first pipeline (Requires HubSpot and BigQuery access)

Step 1: Log in to Google Marketplace

Go to Google Marketplace, search for Striim or HubSpot, and select the solution HubSpot connector by Striim.

Step 2: Choose & subscribe to the plan

Striim offers a 10-day trial through the marketplace, if you want first to see the value, simply select the trial plan. Provide your billing account information to Google, read the Striim Cloud SLA, and agree to proceed. In this step, Google redirects users to the Striim Cloud signup page. Go ahead and sign up and activate your account.

Step 3: Signup with Striim Cloud

After the subscription step in the marketplace, Google will redirect users to the Striim Cloud Signup page as shown below. Sign up with the email address and unique domain name, typically a department or company name, to generate a Striim Cloud tenant with a url to access the service. You may need to activate your account from your email inbox and sign in before going to the next step

Step 4: Create Application Adapter service (Infrastructure to create pipelines)

Select the region and create a service. Striim Cloud automatically creates the infrastructure required to run Striim adapter data pipelines, including K8s cluster, networking, and storage services, and configures Striim software with all smart defaults.

After the service is in a running state, simply Launch the service to get started.

Step 5: Create the first data pipeline to stream HubSpot data to BigQuery

After launching the service, users will land on the Application Connectors homepage. Simply select the HubSpot to Bigquery wizard.

Configure HubSpot and BigQuery pipeline using the wizard, by default Striim creates the schema on the target (BigQuery) for your selected objects in HubSpot and the wizard automatically validates connections, permissions, and necessary requirements.

Configure BigQuery with access; the service account key can be stored securely in the key vault. Check Striim documentation on how to use key vaults to store keys.

Wizard validates both the selected objects on the source and the dataset selected on the BigQuery and summarizes for users to confirm before starting to stream the data.

With that, the user is all set to stream data, and the Striim for Application Adapter service starts moving data from Hubspot to BigQuery.

Cultivating Developer Communities and Revolutionizing Data Analysis with Viktor Gamov

Posted on February 2, 2024 by Striim Team | 2 min read | 5 views

Unlock the secrets of engaging developer communities and the transformative world of real-time data analytics with our guest, Viktor Gamov of StarTree. From crafting code to leading developer relations, Viktor unravels his career evolution, highlighting how fostering connections and sharing knowledge with developers has reshaped the landscape of tech communication. His take on the democratization of technical know-how reveals the profound impact of making what was once consultancy-exclusive, accessible to all. Tune in for a masterclass on the importance of community in the tech industry and how it can break barriers for innovation.

Are you ready to see data come to life? Viktor’s thrilling exposition on Kafka, KSQL, and Apache Pino turns the arcane into the amazing, using a real-time Pac-Man game dashboard to illustrate the revolutionary shift from batch to stream processing. Witness the rebirth of open-source technologies and grasp the concept of ‘data in motion’ as we discuss the critical importance of streaming platforms in modern data architecture. Viktor’s expertise in developer relations shines as he demonstrates the value of making complex tech relatable and relevant to business needs.

The data landscape is ever-evolving, and with the rise of AI, the stakes have never been higher. In an era where milliseconds matter, Viktor peels back the layers of how Apache Pino is driving real-world solutions for industries galore. From restaurant load management to transaction tracking, discover how real-time analytics are informing strategic business decisions. As we journey with our guest back to his roots in data and game development, we’re reminded of the cyclical nature of passion and profession—where one’s beginnings often foretell the trajectory of their career. Don’t miss out on this episode, where we connect the dots between nostalgia and the next wave of data innovation.

Retrieval Augmented Generation with Database Change Streams

Posted on January 30, 2024 by Striim Team | 1 min read | 6 views

On-demand Technical Webinar

Customer Royalty, the Revolution of Unified Profiles, and AI-driven Engagement with Alex Levin

Posted on January 11, 2024 by Striim Team | 2 min read | 5 views

Get ready to revolutionize your approach to customer engagement with the insights from Alex Levin, the visionary CEO behind Regal.io. In our latest episode, we break down the transformative power of treating every customer like royalty, the secret sauce that’s reshaping business strategy in a digital-first marketplace. As Alex takes us from his unexpected tech pilgrimage to the founding of Regal.io, you’ll discover the game-changing potential of unified customer profiles that tailor experiences across the board—from retail therapy to the nurturing of patient relationships in healthcare.

Imagine a world where your business could predict and fulfill every customer’s desire before they even click ‘send’ on that service inquiry. That’s the crux of our discussion with Alex, where we dissect the intricacies of customer service at scale, the revolution in email marketing, and the marvels of journey builders that are setting new standards for brand engagement.

Finally, join us as we navigate the digital marketing terrain that’s swiftly moving away from costly, ephemeral channels to forging direct, lasting relationships with customers. As Alex shares his expertise, we look at the pivotal role of AI and large language models in gleaning context from data—ushering in an era of customer interaction that’s both incredibly personalized and meticulously overseen by the human touch. This isn’t just a conversation; it’s a masterclass in staying indispensable in the SaaS consolidation wave and becoming the data steward your company needs to succeed in the ever-competitive marketplace.

What’s New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What’s New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.

Striim’s Dynamic Duo: A Powerful Partnership with Yugabyte Redefines Data Management

Posted on December 12, 2023 by Edward Trainor | 3 min read | 5 views

Exciting news is on the horizon as Striim proudly announces its Technology Partnership with YugabyteDB, a collaboration set to reshape the landscape of data management. In this dynamic partnership, the fusion of Striim’s real-time data integration and streaming analytics capabilities with Yugabyte ‘s distributed SQL database, YugabyteDB, promises businesses unprecedented scalability, resilience, and global reach. As we embark on this thrilling journey, we share a vision of empowering organizations with the tools they need to thrive in a data-driven world.

Striim, a frontrunner in real-time data solutions, is renowned for enabling enterprises to ingest, process, and analyze data in real-time. YugabyteDB, on the other hand, is a leader in distributed SQL databases, providing scalability and resilience to meet the evolving needs of modern applications. Together, these two companies bring a wealth of expertise to the table, promising a comprehensive solution that addresses the complexities of modern data management.

“At Striim, we believe in the transformative potential of data. Our partnership with Yugabyte is a testament to our commitment to providing innovative solutions for businesses seeking to harness the full power of their data,” says Phillip Cockrell, Senior Vice President, Business Development at Striim. “By integrating our real-time data capabilities with YugabyteDB, Yugabyte’s distributed SQL database, we’re not just facilitating data management; we’re enabling organizations to scale, adapt, and derive meaningful insights from their data like never before.”

YugabyteDB is distributed PostgreSQL built for modern apps, delivering on-demand scalability, built-in resilience, and global distribution. It allows businesses to deploy their databases across multiple regions and cloud platforms. This capability ensures that companies can provide low-latency access to data for users worldwide, facilitating global expansion.

Danny Zaidifard, VP Business Development, from YugabyteDB, adds, “Yugabyte is excited to join forces with Striim to deliver a comprehensive solution for organizations navigating the complexities of modern data management. Striim’s expertise in real-time data integration aligns seamlessly with Yugabyte ‘s commitment to providing a distributed SQL database that is scalable, resilient, and globally distributed. Together, we offer businesses a powerful platform to manage, analyze, and derive value from their data, setting the stage for a new era of data agility and innovation.”

The Striim and Yugabyte’s partnership marks a significant milestone in the realm of data management. By combining the strengths of real-time data integration with a distributed SQL database, businesses can expect a seamless, scalable, and resilient solution that empowers them to make the most of their data assets. As organizations strive to stay competitive in a data-centric world, this partnership provides the essential tools needed to thrive and innovate in an ever-evolving landscape.

Enhancing Business Efficiency with Striim’s ‘Read Once, Stream Anywhere’ CDC Pattern

Posted on December 8, 2023 by Striim Team | 3 min read | 5 views

In today’s data-driven business landscape, the ability to effectively capture and utilize real-time data is paramount. Change Data Capture (CDC) is not just a technical process; it’s a gateway to unparalleled business efficiency and intelligence. Let’s explore how Striim’s ‘Read Once, Stream Anywhere’ CDC pattern is revolutionizing how businesses handle data.

Simplifying Data Complexity

Imagine a world where data from your core systems is seamlessly captured and synchronized across multiple platforms in real-time. Striim’s CDC approach simplifies this complexity, turning a maze of data streams into a streamlined flow of actionable insights.

The Striim Advantage: Efficiency and Scale

Striim’s CDC methodology stands out for its efficiency. Unlike traditional methods that create separate read clients for each data consumer – adding stress to your systems – Striim uses a single read client. This approach significantly reduces the load on source databases and paves the way for scalable data management.

Persistent Streams: The Heart of Reliable Data Flow

At the core of Striim’s CDC strategy are Persistent Streams. These components ensure that data is not only captured accurately but also managed effectively across different applications. They are the guardians of your data integrity, especially during critical recovery processes.

Business Benefits of Striim’s CDC Pattern

Reliable Data Recovery: Persistent Streams enable controlled data rewinds during recovery, ensuring business continuity without data loss or duplication.
Autonomous Application Interaction: With Striim, each application interacts with the data stream independently, enhancing data consistency and integrity across your business operations.
Scalable Data Management: Striim’s CDC pattern allows businesses to manage data flows efficiently, adapting to the ever-changing data landscapes without overwhelming your source systems.
Enhanced Decision-Making: Real-time data synchronization means your business decisions are always informed by current data, giving you a competitive edge in responsiveness and strategic planning.

Streamlining Operations with Advanced Data Routing

Striim’s Router component is another jewel in the CDC crown. It efficiently directs data flows, ensuring that each piece of data reaches its intended destination without unnecessary complexity. This means more time focusing on core business operations and less on managing data pathways.

Conclusion: A New Era of Data Integration

Striim’s ‘Read Once, Stream Anywhere’ CDC pattern is not just about moving data – it’s about transforming how businesses approach real-time data integration and analytics. Striim is leading businesses into a new era of efficiency and intelligence by simplifying data complexity, ensuring reliability, and enhancing scalability.

Embrace Striim’s innovative approach to CDC and unlock the true potential of your business data. It’s not just an upgrade to your data processes; it’s a strategic move towards a smarter, more agile business model.

Read our full technical deep-dive here and start your free 14-day trial of Striim.

Change Data Capture Best Practices with a ‘Read Once, Stream Anywhere’ Pattern in Striim

Posted on December 8, 2023 by John Kutay | 15 min read | 5 views

Note: To follow best practices guide, you must have the Persisted Streams add-on in Striim Cloud or Striim Platform.

Introduction

Change Data Capture (CDC) is a critical methodology, particularly in scenarios demanding real-time data integration and analytics. CDC is a technique designed to efficiently capture and track changes made in a source database, thereby enabling real-time data synchronization and streamlining the process of updating data warehouses, data lakes, or other systems.

Change Data Capture to Multiple Subscribers

It is common for organizations to stream transactional data from a database to multiple consumers – whether it be different lines of the business or separate technical infrastructure (databases, data warehouses, and messaging systems like Kafka).

However, a common anti-pattern in CDC implementation is creating a separate read client for each subscriber. This might seem intuitive but is actually inefficient due to competing I/O and additional overhead created on the source database.

The more efficient approach is to have a single read client that pipes out to a stream, which is then ingested by multiple writers. Striim addresses this challenge through its implementation of Persistent Streams, which manage data delivery and recovery across application boundaries.

Concepts and definitions for this article

Striim App: Deployable Directed Acyclic Graph (DAG) of data processing components in Striim.
Stream: Time-ordered log of events transferring data between processing components.
Source (e.g., OracleReader): Captures real-time data changes from an external system and emits a stream of events
Targets: Writes a stream of events to various external systems.
Router: Directs data from an input stream to two or more stream components based on rules
Continuous Query (CQ): Processes data streams using a rich Streaming SQL language
Persistent Streams: Transfers data between components in a durable, replayable manner.
Transaction log:A chronological record of all transactions and the database changes made by each transaction, used for ensuring data integrity and recovery purposes.

Enhanced Role of Persistent Streams in Striim for Data Recovery and Application Boundary Management

Persistent Streams in Striim play a critical role in data recovery and managing data flows across application boundaries. They address a common challenge in data streaming applications: ensuring data consistency and integrity, especially during recovery processes and when data crosses boundaries between different applications.

In-Memory Streams vs. Persistent Streams

In traditional streaming setups without Persistent Streams, data recovery and consistency across application boundaries can be problematic. Consider the following scenario with two Striim applications (App1 and App2):

A database transaction log is a chronological record of all transactions and the database changes made by each transaction, used for ensuring data integrity and recovery purposes.

Persistent Streams for Recovery and Application Boundary Negotiation

To mitigate these challenges, Persistent Streams offer a more sophisticated approach:

In this setup, Persistent Streams allow for a more controlled and error-free data flow, especially during restarts or recovery processes. Here’s how they work:

Rewind and Reprocessing: Persistent Streams enable the rewinding of data flows, allowing components to reprocess a portion of the stream backlog. This ensures that data flows are reset properly during recovery.

Independent Checkpoints: Each subscribing application (example App2) maintains its own checkpoint on the Persistent Stream. This means that each app interacts with the stream independently, enhancing data consistency and integrity.

Private Stream Management: Applications can stop, start, or reposition their interaction with the Persistent Stream without affecting other applications. This autonomy is crucial for maintaining uninterrupted data processing across different applications.

Controlled Data Flow: Each application reads from the Persistent Stream at its own pace, reaching the current data head and then waiting for new data as needed. This controlled flow prevents data loss or duplication that can occur with traditional streams.

Flexibility for Upstream Applications: The upstream application (like App1 in our example) can write data into the Persistent Stream at any time, without impacting the downstream applications. This flexibility is vital for dynamic data environments.

Designing pipelines to read once, stream anywhere

Application Boundary Negotiation with Persistent Streams

Persistent Streams in Striim play a pivotal role in managing data flows across application boundaries, ensuring Exactly Once Processing. They provide a robust mechanism for recovery by rewinding data flows and allowing reprocessing of the stream backlog. For example, in a scenario where two applications share a stream:

In this setup, Persistent Streams allow each application to maintain its own checkpoint and process data independently, avoiding issues like data duplication or missed data that can occur when traditional streams are used across application boundaries. This approach ensures that each downstream application can operate independently, maintaining data integrity and consistency.

Kafka-backed Streams in Striim

Striim offers integration with Kafka, either fully managed by Striim or with an external Kafka cluster, to manage Persistent Streams. This add-on is available in both Striim Cloud and Striim Platform.

See enabling Kafka Streams in Striim Platform (self-hosted).

To create a Persistent Stream, follow the steps in our documentation.

To set up Persistent Streams in Striim Cloud, simply check the ‘Persistent Streams’ box when launching your service. You can then use the ‘admin.CloudKafkaProperties’ to automatically persist your streams to the Kafka cluster fully managed by Striim.

Striim Router for Load Distribution

Striim provides several methods to distribute and route change data capture streams to multiple subscribers. To route data based on simple rules (e.g. table name, operation type), we generally recommend using a Striim Router component.

A Router component to channel events to related streams based on specific conditions. Each DatabaseWriter is connected to a stream defined by the Router.

				
					CREATE STREAM CustomerTableGroupStream OF Global.WAEvent;
CREATE STREAM DepartmentTableGroupStream OF Global.WAEvent;
CREATE OR REPLACE ROUTER event_router INPUT FROM OracleCDCOut AS src 
CASE
    WHEN meta(src, "TableName").toString().equals("QATEST.DEPARTMENTS") OR 
         meta(src, "TableName").toString().equals("QATEST.EMPLOYEES") OR 
         meta(src, "TableName").toString().equals("QATEST.TASKS") THEN
        ROUTE TO DepartmentTableGroupStream,
    WHEN meta(src, "TableName").toString().equals("QATEST.CUSTOMERS") OR 
         meta(src, "TableName").toString().equals("QATEST.ORDERS") THEN
        ROUTE TO CustomerTableGroupStream;

The event router is assigned the task of routing incoming data from the OracleCDCOut source to various streams, determined by predefined conditions. This ensures efficient data segregation and processing based on the incoming event’s table name.

Flow Representation:

The Reader reads from the “interested tables” and writes to a single stream. The Router then uses table name-based case statements to direct the events to multiple writers.

Designing the Striim Apps with Resource Isolation

For a read once, Write Anywhere CDC pattern, you will create Striim apps that handle reading from a source database, moving data with Persistent Streams, and routing data to various consumers.

Be sure to enable Recovery on all applications created in this process. Review Striim docs on steps to enable recovery.

CDC Reader App: An app with a CDC reader that connects to the source Database that outputs to the Persistent Stream

For each database that you will perform CDC from, create an application to source data and route to downstream consumers.

Sample TQL:

				
					CREATE OR REPLACE APPLICATION CDC_App USE EXCEPTIONSTORE TTL : '7d', RECOVERY 10 seconds ;
CREATE SOURCE OracleCDC USING Global.OracleReader ( 
  Username: 'cdcuser', 
  ConnectionURL:<your connection URL>, 
  Password: <your password>, 
  ReplicationSlotName: 'striiim_slot', 
  Password_encrypted: 'true', 
  Tables: <your tables>
  connectionRetryPolicy: 'retryInterval=30, maxRetries=3', 
  FilterTransactionBoundaries: true ) 
OUTPUT TO CDCStream PERSIST USING admin.CloudKafkaProperties;
END APPLICATION CDC_App;

This app will then contain a CQ or a Router to route the ‘CDCStream’ to multiple streams. For simplicity and minimizing event data, we recommend using the Router. Create a Router to route data from the persistent CDCStream to an in-memory stream for each downstream app that you will create.

The entire app will have the below components..

Sample TQL:

				
					CREATE OR REPLACE APPLICATION CDC_App USE EXCEPTIONSTORE TTL : '7d', RECOVERY 10 seconds ;
CREATE SOURCE OracleCDC USING Global.OracleReader ( 
  Username: 'cdcuser', 
  ConnectionURL:<your connection URL>, 
  Password: <your password>, 
  ReplicationSlotName: 'striiim_slot', 
  Password_encrypted: 'true', 
  Tables: <your tables>
  connectionRetryPolicy: 'retryInterval=30, maxRetries=3', 
  FilterTransactionBoundaries: true ) 
OUTPUT TO CDCStream PERSIST USING admin.CloudKafkaProperties;
CREATE STREAM CustomerTableGroupStream OF Global.WAEvent PERSIST USING admin.CloudKafkaProperties ;
CREATE STREAM DepartmentTableGroupStream OF Global.WAEvent PERSIST USING admin.CloudKafkaProperties;
CREATE OR REPLACE ROUTER event_router INPUT FROM CDCStream AS src 
CASE
    WHEN meta(src, "TableName").toString().equals("QATEST.DEPARTMENTS") OR 
         meta(src, "TableName").toString().equals("QATEST.EMPLOYEES") OR 
         meta(src, "TableName").toString().equals("QATEST.TASKS") THEN
        ROUTE TO DepartmentTableGroupStream,
    WHEN meta(src, "TableName").toString().equals("QATEST.CUSTOMERS") OR 
         meta(src, "TableName").toString().equals("QATEST.ORDERS") THEN
        ROUTE TO CustomerTableGroupStream;
END APPLICATION CDC_App;

Create apps for each downstream subscriber

Now you will create an app that reads data from the Database CDC Stream App’s streams.

				
					CREATE OR REPLACE APPLICATION CustomerTableGroup_App USE EXCEPTIONSTORE TTL : '7d', RECOVERY 10 seconds ;
CREATE TARGET WriteCustomerTable USING DatabaseWriter (
  connectionurl: 'jdbc:mysql://192.168.1.75:3306/mydb',
  Username:'striim',
  Password:'******',
  Tables: <your tables>
) INPUT FROM CDC_App.CustomerTableGroupStream;
END APPLICATION CustomerTableGroup_App;

Here we will create the second application that reads data from Database CDC Stream App’s streams:

				
					CREATE OR REPLACE APPLICATION Departments_App USE EXCEPTIONSTORE TTL : '7d', RECOVERY 10 seconds ;
CREATE TARGET WriteDepartment USING DatabaseWriter (
  connectionurl: 'jdbc:mysql://192.168.1.75:3306/mydb',
  Username:'striim',
  Password:'******',
  Tables: <your tables>
) INPUT FROM CDC_App.DepartmentTableGroupStream;
END APPLICATION DepartmentsApp;

You can repeat this pattern n-number of times for n-number of consumers. It’s also worth noting you can have multiple consumers (Striim apps with a Target component) on the same physical target data platform – e.g. multiple Striim Targets for a single Snowflake instance. For instance, you may split out the target for a few critical tables that require dedicated compute and runtime resources to integrate.

Once you’ve designed these Striim applications, they will be independently recoverable and maintain their own upstream checkpointing with Striim’s best-in-class transaction watermarking capabilities.

Designing the Striim Apps with Resource Sharing

To consolidate runtime and share resources between consumers, you can also bundle both Targets in the same Striim app. This means if one consumer goes down, it will halt the pipeline for all consumers. If your workloads are uniform and you have no operational advantages from decoupling, you can easily keep all the components in the same app.

You would simply create all components within one application.

				
					CREATE OR REPLACE APPLICATION CDC_App USE EXCEPTIONSTORE TTL : '7d', RECOVERY 10 seconds ;
CREATE SOURCE OracleCDC USING Global.OracleReader ( 
  Username: 'cdcuser', 
  ConnectionURL:<your connection URL>, 
  Password: <your password>, 
  ReplicationSlotName: 'striiim_slot', 
  Password_encrypted: 'true', 
  Tables: <your tables>
  connectionRetryPolicy: 'retryInterval=30, maxRetries=3', 
  FilterTransactionBoundaries: true ) 
OUTPUT TO CDCStream PERSIST USING admin.CloudKafkaProperties;
CREATE STREAM CustomerTableGroupStream OF Global.WAEvent PERSIST USING admin.CloudKafkaProperties;
CREATE STREAM DepartmentTableGroupStream OF Global.WAEvent PERSIST USING admin.CloudKafkaProperties;
CREATE OR REPLACE ROUTER event_router INPUT FROM CDCStream AS src 
CASE
    WHEN meta(src, "TableName").toString().equals("QATEST.DEPARTMENTS") OR 
         meta(src, "TableName").toString().equals("QATEST.EMPLOYEES") OR 
         meta(src, "TableName").toString().equals("QATEST.TASKS") THEN
        ROUTE TO DepartmentTableGroupStream,
    WHEN meta(src, "TableName").toString().equals("QATEST.CUSTOMERS") OR 
         meta(src, "TableName").toString().equals("QATEST.ORDERS") THEN
        ROUTE TO CustomerTableGroupStream;
CREATE TARGET WriteCustomer USING DatabaseWriter (
  connectionurl: 'jdbc:mysql://192.168.1.75:3306/mydb',
  Username:'striim',
  Password:'******',
  Tables: <your tables>
) INPUT FROM CDC_App.CustomerTableGroupStream;
CREATE TARGET WriteDepartment USING DatabaseWriter (
  connectionurl: 'jdbc:mysql://192.168.1.75:3306/mydb',
  Username:'striim',
  Password:'******',
  Tables: <your tables>
) INPUT FROM CDC_App.DepartmentTableGroupStream;
END APPLICATION CDC_App;

Grouping Tables

Striim provides exceptional flexibility processing multi-variate workloads by allowing you to group different tables into their own apps and targets. As demonstrated above, you can either group tables into their own target in the same app (in the resource sharing example), or provide fine-grained resource isolation by grouping tables into their own target into their own respective Striim app. An added benefit is you can still use a single source to group tables.

Here are some criteria for group tables

Data freshness SLAs
- Group tables into a Striim target based on data freshness SLA. You can configure and tune the Striim target’s batchpolicy/uploadpolicy based on the SLA
Data volumes
- Group tables with similar transaction volumes into their own apps and targets. This will allow you to optimize the performance of delivering data to business consumers.
Tables with relationships – such as Foreign Keys – should also be grouped together

After you group your tables into their respective targets based on the above criteria, you can apply heuristic-based tuning to get maximum performance, cost reduction, and service uptime.

Tuning Pipelines for Performance

We recommend heuristic-based tuning of Striim applications to meet the business and technical performance requirements of your data pipelines. You can set your Striim Target adapters batch policy accordingly:

Choose a Data Freshness SLA for the concerned set of tables you are integrating to the target – defined as N.
Measure the number of DMLs created by your source database during the timeframe of ‘N’ – defined as M.

When writing to Data Warehouse targets such as Google BigQuery, Snowflake, Microsoft Fabric and Databricks, Striim’s target writers have a property called ‘BatchPolicy’ (or UploadPolicy) which defines the interval in which batches should be uploaded to the Target systems. The BatchPolicy combines a time interval and a number of events, loading events to the target based on whichever criterion is met first.

Batches are created on a per table basis, regardless of how many tables are defined in your Target’s ‘tables’ property. Using the above variables, set your BatchPolicy as

Time interval: N/2. (where N is the Data Freshness SLA you are trying to achieve)

Events: M (number of events produced in N time period)

This means we will create batches to upload to your target at either a time of N/2 OR for every M events. These analytical systems are efficiently designed for large data ingress in periodic chunks, Striim recommends setting N/2 and M to values that maximize the write performance of the target system to meet business SLAs and technical objectives.

Slower than expected upload performance often occurs when batches from various tables are enqueued faster than they are dequeued. However, if jobs are enqueued sequentially as batch policies expire and job processing does not keep up, the queue will grow and fail to sustain workload demands. This reduced write performance results in delays (lag) in data availability and can lead to reliability issues such as out-of-memory (OOM) conditions or API service contention, especially during streaming uploads.

Excessive enqueueing of batches can occur from causes other than target performance.

Rapid Batch Policy Expiration: A common cause of rapid expiration is setting a low EventCount in the BatchPolicy. The batch policy would expire and enqueue target loads faster than the time required for upload and merge operations, leading to an increasing backlog. To mitigate this, adjust the EventCount and Interval values in the BatchPolicy.
High Source Input Rate: An elevated rate of incoming data can intensify the upload contention. A diverse group of high frequency tables may require implementing multiple target writers. Each additional writer increases the number of job processing compute threads and effectively increases the available load processing queues servicing the target system. While beneficial for compute parallelism this may increase quota usage and or resource utilization of the target data system.
High Average Wait Time in Queue: Reducing the number of batches by increasing batch size and extending time intervals can also be effective in alleviating these bottlenecks and enhancing overall system performance.

To monitor and address bottlenecked uploads causing latency, processing delays, and OOM errors, focus on the following monitoring metrics:

Set Alerts for SourceIdle and TargetIdle customized to be describe your data freshness SLA of n.
Run ‘LEE <source> <target>’ command
- If LEE is within n, then you may be meeting your freshness SLA. However there’s still a possibility that your last transaction read watermark is greater than n. In which case validate with the below steps
Run ‘Mon <source>’ command
- If the current time minus ‘Reader Last Timestamp’ is less than n, then you may be meeting your freshness SLAs. However if that number is greater than n, you can start triaging the rest of the pipeline with the below steps.
Run ‘Mon <target>’ command
- The following metrics are critical for diagnosis of issues. The time it takes to load data is heavily influenced by the below metrics:

Avg Integration Time in ms – Indicates the average time for batches to successfully write to the target based on current runtime performance

Avg Waiting time in Queue in ms – Indicates the average time a batch is queued for integration to the target.

Total Batches Queued – Indicates the number of batches queued for integration to the target.

Batch Accumulation Time – The time it takes for a batch hit the expiration threshold

- Update your BatchPolicy’s time interval value. This value should be equal to ‘Avg Integration Time in MS’ * the total number of tables in a writer
- If there are a significant number of batches queuing up (using Total Batches Queued) , and ‘Average integration time in ms’ is greater than ‘n’, then increase the Batchpolicy interval value. Test again with incremental values until you are consistently meeting your SLAs.
  - If your target is Snowflake – keep in mind the file upload size should be between 100-250 mb at most before copy performance begins to degrade
- An alternative to increasing the batch sizes is splitting the tables into another target and using the same heuristic. Creating a Target dedicated to high volume tables allocates dedicated compute to its respective tables. However if you hit CPU bottlenecks at a Striim service level, you may need to increase your Striim service size.

Note: If a Striim Target is a Data Warehouse writer utilizing ‘optimized merge’ setting and a source is generating many Primary Key (PK) updates, each update will be batched separately and write performance will decline.

FAQ

Q: What if I have tables with different data delivery SLAs and priorities? E.g. my ‘Orders table’ has a 1 minute SLA and my ‘Store Locations’ table has a 1 hour SLA.

A: You can use a Router to split the CDC Reader’s Persistent Stream into multiple persistent streams, then create an Striim app with its own target for each set of tables with varying SLAs.

Q: What if I don’t have the Persistent Streams add-on in Striim? Can I follow these steps?

A: You can still minimize overhead on your source database by running one reader per database, but you will need to design a solution to handle recovery across apps if you have multiple consumers or prepare to do full re-syncs in the event of a transient failure.

Conclusion

Optimizing CDC in Striim involves leveraging key components like Persistent Streams, Routers, and Continuous Queries. The ‘Read Once, Write Anywhere’ pattern, facilitated by Persistent Streams, ensures efficient data distribution, integrity, and recovery across application boundaries. This approach is essential for effective real-time data integration and analytics in modern business environments.

Enhancing Emergency Room Efficiency with Real-Time Data Analytics

Posted on December 7, 2023 by Striim | 4 min read | 5 views

In emergency rooms, where the stakes are highest and every second counts, having access to real-time patient data is not just a convenience—it’s a life-saving necessity. The ability to instantly process and act on critical information can drastically improve patient outcomes and, in many cases, make the difference between life and death.

Given the fast-paced and unpredictable nature of emergency care, real-time data is a cornerstone of effective decision-making, resource allocation, and patient management. Here’s what you need to know about increasing ER efficiency with the help of real-time data analytics.

The Critical Role of Real-Time Data in Emergency Rooms

Emergency rooms are high-stakes environments where speed, accuracy, and resource management are paramount. However, managing large volumes of patient data, especially across multiple disconnected systems, can prove a cumbersome challenge. With the increasing complexity of patient care and the demand for faster, more precise decision-making, the integration of real-time data is a game-changer in improving emergency room operations.

Real-time data enables healthcare providers to gain immediate insights into patient conditions, providing a clearer picture of care needs and resource availability. This immediate access to data supports timely decisions, helps prioritize care based on urgency, and ensures that resources such as staff and medical equipment are optimally allocated.

Improved Decision-Making with Interactive Dashboards

In emergency rooms, clinicians must make critical decisions quickly. Real-time, interactive dashboards offer healthcare teams a dynamic view of patient conditions, available resources, and key operational metrics. These dashboards present data in a way that not only tracks patient flow but also reflects the real-time status of hospital resources like beds, staff availability, and medical equipment, providing healthcare practitioners with the information necessary to make the best decision — all in real time.

Instead of having to wait for reports or updates from other departments, healthcare organizations have the information they need the moment they need it. Better yet, the data isn’t outdated as it would be with batch processing. With live data at their fingertips, clinicians can prioritize patient care more effectively and coordinate efforts across departments to reduce delays.

Streamlining Communication for Better Collaboration

Another way real-time data analytics enhance emergency room response is through improved communication. Effective communication is key in emergency rooms, where teams of healthcare professionals must work together seamlessly to deliver rapid care. However, without timely data, communication can break down, leading to mistakes and delays. Real-time data integration enhances communication by ensuring that all team members have immediate access to relevant, up-to-date information.

Whether it’s coordinating care with other departments or updating patients on their status, real-time insights allow for better collaboration, enabling healthcare providers to respond quickly and appropriately to changing patient needs.

Optimizing Resource Allocation and Workflow Efficiency

With healthcare facilities facing staffing shortages and growing patient numbers, optimizing resource allocation has never been more important. Real-time data integration allows hospitals to monitor resources in real-time, ensuring that staff, equipment, and treatment areas are allocated where they are needed most.

By leveraging real-time data, hospitals can dynamically adjust their operations to match real-time patient volumes and needs. For example, bed availability and staffing levels can be adjusted as patient conditions evolve, helping to reduce wait times, improve patient care, and prevent overcrowding in emergency departments.

Looking Ahead: The Future of Real-Time Healthcare

The potential of real-time data in healthcare extends far beyond emergency rooms. From pharmacy order monitoring to proactive management of chronic conditions, the benefits of real-time data are transformative for all areas of healthcare. This level of data integration allows for more personalized care, faster treatments, and improved operational efficiency, contributing to both better patient outcomes and a more streamlined healthcare system overall.

Better Data, Better Patient Outcomes

The ability to integrate and act on real-time data in emergency rooms is not a luxury—it’s a necessity for providing high-quality, patient-centered care. As healthcare systems continue to evolve, embracing real-time data analytics will be crucial in ensuring that hospitals can meet the demands of a modern, fast-paced healthcare environment. This technology not only enables immediate response times but also lays the groundwork for a more efficient, responsive, and patient-focused healthcare system.

Ready to discover how Striim can help your healthcare organization enhance emergency room efficiency and more? Get a demo today.