Irem Radzik

15 Posts

Getting Started with Real-Time ETL to Azure SQL Database

Running production databases in the cloud has become the new norm. For us at Striim, real-time ETL to Azure SQL Database and other popular cloud databases has become a common use case. Striim customers run critical operational workloads in cloud databases and rely on our enterprise-grade streaming data pipelines to keep their cloud databases up-to-date with existing on-premises or cloud data sources.

Striim supports your cloud journey starting with the first step. In addition to powering fully-connected hybrid and multi-cloud architectures, the streaming data integration platform enables cloud adoption by minimizing risks and downtime during data migration. When you can migrate your data to the cloud without database downtime or data loss, it is easier to modernize your mission-critical systems. And when you liberate your data trapped in legacy databases and stream to Azure SQL DB in sub-seconds, you can run high-value, operational workloads in the cloud and drive business transformation faster.

Streaming Integration from Oracle to Azure SQL DBBuilding continuous, streaming data pipelines from on-premises databases to production cloud databases for critical workloads requires a secure, scalable, and reliable integration solution. Especially if you have enterprise database sources that cannot tolerate performance degradation, traditional batch ETL will not suffice. Striim’s low-impact change data capture (CDC) feature minimizes overhead on the source systems while moving database operations (inserts, updates, and deletes) to Azure SQL DB in real time with security, reliability, and transactional integrity.

Striim is available as a PaaS offering in major cloud marketplaces such as Microsoft Azure Cloud, AWS, and Google Cloud. You can run Striim in the Azure Cloud to simplify real-time ETL to Azure SQL Database and other Azure targets, such as Azure Synapse Analytics, Azure Cosmos DB, Event Hubs, ADLS, and more. The service includes heterogeneous data ingestion, enrichment, and transformation in a single solution before delivering the data to Azure services with sub-second latency. What users love about Striim is that it offers a non-intrusive, quick-to-deploy, and easy-to-iterate solution for streaming data integration into Azure.

To illustrate the ease of use of Striim and to help you get started with your cloud database integration project, we have prepared a Tech Guide: Getting Started with Real-Time Data Integration to Microsoft Azure SQL Database. You will find step-by-step instructions on how to move data from an on-premises Oracle Database to Azure SQL Database using Striim’s PaaS offering available in the Azure Marketplace. In this tutorial you will see how Striim’s log-based CDC enables a solution that doesn’t impact your source Oracle Database’s performance.

If you have, or plan to have, Azure SQL Databases that run operational workloads, I highly recommend that you use a free trial of Striim along with this tutorial to find out how fast you can set up enterprise-grade, real-time ETL to Azure SQL Database. On our website you can find additional tutorials for different cloud databases. So be sure to check out our other resources as well. For any streaming integration questions, please feel free to reach out.

Mitigating Data Migration and Integration Risks for Hybrid Cloud Architecture

 

Cloud computing has transformed how businesses use technology and drive innovation for improved outcomes. However, the journey to the cloud, which includes data migration from legacy systems, and integration of cloud solutions with existing systems, is not a trivial task. There are multiple cloud adoption risks that businesses need to mitigate to achieve the cloud’s full potential.

 

Common Risks in Data Migration and Integration to Cloud Environments

In addition to data security and privacy, there are additional concerns and risks in cloud migration and integration. These include:

Downtime: The bulk data loading technique, which takes a snapshot of the source database, requires you to lock the legacy database to preserve the consistent state. This translates to downtime and business disruption for your end users. While this disruption can be acceptable for some of your business systems, the mission-critical ones that need modernization are typically the ones that cannot tolerate even planned downtime. And sometimes, planned downtime extends beyond the expected duration, turning into unplanned downtime with detrimental effects on your business.

Data loss: Some data migration tools might lose or corrupt data in transit because of a process failure or network outage. Or they may fail to apply the data to the target system in the right transactional order. As a result, your cloud database ends up diverging from the legacy system, also negatively impacting your business operations.

Inadequate Testing: Many migration projects operate under tense time pressures to minimize downtime, which can lead to a rushed testing phase. When the new environment is not tested thoroughly, the end result can be an unstable cloud environment. Certainly, not the desired outcome when your goal is to take your business systems to the next level.

Stale Data: Many migration solutions focus on the “lift and shift” of existing systems to the cloud. While it is a critical part of cloud adoption, your journey does not end there. Having a reliable and secure data integration solution that keeps your cloud systems up-to-date with existing data sources is critical to maintaining your hybrid cloud or multi-cloud architecture. Working with outdated technologies can lead to stale data in the cloud and create delays, errors, and other inefficiencies for your operational workloads.

 

Upcoming Webinar on the Role of Streaming Data Integration for Data Migration and Integration to Cloud

Streaming data integration is a new approach to data integration that addresses the multifaceted challenges of cloud adoption. By combining bulk loading with real-time change data capture technologies, it minimizes downtime and risks mentioned above and enables reliable and continuous data flow after the migration.

Striim - Data Migration to Cloud

In our next live, interactive webinar, we dive into this particular topic; Cloud Adoption: How Streaming Data Integration Minimizes Risks. Our Co-Founder and CTO, Steve Wilkes, will present the practical ways you can mitigate the data migration risks and handle integration challenges for cloud environments. Striim’s Solution Architect, Edward Bell, will walk you through with a live demo of zero downtime data migration and continuous streaming integration to major cloud platforms, such as AWS, Azure, and Google Cloud.

I hope you can join this live, practical presentation on Thursday, May 7th 10:00 AM PT / 1:00 PM ET to learn more about how to:

  • Reduce migration downtime and data loss risks, as well as allow unlimited testing time of the new cloud environment.
  • Set up streaming data pipelines in just minutes to reliably support operational workloads in the cloud.
  • Handle strict security, reliability, and scalability requirements of your mission-critical systems with an enterprise-grade streaming data integration platform.

Until we see you at the webinar, and afterward, please feel free to reach out to get a customized Striim demo for data migration and integration to cloud to support your specific IT environment.

 

Striim 3.9.8 Adds Advanced Security Features for Cloud Adoption

 

 

We are pleased to announce the general availability of Striim 3.9.8 with a rich set of features that span multiple areas, including advanced data security, enhanced development productivity, data accountability, performance and scalability, and extensibility with new data targets.

The new release brings many new features that are summarized here:

Let’s review the key themes and features of the new release starting with the security topic.

Advanced Platform and Adapter Security:

With a sharp focus on business-critical systems and use cases, the Striim team has been boosting the platform’s security features for the last several years. However, in version 3.9.8, we introduced a broad range of advanced security features to both the platform and its adapters to provide users with robust security for the end-to-end solution, and higher control for managing data security.

The new platform security features include the following components:

  • Striim KeyStore, which is a secured, centralized repository based on Java Keystore, for storing passwords and encryption keys, streamlines security management across the platform.
  • Ultra-secure algorithms for user password encryption across all parts of the platform reducing platform’s vulnerabilities to external or internal breaches.
  • Stronger encryption support for inter-node cluster communication with internally generated, long string password and unified security management for all nodes and agents.
  • Multi-layered application security via advanced support for exporting and importing pipeline applications within the platform. In Striim, all password properties of an application are encrypted using their own keys. When exporting applications containing passwords or other encrypted property values, you can now add a second level of encryption with a passphrase that will be required at the time of import, to strengthen the application security.
  • Encryption support using customer provided key for securing permanent files, via the File Writer, and for the intermediate temporary files via the Google Cloud Storage Writer. Supported encryption algorithm types include RSA, AES and PGP. You can generate keys for encrypting by multiple tools available online or using in house Java program and easily configure the encryption settings of the adapters via the Encryption policy property on the UI.

Overall, these new security features enable:

  • Enhanced platform and adapter security for hybrid cloud deployments and mission-critical environments
  • Strengthened end-to-end data protection from ingestion to file delivery
  • Enhanced compliance with strict security policies and regulations
  • Secured application sharing between platform users

Improved Data Accountability:

Striim version 3.9.8 includes an application-specific exception store for storing events discarded by the application, including discarded records. The feature allows viewing discarded records and their details in real time. You can configure this feature with a simple on/off option when building an application. With this feature, Striim improves its accountability for all data passing through the platform and allows users to build applications for replaying and processing discarded records.

Enhanced Application Development Support and Ease of Use

The new release also includes features that accelerate and ease developing integration applications, especially in high-volume data environments.

  • A New Enrichment Transformer: Expanding the existing library of out-of-the-box transformers, the new enrichment transformer function allows you to enrich your streaming data in-flight without any manual coding step. You only need Striim’s drag and drop UI to create a real-time data pipeline that performs in-memory data lookups. With this transformer, you can, for example, add City Name and County Name fields to an event containing Zip Code.

  • External Lookups: Striim provides an in-memory data cache to enrich data in-flight at very high speeds. With the new release, Striim gives you the option to enrich data with lookups from external data stores. The platform can now execute a database query to fetch data from an external database and return the data as a batch. The external lookup option helps users avoid preloading data in the Striim cache. This is especially beneficial for lookups from or joining with large data sets. External lookups also eliminate the need for a cache refresh since the data is fetched from the external database. The external lookups are supported for all major databases, including Oracle, SQL Server, MySQL, PostgreSQL, HPE NonStop.
  • The Option to Use Sample Data for Continuous Queries: With this ability, Striim reduces the data required for computation or displaying results via the dashboards. You can select to use only a portion of your streaming data for the application, if your use case can benefit from this approach. As a result, it increases the speed for computation and displaying the results, especially when working with very large data volumes.
  • Dynamic Output Names for Writers: The Striim platform makes it now easy to organize and consume the files and objects on the target system by giving flexible options for naming them. Striim file and object output names can include data, metadata, and user data field values from the source event. This dynamic output naming feature is available for the following targets: Azure Data Lake Store Gen 1 and Gen 2, Azure Blob Storage, Azure File Storage, Google Cloud Storage, Apache HDFS, Amazon S3.
  • Event-Augmented Kafka Message Header: Starting with Apache Kafka v11, Striim 3.9.8 introduced a new property called MessageHeader that enriches the Kafka message header with a mix of the event’s dynamic and static values before delivering with sub-second latency. With the help of the additional contextual information, downstream consumer application can rapidly determine how to use the messages arriving via Striim.
  • Simplified User Experience: The new UI for configuring complex adapter properties, such as rollover policy, flush policy, encryption policy, speeds new application development.

  • New sample application for real-time dashboards: Striim version 3.9.8 added a new sample dashboarding application that uses real-time data from meetup-website and displays in details of the meet-up events happening around the globe using demonstrates the Vector Map visualization.

Other platform improvements for ease of use and manageability include:

  • The Open Processor component, which allows bringing external code into the Striim platform, can be loaded and unloaded dynamically without having to restart Striim.
  • The Striim REST API allows safely deleting or post-processing the files processed by the Striim File Reader.
  • The Striim REST API for application monitoring reports consolidated statistics of various application components within a specified time range.

Increased Performance and Scalability:

For further improving performance and scalability, we have multiple features, including dynamic partitioning and performance fine-tuning for writers:

  • Dynamic Partitioning with Higher-Level of Control: Partitions allow parallel processing of the events in the stream by splitting them across multiple servers in the deployment. Striim’s partitioning distributes events dynamically at run-time across server nodes in a cluster and enables high performance and easy scalability. In prior releases, Striim used one or more fields of the events in the stream as key for partitioning. In the new release, users have additional, flexible options for distributing and processing large data volumes in streams or windows. Striim 3.9.8 allows partitioning key to be one or more expressions composed with the fields of the events in the stream. Striim’s flexible partitioning enables load-balancing applications that are deployed on multi-node clusters and process large data volumes. Windows-based partitioning enables grouping the events in windows that can, for example, be consumed by specific downstream writers. As a result, you can perform load-balancing across multiple writers to improve writing performance.
  • Writer Fine-Tuning Options: Striim 3.9.8 now offers the ability to configure the number of parallel threads for writing into the target system and simplifies writer configuration for achieving even higher throughput from the platform. The fine-tuning option is available for the following list of writers at this time: Azure Synapse Analytics and Azure SQL Data Warehouse, Google BigQuery, Google Cloud Spanner, Azure Cosmos DB, Apache HBase, Apache Kudu, MapR Database, Amazon Redshift, and Snowflake.

Increased Extensibility with New Data Targets

  • The Striim platform now supports SAP Hana as a target with direct integration. SAP Hana customers can now stream real-time data from a diverse set of sources into the platform with in-flight, in-memory data processing. With the availability of real-time data pipelines to SAP Hana, deployed on-premises or in the cloud, customers can rapidly develop time-sensitive analytics applications that transform their business operations.
  • Expanding the HTTP Reader capabilities to send custom responses back to the requestor. The HTTP Reader can now defer responding until events reach a corresponding HTTP Writer. This feature enables users to build REST services using Striim.

Other extensibility improvements are:

  • Improved support for handling special characters for table names in Oracle and SQL Server databases
  • Hazelcast Writer supports multi-column primary keys to enable more complex Hot Cache use cases
  • Performance improvement options for the SQL Server CDC Reader

These are only a portion of the new features of Striim 3.9.8. There is more to discover. If you would like to learn more about the new release, please reach out to schedule a demo with a Striim expert.

The Top 4 Use Cases for Streaming Data Integration: Whiteboard Wednesdays

Today we are talking about the top four use cases for streaming data integration. If you’re not familiar with streaming data integration, please check out our channel for a deeper dive into the technology. In this 7-minute video, let’s focus on the use cases.

Use Case #1 Cloud Adoption – Online Database Migration

The first one is cloud adoption – specifically online database migration. When you have your legacy database and you want to move it to the cloud and modernize your data infrastructure, if it’s a critical database, you don’t want to experience downtime. The streaming data integration solution helps with that. When you’re doing an initial load from the legacy system to the cloud, the Change Data Capture (CDC) feature captures all the new transactions happening in this database as it’s happening. Once this database is loaded and ready, all the changes that happened in the legacy database can be applied in the cloud. During the migration, your legacy system is open for transactions – you don’t have to pause it.

While the migration is happening, CDC helps you to keep these two databases continuously in-sync by moving the real-time data between the systems. Because the system is open to transactions, there is no business interruption. And if this technology is designed for both validating the delivery and checkpointing the systems, you will also not experience any data loss.

Because this cloud database has production data, is open to transactions, and is continuously updated, you can take your time to test it before you move your users. So you have basically unlimited testing time, which helps you minimize your risks during such a major transition. Once the system is completely in-sync and you have checked it and tested it, you can point your applications and run your cloud database.

This is a single switch-over scenario. But streaming data integration gives you the ability to move the data bi-directionally. You can have both systems open to transactions. Once you test this, you can run some of your users in the cloud and some of you users in the legacy database.

All the changes happening with these users can be moved between databases, synchronized so that they’re constantly in-sync. You can gradually move your users to the cloud database to further minimize your risk. Phased migration is a very popular use case, especially for mission-critical systems that cannot tolerate risk and downtime.

Cloud adoptionUse Case #2 Hybrid Cloud Architecture

Once you’re in the cloud and you have a hybrid cloud architecture, you need to maintain it. You need to connect it with the rest of your enterprise. It needs to be a natural extension of your data center. Continuous real-time data moment with streaming data integration allows you to have your cloud databases and services as part of your data center.

The important thing is that these workloads in the cloud can be operational workloads because there’s fresh information (ie, continuously updated information) available. Your databases, your machine data, your log files, your other cloud sources, messaging systems, and sensors can move continuously to enable operational workloads.

What do we see in hybrid cloud architectures? Heavy use of cloud analytics solutions. If you want operational reporting or operational intelligence, you want comprehensive data delivered continuously so that you can trust that’s up-to-date, and gain operational intelligence from your analytics solutions.

You can also connect your data sources with the messaging systems in the cloud to support event distribution for your new apps that you’re running in the cloud so that they are completely part of your data center. If you’re adopting multi-cloud solutions, you can again connect your new cloud systems with existing cloud systems, or send data to multiple cloud destinations.

Hybrid Cloud ArchitectureUse Case #3 Real-Time Modern Applications

A third use case is real-time modern applications. Cloud is a big trend right now, but not everything is necessarily in the cloud. You can have modern applications on-premises. So, if you’re building any real-time app and modern new system that needs timely information, you need to have continuous real-time data pipelines. Streaming data integration enables you run real-time apps with real-time data.

Use Case #4 Hot Cache

Last, but not least, when you have an in-memory data grid to help with your data retrieval performance, you need to make sure it is continuously up-to-date so that you can rely on that data – it’s something that users can depend on. If the source system is updated, but your cache is not updated, it can create business problems. By continuously moving real-time data using CDC technology, streaming data integration helps you to keep your data grid up-to-date. It can serve as your hot cache to support your business with fresh data.

 

To learn more about streaming data integration use cases, please visit our Products section, schedule a demo with a Striim expert, or download the Striim platform to get started.

 

On-Premises-to-Cloud Migration: How to Minimize the Risks

On-premises-to-cloud migration is the necessary first step to cloud adoption, which offers a fast lane to data infrastructure modernization, innovation, and the ability to rapidly transform business operations. But many companies still restrict themselves to using the cloud for non-critical projects, rather than mission-critical operations, out of concern over the difficulties and the risks of migration. Are you one of them? If so, read on to discover a new approach that addresses critical data migration challenges.

Common Risks of On-Premises-to-Cloud Migration

A major component of the cloud migration effort is data migration from existing legacy databases. Many data migration solutions require you to lock the legacy database to preserve the consistent state after a snapshot is taken.

Depending on the size of the database, network bandwidth, and required transformations, the whole process for loading the data to the cloud, restoring the database, and testing the new system can take days, weeks, or even months. I am not aware of any digital business that would be good with locking databases that support critical business operations for such an extensive time.

In addition, you run the risk of having a database with an inconsistent state after the migration process. Some solutions might lose data in transit because of a process failure or network outage. Or the data might not be applied to the target system in the right transactional order. As a result, your cloud database winds up diverging from the source legacy system.

To ensure that the new environment is stable, you have to test the new system thoroughly before moving all your users over. Time pressures to minimize downtime can lead to rushed testing, which in turn results in an unstable cloud environment after you do a big bang switchover. Certainly, not the goal of your modernization effort!

It is no wonder with all these risks and disruptions to operations, the systems that should move to the cloud as the top priority – because they can bring the greatest positive impact for business transformation – end up being de-prioritized in favor of less risky migrations. As a result, your organization may fail to extract the full value from your cloud investment and limit the speed of innovation and modernization.

Mitigating the Risks of On-Premises-to-Cloud Migration

Here comes the good news that I love sharing: Today, newer, more sophisticated streaming data integration with change data capture technology minimizes disruptions and risks mentioned earlier. This solution combines initial batch load with real-time change data capture (CDC) and delivery capabilities.

As the system performs the bulk load, the CDC component collects the changes in real time as they occur. As soon as the initial load is complete, the system applies the changes to the target environment to maintain the legacy and cloud database consistent.

Let’s review how the streaming data integration approach tackles each of these risks that delay your business in getting the fullest benefits from your cloud investments.

Eliminating Database Downtime

Combining bulk load with CDC removes the need to pause the legacy database. During the bulk load process, your database is open to any new transactions. All new transactions are immediately captured and applied to the target as soon as the bulk load is complete, keeping the two systems in-sync.

The only downtime for the migration process occurs during the application switchover process. Therefore, this configuration enables zero database downtime during on-premises-to-cloud migration.

Striim - Cloud Migration

Avoiding Data Loss

To prevent data loss throughout the data migration process, streaming data integration tracks data movement and processing. Striim’s streaming data integration platform provides delivery validation that all your data has been moved to the target.

Also, with built-in exactly once processing (E1P), the software platform can avoid data duplicates. Striim’s CDC offering is designed to maintain the transaction integrity (i.e., ACID properties) during the real-time data movement so the target database remains consistent with the source.

Thorough Testing Without Time Limitation

Because during and after the initial load, CDC keeps up with transactions happening in the legacy system, your team can take the time necessary to thoroughly test the new system before moving users. Having live production data in the cloud database, combined with unlimited testing time, provides the comprehensive assessments and assurances that many mission-critical systems need for such a significant transition.

Fallback Option

After the switchover, performing reverse real-time data movement from the cloud database back to the legacy database enables you to keep the legacy system up-to-date with the new transactions taking place in the cloud. In short, if necessary, you have a fallback option to put everyone back on the old system as you troubleshoot any issues in the new system.

During this troubleshooting and retesting time, the CDC process can be set up to collect the new transactions happening in the legacy database to bring the cloud database to a consistent state with the legacy system. You can point the application to the cloud database, once again, after testing thoroughly.

Phased Migration with Parallel Use

A more complex but highly effective approach to further facilitate your risk mitigation and thorough testing is a gradual migration. Bi-directional real-time data replication is your solution to keep both the cloud and the on-premises legacy systems in-sync while they are both open to transactions and support the application.

Striim - Bi-Directional Replication for Phased Cloud Migration

You can move some users to the new system and leave others in the old, running both in parallel as you test the new system. As the testing with the production workload progresses as planned, you can add new users in a phased and gradual manner that minimizes risks.

Migration Is Only the First Step

Streaming data integration is not only for on-premises to cloud migration. Once you have the cloud database in production, you perform ongoing integration with relevant data sources and applications across the enterprise, including in other clouds.

Striim is designed for continuous data integration to support your hybrid cloud architecture with stream processing capabilities, as well. When you use a single cloud integration solution for both the database migration and ongoing data integration, you minimize development efforts, shorten the learning curve, and reduce risks with simplified solution architecture.

With strong partnerships with leading cloud vendors, Striim offers proven solutions that minimize your risks during data migration and ongoing integration. To learn more about how Striim can help with your on-premises-to-cloud migration, I invite you to schedule a brief demo with a Striim technologist.

 

 

Log-Based Change Data Capture: the Best Method for CDC

Change data capture, and in particular log-based change data capture, has become popular in the last two decades as organizations have discovered that sharing real-time transactional data from OLTP databases enables a wide variety of use-cases. The fast adoption of cloud solutions requires building real-time data pipelines from in-house databases, in order to ensure the cloud systems are continually up to date. Turning enterprise databases into a streaming source, without the constraints of batch windows, lays the foundation for today’s modern data architectures. In this blog post, I would like to discuss Striim’s CDC capabilities along with its unique features that enhance the change data capture, as well as its processing and delivery across a wide range of sources and targets.

Log-based Change Data CaptureLog-Based Change Data Capture

In our blog post about Change Data Capture, we explained why log-based change data capture is a better method to identify and capture change data. Striim uses the log-based CDC technique for the same reasons we stated in that post: Log-based CDC minimizes the overhead on the source systems, reducing the chances of performance degradation. In addition, it is non-intrusive. It does not require changes to the application, such as adding triggers to tables would do. It is a light-weight but also a highly-performant way to ingest change data. While Striim reads DML operations (INSERTS, UPDATES, DELETES) from the database logs, these systems continue to run with high-performance for their end users.

Striim’s strengths for real-time CDC are not limited to the ingestion point. Here are a few capabilities of the Striim platform that build on its real-time, log-based change data capture in enabling robust, end-to-end streaming data integration solutions:

Log-based CDC from heterogeneous databases for non-intrusive, low-impact real-time data ingestion

Striim uses log-based change data capture when ingesting from major enterprise databases including Oracle, HPE NonStop, MySQL, PostgreSQL, MongoDB, among others. It minimizes CPU overhead on sources, does not require application changes, and substantial management overhead to maintain the solution.

Ingestion from multiple, concurrent data sources to combine database transactions with semi-structured and unstructured data

Striim’s real-time data ingestion is not limited to databases and the CDC method. With Striim you can merge real-time transactional data from OLTP systems with real-time log data (i.e., machine data), messaging systems’ events, sensor data, NoSQL, and Hadoop data to obtain rich, comprehensive, and reliable information about your business.

End-to-end change data integration

Striim is designed from the ground-up to ingest, process, secure, scale, monitor, and deliver change data across a diverse set of sources and targets in real time. It does so by offering several robust capabilities out of the box:

  • Transaction integrity: When ingesting the change data from database logs, Striim moves committed transactions with the transactional context (i.e., ACID properties) maintained. Throughout the whole data movement, processing, and delivery steps, this transactional context is preserved so that users can create reliable replica databases, such as in the case of cloud bursting.
  • In-flight change data processing: Striim offers out-of-the-box transformers, and in-memory stream processing capabilities to filter, aggregate, mask, transform, and enrich change data while it is in motion. Using SQL-based continuous queries, Striim immediately turns change data into a consumable format for end users, without losing transactional context.
  • Built-in checkpointing for reliability: As the data moves and gets processed through the in-memory components of the Striim platform, every operation is recorded and tracked by the solution. If there is an outage, Striim can replay the transactions from where it was left off — without missing data or having duplicates.
  • Distributed processing in a clustered environment: Striim comes with a clustered environment for scalability and high availability. Without much effort, and using inexpensive hardware, you can scale out for very high data volumes with failover and recoverability assurances. With Striim, you don’t need to build your own clusters with third-party products.
  • Continuous monitoring of change data streams: Striim continuously tracks change data capture, movement, processing, and delivery processes, as well as the end-to-end integration solution via real-time dashboards. With Striim’s transparent pipelines, you have a clear view into the health of your integration solutions.
  • Schema change replication: When source Oracle database schema is modified and a DDL statement is created, Striim applies the schema change to the target system without pausing the processes.
  • Data delivery validation. For database sources and targets, Striim offers out-of-the-box data delivery verification. The platform continuously compares the source and target systems, as the data is moving, validating that the databases are consistent and all changed data has been applied to the target. In use cases, where data loss must be avoided, such as migration to a new cloud data store, this feature immensely minimizes migration risks.
  • Concurrent, real-time delivery to a wide range of targets: With the same software, Striim can deliver change data in real time not only to on-premise databases but also to databases running in the cloud, cloud services, messaging systems, files, IoT solutions, Hadoop and NoSQL environments. Striim’s integration applications can have multiple targets with concurrent real-time data delivery.
  • Pre-packaged applications for initial load and CDC: Striim comes with example integration applications that include initial load and CDC for PostgreSQL environments. These integration applications enable setting up data pipelines in seconds, and serve as a template for other CDC sources as well.

Turning Change Data to Time-Sensitive Insights

In addition to building real-time integration solutions for change data, Striim can perform streaming analytics with flexible time windows allowing you to gain immediate insights from your data in motion. For example, if you are moving financial transactions using Striim, you can build real-time dashboards that alert on potential fraud cases before Striim delivers the data to your analytics solution.

Log-based change data capture is the modern way to turn databases into streaming data sources. However, ingesting the change data is only the first of many concerns that integration solutions should address. You can learn more about Striim’s CDC offering by scheduling a demo with a Striim technologist or experience its enterprise-grade streaming integration solution first-hand by downloading a free trial.

 

Moving Real-Time Data to Azure Cosmos DB with Striim

In this video you will see how Striim can help feed Cosmos DB in real-time through our wizard-based UI and intuitive data pipelines.

Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service. You have chosen Cosmos DB to store ever-increasing volumes of data and make this data available in milliseconds. However, most of your source data resides elsewhere – in a wide variety of on-premise or cloud sources. How do you continually move this data to Cosmos DB in real-time, so that your fast analytics and insights are reporting on timely data?

 

Video Transcription:

Azure Cosmos DB was built to achieve low latency and high availability in a globally distributed world. By elastically and independently scaling throughput and storage across multiple Azure regions world-wide you can access your data when and where you want. And support for multiple models means you can use SQL, Cassandra, MongoDB and other APIs to get to your data.

However, residing in the cloud means you have to determine how to move your existing data to Cosmos DB. This could be migrating an existing SQL Server, Oracle, MySQL, or PostgreSQL operational database, or continually populating Cosmos DB with newly generated on-premise data from logs, or device information. In order for Cosmos DB to provide up-to-date information, there should be as little latency as possible between the original data creation and its delivery to the cloud.

The Striim platform can help with all these requirements and more. Our database adapters support change data capture, or CDC from enterprise or cloud databases. CDC directly intercepts database activity and collects all the inserts, updates and deletes as they happen, ready to stream into Cosmos DB. Adapters for machine logs and other files read at the end of multiple files in parallel to stream out data as it is written, removing the inherent latency of batch. While data from devices and messaging systems can be collected easily, independent of its format, through a variety of high speed adapters and parsers.

After being collected continuously, the streaming data can be delivered directly into Azure Cosmos DB with very low latency, or pushed through a data pipeline where it can be pre-processed through filtering, transformation, enrichment, and correlation using SQL-based queries, before delivery into CosmosDB. This enables such things as data denormalization, change detection, deduplication, and quality checking before the data is ever stored.

In addition to this, because Striim is an enterprise grade platform, it can scale with Cosmos DB and reliably guarantee delivery of source data while also providing built-in dashboards and verification of data pipelines for operational monitoring purposes.

The Striim wizard-based UI enables users to rapidly create a new data flow to move data to Cosmos DB. In this example, real-time change data from Oracle is being continually delivered to Cosmos DB through the SQL API. The wizard walks you through all the configuration steps, checking that everything is set up properly, and results in a data flow application. This data flow can be enhanced to filter, transform and enrich the data through SQL based queries. Here we are adding a name and email address from a cache, based on an ID present in the original data.

When the application is started, data will begin flowing in real-time from Oracle to Cosmos DB. Making changes in Oracle results in the transformed data being written continually to Cosmos DB, as you can see through the Cosmos DB data explorer UI.

Of course, we are not limited to writing through the SQL API. In this example, we are writing Oracle data to a Cassandra model, which can be utilized directly by existing or new Cassandra applications. Here’s what the data looks like in this case.

Striim and Cosmos DB can change the way you do analytics, with Cosmos DB providing global rapid access to the real-time data provided by Striim. The globally distributed cloud database service needs data delivered to the cloud, and Striim can continually feed Cosmos DB with the data you need to run your business.

Try Striim and Cosmos DB today through the Striim for Real-Time Data Integration to Cosmos DB offering on the Azure Marketplace, to see your data how, where, and when you want it.

Streaming Integration to Azure Cosmos DB

Real-time integration to Azure Cosmos DB enables companies to make the most of the environment’s globally-distributed, multi-model database service. With Striim’s streaming integration to Azure Cosmos DB solution, companies can continuously feed real-time operational data from a wide-range of on-premises and cloud-based data sources.

What is Striim?

The Striim software platform offers continuous, real-time data movement from enterprise document and relational databases, sensors, messaging systems, and log files into Azure Cosmos DB with in-flight transformations and built-in delivery validation to support real-time reporting, IoT analytics, and transaction processing.

Streaming Integration to Azure Cosmos DB

Offload Operational Reporting

  • Move real-time unstructured and structured data to Cosmos DB to support operational workloads including real-time reporting
  • Continuously collect data from a diverse set of sources (such as Internet of Things (IoT) sensors) for timely and rich insight

Accelerate and Simplify Processing

  • Perform filtering, transformations, aggregation, and enrichments in-flight before delivery to Cosmos DB
  • Avoid adding latency via stream processing
  • Easily convert structured data to document form

Ease the Cosmos DB Adoption Process

  • Use phased and zero-downtime migration from MongoDB by running them in parallel
  • Continuously visualize and monitor data pipelines with real-time alerts
  • Prevent data loss with built-in validation

How Striim Delivers Streaming Integration to Azure Cosmos DB

Low-Impact Change Data Capture from Enterprise Databases

  • Continuous, non-intrusive data ingestion for high-volume data
  • Support for databases such as Oracle, SQL Server, HPE NonStop, MySQL, PostgreSQL, MongoDB, Amazon RDS for Oracle, and Amazon RDS for MySQL
  • Real-time data collection from logs, sensors, Hadoop and message queues to support rich and timely analytics

Continuous, In-Flight Data Processing

  • In-line transformation, filtering, aggregation, enrichment to store only the data you need, in the right format
  • Uses SQL-based continuous queries via a drag-and-drop UI

Real-Time Data Delivery with Built-In Monitoring

  • Continuous verification of source and target database consistency
  • Interactive, live dashboards for streaming data pipelines
  • Real-time alerts via web, text, email

Streaming Integration to Azure Cosmos DB

To learn more about how to leverage Striim’s solution for streaming integration to Azure Cosmos DB, check out our Striim for Azure Cosmos DB solution page, schedule a brief demo with a Striim technologist, provision Striim for Cosmos DB on the Azure marketplace, or download a free trial of the Striim platform and get started today!

Streaming Integration to Azure

To adopt modern data warehousing, advanced big data analytics, and machine learning solutions in the Azure Cloud, businesses need streaming integration to Azure. They need to be able to continuously feed real-time operational data from existing on-premises and cloud-based data stores and data warehouses.

What is Striim?

The Striim software platform offers continuous, real-time data movement from heterogeneous, on-premises systems and AWS into Azure with in-flight transformations and built-in delivery validation to make data immediately available in Azure, in the desired format.

Streaming Integration to Azure

Implement Operational Data Warehouse on Azure Cloud

  • Rapidly set up real-time data pipelines from on-prem databases and AWS to enable real-time operational data store
  • Perform transformations, including denormalization, in-flight
  • Use phased and zero downtime migration from Oracle Exadata, Teradata, AWS Redshift by running them in parallel
  • Prevent data loss with built-in validation

Run Operational Workloads in Azure Databases

  • Continuously stream on-prem and AWS data to Azure SQL DB, Cosmos DB, Azure Database for MySQL, and Azure Database for PostgreSQL
  • Use non-intrusive change data capture to avoid impacting sources
  • Offload operational reporting
  • Move data continuously from MongoDB, sensors and other sources to Cosmos DB

Use Pre-Processed, Real-Time Data for Advanced Big Data Analytics and ML

  • Feed real-time data to Azure Data Lake Storage, Azure DataBricks, and Azure HDInsight from on-prem or AWS databases, log files, messaging systems, Hadoop, and sensors
  • Pre-process data-in-motion to reduce ETL efforts and accelerate insight
  • Continuously visualize and monitor data pipelines with real-time alerts

How Striim Works to Achieve Streaming Integration to Azure

Low Impact Change Data Capture from Enterprise Databases

  • Non-stop, non-intrusive data ingestion for high-volume data
  • Support for data warehouses such as Oracle Exadata, Teradata, Amazon Redshift; and databases such as Oracle, SQL Server, HPE NonStop, MySQL, PostgreSQL, MongoDB, Amazon RDS for Oracle, Amazon RDS for MySQL
  • Real-time data collection from logs, sensors, Hadoop and message queues to support operational decision making

Continuous Data Processing and Delivery

  • In-flight transformation, incl. denormalization, filtering, aggregation, enrichment to store only the data you need, in the right format
  • Real-time data delivery to Azure SQL Data Warehouse, SQL Server on Azure, Azure SQL Database, Azure Data Lake Storage, Azure Databricks, Kafka, Azure HDInsight, and Cosmos DB

Built-In Monitoring and Validation

  • Interactive, live dashboards for streaming data pipelines
  • Continuous verification of source and target database consistency
  • Real-time alerts via web, text, emailStreaming Integration to Azure

Why Striim?

As an enterprise-grade platform with built-in high-availability, scalability, and reliability, Striim is designed to deliver tangible ROI with low TCO to meet the real-time requirements for streaming integration to Azure in mission-critical environments.

With a broad set of supported sources, Striim enables you to make virtually any data available on Azure in real time and the desired format to support next-generation cloud analytics and operational decision making on a continuous basis.

To learn more about how to use Striim for streaming integration to Azure, check out our Striim for Azure product page, schedule a short demo with a Striim technologist, or download a free trial of the Striim platform and get started today.

Real-Time Data Visualization and Data Exploration

When business operations run at lightning speed generating large data volumes and operational complexity abounds, real-time data visualization and data exploration becomes increasingly critical to manage daily operations. Striim enables businesses to access, analyze, visualize and explore live operational data to understand their “Now,” and take control of business operations.Real-Time Data Visualization and Data Exploration

Real-Time, Comprehensive Insight Made Easy

By combining real-time data integration, streaming analytics, and rich data visualization in a single, enterprise-grade platform, Striim allows businesses to respond to business trends and emerging issues proactively and with full context. With Striim, users not only have up-to-the-second visibility into all corners of the business with advanced custom metrics, but also the flexibility to explore streaming data without needing to write code.

Create Sophisticated Metrics Easily

Unlike packaged solutions with fixed and generic metrics, Striim’s software platform gives businesses the flexibility to gain fast and deep insight using business-specific metrics. By ingesting, filtering, aggregating, transforming, enriching, and analyzing real-time data from virtually any source, it enables custom metrics using all relevant data and the ability to dice and slice the metrics across a wide range of dimensions for fast insight. A comprehensive set of built-in SQL operations and functions – such as Math, Statistics, Date, Spatial, String – along with customizable, jumping and sliding time windowsprovide the granular and precise metric definitions that deliver accurate performance assessment.

Gain Real-Time and Flexible Visibility into Operations

By combining streaming integration and analytics capabilities with in-memory processing, Striim updates all metrics in real time as new data streams in from various sources, and stores historical data within the built-in results store for time-based comparisons.

Via the dashboards, users can compare live data to historical averages or to a specific date and time in the past, without having to write code. Real-time, interactive dashboards allow business users to view live data with detailed field and time-based filtering at the page or chart level. In addition, users can search streaming data directly on the dashboard and drill down to detail pages.

Real-Time Data Visualization and Data Exploration

Key Platform Features for Real-Time Data Visualization and Data Exploration

Striim offers an end-to-end, enterprise-grade platform to deliver instant insights from high-volume, high-velocity data. Some of the key features for real-time data visualization and data exploration are as follows:

  • Real-time data ingestion from diverse sources: Ingests, processes, and enriches unstructured, semi-structured, and structured data from databases, log files, message queues, and sensors
  • Multi-source stream processing and analytics: Performs SQL-based continuous processing on multiple streams of live data including enrichment with static and streaming reference data
  • Flexible time windows: Offers time-based, event-based, and session-based windowing
  • Interactive, live dashboards: Delivers push-based visualization with automatic refresh
  • Rewinding: Enables to view and compare historical data via the UI
  • Search: Offers keyword search on live, streaming data
  • Field and time-based filtering: Allows filtering and comparing each chart by different dimensions
  • Page and chart level filtering: Gives the flexibility to use filter at the chart or page level
  • Embedding into custom websites: Striim charts can be embedded into any HTML5 page via iFrame along with filtering and search capabilities.

Deploy and Modify Easily as Business Needs Change

Businesses can quickly gain real-time visibility into their operations via Striim’s intuitive UI without any coding. Using Striim’s simple yet powerful streaming SQL engine, Striim applications can ingest millions of data points per second and create visualization-specific aggregates. Striim’s GUI and SQL-based language makes it easy to correlate live, streaming data with historical aggregates.

Data Visualization and Data Exploration
Striim offers an intuitive UI to easily set up data flows and correlate historical data with streaming data

Within seconds of establishing data sources and flows, users can create dashboards to view live data, and modify the dashboards and charts as needed to meet ever-changing business needs. Visualizations can be done via a variety of charts such line, area, column, maps, heat maps, tables etc. Dashboards can contain multiple pages with in-page filtering and drill down available for deeper understanding of operational metrics.

Striim’s charts can be embedded to any custom dashboard or web page to support broad collaboration and distribution of real-time insights. Striim issues real-time alerts based on custom thresholds, and can trigger workflows to enable timely action.

Benefits of Data Exploration with Striim

Using Striim for live operational dashboards and streaming data exploration, businesses gain several competitive advantages including:

  • Real-time, granular, and comprehensive insights with business-specific metrics
  • Correlation of real-time and historical data to detect deviations immediately
  • Rapid iteration of the dashboards and data flows as business needs change
  • Proactive response to emerging trends based on in-time, in-context insights
  • The ability to easily meet strict SLAs and improve customer experience

Striim enables businesses to accurately track operational performance with the right metrics, in real time, so they can course-correct fast, with full confidence.

To learn more about Striim’s real-time data visualization and data exploration capabilities, visit our Creating and Monitoring Operational Metrics solutions page, schedule a demo with a Striim technologist, or download a free trial of the platform and try it for yourself!

Back to top