Real-time Data Archives - Page 32 of 37

Striim Announces Strategic Partnership with Snowflake to Drive Cloud-Based Data-Driven Analytics

Posted on April 16, 2019 by Ryan Siss | 3 min read | 5 views

We are excited to announce that we’ve entered into a strategic partnership with Snowflake, the data warehouse built for the cloud, in which Striim will be used to move real-time data into Snowflake. Through this strategic partnership, Snowflake users will be empowered to gain fast insights from their cloud-based analytics.

Enterprise companies are quickly adopting Snowflake because its architecture is built from the ground up for the cloud. Snowflake offers speed, scalability, and cost-effectiveness, along with zero management. In order to attain fast analytics, you need access to real-time data, and that’s where Striim comes in. Striim is leveraging its vast real-time data integration capabilities to enable Snowflake users to collect and move data from a variety of sources into their environment to accelerate their data-driven analytics.

Striim uses low-impact change data capture (CDC) to move data from existing on-prem databases, including SQL Server, Oracle, MongoDB, HPE NonStop, PostgreSQL, MySQL and Amazon RDS. Striim can also help you migrate data warehouses such as Teradata, Netezza, Amazon Redshift, and Oracle Exadata. Additionally, Striim can collect from messaging systems, Hadoop, log files, sensors, and security devices and other systems. Striim also has analytical capabilities to monitor and measure transaction lag and alert when SLAs are not met.

Through CDC, Striim can handle large volumes of enterprise data securely and reliably. Along with its CDC capabilities, Striim adds further value through in-flight processing, transformations, and denormalization to further assist Snowflake users in providing quicker analysis by continuously delivering data to Snowflake in the right format, and with added context.

Striim has a number of use cases with customers using the solution for both online migrations and continuous integration to Snowflake.

For example, a company offering HR and well-being solutions, is a joint customer that was searching for a low-latency streaming integration solution that was scalable and also offered a secure data warehouse with analytical options. This organization’s goal was to enable employees to instantly query their personal information, as well as allow employers to identify trends and patterns from the data.

With Striim + Snowflake, this business has been delivering real-time data and analytics using CDC from Oracle to Azure for streamlined operations. The partnership between Striim and Snowflake has dramatically enhanced the company’s operationsoperations, enabling them to make faster, smarter decisions based on their real-time data.

To learn more about the Striim-Snowflake solution and Striim’s partnership with Snowflake, please read our press release, visit our Striim for Snowflake product page, or set up a quick demo with a Striim technologist.

Google Cloud Next – Cloud Spanner Demo

Posted on April 12, 2019 by Ryan Siss | 6 min read | 5 views

Alok Pareek, EVP of Products at Striim, and Codin Pora, Director of Partner Technology at Striim, provide a demo of the Striim platform at Google Cloud Next SF, April 2019. Alok goes into detail about how Google Cloud users can move real-time data from a variety of sources into their Google Cloud Spanner environment using the Striim platform.

Unedited Transcript:

So with that, I’d like to invite Alok and call them up to stage to give us a demo of Spanner. And their company Striim is strategic partners of ours that do basically replication and migration of data into Google cloud. Thank you. Thank you.

Thank you, Tobias. So today I’m going to show a demonstration of another. You have these wonderful endpoints on the Google cloud. How do you actually use them? How do you actually move your data into them? And I’m going to talk about in this demo how we move real time data from your applications from an on premise Oracle database into Cloud Spanner. So before I get into the demojust a little bit about Striim. Striim is the next generation platform that helps in three solution categories. These are cloud option, hybrid cloud data integration, in-memory stream processing. Today I’m going to be focusing on the cloud adoption, specifically, how do we move data into Spanner? So with that, we’re going to jump into the demo.

Okay. So what you see on the screen is the landing page. And I’m gonna keep this going pretty fast. We’re going to step into the apps part of the demo. That’s where the data pipelines are defined. That helps you move the data from on premise to Spanner. In this case, what you are seeing, there are two pipelines. One of them is meant to do an initial load or an instantiation of your existing data onto Cloud Spanner tables. And the other one is also meant to catch it up. So while you are actually moving the data, you might have very large tables, for example, or massive amounts of volumes. So how do you actually go ahead and not lose any data? And all of the consistency things that we heard about from Tobia survey earlier.

It’s important that while you are moving the data, you also don’t have disruption to your applications and to your business. So let’s step into the pipeline here. So this is a very simple pipeline. It actually has a simple flow. You have at the top a data source, which is in this case Oracle, it’s running on premise. So we connect into this Oracle database. It has a line items table. We’re going to show you a movement of about a hundred thousand records. And also there’s an order stabler where we’re going to show you the delta processing. The way this application is constructed is by using these components on the left side of the UI in the flow designer as you drag and drop one of these things and you push them into the pipeline.

And that’s how you actually construct your data flow. And once we actually go we can also step into the Spanner target definition and this is your service account and the connectivity and the config for your Spanner. We’re gonna next deploy this application or the pipeline and once we deploy it, this is where you can sort of see that I can actually run this within the Striim platform. This can be run either on premise or on the Google Cloud. We want to probably show, Codin, that there’s nothing available yet in the tables on the Spanner side. So let’s go ahead and execute a query against a line item table. And in this case you’re seeing that there are zero records there and you can take my word that there is a hundred thousand records on the Oracle side.

In the interest of time we’ll assume that and let’s go ahead and run the application. And as soon as we are on the application you can see that in the preview in the lower part of your screen, you can actually see the records running live. This is while we are uploading the data and applying them into Cloud Spanner. You can see that we have completed a 100,000 records and it was pretty fast. This morning I’d done a million records so I was holding my breath there, but that was pretty fast as well. So now you can see that the data part is completed. I mentioned to you that there’s a second phase here. That’s the change data capture phase. So this is while you’re actually executing this query, of course, this query is consistent as of a specific snapshot.

At Oracle, there’s also DML activity against your application. So how do we actually take this data? This is the second pipeline now, so we can step into pipeline number two. Codin is already deployed it and in this case we use a special reader and that actually operates against the redo logs of the Oracle database and actually monitors that. So it doesn’t actually have any impact on the production system per se, impact us in like it’s at least not doing any query impact there. We grabbed the data from the redo logs and then we are going to reapply that as DMO, as inserts, updates and so forth on the Cloud Spanner system. So let’s go ahead and run this application. We are going to generate some DML using a data generator.

And let’s go ahead and run the generator and you’ll see that there’s a number of inserts, updates and deletes against the orders table. And now let’s switch over to the Cloud Spanner system and query the order stable here. As you can see, there’s data in the orders table. This was also something that was just propagated. So this is sort of like the two phase, very fast demo of how you get data from your on prem databases into Cloud Spanner. And of course this can work against other databases that we support as well. And this a available in the Google Cloud. So with that, I’m gonna hand the control back to Tobias.

Kafka to HDFS

Posted on April 11, 2019 by Katherine Rincon | 2 min read | 4 views

The real-time integration of messaging data from Kafka to HDFS augments transactional data for richer context. This allows organizations to gain optimal value from their analytics solutions and achieve a deeper understanding of operations – essential to establishing and sustaining competitive advantage.

To truly leverage the high volumes of data residing in Kafka stores, companies need to be able move it, process it, and deliver it to a variety of on-premises and cloud systems with sub-second latency. It also needs to be integrated with operational data from a wide variety of sources.

Traditional batch-based solutions are not designed for situations where data is time-sensitive – they are simply too slow. To allow organizations use their data to enhance operations, tailor services, and improve customer experiences, data delivery from Kafka to HDFS systems needs to be scalable and in real time.

Continuously Deliver Data

With Striim, companies can continuously deliver data in real time from Kafka to HDFS, as well as to a wide range of targets including Hadoop and cloud environments. Depending on the requirements of the organization, all the Kafka data can be written to a number of different targets simultaneously. In use cases where not all the data is required, data can be matched to specific criteria to deliver a highly relevant subset of data to the target.

Striim can create data flows to deliver the data from Kafka to HDFS in milliseconds, “as-is.” However, depending on how the data is going to be utilized, the user may require the data to be processed, prepared, and delivered in the right format. Striim supports continuous queries to filter, transform, aggregate, enrich, and analyze the data in-flight before delivering it with sub-second latency.

Analyze Data In-Flight

By analyzing the data in-flight, Kafka users can capture time-sensitive information as the data is flowing through the data stream. Striim pushes insights and alerts to interactive dashboards highlighting real-time data and the results of pattern matching, correlation, outlier detection, predictive analytics, and further enables drill-down and in-page filtering.

Learn more about integrating and processing Kafka to HDFS in real-time, please visit our Kafka integration page.

Our experts can show you how to get maximum value from your analytics solutions using Striim for real-time data integration from Kafka to HDFS. Please contact us to schedule a demo.

Oracle CDC to Postgres

Posted on April 10, 2019 by Katherine Rincon | 3 min read | 4 views

Real-Time Data Movement with Oracle CDC to Postgres

As an open source alternative, Postgres offers a lower total cost of ownership and the ability to store structured and unstructured data. Real-time movement of transactional data using Oracle CDC to Postgres is essential to creating a rich and up-to-date view of operations and improving
customer experiences.

IDC projects that by the year 2025, 80% of all data will be unstructured. Emails and social media posts are good examples of unstructured data. The ability to integrate unstructured, semi-structured and structured data from transactional databases into the enterprise is vital for timely and relevant analysis. To get a deep understanding from all the data an organization captures and records and to get the most value from it, it must be in the right place and in the right format – in real time.

Continuous movement of transactional data using Oracle CDC to Postgres ensures the organization is utilizing the real-time information from on-prem transactional databases and other data stores that is needed to make decisions that optimize user experience and drive higher revenue.

Moving data from enterprise databases to Postgres using traditional ETL processes introduces latency. Delays incurred while the data is being migrated or updated results in an out-of-date picture of the business, and limits the extent to which decisions can have any significant impact. Organizations also face a series of challenges managing storage and accessing the actual data that can produce real value to the organization if they move all the data as is.

How Striim Simplifies Oracle CDC to Postgres

Striim enables organizations to generate real value from the transactional data residing in their existing Oracle databases. Using non-intrusive change data capture (CDC), Striim enables continuous data ingestion from Oracle to Postgres with sub-second latency. Users can easily set up ingestion via Striim’s pre-configured CDC wizards, and drag-and-drop UI.

Moving and processing data in-flight, Striim filters data that is not required and delivers what is important to Postgres – in real time. The data can also be transformed and enriched so it is delivered in the format required. Oracle CDC to Postgres allows organizations gain access to critical insights sooner and make more informed operational decisions faster.

Once the real-time data pipelines are built and the initial data load using Oracle CDC to Postgres has been performed, continuous updating with every new database transaction ensures that analytics applications have the most up-to-date information. Built-in monitoring continuously compares the source and target, validating database consistency and providing assurance that the replicated environment is completely up-to-date with the on-prem Oracle instance.

For more information on real-time data integration and processing using Striim’s Oracle CDC to Postgres solution, please visit our Change Data Capture page.

To see first-hand how easy it is to move data to Postgres using Striim’s Oracle CDC to Postgres functionality, please schedule a demo with one of our technologists.

Striim Announces Real-Time Data Migration to Google Cloud Spanner

Posted on April 2, 2019 by Katherine Rincon | 2 min read | 4 views

Google Cloud Marketplace

The Striim team has been working closely with Google to deliver an enterprise-grade solution for online data migration to Google Cloud Spanner. We’re happy to announce that it is available in the Google Cloud Marketplace. This PaaS solution facilitates the initial load of data (with exactly once processing and delivery validation), as well as the ongoing, continuous movement of data to Cloud Spanner.

The real-time data pipelines enabled by Striim from both on-prem and cloud sources are scalable, reliable and high-performance. Cloud Spanner users can further leverage change data capture to replicate data in transactional databases to Cloud Spanner without impacting the source database, or interrupting operations.

Google Cloud Spanner is a cloud-based database system that is ACID compliant, horizontally scalable, and global. Spanner is the database that underlies much of Google’s own data collection, and it has been designed to offer the consistency of a relational database with the scale and performance of a non-relational database.

Migration to Google Cloud Spanner requires a low-latency, low-risk solution to feed mission-critical applications. Striim offers an easy-to-use solution to move data in real time from Oracle, SQL Server, PostgreSQL, MySQL, and HPE NonStop to Cloud Spanner while ensuring zero downtime and zero data loss. Striim is also used for real-time data migration from Kafka, Hadoop, log files, sensors, and NoSQL databases to Cloud Spanner.

While the data is streaming, Striim enables in-flight processing and transformation of the data to maximize usability of the data the instant it lands in Cloud Spanner.

Learn More

To learn more about Striim’s Real-Time Migration to Google Cloud Spanner, read the related press release or provision Striim’s Real-Time Data Integration to Cloud Spanner in the Google Cloud Marketplace.

Real-Time Database CDC to Cloudera

Posted on March 22, 2019 by Katherine Rincon | 2 min read | 4 views

As Cloudera increasingly invests in its Enterprise Data Cloud, the ability move data via change data capture or CDC to Cloudera has never been more important. Database CDC to Cloudera helps Cloudera users gain more operational value from their analytics solutions by loading critical database transactions in real time.

The timely ingestion of large volumes of data to Cloudera is imperative to realizing the true operational value of the platform. The explosion in the amount of data generated and the variety of data formats residing in traditional relational databases and data warehouses requires an ingestion process that is real-time and scalable.

Traditional methods or batch ETL uploads fall short in today’s business timeframes. Latency renders operational and transactional data obsolete and unable to provide Cloudera solutions with the real-time data required for operational intelligence and reporting. The negative performance impact of batch processing on transactional databases is also a major reason to move only the changed data in a continuous fashion.

To address the concerns mentioned above, there is a solution to ingest changed data in real time from databases: CDC to Cloudera from Striim. This enterprise-grade streaming data integration solution for Cloudera supports high-volume environments and allows users to move real-time data from a wide variety of sources without impacting source systems.

By moving only change data – continuously and with essential scalability – Cloudera users can rely on the Striim platform for the delivery of data. Data can be loaded as-is, or with a variety of processing, transformations or enrichments applied, all with sub-second latency and in the right format to support specific use cases.

A one-time initial load with continuous change updates ensures up-to-the-second data delivery to Cloudera to support operational decision making. Striim also offers real-time pipeline monitoring with alerting, which is particularly important in the context of mission-critical solutions.

Striim currently offers low-impact, log-based CDC to Cloudera from the following data sources: Oracle, Microsoft SQL Server, MySQL, PostgreSQL, HPE NonStop SQL/MX, HPE NonStop SQL/MP, HPE NonStop Enscribe, MongoDB, and MariaDB. All of these databases can be accessed via Striim’s easy-to-use Wizards and drag-and-drop UI, speeding delivery of CDC to Cloudera solutions. In addition, Striim offers pre-built starter integration applications, such as PostgreSQL CDC to Kafka, that can be leveraged to significantly reduce development efforts of any CDC-based application.

If you’d like a brief walk-through of Striim’s CDC to Cloudera offering, please schedule a demo.

What is iPaaS for Data?

Posted on March 19, 2019 by Katherine Rincon | 3 min read | 4 views

Organizations can leverage a wide variety of cloud-based services today, and one of the fastest growing offerings is integration platform as a service. But what is iPaaS?

There are two major categories of iPaaS solutions available, focusing on application integration and data integration. Application integration works at the API level, typically involves relatively low volumes of messages, and enables multiple SaaS applications to be woven together.

Integration platform as a service for data enables organizations to develop, execute, monitor, and govern integration across disparate data sources and targets, both on-premises and in the cloud, with processing and enrichment of the data as its streaming.

Within the scope of iPaaS for data there are older batch offerings, and more modern real-time streaming solutions. The latter are better suited to the on-demand and continuous way organizations are utilizing cloud resources.

Streaming data iPaaS solutions facilitate integration through intuitive UIs, by providing pre-configured connectors, automated operators, wizards and visualization tools to facilitate creation of data pipelines for real-time integration. With the iPaaS model, companies can develop and deploy the integrations they need without having to install or manage additional hardware or middleware, or acquire specific skills related to data integration. This can result in significant cost savings and accelerated deployment.

This is particularly useful as enterprise-scale cloud adoption becomes more prevalent, and organizations are required to integrate on-premises data and cloud data in real time to serve the company’s analytics and operational needs.

Factors such as increasing awareness of the benefits of iPaaS among enterprises – including reduced cost of ownership and operational optimization – are fueling the growth of the market worldwide.

For example, a report by Markets and Markets notes that the Integration Platform as a Service market is estimated to grow from $528 million in 2016 to nearly $3 billion by 2021, at a compound annual growth rate (CAGR) of 42% during the forecast period.

“The iPaaS market is booming as enterprises [embrace] hybrid and multi-cloud strategies to reduce cost and optimize workload performance” across on-premises and cloud infrastructure, the report says. Organizations around the world are adopting iPaaS and considering the deployment model an important enabler for their future, the study says.

Research firm Gartner, Inc. notes that the enterprise iPaaS market is an increasingly attractive space due to the need for users to integrate multi-cloud data and applications, with various on-premises assets. The firm expects the market to continue to achieve high growth rates over the next several years.

By 2021, enterprise iPaaS will be the largest market segment in application middleware, Gartner says, potentially consuming the traditional software delivery model along the way.

“iPaaS is a key building block for creating platforms that disrupt traditional integration markets, due to a faster time-to-value proposition,” Gartner states.

The Striim platform can be deployed on-premises, but is also available as an iPaaS solution on Microsoft Azure, Google Cloud Platform, and Amazon Web Services. This solution can integrate with on-premise data through a secure agent installation. For more information, we invite you to schedule a demo with one of our lead technologists, or download the Striim platform.

Oracle Change Data Capture: Methods, Benefits, Challenges

Posted on March 7, 2019 by Srdan Dvanajscak | 14 min read | 4 views

If there’s one thing today’s economy values, it’s speed. To enable faster decisions, businesses are rapidly moving data to the cloud, building powerful AI-driven applications, and increasingly relying on operational analytics. These initiatives all depend on one thing: a constant, reliable stream of real-time data.

But many organizations struggle to deliver real-time data; their data strategies are stuck in the past. Traditional data movement, built on slow, scheduled batch jobs (ETL), simply can’t keep up with the industry’s need for speed. This legacy approach creates data latency, leaving decision-makers with stale information and preventing applications from responding to events as they happen.

Sound familiar? Perhaps you already know the consequences of stale data. When you can’t get data when you need it, you risk missing key opportunities, creating inefficiencies, and widening the gap between data and its potential value.

This is where Oracle Change Data Capture (CDC) comes in. CDC offers a powerful and efficient way to capture every insert, update, and delete from your critical Oracle databases in real time. When implemented correctly, it can become the engine for modern, event-driven data architectures. But without the right strategy and tools, navigating the complexities of Oracle CDC can be challenging.

This guide will provide a clear roadmap to mastering Oracle CDC. We’ll explore what it is, how it works, and how to choose the right approach for your business—transforming your data infrastructure from a slow-moving liability into a real-time strategic asset.

What is Oracle Change Data Capture?

Oracle Change Data Capture (CDC) is a technology designed to identify and capture changes made to data in an Oracle database. It can capture DML (INSERT, UPDATE, and DELETE), DDL (CREATE, ALTER, DROP, and TRUNCATE) changes in your database the moment they occur. Think of it as a surveillance system for your data, noting every single modification in real time. This is about building infrastructure that can understand and react to new events. By tracking changes as they happen, CDC provides a continuous stream of change events that form the foundation of a responsive data strategy. This capability is essential for businesses that need to power streaming analytics, execute seamless cloud migrations with zero downtime, and build sophisticated, event-driven AI applications that rely on the freshest data possible.

Common Use Cases for Oracle Change Data Capture

At its best, Oracle CDC doesn’t just move data; it enables better outcomes. By providing a real-time stream of changes, CDC unlocks new capabilities for companies of all sizes, from agile startups to large enterprises across finance, retail, manufacturing, and more.

Cloud Migration and Adoption

For any company moving its Oracle workloads to the cloud, minimizing downtime is critical. Oracle CDC facilitates zero-downtime migrations by continuously syncing the on-premises source database with the new cloud target. This allows for a phased, low-risk cutover, ensuring business operations are never disrupted.

Streaming Data Pipelines for Analytics and AI

Advanced analytics and AI applications thrive on fresh data. CDC is the engine that feeds real-time data from Oracle databases into cloud data warehouses like Snowflake, Google BigQuery, and Databricks, or into streaming platforms like Apache Kafka. This allows data science teams to build dashboards with up-to-the-second accuracy and train machine learning models on the most current dataset available.

Offloading Operational Reporting and Upstream Analytics

Running heavy analytical queries against a live production (OLTP) database can degrade its performance, impacting core business applications. CDC allows companies to replicate transactional data to a secondary database or another backup storage option in real time. This offloads the reporting workload, ensuring that intensive analytics don’t slow down critical operational systems.

Event-Driven Application Development and Platform Modernization

In event-driven architecture, services communicate by reacting to events as they happen. Oracle CDC turns database changes into a stream of events. For example, a new entry in an orders table can trigger a notification to the shipping department, update inventory levels, and alert the customer, all in real time. This is invaluable for industries like e-commerce and logistics that need to automate complex workflows.

Disaster Recovery and High Availability

For mission-critical systems, maintaining a real-time, up-to-date replica of a production database is essential for disaster recovery. Oracle CDC ensures that a standby database is always in sync with the primary system. In the event of an outage, the business can failover to the replica with minimal data loss and disruption.

Data Synchronization Across Systems

Enterprises often have multiple systems that need a consistent view of the same data. Whether it’s keeping a CRM and an ERP system in sync or ensuring data consistency across geographically distributed databases, CDC is a reliable solution for real-time data synchronization, eliminating data silos and inconsistencies before they spring up.

Regulatory Compliance and Audit Readiness

For industries with strict regulatory requirements, like finance and healthcare, maintaining a detailed audit trail of all data changes is non-negotiable. Oracle CDC provides an immutable, chronological log of every insert, update, and delete. This creates a reliable audit history that can be used to ensure compliance and simplify audit processes.

AI Enablement

When it comes to getting AI-ready, enterprises need the freshest data available to fuel AI models with relevant insights. Real-time CDC ensures AI applications get the most up-to-date insights to power RAG engines with continuous, accurate updates. The result: faster, smarter, more responsive AI outputs based on relevant business contexts.

How Oracle Change Data Capture Works

Unlike systems that repeatedly poll tables for changes—an approach that is both inefficient and resource-intensive—Change Data Capture (CDC) taps directly into Oracle’s internal mechanisms. The most robust and performant CDC methods leverage Oracle’s transaction logs to capture changes with minimal impact on the source system. At the core of this process are Oracle redo logs. Every data-modifying transaction—whether an insert, update, or delete—is first recorded in a redo log file. This built-in mechanism ensures data integrity and supports recovery in the event of a system failure. Once redo logs reach capacity, they are archived into archive logs for persistence and historical tracking. Log-based CDC tools like Striim connect to the database and “mine” these redo and archive logs in a non-intrusive way. Striim offers two Oracle CDC adapters:

LogMiner-based Oracle Reader – Uses an Oracle LogMiner session to scan and capture server-side changes.
OJet Adapter – A high-performance, API-driven solution designed for large-scale, real-time data capture.

Both approaches are highly efficient and have minimal overhead, preserving the performance and stability of the source database. Learn more about Striim’s Oracle CDC adapters here.

Simple Oracle CDC Flow:

Transaction Occurs: An application performs an INSERT, UPDATE, or DELETE or a DDL change on an Oracle database table.
Log Write: Oracle writes the change to its redo log.
CDC Capture: A CDC tool (like Striim) reads the change from the redo log in real time.
Stream Processing (Optional): The data can be transformed, filtered, or enriched in-flight.
Data Delivery: The processed data is delivered to the target (e.g., Snowflake, Kafka, BigQuery).

Methods of Implementing CDC in Oracle

There are multiple ways to implement CDC in Oracle, each with its own trade-offs in performance, complexity, and cost. There’s no one “correct” method to choose—it comes down to selecting the approach that best matches the needs of your data management strategy and business goals.

Log-Based CDC

Reads changes directly from Oracle redo/archive logs. The gold standard for high-performance, low-latency pipelines where source performance is critical.

Impact:
Very Low
Complexity:
Moderate to High
Cost:
Variable

Trigger-Based CDC

Uses database triggers on each table to write changes to audit tables. Best for low-volume tables or when log access is restricted.

Impact:
High
Complexity:
Low to High
Cost:
High (Performance)

Oracle GoldenGate

Oracle’s proprietary log-reading replication software. Ideal for enterprise Oracle-to-Oracle replication with a large budget.

Impact:
Low
Complexity:
High
Cost:
Very High

Oracle Native CDC

Deprecated

A built-in feature in older Oracle versions using triggers and system objects. It is no longer supported and should not be used for new projects.

Impact:
Moderate to High
Complexity:
High
Cost:
N/A

Log-Based Oracle API CDC

The gold standard for high-performance Oracle CDC leverages Oracle’s native APIs to capture changes directly from Logical Change Records (LCRs)—Oracle’s internal representation of both DML (INSERT, UPDATE, DELETE) and DDL (CREATE, ALTER, DROP) operations. These records are derived from the database’s redo logs, offering a highly accurate, low-latency stream of transactional and structural changes. Because this method uses the same internal mechanisms Oracle relies on for replication and recovery, it ensures minimal performance impact on the source system. However, interacting directly with LCRs and Oracle’s APIs can be complex and requires advanced database knowledge. Striim simplify this by providing a fully managed, Oracle-integrated CDC solution that captures both data and schema changes in real time—without the need for extensive manual configuration.

Trigger-Based CDC

This approach involves placing database triggers on each source table. When a row is inserted, updated, or deleted, the trigger fires and copies the change into a separate “shadow” or audit table. While conceptually simple, this method adds significant overhead to the production database, as every transaction now requires an additional write operation. This can slow down applications and become a major performance bottleneck, especially in high-throughput environments. It’s also difficult to maintain as the number of tables grows.

Oracle GoldenGate

Oracle GoldenGate is a premium, feature-rich data replication solution known for its deep integration with the Oracle database and its ability to support high-volume, low-latency replication. While it excels in large-scale, mission-critical environments—particularly for Oracle-to-Oracle replication—its complexity and high licensing costs can be a barrier for many organizations. Striim offers a unique advantage by allowing customers to leverage existing GoldenGate trail files without requiring a full GoldenGate deployment. This capability enables organizations to preserve their investment in GoldenGate infrastructure while using Striim’s modern, flexible platform for real-time data integration, transformation, and delivery. Striim is one of the few solutions on the market that can read GoldenGate trail files directly, providing a cost-effective and simplified alternative for operationalizing data across diverse targets like Snowflake, BigQuery, Kafka, and more.

Oracle Native LogMiner

Oracle previously offered a built-in feature called Continuous Mine Mode to support Change Data Capture (CDC) in earlier versions of its database. However, this mode was complex, less performant than modern alternatives, and has been deprecated starting with Oracle 19c. While CONTINUOUS_MINE is no longer supported, LogMiner remains fully functional and officially supported by Oracle. LogMiner traditionally reads redo and archived redo logs to extract transactional changes, enabling real-time CDC. However, with the deprecation of Continuous Mine Mode, organizations have sought more efficient and forward-compatible solutions. To meet this need, Striim introduced Active Log Mining Mode (ALM)—a high-performance, real-time CDC capability built for Oracle 19c and beyond. ALM enables Striim to efficiently mine redo and archive logs without relying on deprecated features, ensuring low-latency, uninterrupted CDC across supported Oracle versions. For organizations seeking a future-proof CDC solution, Striim also offers Oracle OJET—an API-based integration that reads Logical Change Records (LCRs) directly from Oracle. OJET is Oracle’s strategic path forward for CDC, providing robust, enterprise-grade replication with long-term compatibility and official support.

Choosing the Right Oracle CDC Approach

To choose the right CDC method, you’ll need to align your technical strategy with your business goals, budget, and scalability needs. Striim has developed two CDC adapters for integrating data from Oracle. The first one is an Oracle Reader that captures CDC data using the LogMiner session on the server side. The second is the OJet adapter that uses a high-performing logmining API and offers the best performance for high-scale workloads. To learn more, check out this performance study which demonstrates the advantages of each adapter option.

The Benefits of Using Oracle CDC

When implemented with a clear strategy, Oracle CDC offers transformational benefits that go far beyond simple data replication. It empowers organizations to:

Enable real-time operational visibility for faster decision-making. By streaming every transaction, CDC provides an up-to-the-second view of business operations. This allows leaders to monitor KPIs, detect anomalies, and react to market changes instantly, rather than waiting for end-of-day reports.
Support phased and zero-downtime cloud migrations. CDC de-risks one of the most challenging aspects of cloud adoption: data downtime. By keeping on-premises and cloud databases perfectly in sync, businesses can migrate at their own pace without service interruptions, ensuring a smooth and seamless transition.
Streamline data ingestion for analytics, AI, and customer personalization. Feeding fresh, granular data to analytical systems is crucial for competitive advantage. CDC provides a continuous, low-latency stream of data that powers everything from dynamic pricing models and fraud detection algorithms to hyper-personalized customer experiences.

Challenges and Limitations of Change Data Capture

While Oracle CDC is a powerful way to get fresh data into downstream tools and systems, a poorly planned implementation can be risky and hugely costly. Without the right platform and strategy, data teams can run into several major challenges.

Performance Overhead on Source Systems

The Challenge: Trigger-based CDC or inefficient log-mining can place a heavy burden on production OLTP systems, slowing down the applications that the business depends on. This is especially damaging for startups and scaling companies with resource-constrained databases.

How Striim Helps: Striim uses a highly optimized, agentless, log-based CDC method on the source database, ensuring production workloads are not compromised. Striim also supports reading from Oracle ADG (Active Data Guard) or other downstream databases to minimize impact on the primary database.

Complexity of Managing Schema Changes

The Challenge: When the structure of a source table changes (e.g., a new column is added), it’s known as schema drift. These DDL changes can easily break data pipelines, forcing teams to manually intervene to resynchronize systems. This is a common struggle for mid-size and enterprise teams managing evolving applications.

How Striim Helps: Striim offers built-in, automated schema migration services and schema evolution capabilities that automatically detect and propagate schema changes from data source to target, ensuring pipelines remain resilient and data stays in sync without manual effort.

High Licensing and Operational Costs

The Challenge: Native Oracle solutions like GoldenGate come with a hefty price tag, adding a significant licensing burden to any project. This can be a major roadblock for enterprises looking to control the costs of their initiatives.

How Striim Helps: Striim provides a cost-effective solution with scalable pricing and cloud-native architecture, reducing the total cost of ownership (TCO) for real-time data integration.

Lack of Real-Time Observability and Alerting

The Challenge: Many traditional CDC solutions are “black boxes.” Teams often don’t know a pipeline has failed until a downstream report is broken or a user complains about stale data. This is particularly painful for lean IT teams and cloud-first startups that can’t afford to spend hours troubleshooting.

How Striim Helps: Striim provides comprehensive, real-time monitoring dashboards, data validation, and proactive alerting. This gives teams end-to-end observability into their data pipelines, allowing them to identify and resolve issues before they impact the business.

Real-Time AI Model Enablement on Live Enterprise Data Streams

The Challenge: Businesses struggle to apply AI in real time because traditional methods rely on batch processing and siloed systems, causing delays in detecting sensitive data, anomalies, or insights. Integrating AI directly into live data streams to enable instant action remains a complex problem.

How Striim Helps: Striim offers highly performant AI agents that embed advanced AI capabilities directly into streaming pipelines, enabling real-time intelligence and automation:

- Sherlock AI: Uses large language models to classify and tag sensitive fields on-the-fly.
- Sentinel AI: Detects and protects sensitive data in real time within streaming applications.
- Euclid: Enables semantic search and categorization through vector embeddings for deeper analysis.
- Foreseer: Provides real-time anomaly detection and time series forecasting for predictive monitoring.

By integrating these AI agents seamlessly, Striim empowers organizations to operationalize AI-driven insights instantly, improve data privacy, detect risks early, and make faster, smarter decisions.

Simplify Oracle Change Data Capture With Striim

When it comes to moving data from Oracle systems, Oracle CDC is a trusted approach—but building and managing reliable, scalable pipelines without the right platform is complex, risky, and costly. Manual infrastructure and legacy tools often introduce delays and budget overruns, putting projects at risk before they even start. Striim streamlines Oracle CDC with a comprehensive, agentless platform designed for high-throughput, real-time data integration. Optimized for modern cloud environments, Striim enables you to:

Deliver data with sub-second latency using best-in-class, log-based CDC.
Process and transform data on the fly through a powerful SQL-based streaming analytics engine.
Achieve enterprise-grade observability with real-time monitoring, alerting, and data validation.
Securely connect to any cloud platform with extensive, pre-built, scalable integrations.

With Striim, Oracle CDC becomes simpler, faster, and more reliable—empowering your data initiatives to succeed from day one. Ready to learn more? Here’s a few ways to dive in with Striim:

Get an interactive demo of Striim.
See how Striim compares to other real-time streaming platforms in GigaOm’s latest radar report.
Explore Striim recipes in GitHub.
Download the Benchmark Study Oracle Reader and OJET to BigQuery

Stop wrestling with brittle pipelines and start building the future of your data infrastructure.
Book a Demo with a Striim Expert or Start Your Free Trial Today

Moving Real-Time Data to Azure Cosmos DB with Striim

Posted on February 10, 2019 by Irem Radzik | 4 min read | 4 views

In this video you will see how Striim can help feed Cosmos DB in real-time through our wizard-based UI and intuitive data pipelines.

Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service. You have chosen Cosmos DB to store ever-increasing volumes of data and make this data available in milliseconds. However, most of your source data resides elsewhere – in a wide variety of on-premise or cloud sources. How do you continually move this data to Cosmos DB in real-time, so that your fast analytics and insights are reporting on timely data?

Video Transcription:

Azure Cosmos DB was built to achieve low latency and high availability in a globally distributed world. By elastically and independently scaling throughput and storage across multiple Azure regions world-wide you can access your data when and where you want. And support for multiple models means you can use SQL, Cassandra, MongoDB and other APIs to get to your data.

However, residing in the cloud means you have to determine how to move your existing data to Cosmos DB. This could be migrating an existing SQL Server, Oracle, MySQL, or PostgreSQL operational database, or continually populating Cosmos DB with newly generated on-premise data from logs, or device information. In order for Cosmos DB to provide up-to-date information, there should be as little latency as possible between the original data creation and its delivery to the cloud.

The Striim platform can help with all these requirements and more. Our database adapters support change data capture, or CDC from enterprise or cloud databases. CDC directly intercepts database activity and collects all the inserts, updates and deletes as they happen, ready to stream into Cosmos DB. Adapters for machine logs and other files read at the end of multiple files in parallel to stream out data as it is written, removing the inherent latency of batch. While data from devices and messaging systems can be collected easily, independent of its format, through a variety of high speed adapters and parsers.

After being collected continuously, the streaming data can be delivered directly into Azure Cosmos DB with very low latency, or pushed through a data pipeline where it can be pre-processed through filtering, transformation, enrichment, and correlation using SQL-based queries, before delivery into CosmosDB. This enables such things as data denormalization, change detection, deduplication, and quality checking before the data is ever stored.

In addition to this, because Striim is an enterprise grade platform, it can scale with Cosmos DB and reliably guarantee delivery of source data while also providing built-in dashboards and verification of data pipelines for operational monitoring purposes.

The Striim wizard-based UI enables users to rapidly create a new data flow to move data to Cosmos DB. In this example, real-time change data from Oracle is being continually delivered to Cosmos DB through the SQL API. The wizard walks you through all the configuration steps, checking that everything is set up properly, and results in a data flow application. This data flow can be enhanced to filter, transform and enrich the data through SQL based queries. Here we are adding a name and email address from a cache, based on an ID present in the original data.

When the application is started, data will begin flowing in real-time from Oracle to Cosmos DB. Making changes in Oracle results in the transformed data being written continually to Cosmos DB, as you can see through the Cosmos DB data explorer UI.

Of course, we are not limited to writing through the SQL API. In this example, we are writing Oracle data to a Cassandra model, which can be utilized directly by existing or new Cassandra applications. Here’s what the data looks like in this case.

Striim and Cosmos DB can change the way you do analytics, with Cosmos DB providing global rapid access to the real-time data provided by Striim. The globally distributed cloud database service needs data delivered to the cloud, and Striim can continually feed Cosmos DB with the data you need to run your business.

Try Striim and Cosmos DB today through the Striim for Real-Time Data Integration to Cosmos DB offering on the Azure Marketplace, to see your data how, where, and when you want it.

Streaming Integration to Azure Cosmos DB

Posted on January 29, 2019 by Irem Radzik | 2 min read | 4 views

Real-time integration to Azure Cosmos DB enables companies to make the most of the environment’s globally-distributed, multi-model database service. With Striim’s streaming integration to Azure Cosmos DB solution, companies can continuously feed real-time operational data from a wide-range of on-premises and cloud-based data sources.

What is Striim?

The Striim software platform offers continuous, real-time data movement from enterprise document and relational databases, sensors, messaging systems, and log files into Azure Cosmos DB with in-flight transformations and built-in delivery validation to support real-time reporting, IoT analytics, and transaction processing.

Offload Operational Reporting

Move real-time unstructured and structured data to Cosmos DB to support operational workloads including real-time reporting
Continuously collect data from a diverse set of sources (such as Internet of Things (IoT) sensors) for timely and rich insight

Accelerate and Simplify Processing

Perform filtering, transformations, aggregation, and enrichments in-flight before delivery to Cosmos DB
Avoid adding latency via stream processing
Easily convert structured data to document form

Ease the Cosmos DB Adoption Process

Use phased and zero-downtime migration from MongoDB by running them in parallel
Continuously visualize and monitor data pipelines with real-time alerts
Prevent data loss with built-in validation

How Striim Delivers Streaming Integration to Azure Cosmos DB

Low-Impact Change Data Capture from Enterprise Databases

Continuous, non-intrusive data ingestion for high-volume data
Support for databases such as Oracle, SQL Server, HPE NonStop, MySQL, PostgreSQL, MongoDB, Amazon RDS for Oracle, and Amazon RDS for MySQL
Real-time data collection from logs, sensors, Hadoop and message queues to support rich and timely analytics

Continuous, In-Flight Data Processing

In-line transformation, filtering, aggregation, enrichment to store only the data you need, in the right format
Uses SQL-based continuous queries via a drag-and-drop UI

Real-Time Data Delivery with Built-In Monitoring

Continuous verification of source and target database consistency
Interactive, live dashboards for streaming data pipelines
Real-time alerts via web, text, email

To learn more about how to leverage Striim’s solution for streaming integration to Azure Cosmos DB, check out our Striim for Azure Cosmos DB solution page, schedule a brief demo with a Striim technologist, provision Striim for Cosmos DB on the Azure marketplace, or download a free trial of the Striim platform and get started today!