Ryan Siss

17 Posts

Stream Data into Snowflake with Streaming Data Integration

Posted on September 1, 2019 by Ryan Siss | 4 min read | 3 views

In this video, learn why enterprises must stream data into Snowflake to take full advantage of this data warehouse built for the cloud.

To learn more about Striim for Snowflake Data Warehouse, visit our Snowflake solution page.

Video Transcription:

You chose Snowflake to provide rapid insights into your data on a massive scale, on AWS or Azure. However, most of your source data resides elsewhere – in a wide variety of on-premise or cloud sources. How do you continually move data to Snowflake in real-time, processing it along the way, so that your fast analytics and insights are reporting on timely data?

Snowflake was built for the cloud, and built for speed. By separating compute from storage you can easily scale up and down as needed. This gives you instant elasticity supporting any amount of data, and high speed queries for any number of users, coupled with the peace of mind provided by secure data sharing. The per-second pricing and support for multiple clouds allows you to choose your infrastructure and only pay when you are using the data warehouse.

However, residing in cloud means you have to determine how to most effectively move data to Snowflake. This could be migrating an existing Teradata or Exadata Data Warehouse, or continually populating Snowflake with newly generated on-premises data from operational databases, logs, or device information. In order for the warehouse to provide up-to-date information, there should be as little latency as possible between the original data creation and its delivery to Snowflake.

The Striim platform can help with all these requirements and more. Our database adapters support change data capture, or CDC, from enterprise or cloud databases. CDC directly intercepts database activity and collects all the inserts, updates, and deletes as they happen, ready to stream into Snowflake. Adapters for machine logs and other files read at the end of multiple files in parallel to stream out data as it is written, removing the inherent latency of batch. While data from devices and messaging systems can be collected easily, independent of their format, through a variety of high-speed adapters and parsers.

After being collected continuously, the streaming data can be delivered directly into Snowflake with very low latency, or pushed through a data pipeline where it can be pre-processed through filtering, transformation, enrichment, and correlation using SQL-based queries, before delivery into Snowflake. This enables such things as data denormalization, change detection, de-duplication, and quality checking before the data is ever stored.

In addition to this, because Striim is an enterprise-grade platform, it can scale with Snowflake and reliably guarantee delivery of source data while also providing built-in dashboards and verification of data pipelines for operational monitoring purposes.

The Striim wizard-based UI enables users to rapidly create a new data flow to move data to Snowflake. In this example, real-time change data from Oracle is being continually delivered to Snowflake. The wizard walks you through all the configuration steps, checking that everything is set up properly, and results in a data flow application. This data flow can be enhanced to filter, transform and enrich the data through SQL-based queries. In the video, we add a name and email address from a cache, based on an ID present in the original data.

When the application is started, data flows in real-time from Oracle to Snowflake. Making changes in Oracle results in the transformed data being written continually to Snowflake, visible through the Snowflake UI.

Striim and Snowflake can change the way you do analytics, with Snowflake providing rapid insight to the real-time data provided by Striim. The data warehouse that is built for the cloud needs data delivered to the cloud, and Striim can continuously move data to Snowflake to support your business operations and decision-making.

To learn more about how Striim makes it easy to continuously move data to Snowflake, visit our Striim for Snowflake product page, schedule a demo with a Striim technologist, or download the platform and try it for yourself.

Rapid Adoption of Google Cloud SQL Using Streaming Integration & CDC

Posted on August 14, 2019 by Ryan Siss | 3 min read | 3 views

In this video, we will demonstrate how Striim can provide continuous data integration via CDC to Google Cloud SQL through a pipeline for the real-time collection, processing, and delivery of enterprise data, sourcing from Oracle on-prem.

Are you looking to migrate to Google Cloud Platform? Start our free 90-day migration service today!

To learn more about the Striim platform, visit our platform overview page.

This was originally published as a blog post here.

Unedited Transcript:

Rapid adoption of Google Cloud SQL. We’re using Striim with streaming integration from any enterprise data source you want to move to Google Cloud SQL, but much of your data may currently be elsewhere locked up. Maybe this is in operational databases, data warehouses, legacy systems and other locations. You need a new hybrid cloud integration strategy for the continuous movement of enterprise data to and from Google Cloud with continuous collection, processing, and delivery of enterprise data in real time, not batch, to ensure Google Cloud SQL is always up to date. Data from on premise and cloud sources needs to be delivered to Google Cloud SQL including the one time load and continuous change to delivery with in-flight processing to ensure up to second information for your users, and that’s where Striim comes in. Striim is a next generation streaming integration and intelligence platform that supports your hybrid cloud initiatives and has integration with multiple Google Cloud technologies. We will demonstrate how Striim can provide continuous data integration into Google Cloud SQL through a pipeline for the real-time collection, processing and delivery of enterprise data.

Sourcing from Oracle on premise. In this case, we’ll be doing an initial load followed by continuous change delivery from Oracle to Google Cloud SQL. Striim’s UI makes it easy to continuously and non intrusively ingest all your enterprise data from a variety of sources in real time. We’ll start by doing an initial load of data from Oracle on premise to Google Cloud SQL using a data flow. When the flow is started, the full contents of the on premise customer table is loaded into Google Cloud SQL. After a short time. All the rows in the source table are present in the Google Cloud SQL customer target table. This can be monitored using Striim and the Google Cloud monitor UI. Once the initial load is complete, we can continuously deliver changes using CDC from Oracle into the Google Cloud SQL instance. A separate flow is used so that the initial load and CDC can be coordinated after many changes. You can see that the Google Cloud SQL is completely up to date with the on premise Oracle instance. The continuous updates can also be monitored through the Striim UI. You can see how Striim can enable your hybrid cloud initiatives and accelerate the adoption of Google Cloud SQL. Get started with Striim now with a trial download on our website or contact us if you want to know more.

Simplify Your Azure Hybrid Cloud Architecture with Streaming Data Integration

Posted on July 15, 2019 by Ryan Siss | 4 min read | 3 views

While the typical conversation about Azure hybrid cloud architecture may be centered around scaling applications, VMs, and microservices, the bigger consideration is the data. Spinning up additional services on-demand in Azure is useless if the cloud services cannot access the data they need, when they need it.

“According to a March 2018 hybrid cloud report from 451 Research and NTT Communications, around 63% of firms have a formal strategy for hybrid infrastructure. In this case, hybrid cloud does not simply mean using a public cloud and a private cloud. It means having a seamless flow of data between all clouds, on and off-premises.” – Data Foundry

To help simplify providing a seamless flow of data to your Microsoft Azure hybrid cloud infrastructure, we’re happy to announce that the Striim platform is available in the Microsoft Azure Marketplace.

How Streaming Data Integration Simplifies Your Azure Hybrid Cloud Architecture

Enterprise-grade streaming data integration enables continuous real-time data movement and processing for hybrid cloud, connecting on-prem data sources and cloud environments, as well as bridging a wide variety of cloud services. With in-memory stream processing for hybrid cloud, companies can store only the data they need, in the format that they need. Additionally, streaming data integration enables delivery validation and data pipeline monitoring in real time.

Streaming data integration simplifies real-time streaming data pipelines for cloud environments. Through non-intrusive change data capture (CDC), organizations can collect real-time data without affecting source transactional databases. This enables cloud migration with zero database downtime and minimized risk, and feeds real-time data to targets with full context – ready for rich analytics on the cloud – by performing filtering, transformation, aggregation, and enrichment on data-in-motion.

Key Traits of a Streaming Data Integration Solution for Your Azure Hybrid Cloud Architecture

There are three important objectives to consider when implementing a streaming data integration solution in an Azure hybrid cloud architecture:

Make it easy to build and maintain –The ability to use a graphical user interface (GUI) and a SQL-based language can significantly reduce the complexity of building streaming data pipelines, allowing more team members within the company to maintain the environment.
Make it reliable – Enterprise hybrid cloud environments require a data integration solution that is inherently reliable with failover, recovery and exactly-once processing guaranteed end-to-end, not just in one slice of the architecture.
Make it secure –Security needs to be treated holistically, with a single authentication and authorization model protecting everything from individual data streams to complete end-user dashboards. The security model should be role-based with fine-grained access, and provide encryption for sensitive resources.

Striim for Microsoft Azure

The Striim platform for Azure is an enterprise-grade data integration platform that simplifies an Azure-based hybrid cloud infrastructure. Striim provides real-time data collection and movement from a variety of sources such as enterprise databases (ie, Oracle, HPE NonStop, SQL Server, PostgreSQL, Amazon RDS for Oracle, Amazon RDS for MySQL via low-impact, log-based change data capture), as well as log files, sensors, messaging systems, NoSQL and Hadoop solutions.

Once the data is collected in real time, it can be streamed to a wide variety of Azure services including Azure Cosmos DB, Azure SQL Database, Azure SQL Data Warehouse, Azure Event Hubs, Azure Data Lake Storage, and Azure Database for PostgreSQL.

While the data is streaming to Azure, Striim enables in-stream processing such as filtering, transformations, aggregations, masking, and enrichment, making the data more valuable when it lands. This is all done with sub-second latency, reliability and securty via an easy-to-use interface and SQL-based programming language.

To learn more about Striim’s capabilities to support the data integration requirements for an Azure hybrid cloud architecture, read today’s press release announcing the availability of the Striim platform in the Microsoft Azure Marketplace, and check out all of Striim’s solutions for Azure.

Striim Sweeps 2019 Best Places to Work Awards

Posted on June 25, 2019 by Ryan Siss | 3 min read | 3 views

We are proud to announce that Striim has received two 2019 best places to work awards in the Bay Area by three highly regarded local publications: the San Francisco Business Times, the Silicon Valley Business Journal, and the Bay Area News Group (publisher of The Mercury News in San Jose). This is the third year in a row that Striim was among the top companies on both lists.

This past week, Striim ranked #1 in the Small Companies category of the Bay Area News Group’s Top Workplaces award. This is the second time in three years that Striim has received the top ranking.

In late April, the San Francisco Business Times and the Silicon Valley Business Journal recognized Striim as the #7 best place to work in the Bay Area, up 3 spots from its #10 ranking in 2018.

Striim is honored to consistently rank among the top 10, and even more so to achieve Bay Area News Groups #1 spot. These rankings are a reflection of Striim’s ability to attract amazing employees in the Silicon Valley, and showcase the positive experience of the Striim team members currently working at the company.

What’s great is that both awards were 100% driven by employee feedback. Employees were asked a number of multiple choice and open-ended questions pertaining to a variety of workplace considerations: culture, pay, benefits, work-life balance, team collaboration, etc. Striim employees ranked the company extremely high in all categories.

Striim does not take these 2019 best places to work awards lightly. As a tech startup, it’s difficult to attract and retain top talent that Silicon Valley. Striim, like many other small companies in the Valley, needs to compete with big tech organizations and well-funded start-ups alike.

Along with its own unique perks and offerings, Striim offers a close-knit environment that promotes respect, hard work, and collaboration. Also, every day, employees are given the opportunity to work on emerging technology that is changing the way enterprise companies interact and handle its data.

It’s our belief that this combination is why Striim has done so well with these best places to work awards over the years.

If you’re interested in learning more about why Striim has been recognized as one of the top 2019 best places to work in the Bay Area, please read our San Francisco Business Times/Silicon Valley Business Journal and Bay Area News Group Top Workplaces press releases. And please check our Careers page if you think Striim might be a fit for you!

Striim Announces Strategic Partnership with Snowflake to Drive Cloud-Based Data-Driven Analytics

Posted on April 16, 2019 by Ryan Siss | 3 min read | 3 views

We are excited to announce that we’ve entered into a strategic partnership with Snowflake, the data warehouse built for the cloud, in which Striim will be used to move real-time data into Snowflake. Through this strategic partnership, Snowflake users will be empowered to gain fast insights from their cloud-based analytics.

Enterprise companies are quickly adopting Snowflake because its architecture is built from the ground up for the cloud. Snowflake offers speed, scalability, and cost-effectiveness, along with zero management. In order to attain fast analytics, you need access to real-time data, and that’s where Striim comes in. Striim is leveraging its vast real-time data integration capabilities to enable Snowflake users to collect and move data from a variety of sources into their environment to accelerate their data-driven analytics.

Striim uses low-impact change data capture (CDC) to move data from existing on-prem databases, including SQL Server, Oracle, MongoDB, HPE NonStop, PostgreSQL, MySQL and Amazon RDS. Striim can also help you migrate data warehouses such as Teradata, Netezza, Amazon Redshift, and Oracle Exadata. Additionally, Striim can collect from messaging systems, Hadoop, log files, sensors, and security devices and other systems. Striim also has analytical capabilities to monitor and measure transaction lag and alert when SLAs are not met.

Through CDC, Striim can handle large volumes of enterprise data securely and reliably. Along with its CDC capabilities, Striim adds further value through in-flight processing, transformations, and denormalization to further assist Snowflake users in providing quicker analysis by continuously delivering data to Snowflake in the right format, and with added context.

Striim has a number of use cases with customers using the solution for both online migrations and continuous integration to Snowflake.

For example, a company offering HR and well-being solutions, is a joint customer that was searching for a low-latency streaming integration solution that was scalable and also offered a secure data warehouse with analytical options. This organization’s goal was to enable employees to instantly query their personal information, as well as allow employers to identify trends and patterns from the data.

With Striim + Snowflake, this business has been delivering real-time data and analytics using CDC from Oracle to Azure for streamlined operations. The partnership between Striim and Snowflake has dramatically enhanced the company’s operationsoperations, enabling them to make faster, smarter decisions based on their real-time data.

To learn more about the Striim-Snowflake solution and Striim’s partnership with Snowflake, please read our press release, visit our Striim for Snowflake product page, or set up a quick demo with a Striim technologist.

Google Cloud Next – Cloud Spanner Demo

Posted on April 12, 2019 by Ryan Siss | 6 min read | 3 views

Alok Pareek, EVP of Products at Striim, and Codin Pora, Director of Partner Technology at Striim, provide a demo of the Striim platform at Google Cloud Next SF, April 2019. Alok goes into detail about how Google Cloud users can move real-time data from a variety of sources into their Google Cloud Spanner environment using the Striim platform.

Unedited Transcript:

So with that, I’d like to invite Alok and call them up to stage to give us a demo of Spanner. And their company Striim is strategic partners of ours that do basically replication and migration of data into Google cloud. Thank you. Thank you.

Thank you, Tobias. So today I’m going to show a demonstration of another. You have these wonderful endpoints on the Google cloud. How do you actually use them? How do you actually move your data into them? And I’m going to talk about in this demo how we move real time data from your applications from an on premise Oracle database into Cloud Spanner. So before I get into the demojust a little bit about Striim. Striim is the next generation platform that helps in three solution categories. These are cloud option, hybrid cloud data integration, in-memory stream processing. Today I’m going to be focusing on the cloud adoption, specifically, how do we move data into Spanner? So with that, we’re going to jump into the demo.

Okay. So what you see on the screen is the landing page. And I’m gonna keep this going pretty fast. We’re going to step into the apps part of the demo. That’s where the data pipelines are defined. That helps you move the data from on premise to Spanner. In this case, what you are seeing, there are two pipelines. One of them is meant to do an initial load or an instantiation of your existing data onto Cloud Spanner tables. And the other one is also meant to catch it up. So while you are actually moving the data, you might have very large tables, for example, or massive amounts of volumes. So how do you actually go ahead and not lose any data? And all of the consistency things that we heard about from Tobia survey earlier.

It’s important that while you are moving the data, you also don’t have disruption to your applications and to your business. So let’s step into the pipeline here. So this is a very simple pipeline. It actually has a simple flow. You have at the top a data source, which is in this case Oracle, it’s running on premise. So we connect into this Oracle database. It has a line items table. We’re going to show you a movement of about a hundred thousand records. And also there’s an order stabler where we’re going to show you the delta processing. The way this application is constructed is by using these components on the left side of the UI in the flow designer as you drag and drop one of these things and you push them into the pipeline.

And that’s how you actually construct your data flow. And once we actually go we can also step into the Spanner target definition and this is your service account and the connectivity and the config for your Spanner. We’re gonna next deploy this application or the pipeline and once we deploy it, this is where you can sort of see that I can actually run this within the Striim platform. This can be run either on premise or on the Google Cloud. We want to probably show, Codin, that there’s nothing available yet in the tables on the Spanner side. So let’s go ahead and execute a query against a line item table. And in this case you’re seeing that there are zero records there and you can take my word that there is a hundred thousand records on the Oracle side.

In the interest of time we’ll assume that and let’s go ahead and run the application. And as soon as we are on the application you can see that in the preview in the lower part of your screen, you can actually see the records running live. This is while we are uploading the data and applying them into Cloud Spanner. You can see that we have completed a 100,000 records and it was pretty fast. This morning I’d done a million records so I was holding my breath there, but that was pretty fast as well. So now you can see that the data part is completed. I mentioned to you that there’s a second phase here. That’s the change data capture phase. So this is while you’re actually executing this query, of course, this query is consistent as of a specific snapshot.

At Oracle, there’s also DML activity against your application. So how do we actually take this data? This is the second pipeline now, so we can step into pipeline number two. Codin is already deployed it and in this case we use a special reader and that actually operates against the redo logs of the Oracle database and actually monitors that. So it doesn’t actually have any impact on the production system per se, impact us in like it’s at least not doing any query impact there. We grabbed the data from the redo logs and then we are going to reapply that as DMO, as inserts, updates and so forth on the Cloud Spanner system. So let’s go ahead and run this application. We are going to generate some DML using a data generator.

And let’s go ahead and run the generator and you’ll see that there’s a number of inserts, updates and deletes against the orders table. And now let’s switch over to the Cloud Spanner system and query the order stable here. As you can see, there’s data in the orders table. This was also something that was just propagated. So this is sort of like the two phase, very fast demo of how you get data from your on prem databases into Cloud Spanner. And of course this can work against other databases that we support as well. And this a available in the Google Cloud. So with that, I’m gonna hand the control back to Tobias.

Striim Recognized on FORTUNE’s “2019 Best Workplaces in the Bay Area” List

Posted on February 28, 2019 by Ryan Siss | 2 min read | 2 views

We are excited to announce that Striim has been recognized as a “Best Workplace” on FORTUNE’s “2019 Best Workplaces in the Bay Area” list.

Striim was selected based on a survey that was created, launched, and evaluated by Great Place to Work, a global people analytics and consulting firm.

The rankings took into account more than 30,000 surveys by employees across the Bay Area, designed to evaluate more than 60 elements of an employee’s job and work environment, including trust in leadership, camaraderie in a team setting, and respect among colleagues. Employee perks and benefits were also factored into the rankings.

This Best Workplaces in the Bay Area recognition is very important to Striim because the rankings were completely driven by employee feedback that Great Place to Work collected and evaluated. Additionally, given the fierce competition of not only attracting, but also retaining the best talent in the Bay Area, having our employees thrive in a culture that the Striim team Striim works so hard to foster is extremely rewarding and indicative that we’re on the right track for employee satisfaction.

Striim scored high across the board in many categories including Justice (100 %), Camaraderie (98%), Integrity (96%), Credibility (96%), and Innovation (96%), just to name a few.

Additionally, according to the survey, the overall Striim employee experience was rated 96%. Other great indications that our employees noted include:

“Managers avoid playing favorites.” – 100%
“I can be myself around here.” – 100%
“When you join the company, you re made to feel welcome.” – 100%
“Management is approachable, easy to talk with.” – 98%
“People here are given a lot of responsibility.” – 98%

Learn more about what employees had to say about Striim, as well as further information on the company, by reading the full Great Place to Work review.

To learn more about why Striim was included on FORTUNE’s Best Workplaces in the Bay Area, as well as to see the full list of winners, please read our press release, “Striim Named One of the 2019 Best Workplaces in the Bay Area by FORTUNE and Great Place to Work.”

The Power of Streaming SQL for Real-Time Data Solutions

Posted on January 7, 2019 by Ryan Siss | 4 min read | 2 views

In this video, Striim Founder and CTO, Steve Wilkes, discusses streaming integration, the need for stream processing and streaming SQL, and why they’re essential to real-world real-time solutions.

To learn more about the Striim platform, go here.

Unedited Transcript:

You’ve heard about streaming integration, the need for stream processing, and often hear the term streaming SQL. But what is streaming SQL, and why is it so essential to real-world real-time solutions?

IBM created the Structured Query Language, or SQL, in the 1970s as a declarative mechanism for working with relational data. It has been used for four decades as a way of creating, modifying and querying data in almost every database on the planet. However, because databases store data before it is available for querying, this data is invariably old.

In the world of real-time data and streaming systems there is also a need to work with data, and Striim chose 5 years ago to use a variant of SQL for stream processing. This streaming SQL looks very much like the static database variant, but needs new constructs to deal with the differences between stored and real-time continuous data.

Database SQL works against an existing set of data and produces a result set. If the data changes, the SQL needs to be run again. Streaming SQL receives a continuous and never-ending amount of data, and continually produces new results as new data arrives.

The simplest things that can be done with this data are filtering and transformation. These operations work event-by-event with every input potentially creating zero or one output.

For example, if we want to limit data moving from one stream to another to a certain location, we could write a simple WHERE clause.

SELECT *
FROM OrderStream
WHERE zip = 94301

And if we want to combine first and last names into full name, we can use concatenation, with other, more complex, functions of course available.

SELECT *,
       FirstName + ‘ ‘ + LastName as FullName
FROM OrderStream
WHERE zip = 94301

However, because streaming queries receive events one-by-one, additional constructs are required for aggregate queries that work against a set of data, so windows and event tables need to be introduced.

A window contains a set of events bounded by some criteria. This could be the last 5 minutes worth of data, last 100 events, or hold events until no more arrive within a certain time. Windows can also be partitioned, so the sets are based on the criteria per some data value, for example last 100 actions carried out per customer. Event tables hold the last event that occurred for some key, for example the last temperature reading per room.

Streaming SQL can work against windows and event tables and will output results whenever there is any change. Aggregate queries against windows will recalculate whenever the window is updated, giving running counts, sums over micro-batches, or activity within a session.

For example to create a running count and sum of purchases per item in the last hour, from a stream of orders, you would use a window, and the familiar group by clause.

CREATE WINDOW OrderWindow
OVER OrderStream
KEEP WITHIN 1 HOUR
PARTITION BY itemId
 
SELECT itemId, itemName,
       COUNT(*) as itemCount,
       SUM(price) as totalAmount
FROM OrderWindow
GROUP BY itemId

Enriching data is just as easy, it uses the standard notion of a JOIN. The Striim platform supports all types of joins familiar to database users including inner, outer, cross and self-joins through nested queries. Striim enables users to load large amounts of data into in-memory caches and event tables from databases, files, hdfs and other sources. This can be reference, context or historical data, and can be updated through the incorporation of CDC.

For example, if we want to enrich the orders stream to include details about customer and location, we can join with reference data loaded into caches from the customer table and location database.

SELECT o.orderid, o.itemname,
       o.custid, o.price, o.quantity,
       c.name, c.age, c.gender, c.zip,
       z.city, z.state, z.country
FROM OrderStream o,
     CustInfo c, ZipInfo z
WHERE o.custid = c.id
AND   c.zip = z.zip

Of course, this just scratches the surface of what can be achieved through Streaming SQL. Production queries can be much more complex, utilizing case statements and even pattern matching syntax.

Real-Time Cloud Migration Monitoring with Striim

Posted on November 26, 2018 by Ryan Siss | 3 min read | 2 views

In this cloud migration monitoring demo, we will show how, by collecting change data from source and target and matching transactions applied to each in real time, you can ensure your cloud database is completely synchronized with on-premise, and detect any data divergence when migrating from an on-premise database.

This was originally published as a blog post here.

To learn more about the Striim platform, visit our platform overview page here.

Unedited Transcript:

Migrating applications to AWS requires more than just being able to run in VMs or cloud containers. Applications rely on data and that data needs to be migrated as well. In most cases, the original applications are essential to the business and cannot be stopped during this process since it takes time to migrate the data and time to verify the application after migration. It is essential the data changes are collected and delivered during and after that initial load. As the data is so crucial to the business and change data will be continually applied for a long time, mechanisms are verified that the data is delivered correctly are an important aspect of any cloud migration. This migration monitoring demo will show how by collecting changed data from source and targets and matching transactions applied to each in real time, you can assure your cloud database is completely synchronized with on premise and it takes any day to divergence where migrating front on-premise database.

The key challenges with monitoring cloud database migrations include enabling data migration without a production outage; with monitoring during and after migration; detecting out of sync data should any divergence occur with this detection happening immediately at the time of divergence; preventing further data corruption; running the monitoring solution, non intrusively with low overhead; and obtaining sufficient information to enable fast resynchronization. In our scenario, we’re monitoring the migration of an on premise application to AWS. A Striim dashboard shows real time status complete with alerts and is powered by continuously running data pipeline. The on premise application uses an Oracle database and cannot be stopped. The database transactions are continually replicated to an Amazon Aurora MySQL database. The underlying migration solution could either be streams, migration solution or other solutions such as AWS DMS. The objective is to monitor ongoing migration of transactions and alerts when any transactions go out of sync, indicating any potential data discrepancy.

This is achieved in the Striim platform through this continuous query processing layer. Transactions are continuously collected from the source and target databases in real time and matched within a time window. If matching transactions do not occur within a period of time, they’re considered long running. If no match occurs in an additional time period, the transaction is considered missing. Alerts are generated in both cases. The number of alerts from missing transactions and long running transactions are displayed in the dashboard. Transaction rates and operation activity are also available in the dashboard and can be displayed for all tables or just for critical tables and users. You can immediately see live updates and alerts where the transactions do that get propagated to the target within a user configured window. With lung running transactions that eventually make it to target, also tracked. The dashboard is used of customizable, making it easy to add additional visualizations for specific monitoring as necessary. You’ve seen how Striim can be used for continuous monitoring of your on premise to cloud migrations. Talk to us today about this solution and get started immediately using a download from our website or test out Striim in the AWS marketplace.

Rapid Adoption of AWS Using Streaming Data Integration with CDC

Posted on November 16, 2018 by Ryan Siss | 4 min read | 2 views

In this video, Striim Founder and CTO, Steve Wilkes, talks about moving data to Amazon Web Services in real-time and explains why streaming data integration to AWS – with change data capture (CDC) and stream processing – is a necessary part of the solution.

To learn how Striim can help you continuously move real-time data into AWS, visit our Striim for AWS page.

Unedited Transcript:

Adopting Amazon web services is important to your business and why? Real-time data movement through streaming integration, change, data capture and stream processing necessary parts of this process you’ve already decided that you want to adopt Amazon web services is going to be Amazon rds or ever Amazon redshift, Amazon s three Amazon, Canisius, Amazon EMR, any number of other technologies you may want to migrate existing applications to AWS scale elastically as necessary or use the cloud for analytics or machine learning or any applications in AWS as VMs or containers. So only parts of the problem. You also need to consider how to move data to the cloud and to your applications. Analytics are always up to date. Make sure the data is in the right format to be valuable. Most important starting point is ensuring you can stream data to the cloud in real time. Batch data movement can cause unpredictable load enclave targets and that’s a high latency meaning it as often now as old from an applications having up to a second.

Information is essential. For example, to provide current customer information, accurate business reporting, offer real time decision maker streaming data from on premise to Amazon web services required making use of appropriate data collection technologies for databases. This has changed their to capture or CDC. We start rectally and continuously intercepts database activity and collects all the inserts, updates and deletes as events as they happen. Love data requires file Taylor which reads at the end of one or more file across potentially multiple machines and streams the latest records as they are written. Other sources like IoT data or third party SAS applications also requires specific treatments in order to ensure data can be streamed in real time which you have streaming data. The next consideration is what processing is necessary to make that data valuable. Your specific AWS destination, and this depends on the use case for database migration or lesson scalability use cases, but the target Schema is similar to the source.

Moving raw data from on premise databases to Amazon RDS or Aurora. Maybe sufficient important consideration here is that the source applications typically cannot be stopped and it takes time to do an initial load based way. Collecting and delivering database change during and after. The initial load is essential for zero downtime migrations. The real time application sourcing from Amazon, nieces or analytics use cases built on Amazon redshift or Amazon EMR, maybe necessary to perform stream processing before the data is delivered to the cloud. There’s processing can transform the data structure and in Richard with additional context information while the data is in flight, adding value to the data and optimizing downstream analytics stream streaming integration platform. We continuously collect data from on premise or other cloud sources and delivered to all of your Amazon web service endpoints to can take care of initial loads as well as CDC for the continuous application of change. And these data flows can be created rapidly and monitored and validating continuously through our intuitive UI, the stream, your cloud migration, scaling, and analytics. We built an iterated on at the speed of your business, ensuring your data. There’s always where you wanted when you want.