In this episode of What’s New in Data, AWS VP and Distinguished Engineer Marc Brooker joins us to break down DSQL, Amazon’s latest innovation in serverless, distributed databases. We discuss how DSQL balances consistency, availability, and scalability—without the headaches of traditional relational databases. Tune in to hear how this new approach simplifies architecture, eliminates operational pain points, and sets a new standard for high-performance cloud databases.
Follow Marc on: X, Bluesky, LinkedIn, or his blog for more insights on distributed systems, databases, and the future of cloud computing.
Join us for a deep dive into the world of databases with CMU professor Andy Pavlo. We discuss everything from OLTP vs. OLAP, the challenges of distributed databases, and why cloud-native databases require a fundamentally different approach than legacy systems. We discuss modern Vector Databases, RAG, Embeddings, Text to SQL and industry trends.
What’s New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What’s New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.
Discover how Alex Noonan transitioned from the flight deck of a Marine aircraft to the intricate world of data engineering. His unique journey, enriched by a stint in finance, gives us a firsthand view of the diverse backgrounds shaping the data industry. As Alex recounts his experiences, we explore the vibrant community he found on data Twitter, a realm buzzing with shared insights and collaborative spirit. However, the landscape shifted following Elon Musk’s takeover of Twitter, leading to content fragmentation and a migration towards emerging platforms like Blue Sky. Join us as Alex discusses how these changes have impacted the cohesion and knowledge-sharing dynamics within the data community.
Navigate the complex world of professional networking with tips from Alex, as he breaks down the strategic use of platforms like LinkedIn, Reddit, and Hacker News for data professionals. Learn how to creatively tailor your content to fit the quirks of each platform’s algorithm, and prepare to engage with varied audiences. The conversation also highlights the transformative potential of AI tools in elevating data processes, reducing mundane tasks, and fostering high-value work. Discover innovations like Dagster and its role as an orchestrator, integrating key business intelligence tools to streamline the data engineer’s experience. This episode is a must-listen for anyone intrigued by the evolving interplay of technology, social media, and the power of community.
Everett Berry returns to the show with a treasure trove of insights on reshaping sales strategies through cutting-edge go-to-market data and AI advancements. Discover how Everett’s journey from prior roles to his pivotal role at Clay has equipped him to tackle the challenges of cleaning and enriching go-to-market data. He unveils how Clay’s innovative tools enhance data accuracy and coverage, empowering businesses to streamline their revenue operations by effectively leveraging both internal and third-party data. If you’re eager to work smarter and optimize your sales and marketing strategies, this episode promises invaluable lessons from a seasoned expert.
As AI technology rapidly evolves, Everett and John explore its transformative potential in sales operations and revenue processes. We dissect the interplay between AI agents and human interactions, the integration of customer data platforms with CRMs, and the blurred boundaries between RevOps and data teams. Imagine a future where AI agents autonomously manage data tasks, reshaping organizational structures and emphasizing collaboration between data and go-to-market teams. This episode is a must for those keeping pace with the swift evolution of sales technology, offering a glimpse into the future of autonomous data management and its implications for business success.
What does it take to go from leading Kafka development at Confluent to becoming a key figure in the PostgreSQL world? Join us as we talk with Gwen Shapira, co-founder and chief product officer at Nile, about her transition from cloud-native technologies to the vibrant PostgreSQL community. Gwen shares her journey, including the shift from conferences like O’Reilly Strata to PostgresConf and JavaScript events, and how the Postgres community is evolving with tools like Discord that keep it both grounded and dynamic.
We dive into the latest developments in PostgreSQL, like hypothetical indexes that enable performance tuning without affecting live environments, and the growing importance of SSL for secure database connections in cloud settings. Plus, we explore the potential of integrating PostgreSQL with Apache Arrow and Parquet, signaling new possibilities for data processing and storage.
At the intersection of AI and PostgreSQL, we examine how companies are using vector embeddings in Postgres to meet modern AI demands, balancing specialized vector stores with integrated solutions. Gwen also shares insights from her work at Nile, highlighting how PostgreSQL’s flexibility supports SaaS applications across diverse customer needs, making it a top choice for enterprises of all sizes.
Join us as we catch up with Chad Sanderson and Mark Freeman from Gable, live from Big Data London. Discover Chad’s insights from his well-attended talk and why the data scene in London has everyone buzzing. We’re diving deep into the concept of shifting data quality left, ensuring upstream data producers are as invested in data governance, privacy, and quality as their downstream counterparts. Chad and Mark also give us a sneak peek into their upcoming O’Reilly book on Data Contracts, complete with the charming Algerian racer lizard as its symbolic mascot.
In this engaging conversation, Chad and Mark offer practical advice for data operators ready to embark on the journey of data contracts. They emphasize the importance of starting small and nurturing a strong cultural initiative to ensure success. Listen as they share strategies on engaging leadership and fostering a collaborative environment, providing a framework not just for implementation but also for securing leadership buy-in. This episode is packed with expert advice and real-world experiences that are a must-listen for anyone in the data field.
John Kutay chimes in with examples of innovative data operators such as George Tedstone deploying Data Contracts at National Grid. Data Contracts and shifting data quality left will certainly be an area that many data teams prioritize as their workloads become increasingly operational.
Join us as we sit down with Joe Reis, live at Big Data LDN (London) 2024. Joe shares his partnership with DeepLearning.ai and AWS through his new course on Data Engineering. Joe’s new course promises to elevate your data skills with hands-on exercises that marry foundational knowledge with cutting-edge practices. We dive into how this course complements his seminal book, “Fundamentals of Data Engineering,” and why certification is valuable for those looking for foundational, hands-on knowledge to be a data practitioner.
But that’s not all; we also dissect the hurdles of adopting modern data architectures like data mesh in traditionally siloed companies. Using Conway’s Law as a lens, Joe discuss why businesses struggle to transition from outdated infrastructures to decentralized systems and how cross-disciplinary skills—a concept inspired by mixed martial arts—are crucial in this endeavor as he cleverly calls it ‘Mixed Model Arts’.
What’s New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What’s New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.
Can AI really make your data analysis as easy as talking to a friend? Join us for an enlightening conversation with Ethan Ding, the co-founder and CEO of TextQL, as he shares his journey from Berkeley graduate to pioneering the text-to-SQL technology that’s transforming how businesses interact with their data. Discover how natural language queries are breaking down barriers, making data analysis accessible to everyone, regardless of technical skill. Ethan delves into the historical hurdles and the game-changing advancements that are pushing the boundaries of AI and large language models in data querying.
Ever wondered how the quest for full autonomy in self-driving cars relates to data querying? We draw fascinating parallels between these two cutting-edge fields, emphasizing the importance of structured systems over chaotic, AI-driven approaches. This chapter reveals the often-overlooked limitations of current data management practices and underscores the critical need for high-quality data and robust modeling. Through a comparison of traditional business intelligence tools and advanced AI-driven solutions, we explore what truly makes data querying effective and insightful.
Hear from Ethan Deng, co-founder and CEO of TextQL, as he explains how their innovative tool integrates seamlessly with existing BI infrastructures, boosting productivity without the need for disruptive overhauls. Tune in to find out how TextQL is making data-driven decisions faster and smarter, paving the way for a future where data is everyone’s best friend.
What makes MotherDuck and DuckDB a game-changer for data analytics? Join us as we sit down with Jacob Matson, a renowned expert in SQL Server, dbt, and Excel, who recently became a developer advocate at MotherDuck.
During this episode, Jacob shares his compelling journey to MotherDuck, driven by his frequent use of DuckDB for solving data challenges. We explore the unique attributes of DuckDB, comparing it to SQLite for analytics, and uncover its architectural benefits, such as utilizing multi-core machines for parallel query execution. Jacob also sheds light on how MotherDuck is pushing the envelope with their innovative concept of multiplayer analytics.
Our discussion takes a deep dive into MotherDuck’s innovative tenancy model and how it impacts database workloads, highlighting the use of DuckDB format in Wasm for enhanced data visualization. Jacob explains how this approach offers significant compression and faster query performance, making data visualization more interactive. We also touch on the potential and limitations of replacing traditional BI tools with Mosaic, and where MotherDuck stands in the modern data stack landscape, especially for organizations that don’t require the scale of BigQuery or Snowflake. Plus, get a sneak peek into the upcoming Small Data Conference in San Francisco on September 23rd, where we’ll explore how small data solutions can address significant problems without relying on big data. Don’t miss this episode packed with insights on DuckDB and MotherDuck innovations!
Prepare to transform your understanding of data and cloud architecture with visionary CEO Alex Gallego of Redpanda. Discover how Alex’s journey from building racing motorcycles and tattoo machines as a child led him to revolutionize stream processing and cloud infrastructure. This episode promises invaluable insights into the shift from batch to real-time data processing, and the practical applications across multiple industries that make this transition not just beneficial but necessary.
Explore the intricate challenges and groundbreaking innovations in data storage and streaming. From Kafka’s distributed logs to the pioneering Redpanda, Alex shares the operational advantages of streaming over traditional batch processing. Learn about the core concepts of stream processing through real-world examples, such as fraud detection and real-time reward systems, and see how Redpanda is simplifying these complex distributed systems to make real-time data processing more accessible and efficient for engineers everywhere.
Finally, we delve into emerging trends that are reshaping the landscape of data infrastructure. Examine how lightweight, embedded databases are revolutionizing edge computing environments and the growing emphasis on data sovereignty and “Bring Your Own Cloud” solutions. Get a glimpse into the future of data ownership and AI, where local inferencing and traceability of AI models are becoming paramount. Join us for this compelling conversation that not only highlights the evolution from Kafka to Redpanda but paints a visionary picture of the future of real-time systems and data architecture.
What’s New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What’s New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.