Latest news with #MateiZaharia


Cision Canada
2 days ago
- Business
- Cision Canada
Databricks Donates Declarative Pipelines to Apache Spark™ Open Source Project
SAN FRANCISCO, June 11, 2025 /CNW/ -- Data + AI Summit -- Databricks, the Data and AI company, today announced it is open-sourcing the company's core declarative ETL framework as Apache Spark™ Declarative Pipelines. This initiative comes on the heels of Apache Spark reaching two billion downloads and the recent launch of Apache Spark 4.0. These releases build on Databricks' long-standing commitment to open ecosystems, ensuring users have the flexibility and control they need without vendor lock-in. Spark Declarative Pipelines tackles one of the biggest challenges in data engineering, making it easy to build and operate reliable, scalable data pipelines end-to-end. Spark Declarative Pipelines provides an easier way to define and execute data pipelines for both batch and streaming ETL workloads across any Apache Spark-supported data source, including cloud storage, message buses, change data feeds and external systems. This battle-tested declarative framework for building data pipelines helps engineers address common pain points like complex pipeline authoring, manual operations overhead and siloed batch/streaming. Spark Declarative Pipelines is based on Databricks' core declarative ETL framework, which is used by thousands of customers. With the proven ability to handle complex data engineering workloads and low-latency streaming, Spark Declarative Pipelines lays the foundation for the next generation of data processing and governance. With Spark Declarative Pipelines, more community members can begin to cut engineering time and costs and reliably support new AI agent systems and other workloads in production. "Our commitment to open source is unwavering. With origins in academia and the open source community, Databricks was founded in 2013 by the original creators of the lakehouse architecture and open source projects including Apache Spark, Delta Lake, MLflow and Unity Catalog," said Matei Zaharia, Co-founder and CTO of Databricks. "We worked closely with the community to help remove friction around data formats that kept information siloed. Spark Declarative Pipelines now gives enterprises an open way to build high-quality pipelines." Key benefits of Spark Declarative Pipelines include: Simplifying pipeline authoring: Data engineers and analysts can quickly declare robust pipelines with minimal coding, focusing on delivering business-critical insights. Improved operability by design: Spark Declarative Pipelines help catch issues earlier in development through clear pipeline definitions that are validated in full prior to execution, reducing the risk of failures downstream and making pipelines easier to troubleshoot and maintain. Unified batch and streaming: Data teams can flexibly meet both real-time and periodic processing needs through a single API for defining and managing batch and streaming data pipelines, simplifying development and maintenance. "Declarative pipelines hide the complexity of modern data engineering under a simple, intuitive programming model. As an engineering manager, I love the fact that my engineers can focus on what matters most to the business. It's exciting to see this level of innovation now being open-sourced, making it accessible to even more teams." — Jian (Miracle) Zhou, Senior Engineering Manager, Navy Federal Credit Union "At 84.51˚ we're always looking for ways to make our data pipelines easier to build and maintain, especially as we move toward more open and flexible tools. The declarative approach has been a big help in reducing the amount of code we have to manage, and it's made it easier to support both batch and streaming without stitching together separate systems. Open-sourcing this framework as Spark Declarative Pipelines is a great step for the Spark community." — Brad Turnbaugh, Sr. Data Engineer, 84.51° About Databricks Databricks is the Data and AI company. More than 15,000 organizations worldwide — including Block, Comcast, Condé Nast, Rivian, Shell and over 60% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to take control of their data and put it to work with AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake, MLflow, and Unity Catalog. To learn more, follow Databricks on X, LinkedIn and Facebook.


Cision Canada
22-05-2025
- Business
- Cision Canada
Databricks Announces 2025 Data + AI Summit Keynote Lineup and Data Intelligence Programming
The Data Intelligence for All themed event will feature keynotes from co-founders Ali Ghodsi, Matei Zaharia, Arsalan Tavakoli-Shiraji and Reynold Xin Attendees will hear from Jamie Dimon, Chairman and CEO, JPMorgan Chase; Dario Amodei, Co-founder and CEO, Anthropic; and Satya Nadella, Chairman and CEO, Microsoft SAN FRANCISCO, May 22, 2025 /CNW/ -- Databricks, the Data and AI company, today announced the full agenda and lineup of featured speakers for this year's Data + AI Summit, a global event for the data and AI community. From June 9-12, more than 20,000 data and AI practitioners, leaders, and visionaries from over 160 countries are expected to convene, both at San Francisco's Moscone Center, and tens of thousands more virtually, to discover the latest developments in generative AI, machine learning, analytics and data governance, alongside Databricks' latest product innovations. This year's Summit underscores Databricks' deep and ongoing commitment to San Francisco, building on the company's recent announcement to invest $1 billion into the city over the next three years. Tickets are still available; register here to join data leaders, open source enthusiasts as well as Databricks customers and partners. Data + AI Summit will feature keynotes from Databricks co-founders — and the original creators of Apache Spark™, Delta Lake, MLflow, and Unity Catalog — Ali Ghodsi, Matei Zaharia and Reynold Xin. Attendees will also hear from a broad lineup of data and AI luminaries, open source pioneers and global thought leaders, including: Jamie Dimon, Chairman and CEO, JPMorgan Chase Dario Amodei, Co-founder and CEO, Anthropic Satya Nadella, Chairman and CEO, Microsoft, in a pre-recorded, virtual fireside chat This year's theme, Data Intelligence for All, will showcase how every organization can harness the power of data to develop its own AI apps and agent systems. Attendees can expect to hear from top experts, researchers and open source contributors as they share actionable best practices and compelling insights about their data journeys. Highlights include thought-provoking sessions from data leaders from pioneering companies like Adobe, Apple, AT&T, Block, Doordash, Doctors Without Borders, GM, Joby, Mastercard, Meta, NVIDIA, SAP, Unilever, U.S. Department of Veterans Affairs, Virgin Atlantic and Walmart. The annual event will feature a compelling program including technical training sessions, open source community meetups, networking opportunities and industry-specific breakout events, including the following highlights: 700+ breakout sessions highlighting data intelligence, generative AI, data sharing, data governance and industry trends. Speakers will cover the latest innovations in Databricks AI/BI, Databricks SQL, Delta Sharing, Lakeflow, Mosaic AI, Unity Catalog and more in keynotes, lightning talks and hands-on training. Attendees will also receive technical deep dives on leading open source projects and technologies like Apache Spark™, Delta Lake, MLflow and more. Sessions covering everything from data management and governance to building AI agent systems in practice. Hear from Databricks experts and industry leaders on how they're using data intelligence to drive real business transformation. Industry-specific content tracks that dive into the power of data intelligence within the Financial Services, Healthcare and Life Sciences, Public Sector, Retail and Consumer Goods, Manufacturing & Transportation, Energy & Utilities, Communications, Media and Advertising, Gaming and Tech industries. These forums and breakouts will highlight industry-specific data and AI use cases, customer panels, interactive demos and opportunities to connect with peers and partners in the Industry Lounge. Training and certifications with 30+ paid trainings, instructor-led courses and certification options. Courses will cover everything in the Databricks Data Intelligence Platform, from advanced data engineering to Gen AI model deployment — all hosted by industry-leading technical experts. Networking events for attendees to interact and collaborate with other industry experts. A Welcome Reception will be held on Tuesday, June 10, at 6:00 pm PT, along with additional partner networking events, executive networking events and dozens of evening receptions. Additionally, Databricks' annual party, Data After Hours, will be held at 7 pm PT on Wednesday, June 11. Databricks' annual Women in Data + AI meetup to honor women's remarkable contributions and achievements in artificial intelligence, open source and software development. Learn more about data and AI as panelists share their career journeys, demonstrating how they blazed their own trail in the data and AI field. The new Marketing Forum gathers industry experts and innovators to explore how data and AI are transforming marketing. Discover real-world use cases, customer panels and best practices for data-driven marketing. All-day AI Agents Hackathon to kick off the first day of Summit, welcoming data scientists, engineers and analysts to join forces to create a novel proof of concept or unique technique on the development of innovative AI Agents using the Databricks Data Intelligence Platform! Data + AI Summit will also showcase nearly 200 sponsors and partners, including AWS, Accenture, Deloitte, Google Cloud and Microsoft. Check out the event's full agenda to learn more. Register here. About Databricks Databricks is the Data and AI company. More than 10,000 organizations worldwide — including Block, Comcast, Condé Nast, Rivian, Shell and over 60% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to take control of their data and put it to work with AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake, MLflow, and Unity Catalog. To learn more, follow Databricks on X, LinkedIn and Facebook.