- Видео 384
- Просмотров 234 738
Starburst
США
Добавлен 29 янв 2018
Unlock the power of a data Icehouse architecture, an open data lakehouse built on Trino and Apache Iceberg.
Federating data with Starburst Galaxy | Starburst Academy
Many organizations struggle with data spread across multiple systems-data lakes, warehouses, and operational databases-creating silos that slow down decision-making. Traditionally, businesses have relied on complex ETL pipelines to move and consolidate this data, but this approach is expensive, time-consuming, and difficult to maintain.
Data federation offers a more efficient solution by enabling direct access to data across different sources without requiring physical movement. Instead of duplicating and storing copies, a federated approach provides a virtual query layer, allowing teams to analyze data in place. This is particularly valuable for companies looking to maintain flexibility w...
Data federation offers a more efficient solution by enabling direct access to data across different sources without requiring physical movement. Instead of duplicating and storing copies, a federated approach provides a virtual query layer, allowing teams to analyze data in place. This is particularly valuable for companies looking to maintain flexibility w...
Просмотров: 27
Видео
Federated data products for data migrations | Starburst Virtual Events
Просмотров 274 часа назад
Learn how data products and data federation work together to create a flexible and scalable data architecture with Starburst, powered by Trino. This webinar explores how data products enable a structured, governed approach to accessing and sharing data across an organization while leveraging data federation to connect disparate sources seamlessly. See how this combination provides a powerful st...
How Halliburton uses Starburst data products | Starburst Virtual Events
Просмотров 164 часа назад
Discover how Halliburton leveraged Starburst data products to unlock real-time insights and drive data-driven decision-making. This webinar explores how Starburst’s federated query engine enables seamless access to distributed data, eliminating silos and accelerating analytics. Learn how Halliburton optimized its data architecture to support high-performance queries, improve operational efficie...
Starburst 101 Workshop: Iceberg + Trino | Starburst Virtual Events
Просмотров 247 часов назад
Unlock the power of Icehouse architecture with Starburst and Apache Iceberg. This webinar explores how combining Trino with Iceberg enables a scalable, high-performance data lakehouse, delivering fast, secure, and cost-effective analytics. Learn how to transform raw data into governed, curated data products, ready to fuel AI, BI, and other critical workloads. Discover how Starburst Galaxy simpl...
Migrating workloads to a data lakehouse | Starburst Enterprise
Просмотров 317 часов назад
Discover how to seamlessly migrate your workloads to a data lakehouse with Starburst Enterprise. This webinar explores how organizations can modernize their data architecture by transitioning from legacy systems to a flexible, high-performance lakehouse model. Learn how Starburst enables federated queries, supports Apache Iceberg, and provides a cost-effective, scalable alternative to tradition...
Starburst 101: Starburst Galaxy demo | Starburst Galaxy
Просмотров 197 часов назад
Unlock the full potential of your data with Starburst Galaxy and Apache Iceberg. This webinar demo explores how an Icehouse architecture, powered by Starburst and Iceberg, is the best way forward for companies looking to modernize their data infrastructure. Learn how Starburst enables a hybrid lakehouse experience, allowing organizations to seamlessly query data across cloud and on-prem environ...
Hochgradig skalierende policy basierende Datenprodukte | Starburst auf Deutsch
Просмотров 77 часов назад
Entdecken Sie, wie hochgradig skalierbare, policy-basierte Datenprodukte die Datenverwaltung revolutionieren. Erfahren Sie, wie Starburst und Trino eine flexible und sichere Datenarchitektur ermöglichen, um datengetriebene Entscheidungen effizient zu unterstützen. #Datenprodukte #Datenmanagement #Starburst #Trino #DataGovernance
Build a data lakehouse with Trino and Iceberg | Starburst Virtual Events
Просмотров 177 часов назад
Unlock the power of modern data architecture with Starburst and Apache Iceberg. This webinar explores how the combination of Trino and Iceberg-known as an Icehouse architecture-enables organizations to transform raw data in their data lakes into secure, curated data products ready for downstream analytics. Learn how to leverage Starburst Galaxy to build a scalable, high-performance data lakehou...
Découvrez Starburst & Trino en 30 minutes | Starburst en Français
Просмотров 87 часов назад
Découvrez Starburst et Trino, la solution idéale pour interroger vos données où qu'elles se trouvent. Apprenez comment optimiser votre architecture data et exploiter tout le potentiel du data lakehouse. #Starburst #Trino #DataAnalytics #BigData #DataLakehouse
Building scalable Data Products with Starburst & AWS | Starburst Virtual Events
Просмотров 77 часов назад
Explore how Starburst empowers organizations to create scalable data products and unlock the full potential of decentralized data architecture. This webinar features Resilience, a biopharmaceutical and medicine manufacturer, sharing their journey to adopting data products on AWS using Starburst. Learn what data products are, why Resilience moved beyond traditional data pipelines, and how they b...
Starburst powered data applications featuring Vectra | Starburst Virtual Events
Просмотров 377 часов назад
Discover how Starburst enables the creation of powerful data applications and embedded analytics use cases, bringing big data directly into software applications. This webinar explores how Vectra, a cybersecurity leader, leveraged Starburst Galaxy to build an AI-driven threat prevention platform, unlocking new revenue opportunities and expanding into new markets. Learn how Starburst powers real...
Creating & querying data lake tables | Starburst Virtual Events
Просмотров 87 часов назад
Learn how to create and query data lake tables with Starburst, powered by Trino, in this hands-on training session. This webinar walks through using Starburst Galaxy to connect with an AWS S3 object store, build external tables, and optimize data storage with columnar file formats. Discover best practices for designing partitioned tables to enhance query performance and see how federated querie...
Modern table formats & Apache Iceberg | Starburst Virtual Events
Просмотров 117 часов назад
Explore how Apache Iceberg is transforming data lakehouses and why it's a game-changer compared to older technologies like Hive. This webinar dives into the advantages of modern table formats, showcasing how Iceberg enables ACID-compliant transactions, versioned tables, and time-travel queries while simplifying partitioning without costly table rebuilds. See how Starburst, powered by Trino, sea...
Build data products from your optimized data lake | Starburst Virtual Events
Просмотров 147 часов назад
Discover how Starburst, powered by Trino, enables the creation of secure, high-performance data products on an open data lakehouse using Apache Iceberg. This webinar introduces the fundamentals of Starburst and Trino before diving into a hands-on lab where you’ll learn to build federated data pipelines that transform raw data into curated, governed data products. Using Starburst Galaxy, see how...
Exploring data pipelines, views, and data products | Starburst Virtual Events
Просмотров 17 часов назад
Building efficient data pipelines doesn’t have to be complicated. This webinar explores how Starburst, powered by Trino, enables SQL-based data transformation, making ETL processes faster and more flexible. Learn how to construct data pipelines that integrate data from both your data lake and other sources, creating clean, optimized, and curated datasets-known as data products. See how data pro...
The next generation of data architecture | Starburst Virtual Events
Просмотров 137 часов назад
The next generation of data architecture | Starburst Virtual Events
Supercharge dbt with Starburst | Starburst Virtual Events
Просмотров 227 часов назад
Supercharge dbt with Starburst | Starburst Virtual Events
Unlock secure & compliant data access across regions | Starburst Virtual Events
Просмотров 169 часов назад
Unlock secure & compliant data access across regions | Starburst Virtual Events
Icehouse 101 Webinar | Starburst Virtual Events
Просмотров 689 часов назад
Icehouse 101 Webinar | Starburst Virtual Events
How data leaders are closing the gap on data analytics | Starburst Galaxy
Просмотров 3814 часов назад
How data leaders are closing the gap on data analytics | Starburst Galaxy
What do data products mean for your business? | Starburst Galaxy
Просмотров 3514 часов назад
What do data products mean for your business? | Starburst Galaxy
Starburst + dbt = coup de foudre en SQL | Starburst en Français
Просмотров 1814 часов назад
Starburst dbt = coup de foudre en SQL | Starburst en Français
How to build an Agile data platform | Starburst Galaxy
Просмотров 7621 час назад
How to build an Agile data platform | Starburst Galaxy
The modern data stack illusion | Starburst Galaxy
Просмотров 53День назад
The modern data stack illusion | Starburst Galaxy
Reimagining data governance in the age of AI | Starburst
Просмотров 27День назад
Reimagining data governance in the age of AI | Starburst
What is data driven innovation? | Starburst
Просмотров 61День назад
What is data driven innovation? | Starburst
Dell + Starburst: The future of Analytics and AI | Starburst Galaxy
Просмотров 189День назад
Dell Starburst: The future of Analytics and AI | Starburst Galaxy
How to secure flexible supply chain data | Starburst Galaxy
Просмотров 6День назад
How to secure flexible supply chain data | Starburst Galaxy
Game theory in data analytics | Starburst Galaxy
Просмотров 47День назад
Game theory in data analytics | Starburst Galaxy
PyStarburst - Dataframe API for Trino | Starburst ask-me-anything #3
Просмотров 43День назад
PyStarburst - Dataframe API for Trino | Starburst ask-me-anything #3
Worth watching every second. I was looking for a video about how the workflow goes and here it is. keep doing the good work <3 earned a subscriber <3
Fantastic! We love to hear this, and we've got lots of new content in the pipeline similar to this :)
Fantastic high-level explanation, thank you!
Thanks so much! With so much interest in Apache Iceberg this year we though it was the perfect time.
How to be a part of similar live Q/As in the future and get the queries answered?
We'll be having regular Q&As/AMAs in the future so there will be lots of opportunities down the line. In the meantime, feel free to reach out on the Starburst Community forum (as you have been) or on here and Lester can answer your questions in the comments.
you guided me on community post a couple of days ago regarding cross region data sources on Galaxy. Glad to be the first viewer on this video.
That's amazing! We're so glad that you're finding answers the Starburst Community forum. Great to have you over at RUclips too :)
Yes, I remember it! Thanks for the comments.
great explanation! thanks
Glad it was helpful!
Loved the presentation delivery!
Awesome! So glad you liked it.
It seems like it will be a future mess
Many companies actually find that an Iceberg/Icehouse architecture makes things easier to manage for the long term compared to older, alternative approaches. What kind of use case were you thinking about? We can always chat through the options with you if you're considering different angles.
Great explanation.
Thank you! We loved the analogy too.
Incredible talk. Definitely agree, better to catch data issues as early as possible in the pipeline, which requires buy-in from data producers. I like the idea of automated testing on changes to schema as well
Glad it was helpful!
It was very interesting to see this presentation reflect efforts similar to what I undertook in my past jobs. The producers were often not tech or data savvy and lacked understanding of how downstream users leveraged the data they created - left hand not talking to the right hand. There was a significant gap between producers, the data team (aka bottlenecks), and data consumers. It became difficult to determine the ROI for data management/governance tools like Collibra as it did not directly hit bottom lines of companies. I'm excited to see this becoming a trend, as it will undoubtedly help organizations and employees easily access data, understand, trust, and use it for data-driven decision-making. I am watching this space closely. Thank you for sharing!
@@TeeA4 Thanks for sharing. We're excited too! It feels like these things are only becoming more talked about and more important.
can a view be created in starburst using federation , and instead of join is union possible ?
Yeah, you can use unions instead of joins if you prefer when federating.
hello. when I populate iceberg table with catalog in aws glue with data through trino the location property becomes empty and I ca not build statistics, do optimization for table etc. Could you tell me how to keep location property intact?
Thanks for reaching out. If you're using Trino and Iceberg with aws glue, I think this article might help unpack it: www.starburst.io/blog/aws-glue-iceberg-s3/ Hope it helps!
ruclips.net/video/wYX_zhlTDr8/видео.htmlsi=YJHpgqQi44Oxlo8O
Hey, I figured out if you roll the starburst into balls, they are more juicy and flavorful. And if you mix the flavors together it makes it better. You should roll all the starburst into balls
We never tire of Starburst puns around here.
good starting example!
Thanks! Glad you enjoyed it.
It would have been nice if the relationship between the Iceberg and S3 is explained. Because suddenly the S3 was browsed to look at the Iceberg table metadata, manifest. Thanks 👍
Great video, thanks.
Glad you liked it!
I love these values!
Thank you so much!
Nice🔥
Thanks!
Hive is no longer using Mapreduce but Apache Tez that follow DAG and avoid multiple times reload of data.
Yes, good call out! Although Hive/Tez isn't used in data lakehouses either, Apache Tez is used in some Hive implementations of data lakes and it does reduce some of the issues associated with traditional Hive. You can think of those Hive implementations as moving a bit closer to a data lakehouse but a true lakehouse requires one of the 3 modern table formats: Iceberg, Delta Lake, or Hudi.
Thank you - a very easy to understand precis of the technology.
Awesome! Glad you enjoyed it. And we think Trino is only going to get more popular in the years to come with some of the changes happening in the data landscape.
Niice
Glad you liked it! Apache Iceberg is so big right now.
Reminds me of my Army core values with a bit of the oz principle. Loyalty, duty, respect, selfless service, honor, integrity, personal courage.
Thank you!
I just applied for a role there and this video certainly has me excited about Starburst. Love the start-up mentality and environment, the fact that you ground yourself in specific core values, and Justin comes across as an authentic and genuinely good person. Wish me luck!
sounds awesome... .a comparison to existing stacks would be useful, though, as to what the alternatives would be
Glad you enjoyed it! That's a great idea for another video. Look out for something similar soon. In the meantime, you can read more about the data Icehouse here: www.starburst.io/info/icehouse/
Should there be any technical concern if I use OdBC connector instead of recommended Starburst trino driver for connecting to Tableau
Our Tableau extension is released for JDBC and tuned for such. You can use the Starburst ODBC Driver with Tableau and point to Starburst as generic ODBC source, but it's not generally recommended unless you can't install JDBC drivers. We have done number of optimizations and bug fixes for our Tableau extension that you won't get when using the ODBC driver, and in our experience the ODBC setup is more difficult and error-prone.
Hope that answers your question OP!
Incredible
Thanks!
Jist was imagining how much data space is consumed on daily basis and exponential rise of use, in view of exploding social media. Thinking about its sustainability or do we need new tech to reduce its footprint.
We love thinking about sustainable solutions when it comes to data and data architecture!
Half that data is bullshit misinformation and pop up ads
but to use the iceberg connector for trino you need Hive Metastore, why??????
You dont need a Hive metastore. You need a metastore .. thats how Iceberg works. Trino supports Hive, Glue, Nessie, JDBC, REST and Snowflake metastores with the Iceberg connector.
This is correct! Although you can use the Hive metastore, you have many other options. This is a big part of the draw for the openness of Iceberg. You get to decide which components to use, including the metastore.
You're probably thinking that the metadata is stored on the data lake along with the data. well, it is, BUT the Iceberg spec calls out that the name of the specific metadata file that contains the current snapshot is also stored. this is crucial to the optimistic concurrency when a write happens. basically, a new version needs to be based on the prior version (or a descendant of it in certain circumstances) and the catalog aids in keeping this under control.
But we are not storing data at the subatomic level. So how could you just add up the number of electrons and use that as the weight of the data?
If you read what's on screen at that point, it says, "This is a huge oversimplification, but it's a lower bound." Explaining transistors as part of a 60s short is a little tough, but it's a good topic for a follow-up video!
Really cool! I wonder what the composition of that 100 ZB is 🤔
Yeah, it's interesting to think about!
Nice job on this, cool data!
Glad you enjoyed it!
you forgot yo put video about how it works i really helps a lots
Thanks for commenting. This video was designed to tackle Trino's strategic displacement of similar query technologies in the ecosystem. Other videos do go into granular technical detail about how Trino works as a query engine. You might really enjoy this one for instance: ruclips.net/video/uE2rc7HxpCs/видео.html
How on the earth consumers are able to generate the complex ETL logic required to produce final dataset that consumers want in first place ? if not producers , who will generate that final dataset on which consumers will create report on ? it seems too theorotical and just addinng new jargons to same thing
Thanks for your comment! You're right that data products don't replace an ETL pipeline, but what they do allow is certain core ETL processes to be abstracted out from data producers/data engineering teams and packaged up as a curated "product". This product is easy to access and can be shared with data producers and data consumers alike according to access controls. This can mean different things for different organizations, but it allows some of the workload to shift to data consumer teams and frees up data engineering resources for more complex tasks. It's part of the process of "democratizing" data engineering and although it doesn't replace traditional ETL, it makes a subset of that workload more accessible to a wider number of teams. Hope that helps! Since you're interested in data products, we've got some other resources to help you explore: www.starburst.io/solutions/data-products/
Didn’t even explain what is meant by context for consumers
makes lot of sense
Glad you enjoyed the video!
legend
Glad you enjoyed it!
Great Learning and explanation.
So glad it was helpful for you!
The data which cannot be handled by traditional systems
Totally. Also a good description of the kinds of systems that Trino and Starburst handle best.
42 ftw
Impressive presentation!
Glad you liked it!
Great analogy, Adrian! I suppose we're at the "compounding apothecary" stage now but the future looks bright for the vision you've described.
So glad you enjoyed it!
Thank you honda
Thanks for sharing
Thank you. Glad you enjoyed it!
@@StarburstData I want to talk with you about your RUclips channel monetization
Ownership and authenticity is the most important. I look forward working with your company. I just applied for intern ui/ux role.
Great and amazing
Great values to have!
I applied for your company I will love to work remote as an intern with your company
Merci pour la perte de temps : 1ere partie de la video presentation d'un client SQL deuxieme partie presentation d'une visualisation.....la cible de la vidéo c'est les maternelles
This makes sense
Perfect! Glad you enjoyed it. If you're interested in learning more about Starburst data products, we have a free course on it: academy.starburst.io/exploring-data-products
Own it. Love this!
can you please share if there is a demo for Starburst and Power BI
The best step-by-step walkthrough we have of Starburst and PowerBI is in our documentation: docs.starburst.io/clients/powerbi.html Hope this helps :)