The GeekNarrator

Kaivalya Apte

The GeekNarrator podcast is a show hosted by Kaivalya Apte who is a Software Engineer and loves to talk about Technology, Technical Interviews, Self Improvement, Best Practices and Hustle. Connect with Kaivalya Apte https://www.linkedin.com/in/kaivalya-apte-2217221a Tech blogs: https://kaivalya-apte.medium.com/ Wanna talk? Book a slot here: https://calendly.com/speakwithkv/hey Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show. Cheers

  1. 7 GIỜ TRƯỚC

    Building a new Database Query Optimiser - @cmu ​

    Read more about Kafka Diskless-topics, KIP by Aiven:KIP-1150: https://fnf.dev/3EuL7mvSummary:In this conversation, Kaivalya Apte and Alexis Schlomer discuss the internals of query optimization with the new project optd. They explore the challenges faced by existing query optimizers, the importance of cost models, and the advantages of using Rust for performance and safety. The discussion also covers the innovative streaming model of query execution, feedback mechanisms for refining optimizations, and the future developments planned for optd, including support for various databases and enhanced cost models.Chapters00:00 Introduction to optd and Its Purpose03:57 Understanding Query Optimization and Its Importance10:26 Defining Query Optimization and Its Challenges17:32 Exploring the Limitations of Existing Optimizers21:39 The Role of Calcite in Query Optimization26:54 The Need for a Domain-Specific Language40:10 Advantages of Using Rust for optd44:37 High-Level Overview of optd's Functionality48:36 Optimizing Query Execution with Coroutines50:03 Streaming Model for Query Optimization51:36 Client Interaction and Feedback Mechanism54:18 Adaptive Decision Making in Query Execution54:56 Persistent Memoization for Enhanced Performance57:12 Guided Scheduling in Query Optimization59:55 Balancing Execution Time and Optimization01:01:43 Understanding Cost Models in Query Optimization01:04:22 Exploring Storage Solutions for Query Optimization01:07:13 Enhancing Observability and Caching Mechanisms01:07:44 Future Optimizations and System Improvements01:18:02 Challenges in Query Optimization Development01:20:33 Upcoming Features and Roadmap for optdReferences:- NeuroCard: learned Cardinality Estimation: https://vldb.org/pvldb/vol14/p61-yang.pdf- RL-based QO: https://arxiv.org/pdf/1808.03196- Microsoft book about QO: https://www.microsoft.com/en-us/research/publication/extensible-query-optimizers-in-practice/- Cascades paper: https://15721.courses.cs.cmu.edu/spring2016/papers/graefe-ieee1995.pdf- optd source code: https://github.com/cmu-db/optd- optd website (for now): https://db.cs.cmu.edu/projects/optd/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #queryoptimization #sql #postgres

    1 giờ 24 phút
  2. 8 GIỜ TRƯỚC

    Fast Observability on S3 with Parseable

    For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummaryIn this conversation, Nitish Tiwari discusses Parseable, an observability platform designed to address the challenges of managing and analyzing large volumes of data. The discussion covers the evolution of observability systems, the design principles behind Parseable, and the importance of efficient data ingestion and storage in S3. Nitish explains how Parseable allows for flexible deployment, handles data organization, and supports querying through SQL. The conversation also touches on the correlation of logs and traces, failure modes, scaling strategies, and the optional nature of indexing for performance optimization.References:Parseable: https://www.parseable.com/GitHub Repository: https://github.com/parseablehq/parseableArchitecture: https://parseable.com/docs/architecture Chapters:00:00 Introduction to Parseable and Observability Challenges05:17 Key Features of Parseable12:03 Deployment and Configuration of Parseable18:59 Ingestion Process and Data Handling32:52 S3 Integration and Data Organisation35:26 Organising Data in Parseable38:50 Metadata Management and Retention39:52 Querying Data: User Experience and SQL44:28 Caching and Performance Optimisation46:55 User-Friendly Querying: SQL vs. UI48:53 Correlating Logs and Traces50:27 Handling Failures in Ingestion53:31 Managing Spiky Workloads54:58 Data Partitioning and Organisation58:06 Creating Indexes for Faster Reads01:00:08 Parseable's Architecture and Optimisation01:03:09 AI for Enhanced Observability01:05:41 Getting Involved with ParseableFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #s3 #objectstorage #opentelemetry #logs #metrics

    1 giờ 6 phút
  3. 8 GIỜ TRƯỚC

    How does AWS Lambda work?

    For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this conversation, Kaivalya Apte and Rajesh Pandey talk about the engineering behind AWS Lambda, exploring its architecture, use cases, and best practices. They discuss the challenges of event handling, concurrency, and load balancing, as well as the importance of observability and testing in serverless environments. The conversation highlights the innovative solutions AWS Lambda provides for developers, emphasizing the balance between simplicity and complexity in cloud computing.Chapters:00:00 Introduction to AWS Lambda04:36 Use Cases and Best Practices for AWS Lambda09:34 Event Handling and Queue Management19:41 Idempotency and Event Duplication Challenges29:39 Cold Starts and Performance Optimization34:37 Statelessness and Resource Management in Lambda42:18 Understanding Micro-VMs and Cold Starts45:14 Resource Management and Recommendations for Developers47:04 Scaling and Back Pressure in Serverless Systems51:33 Cellular Architecture and Fairness in Resource Allocation55:23 Handling Problematic Events and Poison Pills01:01:03 Testing and Operational Readiness in Lambda01:14:11 Preparing for High Traffic EventsReferences:Handling Billions of invocations: https://aws.amazon.com/blogs/compute/handling-billions-of-invocations-best-practices-from-aws-lambda/Firecracker: https://firecracker-microvm.github.io/AWS Lambda: https://aws.amazon.com/lambda/Connect with Rajesh: https://x.com/RPandeyViewshttps://www.linkedin.com/in/rajeshpandeyiiit/Don't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#aws #awslambda #serverless #distributedsystems #scalability #reliability

    1 giờ 17 phút
  4. 8 GIỜ TRƯỚC

    Breaking Distributed Systems with Kyle Kingsbury from Jepsen

    For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this episode of The Geek Narrator podcast, host Kaivalya Apte interviews Kyle Kingsbury, a renowned expert in database and distributed systems safety analysis. They discuss the world of testing distributed systems, the challenges faced, common bugs and patterns. Kyle shares insights on the importance of understanding system documentation, the role of formal verification, and the balance between performance and safety in testing. He also provides valuable advice for aspiring engineers in the field of distributed systems.Chapters:00:00 Introduction to Kyle Kingsbury and His Work06:59 Common Bugs in Distributed Systems12:37 Functional Bugs vs Safety Bugs17:54 Changes in Testing Over the Years26:03 False Positives and Negatives in Testing32:33 The Importance of Experimentation in Testing39:28 Tools and Technologies for Testing48:58 The Role of Formal Verification57:04 Reusability of TestsImportant links:Distributed systems class: https://github.com/aphyr/distsys-classWrite your own distributed system: https://github.com/jepsen-io/maelstromJepsen Analyses: https://jepsen.io/analysesKey takeaways:- Reading documentation is a crucial first step in testing systems.- Testing distributed systems involves understanding their semantics and guarantees.- Common bugs often arise from mismanagement of definite versus indefinite failures.- Testing strategies for cloud-based systems require cooperation with providers.- Performance testing can reveal unexpected behaviours in systems under stress.- Formal verification remains a challenging but valuable tool in ensuring system safety.- The testing process is iterative and requires collaboration with engineering teams.- Aspiring engineers should immerse themselves in practical experiences to build intuition.For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#databasearchitecture #distributedsystems #cloudcomputing #testing #jepsen

    1 giờ 5 phút
  5. 7 THG 4

    How do vector (search) databases work? ft: turbopuffer

    For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this conversation, Kaivalya Apte and Simon Eskildsen talk about vector databases, particularly focusing on TurboPuffer. They discuss the importance of vector search, embeddings, and the challenges associated with building efficient search engines. The conversation covers various aspects such as cost considerations, chunking strategies, multi-tenancy, and performance optimization. Simon shares insights on the future of vector search and the significance of observability and metrics in database performance. The discussion emphasizes the need for practical application and experimentation in understanding these technologies.Chapters:00:00 Introduction to Vector Databases10:34 Understanding Vectors and Embeddings15:03 Example: Designing a Search Engine for Podcasts27:53 Scaling Challenges in Vector Search36:46 Indexing and Querying in TurboPuffer38:12 Understanding Indexing and Query Planning45:45 Exploring Index Types and Their Performance50:27 Data Ingestion and Embedding Retrieval54:19 Use Cases and Challenges in Vector Search01:01:22 Metrics and Observability in Vector Databases01:03:52 Future Trends in Vector Search and DatabasesReferences:How do build a database on Object Storage? https://youtu.be/RFmajOeUKnETurbopuffer https://turbopuffer.com/Continous Recall measurement: https://turbopuffer.com/blog/continuous-recallTurbopuffer architecture: https://turbopuffer.com/architecture

    1 giờ 9 phút
  6. 7 THG 4

    Are your Data Pipelines Complex?

    The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Jacopo and Ciro discuss their journey in building Bauplan, a platform designed to simplify data management and enhance developer experience. They explore the challenges faced in data bottlenecks, the integration of development and production environments, and the unique approach of Bauplan using serverless functions and Git-like versioning for data. The discussion also touches on scalability, handling large data workloads, and the critical aspects of reproducibility and compliance in data management. Chapters:00:00 Introduction03:00 The Data Bottleneck: Challenges in Data Management06:14 Bridging Development and Production: The Need for Integration09:06 Serverless Functions and Git for Data17:03 Developer Experience: Reducing Complexity in Data Management19:45 The Role of Functions in Data Pipelines: A New Paradigm23:40 Building Robust Data Solutions: Versioning and Parameters30:13 Optimizing Data Processing: Bauplan Runtime46:46 Understanding Control Planes and Data Management48:51 Ensuring Robustness in Data Pipelines52:38 Data Quality and Testing Mechanisms54:43 Branching and Collaboration in Data Development57:09 Scalability and Resource Management in Data Functions01:01:13 Handling Large Data Workloads and Use Cases01:09:05 Reproducibility and Compliance in Data Management01:16:46 Future Directions in Data Engineering and Use CasesLinks and References:Bauplan website:https://www.bauplanlabs.com

    1 giờ 23 phút
  7. 14 THG 3

    Redpanda - High Performance Streaming Platform for Data Intensive Applications

    The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Alex from Red Panda discusses his engineering background, the challenges faced in reliability engineering, and the journey of building a better streaming system. He emphasizes the importance of understanding latency and performance in engineering systems, the market position of Red Panda in relation to Kafka, and the complexities involved in optimizing codebases for better performance. In this conversation, Alex discusses Red Panda's architecture, focusing on its thread architecture, memory allocation mechanics, and the importance of protocol correctness. He highlights how Red Panda stands out in the data systems landscape by eliminating unnecessary complexities and optimizing performance across various latency spectrums. The discussion also touches on the future of data processing, emphasizing the shift towards agentic workloads and the integration of analytical and operational layers.Chapters00:00 Introduction11:07 Building a Better Streaming System19:10 Market Position and Competition25:06 Optimizing Latency and Performance32:38 Understanding Complexity in Codebases33:36 Thread Architecture and Concurrency Models39:39 Memory Allocation Mechanics47:31 Protocol Correctness and Optimization Strategies56:27 Red Panda's Unique Position in Data Systems01:02:05 The Future of Data Processing and Agentic WorkloadsBlogs:TPC buffers: https://www.redpanda.com/blog/tpc-buffershttps://www.redpanda.com/blog/always-on-production-memory-profiling-seastarhttps://www.redpanda.com/blog/end-to-end-data-pipelines-types-benefits-and-process------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet.Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#streaming #kafka #redpanda #c++ #databasesystems #SQL #distributedsystems #memoryallocation #garbagecollection

    1 giờ 5 phút

Xếp Hạng & Nhận Xét

5
/5
3 Xếp hạng

Giới Thiệu

The GeekNarrator podcast is a show hosted by Kaivalya Apte who is a Software Engineer and loves to talk about Technology, Technical Interviews, Self Improvement, Best Practices and Hustle. Connect with Kaivalya Apte https://www.linkedin.com/in/kaivalya-apte-2217221a Tech blogs: https://kaivalya-apte.medium.com/ Wanna talk? Book a slot here: https://calendly.com/speakwithkv/hey Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show. Cheers

Có Thể Bạn Cũng Thích