The GeekNarrator Kaivalya Apte
-
- Technology
-
The GeekNarrator podcast is a show hosted by Kaivalya Apte who is a Software Engineer and loves to talk about Technology, Technical Interviews, Self Improvement, Best Practices and Hustle.
Connect with Kaivalya Apte https://www.linkedin.com/in/kaivalya-apte-2217221a
Tech blogs: https://kaivalya-apte.medium.com/
Wanna talk? Book a slot here: https://calendly.com/speakwithkv/hey
Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show.
Cheers
-
Demystifying Real-time Analytics, Search and Hybrid Search with Dhruba, CTO @Rockset
In this video, I talk to Dhruba, CTO @Rockset about search and realtime analytics. We discussed deep internals of Rockset, its architecture and why is it a great fit for search and realtime analytics use cases.
Chapters:
00:00 Introduction
02:45 The Evolution of Data Systems: From Hadoop to Rockset
07:30 Understanding Rockset: Real-Time Analytics and Search Defined
12:01 The Technical Edge: Rockset vs. Elasticsearch
18:16 Deep Dive into Rockset's Architecture and Internals
28:21 Partitioning, Hashing, and Data Distribution in Rockset
36:56 Exploring Hot Storage and Cache Layers
37:40 Why Hot Storage is Essential for Low Latency
39:05 Optimizing Data Storage with Compression and Delta Encoding
39:49 Balancing Cost and Performance in Data Storage
41:50 The Power of Converged Indexing in Rockset
45:50 Efficient Query Execution and Index Management
54:51 Leveraging Mutability for Real-Time Analytics
59:24 Deep Dive into Query Processing and Optimization
01:04:21 Understanding Joins and Reporting Queries in Rockset
01:12:23 Future Directions and Vector Search Innovations
Index Conference: https://rockset.com/index-conf/
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#rockset #elasticsearch #search #vectorsearch #realtime #databases #sql #joins #indexes -
Rapidly Simulate Production Traffic ft. Michael Drogalis
In this episode we explore how to Rapidly Simulate Production Traffic with Michael Drogalis, using his creation ShadowTraffic. I am sure you will be able to relate to all the different problems mentioned in this episode and like how ShadowTraffic aims to solve those problems.
I hope you like this conversation.
Chapters:
00:00 Welcome to The Geek Narrator Podcast: Exploring Deep Tech
00:18 The Challenge of Simulating Production Traffic
00:59 Introducing Shadow Traffic: A Solution to Data Simulation
02:34 Understanding the Problem Space of Data Simulation
06:03 How Shadow Traffic Works: A Deep Dive
08:17 The Power of Declarative Data Generation with Shadow Traffic
10:40 Shadow Traffic's Architecture and Deployment
13:02 Configuring Load Testing and Throttling with Shadow Traffic
15:47 Testing and Validation in Shadow Traffic
20:42 Mimicking Production Data Distribution with Shadow Traffic
26:48 Innovative Features for Stream Processing Testing
28:47 Shadow Traffic: Adding Faults to Data for Robust Testing
29:04 Antithesis and Shadow Traffic: A Synergistic Approach
32:46 The Challenge of Generating Realistic Test Data
40:04 Enhancing Observability in Data Generation
41:50 Customer-Driven Roadmap and Future Vision
45:27 Closing Thoughts
ShadowTraffic: https://shadowtraffic.io/
Contact Michael: https://shadowtraffic.io/contact.html
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#kafka #s3 #postgres #testing #streamprocessing #loadtesting #chaostesting #demo -
High Performance with GraalVM - Alina Yurenko
If you're involved in the Java space, chances are you've come across #GraalVM. And for those active in the tech community, you might have heard about the recent 1BRC challenge initiated by Gunnar Morling.
GraalVM truly showcased its capabilities in this challenge, sparking my curiosity. That's why I reached out to Alina to delve deeper into GraalVM, exploring its features and uncovering how it excels in such endeavors. And here we are talking about GraalVM
Chapters:
00:00 Introduction
01:47 GraalVM's Impact on the 1BRC Challenge and Its Features
04:34 Exploring GraalVM's Core Features and Benefits
08:34 Real-World Success Stories: GraalVM in Action
16:18 Understanding Native Image Compilation with GraalVM
20:34 Framework Compatibility and GraalVM Integration
25:04 Testing and Integration with GraalVM
25:26 Exploring Testing and Development with GraalVM
25:58 Best Practices for Developing with GraalVM
28:11 Migrating to GraalVM: Strategies and Considerations
31:25 Performance Optimization in GraalVM
35:15 Building and Resource Considerations for GraalVM
38:45 Expanding Horizons: Polyglot Programming with GraalVM
43:15 Future Directions and Limitations of GraalVM
47:40 Engaging the Java Community: GraalVM's Impact
50:21 Getting Started with GraalVM: Resources and Recommendations
References and Links:
- The GraalVM website with docs, downloads, guides: https://www.graalvm.org/
- Nicolai Parlog's "Modern Java in Action" demo: https://github.com/nipafx/modern-java-demo
- My native version of Nicolai's demo: https://github.com/alina-yur/native-modern-java-demo
- For news, follow GraalVM: https://twitter.com/graalvm
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#Java #jvm #graalvm #highperformance #JITcompiler #AOT #nativeimage #security #rust #c++ -
Taming TimeSeries Data with QuestDB - Javier Ramirez
In this episode I am talking to Javier Ramirez from QuestDB, about everything QuestDB. This episode is a great resource to understand how QuestDB works, its architecture, what is it optimised for and whats upcoming as per the roadmap.
If you have timeseries data and need a simple yet highly scalable solution, #QuestDB is a great option.
Chapters:
00:00 Introduction
03:04 Understanding QuestDB: Origins and Use Cases
09:21 Deep Dive into QuestDB's Architecture and Data Ingestion
19:07 Optimizing Data Reads and Writes in QuestDB
28:40 Exploring Data Granularity and Partitioning in QuestDB
29:29 Optimizing Query Performance with Partition Strategies
30:26 Handling Data Ingestion and Query Efficiency
32:58 In-depth Look at Data Duplication and Ingestion Performance
34:55 Understanding Compression and Its Impact on Performance
38:51 Replication and Data Distribution Strategies
47:10 Observability and Metrics in QuestDB
50:57 Future Developments and Enhancements in QuestDB
58:45 Closing Remarks
Links:
QuestDB: https://questdb.io/
Github: https://github.com/questdb/questdb
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#questdb #sql #timeseries #timeseriesanalysis #databases #highscale #scaleup #performance #parquet #S3 #replication #writeaheadlog #wal #durability #columnstore -
Beat the CAP Theorem : Make Distributed consistency simple
In this episode I talk to Andras Gerlits, who founded omniledger.io. Andras has a very interesting view on how Distributed Consistency should work that can get rid of several bottlenecks when it comes to maintaining Distributed consistency.
He argues how getting rid of a global wall clock and using causality to approach Distributed consistency helps you build resilient, simple and performant systems. We have gone deeper into how that can be achieved and how the product works.
Chapters:
00:00 Introduction
00:52 Andras's Journey into Distributed Consistency
03:04 The Evolution of Data Consistency in Banking and Beyond
08:04 Introducing Client-Centric Consistency
10:36 Exploring the Standard Model of Distributed Consistency
16:01 Redefining Strong Consistency with a Relativistic Approach
34:25 Practical Implications of Client-Centric Consistency in Banking
36:20 Mitigating Latencies and Partitions in Distributed Systems
41:08 Exploring System Reliability and Availability
41:52 Tuning System Properties for Specific Use Cases
43:07 Comparing Standard and New Models for Data Management
45:08 Understanding Local Progress and Mutex-Free Updates
47:23 Deep Dive into Token-Based Ordering and Global Calibration
58:30 Introducing OmniLedger: A New Approach to Distributed Consistency
01:02:41 Performance Optimizations and Tunable Consistency
01:08:20 Ideal Use Cases and Potential Limitations of OmniLedger
01:14:30 Future Directions and Closing Thoughts
Links:
Our website:
https://omniledger.io
A long-form essay on the thinking behind our model:
https://medium.com/p/5e397cb12e63
A demo of transactionality
https://www.youtube.com/watch?v=XJSSjY4szZE
I think my blog in general might be interesting to some
https://medium.com/@andrasgerlits
The science-paper with all its mathematical rigour:
https://www.researchgate.net/publication/359578461_Continuous_Integration_of_Data_Histories_into_Consistent_Namespaces
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#databases #sql #consistency #distributedsystems -
A Graph Database That You Can Embed - KuzuDB
In this video I talk to Semih Salihoglu about KuzuDB : A highly scalable, extremely fast, easy to use embeddable Graph Database.
Chapters:
00:00 Introduction
00:40 The Genesis of KuzuDB: From Academic Research to Startup
06:40 Graph Databases 101: Understanding the Basics and Beyond
10:24 When to Opt for a Graph Database: Use Cases and Advantages
19:16 KuzuDB vs. Traditional Databases: A Comparative Analysis
24:39 Inside KuzuDB: Optimizations and Data Ingestion Explained
31:08 Exploring Query Optimizations in Graph Databases
31:34 The Relational Nature of Graph Databases
33:33 Factorization: A Key Optimization Technique
38:50 Integrating New Data Sources and Handling Joins
43:39 Optimizing Write Operations and Index Management
50:23 Comparing Kuzu with Other Graph Databases
58:50 Future Developments and Vision for Kuzu
Important links:
- History of DBMSs and the IDS, which is the first database in history, which had a graph-based model: https://dl.acm.org/doi/abs/10.1145/1147376.1147382 is a good paper by CS historian on this history and a must read for everyone interested in the birth of databases as a field.
- https://blog.kuzudb.com/post/what-every-gdbms-should-do-and-vision/ blog on the what every GDBMS should do and vision of Kùzu.
- The user survey paper that got Semih into GDBMSs. https://arxiv.org/pdf/1709.03188.pdf
- Blog on factorization https://blog.kuzudb.com/post/factorization/
- Kùzu's RDFGraphs feature https://docs.kuzudb.com/rdf-graphs/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!