The Backend Engineering Show with Hussein Nasser

Hussein Nasser
The Backend Engineering Show with Hussein Nasser

Welcome to the Backend Engineering Show podcast with your host Hussein Nasser. If you like software engineering you’ve come to the right place. I discuss all sorts of software engineering technologies and news with specific focus on the backend. All opinions are my own. Most of my content in the podcast is an audio version of videos I post on my youtube channel here http://www.youtube.com/c/HusseinNasser-software-engineering Buy me a coffee https://www.buymeacoffee.com/hnasr 🧑‍🏫 Courses I Teach https://husseinnasser.com/courses

  1. ٢٩ ربيع الآخر

    Six stages of a good software engineer

    You get better as a software engineer when you go through these stages. 0:00 Intro  1:15 Understand a technology 7:07 Articulate how it works 15:30 Understand its’ limitations 19:48 Try to build something better 27:45 Realize what you built also has limitations 32:48 Appreciate the original tech as is Understand a technology   We use technologies all the time without knowing how it works. And it is ok not knowing how things work if interests isn’t there. But when there is interest to understand how something works, pursue it. It feels good when you understand how something works because you work better with it, you swim with the tide instead of against it.  When I learned how TCP/IP work..  you would appreciate every connection request, how you read requests. You will ask questions,  what is my code doing here?  When exactly I’m creating connections? When am I reading from the connection?  Is it safe to share connections? Articulate how it works This one is not easy, you might think you understand something until you try to explain how it works. If you find yourself using jargon you probably don’t understand and you just try to impress others. Have you seen people who want to talk about something to show they understand it? It’s the opposite. Try to truly articlate how it works, you will really understand it , back to 1. I thought I understand how backend reads requests until I tried to speak to it.  Understand the technology limitations Once 1,2 are done you will truly understand the tech, now you are confidant, you are excited about the tech and you will truly see when you can use the tech to its full potential and also know the weak points of the tech where it breaks, this happens a lot with TCP/IP. We know tcps limitations.  Try to build something better This one is optional and can be skipped, but attempting to design or building something better then the tech because you know the limitations will truly reveal how you became better. But the challenge here is the ego, you might understand the limitations but you problem is thinking that what you will build is flawless. This step must be proceed with caution.  Realize what you build also has limitation Dust settles.. this step hurts, and you may take a while to realize it, but whatever you build will have flaws… and when you realize this it is when you get better as an engineer.  Appreciate the tech as is This is when you are back full circle you are back to the first stage, look at the technology and understand it but don’t judge it.. just know the limitations and its strength and flow with it. Stop fighting and instead build around a tech, does that mean you shouldn’t build anything new, of course not. Go build, but don’t stress around making something better to defeat existing tech. But actually build it for building it.

    ٣٩ من الدقائق
  2. ١٥ ربيع الآخر

    Cloudflare's 150ms global cache purge | Deep Dive

    Cloudflare built a global cache purge system that runs under 150 ms. This is how they did it. Using RockDB to maintain local CDN cache, and a peer-to-peer data center distributed system and clever engineering, they went from 1.5 second purge, down to 150 ms. However, this isn’t full picture, because that 150 ms is just actually the P50. In this video I explore Clouldflare CDN work, how the old core-based centralized quicksilver, lazy purge work compared to the new coreless, decentralized active purge. In it I explore the pros and cons of both systems and give you my thoughts of this system. 0:00 Intro 4:25 From Core Base Lazy Purge to Coreless Active 12:50 CDN Basics 16:00 TTL Freshness 17:50 Purge 20:00 Core-Based Purge 24:00 Flexible Purges 26:36 Lazy Purge 30:00 Old Purge System Limitations 36:00 Coreless / Active Purge 39:00 LSM vs BTree 45:30 LSM Performance issues 48:00 How Active Purge Works 50:30 My thoughts about the new system 58:30 Summary Cloudflare blog https://blog.cloudflare.com/instant-purge/ Mentioned Videos Cloudflare blog https://blog.cloudflare.com/instant-purge/ Percentile Tail Latency Explained (95%, 99%) Monitor Backend performance with this metric https://www.youtube.com/watch?v=3JdQOExKtUY How Discord Stores Trillions of Messages | Deep Dive https://www.youtube.com/watch?v=xynXjChKkJc Fundamentals of Operating Systems Course https://os.husseinnasser.com Backend Troubleshooting Course https://performance.husseinnasser.com

    ١ س ٢ د
  3. ١٠ ربيع الأول

    When do you use threads?

    Fundamentals of Operating Systems Course https://os.husseinnasser.com When do you use threads? I would say in scenarios where the task is either 1) IO blocking task 2) CPU heavy 3) Large volume of small tasks In any of the cases above, it is favorable to offload the task to a thread. 1) IO blocking task When you read from or write to disk, depending on how you do it and the kernel interface you used, the write might be blocking. This means the process that executes the IO will not be allowed to execute any more code until the write/read completes. That is why you see most logging operations are done on a secondary thread (like libuv that Node uses) this way the thread is blocked but the main process/thread can resume its work. If you can do file reads/writes asynchronously with say io_uring then you technically don't need threading. Now notice how I said file IO because it is different than socket IO which is always done asynchronously with epoll/select etc. 2) CPU heavy The second use case is when the task requires lots of CPU time, which then starves/blocks the rest of the process from doing its normal job. So offloading that task to a thread so that it runs on a different core can allow the main process to continue running on its the original core. 3) Large volume of small tasks The third use case is when you have large amount of small tasks and single process can't deliver as much throughput. An example would be accepting connections, a single process can only accept connections so fast, to increase the throughput in case where you have massive amount of clients connecting, you would spin multiple threads to accept those connections and of course read and process requests. Perhaps you would also enable port reuse so that you avoid accept mutex locking. Keep in mind threads come with challenges and problems so when it is not required. 0:00 Intro 1:40 What are threads? 7:10 IO blocking Tasks 17:30 CPU Intensive Tasks 22:00 Large volume of small tasks

    ٣١ من الدقائق
  4. ٢٩ صفر

    Postgres is combining IO in version 17

    Learn more about database and OS internals, check out my courses  Fundamentals of database engineering https://databases.win  Fundamentals of operating systems https://oscourse.win This new PostgreSQL 17 feature is game changer. You see, postgres like most databases work with fixed size pages. Pretty much everything is in this format, indexes, table data, etc. Those pages are 8K in size, each page will have the rows, or index tuples and a fixed header. The pages are just bytes in files and they are read and cached in the buffer pool. To read page 0, for example, you would call read on offset 0 for 8192 bytes, To read page 1 that is another read system call from offset 8193 for 8192, page 7 is offset 57,345 for 8192 and so on.  If table is 100 pages stored a file, to do a full table scan, we would be making 100 system calls, each system call had an overhead (I talk about all of that in my OS course).  The enhancement in Postgres 17 is to combine I/Os you can specify how much IO to combine, so technically while possible you can scan that entire table in one system call doesn’t mean its always a good idea of course and Ill talk about that.  This also seems to included a vectorized I/O, with preadv system call which takes an array of offsets and lengths for random reads.  The challenge will become how to not read too much, say I’m doing a seq scan to find something, I read page 0 and found it and quit I don’t need to read any more pages. With this feature I might read 10 pages in one I/O and pull all its content, put in shared buffers only to find my result in the first page (essentially wasting disk bandwidth, memory etc)  It is going to be interesting to balance this out.

    ٢٨ من الدقائق

التقييمات والمراجعات

٥
من ٥
‫٤ من التقييمات‬

حول

Welcome to the Backend Engineering Show podcast with your host Hussein Nasser. If you like software engineering you’ve come to the right place. I discuss all sorts of software engineering technologies and news with specific focus on the backend. All opinions are my own. Most of my content in the podcast is an audio version of videos I post on my youtube channel here http://www.youtube.com/c/HusseinNasser-software-engineering Buy me a coffee https://www.buymeacoffee.com/hnasr 🧑‍🏫 Courses I Teach https://husseinnasser.com/courses

قد يعجبك أيضًا

للاستماع إلى حلقات ذات محتوى فاضح، قم بتسجيل الدخول.

اطلع على آخر مستجدات هذا البرنامج

قم بتسجيل الدخول أو التسجيل لمتابعة البرامج وحفظ الحلقات والحصول على آخر التحديثات.

تحديد بلد أو منطقة

أفريقيا والشرق الأوسط، والهند

آسيا والمحيط الهادئ

أوروبا

أمريكا اللاتينية والكاريبي

الولايات المتحدة وكندا