287 episodes

Technical interviews about software topics.

Software Daily SoftwareDaily.com

    • Tech News

Technical interviews about software topics.

    Internet Archive Book Scanning with Davide Semenzin

    Internet Archive Book Scanning with Davide Semenzin

    The Internet Archive collects historical records of the Internet. The Wayback Machine is one tool from the Internet Archive which you may be familiar with. One project you may be unfamiliar with is book scanning. Internet Archive scans high volumes of books in order to digitize them.
    In today’s episode, Davide Semenzin joins the show to talk through the history of the Internet Archive and the engineering behind book digitization. We talk through OCR, storage, architecture, and scalability.

    Software Daily

    Software Daily

    For the last five months, we have been working on a new version of Software Daily, the platform we built to host and present our content. 
    We are creating a platform that integrates the podcast with a set of other features that make it easier to learn from the audio interviews. 
    Software Daily includes the following features:


    The world of software is large, and growing bigger every day. Software Daily is a place to explore this world of software companies and projects.
    If the podcast is a useful resource for you to learn about software, then Software Daily might also provide you with value. This post (and episode) is a brief description of the features that we have built into Software Daily.
    If you want to listen to Software Engineering Daily without ads, you can become a paid subscriber, paying $10/month or $100/year by going to softwaredaily.com/subscribe. We now have an RSS feed that paid customers can add to a podcast player like Overcast (on iOS) or Podcast Addict (on Android). You can also listen to the premium episodes using our apps for iOS or Android.
    Whether you are a listener who is fine with listening to ads, or you are a listener who pays to hear episodes without ads, we are happy to have you tuning in.
    Apple podcasts limits the number of episodes in an RSS feed to 300. The feed with the last 300 episodes is available by searching for Software Daily. In total, we have more than 1200 episodes in our back catalog.
    Listeners often want to find all our episodes on React, or Kubernetes, or serverless, or self-driving cars. We have been covering these topics for years, and much of the old content has retained its value. Software Daily allows you to easily find all the episodes relating to a subject that you are interested in.
    You can also find our most popular episodes, ranked by how people interact with them.
    Additionally, episode transcripts have interactive features with highlighting, commenting, and discussions. We want to create a Medium-like experience for the episodes.
    Software Daily is a place where listeners can write about the topics they are listening to. When you are listening to lots of episodes about a topic such as GraphQL, you may find it useful to write about that topic as a form of active learning. The topic pages also have a Q&A section. Post questions about a topic, or post an answer. Engage in the community dialogue surrounding a topic you are passionate or curious about. If there is a topic you want to write about, check out softwaredaily.com/write.
    We will be turning the best written content into short podcast episodes published on the weekends where we will read your contribution and mention your name. If you write something awesome, we want to turn it into audio for larger distribution. 
    Every topic on Software Daily has a Q&A section. We have covered lots of niche software companies and open source projects, and on Software Daily we want to collect more information about the world of software with Q&A.
    If you want to write about a specific company or topic that you heard about on Software Daily, Q&A is also an option. Our goal with Q&A is to provide a companion experience to listening to the podcast. It is not always easy to retain what you hear in a podcast episode. Answering some questions after you listen to an episode can help with that retention.
    Are you looking to hire someone specific in the world of software? Post a job on the Software Daily jobs board. We will be announcing some of these jobs on the podcast, especially the more interesting postings, and ones that align with content we are producing.
    We appreciate you tuning into Software Daily. We would welcome your feedback, and hope you take the time to check out SoftwareDaily.com.

    Data Infrastructure Investing with Eric Anderson

    Data Infrastructure Investing with Eric Anderson

    In a modern data platform, distributed streaming systems are used to read data coming off of an application in real-time. There are a wide variety of streaming systems, including Kafka Streams, Apache Samza, Apache Flink, Spark Streaming, and more. 
    When Eric Anderson joined the show back in 2016, he was working at Google on Google Cloud Dataflow, a managed service for handling streaming data. Today, he works as an investor at Scale Venture Partners. In his current job, he analyzes companies built around data infrastructure, developer tooling, and other enterprise engineering domains.
    Eric also hosts the podcast Contributor, which explores open source maintainers and the stories of their projects. His podcast has featured the creators of projects such as Envoy, Alluxio, and Chef. In today’s episode, Eric returns to the show to discuss data infrastructure, investing, and the evolving world of open source.

    Data Warehouse ETL with Matthew Scullion

    Data Warehouse ETL with Matthew Scullion

    A data warehouse provides low latency access to large volumes of data. 
    A data warehouse is a crucial piece of infrastructure for a large company, because it can be used to answer complex questions involving a large number of data points. But a data warehouse usually cannot hold all of a company’s data at any given time. Users need to move a subset of the data into the data warehouse by reading large files from a data lake on disk and putting that data into the data warehouse.
    The process of moving data from one place into another is broken down into three sequential steps, often called “ETL” (extract, transform, load) or “ELT” (extract, load, transform). In ETL, the data is extracted from a source such as a data lake, transformed into a schema that is customized for the data warehouse application, and then loaded into the data warehouse. In ELT, the last two steps are reversed, because modern systems can often leave the necessary schema transformation until after the data has been loaded into the data warehouse.
    Matthew Scullion is the CEO of Matillion, a company that specializes in building tools for data transformations. Matthew joins the show to talk about the problem of data transformation, and how that problem has evolved over the nine years since he started Matillion.
    If you enjoy the show, you can find all of our past episodes about data infrastructure by going to SoftwareDaily.com and searching for the technologies or companies mentioned. And if there is a subject that you want to hear covered, feel free to leave a comment on the episode, or send us a tweet @software_daily.

    Presto with Justin Borgman

    Presto with Justin Borgman

    A data platform contains all of the data that a company has accumulated over the years. Across a data platform, there is a multitude of data sources: databases, a data lake, data warehouses, a distributed queue like Kafka, and external data sources like Salesforce and Zendesk.
    A user of the data platform often has a question that requires multiple data sources to answer. How does this user join two data sources from a data lake? How does this user join data across a transactional database and a data lake? How does the user join data from two different data warehouse technologies? 
    Presto is an open source tool originally developed at Facebook. Presto allows a user to query a data platform with a SQL statement. That query gets parsed and executed across the data platform to read from any heterogeneous data source. For some use cases, Presto is replacing the technology Hadoop MapReduce-based technology Hive. For other use cases, Presto is solving a problem in a completely novel way.
    Justin Borgman joins the show to discuss the motivation for Presto, the problems it solves, and the architecture of Presto. He also talks about the company he started, Starburst Data, which sells and supports technologies built around Presto.
    If you enjoy the show, you can find all of our past episodes about data infrastructure by going to SoftwareDaily.com and searching for the technologies or companies mentioned. And if there is a subject that you want to hear covered, feel free to leave a comment on the episode, or send us a tweet @software_daily.

    Software Media with Tim O’Reilly

    Software Media with Tim O’Reilly

    Software has changed the way the world functions. The rapid pace of change has made it difficult to know how to navigate the new world. Knowledge workers who want to keep advancing in their careers develop a strategy of continuous learning in order to adapt to these changes.
    O’Reilly Media has existed for almost 40 years, providing resources for the technical consumer. As O’Reilly has expanded its product line from books to conferences to online learning, the business has grown slowly but steadily. That business trajectory stands in contrast to many of the software companies that are financially structured to either grow rapidly or die.
    Today, O’Reilly has a large impact on the software ecosystem. Software professionals congregate at O’Reilly conferences. Enterprises pay O’Reilly to educate their employees. And O’Reilly continues to grow into new product lines, recently acquiring the interactive learning platform Katacoda, which can be used to learn about Kubernetes and other popular technologies.
    In a previous episode, we discussed Tim O’Reilly’s book “What’s The Future”. In today’s show, Tim returns to the show to discuss his experience building O’Reilly, and how his business philosophy contrasts with much of the assumed wisdom of software company building.

Top Podcasts In Tech News