100 episodes

Technical interviews about software topics.

Software Engineering Daily Software Engineering Daily

    • Tech News

Technical interviews about software topics.

    Data Warehouse ETL with Matthew Scullion

    Data Warehouse ETL with Matthew Scullion

    A data warehouse provides low latency access to large volumes of data.  A data warehouse is a crucial piece of infrastructure for a large company, because it can be used to answer complex questions involving a large number of data points. But a data warehouse usually cannot hold all of a company’s data at any

    • 57 min
    Anyscale with Ion Stoica

    Anyscale with Ion Stoica

    Machine learning applications are widely deployed across the software industry.  Most of these applications used supervised learning, a process in which labeled data sets are used to find correlations between the labels and the trends in that underlying data. But supervised learning is only one application of machine learning. Another broad set of machine learning

    • 56 min
    Flink and BEAM Stream Processing with Maximilian Michels

    Flink and BEAM Stream Processing with Maximilian Michels

    Distributed stream processing systems are used to read large volumes of data and perform operations across those data streams.  These stream processing systems often build off of the MapReduce algorithm for collecting and aggregating large volumes of data, but instead of processing a calculation over a single large batch of data, they process data on

    • 51 min
    Druid Analytics with Jad Naous

    Druid Analytics with Jad Naous

    Large companies generate large volumes of data. This data gets dumped into a data lake for long-term storage, then pulled into memory for processing and analysis. Once it is in memory, it is often read into a dashboard, which presents a human with a visualization of the data.  The end-user who is consuming this data

    • 56 min
    The Data Exchange with Ben Lorica

    The Data Exchange with Ben Lorica

    Data infrastructure has been transformed over the last fifteen years.  The open source Hadoop project led to the creation of multiple companies based around commercializing the MapReduce algorithm and Hadoop distributed file system. Cheap cloud storage popularized the usage of data lakes. Cheap cloud servers led to wide experimentation for data tools. Apache Spark emerged

    • 1 hr 8 min
    Presto with Justin Borgman

    Presto with Justin Borgman

    A data platform contains all of the data that a company has accumulated over the years. Across a data platform, there is a multitude of data sources: databases, a data lake, data warehouses, a distributed queue like Kafka, and external data sources like Salesforce and Zendesk. A user of the data platform often has a

    • 1 hr 16 min

Listeners Also Subscribed To