43 min

A Convergent Architecture for Big Data, Machine Learning and Real-Time Computing 2/28/201‪7‬ Computer Architecture Seminar Series

    • Technology

In the quest for more intelligent consumer devices, machine learning lets appliances understand what is happening around the computer and what is asked of it, while big data provides the history and context of the environment. But devices must also react to be useful, and for many applications the reaction needs to happen on human timescale to be valuable. For example, an advertisement beacon must beam a discount coupon to the shopper's cellphone in a few hundred milliseconds or the shopper will walk past. Today many people prefer to use large shared data centers remotely accessed through the public internet for big data analytics and machine learning because this is the most cost-effective and energy efficient way to do large-scale computing. But integrating real-time computing with big data and machine learning may make that impractical because exchanging messages through the internet may itself consume a substantial fraction of a second, leaving almost no time for computing if you want to guarantee application response time of a few hundred milliseconds. In this talk I propose a FLASH-based parallel computer using large numbers of low-power processor chips with vector units. Such a system is very much smaller, cheaper and lower power than one with equal memory capacity and instruction throughput made entirely with DRAM, x86 processors and GPUs. It is small enough to install locally in retail and office locations or mobile platforms such as trucks and ships, and inexpensive enough that it need not be a shared computing resource. Yet because it uses primarily FLASH memory, which is extremely dense, the storage capacity can be as big or bigger than any DRAM-based in-memory big data analytic server.

In the quest for more intelligent consumer devices, machine learning lets appliances understand what is happening around the computer and what is asked of it, while big data provides the history and context of the environment. But devices must also react to be useful, and for many applications the reaction needs to happen on human timescale to be valuable. For example, an advertisement beacon must beam a discount coupon to the shopper's cellphone in a few hundred milliseconds or the shopper will walk past. Today many people prefer to use large shared data centers remotely accessed through the public internet for big data analytics and machine learning because this is the most cost-effective and energy efficient way to do large-scale computing. But integrating real-time computing with big data and machine learning may make that impractical because exchanging messages through the internet may itself consume a substantial fraction of a second, leaving almost no time for computing if you want to guarantee application response time of a few hundred milliseconds. In this talk I propose a FLASH-based parallel computer using large numbers of low-power processor chips with vector units. Such a system is very much smaller, cheaper and lower power than one with equal memory capacity and instruction throughput made entirely with DRAM, x86 processors and GPUs. It is small enough to install locally in retail and office locations or mobile platforms such as trucks and ships, and inexpensive enough that it need not be a shared computing resource. Yet because it uses primarily FLASH memory, which is extremely dense, the storage capacity can be as big or bigger than any DRAM-based in-memory big data analytic server.

43 min

Top Podcasts In Technology

More by The University of Texas at Austin