Main Program

Sunday February 17, 2019

[18:00-20:00]     Reception

Monday February 18, 2019

[08:15-09:30]     Plenary Session

HPCA Keynote: Towards Secure High-Performance Computer Architectures
Srini Devadas (MIT)

[09:35-10:25]     Lightning Talks

90-second lightning talks for 26 papers presented on Monday.

[10:25-10:55]     Coffee Break
[10:55-12:35]     Session 1: BEST PAPER NOMINEES

The Accelerator Wall: Limits of Chip Specialization
Adi Fuchs and David Wentzlaff (Princeton University)

Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores
Artemiy Margaritov (University of Edinburgh); Siddharth Gupta (EPFL); Rekai Gonzalez-Alberquilla (Arm Ltd, Cambridge, UK); Boris Grot (University of Edinburgh)

CIDR: A Cost-Effective In-line Data Reduction System for Terabit-per-Second Scale SSD Arrays
Mohammadamin Ajdari (POSTECH); Pyeongsu Park, Joonsung Kim, Dongup Kwon, and Jangwoo Kim (Seoul National University)

Composite-ISA Cores: Enabling Multi-ISA Heterogeneity Using a Single ISA
Ashish Venkat (UCSD/UVA); Harsha Basavaraj and Dean Tullsen (UCSD)

[12:35-14:00]     Lunch provided by the conference
[14:00-15:40]     Session 2A: ACCELERATORS FOR DNNs

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
Linghao Song and Jiachen Mao (Duke University); Youwei Zhuo and Xuehai Qian (University of Southern California); Hai Li and Yiran Chen (Duke University)

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs
Zhe Li (Syracuse University); Caiwen Ding and Siyue Wang (Northeastern University); Wujie Wen (Florida International University); Youwei Zhuo (University of Southern California); Chang Liu (Carnegie Mellon University); Qinru Qiu (Syracuse University); Wenyao Xu (University at Buffalo (SUNY)); Xue Lin (Northeastern University); Xuehai Qian (University of Southern California); Yanzhi Wang (Northeastern University)

Bit Prudent In-Cache Acceleration of Deep Convolutional Neural Networks
Xiaowei Wang and Jiecao Yu (University of Michigan, Ann Arbor); Charles Augustine and Ravi Iyer (Intel); Reetuparna Das (University of Michigan, Ann Arbor)

Shortcut Mining: Exploiting Cross-layer Shortcut Reuse in DCNN Accelerators
Arash Azizimazreah and Lizhong Chen (Oregon State University)

[14:00-15:40]     Session 2B: POWER EFFICIENCY

Fine-tuning the Active Timing Margin (ATM) Control Loop for Maximizing Multicore Efficiency on an IBM POWER Server
Yazhou Zu and Daniel Richins (University of Texas at Austin); Charles Lefurgy (IBM Research); Vijay Janapa Reddi (University of Texas at Austin/Google)

μDPM: Dynamic Power Management for the Microsecond Era
Chih-Hsun Chou, Daniel Wong, and Laxmi N Bhuyan (University of California, Riverside)

Adaptive Voltage/Frequency Scaling and Core Allocation for Balanced Energy and Performance on Multicore CPUs
George Papadimitriou, Athanasios Chatzidimitriou, and Dimitris Gizopoulos (University of Athens)

Resilient Low Voltage Acceleration for High Energy Efficiency
Nandhini Chandramoorthy, Karthik Swaminathan, Martin Cochet, Schuyler Eldridge, Arun Paidimarri, Rajiv Joshi, Matthew Ziegler, Alper Buyuktosunoglu, and Pradip Bose (IBM Research)

[15:40-15:50]     Coffee Break
[15:50-17:30]     Session 3A: DATACENTER

Pliant: Leveraging Approximation to Improve Datacenter Resource Efficiency
Neeraj Kulkarni, Feng Qi, and Christina Delimitrou (Cornell University)

Kelp: QoS for Accelerators in Machine Learning Platforms
Haishan Zhu (Microsoft); David Lo, Liqun Cheng, Rama Govindaraju, and Parthasarathy Ranganathan (Google); Mattan Erez (The University of Texas at Austin)

Enhancing Server Efficiency in the Face of Killer Microseconds
Amirhossein Mirhosseini, Akshitha Sriraman, and Thomas F. Wenisch (University of Michigan)

Poly: Efficient Heterogeneous System and Application Management for Interactive Applications
Shuo Wang and Yun Liang (Peking University); Wei Zhang (Hong Kong University of Science and Technology)

[15:50-17:30]     Session 3B: EMERGING TECHNOLOGIES

The What's Next Intermittent Computing Architecture
Karthik Ganesan (University of Toronto); Joshua San Miguel (University of Wisconsin-Madison); Natalie Enright Jerger (University of Toronto)

eQASM: An Executable Quantum Instruction Set Architecture
Xiang Fu (QuTech, Delft University of Technology; Quantum Computer Architecture Lab, Delft University of Technology); Leon Riesebos (Quantum Computer Architecture Lab and QuTech, Delft University of Technology); M. A. Rol (QuTech, Delft University of Technology; Kavli Institute of Nanoscience, Delft University of Technology); Jeroen van Straten (Computer Engineering Lab, Delft University of Technology); Hans van Someren, Nader Khammassi, and Imran Ashraf (Quantum Computer Architecture Lab and QuTech, Delft University of Technology); Raymond Vermeulen (QuTech, Delft University of Technology; Kavli Institute of Nanoscience, Delft University of Technology); Vincent Newsum and Kelvin Loh (Netherlands Organisation for Applied Scientific Research (TNO); QuTech, Delft University of Technology); Jacob de Sterke (Topic Embedded Systems; QuTech, Delft University of Technology); Wouter Vlothuizen (Netherlands Organisation for Applied Scientific Research (TNO); QuTech, Delft University of Technology); Raymond Schouten (QuTech, Delft University of Technology; Kavli Institute of Nanoscience, Delft University of Technology); Carmina G. Almudever (Quantum Computer Architecture Lab and QuTech, Delft University of Technology); Leo DiCarlo (QuTech, Delft University of Technology; Kavli Institute of Nanoscience, Delft University of Technology); Koen Bertels (Quantum Computer Architecture Lab and QuTech, Delft University of Technology)

Reliability Evaluation of Mixed-Precision Architectures
Fernando Fernandes dos Santos, Daniel Oliveira, Caio Lunardi, Fabiano Pereira Libano, and Paolo Rech (UFRGS)

Architecting Waferscale Processors - A GPU Case Study
Saptadeep Pal (UCLA); Daniel Petrisko and Matthew Tomei (UIUC); Puneet Gupta and Subramanian S. Iyer (UCLA); Rakesh Kumar (UIUC)

[17:40-18:55]     Session 4A: SECURITY

Conditional Speculation: An Effective Approach to Safeguard Out-of-Order Execution Against Spectre Attacks
Peinan Li, Lutan Zhao, and Rui Hou (Institute of Information Engineering, CAS); Lixin Zhang (HXT Semiconductor Co.LTD); Dan Meng (Institute of Information Engineering, CAS)

FPGA-based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data
Sujoy Sinha Roy, Furkan Turan, Frederik Vercauteren, and Ingrid Verbauwhede (COSIC, ESAT, KU Leuven); Kimmo Järvinen(University of Helsinki)

POWERT Channels: A Novel Class of Covert Communication Exploiting Power Management Vulnerabilities
S. Karen Khatamifard (University of Minnesota); Longfei Wang (University of South Florida); Amitabh Das; Selcuk Kose (University of Rochester); Ulya R. Karpuzcu (University of Minnesota)

[17:40-18:55]     Session 4B: INDUSTRY SESSION 1, MOBILE & LOW POWER

Killi: Runtime Fault Classification to Deploy Low Voltage Caches without MBIST
Shrikanth Ganapathy (AMD Research), John Kalamatianos (AMD Research), Brad Beckmann (AMD Research), Steven Raasch (AMD Research), Lukasz Szafaryn (Intel)

Gables: A Roofline Model for Mobile SoCs with Many Accelerators and Ceilings
Mark Hill (Google), Vijay Janapa Reddi (Google)

Machine Learning at Facebook: Understanding Inference at the Edge
Carole-Jean Wu (Facebook), David Brooks (Facebook), Kevin Chen (Facebook), Douglas Chen (Facebook), Sy Choudhury (Facebook), Marat Dukhan (Facebook), Kim Hazelwood (Facebook), Eldad Isaac (Facebook), Yangqing Jia (Facebook), Bill Jia (Facebook), Tommer Leyvand (Facebook), Hao Lu (Facebook), Yang Lu (Facebook), Lin Qiao (Facebook), Brandon Reagen (Facebook), Joe Spisak (Facebook), Fei Sun (Facebook), Andrew Tulloch (Facebook), Peter Vajda (Facebook), Xiaodong Wang (Facebook), Yanghan Wang (Facebook), Bram Wasti (Facebook), Yiming Wu (Facebook), Ran Xian (Facebook), Sungjoo Yoo (Facebook), Peizhao Zhang (Facebook)

[19:00-19:45]    Business Meeting

Tuesday February 19, 2019

[08:15-09:30]     Plenary Session

PPoPP Keynote: Karin Strauss (Microsoft Research).

[09:35-10:25]     Lightning Talks

90-second lightning talks for 32 papers presented on Tue/Wed.

[10:25-10:55]     Coffee Break

VIP: A Versatile Inference Processor
Skand Hurkat and José F. Martínez (Cornell University)

Darwin-WGA: A Co-processor Provides Increased Sensitivity in Whole Genome Alignments with High Speed
Yatish Turakhia, Sneha D. Goenka, and Gill Bejerano (Stanford University); William J. Dally (Stanford University, NVIDIA Research)

Analysis and Optimization of the Memory Hierarchy for Graph Processing Workloads
Abanti Basak, Xing Hu, Shuangchen Li, Sang Min Oh, and Xinfeng Xie (University of California, Santa Barbara); Li Zhao and Xiaowei Jiang (Alibaba Inc.); Yuan Xie (University of California, Santa Barbara)

FPGA Accelerated INDEL Realignment in the Cloud
Lisa Wu and David Bruns-Smith (University of California Berkeley); Frank A. Nothaft (Databricks); Qijing Huang, Sagar Karandikar, Johnny Le, Andrew Lin, Howard Mao, Brendan Sweeney, Krsté Asanovic, David A. Patterson, and Anthony D. Joseph (University of California Berkeley)

[10:55-12:35]     Session 5B: MEMORY HIERARCHY MANAGEMENT

Bingo Spatial Data Prefetcher
Mohammad Bakhshalipour (Sharif University of Technology); Mehran Shakerinava (Sharif University of Technology); Pejman Lotfi-Kamran (Institute for Research in Fundamental Sciences (IPM)); Hamid Sarbazi-Azad (Sharif University of Technology)

NoMap: Speeding-Up JavaScript Using Hardware Transactional Memory
Thomas Shull and Jiho Choi (University of Illinois Urbana-Champaign); Maria Garzaran (Intel); Josep Torrellas (University of Illinois Urbana-Champaign)

FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads
Jie Zhang and Myoungsoo Jung (Yonsei University); Mahmut Kandemir (Penn State University)

Featherlight Reuse-distance Measurement
Qingsen Wang (College of William & Mary); Millind Chabbi (Uber); Xu Liu (College of William & Mary)

[12:35-14:00]     Lunch provided by the conference, Awards

Efficient Load Value Prediction using Multiple Predictors and Filters
Rami Sheikh (Qualcomm), Derek Hower (Qualcomm)

BRB: Mitigating Branch Predictor Side-Channels
Ilias Vougioukas (ARM Research/U. of Southampton), Nikos Nikoleris (ARM Research), Andreas Sandberg (ARM Research), Stephan Diestelhorst (ARM Research), Bashir M. Al-Hashimi (U. of Southampton), Geoff V. Merrett (U. of Southampton)

Elastic Instruction Fetching
Arthur Perais (Qualcomm), Rami Sheikh (Qualcomm), Luke Yen (Qualcomm), Michael McIlvaine (Qualcomm), Robert D. Clancy (Qualcomm)


Amoeba: An Autonomous Backup and Recovery SSD for Ransomware Attack Defense
Donghyun Min (Sogang University), Donggyu Park (Sogang University), Jinwoo Ahn (Sogang University), Ryan Walker (University of Texas at San Antonio), Junghee Lee (University of Texas at San Antonio), Sungyong Park (Sogang University), Youngjae Kim (Sogang University)

The Architectural Implications of Cloud Microservices
Yu Gan and Christina Delimitrou (Cornell University)

An Alternative Analytical Approach to Associative Processing
Soroosh Khoram, Yue Zha, and Jing Li (University of Wisconsin-Madison)

[15:15-15:45]     Coffee Break
[15:45-17:00]     Session 7A: GPUs/MODELING

Poise: Balancing Thread-Level Parallelism and Memory System Performance in GPUs using Machine Learning
Saumay Dublish, Vijay Nagarajan, and Nigel Topham (The University of Edinburgh)

A Hybrid Framework for Fast and Accurate GPU Performance Estimation through Source-Level Analysis and Trace-Based Simulation
Xiebing Wang (Technical University of Munich); Kai Huang (Sun Yat-sen University); Xuehai Qian (University of Southern California); Alois Knoll (Technical University of Munich)

Understanding the Future of Energy Efficiency in Multi-Module GPUs
Akhil Arunkumar (Arizona State University); Evgeny Bolotin and David Nellans (Nvidia); Carole-Jean Wu (Arizona State University)

[15:45-17:00]     Session 7B: MICROARCHITECTURE

R3-DLA (Reduce, Reuse, Recycle): A More Efficient Approach to Decoupled Look-Ahead Architectures
Sushant Kondguli and Michael Huang (University of Rochester)

Recycling Data Slack in Out-of-Order Cores
Gokul Subramanian Ravi and Mikko Lipasti (University of Wisconsin - Madison)

Freeway: Maximizing MLP for Slice-Out-of-Order Execution
Rakesh Kumar (Norwegian University of Science and Technology (NTNU), Norway); Mehdi Alipour and David Black-Schaffer (Uppsala University, Sweden)

[17:10-18:30]     Panel: How Do We Make HPCA Serve the Community Better?

Josep Torrellas (University of Illinois at Urbana-Champaign)
Reetuparna Das (University of Michigan)
Lisa Hsu (Microsoft)
Ulya Karpuzcu (University of Minnesota)
John Kim (KAIST)

[19:00]     Excursion and Banquet Dinner

Wednesday February 20, 2019

[08:15-09:30]     Plenary Session

CGO Keynote: Rethinking Compilation in a Heterogeneous World
Michael O'Boyle (University of Edinburgh)

[09:35-10:50]     Session 8A: MEMORY

Enabling Transparent Memory-Compression for Commodity Memory Systems
Vinson Young, Sanjay Kariyappa, and Moinuddin Qureshi (Georgia Institute of Technology)

D-RaNGe: Using Commodity DRAM Devices to Generate True Random Numbers with Low Latency and High Throughput
Jeremie S Kim (Carnegie Mellon University; ETH Zurich); Minesh Patel and Hasan Hassan (ETH Zurich); Lois Orosa (ETH Zurich; Universidade Estadual de Campinas); Onur Mutlu (ETH Zurich; Carnegie Mellon University)

PageSeer: Using Page Walks to Trigger Page Swaps in Hybrid Memory Systems
Apostolos Kokolis, Dimitrios Skarlatos, and Josep Torrellas (University of Illinois, Urbana-Champaign)

[09:35-10:50]     Session 8B: ACCELERATORS FOR GRAPHICS/VR

PIM-VR: Erasing Motion Anomalies In Highly-Interactive Virtual Reality World With Customized Memory Cube
Chenhao Xie and Xingyao Zhang (University of Houston); Ang Li (Pacific Northwest National Laboratory); Xin Fu (University of Houston); Shuaiwen Leon Song (Pacific Northwest National Laboratory)

Rendering Elimination: Early Discard of Redundant Tiles in the Graphics Pipeline
Martí Anglada, Joan-Manuel Parcerisa, and Enrique de Lucas (Universitat Politècnica de Catalunya); Juan Luis Aragón (Universidad de Murcia); Pedro Marcuello (unaffiliated); Antonio González (Universitat Politècnica de Catalunya)

Early Visibility Resolution for Removing Ineffectual Computations in the Graphics Pipeline
Martí Anglada, Enrique de Lucas, and Joan-Manuel Parcerisa (Universitat Politècnica de Catalunya); Juan Luis Aragón (Universidad de Murcia); Antonio González (Universitat Politècnica de Catalunya)

[10:50-11:20]     Coffee Break
[11:20-12:35]     Session 9A: EMERGING MEMORY TECHNOLOGIES

String Figure: A Scalable and Elastic Memory Network Architecture
Matheus Ogleari (University of California, Santa Cruz); Ye Yu (University of Kentucky); Chen Qian and Ethan Miller (University of California, Santa Cruz); Jishen Zhao (University of California, San Diego)

NAND-Net: Minimizing Computational Complexity of In-Memory Processing for Binary Neural Networks
Hyeonuk Kim, Jaehyeong Sim, Yeongjae Choi, and Lee-Sup Kim (KAIST)

Active-Routing: Compute on the Way for Near-Data Processing
Jiayi Huang (Texas A&M University); Ramprakash Reddy Puli (NVIDIA); Pritam Majumder and SungKeun Kim (Texas A&M University); Rahul Boyapati (Intel Corporation); Ki Hwan Yum and Eun Jung Kim (Texas A&M University)

[11:20-12:35]     Session 9B: INDUSTRY SESSION 3, SERVERS

Understanding the Impact of Socket Density in Density Optimized Servers
Manish Arora (AMD Research/UC San Diego), Matt Skach (University of Michigan), Wei Huang (AMD Research), Xudong An (AMD Research), Jason Mars (U. of Michigan), Lingjia Tang (U. of Michigan), Dean M. Tullsen (UC San Diego)

Challenges in Power Management Adoption for Public Clouds
Yang Li (IBM/Carnegie Mellon U.), Charles Lefurgy (IBM), Karthick Rajamani (IBM), Malcolm Ware (IBM), Guillermo J. Silva (IBM), Daniel D. Heimsoth (IBM), Saugata Ghose (Carnegie Mellon U.), Onur Mutlu (ETH Zurich/Carnegie Mellon U.)

Power Aware Heterogeneous Node Assembly
Bilge Acun (IBM TJ Watson), Alper Buyuktosunoglu (IBM TJ Watson), Eun Kyung Lee (IBM TJ Watson), Yoonho Park (IBM TJ Watson)

[12:35]     Best Paper Award, Closing