Supply-voltage scaling has stagnated in recent technology nodes, leading to so-called dark silicon. To increase overall chip multiprocessor (CMP) performance, it is necessary to improve the energy efficiency of individual tasks so that more tasks can be executed simultaneously within thermal limits. In this article, the authors investigate the limit of voltage scaling together with task parallelization to maintain task completion latency while reducing energy consumption. Additionally, they examine improvements in energy efficiency and parallelism when serial portions of code can be overcome through quickly boosting a core’s operating voltage. When accounting for parallelization overheads, minimum task energy is obtained at near-threshold supply voltages across six commercial technology nodes and provides 4× improvement in overall CMP performance. Boosting is most effective when the task is modestly parallelizable but not highly parallelizable.