In the Directed Steiner Tree (DST) problem we are given an n-vertex directed edge-weighted graph, a root r, and a collection of k terminal nodes. Our goal is to find a minimum-cost arborescence that contains a directed path from r to every terminal. We present an O(log 2 k/ log log k)approximation algorithm for DST that runs in quasi-polynomial-time, i.e., in time n poly log(k) . By assuming the Projection Game Conjecture and NP ⊆ 0< <1 ZPTIME(2 n ), and adjusting the parameters in the hardness result of Halperin and Krauthgamer [STOC'03], we show the matching lower bound of Ω(log 2 k/ log log k) for the class of quasi-polynomial-time algorithms, meaning that our approximation ratio is asymptotically the best possible. This is the first improvement on the DST problem since the classical quasi-polynomial-time O(log 3 k) approximation algorithm by Charikar et al. [SODA'98 & J. Algorithms'99]. (The paper erroneously claims an O(log 2 k) approximation due to a mistake in prior work.)Our approach is based on two main ingredients. First, we derive an approximation preserving reduction to the Label-Consistent Subtree (LCST) problem. Here we are given a rooted tree with node labels, and a feasible solution is a subtree satisfying proper constraints on the labels. The LCST instance has quasi-polynomial size and logarithmic height. We remark that, in contrast, Zelikovsky's heigh-reduction theorem [Algorithmica'97] used in all prior work on DST achieves a reduction to a tree instance of the related Group Steiner Tree (GST) problem of similar height, however losing a logarithmic factor in the approximation ratio.Our second ingredient is an LP-rounding algorithm to approximately solve LCST instances, which is inspired by the framework developed by [Rothvoß, Preprint'11; Friggstad et al., IPCO'14]. We consider a Sherali-Adams lifting of a proper LP relaxation of LCST. Our rounding algorithm proceeds level by level from the root to the leaves, rounding and conditioning each time on a proper subset of label variables. The limited height of the tree and small number of labels on root-to-leaf paths guarantees that a small enough (namely, polylogarithmic) number of Sherali-Adams lifting levels is sufficient to condition up to the leaves.We believe that our basic strategy of combining label-based reductions with a round-and -condition type of LP-rounding over hierarchies might find applications to other related prob-In the Directed Steiner Tree (DST) problem, we are given an n-vertex digraph G = (V, E) with cost c e on each edge e ∈ E, a root vertex r ∈ V and a set of k terminals K ⊆ V \ {r}. The goal is to find a minimum-cost out-arborescence H ⊆ G rooted at r that contains an r → t directed path for every terminal t ∈ K. W.l.o.g. we assume that edge costs satisfy triangle inequality.The DST problem is a fundamental problem in the area of network design that is known for its bizarre behaviors. While constant-approximation algorithms have been known for its undirected counterpart (see, e.g., [3,29,31]), the best known polynomial-ti...