Proceedings of the 24th International Conference on Machine Learning 2007
DOI: 10.1145/1273496.1273501
|View full text |Cite
|
Sign up to set email alerts
|

Scalable training of L 1 -regularized log-linear models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
348
0
4

Year Published

2010
2010
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 378 publications
(367 citation statements)
references
References 9 publications
0
348
0
4
Order By: Relevance
“…We present this strategy as a two phase method, composed of an active set identification phase using an infinitesimal line search, and a subspace minimization phase that utilizes the Hessian subsampling technique. We present numerical results for a sparse version of the speech recognition problem, and we have shown that our algorithm is able to outperform the OWL algorithm presented in [2], both in terms of sparsity and objective value.…”
Section: Final Remarksmentioning
confidence: 95%
See 4 more Smart Citations
“…We present this strategy as a two phase method, composed of an active set identification phase using an infinitesimal line search, and a subspace minimization phase that utilizes the Hessian subsampling technique. We present numerical results for a sparse version of the speech recognition problem, and we have shown that our algorithm is able to outperform the OWL algorithm presented in [2], both in terms of sparsity and objective value.…”
Section: Final Remarksmentioning
confidence: 95%
“…These methods must, however, perform a vast number of iterations before an appreciable improvement in the objective is obtained, and due to the sequential nature of these iterations, it can be difficult to parallelize them; see [23,1,10] and the references therein. On the other hand, batch (or mini-batch) algorithms can easily exploit parallelism in the function and gradient evaluation, and are able to yield high accuracy in the solution of the optimization problem [30,2,31], if so desired. Motivated by the potential of function/gradient parallelism, the sole focus of this paper is on batch and mini-batch methods.…”
Section: Preliminariesmentioning
confidence: 99%
See 3 more Smart Citations