Jeff M. Phillips scite author profile

Word embeddings carry stereotypical connotations from the text they are trained on, which can lead to invalid inferences in downstream models that rely on them. We use this observation to design a mechanism for measuring stereotypes using the task of natural language inference. We demonstrate a reduction in invalid inferences via bias mitigation strategies on static word embeddings (GloVe). Further, we show that for gender bias, these techniques extend to contextualized embeddings when applied selectively only to the static components of contextualized embeddings (ELMo, BERT).

show abstract

Frequent Directions: Simple and Deterministic Matrix Sketching

Ghashami¹,

Liberty²,

Phillips³

et al. 2016

SIAM J. Comput.

104

125

View full text Add to dashboard Cite

show abstract

Mergeable summaries

Agarwal

Cormode

Huang

et al. 2012

View full text Add to dashboard Cite

We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means that the summaries can be merged in a way like other algebraic operators such as sum and max, which is especially useful for computing summaries on massive distributed data. Several data summaries are trivially mergeable by construction, most notably all the sketches that are linear functions of the data sets. But some other fundamental ones like those for heavy hitters and quantiles, are not (known to be) mergeable. In this paper, we demonstrate that these summaries are indeed mergeable or can be made mergeable after appropriate modifications. Specifically, we show that for ε-approximate heavy hitters, there is a deterministic mergeable summary of size O(1/ε); for εapproximate quantiles, there is a deterministic summary of size O( 1 ε log(εn)) that has a restricted form of mergeability, and a randomized one of size O( 1 ε log 3/2 1 ε ) with full mergeability. We also extend our results to geometric summaries such as ε-approximations and ε-kernels.We also achieve two results of independent interest: (1) we provide the best known randomized streaming bound for ε-approximate quantiles that depends only on ε, of size O( 1 ε log 3/2 1 ε ), and (2) we demonstrate that the MG and the SpaceSaving summaries for heavy hitters are isomorphic.

show abstract

Outlier Robust ICP for Minimizing Fractional RMSD

2007

View full text Add to dashboard Cite

We describe a variation of the iterative closest point (ICP) algorithm for aligning two point sets under a set of transformations. Our algorithm is superior to previous algorithms because (1) in determining the optimal alignment, it identifies and discards likely outliers in a statistically robust manner, and (2) it is guaranteed to converge to a locally optimal solution. To this end, we formalize a new distance measure, fractional root mean squared distance (FRMSD), which incorporates the fraction of inliers into the distance function. We lay out a specific implementation, but our framework can easily incorporate most techniques and heuristics from modern registration algorithms. We experimentally validate our algorithm against previous techniques on 2 and 3 dimensional data exposed to a variety of outlier types. RMSD(D, M, µ) = 1 |D| p∈D ||p − µ(p)|| 2 When convenient we sometimes write RMSD(D, M ), letting µ match every point in D to the closest point in M . Problem 2.1. [minimizing RMSD ] Given a model point set M and an input data point set D where D, M ⊂ R d , compute the transformation T ∈ T to minimize RMSD(T (D), M ): min T ∈ T 1 |D| p∈D ||T (p) − µ(p)|| 2 .

show abstract

Relative Errors for Deterministic Low-Rank Matrix Approximations

Ghashami

Phillips

2013

View full text Add to dashboard Cite

We consider processing an n × d matrix A in a stream with row-wise updates according to a recent algorithm called Frequent Directions (Liberty, KDD 2013). This algorithm maintains an ℓ × d matrix Q deterministically, processing each row in O(dℓ 2 ) time; the processing time can be decreased to O(dℓ) with a slight modification in the algorithm and a constant increase in space. Then for any unit vector x, the matrix Q satisfies 0 ≤ Ax 2 − Qx 2 ≤ A 2 F /ℓ. We show that if one sets ℓ = ⌈k + k/ε⌉ and returns Q k , a k × d matrix that is simply the top k rows of Q, then we achieve the following properties:

show abstract

Radio tomographic imaging and tracking of stationary and moving people via kernel distance

et al. 2013

View full text Add to dashboard Cite

Network radio frequency (RF) environment sensing (NRES) systems pinpoint and track people in buildings using changes in the signal strength measurements made by a wireless sensor network. It has been shown that such systems can locate people who do not participate in the system by wearing any radio device, even through walls, because of the changes that moving people cause to the static wireless sensor network. However, many such systems cannot locate stationary people. We present and evaluate a system which can locate stationary or moving people, without calibration, by using kernel distance to quantify the difference between two histograms of signal strength measurements. From five experiments, we show that our kernel distance-based radio tomographic localization system performs better than the state-of-the-art NRES systems in different non line-of-sight environments.

show abstract

Comparing distributions and shapes using the kernel distance

Joshi

Kommaraji

Phillips

et al. 2011

View full text Add to dashboard Cite

Starting with a similarity function between objects, it is possible to define a distance metric on pairs of objects, and more generally on probability distributions over them. These distance metrics have a deep basis in functional analysis, measure theory and geometric measure theory, and have a rich structure that includes an isometric embedding into a (possibly infinite dimensional) Hilbert space. They have recently been applied to numerous problems in machine learning and shape analysis.In this paper, we provide the first algorithmic analysis of these distance metrics. Our main contributions are as follows: (i) We present fast approximation algorithms for computing the kernel distance between two point sets P and Q that runs in near-linear time in the size of P ∪ Q (note that an explicit calculation would take quadratic time). (ii) We present polynomial-time algorithms for approximately minimizing the kernel distance under rigid transformation; they run in time O(n+ poly(1/ǫ, log n)). (iii) We provide several general techniques for reducing complex objects to convenient sparse representations (specifically to point sets or sets of points sets) which approximately preserve the kernel distance. In particular, this allows us to reduce problems of computing the kernel distance between various types of objects such as curves, surfaces, and distributions to computing the kernel distance between point sets. These take advantage of the reproducing kernel Hilbert space and a new relation linking binary range spaces to continuous range spaces with bounded fat-shattering dimension.

show abstract

Distributed trajectory similarity search

Xie

Phillips

2017

Proc. VLDB Endow.

View full text Add to dashboard Cite

Mobile and sensing devices have already become ubiquitous. They have made tracking moving objects an easy task. As a result, mobile applications like Uber and many IoT projects have generated massive amounts of trajectory data that can no longer be processed by a single machine efficiently. Among the typical query operations over trajectories, similarity search is a common yet expensive operator in querying trajectory data. It is useful for applications in different domains such as traffic and transportation optimizations, weather forecast and modeling, and sports analytics. It is also a fundamental operator for many important mining operations such as clustering and classification of trajectories. In this paper, we propose a distributed query framework to process trajectory similarity search over a large set of trajectories. We have implemented the proposed framework in Spark, a popular distributed data processing engine, by carefully considering different design choices. Our query framework supports both the Hausdorff distance the Fréchet distance. Extensive experiments have demonstrated the excellent scalability and query efficiency achieved by our design, compared to other methods and design alternatives.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jeff M. Phillips

On Measuring and Mitigating Biased Inferences of Word Embeddings

Frequent Directions: Simple and Deterministic Matrix Sketching

Mergeable summaries

Outlier Robust ICP for Minimizing Fractional RMSD

Relative Errors for Deterministic Low-Rank Matrix Approximations

Radio tomographic imaging and tracking of stationary and moving people via kernel distance

Comparing distributions and shapes using the kernel distance

Distributed trajectory similarity search

Contact Info

Product

Resources

About