Fuat Basık scite author profile

We are witnessing an enormous growth in the volume of data generated by various online services. An important portion of this data contains geographic references, since many of these services are location-enhanced and thus produce spatio-temporal records of their usage. We postulate that the spatio-temporal usage records belonging to the same real-world entity can be matched across records from different location-enhanced services. Linking spatio-temporal records enables data analysts and service providers to obtain information that they cannot derive by analyzing only one set of usage records. In this paper, we develop a new linkage model that can be used to match entities from two sets of spatio-temporal usage records belonging to two different location-enhanced services. This linkage model is based on the concept of k-l diversity -that we developed to capture both spatial and temporal aspects of the linkage. To realize this linkage model in practice, we develop a scalable linking algorithm called ST-Link, which makes use of effective spatial and temporal filtering mechanisms that significantly reduce the search space for matching users. Furthermore, ST-Link utilizes sequential scan procedures to avoid random disk access and thus scales to large datasets. We evaluated our work with respect to accuracy and performance using several datasets. Experiments show that ST-Link is effective in practice for performing spatio-temporal linkage and can scale to large datasets.

show abstract

Fair Task Allocation in Crowdsourced Delivery

Basık

Gedik

Ferhatosmanoğlu

et al. 2021

IEEE Trans. Serv. Comput.

View full text Add to dashboard Cite

Faster and more cost-efficient, crowdsourced delivery is needed to meet the growing customer demands of many industries, including online shopping, on-demand local delivery, and on-demand transportation. The power of crowdsourced delivery stems from the large number of workers potentially available to provide services and reduce costs. It has been shown in social psychology literature that fairness is key to ensuring high worker participation. However, existing assignment solutions fall short on modeling the dynamic fairness metric. In this work, we introduce a new assignment strategy for crowdsourced delivery tasks. This strategy takes fairness towards workers into consideration, while maximizing the task allocation ratio. Since redundant assignments are not possible in delivery tasks, we first introduce a 2-phase allocation model that increases the reliability of a worker to complete a given task. To realize the effectiveness of our model in practice, we present both offline and online versions of our proposed algorithm called F-Aware. Given a task-to-worker bipartite graph, F-Aware assigns each task to a worker that minimizes unfairness, while allocating tasks to use worker capacities as much as possible. We present an evaluation of our algorithms with respect to running time, task allocation ratio (TAR), as well as unfairness and assignment ratio. Experiments show that F-Aware runs around 10 7 × faster than the TAR-optimal solution and allocates 96.9% of the tasks that can be allocated by it. Moreover, it is shown that, F-Aware is able to provide a much fair distribution of tasks to workers than the best competitor algorithm.

show abstract

S $$^3$$ 3 -TM: scalable streaming short text matching

et al. 2015

View full text Add to dashboard Cite

Micro-blogging services have become major venues for information creation, as well as channels of information dissemination. Accordingly, monitoring them for relevant information is a critical capability. This is typically achieved by registering content-based subscriptions with the micro-blogging service. Such subscriptions are long-running queries that are evaluated against the stream of posts. Given the popularity and scale of micro-blogging services like Twitter and Weibo, building a scalable infrastructure to evaluate these subscriptions is a challenge. To address this challenge, we present the S 3 -TM system for streaming short text matching. S 3 -TM is organized as a stream processing application, in the form of a data parallel flow graph designed to be run on a data center environment. It takes advantage of the structure of the publications (posts) and subscriptions to perform the matching in a scalable manner, without broadcasting publications or subscriptions to all of the matcher instances. The basic design of S 3 -TM uses a scoped multicast for publications and scoped anycast for subscriptions. To further improve throughput, we introduce publication routing algorithms that aim at minimizing the scope of the multicasts. First set of algorithms we develop are based on partitioning the word co-occurrence frequency graph, with the aim of routing posts that include commonly co-occurring words to a small set of matchers. While effective, these algorithms fell short in balancing the load. To address this, we develop the SALB algorithm, which provides better load balance by modeling the load more accurately using the word-to-post bipartite graph. We also develop a subscription placement algorithm, called LASP, to group together similar subscrip-B Fuat Basık fuat.basik@bilkent.edu.tr 1 Computer Engineering Department, Bilkent University, Ankara, Turkey tions, in order to minimize the subscription matching cost. Furthermore, to achieve good scalability for increasing number of nodes, we introduce techniques to handle workload skew. Finally, we introduce load shedding techniques for handling unexpected load spikes with small impact on the accuracy. Our experimental results show that S 3 -TM is scalable. Furthermore, the SALB algorithm provides more than 2.5× throughput compared to the baseline multicast and outperforms the graph partitioning-based approaches.

show abstract

SLIM: Scalable Linkage of Mobility Data

Basık

Ferhatosmanoğlu

Gedik

2020

View full text Add to dashboard Cite

show abstract

DBPal

Basık

Hättasch

Ilkhechi

et al. 2018

View full text Add to dashboard Cite

In this demo, we present DBPal, a novel data exploration tool with a natural language interface. DBPal leverages recent advances in deep models to make query understanding more robust in the following ways: First, DBPal uses novel machine translation models to translate natural language statements to SQL, making the translation process more robust to paraphrasing and linguistic variations. Second, to support the users in phrasing questions without knowing the database schema and the query features, DBPal provides a learned auto-completion model that suggests to users partial query extensions during query formulation and thus helps to write complex queries.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fuat Basık

Spatio-Temporal Linkage over Location-Enhanced Services

Fair Task Allocation in Crowdsourced Delivery

S $$^3$$ 3 -TM: scalable streaming short text matching

SLIM: Scalable Linkage of Mobility Data

DBPal

Contact Info

Product

Resources

About