students | Gabriel Scalosub

A note for prospective graduate students and postdocs

If you are interested in pursuing graduate studies (M.Sc./Ph.D. program) in either theoretical or systems aspects of computer networks, cloud computing, or HW security, feel free to drop me a line and schedule a meeting where we could discuss things further. It is recommended you check out my recent publications beforehand to see if any of my current research topics are of specific interest.

Current

Ph.D.

Saar Ben-Yohana

Joint supervision with Chen Avin

Ph.D.

Nadav Keren

Joint supervision with Gil Einziger

Publications (1)

2026

NSDI

Latency-Aware Caching with Delayed Hits: From Bursty Traffic to Pipeline Architectures

N. Keren, G. Einziger, and G. Scalosub

Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2026

Abs Bib

Modern computing systems rely on caching to reduce access latency and optimize resource utilization. However, in heterogeneous storage and cloud environments, non-uniform access latencies across storage tiers, network locations, and intermediary caches undermine traditional caching. Moreover, modern cache algorithms that attempt to capture multiple access patterns, recency, frequency, and burstiness, often become complex and difficult to maintain. As a key contribution, we propose an adaptive caching architecture that treats caching strategies as a pipeline of simple, orthogonal policies, each focused on a distinct access bias. This modular design is easier to expand, debug, and integrate, and it self-adjusts the memory resources allocated to each stage to optimize overall workload performance. New heuristics can be introduced dynamically without disrupting existing behaviors. In addition, in latency-aware caching, one often encounters the phenomenon of delayed hits, where items not yet available in the cache are requested repeatedly. We introduce the Least Bursty Used (LBU) heuristic, which retains items exhibiting high burstiness even when they are neither recent nor frequent, thereby mitigating delayed hits that degrade request latency. We embed LBU within our pipeline and derive the Recency-Frequency-Burstiness (RFB) policy, which balances resources among recency, frequency, and burstiness. Evaluations on thirteen real-world storage traces from IBM, Twitter and Meta using latencies drawn from real-life deployments show that RFB reduces average request latency by 10\% compared to the best state-of-the-art alternative, while maintaining consistent performance, with a low standard deviation across bursty and non-bursty workloads.

@inproceedings{keren2026latency,
  bibtex_show = {true},
  abbr = {NSDI},
  themes = {caching},
  title = {Latency-Aware Caching with Delayed Hits: From Bursty Traffic to Pipeline Architectures},
  author = {Keren, N. and Einziger, G. and Scalosub, G.},
  booktitle = {Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI)},
  pages = {2389--2405},
  year = {2026},
  abstract = {Modern computing systems rely on caching to reduce access latency and optimize resource utilization. However, in heterogeneous storage and cloud environments, non-uniform access latencies across storage tiers, network locations, and intermediary caches undermine traditional caching. Moreover, modern cache algorithms that attempt to capture multiple access patterns, recency, frequency, and burstiness, often become complex and difficult to maintain. As a key contribution, we propose an adaptive caching architecture that treats caching strategies as a pipeline of simple, orthogonal policies, each focused on a distinct access bias. This modular design is easier to expand, debug, and integrate, and it self-adjusts the memory resources allocated to each stage to optimize overall workload performance. New heuristics can be introduced dynamically without disrupting existing behaviors. In addition, in latency-aware caching, one often encounters the phenomenon of delayed hits, where items not yet available in the cache are requested repeatedly. We introduce the Least Bursty Used (LBU) heuristic, which retains items exhibiting high burstiness even when they are neither recent nor frequent, thereby mitigating delayed hits that degrade request latency. We embed LBU within our pipeline and derive the Recency-Frequency-Burstiness (RFB) policy, which balances resources among recency, frequency, and burstiness. Evaluations on thirteen real-world storage traces from IBM, Twitter and Meta using latencies drawn from real-life deployments show that RFB reduces average request latency by 10\% compared to the best state-of-the-art alternative, while maintaining consistent performance, with a low standard deviation across bursty and non-bursty workloads.},
  selected = {true}
}

M.Sc.

Yarin Oziel

Joint supervision with Yossi Oren

Publications (1)

2026

uASC

$\mu$-ops, I Did it Again: A Second Look at Port Assignment on Intel CPUs (WiP)

Y. Oziel, T. Laor, S. Levy, C. Maurice, Y. Oren, T. Rokicki, and G. Scalosub

Proceedings of the 2nd Microarchitecture Security Conference (uASC), 2026

Abs Bib

Modern Intel CPUs use a superscalar, out-of-order execution (OoOE) pipeline that maximizes instruction throughput through instruction-level parallelism (ILP). A critical component affecting system performance is the process in which $\mu$-ops are mapped to execution ports, which lays at the core of the ability to perform OoOE. Significant effort has been undertaken in the attempt to reverse-engineer and model this process. Our work revisits this question from a security perspective, focusing on potential vulnerabilities stemming from $\mu$-op port assignment and execution. Using carefully designed code gadgets, we expose behaviors that contradict state-of-the-art models. In this Work In Progress, we ask: Which undocumented behaviors in Intel's port assignment algorithm affect instruction scheduling? By extending the micro-benchmarks used by Abel and Reineke, we identify corner cases that strongly deviates from the common case, where the CPU adopts at least three different port assignment strategies. These depend not only on the instruction and the microarchitectural generation, but, surprisingly, also on immediate operands.

@inproceedings{oziel26uops,
  bibtex_show = {true},
  abbr = {uASC},
  themes = {security},
  title = {$\mu$-ops, I Did it Again: A Second Look at Port Assignment on Intel CPUs (WiP)},
  author = {Oziel, Y. and Laor, T. and Levy, S. and Maurice, C. and Oren, Y. and Rokicki, T. and Scalosub, G.},
  booktitle = {Proceedings of the 2nd Microarchitecture Security Conference (uASC)},
  year = {2026},
  abstract = {Modern Intel CPUs use a superscalar, out-of-order execution (OoOE) pipeline that maximizes instruction throughput through instruction-level parallelism (ILP). A critical component affecting system performance is the process in which $\mu$-ops are mapped to execution ports, which lays at the core of the ability to perform OoOE. Significant effort has been undertaken in the attempt to reverse-engineer and model this process. Our work revisits this question from a security perspective, focusing on potential vulnerabilities stemming from $\mu$-op port assignment and execution. Using carefully designed code gadgets, we expose behaviors that contradict state-of-the-art models. In this Work In Progress, we ask: Which undocumented behaviors in Intel's port assignment algorithm affect instruction scheduling? By extending the micro-benchmarks used by Abel and Reineke, we identify corner cases that strongly deviates from the common case, where the CPU adopts at least three different port assignment strategies. These depend not only on the instruction and the microarchitectural generation, but, surprisingly, also on immediate operands.},
  selected = {true}
}

M.Sc.

Shir Cohen

2026

M.Sc.

Saar Ben-Yohana

Joint supervision with Chen Avin

Publications (1)

2025

NAIC

T3P: Topology-Tailored Tensor Parallelism

S. Ben-Yohana, C. Avin, and G. Scalosub

Proceedings of the 2nd Workshop on Networks for AI Computing (NAIC), 2025

Abs Bib

As deep learning models continue to grow in scale and complexity, methods of distributed machine learning training, and particularly those used for large language models (LLMs), have become a critical ingredient in making such computations efficient and feasible. In such contexts, tensor parallelism (TP) is widely employed to distribute computations across multiple accelerators. However, since TP mandates frequent and high-volume communication between devices, the underlying network characteristics significantly influence performance. Previous work was mostly either model-agnostic or topology-agnostic and did not pick provably optimal configurations. This study presents Topology-Tailored Tensor Parallelism, T3P, an efficient algorithm that identifies the communication-optimal TP sharding configuration (within the considered search space) based on both the model architecture and the network topology. In particular, we show that T3P is optimal for any given resharding cost model.

@inproceedings{benyohana2025tttp,
  bibtex_show = {true},
  abbr = {NAIC},
  themes = {networking_ai},
  author = {Ben-Yohana, S. and Avin, C. and Scalosub, G.},
  title = {{T3P}: Topology-Tailored Tensor Parallelism},
  booktitle = {Proceedings of the 2nd Workshop on Networks for AI Computing (NAIC)},
  pages = {81--88},
  year = {2025},
  abstract = {As deep learning models continue to grow in scale and complexity, methods of distributed machine learning training, and particularly those used for large language models (LLMs), have become a critical ingredient in making such computations efficient and feasible. In such contexts, tensor parallelism (TP) is widely employed to distribute computations across multiple accelerators. However, since TP mandates frequent and high-volume communication between devices, the underlying network characteristics significantly influence performance. Previous work was mostly either model-agnostic or topology-agnostic and did not pick provably optimal configurations. This study presents Topology-Tailored Tensor Parallelism, T3P, an efficient algorithm that identifies the communication-optimal TP sharding configuration (within the considered search space) based on both the model architecture and the network topology. In particular, we show that T3P is optimal for any given resharding cost model.},
  selected = {true}
}

2025

M.Sc.

Shahar Blank

Joint supervision with Gil Einziger

2024

M.Sc.

Nadav Keren

Joint supervision with Gil Einziger

Publications (1)

2023

SYSTOR

On Latency Awareness with Delayed Hits (Poster)

N. Keren, G. Einziger, and G. Scalosub

Proceedings of the 16th ACM International Conference on Systems and Storage (SYSTOR), 2023

Abs Bib

We consider a new locality pattern in the form of burstiness to improve cache effectiveness in workflows where items are requested in possibly infrequent yet costly batches. Adding a cache that handles only bursty items to existing State-Of-The-Art algorithms shows a significant improvement in overall average time per query.

@inproceedings{keren2023latency,
  bibtex_show = {true},
  abbr = {SYSTOR},
  themes = {caching},
  title = {On Latency Awareness with Delayed Hits (Poster)},
  author = {Keren, N. and Einziger, G. and Scalosub, G.},
  booktitle = {Proceedings of the 16th ACM International Conference on Systems and Storage (SYSTOR)},
  pages = {144},
  abstract = {We consider a new locality pattern in the form of burstiness to improve cache effectiveness in workflows where items are requested in possibly infrequent yet costly batches. Adding a cache that handles only bursty items to existing State-Of-The-Art algorithms shows a significant improvement in overall average time per query.},
  year = {2023}
}

M.Sc.

Elioz Geller

Joint supervision with Chen Avin

2023

M.Sc.

Uriah Elkayam

Joint supervision with Niv Gilboa
M.Sc.

Tom Kessous

Joint supervision with Niv Gilboa

M.Sc.

Itamar Gozlan

Joint supervision with Gil Einziger

Publications (1)

2023

SIGMETRICS

Go-to-Controller is Better: Efficient and Optimal LPM Caching with Splicing

I. Gozlan, C. Avin, G. Einziger, and G. Scalosub

Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), 2023

Abs Bib

Data center networks must support huge forwarding policies as they handle the traffic of the various tenants. Since such policies cannot be stored within the limited memory available at commodity switches, SDN controllers can manage the memory available at the switch as a cache, updating and changing the forwarding rules in the cache according to the policy and workloads dynamics. Most policies, such as Longest-prefix-match (LPM) policies, include dependencies between the forwarding rules, which introduce consistency constraints on the structure of the cached content, affecting the performance in terms of throughput and delay. Previous work suggested the concept of splicing to address such deficiencies, where modified Go-to-Controller rules can be inserted into the cache to improve performance while maintaining consistency. We present the first optimal algorithm for determining the cache content with splicing, as well as several efficient heuristics with some performance guarantees. We evaluate our solutions using traces derived from real systems and traffic, and show that splicing can reduce the cache miss ratio by as much as 30%, without increasing the cache size. We further propose a new metric which can provide a quick estimate as to the potential benefits of splicing compared to classical LPM-caching.

@article{gozlan2023goto,
  bibtex_show = {true},
  abbr = {SIGMETRICS},
  themes = {caching},
  title = {Go-to-Controller is Better: Efficient and Optimal LPM Caching with Splicing},
  author = {Gozlan, I. and Avin, C. and Einziger, G. and Scalosub, G.},
  journal = {Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS)},
  volume = {7},
  number = {1},
  pages = {10:1--10:33},
  year = {2023},
  abstract = {Data center networks must support huge forwarding policies as they handle the traffic of the various tenants. Since such policies cannot be stored within the limited memory available at commodity switches, SDN controllers can manage the memory available at the switch as a cache, updating and changing the forwarding rules in the cache according to the policy and workloads dynamics. Most policies, such as Longest-prefix-match (LPM) policies, include dependencies between the forwarding rules, which introduce consistency constraints on the structure of the cached content, affecting the performance in terms of throughput and delay. Previous work suggested the concept of splicing to address such deficiencies, where modified Go-to-Controller rules can be inserted into the cache to improve performance while maintaining consistency. We present the first optimal algorithm for determining the cache content with splicing, as well as several efficient heuristics with some performance guarantees. We evaluate our solutions using traces derived from real systems and traffic, and show that splicing can reduce the cache miss ratio by as much as 30%, without increasing the cache size. We further propose a new metric which can provide a quick estimate as to the potential benefits of splicing compared to classical LPM-caching.},
  selected = {true}
}

2021

M.Sc.

Raz Segal

Joint supervision with Chen Avin

Publications (3)

2024

TNSM

SOAR: minimizing network utilization with bounded in-network computing

R. Segal, C. Avin, and G. Scalosub

IEEE Transactions on Network and Service Management, 2024

Abs Bib

In-network computing via smart networking devices is a recent trend in modern datacenter networks. State-of-the-art switches with near line-rate computing and aggregation capabilities enable acceleration and improved resource utilization for modern applications like large-scale distributed and federated machine learning, as well as big data analytics. We study the problem of activating a \em limited number of in-network computing devices within a network, aiming at reducing the overall cost incurred by such a deployment. Such limitations on the number of in-network computing elements arise, e.g., in incremental upgrades of network infrastructure, and are also due to requiring specialized middleboxes, or FPGAs, for supporting heterogeneous workloads, and multiple tenants. We present an efficient optimal algorithm for placing such devices in tree networks with arbitrary link rates, and further evaluate its performance in various scenarios and for various tasks, including federated/distributed ML and big data analytics. Our results show that even a small fraction of network devices supporting in-network aggregation leads to a significant reduction in network utilization cost. Furthermore, we show that various intuitive strategies for performing such placements are significantly inferior compared with our solution, for varying workloads, tasks, and link rates.

@article{segal2024soar,
  bibtex_show = {true},
  abbr = {TNSM},
  themes = {networking_ai},
  title = {SOAR: minimizing network utilization with bounded in-network computing},
  author = {Segal, R. and Avin, C. and Scalosub, G.},
  journal = {IEEE Transactions on Network and Service Management},
  volume = {21},
  number = {2},
  pages = {1832--1851},
  year = {2024},
  abstract = {In-network computing via smart networking devices is a recent trend in modern datacenter networks. State-of-the-art switches with near line-rate computing and aggregation capabilities enable acceleration and improved resource utilization for modern applications like large-scale distributed and federated machine learning, as well as big data analytics. We study the problem of activating a {\em limited number} of in-network computing devices within a network, aiming at reducing the overall cost incurred by such a deployment. Such limitations on the number of in-network computing elements arise, e.g., in incremental upgrades of network infrastructure, and are also due to requiring specialized middleboxes, or FPGAs, for supporting heterogeneous workloads, and multiple tenants. We present an efficient optimal algorithm for placing such devices in tree networks with arbitrary link rates, and further evaluate its performance in various scenarios and for various tasks, including federated/distributed ML and big data analytics. Our results show that even a small fraction of network devices supporting in-network aggregation leads to a significant reduction in network utilization cost. Furthermore, we show that various intuitive strategies for performing such placements are significantly inferior compared with our solution, for varying workloads, tasks, and link rates.},
  selected = {true}
}

2022

INFOCOM

Constrained In-network Computing with Low Congestion in Datacenter Networks

R. Segal, C. Avin, and G. Scalosub

Proceedings of IEEE INFOCOM 2022, the 41st Annual Joint Conference of the IEEE Computer and Communications Societies, 2022

Abs Bib

Distributed computing has become a common practice nowadays, where recent focus has been given to the usage of smart networking devices with in-network computing capabilities. State-of-the-art switches with near-line rate computing and aggregation capabilities enable acceleration and improved performance for various modern applications like big data analytics and large-scale distributed and federated machine learning. In this work, we formulate and study the theoretical algorithmic foundations of such approaches, and focus on how to deploy and use constrained in-network computing capabilities within the data center. We focus our attention on reducing the network congestion, i.e., the most congested link in the network, while supporting the given workload(s). We present an efficient optimal algorithm for tree-like network topologies and show that our solution provides as much as an x13 improvement over common alternative approaches. In particular, our results show that having merely a small fraction of network devices that support in-network aggregation can significantly reduce the network congestion, both for single and multiple workloads.

@inproceedings{segal2022constrained,
  bibtex_show = {true},
  abbr = {INFOCOM},
  themes = {networking_ai},
  title = {Constrained In-network Computing with Low Congestion in Datacenter Networks},
  author = {Segal, R. and Avin, C. and Scalosub, G.},
  booktitle = {Proceedings of IEEE INFOCOM 2022, the 41st Annual Joint Conference of the IEEE Computer and Communications Societies},
  pages = {1639--1648},
  year = {2022},
  abstract = {Distributed computing has become a common practice nowadays, where recent focus has been given to the usage of smart networking devices with in-network computing capabilities. State-of-the-art switches with near-line rate computing and aggregation capabilities enable acceleration and improved performance for various modern applications like big data analytics and large-scale distributed and federated machine learning. In this work, we formulate and study the theoretical algorithmic foundations of such approaches, and focus on how to deploy and use constrained in-network computing capabilities within the data center. We focus our attention on reducing the network congestion, i.e., the most congested link in the network, while supporting the given workload(s). We present an efficient optimal algorithm for tree-like network topologies and show that our solution provides as much as an x13 improvement over common alternative approaches. In particular, our results show that having merely a small fraction of network devices that support in-network aggregation can significantly reduce the network congestion, both for single and multiple workloads.},
  selected = {true}
}

2021

CoNEXT

SOAR: minimizing network utilization with bounded in-network computing

R. Segal, C. Avin, and G. Scalosub

Proceedings of CoNEXT 2021, the 17th International Conference on emerging Networking EXperiments and Technologies, 2021

Abs Bib

In-network computing via smart networking devices is a recent trend for modern datacenter networks. State-of-the-art switches with near line rate computing and aggregation capabilities are developed to enable, e.g., acceleration and better utilization for modern applications like big data analytics, and large scale distributed and federated machine learning. We formulate and study the problem of activating a limited number of in-network computing devices within a network, aiming at reducing the overall network utilization for a given workload. Such limitations on the number of in-network computing elements per workload arise, e.g., in incremental upgrades of network infrastructure, and are also due to requiring specialized middleboxes, or FPGAs, that should support heterogeneous workloads, and multiple tenants. We present an optimal and efficient algorithm for placing such devices in tree networks with arbitrary link rates, and further evaluate our proposed solution in various scenarios and for various tasks. Our results show that having merely a small fraction of network devices support in-network aggregation can lead to a significant reduction in network utilization. Furthermore, we show that various intuitive strategies for performing such placements exhibit significantly inferior performance compared to our solution, for varying workloads, tasks, and link rates.

@inproceedings{segal2021soar,
  bibtex_show = {true},
  abbr = {CoNEXT},
  themes = {networking_ai},
  title = {SOAR: minimizing network utilization with bounded in-network computing},
  author = {Segal, R. and Avin, C. and Scalosub, G.},
  booktitle = {Proceedings of CoNEXT 2021, the 17th International Conference on emerging Networking EXperiments and Technologies},
  pages = {16--29},
  abstract = {In-network computing via smart networking devices is a recent trend for modern datacenter networks. State-of-the-art switches with near line rate computing and aggregation capabilities are developed to enable, e.g., acceleration and better utilization for modern applications like big data analytics, and large scale distributed and federated machine learning. We formulate and study the problem of activating a limited number of in-network computing devices within a network, aiming at reducing the overall network utilization for a given workload. Such limitations on the number of in-network computing elements per workload arise, e.g., in incremental upgrades of network infrastructure, and are also due to requiring specialized middleboxes, or FPGAs, that should support heterogeneous workloads, and multiple tenants. We present an optimal and efficient algorithm for placing such devices in tree networks with arbitrary link rates, and further evaluate our proposed solution in various scenarios and for various tasks. Our results show that having merely a small fraction of network devices support in-network aggregation can lead to a significant reduction in network utilization. Furthermore, we show that various intuitive strategies for performing such placements exhibit significantly inferior performance compared to our solution, for varying workloads, tasks, and link rates.},
  year = {2021}
}

2019

Ph.D.

Itamar Cohen

Now at Ariel University

Publications (10)

2023

ToN

Dynamic Service Provisioning in the Edge-cloud Continuum with Bounded Resources

I. Cohen, C. Chiasserini, P. Giaccone, and G. Scalosub

IEEE/ACM Transactions on Networking, 2023

Abs Bib

We consider a hierarchical edge-cloud architecture in which services are provided to mobile users as chains of virtual network functions. Each service has specific computation requirements and target delay performance, which require placing the corresponding chain properly and allocating a suitable amount of computing resources. Furthermore, chain migration may be necessary to meet the services' target delay. We model and formalize the problem of finding a feasible chain placement and resource allocation, while minimizing the migration, bandwidth, and computation costs. We tackle this problem by partitioning it into a (i) CPU allocation problem, and a (ii) placement problem. For the CPU allocation problem, we find an optimal solution. For the placement problem, we show that even finding a feasible solution is NP-hard, and envision an algorithm that is guaranteed to find a feasible solution while leveraging a bounded amount of resource augmentation. Our algorithms are incorporated into a solution framework that aims to minimize both the cost and the required resource augmentation. The results, obtained through trace-driven, large-scale simulations, show that our framework can provide a close-to-optimal solution while running several orders of magnitude faster than an ILP solver.

@article{cohen2023dynamic,
  bibtex_show = {true},
  abbr = {ToN},
  themes = {edge_cloud},
  title = {Dynamic Service Provisioning in the Edge-cloud Continuum with Bounded Resources},
  author = {Cohen, I. and Chiasserini, C.-F. and Giaccone, P. and Scalosub, G.},
  journal = {IEEE/ACM Transactions on Networking},
  volume = {31},
  number = {6},
  pages = {3096--3111},
  year = {2023},
  abstract = {We consider a hierarchical edge-cloud architecture in which services are provided to mobile users as chains of virtual network functions. Each service has specific computation requirements and target delay performance, which require placing the corresponding chain properly and allocating a suitable amount of computing resources. Furthermore, chain migration may be necessary to meet the services' target delay. We model and formalize the problem of finding a feasible chain placement and resource allocation, while minimizing the migration, bandwidth, and computation costs. We tackle this problem by partitioning it into a (i) CPU allocation problem, and a (ii) placement problem. For the CPU allocation problem, we find an optimal solution. For the placement problem, we show that even finding a feasible solution is NP-hard, and envision an algorithm that is guaranteed to find a feasible solution while leveraging a bounded amount of resource augmentation. Our algorithms are incorporated into a solution framework that aims to minimize both the cost and the required resource augmentation. The results, obtained through trace-driven, large-scale simulations, show that our framework can provide a close-to-optimal solution while running several orders of magnitude faster than an ILP solver.},
  selected = {true}
}

TNSM

High Throughput VMs Placement with Constrained Communication Overhead and Provable Guarantees

I. Cohen, G. Einziger, M. Goldstein, Y. Sa'ar, G. Scalosub, and E. Waisbard

IEEE Transactions on Network and Service Management, 2023

Abs Bib

Placement of VMs in the cloud is one of the most fundamental problems in systems research. Traditionally, placement algorithms assume that the schedulers have complete information about the currently available resources at each host. However, this assumption is in many cases unrealistic, as gathering fresh status information from each of the thousands of hosts in a large data center incurs excessive communication overhead, which results in long queueing delays. Efforts to resolve this problem by employing several parallel schedulers typically exhibit collisions when several schedulers are simultaneously trying to place VMs on the same host. Our work analyzes the performance of various placement algorithms and provides empirical evidence that using multiple randomized schedulers obtains high throughput, while significantly decreasing both the communication overhead, and the number of collisions between schedulers. We, therefore, introduce Adaptive Partial State Random (APSR) -- an efficient parallel random resource management algorithm that samples only from a small number of hosts and dynamically adjusts the degree of parallelism to provide provable guarantees on the probability of collisions between distinct schedulers. We formally analyze APSR, evaluate it on real workloads, and integrate it into the popular OpenStack cloud management platform. Our evaluation shows that APSR matches the throughput provided by other parallel schedulers, while achieving up to 13x lower decline ratio and a reduction of over 85\% in communication overheads.

@article{cohen2023highthroughput,
  bibtex_show = {true},
  abbr = {TNSM},
  themes = {edge_cloud},
  title = {High Throughput VMs Placement with Constrained Communication Overhead and Provable Guarantees},
  author = {Cohen, I. and Einziger, G. and Goldstein, M. and Sa'ar, Y. and Scalosub, G. and Waisbard, E.},
  journal = {IEEE Transactions on Network and Service Management},
  volume = {20},
  number = {3},
  pages = {3148--3161},
  abstract = {Placement of VMs in the cloud is one of the most fundamental problems in systems research. Traditionally, placement algorithms assume that the schedulers have complete information about the currently available resources at each host. However, this assumption is in many cases unrealistic, as gathering fresh status information from each of the thousands of hosts in a large data center incurs excessive communication overhead, which results in long queueing delays. Efforts to resolve this problem by employing several parallel schedulers typically exhibit collisions when several schedulers are simultaneously trying to place VMs on the same host. Our work analyzes the performance of various placement algorithms and provides empirical evidence that using multiple randomized schedulers obtains high throughput, while significantly decreasing both the communication overhead, and the number of collisions between schedulers. We, therefore, introduce Adaptive Partial State Random (APSR) -- an efficient parallel random resource management algorithm that samples only from a small number of hosts and dynamically adjusts the degree of parallelism to provide provable guarantees on the probability of collisions between distinct schedulers. We formally analyze APSR, evaluate it on real workloads, and integrate it into the popular OpenStack cloud management platform. Our evaluation shows that APSR matches the throughput provided by other parallel schedulers, while achieving up to 13x lower decline ratio and a reduction of over 85\% in communication overheads.},
  year = {2023}
}

2022

ToN

False Negative Awareness in Indicator-Based Caching Systems

I. Cohen, G. Einziger, and G. Scalosub

IEEE/ACM Transactions on Networking, 2022

Abs Bib

Distributed caching systems such as content distribution networks often advertise their content via lightweight approximate indicators (e.g., Bloom filters) to efficiently inform clients where each datum is likely cached. While false-positive indications are necessary and well understood, most existing works assume no false-negative indications. Our work illustrates practical scenarios where false-negatives are unavoidable and ignoring them significantly impacts system performance. Specifically, we focus on false-negatives induced by indicator staleness, which arises whenever the system advertises the indicator only periodically, rather than immediately reporting every change in the cache. Such scenarios naturally occur, e.g., in bandwidth-constraint environments or when latency impedes each client's ability to obtain an updated indicator. Our work introduces novel false-negative aware access policies that continuously estimate the false-negative ratio and sometimes access caches despite negative indications. We present optimal policies for homogeneous settings and provide approximation guarantees for our algorithms in heterogeneous environments. We further perform an extensive simulation study with multiple real system traces. We show that our false-negative aware algorithms incur a significantly lower service cost than existing approaches or match the cost of these approaches while requiring an order of magnitude fewer resources (e.g., caching capacity or bandwidth).

@article{cohen2022falsenegative,
  bibtex_show = {true},
  abbr = {ToN},
  themes = {caching},
  title = {False Negative Awareness in Indicator-Based Caching Systems},
  author = {Cohen, I. and Einziger, G. and Scalosub, G.},
  journal = {IEEE/ACM Transactions on Networking},
  volume = {30},
  number = {6},
  pages = {2674--2687},
  year = {2022},
  abstract = {Distributed caching systems such as content distribution networks often advertise their content via lightweight approximate indicators (e.g., Bloom filters) to efficiently inform clients where each datum is likely cached. While false-positive indications are necessary and well understood, most existing works assume no false-negative indications. Our work illustrates practical scenarios where false-negatives are unavoidable and ignoring them significantly impacts system performance. Specifically, we focus on false-negatives induced by indicator staleness, which arises whenever the system advertises the indicator only periodically, rather than immediately reporting every change in the cache. Such scenarios naturally occur, e.g., in bandwidth-constraint environments or when latency impedes each client's ability to obtain an updated indicator. Our work introduces novel false-negative aware access policies that continuously estimate the false-negative ratio and sometimes access caches despite negative indications. We present optimal policies for homogeneous settings and provide approximation guarantees for our algorithms in heterogeneous environments. We further perform an extensive simulation study with multiple real system traces. We show that our false-negative aware algorithms incur a significantly lower service cost than existing approaches or match the cost of these approaches while requiring an order of magnitude fewer resources (e.g., caching capacity or bandwidth).},
  selected = {true}
}

2021

ToN

Access Strategies for Network Caching

I. Cohen, G. Einziger, R. Friedman, and G. Scalosub

IEEE/ACM Transactions on Networking, 2021

Abs Bib

Having multiple data stores that can potentially serve content is common in modern networked applications. Data stores often publish approximate summaries of their content to enable effective utilization. Since these summaries are not entirely accurate, forming an efficient access strategy to multiple data stores becomes a complex risk management problem. This paper formally models this problem as a cost minimization problem, while taking into account both access costs, the inaccuracy of the approximate summaries, as well as the penalties incurred by failed requests. We introduce practical algorithms with guaranteed approximation ratios and further show that they are optimal in various settings. We also perform an extensive simulation study based on real data and show that our algorithms are more robust than existing heuristics. That is, they exhibit near-optimal performance in various settings, whereas the efficiency of existing approaches depends upon system parameters that may change over time, or be otherwise unknown.

@article{cohen2021access,
  bibtex_show = {true},
  abbr = {ToN},
  themes = {caching},
  title = {Access Strategies for Network Caching},
  author = {Cohen, I. and Einziger, G. and Friedman, R. and Scalosub, G.},
  journal = {IEEE/ACM Transactions on Networking},
  volume = {29},
  number = {2},
  pages = {609--622},
  abstract = {Having multiple data stores that can potentially serve content is common in modern networked applications. Data stores often publish approximate summaries of their content to enable effective utilization. Since these summaries are not entirely accurate, forming an efficient access strategy to multiple data stores becomes a complex risk management problem. This paper formally models this problem as a cost minimization problem, while taking into account both access costs, the inaccuracy of the approximate summaries, as well as the penalties incurred by failed requests. We introduce practical algorithms with guaranteed approximation ratios and further show that they are optimal in various settings. We also perform an extensive simulation study based on real data and show that our algorithms are more robust than existing heuristics. That is, they exhibit near-optimal performance in various settings, whereas the efficiency of existing approaches depends upon system parameters that may change over time, or be otherwise unknown.},
  year = {2021}
}

IFIP Networking

Parallel VM Deployment with Provable Guarantees

I. Cohen, G. Einziger, M. Goldstein, Y. Sa'ar, G. Scalosub, and E. Waisbard

Proceedings of IFIP Networking 2021, the 20th International Federation for Information Processing (IFIP) Networking Conference, 2021

Abs Bib

Network Function Virtualization (NFV) carries the potential for on-demand deployment of network algorithms in virtual machines (VMs). In large clouds, however, VM resource allocation incurs delays that hinder the dynamic scaling of such NFV deployment. Parallel resource management is a promising direction for boosting performance, but it may significantly increase the communication overhead and the decline ratio of deployment attempts. Our work analyzes the performance of various placement algorithms and provides empirical evidence that state of the art parallel resource management dramatically increases the decline ratio of deterministic algorithms, but hardly affects randomized algorithms. We therefore introduce APSR -- an efficient parallel random resource management algorithm that requires information only from a small number of hosts and dynamically adjusts the degree of parallelism to provide provable decline ratio guarantees. We formally analyze APSR, evaluate it on real workloads, and integrate it into the popular OpenStack cloud management platform. Our evaluation shows that APSR matches the throughput provided by other parallel schedulers, while achieving up to 13x lower decline ratio and a reduction of over 85% in communication overheads.

@inproceedings{cohen2021parallel,
  bibtex_show = {true},
  abbr = {IFIP Networking},
  themes = {edge_cloud},
  title = {Parallel VM Deployment with Provable Guarantees},
  author = {Cohen, I. and Einziger, G. and Goldstein, M. and Sa'ar, Y. and Scalosub, G. and Waisbard, E.},
  booktitle = {Proceedings of IFIP Networking 2021, the 20th International Federation for Information Processing (IFIP) Networking Conference},
  year = {2021},
  abstract = {Network Function Virtualization (NFV) carries the potential for on-demand deployment of network algorithms in virtual machines (VMs). In large clouds, however, VM resource allocation incurs delays that hinder the dynamic scaling of such NFV deployment. Parallel resource management is a promising direction for boosting performance, but it may significantly increase the communication overhead and the decline ratio of deployment attempts. Our work analyzes the performance of various placement algorithms and provides empirical evidence that state of the art parallel resource management dramatically increases the decline ratio of deterministic algorithms, but hardly affects randomized algorithms. We therefore introduce APSR -- an efficient parallel random resource management algorithm that requires information only from a small number of hosts and dynamically adjusts the degree of parallelism to provide provable decline ratio guarantees. We formally analyze APSR, evaluate it on real workloads, and integrate it into the popular OpenStack cloud management platform. Our evaluation shows that APSR matches the throughput provided by other parallel schedulers, while achieving up to 13x lower decline ratio and a reduction of over 85% in communication overheads.},
  additional_info = {Poster appeared in IEEE INFOCOM 2020.}
}

ICDCS

On the Power of False Negative Awareness in Indicator-based Caching Systems

I. Cohen, G. Einziger, and G. Scalosub

Proceedings of ICDCS 2021, the 41st IEEE International Conference on Distributed Computing Systems, 2021

Abs Bib

Distributed caching systems such as content distribution networks often advertise their content via lightweight approximate indicators (e.g., Bloom filters) to efficiently inform clients where each datum is likely cached. While false-positive indications are necessary and well understood, most existing works assume no false-negative indications. Our work illustrates practical scenarios where false-negatives are unavoidable and ignoring them has a significant impact on system performance. Specifically, we focus on false-negatives induced by indicator staleness, which arises whenever the system advertises the indicator only periodically, rather than immediately reporting every change in the cache. Such scenarios naturally occur, e.g., in bandwidth-constraint environments or when latency impedes each client's ability to obtain an updated indicator. Our work introduces novel false-negative aware access policies that continuously estimate the false-negative ratio and sometimes access caches despite negative indications. We present optimal policies for homogeneous settings and provide approximation guarantees for our algorithms in heterogeneous environments. We further perform an extensive simulation study with multiple real system traces. We show that our false-negative aware algorithms incur a significantly lower access cost than existing approaches or match the cost of these approaches while requiring an order of magnitude fewer resources (e.g., caching capacity or bandwidth).

@inproceedings{cohen2021power,
  bibtex_show = {true},
  abbr = {ICDCS},
  themes = {caching},
  title = {On the Power of False Negative Awareness in Indicator-based Caching Systems},
  author = {Cohen, I. and Einziger, G. and Scalosub, G.},
  booktitle = {Proceedings of ICDCS 2021, the 41st IEEE International Conference on Distributed Computing Systems},
  pages = {46--56},
  abstract = {Distributed caching systems such as content distribution networks often advertise their content via lightweight approximate indicators (e.g., Bloom filters) to efficiently inform clients where each datum is likely cached. While false-positive indications are necessary and well understood, most existing works assume no false-negative indications. Our work illustrates practical scenarios where false-negatives are unavoidable and ignoring them has a significant impact on system performance. Specifically, we focus on false-negatives induced by indicator staleness, which arises whenever the system advertises the indicator only periodically, rather than immediately reporting every change in the cache. Such scenarios naturally occur, e.g., in bandwidth-constraint environments or when latency impedes each client's ability to obtain an updated indicator. Our work introduces novel false-negative aware access policies that continuously estimate the false-negative ratio and sometimes access caches despite negative indications. We present optimal policies for homogeneous settings and provide approximation guarantees for our algorithms in heterogeneous environments. We further perform an extensive simulation study with multiple real system traces. We show that our false-negative aware algorithms incur a significantly lower access cost than existing approaches or match the cost of these approaches while requiring an order of magnitude fewer resources (e.g., caching capacity or bandwidth).},
  year = {2021}
}

INFOCOM

Self-adjusting Advertisement of Cache Indicators with Bandwidth Constraints

I. Cohen, G. Einziger, and G. Scalosub

Proceedings of IEEE INFOCOM 2021, the 40th Annual Joint Conference of the IEEE Computer and Communications Societies, 2021

Abs Bib

Cache advertisements reduce the access cost by allowing users to skip the cache when it does not contain their datum. Such advertisements are used in multiple networked domains such as 5G networks, wide area networks, and information-centric networking. The selection of an advertisement strategy exposes a trade-off between the access cost and bandwidth consumption. Still, existing works mostly apply a trial-and-error approach for selecting the best strategy, as the rigorous foundations required for optimizing such decisions is lacking.Our work shows that the desired advertisement policy depends on numerous parameters such as the cache policy, the workload, the cache size, and the available bandwidth. In particular, we show that there is no ideal single configuration. Therefore, we design an adaptive, self-adjusting algorithm that periodically selects an advertisement policy. Our algorithm does not require any prior information about the cache policy, cache size, or work-load, and does not require any apriori configuration. Through extensive simulations, using several state-of-the-art cache policies, and real workloads, we show that our approach attains a similar cost to that of the best static configuration (which is only identified in retrospect) in each case.

@inproceedings{cohen2021selfadjusting,
  bibtex_show = {true},
  abbr = {INFOCOM},
  themes = {caching},
  title = {Self-adjusting Advertisement of Cache Indicators with Bandwidth Constraints},
  author = {Cohen, I. and Einziger, G. and Scalosub, G.},
  booktitle = {Proceedings of IEEE INFOCOM 2021, the 40th Annual Joint Conference of the IEEE Computer and Communications Societies},
  year = {2021},
  abstract = {Cache advertisements reduce the access cost by allowing users to skip the cache when it does not contain their datum. Such advertisements are used in multiple networked domains such as 5G networks, wide area networks, and information-centric networking. The selection of an advertisement strategy exposes a trade-off between the access cost and bandwidth consumption. Still, existing works mostly apply a trial-and-error approach for selecting the best strategy, as the rigorous foundations required for optimizing such decisions is lacking.Our work shows that the desired advertisement policy depends on numerous parameters such as the cache policy, the workload, the cache size, and the available bandwidth. In particular, we show that there is no ideal single configuration. Therefore, we design an adaptive, self-adjusting algorithm that periodically selects an advertisement policy. Our algorithm does not require any prior information about the cache policy, cache size, or work-load, and does not require any apriori configuration. Through extensive simulations, using several state-of-the-art cache policies, and real workloads, we show that our approach attains a similar cost to that of the best static configuration (which is only identified in retrospect) in each case.},
  additional_info = {Also appeared in the Highlights Track in SYSTOR 2021.}
}

2019

INFOCOM

Access Strategies for Network Caching

I. Cohen, G. Einziger, R. Friedman, and G. Scalosub

Proceedings of IEEE INFOCOM 2019, the 38th Annual Joint Conference of the IEEE Computer and Communications Societies, 2019

Abs Bib

Having multiple data stores that can potentially serve content is common in modern networked applications. Data stores often publish approximate summaries of their content to enable effective utilization. Since these summaries are not entirely accurate, forming an efficient access strategy to multiple data stores becomes a complex risk management problem. This paper formally models this problem, and introduces practical algorithms with guaranteed approximation ratios, and in particular we show that our algorithms are optimal in a variety of settings. We also perform an extensive simulation study based on real data, and show that our algorithms are more robust than existing heuristics. That is, they exhibit near optimal performance in various settings whereas the efficiency of existing approaches depends upon system parameters that may change over time, or be otherwise unknown.

@inproceedings{cohen2019access,
  bibtex_show = {true},
  abbr = {INFOCOM},
  themes = {caching},
  title = {Access Strategies for Network Caching},
  author = {Cohen, I. and Einziger, G. and Friedman, R. and Scalosub, G.},
  booktitle = {Proceedings of IEEE INFOCOM 2019, the 38th Annual Joint Conference of the IEEE Computer and Communications Societies},
  pages = {28--36},
  year = {2019},
  pdf = {publications/cohen19access.pdf},
  abstract = {Having multiple data stores that can potentially serve content is common in modern networked applications. Data stores often publish approximate summaries of their content to enable effective utilization. Since these summaries are not entirely accurate, forming an efficient access strategy to multiple data stores becomes a complex risk management problem. This paper formally models this problem, and introduces practical algorithms with guaranteed approximation ratios, and in particular we show that our algorithms are optimal in a variety of settings. We also perform an extensive simulation study based on real data, and show that our algorithms are more robust than existing heuristics. That is, they exhibit near optimal performance in various settings whereas the efficiency of existing approaches depends upon system parameters that may change over time, or be otherwise unknown.}
}

2018

CompNet

Queueing in the Mist: Buffering and Scheduling with Limited Knowledge

I. Cohen, and G. Scalosub

Computer Networks, 2018

Abs Bib

Scheduling and managing queues with bounded buffers are among the most fundamental problems in computer networking. Traditionally, it is often assumed that all the properties of each packet are known immediately upon arrival. However, as traffic becomes increasingly heterogeneous and complex, such assumptions are in many cases invalid. In particular, in various scenarios information about packet characteristics becomes available only after the packet has undergone some initial processing. In this work, we study the problem of managing queues with limited knowledge. We start by showing lower bounds on the competitive ratio of any algorithm in such settings. The techniques used in our proofs, which make use of a carefully crafted Markov process, may be of independent interest, and can potentially be used in other similar settings as well. Next, we use the insight obtained from these bounds to identify several algorithmic concepts appropriate for the problem, and use these guidelines to design a concrete algorithmic framework. We analyze the performance of our proposed algorithm, and further show how it can be implemented in various settings, which differ by the type and nature of the unknown information. We further validate our results and algorithmic approach by an extensive simulation study that provides further insights as to our algorithmic design principles in face of limited knowledge.

@article{cohen2018queueing,
  bibtex_show = {true},
  abbr = {CompNet},
  themes = {buffer_scheduling},
  title = {Queueing in the Mist: Buffering and Scheduling with Limited Knowledge},
  author = {Cohen, I. and Scalosub, G.},
  journal = {Computer Networks},
  volume = {147},
  pages = {204--220},
  abstract = {Scheduling and managing queues with bounded buffers are among the most fundamental problems in computer networking. Traditionally, it is often assumed that all the properties of each packet are known immediately upon arrival. However, as traffic becomes increasingly heterogeneous and complex, such assumptions are in many cases invalid. In particular, in various scenarios information about packet characteristics becomes available only after the packet has undergone some initial processing. In this work, we study the problem of managing queues with limited knowledge. We start by showing lower bounds on the competitive ratio of any algorithm in such settings. The techniques used in our proofs, which make use of a carefully crafted Markov process, may be of independent interest, and can potentially be used in other similar settings as well. Next, we use the insight obtained from these bounds to identify several algorithmic concepts appropriate for the problem, and use these guidelines to design a concrete algorithmic framework. We analyze the performance of our proposed algorithm, and further show how it can be implemented in various settings, which differ by the type and nature of the unknown information. We further validate our results and algorithmic approach by an extensive simulation study that provides further insights as to our algorithmic design principles in face of limited knowledge.},
  year = {2018}
}

2017

IWQoS

Queueing in the Mist: Buffering and Scheduling with Limited Knowledge

I. Cohen, and G. Scalosub

Proceedings of IWQoS 2017, the 25th IEEE/ACM International Symposium on Quality of Service, 2017

Abs Bib

Scheduling and managing queues with bounded buffers are among the most fundamental problems in computer networking. Traditionally, it is often assumed that all the properties of each packet are known immediately upon arrival. However, as traffic becomes increasingly heterogeneous and complex, such assumptions are in many cases invalid. In particular, in various scenarios information about packet characteristics becomes available only after the packet has undergone some initial processing. In this work, we study the problem of managing queues with limited knowledge. We start by showing lower bounds on the competitive ratio of any algorithm in such settings. Next, we use the insight obtained from these bounds to identify several algorithmic concepts appropriate for the problem, and use these guidelines to design a concrete algorithmic framework. We analyze the performance of our proposed algorithm, and further show how it can be implemented in various settings, which differ by the type and nature of the unknown information. We further validate our results and algorithmic approach by a simulation study that provides further insights as to our algorithmic design principles in face of limited knowledge.

@inproceedings{cohen2017queueing,
  bibtex_show = {true},
  abbr = {IWQoS},
  title = {Queueing in the Mist: Buffering and Scheduling with Limited Knowledge},
  author = {Cohen, I. and Scalosub, G.},
  booktitle = {Proceedings of IWQoS 2017, the 25th IEEE/ACM International Symposium on Quality of Service},
  year = {2017},
  pdf = {publications/cohen17queueing.pdf},
  abstract = {Scheduling and managing queues with bounded buffers are among the most fundamental problems in computer networking. Traditionally, it is often assumed that all the properties of each packet are known immediately upon arrival. However, as traffic becomes increasingly heterogeneous and complex, such assumptions are in many cases invalid. In particular, in various scenarios information about packet characteristics becomes available only after the packet has undergone some initial processing. In this work, we study the problem of managing queues with limited knowledge. We start by showing lower bounds on the competitive ratio of any algorithm in such settings. Next, we use the insight obtained from these bounds to identify several algorithmic concepts appropriate for the problem, and use these guidelines to design a concrete algorithmic framework. We analyze the performance of our proposed algorithm, and further show how it can be implemented in various settings, which differ by the type and nature of the unknown information. We further validate our results and algorithmic approach by a simulation study that provides further insights as to our algorithmic design principles in face of limited knowledge.}
}

M.Sc.

Arbel Levy

Joint supervision with Niv Gilboa
M.Sc.

Ben Amos

Joint supervision with Niv Gilboa

2014

M.Sc.

Yakov Sandomirsky

Joint supervision with Omer Gurewitz

Publications (1)

2014

INFOCOM

Cellular Multi-Coverage with Non-uniform Rates

O. Gurewitz, Y. Sandomirsky, and G. Scalosub

Proceedings of IEEE INFOCOM 2014, the 33rd Annual Joint Conference of the IEEE Computer and Communications Societies, 2014

Abs Bib

Recent advances in the standardization of 4G cellular networks introduce the notion of multi-coverage, where multiple base stations may collaboratively satisfy the demands of mobile users. We provide a theoretical model for studying such multi-coverage environments, in highly heterogeneous settings, where users demands and profits may vary, as can base stations' capacities and the rates with which they can service the users. Whereas previous works provided solutions that were only applicable to scenarios where rates are uniform throughout the network, or allowed a mobile user to be serviced by at most one base station, we present several algorithms for the multi-coverage problem in the presence of non-uniform rates, and analyze their performance. We complete our study by a simulation study that further validates our results and provides further insight into algorithm design, depending on the users' characteristics.

@inproceedings{gurewitz2014cellular,
  bibtex_show = {true},
  abbr = {INFOCOM},
  title = {Cellular Multi-Coverage with Non-uniform Rates},
  author = {Gurewitz, O. and Sandomirsky, Y. and Scalosub, G.},
  booktitle = {Proceedings of IEEE INFOCOM 2014, the 33rd Annual Joint Conference of the IEEE Computer and Communications Societies},
  pages = {1330--1338},
  year = {2014},
  pdf = {publications/gurewitz14cellular.pdf},
  abstract = {Recent advances in the standardization of 4G cellular networks introduce the notion of multi-coverage, where multiple base stations may collaboratively satisfy the demands of mobile users. We provide a theoretical model for studying such multi-coverage environments, in highly heterogeneous settings, where users demands and profits may vary, as can base stations' capacities and the rates with which they can service the users. Whereas previous works provided solutions that were only applicable to scenarios where rates are uniform throughout the network, or allowed a mobile user to be serviced by at most one base station, we present several algorithms for the multi-coverage problem in the presence of non-uniform rates, and analyze their performance. We complete our study by a simulation study that further validates our results and provides further insight into algorithm design, depending on the users' characteristics.}
}

M.Sc.

Michael Markovitch

Now at Brown University

Publications (2)

2022

CompCommun

Bounded Delay Scheduling with Packet Dependencies

M. Markovitch, and G. Scalosub

Computer Communications, 2022

Abs Bib

A common situation occurring when dealing with multimedia traffic is having large data frames fragmented into smaller IP packets, and having these packets sent independently through the network. For real-time multimedia traffic, dropping even few packets of a frame may render the entire frame useless. Such traffic is usually modeled as having inter-packet dependencies. We study the problem of scheduling traffic with such dependencies, where each packet has a deadline by which it should arrive at its destination. Such deadlines are common for real-time multimedia applications, and are derived from stringent delay constraints posed by the application. The figure of merit in such environments is maximizing the system's goodput, namely, the number of frames successfully delivered. We study online algorithms for the problem of maximizing goodput of delay-bounded traffic with inter-packet dependencies, and use competitive analysis to evaluate their performance. We present competitive algorithms for the problem, as well as matching lower bounds that are tight up to a constant factor. We further present the results of a simulation study which further validates our algorithmic approach and shows that insights arising from our analysis are indeed manifested in practice.

@article{markovitch2022bounded,
  bibtex_show = {true},
  abbr = {CompCommun},
  themes = {buffer_scheduling},
  title = {Bounded Delay Scheduling with Packet Dependencies},
  author = {Markovitch, M. and Scalosub, G.},
  journal = {Computer Communications},
  volume = {182},
  pages = {98--109},
  abstract = {A common situation occurring when dealing with multimedia traffic is having large data frames fragmented into smaller IP packets, and having these packets sent independently through the network. For real-time multimedia traffic, dropping even few packets of a frame may render the entire frame useless. Such traffic is usually modeled as having inter-packet dependencies. We study the problem of scheduling traffic with such dependencies, where each packet has a deadline by which it should arrive at its destination. Such deadlines are common for real-time multimedia applications, and are derived from stringent delay constraints posed by the application. The figure of merit in such environments is maximizing the system's goodput, namely, the number of frames successfully delivered. We study online algorithms for the problem of maximizing goodput of delay-bounded traffic with inter-packet dependencies, and use competitive analysis to evaluate their performance. We present competitive algorithms for the problem, as well as matching lower bounds that are tight up to a constant factor. We further present the results of a simulation study which further validates our algorithmic approach and shows that insights arising from our analysis are indeed manifested in practice.},
  year = {2022}
}

2014

INFOCOM-WKSHP

Bounded Delay Scheduling with Packet Dependencies

M. Markovitch, and G. Scalosub

Proceedings of the 1st IEEE INFOCOM 2014 Workshop on Communication and Networking Techniques for Contemporary Video, 2014

Abs Bib

A common situation occurring when dealing with multimedia traffic is having large data frames fragmented into smaller IP packets, and having these packets sent independently through the network. For real-time multimedia traffic, dropping even few packets of a frame may render the entire frame useless. Such traffic is usually modeled as having inter-packet dependencies. We study the problem of scheduling traffic with such dependencies, where each packet has a deadline by which it should arrive at its destination. Such deadlines are common for real-time multimedia applications, and are derived from stringent delay constraints posed by the application. The figure of merit in such environments is maximizing the system's goodput, namely, the number of frames successfully delivered. We study deterministic online algorithms for the problem of maximizing goodput of delay-bounded traffic with inter-packet dependencies, and use competitive analysis to evaluate performance. We present a competitive algorithm for the problem, as well as matching lower bounds that are tight up to a constant factor. We further present the results of a simulation study which further validates our algorithmic approach and shows that insights arising from our analysis are indeed manifested in practice.

@inproceedings{markovitch2014bounded,
  bibtex_show = {true},
  abbr = {INFOCOM-WKSHP},
  title = {Bounded Delay Scheduling with Packet Dependencies},
  author = {Markovitch, M. and Scalosub, G.},
  booktitle = {Proceedings of the 1st IEEE INFOCOM 2014 Workshop on Communication and Networking Techniques for Contemporary Video},
  pages = {257--262},
  year = {2014},
  pdf = {publications/markovitch14bounded.pdf},
  abstract = {A common situation occurring when dealing with multimedia traffic is having large data frames fragmented into smaller IP packets, and having these packets sent independently through the network. For real-time multimedia traffic, dropping even few packets of a frame may render the entire frame useless. Such traffic is usually modeled as having inter-packet dependencies. We study the problem of scheduling traffic with such dependencies, where each packet has a deadline by which it should arrive at its destination. Such deadlines are common for real-time multimedia applications, and are derived from stringent delay constraints posed by the application. The figure of merit in such environments is maximizing the system's goodput, namely, the number of frames successfully delivered. We study deterministic online algorithms for the problem of maximizing goodput of delay-bounded traffic with inter-packet dependencies, and use competitive analysis to evaluate performance. We present a competitive algorithm for the problem, as well as matching lower bounds that are tight up to a constant factor. We further present the results of a simulation study which further validates our algorithmic approach and shows that insights arising from our analysis are indeed manifested in practice.}
}