I am strongly driven by the desire to produce research with long lasting impact and strongly believe that good researchers should not only know how to approach a problem at hand but should also know how to spot important problems in the field.
My general areas of research interest are in Networking-a constantly evolving field with ample research opportunities. My previous research projects have focused on two complementary problems. The first concerned the end-to-end characterization of Network properties through the construction of compact, efficient Network models that capture Network dynamics. The availability of such models would assist in the development of network-aware services. The second problem concerned the adaptation of control strategies of transport protocols and network applications at massively accessed Internet servers in order to more efficiently utilize shared network resources and optimize the content delivery process.
My initial approach
to solving these (as well as other) problems typically involved the use
of modeling and mathematical analysis to generate exact results. This is
followed by the use of more empirical approaches involving implementation
and deployment to enable experimental evaluation under realistic conditions
and assumptions.
1. Overview
Most applications nowadays do not provide service guarantees and thus hosts may experience varying performance over the lifetime of a connection. Network-aware applications attempt to react to changes in network resource availability and/or network performance. This reaction to network conditions is essential to better utilize network resources and optimize content delivery, especially because of the considerable strain on Internet resources imposed by the phenomenal growth of the World Wide Web. In order for Network-aware applications to function properly they need accurate, efficient and scalable models capturing network conditions and properties over interesting parts of the network.
Massively Accessed Scalable Servers (a.k.a. Mass servers) are popular Internet servers, which produce a substantial fraction of the traffic flowing through the network. Mass servers are uniquely positioned (1) to observe and diagnose network conditions by tracking the flows that they generate, and (2) to manage and control network resources by better regulating and scheduling the traffic they inject into the network
It is desirable to achieve these goals over a wide spectrum of time scales. Over shorter time scales, a Mass server can minimize packet loss by smoothing the (bursty) process of injecting packets into the network. Over longer time scales, a Mass server can perform aggregate congestion control by wisely bundling like connections to avoid the burstiness that results from competition among flows.
Network diagnostic models can play a big role in Internet characterization. They can also optimize the deployment of a variety of applications and services. Examples include Server Selection, Overlay Network Organization, Admission Control, Flow Scheduling and Cache/Replica Placement. The various implications and benefits of Network models provide a large pool of research opportunities and promise significant research impacts.
I have made a number of research contributions related to the above general goals. My recent contributions have been in the areas of Internet measurements and diagnosis, Network modeling and management, and control protocols and services. For example:
In
the following sections I describe three representative pieces of work that
I have done. In each section I will highlight the problem, describe the
solution approach and the impact of my research.
2. Metric-Induced Network Topologies
Approach: We proposed a general framework for the construction of such metric-induced models using end-to-end measurements. We instantiated our approach using one such property, packet loss rates, and present an analytical framework for the characterization of Internet loss topologies. From the perspective of a server the loss topology is a logical tree rooted at the server with clients at its leaves, in which edges represent lossy paths between a pair of internal network nodes. We show how end-to-end unicast packet probing techniques could be used to (1) infer a loss topology and (2) identify the loss rates of links in an existing loss topology. We report on simulation, implementation, and Internet deployment results that show the effectiveness of our approach and its robustness in terms of its accuracy and convergence over a wide range of network conditions. A contribution of this work is to provide a mechanism to integrate metric-induced models collected at different hosts into one larger model. Another contribution is to provide a mechanism to integrate different metric-induced models collected from the same host at different points in time. These integration mechanisms allow the uncovering of more network details.
Output
and Impact: This
work has resulted in two publications. The framework itself is described
in a paper to appear in INFOCOM?02 [1] and the framework implementation
in the Linux kernel (a.k.a. the Periscope Toolkit) is described in a paper
to appear in PAM 2002 [2]. Periscope is being used by a number of researchers
investigating various network-aware Internet and Peer-to-Peer applications.
Approach: We developed packet-probing techniques to determine whether a pair of connections experience shared congestion. Our extensive simulation results demonstrated that the conditional (Bayesian) probing approach we employ provides superior accuracy, converges faster, and tolerates a wider range of network conditions than recently proposed memoryless (Markovian) probing approaches [5] for addressing this opportunity.
Output
and Impact: This work appeared in the proceedings
of ICNP 2000 [4] and the Bayesian probing techniques are now part of the
Periscope toolkit.
Approach: We developed and simulated end-to-end probing methods, which can measure bottleneck bandwidth along arbitrary, targeted sub-paths of a path between two end-points in the network (including sub-paths shared by a set of flows). As another important contribution, we described a number of practical applications which we foresee as standing to benefit from solutions to this problem, especially in emerging, flexible network architectures such as overlay networks, ad-hoc networks, peer-to-peer architectures and massively accessed content servers.
Output
and Impact: This work is submitted for publication
to SIGCOMM 2002 [3] (and is available as a Technical Report).
As a follow-up of my thesis research, there are many directions that I would like to explore further. Let me briefly discuss some of these directions:
References
[1] Bestavros, Azer; Byers, John; Harfoush, Khaled. Inference and Labeling of Metric-Induced Network Topologies. To appear in Proceedings of IEEE INFOCOM 2002, New York City, New York, June 2002.
[2] Harfoush, Khaled; Bestavros, Azer; Byers, John. PeriScope: An Active Probing API. To appear in Proceedings of PAM 2002, Passive and Active Measurement Workshop, Fort Collins, Colorado, March 2002.
[3] Harfoush, Khaled; Bestavros, Azer; Byers, John. Measuring Bottleneck Bandwidth of Targeted Path Segments. Technical Report BUCS-TR-2001-016 and submitted to ACM SIGCOMM 2002 for publication.
[4] Harfoush, Khaled; Bestavros, Azer; Byers, John. Unicast-based Characterization of Network Loss Topologies. In Proceeding of ICNP 2000: The 6th IEEE International Conference on Network Protocols (ICNP), Osaka, Japan, October 2000.
[5] D. Rubenstein, J. Kurose and D. Towsley. Detecting Shared Congestion of Flows via End-to-end Measurement. In Proceedings of ACM SIGMETRICS?00, Santa Clara, CA, June 2000.