Recovery for Service Oriented Applications
Project Award Number IIS-0533625
This material is based upon work supported by the National Science
Foundation under Grant No IIS-0533625 Any opinions, findings, and
conclusions or recommendations expressed in this material are those of
the author(s) and do not necessarily reflect the views of the National
Science Foundation.
Principal Investigator
Betty J.
Salzberg
College of Computer and Information
Science
Northeastern University
Boston, MA.,
02115
Phone: (617) 373-2229
Fax : (617) 373-5121
Email: salzberg@ccs.neu.edu
URL: http://www.ccs.neu.edu/home/salzberg
Keywords
recovery, logging, optimistic logging, pessimistic logging, server processes, reliability, performance, application fault tolerance
Project Summary
This project provides a recovery system for processes running methods
at middleware servers. We have four types of logging: (1) pessimistic
logging for communication with the outside world or for communication
between distinct service domains (2) optimistic logging for
communication among servers in the same service domain (3 )separate
shared variable logging and (4) logging for communication between a
server method and a DBMS. These four types of logging are integrated
to provide recovery for server methods which guarantees exactly once
execution, good performance and correct semantics.
Publications and Products
Kanoulas, E. et al., "Derivation of the Tumor Position From External REspiratory Surrogates with Periodical Updating of External/Internal Correlation [abstract]", Medical Physics, vol. 33, (2006) p. 2232.
Kanoulas E. et al., "Finding Fastest Paths on a Road Network with Speed Patterns", ICDE 2006 vol. 22, (2006) p.10-20.
Rui Wang, Betty Salzberg and David Lomet, "Log-Based Recovery for Middleware Servers", accepted in SIGMOD 2007.
Rui Wang, "Log-Based Recovery for Middleware Servers", Ph.D. thesis, Northeastern University, 2006.
Panfeng Zhou, Querying Multidimensional Data and Spatio-Temporal Data with Non-Overlapping Access Methods", Ph.D. thesis, Northeastern University, 2006.
Project Impact
Human Resources: Several graduate students working on the
Ph.D. degree in the College of Computer Science at Northeastern University
were supported by this project. Two of these students finished the Ph.D. in
2006 and are working in the database industry.
Education and Curriculum Development at all levels: Professor
Salzberg teaches courses in Database systems and in Algorithms for
undergraduate and graduate students. Materials for these classes are
influenced by this research. In particular, Professor Salzberg
presents recovery algorithms in database courses in more detail than can
be found in most standard textbooks.
Industry Collaboration: This work is in collaboration with Dr. David Lomet
at Microsoft Research in Redmond, Washington.
Goals, Objectives and Targeted Activities
The goal of this project is to provide algorithms and simulations which
will enable recovery of server methods in spite of the nondeterminism resulting
from message receiving, shared variables and interaction with DBMSs.
The targeted activities are:
1. Publications on server recovery.
2. Experiments and simulations measuring performance on a commercial web services platform.
3. Integration of logging and recovery of shared variables and interaction with DBMSs as well as effects of message passing.
Area Background
Although Data Base Management Systems (DBMSs)
provide recovery for data which is written to the database,
applications which use this data lose all other state when there is a
system failure. This project is a step towards making application
state recoverable. Specifically, when an application calls a method
run at a server, our algorithms can be used to make the state of the
method at the server recoverable. Thus, not only data in a DBMS is
protected against system failure, but also server state.
Reliability is an important property for applications. A great deal of
work must now be done by programmers and system managers to provide
recovery on a case-by-case basis. Our work will allow programmers to
call methods without having to provide recovery logic. Systems can be
restarted with confidence that consistency of state will be recovered.
Area References
Roger Barga and Shimin Chen and David Lomet,
"Improving Logging and Recovery Performance in Phoenix/App",
ICDE,
2004, pp. 486-497.
Roger Barga and David Lomet and Stelios Paparizos and Haifeng Yu and Sirish Chandrasekaran,
"Persistent Component-Based Applications via Automatic Recovery",
IDEAS,
2003, pp. 258-267.
Roger Barga and David Lomet and Gerhard Weikum, "Recovery
Guarantees for General Multi-Tier Applications", ICDE, 2002,
pp. 543-554. Philip A. Bernstein and Meichun Hsu and Bruce Mann,
"Implementing Recoverable Requests Using Queues",
SIGMOD, 1990, pp. 112-122.
Om P. Damani and Ashis Tarafdar and Vijay K. Garg,
"Optimistic Recovery in Multi-threaded Distributed Systems",
SRDS, 1999, pp. 234-243.
E. N. Elnozahy and Lorenzo Alvisi and Yimin Wang and David B. Johnson,
"A Survey of Rollback-Recovery Protocols in Message Passing Systems",
ACM Comput. Surv., 2002, vol. 34, number 3, pp.
375-408.
David Lomet and Gerhard Weikum, "Efficient Transparent Application
Recovery In Client-Server Information Systems", SIGMOD,
1998, pp. 460-471.
Jeff Napper and Lorenzo Alvisi and Harrick Vin,
A Fault-Tolerant Java Virtual Machine, IEEE Dependable Systems and Networks,
2003, pp. 425-434.
Michiel Ronsse et al., "Record/Replay for Nondeterministic
Program Executions", Commun. ACM, vol. 46, number 9, 2003, pp. 62-67.
Rober E. Strom and Shaula Yemini, "Optimistic Recovery in
Distributed Systems", ACM Trans. on Computer Systems, vol 3., number 3, 1985, pp. 204-226.
Project Websites
The project website is at
http://www.ccs.neu.edu/home/salzberg/soa. (this site)