I am a PhD student in
College of Computer and Information Science at
Northeastern University.
Before coming here, I worked at
Baidu, Inc. from 2011 to 2012.
I received my BS from Beihang University
in 2011.
I'm interested in High Perfomance Computing and Operating Systems. Specifically,
fault-tolerance has been playing an important role in the HPC area,
since hardware/software
failure is more likely to happen when the system scales. In the meantime, system
architecture and APIs are developing rapidly.
Hence, a general fault-tolerance solution
has become an interesting research topic. My research project
DMTCP (Distributed MultiThreaded CheckPointing)
is a transparent checkpoint/restart
package that supports fault-tolerance for large-scale distributed computation,
with no modification to the operating system,
or to the user application.
Currently I'm working with Prof. Gene Cooperman in the High Performance Computing Lab.
jiajun "at" ccs.neu.edu
370 West Village H
College of Computer and Information Science
Northeastern University
360 Huntington Avenue
Boston, MA 02115, USA