My research goal: is to build comprehensive, knowledge-based systems that analyze, index, and make available the text and graphics of the biomedical literature. The systems that can achieve this goal will have elements that can help to solve a much broader range of problems. But the biomedical literature is quite enough. The biomedical literature indexed on PubMed represents about ten terabytes of data, if all the full papers were available electronically - and the most recent ones are. My strong background in chemical physics (MIT PhD) and biology (UIUC biology professor) allows me to have expert insight into content issues, rather than simply approaching the problems entirely from the CS point of view.
Building large systems: A comprehensive system such as this ultimately has many technical components, including text processing and natural language understanding, image/diagram analysis, knowledge-based indexing and retrieval, human-interfaces for apps and the Web. The intellectual issues revolve around the design, generation, and use of knowledge representations for the content.
Undergraduate research: In the past, undergraduate students have worked with me on course projects, directed study courses, or in their free time, working on image processing, text analysis, interactive interfaces, machine learning, web apps, and more. My lab, the Biological Knowledge Laboratory in the College of Computer and Information Science, has been focused on these problems for many years. A number of undergraduates have been co-authors on papers the lab has published in conferences and journals.
The advantages of undergraduate research: There are many. First, it gives you a chance to work one-on-one with a faculty member, whether me or someone else. It allows me to get to know you and how you work. This can be invaluable when it comes to getting a recommendation for a job or for graduate school. A letter that simply said, "I had this student in class X and she did a great job.", is little different than simply noting the course grade from a transcript. Second, it gives you insight into the research process, so you can consider whether or not you want to continue such work in a graduate program, MS or PhD, in computer science or in some other field such as bioninformatics. And lastly, it gives you a chance to do research. Research projects, such as a directed study, are not like classes at all. They are open-ended, often with the goals that shift as we learn more. You need to be proactive, able to work both collaboratively and independently. You don't have to come prepared with all these skills - you pick them up through the research experience.
Graduate research: I have supervised PhD students in biology, and more recently and currently, in computer science. My current PhD student is working on machine learning to classify diagrams from the biomedical literature, using multi-class boosting and clustering. Many graduate students are overly cautious, thinking that if they don't pursue a PhD in a "tried and true" area (databases, networks, security, programming languages, etc.) that they "won't be able to get a job". That is just wrong and is can cut you off from exciting future possibilities. But the experience of getting a PhD does prepare you to dive into any number of positions and projects after graduation. I had a graduate student who got an MS working for me on text analysis, then a PhD from Harvard on CPU performance analysis, and then went to IBM Research where she immediately started working on Unix security issues. None of this presented any problems for her. I do research in the areas of natural language and image understanding. If you simply look around you at Google and the many graphics/image-based companies out there, you should immediately recognize how important these topics are and how well work in my group could prepare you to operate and produce things for the computer-based world of the future.
Current research (2007-2008): I want to diversify and extend my research by learning more about database-backed web systems, Flex, and other web tools, and embedded database systems such as Berkeley DB. My interest is not so much on Web 2.0 creations for general consumption, so much as it is on substantial client apps that are web-aware. Eclipse, Protegé, and the Weka machine learning system are major examples in this genre. My lab has already developed diagram parsing, image processing tools, and GUIs for them. These need some redesign and extensions, getting them working well and setting them up for general distribution to the research community. My own web site needs serious rethinking and reworking, to make my many resources more readily available (I've published about 60 scientific papers), adding videos, and so forth.
If you think you might be interested, get in touch: Email me at futrelle@ccs.neu.edu
Return to RPF's Teaching Gateway or homepage