Sunday, August 26, 2007
A Plethora of Databases
August, 2008: Things have been happening: I need good DBs for our research on natural language processing and diagram understanding. We've settled on Apache Derby, and the embedded mode in that. It runs in the same JVM as our app. The DB is simply a collection of files whose location you specify, so it's easy to move around.
Earlier: I'm teaching the undergraduate Database Design course this Fall 2007, CSU430. A lot has changed since I last taught a database course. I'll continue to use the O'Neil's DB book, now in its second edition. What has changed is the industry. Major players such as Oracle, IBM, and Microsoft have been supplemented by the appearance and growth of MySQL and embedded DBs such as Berkeley DB (now owned by Oracle) and Caché. I have used Berkeley DB Java Edition (BDB JE) and it works fine. BDB JE is just one small jar file. Java Annotations are used in classes to indicate various DB options. I've yet to try Caché. Caché claims to be an embedded OO store that simultaneously supports SQL. We'll see.
I can't demo all of the DBs for my class, so I'll focus on one, MySQL. A fully functional freely downloadable versions available for many platforms. I actually had a painful time getting it installed under Mac OS X, but, using a whole bucket of pixie dust, I finally got it installed and it's just fine. Starts up at boot time and is just always there. The students can install it themselves or use some other SQL DB, as long as they can generate things to hand in for comments and grading. In addition, the College has MySQL running on its CGI server that any student can access. There is also MS SQL available, but I'm not a Windows person. Students are welcome to use it. Our textbook is fully SQL-oriented, so SQL makes sense.
MySQL has a two or three nice GUI-based controllers - I downloaded two and they work just fine. (MySQL is installed on the Mac OS side of my Intel MacBook Pro and I'm running pure Windows while I type this so I can't give all the interesting details at this moment.)
I will definitely demo BDB JE and encourage students to choose it or Caché for a project. Since I'll be doing GUIs in my HCI class, also this Fall, there may be some interesting synergies.
More recently (mid-Fall 2007) I've discovered Hadoop, MapReduce, and HBase for handling massive amounts of data on clusters. To work on this, I've acquired a small cluster - One Apple Xserve and four Apple Minis. Two undergrads in my DB Design course are working on this with me. They have code working and will move it to the cluster shortly when we expect to have it powered on by November 16 (2007). First applications will be to text processing and image analysis. More on this as work progresses.
Earlier: I'm teaching the undergraduate Database Design course this Fall 2007, CSU430. A lot has changed since I last taught a database course. I'll continue to use the O'Neil's DB book, now in its second edition. What has changed is the industry. Major players such as Oracle, IBM, and Microsoft have been supplemented by the appearance and growth of MySQL and embedded DBs such as Berkeley DB (now owned by Oracle) and Caché. I have used Berkeley DB Java Edition (BDB JE) and it works fine. BDB JE is just one small jar file. Java Annotations are used in classes to indicate various DB options. I've yet to try Caché. Caché claims to be an embedded OO store that simultaneously supports SQL. We'll see.
I can't demo all of the DBs for my class, so I'll focus on one, MySQL. A fully functional freely downloadable versions available for many platforms. I actually had a painful time getting it installed under Mac OS X, but, using a whole bucket of pixie dust, I finally got it installed and it's just fine. Starts up at boot time and is just always there. The students can install it themselves or use some other SQL DB, as long as they can generate things to hand in for comments and grading. In addition, the College has MySQL running on its CGI server that any student can access. There is also MS SQL available, but I'm not a Windows person. Students are welcome to use it. Our textbook is fully SQL-oriented, so SQL makes sense.
MySQL has a two or three nice GUI-based controllers - I downloaded two and they work just fine. (MySQL is installed on the Mac OS side of my Intel MacBook Pro and I'm running pure Windows while I type this so I can't give all the interesting details at this moment.)
I will definitely demo BDB JE and encourage students to choose it or Caché for a project. Since I'll be doing GUIs in my HCI class, also this Fall, there may be some interesting synergies.
More recently (mid-Fall 2007) I've discovered Hadoop, MapReduce, and HBase for handling massive amounts of data on clusters. To work on this, I've acquired a small cluster - One Apple Xserve and four Apple Minis. Two undergrads in my DB Design course are working on this with me. They have code working and will move it to the cluster shortly when we expect to have it powered on by November 16 (2007). First applications will be to text processing and image analysis. More on this as work progresses.
Subscribe to Posts [Atom]