Sunday, 26 July 2009

SQL and the Database Architecture - Week 2

SQL and the database architecture introduction were topics that I was very keen on exploring when I stared the course. I have been working with databases for several years, but I was always working with their output. The technological underpinnings of the ILS or DAMS were always obscure to me. I have seen a SQL update query from time to time, but even if I had some idea what that script did, I was not really sure how the process was carried out. After these two weeks the fog has lifted a tiny bit and database management is now less magic and more craft.

To my surprise, I found the CLI much easier to work with tables - altering or updating them than with both GIU applications: Webmin and phpAdmin. It is not that once one gets familiar with them it would not be easier just to select from drop-down menus, but I found both interfaces very cluttered and confusing. It was easier to get to the MySQL monitor and code the whole command rather than to search on left side of the screen than on the top and then in the middle below/between tables and boxes. However, I was missing the Linux' auto-completion feature.

Creating and populating tables was not difficult, but designing the database from scratch was a challenge, especially once I learned about normalization forms, but it still was an interesting exercise to think about the character of entered data and gradually simplify the tables.

The SQL queries were (are) hard to learn and I still need to check the correct syntax for the first line with SQL commands. The parameters are somewhat easier, as their syntax remains the same and does not vary. I had no luck with data aggregation and SQL functions. In the video lectures and examples it looks logical and easy, but when I tried to come up with one myself for one of the practice databases I failed miserably (I was trying to calculate how many images each artist has in collections). I guess that topic will need to be revisited.

I was still impressed by how powerful, yet in principle simple the SQL queries are.

Sunday, 19 July 2009

Database Design and SQL

This was definitely a tough week, very theory-heavy, but it also included a relatively gentle introduction into SQL. I admit that at first I did not believe Joshua Mostafa's assertion that SQL is easy to comprehend, but after several lessons I realized I was able comprehend the most fundamental SQL rules of query construction and of database manipulation commands. SQL actually makes sense.

The different data types are interesting and a useful concept. I had some difficulties with understanding some of the numerical data types. The difficulty was rather in imagining the practical use for some of the features. It made me think about what kind of data I use, and how I enter them. I also found out the rational behind some conventions of data entry which I followed and took for granted. So many times I was told that a certain field can't be longer than 255 characters and it often made me wonder why, now I can guess that it may be because the given column requires char, varchar or tinytext data types. Even if it could be a tinyint, as well.

The normal forms still puzzle me. I may get better understanding of table normalization by looking at actually examples and trying to understand the relationships and changes as data are entered, how they are retrieved, and how they may have been manipulated in between.

Sunday, 12 July 2009

On Technology Planning

This week's readings were an interesting bunch, containing articles on the historical background of several federal support programs to foster technology in libraries. Some documents were very useful and detailed howtos, others rather impish and somewhat personal like Michael Schuyler's "Life is What Happens to You When You Are Making Other Plans." Even if I understand his frustration and irritation with "meddlesome" bureaucracy, I did not find his article particularly helpful as guiding material. There were moments when he came to his senses and tried to treat the subject seriously, but it's difficult for me to believe that he tried "really hard not to rant." It was fun to read it, though.

Two documents, however, stood out for me: OCLC The 2003 OCLC Environmental Scan: Pattern Recognition, and ALA's Information Technology at ALA: 2000-2005. These were really thorough and detailed analyses. While the OCLC report was general in nature, the authors of the report tried to depict trends in ways knowledge is being accessed, transmitted, and stored; not only on national, but on the global level. They attempted to capture the position of libraries (the term that they use to represent libraries, as well as museums, archives and other institution dealing with education, heritage, or culture). I was particularly impressed with "The Technology Landscape" chapter. It did not bring up any surprises, but I liked how the authors managed succinctly wrap up the current state of the field. I do occasionally follow Lorcan Dempsey's weblog, who was one of the crafters of the report. He often posts various reports and findings related to literacy, knowledge management, learning and research, and also social networking on his weblog and in doing so permanently updates the OCLC report.

The other document that impressed me was produced by ALA: the goals, needs, assessments, benchmarks, evaluations were all clearly stated, no verbiage - everything to the point. It was a well-structured and transparent document. Some of the other texts on technology planning advised not to set too rigid goals, and even if the ALA document called for flexible planning, their plan when it came down to implementation was very specific. The only criticism I would have is that there was no mention of any person responsible for any of the steps taken. In addition, even if the plan contains a list of possible implementation issues and conflicts, there is no hint how they are going to be resolved.

If I were to participate in any technology planning activities, I think I would start with documenting my current activities and workflows. Once these activities are documented, they can be evaluated and eventually improved.

There are also some more system-oriented concerns that I would have now - how do systems scale up with the needs and workflows in place? Are the applications modular, expandable? Do they support open and shared standards - or is it a closed black-box operation? And if the system is closed are there any open source alternatives? And if there are alternatives, do we have the manpower to run them and maintain them?

Saturday, 4 July 2009

On Learning XML, its Processing, and Unicode

In learning XML, I used both Mark Long's Introduction to XML and the W3schools' XML Tutorial. As much as I like the W3schools tutorials, I rarely learn a technology or language that would be completely new to me from them. I visit the site relatively often, but usually for reference questions or background information. I liked Mark Long's casual and unpretentious style, which was to the point and clear of jargon.

I think that one of the greatest benefits of XML is its ability to accommodate a variety of languages and scripts; that one can have a source code containing several languages and scripts and yet the browser will display only that language or script the user expects if set-up properly. Therefore, I looked at the encoding and special character issues in both tutorials. I am great fan of Unicode, which I find to be an amazing intellectual achievement. It is a pity that for many web developers there is virtually no alternative to ISO-8859-1 (a basic Latin Western character set). Even at the 3Wschools' pages aimed at examples of XML in real life, in spite the fact that UTF-8 is a default encoding for XML, one finds the unnecessary suboptimal value in the processing instruction .

However, the issue that I was most wondering about was the processing of XML. I heard about both SAX and DOM and knew that there are different XML parsers, but the difference was never really clear to me. I think that after watching Mark Long's video I now understand the utilities much better than I did before.