The idea of a pre-installed Virtual machine is an appealing thought. While it served its instructional purpose in 672 when it was important to understand how a server is set-up and see all those tasks which need to be accomplished in order to have an operational server, it became rather a waste of time in 675 to start every new software package with a new clean standard installation of VM. It became a mechanistic routine which I did automatically, so I was not learning anything new from the process and the installation was taking valuable time that I could dedicate to installing the actual software package. Occasional typos in /etc/network/interfaces were a particular source of frustrations, and I do not think I learned much from retyping 168 instead 186. The other problem was related to the hosts file which resides on the host machine and which many may not have even changed and used the 192.168.X.3 fixed IP for all their application, anyway.
I probably would be satisfied with pre-configured "standard" Ubuntu install and manual install in case of less typical installations; such was in the case DSpace which does not use the usual LAMP set-up in order to refresh the whole process in that way the pedagogical purpose of that exercise would still be valid.
Tuesday, 17 November 2009
Tuesday, 3 November 2009
OAI-PMH and Collection Development
I have been thinking about the opportunities OAI-PMH has to offer for a while. At the beginning my acceptance of OAI-PMH was rather unreflective, it was just the right thing to do, something that helps to connect various digital resources. The literature usually did not provide too many guidelines as to how OAI-PMH can be employed as a collection development tool. There were a lot of articles about OAI-PMH, but many of those materials were rather technical and OAI-PMH was treated just as a tool to populate databases with records from other repositories. Interoperability was often mentioned, but there were few case studies that would show how these distributed resources can be aggregated in a meaningful way that would complement material offered by the institution. As a result there is not sufficient granularity in providing records for harvesting, few institutions offer meaningfully created sets. Those sets which exist are often indiscriminate aggregations of resources produced under different projects or by different agencies.
The OAI-PMH was developed for institutional repositories which often exist separately from special collections or archives within university libraries and they may be one of the reasons, why this technology has been underused in heritage repositories. Those huge pools of records with no clearly defined scope and audience had little to offer in terms of collection development and build-up, they could hardly supplement one's own material with complimentary resources from other repositories with similar area of interest. However, recently a number of OAI-PMH service providers appeared that harvest records from more narrowly specified sets, these services are mostly tied to a project, so the metadata seem more consistent than. This is the case of the Sheet Music collection that even posted its cataloging guidelines. Another project with a clearly defined scope, even if broader than the Sheet Music project is American Social History Online. Projects like this allow a high level of customization of their services. They can provide users with more precise and useful results, a metadata filter can be used that brings browsing users almost effortlessly several levels deep into collection hierarchy to resources they seek.
The use of Web 2.0 tools and intelligent use of client-side scripting can make useful browsing and searching even within more general aggregations. I was particularly impressed with the ELib service administered by University of Bremen, Germany that also presents harvested material in very intuitive and effective way. Taking advantage of subjects headings for creating browseable hierarchies, but also tag clouds for keywords and further refinement of query.
The DLF (Digital Library Federation) OAI Portal is a reminder of an earlier period when the OAI-PMH served merely for aggregating material from various resources without any further manipulation and repurposing of metadata. Users can use search or browse two browsable hierarchies, one based on subject headings that however are still very broad and then one based on data providers, but no other tools for narrowing the record sets are available. The records obviously originated in various formats and were based on different rules, and the effort to normalize them was rather limited.
In order to take advantage of the OAI-PMH and make it a useful tool for metadata sharing that can help to round out and complement virtual collections, the data providers need to make sure that the records are available in meaningful granular sets. Clearly defined metadata sets and cataloging that follows accepted standards and takes into account new contexts in which metadata can exist, make it possible for other repositories to integrate these records into their collections, and in such a way to provide additional exposure to those resources.
The OAI-PMH was developed for institutional repositories which often exist separately from special collections or archives within university libraries and they may be one of the reasons, why this technology has been underused in heritage repositories. Those huge pools of records with no clearly defined scope and audience had little to offer in terms of collection development and build-up, they could hardly supplement one's own material with complimentary resources from other repositories with similar area of interest. However, recently a number of OAI-PMH service providers appeared that harvest records from more narrowly specified sets, these services are mostly tied to a project, so the metadata seem more consistent than. This is the case of the Sheet Music collection that even posted its cataloging guidelines. Another project with a clearly defined scope, even if broader than the Sheet Music project is American Social History Online. Projects like this allow a high level of customization of their services. They can provide users with more precise and useful results, a metadata filter can be used that brings browsing users almost effortlessly several levels deep into collection hierarchy to resources they seek.
The use of Web 2.0 tools and intelligent use of client-side scripting can make useful browsing and searching even within more general aggregations. I was particularly impressed with the ELib service administered by University of Bremen, Germany that also presents harvested material in very intuitive and effective way. Taking advantage of subjects headings for creating browseable hierarchies, but also tag clouds for keywords and further refinement of query.
The DLF (Digital Library Federation) OAI Portal is a reminder of an earlier period when the OAI-PMH served merely for aggregating material from various resources without any further manipulation and repurposing of metadata. Users can use search or browse two browsable hierarchies, one based on subject headings that however are still very broad and then one based on data providers, but no other tools for narrowing the record sets are available. The records obviously originated in various formats and were based on different rules, and the effort to normalize them was rather limited.
In order to take advantage of the OAI-PMH and make it a useful tool for metadata sharing that can help to round out and complement virtual collections, the data providers need to make sure that the records are available in meaningful granular sets. Clearly defined metadata sets and cataloging that follows accepted standards and takes into account new contexts in which metadata can exist, make it possible for other repositories to integrate these records into their collections, and in such a way to provide additional exposure to those resources.
Labels:
Aquifer,
collection development,
metadata,
OAI-PMH,
ta
Tuesday, 27 October 2009
Consistency of Metadata
I am a strong believer in the sharing and interoperability of metadata, therefore I am trying to advocate descriptive standards both regarding syntax of elements and their content. I decided to use controlled vocabularies for my collection; most of them are well-established and time-tested within the cultural and heritage institutions, such as Thesaurus for Graphic Materials, Library of Congress Subject Headings. However, for the style element, which is a VRA-inspired extension of the DCTerms set I use a local short list that was prepared for the collection.
It is not only subject headings that I try to control in order to minimize errors and typos, in all tested applications I tried to come up with a drop-down menu for languages used within my collection, because terms like Yiddish or Lithuanian can cause difficulties even to experienced catalogers.
In order to make the retrieval of metadata functional and effective I try to keep values of metadata fields simple and relatively short, so that the data would display properly and did not conflict with layout of the page. Shorter entries are also easier for users to scan on the result screen.
One of the difficulties I have to face is that the collection is relatively small and thematically dispersed, so it is relatively difficult to come up with a good browsing categories, therefore I tried to choose broader access terms rather than specific ones.
It is not only subject headings that I try to control in order to minimize errors and typos, in all tested applications I tried to come up with a drop-down menu for languages used within my collection, because terms like Yiddish or Lithuanian can cause difficulties even to experienced catalogers.
In order to make the retrieval of metadata functional and effective I try to keep values of metadata fields simple and relatively short, so that the data would display properly and did not conflict with layout of the page. Shorter entries are also easier for users to scan on the result screen.
One of the difficulties I have to face is that the collection is relatively small and thematically dispersed, so it is relatively difficult to come up with a good browsing categories, therefore I tried to choose broader access terms rather than specific ones.
Tuesday, 13 October 2009
OAI-PMH and Benefits of DC
For a while now I have been wondering about metadata interoperability. Günter Waibel and Mary W. Elings demonstrated* that interoperability is possible even if different communities use different metadata standards, or more in the spirit of the article, even if different materials are described by different standards. OAI-PMH is essential for this type of interoperability, but the OAI-MPH is just a tool - a protocol for exchange or sharing, but in fact what makes the exchange possible is Dublin Core.
I was never a big fan of Dublin Core, whether in its qualified or unqualified form. I was always skeptical that the effort to generalize the concept of description and remove it from the material to be described does not bode well for practices in cultural and heritage institutions. However, a title is undeniably a title whether it is the title of a book, or of a painting or of an archival artefact. Based on a descriptive standard, the title does not have to be always constructed the same way and look the same, but basically the concept is understandable - the title is that property under which an object is usually known. Once I accepted this truism, my opposition to DC as an intermediary layer became less intense.
OAI-PMH was one reason I changed my mind, but the other was DigiTool and its mapping file harvesting_schema that is based on the modified extended qualified DC, and which effectively manages to channel data from various metadata standards into descriptive facets that are then present to a user in resource discovery. There is little chance that the user will recognize the native format of the metadata, some residual delimiters may give away a MARC record, but content-wise the harvesting_schema allows for a lot of flexibility. It is also extensible , so one is not bound by the DCTerms set.
When it comes to description, I am in favour of MODS, but I can live with MARC as well, but I started to appreciate the fact that there is a light-weight DC somewhere out there. And I am glad that we can make metadata available in OAI-PMH in both formats in addition to DC elements or more precisely OAI_DC.
*Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums by Mary W. Elings and Günter Waibel. First Monday, volume 12, number 3 (March 2007), URL: http://firstmonday.org/issues/issue12_3/elings/index.html (Accessed on 2009-10-13)
I was never a big fan of Dublin Core, whether in its qualified or unqualified form. I was always skeptical that the effort to generalize the concept of description and remove it from the material to be described does not bode well for practices in cultural and heritage institutions. However, a title is undeniably a title whether it is the title of a book, or of a painting or of an archival artefact. Based on a descriptive standard, the title does not have to be always constructed the same way and look the same, but basically the concept is understandable - the title is that property under which an object is usually known. Once I accepted this truism, my opposition to DC as an intermediary layer became less intense.
OAI-PMH was one reason I changed my mind, but the other was DigiTool and its mapping file harvesting_schema that is based on the modified extended qualified DC, and which effectively manages to channel data from various metadata standards into descriptive facets that are then present to a user in resource discovery. There is little chance that the user will recognize the native format of the metadata, some residual delimiters may give away a MARC record, but content-wise the harvesting_schema allows for a lot of flexibility. It is also extensible , so one is not bound by the DCTerms set.
When it comes to description, I am in favour of MODS, but I can live with MARC as well, but I started to appreciate the fact that there is a light-weight DC somewhere out there. And I am glad that we can make metadata available in OAI-PMH in both formats in addition to DC elements or more precisely OAI_DC.
*Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums by Mary W. Elings and Günter Waibel. First Monday, volume 12, number 3 (March 2007), URL: http://firstmonday.org/issues/issue12_3/elings/index.html (Accessed on 2009-10-13)
Monday, 28 September 2009
Drupal Experiences
Installing Drupal was quite an experience. I have downloaded several image-rendering modules, mostly because I was not sure how they worked, so as a result some are installed without probably being used and duplicating the image.module .
I felt that the taxonomy part needed some additional features. I downloaded and installed the tag cloud block – Cumulus and Taxonomy Breadcrumb. Cumulus provides a user-friendly overview of available terms and the quantity of these terms is also visually presented in an intuitive way.
Taxonomy Breadcrumb, on the hand, provide users with a functional clue about their momentary location and allows them to backtrack within a collection. However, this feature would be much more useful had I had the collection more structured and more deeply nested.
I feel that Drupal is a useful presenting tool, relatively easy to use and to set up. The set up is definitely more difficult, but I think that a Drupal site can be indeed designed in a very user-friendly way both for the user-visitor of the site and for the user-content creator. Who, on the on the hand, does not have to have web design or html skills in order to produce aesthetically appealing and easy to navigate pages.
Drupal, however, is more about access to digital resources, not necessarily suitable for their managing and for tasks required for digital preservation. I can imagine that Drupal can work well in tandem with a DAMS application that would handle the storage and manipulation of various manifestations, and Drupal could enable access to the view manifestations and appropriate descriptive metadata.
I felt that the taxonomy part needed some additional features. I downloaded and installed the tag cloud block – Cumulus and Taxonomy Breadcrumb. Cumulus provides a user-friendly overview of available terms and the quantity of these terms is also visually presented in an intuitive way.
Taxonomy Breadcrumb, on the hand, provide users with a functional clue about their momentary location and allows them to backtrack within a collection. However, this feature would be much more useful had I had the collection more structured and more deeply nested.
I feel that Drupal is a useful presenting tool, relatively easy to use and to set up. The set up is definitely more difficult, but I think that a Drupal site can be indeed designed in a very user-friendly way both for the user-visitor of the site and for the user-content creator. Who, on the on the hand, does not have to have web design or html skills in order to produce aesthetically appealing and easy to navigate pages.
Drupal, however, is more about access to digital resources, not necessarily suitable for their managing and for tasks required for digital preservation. I can imagine that Drupal can work well in tandem with a DAMS application that would handle the storage and manipulation of various manifestations, and Drupal could enable access to the view manifestations and appropriate descriptive metadata.
Labels:
digital libraries,
Drupal,
Drupal modules,
irls675
Tuesday, 15 September 2009
Pace
I find the pace of tech classes fine. There is a lot to explore and try. The reading seems a bit more time-consuming than it was in the 672 class. The management component compliments the technology part well. One needs to change gears and instead of finger tips start using one's head again.
However, it is much more difficult to plan work for the whole week now, because there are three days in between the time the technology assignments are made available and the management ones appear in the d2l. Since I do not know what I will have to do from Wednesday on I try to do as much hands-on as possible by then. Then I do work on the theoretical parts and later return to the practical exercises which after that intermission look like anew, so I have to go through some of the steps again, which in the end can be seen as a benefit, because repetition makes perfect.
However, it is much more difficult to plan work for the whole week now, because there are three days in between the time the technology assignments are made available and the management ones appear in the d2l. Since I do not know what I will have to do from Wednesday on I try to do as much hands-on as possible by then. Then I do work on the theoretical parts and later return to the practical exercises which after that intermission look like anew, so I have to go through some of the steps again, which in the end can be seen as a benefit, because repetition makes perfect.
Monday, 7 September 2009
Content Management Systems in Libraries
Content management systems (CMS) are becoming an important presence on campuses and in various cultural and heritage repositories. They often bring unified policies concerning web publishing, elevate the need for the technical expertise of the contributing staff, and can also enable more immediate interaction between an institution and its audience.
What I found interesting while reading through the articles in the special issue dedicated to CMS of the Library Hi-Tech (2006, Volume 24, Issue 1) is that most of the scenarios presented were home-developed systems; few institutions would adopt an already developed package. Even in the case of Morehead State University, as described in Migrating a Library's Web Site to a Commercial CMS within a Campus-wide Implementation *), where the university opted for an outside vendor to provide them with a CMS, they chose a relative new-comer to academic software and caused considerable problems. Most likely, the universities often wanted to built on expertise of their own staff and preserve workflows that may have considered key and well-tested. In some case, the technolust of those in charge was not hidden very well (Kent State for instance).
The question is whether an opportunity to revise workflows and streamline operation was not missed when the architecture of the new CMS system tried to preserve the existing set-up. I was also uneasy when reading about elimination of the code view and relying solely on WYSIWYG and forms. That is often very limiting. I can see why in many cases elimination of the possibility edit code of the page is imperative, but on the other hand I experienced several situations in the past when a small touch in the source code of the HTML could easily fix an issue, but it was not possible because of the system set-up.
I read with particular interest article CMS/CMS: content management system/change management strategies **). This article transparently listed explored methodologies, lessons learned, possible pitfalls and in the end practical recommendations. The articles focused more on the managerial issues and questions of project management. The most important point I took from the article is that the preparation and planning phase is key. To know one's own structure, role and mission is important when implementing CMS, so that the structure and workflows were as effectively channeled as possible and the possibility of duplication minimized. I was also reminded that it is unreasonable to expect that the current state of institution will be preserved, as the "Virtual change will drive organizational change."
After all, implementing an institution-wise CMS is an opportunity to re-think one's role, reevaluate service provided to institution's users and their effectiveness. In short, it is an opportunity for improving the function of an institution.
References:
* Tom Kmetz, Ray Bailey: Migrating a Library's Web Site to a Commercial CMS within a Campus-wide Implementation. Library Hi-Tech (2006), Volume 24, Issue 1, pp. 102-114.
** Susan Goodwin, Nancy Burford, Martha Bedard, Esther Carrigan, Gale C. Hannigan: CMS/CMS: content management system/change management strategies. Library Hi-Tech (2006), Volume 24, Issue 1, pp. 54-60.
What I found interesting while reading through the articles in the special issue dedicated to CMS of the Library Hi-Tech (2006, Volume 24, Issue 1) is that most of the scenarios presented were home-developed systems; few institutions would adopt an already developed package. Even in the case of Morehead State University, as described in Migrating a Library's Web Site to a Commercial CMS within a Campus-wide Implementation *), where the university opted for an outside vendor to provide them with a CMS, they chose a relative new-comer to academic software and caused considerable problems. Most likely, the universities often wanted to built on expertise of their own staff and preserve workflows that may have considered key and well-tested. In some case, the technolust of those in charge was not hidden very well (Kent State for instance).
The question is whether an opportunity to revise workflows and streamline operation was not missed when the architecture of the new CMS system tried to preserve the existing set-up. I was also uneasy when reading about elimination of the code view and relying solely on WYSIWYG and forms. That is often very limiting. I can see why in many cases elimination of the possibility edit code of the page is imperative, but on the other hand I experienced several situations in the past when a small touch in the source code of the HTML could easily fix an issue, but it was not possible because of the system set-up.
I read with particular interest article CMS/CMS: content management system/change management strategies **). This article transparently listed explored methodologies, lessons learned, possible pitfalls and in the end practical recommendations. The articles focused more on the managerial issues and questions of project management. The most important point I took from the article is that the preparation and planning phase is key. To know one's own structure, role and mission is important when implementing CMS, so that the structure and workflows were as effectively channeled as possible and the possibility of duplication minimized. I was also reminded that it is unreasonable to expect that the current state of institution will be preserved, as the "Virtual change will drive organizational change."
After all, implementing an institution-wise CMS is an opportunity to re-think one's role, reevaluate service provided to institution's users and their effectiveness. In short, it is an opportunity for improving the function of an institution.
References:
* Tom Kmetz, Ray Bailey: Migrating a Library's Web Site to a Commercial CMS within a Campus-wide Implementation. Library Hi-Tech (2006), Volume 24, Issue 1, pp. 102-114.
** Susan Goodwin, Nancy Burford, Martha Bedard, Esther Carrigan, Gale C. Hannigan: CMS/CMS: content management system/change management strategies. Library Hi-Tech (2006), Volume 24, Issue 1, pp. 54-60.
Labels:
CMS,
Content management systems,
irls675,
local development
Monday, 31 August 2009
On Tagging and Controlled Vocabularies
Clay Shirky's Ontology is Overrated: Categories, Links, and Tags is certainly a provocative and the discussion about importance of controlled vocabularies and effective use of tags is still ongoing in various forums where catalogers and those interested in description of resources congregate (even if mostly virtually). When it comes to ontologies and controlled vocabularies, I believe that there is a need for the profession to use consistent terminology. If several nomenclatures do exist, they should always be documented and properly referenced. I think that librarians, archivists, and museum curators should provide metadata that contain controlled vocabulary, even pre-coordinated. That said I also believe that popular taxonomies (folksonomies) should be considered when providing access to digital resources. The controlled vocabularies can serve mostly system-related and administrative functions for the staff, they also provide initial categorization. Once the object is in the system, it can move freely throughout the collection prodded by whatever probabilistic algorithm. Including folksonomies into metadata (or maybe rather indexes) for searching and browsing can make finding objects in collections easier, as it provides users with more options. It is also a social commentary on a given culture or society and an indicator of interest among users. However, that does not mean resignation on established terms, because the scholarly and professional needs of expert users have to be addressed, as well. There is no dichotomy. It's not either folksonomies or controlled vocabularies, because both systems can coexist and complement each other.
Labels:
controlled vocabularies,
folksonomies,
irls675,
tagging
Proposed Collection for Drupal Test
There are few occasions when digitised material does not make it into a repository and remains sitting in the staging area. There are usually legitimate reasons for the delay and sooner or later the project will be picked up. This particular project contains images that were digitised for publication. The book is out, most of the digitized images too, but there are still some that did not make it into the repository yet.
I would like to create a collection of images of front-covers and title pages from this project, and look at them as a cultural artifact in their own right, rather than just a manifestation of a book.
The collection may offer several ways to be browsed - based on the language or script used, date of publication, authors of the books, and creators of the book design. The subject depicted on the front cover, if any, certainly should also be a descriptive facet, and a particularly suitable for tagging. Many of these books and their covers had a message to convey to those who took the book into their hands. The message of the books has certainly changed and users can access it from different angles now; the same book can be a piece of Bolshevik propaganda for some, for others just a popular children's book.
I would like to create a collection of images of front-covers and title pages from this project, and look at them as a cultural artifact in their own right, rather than just a manifestation of a book.
The collection may offer several ways to be browsed - based on the language or script used, date of publication, authors of the books, and creators of the book design. The subject depicted on the front cover, if any, certainly should also be a descriptive facet, and a particularly suitable for tagging. Many of these books and their covers had a message to convey to those who took the book into their hands. The message of the books has certainly changed and users can access it from different angles now; the same book can be a piece of Bolshevik propaganda for some, for others just a popular children's book.
Monday, 10 August 2009
Project Management and Digital Libraries
There is ample evidence that planning for a digital project is important. However, in all of those texts on project management I encountered there was always an assumption, sometimes explicitly stated, that the project was something new, unique, outside of the usual operation of the given body. I work for an organization where projects are our daily operation - all staff members are managed through projects. As a result, all members of the team usually know what their roles are; there is no need to worry about buy-in, because our products are based on material submitted by the partners. Previously, there were some initiatives coming from the team in order to participate in grants and gain some additional funds, but those projects could have been realized only because the Partners - the stakeholders - were interested.
The whole environment and operation is relatively low risk; the whole team relatively successful and accepted. The relationships within the team are built on trust. I realize that that can easily backfire, however, I consider that an asset. This trust is also a motivating factor and generates interest, commitment, and sense of professionalism on the part of the team. The members of the team often bring up suggestions that lead to improving workflows and amending our policies which further improves the work environment.
It is clear that the project has to have a clear sense of direction, because not all suggestions have to necessarily lead in the right directions. H. Frank Cervone's articles brought the philosophy of project management that was a bit remote and too technical to the familiar turf of the libraries and other cultural and heritage repositories. It was a reminder that current practices should not be taken for granted and always have to be validated in practice during implementation and execution. Monitoring and constant evaluation are really key.
The whole environment and operation is relatively low risk; the whole team relatively successful and accepted. The relationships within the team are built on trust. I realize that that can easily backfire, however, I consider that an asset. This trust is also a motivating factor and generates interest, commitment, and sense of professionalism on the part of the team. The members of the team often bring up suggestions that lead to improving workflows and amending our policies which further improves the work environment.
It is clear that the project has to have a clear sense of direction, because not all suggestions have to necessarily lead in the right directions. H. Frank Cervone's articles brought the philosophy of project management that was a bit remote and too technical to the familiar turf of the libraries and other cultural and heritage repositories. It was a reminder that current practices should not be taken for granted and always have to be validated in practice during implementation and execution. Monitoring and constant evaluation are really key.
Labels:
digital libraries,
irls672,
project management
Monday, 3 August 2009
LAMP Framwork and Digital Collections
I do not think my perspective on the way digital information is managed has changed that much since the course started , but I have gained a much better understanding of how digital information is managed, manipulated, used, and re-purposed. Databases always seemed to be a key aspect of digital information management and thanks to the MySQL crash-course, I was able to better understand how data are entered into database, in what form and format (data-type), that these things matter and they can be quite helpful and powerful when used right. The date data-types could enhance the information users' require - the way chronological data can be computed and manipulated, unfortunately in most library and archival content standards, chronological data are of little use from the point of view of machine manipulation, as they are usually entered as unrestricted text strings.
The real eye opener, however, was the last part of the course on PHP and MySQL. The ease with which the database could be queried, data retrieved, and delivered to the browser really surprised me. However, the effortless update of the database was the real coup for me. It made me think again about the Web 2.0 participatory aspect and how useful they can be to the cultural heritage field. I realize that both
delivery and update actions were rather unsophisticated and many blocks of code would be required to update databases with useful, normalized data and also to retrieve them in a more user-friendly way.
The relative ease with which the whole system is built and operates , convinced me that the LAMP framework is a viable concept, quite suitable for libraries, archives, museums, and other cultural and heritage institutions both big and small.
The real eye opener, however, was the last part of the course on PHP and MySQL. The ease with which the database could be queried, data retrieved, and delivered to the browser really surprised me. However, the effortless update of the database was the real coup for me. It made me think again about the Web 2.0 participatory aspect and how useful they can be to the cultural heritage field. I realize that both
delivery and update actions were rather unsophisticated and many blocks of code would be required to update databases with useful, normalized data and also to retrieve them in a more user-friendly way.
The relative ease with which the whole system is built and operates , convinced me that the LAMP framework is a viable concept, quite suitable for libraries, archives, museums, and other cultural and heritage institutions both big and small.
Sunday, 26 July 2009
SQL and the Database Architecture - Week 2
SQL and the database architecture introduction were topics that I was very keen on exploring when I stared the course. I have been working with databases for several years, but I was always working with their output. The technological underpinnings of the ILS or DAMS were always obscure to me. I have seen a SQL update query from time to time, but even if I had some idea what that script did, I was not really sure how the process was carried out. After these two weeks the fog has lifted a tiny bit and database management is now less magic and more craft.
To my surprise, I found the CLI much easier to work with tables - altering or updating them than with both GIU applications: Webmin and phpAdmin. It is not that once one gets familiar with them it would not be easier just to select from drop-down menus, but I found both interfaces very cluttered and confusing. It was easier to get to the MySQL monitor and code the whole command rather than to search on left side of the screen than on the top and then in the middle below/between tables and boxes. However, I was missing the Linux' auto-completion feature.
Creating and populating tables was not difficult, but designing the database from scratch was a challenge, especially once I learned about normalization forms, but it still was an interesting exercise to think about the character of entered data and gradually simplify the tables.
The SQL queries were (are) hard to learn and I still need to check the correct syntax for the first line with SQL commands. The parameters are somewhat easier, as their syntax remains the same and does not vary. I had no luck with data aggregation and SQL functions. In the video lectures and examples it looks logical and easy, but when I tried to come up with one myself for one of the practice databases I failed miserably (I was trying to calculate how many images each artist has in collections). I guess that topic will need to be revisited.
I was still impressed by how powerful, yet in principle simple the SQL queries are.
To my surprise, I found the CLI much easier to work with tables - altering or updating them than with both GIU applications: Webmin and phpAdmin. It is not that once one gets familiar with them it would not be easier just to select from drop-down menus, but I found both interfaces very cluttered and confusing. It was easier to get to the MySQL monitor and code the whole command rather than to search on left side of the screen than on the top and then in the middle below/between tables and boxes. However, I was missing the Linux' auto-completion feature.
Creating and populating tables was not difficult, but designing the database from scratch was a challenge, especially once I learned about normalization forms, but it still was an interesting exercise to think about the character of entered data and gradually simplify the tables.
The SQL queries were (are) hard to learn and I still need to check the correct syntax for the first line with SQL commands. The parameters are somewhat easier, as their syntax remains the same and does not vary. I had no luck with data aggregation and SQL functions. In the video lectures and examples it looks logical and easy, but when I tried to come up with one myself for one of the practice databases I failed miserably (I was trying to calculate how many images each artist has in collections). I guess that topic will need to be revisited.
I was still impressed by how powerful, yet in principle simple the SQL queries are.
Sunday, 19 July 2009
Database Design and SQL
This was definitely a tough week, very theory-heavy, but it also included a relatively gentle introduction into SQL. I admit that at first I did not believe Joshua Mostafa's assertion that SQL is easy to comprehend, but after several lessons I realized I was able comprehend the most fundamental SQL rules of query construction and of database manipulation commands. SQL actually makes sense.
The different data types are interesting and a useful concept. I had some difficulties with understanding some of the numerical data types. The difficulty was rather in imagining the practical use for some of the features. It made me think about what kind of data I use, and how I enter them. I also found out the rational behind some conventions of data entry which I followed and took for granted. So many times I was told that a certain field can't be longer than 255 characters and it often made me wonder why, now I can guess that it may be because the given column requires char, varchar or tinytext data types. Even if it could be a tinyint, as well.
The normal forms still puzzle me. I may get better understanding of table normalization by looking at actually examples and trying to understand the relationships and changes as data are entered, how they are retrieved, and how they may have been manipulated in between.
The different data types are interesting and a useful concept. I had some difficulties with understanding some of the numerical data types. The difficulty was rather in imagining the practical use for some of the features. It made me think about what kind of data I use, and how I enter them. I also found out the rational behind some conventions of data entry which I followed and took for granted. So many times I was told that a certain field can't be longer than 255 characters and it often made me wonder why, now I can guess that it may be because the given column requires char, varchar or tinytext data types. Even if it could be a tinyint, as well.
The normal forms still puzzle me. I may get better understanding of table normalization by looking at actually examples and trying to understand the relationships and changes as data are entered, how they are retrieved, and how they may have been manipulated in between.
Sunday, 12 July 2009
On Technology Planning
This week's readings were an interesting bunch, containing articles on the historical background of several federal support programs to foster technology in libraries. Some documents were very useful and detailed howtos, others rather impish and somewhat personal like Michael Schuyler's "Life is What Happens to You When You Are Making Other Plans." Even if I understand his frustration and irritation with "meddlesome" bureaucracy, I did not find his article particularly helpful as guiding material. There were moments when he came to his senses and tried to treat the subject seriously, but it's difficult for me to believe that he tried "really hard not to rant." It was fun to read it, though.
Two documents, however, stood out for me: OCLC The 2003 OCLC Environmental Scan: Pattern Recognition, and ALA's Information Technology at ALA: 2000-2005. These were really thorough and detailed analyses. While the OCLC report was general in nature, the authors of the report tried to depict trends in ways knowledge is being accessed, transmitted, and stored; not only on national, but on the global level. They attempted to capture the position of libraries (the term that they use to represent libraries, as well as museums, archives and other institution dealing with education, heritage, or culture). I was particularly impressed with "The Technology Landscape" chapter. It did not bring up any surprises, but I liked how the authors managed succinctly wrap up the current state of the field. I do occasionally follow Lorcan Dempsey's weblog, who was one of the crafters of the report. He often posts various reports and findings related to literacy, knowledge management, learning and research, and also social networking on his weblog and in doing so permanently updates the OCLC report.
The other document that impressed me was produced by ALA: the goals, needs, assessments, benchmarks, evaluations were all clearly stated, no verbiage - everything to the point. It was a well-structured and transparent document. Some of the other texts on technology planning advised not to set too rigid goals, and even if the ALA document called for flexible planning, their plan when it came down to implementation was very specific. The only criticism I would have is that there was no mention of any person responsible for any of the steps taken. In addition, even if the plan contains a list of possible implementation issues and conflicts, there is no hint how they are going to be resolved.
If I were to participate in any technology planning activities, I think I would start with documenting my current activities and workflows. Once these activities are documented, they can be evaluated and eventually improved.
There are also some more system-oriented concerns that I would have now - how do systems scale up with the needs and workflows in place? Are the applications modular, expandable? Do they support open and shared standards - or is it a closed black-box operation? And if the system is closed are there any open source alternatives? And if there are alternatives, do we have the manpower to run them and maintain them?
Two documents, however, stood out for me: OCLC The 2003 OCLC Environmental Scan: Pattern Recognition, and ALA's Information Technology at ALA: 2000-2005. These were really thorough and detailed analyses. While the OCLC report was general in nature, the authors of the report tried to depict trends in ways knowledge is being accessed, transmitted, and stored; not only on national, but on the global level. They attempted to capture the position of libraries (the term that they use to represent libraries, as well as museums, archives and other institution dealing with education, heritage, or culture). I was particularly impressed with "The Technology Landscape" chapter. It did not bring up any surprises, but I liked how the authors managed succinctly wrap up the current state of the field. I do occasionally follow Lorcan Dempsey's weblog, who was one of the crafters of the report. He often posts various reports and findings related to literacy, knowledge management, learning and research, and also social networking on his weblog and in doing so permanently updates the OCLC report.
The other document that impressed me was produced by ALA: the goals, needs, assessments, benchmarks, evaluations were all clearly stated, no verbiage - everything to the point. It was a well-structured and transparent document. Some of the other texts on technology planning advised not to set too rigid goals, and even if the ALA document called for flexible planning, their plan when it came down to implementation was very specific. The only criticism I would have is that there was no mention of any person responsible for any of the steps taken. In addition, even if the plan contains a list of possible implementation issues and conflicts, there is no hint how they are going to be resolved.
If I were to participate in any technology planning activities, I think I would start with documenting my current activities and workflows. Once these activities are documented, they can be evaluated and eventually improved.
There are also some more system-oriented concerns that I would have now - how do systems scale up with the needs and workflows in place? Are the applications modular, expandable? Do they support open and shared standards - or is it a closed black-box operation? And if the system is closed are there any open source alternatives? And if there are alternatives, do we have the manpower to run them and maintain them?
Labels:
interoperability,
irls672,
technology planning
Saturday, 4 July 2009
On Learning XML, its Processing, and Unicode
In learning XML, I used both Mark Long's Introduction to XML and the W3schools' XML Tutorial. As much as I like the W3schools tutorials, I rarely learn a technology or language that would be completely new to me from them. I visit the site relatively often, but usually for reference questions or background information. I liked Mark Long's casual and unpretentious style, which was to the point and clear of jargon.
I think that one of the greatest benefits of XML is its ability to accommodate a variety of languages and scripts; that one can have a source code containing several languages and scripts and yet the browser will display only that language or script the user expects if set-up properly. Therefore, I looked at the encoding and special character issues in both tutorials. I am great fan of Unicode, which I find to be an amazing intellectual achievement. It is a pity that for many web developers there is virtually no alternative to ISO-8859-1 (a basic Latin Western character set). Even at the 3Wschools' pages aimed at examples of XML in real life, in spite the fact that UTF-8 is a default encoding for XML, one finds the unnecessary suboptimal value in the processing instruction .
However, the issue that I was most wondering about was the processing of XML. I heard about both SAX and DOM and knew that there are different XML parsers, but the difference was never really clear to me. I think that after watching Mark Long's video I now understand the utilities much better than I did before.
I think that one of the greatest benefits of XML is its ability to accommodate a variety of languages and scripts; that one can have a source code containing several languages and scripts and yet the browser will display only that language or script the user expects if set-up properly. Therefore, I looked at the encoding and special character issues in both tutorials. I am great fan of Unicode, which I find to be an amazing intellectual achievement. It is a pity that for many web developers there is virtually no alternative to ISO-8859-1 (a basic Latin Western character set). Even at the 3Wschools' pages aimed at examples of XML in real life, in spite the fact that UTF-8 is a default encoding for XML, one finds the unnecessary suboptimal value in the processing instruction .
However, the issue that I was most wondering about was the processing of XML. I heard about both SAX and DOM and knew that there are different XML parsers, but the difference was never really clear to me. I think that after watching Mark Long's video I now understand the utilities much better than I did before.
Sunday, 28 June 2009
HTML resources and editors
The W3school webpage is a site I always go to when I need a quick reference when it comes to mark-up or some other web development or web design issue, so this time I again went to this site first. I looked at some of the less familiar tags that I do not believe I have ever not used, like fieldset, but most of them were related to forms and I hardly ever deal with them. The page on HTML Events was interesting, but those are mostly tied to JavaScript and my knowledge thereof is, unfortunately, negligible.
However, what really piqued my interest was a table of HTTP messages. I have long wondered about them. True, I have not seen that many: maybe three or four in real life, but it was interesting how they are grouped - there are five groups 1xx: Information; 2xx: Successful; 3xx: Redirection; 4xx: Client Error; 5xx: Server Error. The MARC-like affinity is very appealing. I recall I have seen only error 403 - Forbidden access and 404 - Not found, which is not surprising as those are client errors. Other errors can be hidden or replaced by another customized page like 503 - Service Unavailable, so that user would read a meaningful message redirecting her to alternative resources.
I used to use Dreamweaver (but it was owned by Macromedia back then) and Notetab (a nifty shareware with a cute icon of Swiss coat of arms). Ironically, shortly after I bought my own copy of Dreamweaver I became aware of oXygen - a relatively inexpensive XML editor that also supports HTML and I have not touched Dreamweaver since.
However, what really piqued my interest was a table of HTTP messages. I have long wondered about them. True, I have not seen that many: maybe three or four in real life, but it was interesting how they are grouped - there are five groups 1xx: Information; 2xx: Successful; 3xx: Redirection; 4xx: Client Error; 5xx: Server Error. The MARC-like affinity is very appealing. I recall I have seen only error 403 - Forbidden access and 404 - Not found, which is not surprising as those are client errors. Other errors can be hidden or replaced by another customized page like 503 - Service Unavailable, so that user would read a meaningful message redirecting her to alternative resources.
I used to use Dreamweaver (but it was owned by Macromedia back then) and Notetab (a nifty shareware with a cute icon of Swiss coat of arms). Ironically, shortly after I bought my own copy of Dreamweaver I became aware of oXygen - a relatively inexpensive XML editor that also supports HTML and I have not touched Dreamweaver since.
Labels:
Dreamweaver,
HTML messages,
oXygen,
W3schools
Monday, 22 June 2009
Volatility of Learning Styles: On Footnotes and Hyperlinks
This course has been one big experiment for me. How does one study remotely without a fixed schedule, a dedicated space for learning and/or without the physical presence of the instructor/tutor/lecturer? I am starting to understand that it is similar to remote work or work from home - that also sounds great in theory, but in practice one finds out that it requires much more discipline and concentration to work at home than at the work place. I do not mean to sound negative, it is great if somebody has the flexibility and can work from one's home, but the home environment can be very disruptive. Strangely enough, working or studying at home requires some learning and practicing. It does not come naturally. After all most people usually come home to rest after being at work the whole day, so some old habits need to be broken and one needs to learn how to work efficiently at home.
I find it similar with remote study. I normally study material one document at a time, one after another - if I do not find in one document what I am looking for I go to the next one. However, if I find the answer to my question I do not usually read the next document. On the other hand, every time I pick up the manual I always promise myself to read it methodically from beginning to end. It never happens - once the exercises start, I start practicing and playing with examples and as a result find myself skipping sections, pages, even chapters. I usually do not visit pages skipped. Hence, I am learning something, but something else is usually missed. Those may be the perils of learning through creativity and experimentation.
With this class it is different, I am trying to read everything which has been assigned, but it is a challenge, especially the lectures. They provide a rather dense summary of the subject matter with external links to more detailed material (I have no problem with Wikipedia entries - they are usually well structured and the content is very informative. They include external references, as well). I like to think of these links as a form of footnotes, which represent for me the same dilemma - will I read them all once I finish the page/chapter, or will I read them as they are marked in text. I am still not sure what's the best way, in the end it depends on the text and on the character of footnotes. If I am more familiar with the content I tend to check the footnotes (or hyperlinks) as they appear in the text, otherwise I focus on the text and leave the footnotes for later. However, sometime when the footnotes are too long or too detailed I skip them entirely and read them after I finish the chapter, similarly I usually leave out podcasts or links to video presentation and watch/listen to them last.
Videos often repeat a lot of information covered in the text of lecture or in the reading, so they function as a good summary and represent another way of looking at the issue at hand. However, the fact that they often repeat material covered in the lecture or in the assigned reading is very irritating, so I often watch videos impatiently waiting for something new. Videos and podcasts are difficult to quickly scan for content. Out of the learning tools presented so far, they work the least well for me. Yet, I still enjoyed the Warriors of the Net.
Labels:
footnotes,
learning styles,
Warriors of the Net
Monday, 15 June 2009
Permissions, Groups, and Users
The installation of all the perl components was not too difficult. It definitely helped that I used the completing filenames feature with the TAB key. That saved me some typing, but more importantly some unnecessary typos. I managed to create a new user group with a user account with all three applications. It was a relatively straightforward process in all three cases. The experience and knowledge I gained through each exercise, helped me in completing subsequent exercises. Reading, too, provided enough context that I was able to understand the steps I was taking. I went back to Arthur Griffith's Linux: Introduction to Linux, especially the chapter on Group and Shadow file, which helped my understanding of the assigned reading.
It was not too difficult to set up the new user groups and users in the CLI, but it would be too laborious and prone to error had the user permissions been set up on a larger scale. It seems also quite unnecessary as the GUI applications are very convenient to use and as far as I can tell they do their job well.
I needed to change file permissions occasionally in the past. I looked up the documentation and changed it. I still think I would use the command line for these ad hoc tasks. If, however, I would need to deal with permissions on a more regular bases I would choose a GUI application.
It was not too difficult to set up the new user groups and users in the CLI, but it would be too laborious and prone to error had the user permissions been set up on a larger scale. It seems also quite unnecessary as the GUI applications are very convenient to use and as far as I can tell they do their job well.
I needed to change file permissions occasionally in the past. I looked up the documentation and changed it. I still think I would use the command line for these ad hoc tasks. If, however, I would need to deal with permissions on a more regular bases I would choose a GUI application.
Monday, 8 June 2009
Computer Configuration and vi Text Editor
Every time I use a new computer I try to configure the computer so that it is as similar to the interface I am used to as it can be. Setting up a different view in the MS Windows Explorer is one of the first things I do, I set up the Explorer to the 'Details' view, then I enable the display of file extensions and hidden files. I am glad that I learned about the alias, or rather how surprisingly easy it is to create one and how powerful a short cut an alias can be.
When configuring a computer it is important to recognize whether the change happens on the level of a single user or on system level for all users. In Ubuntu this difference is reflected in using 'sudo' in combination with the regular Linux command. I modified a 'ls' command so that it would display by default the hidden files, as well. I created an alias for 'ls' to behave as 'ls -a', this should happen on the level of a single user, so I did not need to use 'sudo', while in the exercise that enabled a look-up for third-party supported applications for Ubuntu this change happens on the level of system for all users and 'sudo' had to be used. I forgot to use 'sudo' first when opening the appropriate file, and only the read copy thereof opened.
I did not use nano for the exercises this week. The interface resembled pico and since I have been using vi for some time, I stuck with it mainly because I find vi more responsive and faster. I usually learn by reading a book, and some material that makes sense and/or I use in my work I will remember, but most of what I learn I forget. However, I am glad I have to opportunity to work on all those tutorials, and the material makes more sense to me now.
When configuring a computer it is important to recognize whether the change happens on the level of a single user or on system level for all users. In Ubuntu this difference is reflected in using 'sudo' in combination with the regular Linux command. I modified a 'ls' command so that it would display by default the hidden files, as well. I created an alias for 'ls' to behave as 'ls -a', this should happen on the level of a single user, so I did not need to use 'sudo', while in the exercise that enabled a look-up for third-party supported applications for Ubuntu this change happens on the level of system for all users and 'sudo' had to be used. I forgot to use 'sudo' first when opening the appropriate file, and only the read copy thereof opened.
I did not use nano for the exercises this week. The interface resembled pico and since I have been using vi for some time, I stuck with it mainly because I find vi more responsive and faster. I usually learn by reading a book, and some material that makes sense and/or I use in my work I will remember, but most of what I learn I forget. However, I am glad I have to opportunity to work on all those tutorials, and the material makes more sense to me now.
Labels:
computer configuration,
irls672,
Linux,
text editors,
vi
Friday, 29 May 2009
Ubuntu LiveCD Installation and Command Line Tutorials
I did not have any difficulties installing the Ubuntu LiveCD, but reading the instructions made me really anxious. I used Ubuntu for some time two years ago, so I knew what to expect. I knew the interface, set-up, how the OS works, but I had installed Ubuntu on a computer back then and this time I am going to use the LiveCD on a computer which is not mine and which I am not going to use in a month. Once I did it, I relaxed. It was the old interface and everything worked as expected. I repeated the re-boot, played with the language options, and then started the Terminal application.
I worked my way through the tutorials - the ones in the VTC were well accompagnied by the lessons from Learning the Shell from Linuxcommand. I do not deal with permissions often, but I have to do it from time to time and evrytime I have to revisit the relevant documentation, hence I read this part of tutorials carefully. Navigation in the file system and directories and manipulating files were good summaries, but fairly basic. The Input/Output chapter was quite interesting. The chapters on layout of Linux file system were rather technical, so was the tutorial on partitions. I was scratching my head over all those blocks and different drives, but the practical examples made the partition tutorial clearer. However, I still would appreaciate a more realistic scenario and reasons why would I like to have different partitions on one disc.
I worked my way through the tutorials - the ones in the VTC were well accompagnied by the lessons from Learning the Shell from Linuxcommand. I do not deal with permissions often, but I have to do it from time to time and evrytime I have to revisit the relevant documentation, hence I read this part of tutorials carefully. Navigation in the file system and directories and manipulating files were good summaries, but fairly basic. The Input/Output chapter was quite interesting. The chapters on layout of Linux file system were rather technical, so was the tutorial on partitions. I was scratching my head over all those blocks and different drives, but the practical examples made the partition tutorial clearer. However, I still would appreaciate a more realistic scenario and reasons why would I like to have different partitions on one disc.
Wednesday, 20 May 2009
Wine, Microsoft, and Gaming
I became intrigued by Wine. The post in Ubuntu Forum that originally piqued my interest was about interoperability problems between Firefox and Wine. There were other threats related to Wine there, as well - one of them was dealing with playing commercial games on Ubuntu machines. I am not a big player myself, but I occasionally wondered what Linux users do and I could not imagine, they would just kept whining and would not figure out a solution. There seem to be very few commercial games that run on Linux . A Wikipedia entry on Linux gaming (http://en.wikipedia.org/wiki/Linux_gaming, accessed on 2009-05-20) revealed that the situation is not that bad, even if one did not see any of the usual suspects mentioned. Wine delivers a compatibility layer that enables Linux users to enjoy the hottest products of the gaming industry.
I have to admit that I was impressed by the concept and I can imagine there maybe even some interest and support for such compatibility-enabling and/or cross-platform tools in different economic and legal climates. Wine epitomizes the challenges of open source software. On one hand, it enables Linux users to work with Microsoft applications, even if the software may be acquired legally and paid for. On the other hand, the application is often clunky, does not always run smoothly and reliably and often other scripts and application have to be employed in order to run a desired application. It does not help that Microsoft does not seem to be interested and tries to makes it rather difficult for Wine developers to keep their product up-to-date.
The prospects are not all that bleak. Google bundled Wine into their digital image editing and organizing software Picasa, so that it runs on Windows, as well as on Linux. In addition, Google seems continuously interested in Wine which is a rather positive sign, for now...
Sources:
http://en.wikipedia.org/wiki/Wine_(software) (Accessed 2009-05-20)
http://www.winehq.org/ (Accessed 2009-05-20)
I have to admit that I was impressed by the concept and I can imagine there maybe even some interest and support for such compatibility-enabling and/or cross-platform tools in different economic and legal climates. Wine epitomizes the challenges of open source software. On one hand, it enables Linux users to work with Microsoft applications, even if the software may be acquired legally and paid for. On the other hand, the application is often clunky, does not always run smoothly and reliably and often other scripts and application have to be employed in order to run a desired application. It does not help that Microsoft does not seem to be interested and tries to makes it rather difficult for Wine developers to keep their product up-to-date.
The prospects are not all that bleak. Google bundled Wine into their digital image editing and organizing software Picasa, so that it runs on Windows, as well as on Linux. In addition, Google seems continuously interested in Wine which is a rather positive sign, for now...
Sources:
http://en.wikipedia.org/wiki/Wine_(software) (Accessed 2009-05-20)
http://www.winehq.org/ (Accessed 2009-05-20)
Subscribe to:
Posts (Atom)