The day finally arrived. After what seems like countless hours of work and a huge amount of effort by the team (mainly because it did involve countless hours and a huge amount of effort!), APMS launched formally on Monday.

An email was sent to all academics informing them of the launch, but I know this kind of launch can often be fairly quiet in that it can take a few days or even longer before people actually start logging in and using it. However, I know from the logs that quite a number of people, mainly academics,  have been having a look around the system which is very positive. I have even had a couple of queries about the programmes in it and how changes can be made to those programmes, which is great.

One glitch we came across quite late relates to external examiner email addresses. APMS integrates with our Active Directory (AD) for its login credentials and along with University staff, External Examiners have user accounts in our AD that get transferred into APMS. External Examiners will need to access APMS to approve modifications to programmes, and to submit annual monitoring reports for the programmes they examine.  An integral and crucial feature of APMS is that it emails users to let them know when something has happened that involves them or to notify them that something is waiting for them to do.  This notification process uses what is known as the ‘Primary email address’ in AD. We want to make sure these emails go to External Examiners ‘real’ addresses (e.g. joe.bloggs@cambridge.ac.uk), but for various reasons it isn’t feasible to enter those as the primary email addresses. Instead we need to use a secondary email address field, and Worktribe are looking at how to pick up email addresses from different AD fields depending on the group the user is a member of. Hopefully we will have all of that sorted out in a few weeks, but in the meantime if we need to notify any external examiners of anything waiting for them (which is fairly unlikely for now) we can use manual workarounds.

Since we now have access to a very large amount of course data, it is possible to look at ways of improving the presentation of, and access to, this data for (for example) potential students. As such, I’m looking at building a prototype ‘Course Finder’ application.

Building on my work that I outlined in my previous post, we can now identify keywords for all of the courses offered at the university. This offers one way that suitable courses that can be identified for users of the application, likely to be potential students. These courses can also be linked to JACs codes, representing the subjects covered by / in the courses. Courses are also delivered at a particular level – foundation degree, bachelors degree, masters etc.

The criteria that I am currently considering using to identify potential courses for users are: subjects previously studied;  subjects interested in and keywords (identified with Open Calais).

As well as using these parameters to produce search results, I have also included features within the application to record ‘click-through’ on search results, as well as the ability to ‘recommend’ a search result as being appropriate and relevant to the search parameters outlined above. As such, the application should ‘learn’ as more and more searches are carried out. If parameters A,B and C are specified and one of the courses recommended as being relevant, then the next time a search is carried out using parameters A,B and C, the same courses should be highlighted as being potentially more useful and relevant to the user.

Most of the data required to execute the searches is available from our Academic Programme Management System, through our Nucleus data store. The data relating to the individual search instances, as well as recording click-throughs and recommendations will obviously need to be stored within the application’s database. It may also be necessary to locally store some of the data from Nucleus, in order to improve performance by essentially caching views on the data that are unlikely to change too often, such as links between keywords and courses.

Tables storing data locally include:

  • keyword course links
  • search instances
  • search click-throughs
  • search interests
  • search keywords
  • search studied
  • search recommendations
  • subjects
  • similar courses

The majority of the data stored in the tables listed above relates to Coursefinder-specific functionality, some data has been ‘cached’ from Nucleus, purely to save multiple API calls for data that will change very rarely.

In a follow-up post, I’ll show the created application and describe the benefits and limitations.


Or, ‘How I went round and round in circles….. and then round and round some more’

One of the initial ideas that was suggested back at the beginning of the project was a way of mining course data in order to provide suggestions for similar courses. Since we now have access to the APMS data (a lot more data than my dummy set), finding ways of suggesting similar courses is now something that I can attempt properly.

The first step in the process was deciding on a method of finding keywords from within the various text descriptions of programmes and modules. OpenCalais, which has been used in a previous project at the university – JISCPress, was one such option.

OpenCalais, which has an easy to use API, takes a body of text and returns a series of keywords, broken down by type, and their relevancy score, which indicates how strongly an identified keyword is relevant to the body of text. Initially I looked at using an existing PHP library that would interface with the API (reinventing the wheel and all that), but found that they did not return all of the data I wanted in a easy to use manner, so I wrote my own code, which will be available on Github.

When I first started this process, I had access to data relating to 6436 modules of study, which are part of 878 individual programmes of study for a total of 349 courses. (If one course has two different years’ intake, it will be represented by 2 programmes. A similar situation exists with modules.)

In the first instance, I looked at generating keywords for all programmes of study (a mistake, which I cover later) and modules. I decided to use the following fields to generate keywords for programmes:

  • ‘Aims and Objectives’
  • ‘Introduction’
  • ‘Specialist Facilities’
  • ‘Career Opportunities’

I also used the ‘Synopsis’ field for modules.

This process generated 3,335 keywords, broken down into 36 types of keywords. 17,835 links were generated between programmes of study and keywords and 19,255 links between keywords and modules.

Since my last post there has been a lot of working making final tweaks and fixing bugs that were found during testing. We already did a ‘technical’ launch of APMS which means we are now populating programme information directly into the live system.  We’re now getting ready for the full launch of the system to the rest of the University, which will take place in a few weeks.

All of the programme information needed to produce Diploma Supplements for this year’s graduates is now in the system, and work will be continuing for some time to enter all of the other information about remaining programmes. Temporary staff that were helping with that task are now concentrating on entering all of the current marketing information about our programmes. That marketing information will then be output in the form of an XCRI-CAP feed to replace the current one on our public website, and XML and PDF files that will feed the programme pages.

Worktribe have done the work to take reading lists in real time from our new reading list system (Talis Aspire) and released it to us for testing on our test system. Depending on how things work out, this could either go in the live system before we fully launch or slightly afterwards.

Worktribe has also now provided access to the APIs on the test system, which give programmatic read access to the information stored in APMS. For ON Course, this means we can start to look at using live data from APMS to feed outputs from ON Course.

We are doing a final push to get all the outstanding items finished, and then we’ll be ready to hit the metaphorical launch button (if only the button really existed like that!). I am on holiday for 2-weeks shortly, so the launch should happen very soon after I’m back.

The title of this blog post is taken from Noah Iliinsky’s master’s thesis and relates to designing complex diagrams that contain multiple elements, are multi-layered and well ordered, as opposed to ‘spaghetti diagrams’ that “have undefined axes, little or no order, and are a hodgepodge of similar elements”. The thesis highlights key concepts to consider when designing data visualizations and suggests a design process to use, to help create better defined and more meaningful and useful data visualizations. This blog post acts as a quick overview of some of the concepts covered in the thesis, providing a useful list of concepts to consider when designing diagrams / data visualizations. How these concepts directly apply to my work on ON Course will be outlined in a future blog post.

Firstly, how do we define what constitutes a complex diagram? Dictionary definitions don’t really help in this particular context: “The state or quality of being intricate or complicated”. Three criteria are suggested in the thesis, that may contribute towards a diagram being considered as complex.

  1. At least four different types if information to present
    Why four different types? This number was arrived at as the vast majority of visualizations encode three or fewer data types, whereas there are far fewer which encode four or greater types of data.
  2. Large amounts of information
    This concept is fairly self explanatory, if you have to present an inordinate amount of data, then whichever method of presentation you choose, it’s going to be a complex diagram.
  3. Presenting qualitative information
    Presenting qualitative information leads towards complex diagrams because of a lack of standard metaphors / representations / scales that often exist for quantitative information. As a result of this, the designer / author has to create the representative metaphor themselves, in such a way that the target audience can understand and comprehend the data being shown to them

Why be concerned with how complex a diagram is going to be? More data types to present in the visualization means that more distinct encoding methods are required. “Once the most obvious encoding has been used, the author must venture into more subtle encodings to make their points. Clearly, the more characteristics there are to be represented, the more difficult this becomes.”

