Monday, November 23, 2009

Live-Tweeting Now: IRGO unConference

Thursday, November 19, 2009

IRGO unConference Live Tweets



What: Inaugural IRGO unConference 1.0: NZ Digital Futures

Where:Centre for Innovation, the University of Otago, Dunedin New Zealand (visitor information here)

When: 23-24th November, 2009

Digital artifacts from the unConference will be available online after the unConference. Information on further proceedings here.

I'll be on the panel for "Building Communities" (talking about blogging as a tool). Interested? Best thing about this event is that YOU get to follow it here - on my blog - FOR FREE!! :) *Woot woot* So don't forget to send yourself a reminder for this event!

Or simply follow me on Twitter -----> @classyadele.

Saturday, November 14, 2009

The Wayback Machine

Picture source here
(Remembering Mr. Peabody and Sherman, from the Rocky and Bullwinkle show)

As noted in a recent entry, another hurdle that I had to grapple with was the mundane task of archiving 6 months (Jan - June 2009) worth of blog entries from 16 corporate blogs. Adrian suggested free blogs software downloads. Although I was also aware of these tools, again - unfortunately the 'good stuff' isn't free. A good example is Spinn3r (USD500 service rate/month!). I did email them (the guys at Spinn3r) for a student price quote - and they were kind enough to offer me FREE access of the software, though limited access functions - with the primary (and most important) one was the inability to archive previous blog entries. I could only archive blog entries that were 'current' (from the day I sign up). Ahhh.... so close, yet so far. Bummer. I was also introduced to some free tools such as httrack, unfortunately, I suspect it works the same as wget (mirroring websites).

Unfortunately, these tools work best with static websites (basic HTML), not blogs (dynamic text). Don't believe me? Click
here, and you'll find the inquiry as to whether httrack archives blog data is still unanswered. he 'grabbing of urls, etc.' (rich media content) would have been an 'endless loop' process on these tools. I've done my research on this (even enlisted the help of the university's School of Business IT department - who so kindly helped me - I suspect they were just as curious as me about the whole process), and as expected, it turned out to be an endless process -- eg. a particular blog was archiving gigs of data (because of the rich content). Here's some stuff that I found online that may help to explain this better (taken from here):-

Not all sites can be archived – same reasons given for the unsuccessful attempt at mirroring blogs – only HTML-friendly sites are okay. Dynamic pages are a bit more difficult.
How do you archive dynamic pages?
There are many different kinds of dynamic pages, some of which are easily stored in an archive and some of which fall apart completely. When a dynamic page renders standard html, the archive works beautifully. When a dynamic page contains forms, JavaScript, or other elements that require interaction with the originating host, the archive will not contain the original site's functionality.
Why are some sites harder to archive than others?
If you look at our collection of archived sites, you will find some broken pages, missing graphics, and some sites that aren't archived at all. Here are some things that make it difficult to archive a web site:
  • Robots.txt -- We respect robot exclusion headers.
  • Javascript -- Javascript elements are often hard to archive, but especially if they generate links without having the full name in the page. Plus, if javascript needs to contact the originating server in order to work, it will fail when archived.
  • Server side image maps -- Like any functionality on the web, if it needs to contact the originating server in order to work, it will fail when archived.
  • Unknown sites -- The archive contains crawls of the Web completed by Alexa Internet. If Alexa doesn't know about your site, it won't be archived. Use the Alexa Toolbar (available at www.alexa.com), and it will know about your page. Or you can visit Alexa's Archive Your Site page at http://pages.alexa.com/help/webmasters/index.html#crawl_site.
  • Orphan pages -- If there are no links to your pages, the robot won't find it (the robots don't enter queries in search boxes.)
As a general rule of thumb, simple html is the easiest to archive.

Which brings me to highlight something - heard of the Wayback Machine? It's an Internet-based non-profit digital library that archives important sites around the world. Read more about it here. Also good to check out their FAQ. You can make a request to them to archive specific websites, as long as you provide a valid justification for them to do it, of course.

Pretty cool, eh? It's not without some criticisms though. It's believed that archives such as this Wayback Machine may eventually die away as more and more websites become managed dynamically, unless someone comes up with an easy way to archive a dynamic site. Like, literally shooting a moving target.

Now back to that archiving of blog entries. Thank God for Scrapbook, which enabled me to easily save the webpages in a manageable format. After that was done, I also had to 'print' the webpages into .pdf format (to maintain the page layout and format) for further analysis in the qualitative data analysis software that I have chosen (Atlast.ti, which by the way - was another steeeeeeeeeep learning curve for me - as I had to play around with the software myself while referring to a 400-page manual).

Yes, I know, I could have auto-printed the webpages into .pdf docs .........................IF I had a MAC. Unfortunately, I don't - Yup, I've always been a PC person :( Dangit. Where's a MAC when one needs it, eh?
Picture source here.

Although I do have friends who own MACs - I knew that tip a wee bit too late. I did pose this as an issue on Twitter (as the free software tool pdf995 wasn't capturing my blogs nicely), and people were helpful enough to offer other software suggestions (pdfcute, etc.) as by that time, I had already archived my webpages on Scrapbook and couldn't 'transfer' the urls from Scrapbook to a directory, then later access them using a MAC. On Scrapbook, I could only transfer the saved pages as complete saved pages in folders. Not sure if Im making myself clear - but let's just say, I wished the mac idea came eariler :) And also get this - the free software tools aren't perfect (hence the word 'free' but there are tools out there that beautifully capture the wanted pages with a click of a mouse, but they aren't free :( So poor PhD student like me have to work within my means lah).


Regardless of all that - it's a great learning process - a steep one, no doubt - but hey, all this stuff is nicely sitting in my database of skills, knowledge and training - they will come in handy again one day. But have no fear - just in case you foresee yourself in my predicament months from now - I strongly believe that as time progresses, we'll be seeing plenty of these tools that would not only be free, easy-to-use but effective as well. Only time will tell. As for now, as I've reached this stage of my PhD - I can only work with whatever means that I have.

"The best a gal can get" (sounds familliar?).

On another note - yup, I can't believe it either. Such a techy entry for an un-techy person like moi! LOL.

Thursday, November 12, 2009

The Inundation of Data

Just like Mariah Carey's "The Emancipation of Mimi", I feel like coming out with an album too - "The Inundation of Addy". Or something to that effect. Silly, I know. But that's really what my mind's like at the moment. Silly nilly addy. LOL.

This is how my desk at home looks like now. Trust me, this is only stage 1. Give me a month more, you'll see more stuff lying around. Not kidding when I said I was laying low...buried under heaps of data.

I'm beginning to wonder if I collected too much data! Ahhh....
What's been completed so far (yes, typing all this down somehow makes me feel better):-
  • 19 interviews; 19 transcriptions
  • 16 blogs' data (overall stats)
  • Archiving 6 months of data from each blog - 753 blog entries in total
  • Individual stats from each 753 blog entries (I went nearly blind at this stage, I kid you not)
  • Coding each interview transcript - ~160 codes
I'm at the analyzing stage now, by starting with within-case analysis (looking at each individual company's codes and accompanying quotations) and concurrently running the cross-case analysis. Am I seeing some emerging patterns - yes, but I've got 12 more interviews to analyze, which might (or not) validate my suspicions. We'll see how it goes. So patience, my young Padawan. Here, have a cookie... (don't say I didn't offer you now!)

Picture source here


Anyway, I think there's a good story to tell so far.
When I was ranting about the amount of mini-tasks that had to be done at each stage, a friend (yes, that's you Adrian) recommended some advice that might quicken the process. For instance, he suggested software programs that can translate speech to text. Although I'm aware of such tools that can do that - unfortunately, the researcher would still have to 'train' the program to recognize certain words that are pronounced differently, while introducing new words that aren't regularly used (and Im sure we all know that there are some 'new words' in the social media realm (eg. flog, spam, twitter, tweets, etc.) that aren't common words used in everyday language).

I have no qualms that it might work better with American accented language. Unfortunately, my interviewees have a mix of Scottish, Kiwi, British, Malaysian, Welsh and American accents. I also suspect these software tools are more effective at translating SPOKEN text (eg. in a narrative format, as narrated by 1 person), as compared to interviews (dynamic sessions involving at least two people, or more), unfortunately. Eg. Reading out a story/diary would have been easily transcribed by the software VS 2 people actively engaging in a discussion. Plus, not forgetting that the fact that the researcher would also have to carry out frequent checks on the transcripts to ensure its accuracy (from the audio interview). Which may result in me spending more time (possibly) correcting it, thus making it more sensible for me to transcribe my own interviews accurately (simply because I was there, hence I know what was said).

In my case, I only had to check the interviews that I outsourced to be translated, which weren't many (9) - leaving me with the remaining (10 interviews). A lot, eh? Well, to be honest, I didn't really mind transcribing the interviews because as a researcher, you have to get as close to the data as possible anyway. But I did cringe as I heard myself on tape - arghhhh.....!! LOL


Okay, that's enough for now. My next entry will be about my rant on me wanting to die a painful death from dealing with the problem of archiving blog data and cantankerous *free* pdf printing tools.

Saturday, November 07, 2009

Si Translator (Films)


I'm continuing the saga (riding on the wave while it's still strong, eh) with films now. So go ahead, try and guess these movie titles! Good luck!

#filemBIdalamBM (Guess these Malay-translated English films)

1. Lusa

2. Bini si musafir masa

3. Aku tahu apa yang kamu buat musim panas lepas

4. Muzik sekolah tinggi
5. Tukang pos selalu berdering dua kali
6. Senarai baldi


#filemBMdalamBI (Guess these English-translated Malay films)
7. Female, Wife and ...

8. Expired Bachelor

9. Ghost Thumb Descendant

10. Fortune-telling Father Locusts

:-) Somehow trying to translate English films into Malay seemed much more fun! LOL

Hints:- I've chosen mostly classic Malay films (doesn't matter if you've watched them or not, but they have such interesting titles that it's hardly impossible to forget them!). And to make it slightly more difficult, I had to 'tune' them to make it sound logical, rather than translating ad verbatim.

As for the English films - a mix of popular and classic films are in there!